Loren on the Art of MATLAB

Turn ideas into MATLAB

Note

Loren on the Art of MATLAB has been archived and will not be updated.

Singing the Praises of Strings

There is a new way to work with textual data in MATLAB R2016b. The new string datatype haven't got enough attention from me until recently. I have been chatting with colleagues Matt Tearle and Adam Sifounakis and we have each discovered a similar beautiful code pattern in MATLAB for generating a sequence of strings.

Contents

MathWorks History with Textual Data

Early on, MATLAB had character arrays. Let's create one.

myCharPets = ['dog ';'cat ';'fish']
myCharPets =
dog 
cat 
fish

Notice how I had to add trailing blanks for the first 2 pets because my final pet, a fish, required more memory (like Dory from Finding Nemo)?.

I can find my second pet, but, to be fair, I also have to remove the trailing blank.

pet2 = deblank(myCharPets(2,:))
pet2 =
cat

With MATLAB 5.0, we introduced cell arrays and then cell arrays of strings. Since each cell contains its own MATLAB array, there is no need for each array to contain the same number of elements. So we can do this, exploiting some "new" syntax.

myCellPets = {'dog';'cat';'fish'}
myCellPets =
  3×1 cell array
    'dog'
    'cat'
    'fish'

I can find the second pet on the list, with some more, but similar, "new" syntax.

pet2 = myCellPets{2}
pet2 =
cat

String Datatype

In MATLAB Release R2016b, we introduced the notion of a string. Now I can create an array of textual data another way.

myStringPets = string(myCellPets)
myStringPets = 
  3×1 string array
    "dog"
    "cat"
    "fish"

And I can find my second pet again

pet2 = myStringPets(2)
pet2 = 
  string
    "cat"

I think the notation feels much more natural. And I can add strings together.

allofmypets = myStringPets(1) + ' & ' + myStringPets(2) + ' & ' + myStringPets(3)
allofmypets = 
  string
    "dog & cat & fish"

Ok, yes, I really should vectorize that. And I can do that with strings!

But wait, there's more!

You may remember that recently, Steve Eddins posted on my blog about implicit expansion? Well, we can take good advantage of that with strings.

Suppose I want to create an array of directory names that are embedded with a sequence of years.

dirnames = string('C:\work\data\yob\') + (2000:2010)'
dirnames = 
  11×1 string array
    "C:\work\data\yob\2000"
    "C:\work\data\yob\2001"
    "C:\work\data\yob\2002"
    "C:\work\data\yob\2003"
    "C:\work\data\yob\2004"
    "C:\work\data\yob\2005"
    "C:\work\data\yob\2006"
    "C:\work\data\yob\2007"
    "C:\work\data\yob\2008"
    "C:\work\data\yob\2009"
    "C:\work\data\yob\2010"

And if I want to add months, I can do that too.

quarterlyMonths = string({'Jan','Apr','Jul','Oct'});
dirname = string('C:\root\') + quarterlyMonths + (2000:2010)'
dirname = 
  11×4 string array
  Columns 1 through 3
    "C:\root\Jan2000"    "C:\root\Apr2000"    "C:\root\Jul2000"
    "C:\root\Jan2001"    "C:\root\Apr2001"    "C:\root\Jul2001"
    "C:\root\Jan2002"    "C:\root\Apr2002"    "C:\root\Jul2002"
    "C:\root\Jan2003"    "C:\root\Apr2003"    "C:\root\Jul2003"
    "C:\root\Jan2004"    "C:\root\Apr2004"    "C:\root\Jul2004"
    "C:\root\Jan2005"    "C:\root\Apr2005"    "C:\root\Jul2005"
    "C:\root\Jan2006"    "C:\root\Apr2006"    "C:\root\Jul2006"
    "C:\root\Jan2007"    "C:\root\Apr2007"    "C:\root\Jul2007"
    "C:\root\Jan2008"    "C:\root\Apr2008"    "C:\root\Jul2008"
    "C:\root\Jan2009"    "C:\root\Apr2009"    "C:\root\Jul2009"
    "C:\root\Jan2010"    "C:\root\Apr2010"    "C:\root\Jul2010"
  Column 4
    "C:\root\Oct2000"
    "C:\root\Oct2001"
    "C:\root\Oct2002"
    "C:\root\Oct2003"
    "C:\root\Oct2004"
    "C:\root\Oct2005"
    "C:\root\Oct2006"
    "C:\root\Oct2007"
    "C:\root\Oct2008"
    "C:\root\Oct2009"
    "C:\root\Oct2010"

How cool is that!

Is There More?

This is just the beginning for strings. You can find out what else is available now.

methods(string)
Methods for class string:

cellstr         extractAfter    le              split           
char            extractBefore   lower           splitlines      
compose         extractBetween  lt              startsWith      
contains        ge              ne              strip           
count           gt              pad             strlength       
double          insertAfter     plus            upper           
endsWith        insertBefore    replace         
eq              ismissing       replaceBetween  
erase           issorted        reverse         
eraseBetween    join            sort            

And you can bet we have plans to add more capabilities for strings over time. What features would you like to see us add? Let us know here.




Published with MATLAB® R2016b


  • print

Comments

To leave a comment, please click here to sign in to your MathWorks Account or create a new one.