Singing the Praises of Strings
There is a new way to work with textual data in MATLAB R2016b. The new string datatype haven't got enough attention from me until recently. I have been chatting with colleagues Matt Tearle and Adam Sifounakis and we have each discovered a similar beautiful code pattern in MATLAB for generating a sequence of strings.
Contents
MathWorks History with Textual Data
Early on, MATLAB had character arrays. Let's create one.
myCharPets = ['dog ';'cat ';'fish']
myCharPets = dog cat fish
Notice how I had to add trailing blanks for the first 2 pets because my final pet, a fish, required more memory (like Dory from Finding Nemo)?.
I can find my second pet, but, to be fair, I also have to remove the trailing blank.
pet2 = deblank(myCharPets(2,:))
pet2 = cat
With MATLAB 5.0, we introduced cell arrays and then cell arrays of strings. Since each cell contains its own MATLAB array, there is no need for each array to contain the same number of elements. So we can do this, exploiting some "new" syntax.
myCellPets = {'dog';'cat';'fish'}
myCellPets = 3×1 cell array 'dog' 'cat' 'fish'
I can find the second pet on the list, with some more, but similar, "new" syntax.
pet2 = myCellPets{2}
pet2 = cat
String Datatype
In MATLAB Release R2016b, we introduced the notion of a string. Now I can create an array of textual data another way.
myStringPets = string(myCellPets)
myStringPets = 3×1 string array "dog" "cat" "fish"
And I can find my second pet again
pet2 = myStringPets(2)
pet2 = string "cat"
I think the notation feels much more natural. And I can add strings together.
allofmypets = myStringPets(1) + ' & ' + myStringPets(2) + ' & ' + myStringPets(3)
allofmypets = string "dog & cat & fish"
Ok, yes, I really should vectorize that. And I can do that with strings!
But wait, there's more!
You may remember that recently, Steve Eddins posted on my blog about implicit expansion? Well, we can take good advantage of that with strings.
Suppose I want to create an array of directory names that are embedded with a sequence of years.
dirnames = string('C:\work\data\yob\') + (2000:2010)'
dirnames = 11×1 string array "C:\work\data\yob\2000" "C:\work\data\yob\2001" "C:\work\data\yob\2002" "C:\work\data\yob\2003" "C:\work\data\yob\2004" "C:\work\data\yob\2005" "C:\work\data\yob\2006" "C:\work\data\yob\2007" "C:\work\data\yob\2008" "C:\work\data\yob\2009" "C:\work\data\yob\2010"
And if I want to add months, I can do that too.
quarterlyMonths = string({'Jan','Apr','Jul','Oct'}); dirname = string('C:\root\') + quarterlyMonths + (2000:2010)'
dirname = 11×4 string array Columns 1 through 3 "C:\root\Jan2000" "C:\root\Apr2000" "C:\root\Jul2000" "C:\root\Jan2001" "C:\root\Apr2001" "C:\root\Jul2001" "C:\root\Jan2002" "C:\root\Apr2002" "C:\root\Jul2002" "C:\root\Jan2003" "C:\root\Apr2003" "C:\root\Jul2003" "C:\root\Jan2004" "C:\root\Apr2004" "C:\root\Jul2004" "C:\root\Jan2005" "C:\root\Apr2005" "C:\root\Jul2005" "C:\root\Jan2006" "C:\root\Apr2006" "C:\root\Jul2006" "C:\root\Jan2007" "C:\root\Apr2007" "C:\root\Jul2007" "C:\root\Jan2008" "C:\root\Apr2008" "C:\root\Jul2008" "C:\root\Jan2009" "C:\root\Apr2009" "C:\root\Jul2009" "C:\root\Jan2010" "C:\root\Apr2010" "C:\root\Jul2010" Column 4 "C:\root\Oct2000" "C:\root\Oct2001" "C:\root\Oct2002" "C:\root\Oct2003" "C:\root\Oct2004" "C:\root\Oct2005" "C:\root\Oct2006" "C:\root\Oct2007" "C:\root\Oct2008" "C:\root\Oct2009" "C:\root\Oct2010"
How cool is that!
Is There More?
This is just the beginning for strings. You can find out what else is available now.
methods(string)
Methods for class string: cellstr extractAfter le split char extractBefore lower splitlines compose extractBetween lt startsWith contains ge ne strip count gt pad strlength double insertAfter plus upper endsWith insertBefore replace eq ismissing replaceBetween erase issorted reverse eraseBetween join sort
And you can bet we have plans to add more capabilities for strings over time. What features would you like to see us add? Let us know here.
- Category:
- New Feature,
- Strings