{"id":2141,"date":"2016-12-22T11:42:06","date_gmt":"2016-12-22T16:42:06","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=2141"},"modified":"2018-06-12T18:52:44","modified_gmt":"2018-06-12T23:52:44","slug":"singing-the-praises-of-strings","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2016\/12\/22\/singing-the-praises-of-strings\/","title":{"rendered":"Singing the Praises of Strings"},"content":{"rendered":"\r\n<div class=\"content\"><!--introduction--><p>There is a new way to work with textual data in MATLAB R2016b. The new <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/string.html\"><tt>string<\/tt><\/a> datatype haven't got enough attention from me until recently. I have been chatting with colleagues <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/1455089-matt-tearle\">Matt Tearle<\/a> and <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/51009-enigma?s_tid=srchtitle\">Adam Sifounakis<\/a> and we have each discovered a similar beautiful code pattern in MATLAB for generating a sequence of strings.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#2ff5c85e-8be6-4a11-bd3a-4752bdb2d8d8\">MathWorks History with Textual Data<\/a><\/li><li><a href=\"#22a6560d-be2f-4d6a-9238-a8c2fb48600a\">String Datatype<\/a><\/li><li><a href=\"#2c047cad-83e1-4db1-8a84-e32d52343c08\">But wait, there's more!<\/a><\/li><li><a href=\"#2fcd942b-d216-4269-9bdb-86cc80ed7f32\">Is There More?<\/a><\/li><\/ul><\/div><h4>MathWorks History with Textual Data<a name=\"2ff5c85e-8be6-4a11-bd3a-4752bdb2d8d8\"><\/a><\/h4><p>Early on, MATLAB had character arrays.  Let's create one.<\/p><pre class=\"codeinput\">myCharPets = [<span class=\"string\">'dog '<\/span>;<span class=\"string\">'cat '<\/span>;<span class=\"string\">'fish'<\/span>]\r\n<\/pre><pre class=\"codeoutput\">myCharPets =\r\ndog \r\ncat \r\nfish\r\n<\/pre><p>Notice how I had to add trailing blanks for the first 2 pets because my final pet, a fish, required more memory (like Dory from <a href=\"http:\/\/www.imdb.com\/title\/tt0266543\/\">Finding Nemo)?<\/a>.<\/p><p>I can find my second pet, but, to be fair, I also have to remove the trailing blank.<\/p><pre class=\"codeinput\">pet2 = deblank(myCharPets(2,:))\r\n<\/pre><pre class=\"codeoutput\">pet2 =\r\ncat\r\n<\/pre><p>With MATLAB 5.0, we introduced <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/cell-arrays.html\"><tt>cell<\/tt> arrays<\/a> and then cell arrays of strings.  Since each cell contains its own MATLAB array, there is no need for each array to contain the same number of elements.  So we can do this, exploiting some \"new\" syntax.<\/p><pre class=\"codeinput\">myCellPets = {<span class=\"string\">'dog'<\/span>;<span class=\"string\">'cat'<\/span>;<span class=\"string\">'fish'<\/span>}\r\n<\/pre><pre class=\"codeoutput\">myCellPets =\r\n  3&times;1 cell array\r\n    'dog'\r\n    'cat'\r\n    'fish'\r\n<\/pre><p>I can find the second pet on the list, with some more, but similar, \"new\" syntax.<\/p><pre class=\"codeinput\">pet2 = myCellPets{2}\r\n<\/pre><pre class=\"codeoutput\">pet2 =\r\ncat\r\n<\/pre><h4>String Datatype<a name=\"22a6560d-be2f-4d6a-9238-a8c2fb48600a\"><\/a><\/h4><p>In MATLAB Release R2016b, we introduced the notion of a <tt>string<\/tt>. Now I can create an array of textual data another way.<\/p><pre class=\"codeinput\">myStringPets = string(myCellPets)\r\n<\/pre><pre class=\"codeoutput\">myStringPets = \r\n  3&times;1 string array\r\n    \"dog\"\r\n    \"cat\"\r\n    \"fish\"\r\n<\/pre><p>And I can find my second pet again<\/p><pre class=\"codeinput\">pet2 = myStringPets(2)\r\n<\/pre><pre class=\"codeoutput\">pet2 = \r\n  string\r\n    \"cat\"\r\n<\/pre><p>I think the notation feels much more natural. And I can add strings together.<\/p><pre class=\"codeinput\">allofmypets = myStringPets(1) + <span class=\"string\">' &amp; '<\/span> + myStringPets(2) + <span class=\"string\">' &amp; '<\/span> + myStringPets(3)\r\n<\/pre><pre class=\"codeoutput\">allofmypets = \r\n  string\r\n    \"dog &amp; cat &amp; fish\"\r\n<\/pre><p>Ok, yes, I really should vectorize that.  And I can do that with strings!<\/p><h4>But wait, there's more!<a name=\"2c047cad-83e1-4db1-8a84-e32d52343c08\"><\/a><\/h4><p>You may remember that recently, Steve Eddins posted on my blog about <a href=\"https:\/\/blogs.mathworks.com\/loren\/2016\/10\/24\/matlab-arithmetic-expands-in-r2016b\/\">implicit expansion<\/a>? Well, we can take good advantage of that with strings.<\/p><p>Suppose I want to create an array of directory names that are embedded with a sequence of years.<\/p><pre class=\"codeinput\">dirnames = string(<span class=\"string\">'C:\\work\\data\\yob\\'<\/span>) + (2000:2010)'\r\n<\/pre><pre class=\"codeoutput\">dirnames = \r\n  11&times;1 string array\r\n    \"C:\\work\\data\\yob\\2000\"\r\n    \"C:\\work\\data\\yob\\2001\"\r\n    \"C:\\work\\data\\yob\\2002\"\r\n    \"C:\\work\\data\\yob\\2003\"\r\n    \"C:\\work\\data\\yob\\2004\"\r\n    \"C:\\work\\data\\yob\\2005\"\r\n    \"C:\\work\\data\\yob\\2006\"\r\n    \"C:\\work\\data\\yob\\2007\"\r\n    \"C:\\work\\data\\yob\\2008\"\r\n    \"C:\\work\\data\\yob\\2009\"\r\n    \"C:\\work\\data\\yob\\2010\"\r\n<\/pre><p>And if I want to add months, I can do that too.<\/p><pre class=\"codeinput\">quarterlyMonths = string({<span class=\"string\">'Jan'<\/span>,<span class=\"string\">'Apr'<\/span>,<span class=\"string\">'Jul'<\/span>,<span class=\"string\">'Oct'<\/span>});\r\ndirname = string(<span class=\"string\">'C:\\root\\'<\/span>) + quarterlyMonths + (2000:2010)'\r\n<\/pre><pre class=\"codeoutput\">dirname = \r\n  11&times;4 string array\r\n  Columns 1 through 3\r\n    \"C:\\root\\Jan2000\"    \"C:\\root\\Apr2000\"    \"C:\\root\\Jul2000\"\r\n    \"C:\\root\\Jan2001\"    \"C:\\root\\Apr2001\"    \"C:\\root\\Jul2001\"\r\n    \"C:\\root\\Jan2002\"    \"C:\\root\\Apr2002\"    \"C:\\root\\Jul2002\"\r\n    \"C:\\root\\Jan2003\"    \"C:\\root\\Apr2003\"    \"C:\\root\\Jul2003\"\r\n    \"C:\\root\\Jan2004\"    \"C:\\root\\Apr2004\"    \"C:\\root\\Jul2004\"\r\n    \"C:\\root\\Jan2005\"    \"C:\\root\\Apr2005\"    \"C:\\root\\Jul2005\"\r\n    \"C:\\root\\Jan2006\"    \"C:\\root\\Apr2006\"    \"C:\\root\\Jul2006\"\r\n    \"C:\\root\\Jan2007\"    \"C:\\root\\Apr2007\"    \"C:\\root\\Jul2007\"\r\n    \"C:\\root\\Jan2008\"    \"C:\\root\\Apr2008\"    \"C:\\root\\Jul2008\"\r\n    \"C:\\root\\Jan2009\"    \"C:\\root\\Apr2009\"    \"C:\\root\\Jul2009\"\r\n    \"C:\\root\\Jan2010\"    \"C:\\root\\Apr2010\"    \"C:\\root\\Jul2010\"\r\n  Column 4\r\n    \"C:\\root\\Oct2000\"\r\n    \"C:\\root\\Oct2001\"\r\n    \"C:\\root\\Oct2002\"\r\n    \"C:\\root\\Oct2003\"\r\n    \"C:\\root\\Oct2004\"\r\n    \"C:\\root\\Oct2005\"\r\n    \"C:\\root\\Oct2006\"\r\n    \"C:\\root\\Oct2007\"\r\n    \"C:\\root\\Oct2008\"\r\n    \"C:\\root\\Oct2009\"\r\n    \"C:\\root\\Oct2010\"\r\n<\/pre><p>How cool is that!<\/p><h4>Is There More?<a name=\"2fcd942b-d216-4269-9bdb-86cc80ed7f32\"><\/a><\/h4><p>This is just the beginning for strings.  You can find out what else is available now.<\/p><pre class=\"codeinput\">methods(string)\r\n<\/pre><pre class=\"codeoutput\">\r\nMethods for class string:\r\n\r\ncellstr         extractAfter    le              split           \r\nchar            extractBefore   lower           splitlines      \r\ncompose         extractBetween  lt              startsWith      \r\ncontains        ge              ne              strip           \r\ncount           gt              pad             strlength       \r\ndouble          insertAfter     plus            upper           \r\nendsWith        insertBefore    replace         \r\neq              ismissing       replaceBetween  \r\nerase           issorted        reverse         \r\neraseBetween    join            sort            \r\n\r\n<\/pre><p>And you can bet we have plans to add more capabilities for strings over time.  What features would you like to see us add?  Let us know <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=2141#respond\">here<\/a>.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_5788c32cb4814b48b4fb34351e9ca8a8() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='5788c32cb4814b48b4fb34351e9ca8a8 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 5788c32cb4814b48b4fb34351e9ca8a8';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2016 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_5788c32cb4814b48b4fb34351e9ca8a8()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2016b<br><\/p><\/div><!--\r\n5788c32cb4814b48b4fb34351e9ca8a8 ##### SOURCE BEGIN #####\r\n%% Singing the Praises of Strings\r\n% There is a new way to work with textual data in MATLAB R2016b. The new\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/string.html |string|>\r\n% datatype\r\n% haven't got enough attention from me until recently. I have been chatting\r\n% with colleagues\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/1455089-matt-tearle\r\n% Matt Tearle> and\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/51009-enigma?s_tid=srchtitle\r\n% Adam Sifounakis> and we have each discovered a similar beautiful code\r\n% pattern in MATLAB for generating a sequence of strings.\r\n%% MathWorks History with Textual Data\r\n%\r\n% Early on, MATLAB had character arrays.  Let's create one.\r\nmyCharPets = ['dog ';'cat ';'fish']\r\n%%\r\n% Notice how I had to add trailing blanks for the first 2 pets because my\r\n% final pet, a fish, required more memory (like Dory from\r\n% <http:\/\/www.imdb.com\/title\/tt0266543\/ Finding Nemo)?>.\r\n%\r\n%%\r\n% I can find my second pet, but, to be fair, I also have to remove the\r\n% trailing blank.\r\npet2 = deblank(myCharPets(2,:))\r\n%%\r\n% With MATLAB 5.0, we introduced\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/cell-arrays.html |cell| arrays>\r\n% and then cell arrays of strings.  Since each cell contains its own MATLAB\r\n% array, there is no need for each array to contain the same number of\r\n% elements.  So we can do this, exploiting some \"new\" syntax.\r\nmyCellPets = {'dog';'cat';'fish'}\r\n%%\r\n% I can find the second pet on the list, with some more, but similar, \"new\"\r\n% syntax.\r\npet2 = myCellPets{2}\r\n%% String Datatype\r\n% In MATLAB Release R2016b, we introduced the notion of a |string|. Now I\r\n% can create an array of textual data another way.\r\nmyStringPets = string(myCellPets)\r\n%%\r\n% And I can find my second pet again\r\npet2 = myStringPets(2)\r\n%% \r\n% I think the notation feels much more natural. And I can add strings\r\n% together.\r\nallofmypets = myStringPets(1) + ' & ' + myStringPets(2) + ' & ' + myStringPets(3)\r\n%%\r\n% Ok, yes, I really should vectorize that.  And I can do that with strings!\r\n%% But wait, there's more!\r\n% You may remember that recently, Steve Eddins posted on my blog about\r\n% <https:\/\/blogs.mathworks.com\/loren\/2016\/10\/24\/matlab-arithmetic-expands-in-r2016b\/\r\n% implicit expansion>? Well, we can take good advantage of that with\r\n% strings.\r\n%\r\n% Suppose I want to create an array of directory names that are embedded\r\n% with a sequence of years.\r\ndirnames = string('C:\\work\\data\\yob\\') + (2000:2010)'\r\n%%\r\n% And if I want to add months, I can do that too.\r\nquarterlyMonths = string({'Jan','Apr','Jul','Oct'});\r\ndirname = string('C:\\root\\') + quarterlyMonths + (2000:2010)'\r\n%%\r\n% How cool is that!\r\n%\r\n%% Is There More?\r\n% This is just the beginning for strings.  You can find out what else is\r\n% available now.  \r\nmethods(string)\r\n%%\r\n% And you can bet we have plans to add more capabilities for\r\n% strings over time.  What features would you like to see us add?  Let us\r\n% know <https:\/\/blogs.mathworks.com\/loren\/?p=2141#respond here>.\r\n\r\n\r\n##### SOURCE END ##### 5788c32cb4814b48b4fb34351e9ca8a8\r\n-->","protected":false},"excerpt":{"rendered":"<!--introduction--><p>There is a new way to work with textual data in MATLAB R2016b. The new <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/string.html\"><tt>string<\/tt><\/a> datatype haven't got enough attention from me until recently. I have been chatting with colleagues <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/1455089-matt-tearle\">Matt Tearle<\/a> and <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/51009-enigma?s_tid=srchtitle\">Adam Sifounakis<\/a> and we have each discovered a similar beautiful code pattern in MATLAB for generating a sequence of strings.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2016\/12\/22\/singing-the-praises-of-strings\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[6,2],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/2141"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=2141"}],"version-history":[{"count":6,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/2141\/revisions"}],"predecessor-version":[{"id":2954,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/2141\/revisions\/2954"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=2141"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=2141"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=2141"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}