{"id":70,"date":"2006-12-20T12:53:18","date_gmt":"2006-12-20T17:53:18","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=70"},"modified":"2006-12-14T12:58:35","modified_gmt":"2006-12-14T17:58:35","slug":"finding-strings","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2006\/12\/20\/finding-strings\/","title":{"rendered":"Finding Strings"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>Over the years, MATLAB has become a friendlier environment for working with character information.  MATLAB has a rich set\r\n         of text handling functions, ranging from the simple, to the all-powerful <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/regexp.html\"><tt>regexp<\/tt><\/a> functionality (covered <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=27\">here<\/a>). I'm going to cover a few of the simple and very useful string functions today.\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#1\">Use strfind<\/a><\/li>\r\n         <li><a href=\"#3\">Use strrep<\/a><\/li>\r\n         <li><a href=\"#4\">Use strncmp<\/a><\/li>\r\n         <li><a href=\"#5\">Use strcmpi<\/a><\/li>\r\n         <li><a href=\"#6\">Use ismember<\/a><\/li>\r\n         <li><a href=\"#7\">Summary<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <h3>Use strfind<a name=\"1\"><\/a><\/h3>\r\n   <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/strfind.html\"><tt>strfind<\/tt><\/a> instead of <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/findstr.html\"><tt>findstr<\/tt><\/a> or <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/find.html\"><tt>find<\/tt><\/a> for string searches.\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Preferred<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             strfind('abc','a')<\/pre><div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             findstr('abc','a')<\/pre><p>This usage is a bit slower potentially and may cause confusion since there is no way to know which string was found in the\r\n      other one.\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             find('abc'=='a')<\/pre><p>This usage is about 5 times slower than <tt>strfind<\/tt>, and is not robust, since it only works if one of the arguments to <tt>==<\/tt> is scalar.\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Benefits<\/li>\r\n      <\/ul>\r\n   <\/div><pre>      - Speed improvement, less memory (no temporary for results of logical statement inside find\r\n      - No ambiguity on which string to index into later, if desired\r\n      - Code is robust compared to using FIND which can't handle as general a case, nor is FIND as fast.<\/pre><h3>Use strrep<a name=\"3\"><\/a><\/h3>\r\n   <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/strrep.html\"><tt>strrep<\/tt><\/a> instead of replacing values via indexing.\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Preferred (removing blanks from a string)<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             str = strrep(str,' ','')<\/pre><div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             ind = find(str==' '); str(ind) = []\r\n             str(str==' ') = []<\/pre><div>\r\n      <ul>\r\n         <li>Preferred (remove <tt>&amp;<\/tt> from strings, e.g., menu accelerators)\r\n         <\/li>\r\n      <\/ul>\r\n   <\/div><pre>             str = strrep(str,'&amp;','')<\/pre><div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             menuLabelStr(find(menuLabelStr=='&amp;')) = []<\/pre><div>\r\n      <ul>\r\n         <li>Benefits<\/li>\r\n      <\/ul>\r\n   <\/div><pre>      - speed\r\n      - readability\r\n      - more general, i.e., replacement strings don't need to be the same\r\n        size (or empty) as the strings they replace<\/pre><h3>Use strncmp<a name=\"4\"><\/a><\/h3>\r\n   <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/strncmp.html\"><tt>strncmp<\/tt><\/a> instead of <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/strmatch.html\"><tt>strmatch<\/tt><\/a> with literal second input.\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Preferred<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             strncmp(str,'string',length(str))<\/pre><div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             strmatch(str,'string')<\/pre><div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             strmatch(str,'string','exact')<\/pre><div>\r\n      <ul>\r\n         <li>Benefits<\/li>\r\n      <\/ul>\r\n   <\/div><pre>      - speed<\/pre><div>\r\n      <ul>\r\n         <li>Note<\/li>\r\n      <\/ul>\r\n   <\/div><pre>      - strmatch returns indices where the string is found, while strncmp\r\n        returns true\/false, so upgrading code requires more than just copy\/paste.<\/pre><h3>Use strcmpi<a name=\"5\"><\/a><\/h3>\r\n   <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/strcmpi.html\"><tt>strcmpi<\/tt><\/a> instead of using <tt>strcmp<\/tt> with <tt>upper<\/tt> or <tt>lower<\/tt>.\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Preferred<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             strcmpi(str,'lcstring')<\/pre><div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             strcmp(lower(str),'lcstring')<\/pre><div>\r\n      <ul>\r\n         <li>Benefits<\/li>\r\n      <\/ul>\r\n   <\/div><pre>      - speed\r\n           - fewer function calls\r\n           - fewer temporary variables\r\n      - readability<\/pre><h3>Use ismember<a name=\"6\"><\/a><\/h3>\r\n   <p>Use <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/ismember.html\"><tt>ismember<\/tt><\/a> to vectorize string finding operations.\r\n   <\/p>\r\n   <div>\r\n      <ul>\r\n         <li>Preferred<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             pets = {'cat';'dog';'dog';'dog';'giraffe';'hamster'}\r\n             species = {'cat' 'dog'}\r\n             [tf, loc] = ismember(pets, species)<\/pre><div>\r\n      <ul>\r\n         <li>Not recommended<\/li>\r\n      <\/ul>\r\n   <\/div><pre>             locs = zeros(length(pets),1);\r\n             for k = 1:length(species)\r\n                 tf =  strcmp(pets, species(k));\r\n                 locs(tf) = k;\r\n             end<\/pre><div>\r\n      <ul>\r\n         <li>Benefits<\/li>\r\n      <\/ul>\r\n   <\/div><pre>      - speed<\/pre><div>\r\n      <ul>\r\n         <li>Note<\/li>\r\n      <\/ul>\r\n   <\/div><pre>      - strfind works on cell arrays of strings and returns results\r\n        in a cell array, with relevant indices.  It does partial matching.\r\n      - ismember requires an exact match.  The outputs are different\r\n        than strfind's, so coding is not just a matter of direct\r\n        substitution.<\/pre><h3>Summary<a name=\"7\"><\/a><\/h3>\r\n   <p>I've talked about a few simple string functions available in MATLAB.  Do you have some simple string recommendations for users?\r\n       Post your ideas <a href=\"?p=70#respond\">here<\/a>.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_fa6fbc57bbcf4e32bdd043ac2f2459b1() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='fa6fbc57bbcf4e32bdd043ac2f2459b1 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' fa6fbc57bbcf4e32bdd043ac2f2459b1';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Loren Shure';\r\n        copyright = 'Copyright 2006 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_fa6fbc57bbcf4e32bdd043ac2f2459b1()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.3<br><\/p>\r\n<\/div>\r\n<!--\r\nfa6fbc57bbcf4e32bdd043ac2f2459b1 ##### SOURCE BEGIN #####\r\n%% Finding Strings\r\n% Over the years, MATLAB has become a friendlier environment for working\r\n% with character information.  MATLAB has a rich set of text handling\r\n% functions, ranging from the simple, to the all-powerful \r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/regexp.html |regexp|>\r\n% functionality (covered <https:\/\/blogs.mathworks.com\/loren\/?p=27 here>).  \r\n% I'm going to cover a few of the simple and very useful string functions\r\n% today.\r\n%% Use strfind\r\n% Use <https:\/\/www.mathworks.com\/help\/matlab\/ref\/strfind.html |strfind|>\r\n% instead of <https:\/\/www.mathworks.com\/help\/matlab\/ref\/findstr.html |findstr|>\r\n% or <https:\/\/www.mathworks.com\/help\/matlab\/ref\/find.html |find|> for string searches.\r\n% \r\n% * Preferred\r\n%\r\n%               strfind('abc','a')\r\n%\r\n% * Not recommended\r\n%\r\n%               findstr('abc','a') \r\n%\r\n% This usage is a bit slower potentially and may cause confusion since there\r\n% is no way to know which string was found in the other one.\r\n%\r\n% * Not recommended\r\n%\r\n%               find('abc'=='a')\r\n%%\r\n% This usage is about 5 times slower than |strfind|, and is not robust, since\r\n% it only works if one of the arguments to |==| is scalar.\r\n%\r\n% * Benefits\r\n% \r\n%        - Speed improvement, less memory (no temporary for results of logical statement inside find\r\n%        - No ambiguity on which string to index into later, if desired\r\n%        - Code is robust compared to using FIND which can't handle as general a case, nor is FIND as fast.\r\n% \r\n%%  Use strrep\r\n% Use <https:\/\/www.mathworks.com\/help\/matlab\/ref\/strrep.html |strrep|>\r\n% instead of replacing values via indexing.\r\n% \r\n% * Preferred (removing blanks from a string)\r\n%\r\n%               str = strrep(str,' ','')\r\n%\r\n% * Not recommended\r\n%\r\n%               ind = find(str==' '); str(ind) = []\r\n%               str(str==' ') = []\r\n% \r\n% * Preferred (remove |&| from strings, e.g., menu accelerators)\r\n%\r\n%               str = strrep(str,'&','')\r\n%\r\n% * Not recommended\r\n% \r\n%               menuLabelStr(find(menuLabelStr=='&')) = []\r\n%\r\n% * Benefits\r\n% \r\n%        - speed\r\n%        - readability\r\n%        - more general, i.e., replacement strings don't need to be the same\r\n%          size (or empty) as the strings they replace\r\n%%  Use strncmp\r\n% Use <https:\/\/www.mathworks.com\/help\/matlab\/ref\/strncmp.html |strncmp|> \r\n% instead of <https:\/\/www.mathworks.com\/help\/matlab\/ref\/strmatch.html |strmatch|>\r\n% with literal second input.\r\n%\r\n% * Preferred\r\n%\r\n%               strncmp(str,'string',length(str))\r\n%\r\n% * Not recommended\r\n%\r\n%               strmatch(str,'string')\r\n%\r\n% * Not recommended\r\n%\r\n%               strmatch(str,'string','exact')\r\n%\r\n% * Benefits\r\n% \r\n%        - speed\r\n% \r\n% * Note\r\n% \r\n%        - strmatch returns indices where the string is found, while strncmp\r\n%          returns true\/false, so upgrading code requires more than just copy\/paste.\r\n%\r\n%% Use strcmpi\r\n% Use <https:\/\/www.mathworks.com\/help\/matlab\/ref\/strcmpi.html |strcmpi|>\r\n% instead of using |strcmp| with |upper| or |lower|.\r\n%\r\n% * Preferred\r\n%\r\n%               strcmpi(str,'lcstring')\r\n%\r\n% * Not recommended\r\n% \r\n%               strcmp(lower(str),'lcstring')\r\n%\r\n% * Benefits\r\n% \r\n%        - speed\r\n%             - fewer function calls\r\n%             - fewer temporary variables\r\n%        - readability\r\n%\r\n%% Use ismember\r\n% Use\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/ismember.html |ismember|>\r\n% to vectorize string finding operations.  \r\n%\r\n% * Preferred\r\n%\r\n%               pets = {'cat';'dog';'dog';'dog';'giraffe';'hamster'}\r\n%               species = {'cat' 'dog'}\r\n%               [tf, loc] = ismember(pets, species)  \r\n%\r\n% * Not recommended\r\n% \r\n%               locs = zeros(length(pets),1);\r\n%               for k = 1:length(species)\r\n%                   tf =  strcmp(pets, species(k));\r\n%                   locs(tf) = k;    \r\n%               end\r\n%\r\n% * Benefits\r\n% \r\n%        - speed\r\n%\r\n% * Note\r\n%\r\n%        - strfind works on cell arrays of strings and returns results \r\n%          in a cell array, with relevant indices.  It does partial matching.  \r\n%        - ismember requires an exact match.  The outputs are different\r\n%          than strfind's, so coding is not just a matter of direct\r\n%          substitution.\r\n%\r\n%% Summary\r\n% I've talked about a few simple string functions available in MATLAB.  Do\r\n% you have some simple string recommendations for users?  Post your ideas \r\n% <?p=70#respond here>.\r\n\r\n\r\n##### SOURCE END ##### fa6fbc57bbcf4e32bdd043ac2f2459b1\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      Over the years, MATLAB has become a friendlier environment for working with character information.  MATLAB has a rich set\r\n         of text handling functions, ranging from the simple,... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2006\/12\/20\/finding-strings\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/70"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=70"}],"version-history":[{"count":0,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/70\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=70"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=70"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=70"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}