Loren on the Art of MATLAB

Turn ideas into MATLAB

Finding Strings

Over the years, MATLAB has become a friendlier environment for working with character information. MATLAB has a rich set of text handling functions, ranging from the simple, to the all-powerful regexp functionality (covered here). I'm going to cover a few of the simple and very useful string functions today.

Contents

Use strfind

Use strfind instead of findstr or find for string searches.

  • Preferred
             strfind('abc','a')
  • Not recommended
             findstr('abc','a')

This usage is a bit slower potentially and may cause confusion since there is no way to know which string was found in the other one.

  • Not recommended
             find('abc'=='a')

This usage is about 5 times slower than strfind, and is not robust, since it only works if one of the arguments to == is scalar.

  • Benefits
      - Speed improvement, less memory (no temporary for results of logical statement inside find
      - No ambiguity on which string to index into later, if desired
      - Code is robust compared to using FIND which can't handle as general a case, nor is FIND as fast.

Use strrep

Use strrep instead of replacing values via indexing.

  • Preferred (removing blanks from a string)
             str = strrep(str,' ','')
  • Not recommended
             ind = find(str==' '); str(ind) = []
             str(str==' ') = []
  • Preferred (remove & from strings, e.g., menu accelerators)
             str = strrep(str,'&','')
  • Not recommended
             menuLabelStr(find(menuLabelStr=='&')) = []
  • Benefits
      - speed
      - readability
      - more general, i.e., replacement strings don't need to be the same
        size (or empty) as the strings they replace

Use strncmp

Use strncmp instead of strmatch with literal second input.

  • Preferred
             strncmp(str,'string',length(str))
  • Not recommended
             strmatch(str,'string')
  • Not recommended
             strmatch(str,'string','exact')
  • Benefits
      - speed
  • Note
      - strmatch returns indices where the string is found, while strncmp
        returns true/false, so upgrading code requires more than just copy/paste.

Use strcmpi

Use strcmpi instead of using strcmp with upper or lower.

  • Preferred
             strcmpi(str,'lcstring')
  • Not recommended
             strcmp(lower(str),'lcstring')
  • Benefits
      - speed
           - fewer function calls
           - fewer temporary variables
      - readability

Use ismember

Use ismember to vectorize string finding operations.

  • Preferred
             pets = {'cat';'dog';'dog';'dog';'giraffe';'hamster'}
             species = {'cat' 'dog'}
             [tf, loc] = ismember(pets, species)
  • Not recommended
             locs = zeros(length(pets),1);
             for k = 1:length(species)
                 tf =  strcmp(pets, species(k));
                 locs(tf) = k;
             end
  • Benefits
      - speed
  • Note
      - strfind works on cell arrays of strings and returns results
        in a cell array, with relevant indices.  It does partial matching.
      - ismember requires an exact match.  The outputs are different
        than strfind's, so coding is not just a matter of direct
        substitution.

Summary

I've talked about a few simple string functions available in MATLAB. Do you have some simple string recommendations for users? Post your ideas here.




Published with MATLAB® 7.3

|

Comments

To leave a comment, please click here to sign in to your MathWorks Account or create a new one.