Over the years, MATLAB has become a friendlier environment for working with character information. MATLAB has a rich set of text handling functions, ranging from the simple, to the all-powerful regexp functionality (covered here). I'm going to cover a few of the simple and very useful string functions today.
Contents
Use strfind
Use strfind instead of findstr or find for string searches.
- Preferred
strfind('abc','a')- Not recommended
findstr('abc','a')This usage is a bit slower potentially and may cause confusion since there is no way to know which string was found in the other one.
- Not recommended
find('abc'=='a')This usage is about 5 times slower than strfind, and is not robust, since it only works if one of the arguments to == is scalar.
- Benefits
- Speed improvement, less memory (no temporary for results of logical statement inside find
- No ambiguity on which string to index into later, if desired
- Code is robust compared to using FIND which can't handle as general a case, nor is FIND as fast.Use strrep
Use strrep instead of replacing values via indexing.
- Preferred (removing blanks from a string)
str = strrep(str,' ','')
- Not recommended
ind = find(str==' '); str(ind) = []
str(str==' ') = []- Preferred (remove & from strings, e.g., menu accelerators)
str = strrep(str,'&','')
- Not recommended
menuLabelStr(find(menuLabelStr=='&')) = []
- Benefits
- speed
- readability
- more general, i.e., replacement strings don't need to be the same
size (or empty) as the strings they replaceUse strncmp
Use strncmp instead of strmatch with literal second input.
- Preferred
strncmp(str,'string',length(str))
- Not recommended
strmatch(str,'string')
- Not recommended
strmatch(str,'string','exact')
- Benefits
- speed
- Note
- strmatch returns indices where the string is found, while strncmp
returns true/false, so upgrading code requires more than just copy/paste.Use strcmpi
Use strcmpi instead of using strcmp with upper or lower.
- Preferred
strcmpi(str,'lcstring')
- Not recommended
strcmp(lower(str),'lcstring')
- Benefits
- speed
- fewer function calls
- fewer temporary variables
- readabilityUse ismember
Use ismember to vectorize string finding operations.
- Preferred
pets = {'cat';'dog';'dog';'dog';'giraffe';'hamster'}
species = {'cat' 'dog'}
[tf, loc] = ismember(pets, species)- Not recommended
locs = zeros(length(pets),1);
for k = 1:length(species)
tf = strcmp(pets, species(k));
locs(tf) = k;
end- Benefits
- speed
- Note
- strfind works on cell arrays of strings and returns results
in a cell array, with relevant indices. It does partial matching.
- ismember requires an exact match. The outputs are different
than strfind's, so coding is not just a matter of direct
substitution.Summary
I've talked about a few simple string functions available in MATLAB. Do you have some simple string recommendations for users? Post your ideas here.
Get
the MATLAB code
Published with MATLAB® 7.3

Hi loren, where menu accelerators are used?
Kathirvel-
Menu accelerators are particular to Windows and they correspond to the underlined letters in the menus that you access with Alt-theChosenLetter. They allow you to navigate the menus without the mouse. They are different than Ctrl-someLetter in that these don’t navigate the menu, they are simply a direct shortcut to a particular action.
–Loren
Loren,
Regarding your comments on strcmpi, the case where I do find myself having to use calls to lower is in switch statements. Is there a way around that? (Other than obviously coding so that I always use lower case, etc…. You know something that doesn’t require me to be smarter : )
Dan-
Instead of using lower in the switch statement itself, you can reduce the burden by having all your switch cases be lower case, then simply lower only the input string before entering the switch statement, like this:
switch lower(method) case {'linear','bilinear'} disp('Method is linear') case 'cubic' disp('Method is cubic') case 'nearest' disp('Method is nearest') otherwise disp('Unknown method.') endBut that might be what you meant already. I don’t know of a way to totally avoid the lower, but at least you don’t have to do strcmp(lower(…)) everywhere.
–Loren
Thanks Loren,
That’s what I am already doing, mostly because I lifted the technique out of one of TMW’s toolboxes. I was just wondering if there was a lovely little undocumented switchi out there, or something…
Dan
Hello,
Is there a way to search a vector of numbers for a smaller set of numbers. Say for example I have a vector with the following numbers:
x = [1 2 3 4 5 1 2 3 4 5 4 6 7 1 2 3 4 5];
‘ ‘
Now from this vector I want to know when [4 6 7] occurs or if it even occurs in that order, is there a way to do this?
Thanks,
-M. Zia
as it was mentioned over and over in ML’s NG CSSM, the prefix STR in STRFIND simply means a string (of bits) and does not imply a string of characters (in the end, every data type is represented in the computer’s memory as a boring string of 0s and 1s…)
hence
x=[1 2 3 4 5 1 2 3 4 5 4 6 7 pi 1 2 3 4 5];
ix=strfind(x,[4,6,7,pi])
% ix = 11
us
Hi,
For string comparisions , does using isequal instead of strcmpi or strncmp give any advantage in terms of speed?
Sj-
If you really want to be sure you are comparing strings, you should use the str* functions. isequal doesn’t care about class, so you would get the following to be true:
f = 'hello' d = double(f) isequal(f,d) ans = 1 strcmp(f,d) ans = 0As a result, I think you can expect the string functions to be generally higher performance since no conversions take place.
–Loren
That makes sense, Loren. Thanks a bunch!
~sj
Loren,
Can you please comment on the speed of regexp and regexprep.
I am a perl user and work frequently with regular expressions for string manipulations. Matlab has all the necessary functions in place, but they seem to be quite slow.
Eric
Loren, In the past I have always found ‘ismemeber’ to a v_e_r_y slow function, so I was quite surprised to see your posting recommending its use. Maybe I’m missing something here? I tried 1000 iterations of the code you suggested and found ‘ismember’ to be a factor of 10 or more slower! (I’m running 7.3.0.298 (R2006b) on a 1.67 GHz PowerPC G4 under Mac OS X 10.4.8 with 2 GB of RAM. Would it matter that I’m on a Mac?)
pets = {’cat’;'dog’;'dog’;'dog’;'giraffe’;'hamster’};
species = {’cat’ ‘dog’};
tic
for lp = 1:1000
[tf, loc] = ismember(pets, species);
end
toc
tic
for lp = 1:1000
locs = zeros(length(pets),1);
for k = 1:length(species)
tf = strcmp(pets, species(k));
locs(tf) = k;
end
end
toc
isequal( loc, locs )
% My results: ‘ismember’ first, then ’strcmp’
%——————————
Elapsed time is 2.670689 seconds.
Elapsed time is 0.100176 seconds.
ans =
1
Your thoughts?
Eric-
What I wrote what I wrote, I was not focusing on performance. In addition, timing results depend on the computer architecture, whether or not MATLAB has a JIT there, and many more parameters. Also, timing depends heavily on the size of problem you pose. I personally am often (but not always) willing to live with lower performance for smaller inputs, provided there is enough benefit for large inputs. The reason I am not always willing to do this is because sometimes those smaller inputs occur in a loop and must be done a huge number of times.
The reason to recommend ismember for vectorizing is if that aspect of the code is helpful to people. Sometimes shorter code is more readable and maintainable, if not quite as fast.
Design trade-offs are hard to make. And they are situational.
–Loren
is there anyway to get the selected string from an editbox into the program workspace, without using CTRL C.
Sridhar-
You should read about handle graphics. You’ll need to get the ’string’ value from your edit box.
–Loren
i meant getting a partially selected string , is there any getselected command
i meant getting a partially selected string , is there any getselected command beacuse “string” property shows entire string. how can we extract a partially selected string by the user.
let us assume a string in the edit box
st=”subject predicate noun”;
if user selects a “noun” inside editbox , and clicks a menu to perform some operation on this selection . how would program know “noun” has been selected out of “subject predicate noun”.
is there any getselected?
Sridhar-
There is nothing built into MATLAB for that. You’d need to write your own code for processing getting the data from the edit box and analyzing it.
–Loren
You are probably not still checking this, but this is the closest info I can find related to my question.
I am trying to compare 2 string arrays and return only the full strings (not sub strings) in common for both (comparing a list of 2 names to find the matching names). Is there a simple way to do this? I can match using single strings, but not the whole array.
Thanks for any help you can provide.
Phoebe-
If you are looking to see if 2 strings are equal, check out the functions in the strcmp or ismember families.
–Loren