## Loren on the Art of MATLABTurn ideas into MATLAB

Note

Loren on the Art of MATLAB has been retired and will not be updated.

# String Things

Working with text in MATLAB has evolved over time. Way back, text data was stored in double arrays with an internal flag to denote that it was meant to be text. We then transformed this representation so character arrays were their very own type. And I mentioned earlier that we introduced a string datatype to make working with text data more efficient and natural. Let me show you a little more.

### Contents

#### How to Compare Text: the Olden Days

Early on in MATLAB, we used the function strcmp to compare strings. A big caveat for many people is that strcmp does not behave the same way as its C-language counterpart. We then added over time a few more comparison functions:

to allow case-insensitive matches and to constrain the match to at most n characters.

Let's do some comparisons now. First on cell arrays of strings...

cellChars = {'Mercury','Venus','Earth','Mars'}

cellChars =
1×4 cell array
{'Mercury'}    {'Venus'}    {'Earth'}    {'Mars'}

TF = strcmp('fred',cellChars)

TF =
1×4 logical array
0   0   0   0

TF = strcmp('Venus',cellChars)

TF =
1×4 logical array
0   1   0   0

TF = strncmp('Mars', cellChars, 2)

TF =
1×4 logical array
0   0   0   1

TF = strncmp('Marvelous', cellChars, 2)

TF =
1×4 logical array
0   0   0   1

TF = strncmp('Marvelous', cellChars, 4)

TF =
1×4 logical array
0   0   0   0

TF = strcmpi('mars', cellChars)

TF =
1×4 logical array
0   0   0   1

TF = strcmpi('mar', cellChars)

TF =
1×4 logical array
0   0   0   0


#### More Modern, Not Identical Use

We also introduced categorical arrays for cases where limiting the set of string choices was appropriate. When using categorical variables, you may use == for comparisons.

catStr = categorical(cellChars)

catStr =
1×4 categorical array
Mercury      Venus      Earth      Mars

TF = 'Mars' == catStr

TF =
1×4 logical array
0   0   0   1


#### String Comparisons Circa 2020

And now for string comparisons.

str = string(cellChars) % or ["Mercury","Venus","Earth","Mars"]

str =
1×4 string array
"Mercury"    "Venus"    "Earth"    "Mars"


I can still use the str*cmp* functions. But we are not restricted to them.

TF = strcmp ('Mars', str)

TF =
1×4 logical array
0   0   0   1


We can now use == and related operators without worrying about indexing issues that might arise with character arrays.

TF = str ~= "Mars"

TF =
1×4 logical array
1   1   1   0


And most recently, we introduced the function matches.

TF = matches(str,"Earth")

TF =
1×4 logical array
0   0   1   0


It's got some nice features that allow for handling string arrays very nifty. Like looking for planets with an orbit inside Earth.

TF = matches(str,["Mercury","Venus"])

TF =
1×4 logical array
1   1   0   0


And I can, of course, ignore case, with code that, to me, appears less cryptic.

TF = matches(str,"earth","IgnoreCase",true)

TF =
1×4 logical array
0   0   1   0


As is true in all of these cases, we can index into the original array with the logical output to extract the relevant item(s).

str(TF)

ans =
"Earth"


#### My Advice: Err on the Side of Code Readability

I haven't touched on performance here, but one of the drivers for the recent string datatype is efficiency and performance. We've worked hard to overlay that with functions that make your code highly readable. This makes code maintenance and code transfer go much more smoothly. I tend to favor this over eking out the last fractional second of speed. In the case of strings, you may not even need to make that tradeoff.