Today I'd like to introduce a guest blogger, Stephen Doe, who works for the MATLAB Documentation team here at MathWorks. In today's post, Stephen discusses how, and why, you might want to update your code to accept string arrays as inputs.
In R2016b, MATLAB® introduced the string data type as a new data type for text. Each element of a string array stores a sequence of characters. You can use standard array indexing and operations on string arrays, along with string manipulation functions introduced in R2016b. Also, you can use many MATLAB functions (such as sort and unique) on string arrays.
Here's a short example. Start with some text you can store as one string--or as a string scalar, a term we use to describe a string array with one element. As of R2017a, you can use double-quotes to create a string. (Single-quotes still create a character vector.)
str = "A horse! a horse! My kingdom for a horse!"
str = "A horse! a horse! My kingdom for a horse!"
If you use the split function to split that string on space characters, then the result is a string array with nine elements. Instead of getting a cell array, you get a homogeneous array storing text--an array that has the same data type you started with. The split function returns the string array as a column vector. Let's reshape it as a row vector of strings for more compact display.
str = split(str); str = str'
str = 1×9 string array Columns 1 through 8 "A" "horse!" "a" "horse!" "My" "kingdom" "for" "a" Column 9 "horse!"
String arrays now provide a powerful way to deal with text in your data. As you work with your text in a string array, you never have to resort to cell arrays or curly brace indexing. I won't belabor the point, because Loren already has published some wonderful guest posts on using string arrays. If you are looking for more examples using strings, see: Introducing String Arrays, Singing the Praises of Strings, and Working with Text in MATLAB.
And of course, you can find the documentation for string arrays at Characters and Strings.
A few paragraphs ago, I said that "many" MATLAB functions now accept string arrays as inputs arguments. In fact, in a future release nearly all MathWorks® products will work with string arrays as inputs.
Here's what we mean when we say that a product "works with" string arrays as inputs:
- If an input argument can be a single piece of text, then you can specify it as a character vector or as a string scalar. That is, you can use either single-quotes or double-quotes.
- If an input argument can contain multiple pieces of text, then you can specify it as a cell array of character vectors or as a string array.
We're holding to this pattern in old functions where we are adding support for string arrays, and in new functions going forward. Consider the split function I used to split a string. It works just as well on character vectors or cell arrays of character vectors as it does on strings. (Note that functions which have always returned a character vector in previous releases, such as the fileread function, will continue to do so for compatibility.)
We think this approach is the most seamless way to support use of string arrays throughout our products. Our advice is to take the same approach in your own code. If you maintain code of your own, or write code for other MATLAB users, then it is to your advantage to update your code to accept string arrays as inputs, while maintaining support for the older types for storing text.
Right now, you are thinking, "Great! But how do I do that?" (Well, perhaps "great" is not the exact word that came to mind.)
To help you update your code, the R2018a documentation provides guidelines to Update Your Code to Accept Strings. You can apply these guidelines to your code now. The guidelines, and our documentation, also provide links to helper functions, such as convertStringsToChars, that smooth the way to accepting strings in your code.
Here is a short example that shows two ways to put these guidelines into action. I'll start with a fast and simple way that uses one of our helper functions. Then I'll follow with another way that is a little more time consuming, but is also a little more forward-looking.
Suppose this code is your original function to abbreviate the name of a city using its first three letters, making those letters uppercase. Up to now, you have specified a city's name as a character vector. For example, if the input argument is 'Boston', then the output argument is 'BOS'.
function abr = abbreviateName(str) abr = upper(str(1:3)); end
If your user instead specifies the string "Boston" as the input argument, then there will be a syntax error. For a string array, the syntax str(1:3) means, return the first, second, and third strings as a three-element string array. But "Boston" all by itself is a 1-by-1 string array!
One way to support both character vectors and strings is to use the helper function convertStringsToChars at the beginning of the original code. convertStringsToChars processes an arbitrary number of input arguments and, if any of them are strings, converts them. It leaves all other data types alone.
function abr = abbreviateName(str) str = convertStringsToChars(str); abr = upper(str(1:3)); end
With the helper function, you can accept input string arrays without altering any of the code that follows. Also, the output argument type does not differ from the type returned by the original code.
A second way is to use one of the new functions for text manipulation, such as extractBefore. These new functions also work with character vectors and cell arrays of character vectors. And so, this new version of abbreviateName works the same way for inputs of any text data type.
function abr = abbreviateName(str) abr = extractBefore(str,4); abr = upper(abr); end
While this code supports both strings and character vectors, I did have to rewrite the code. Also, this code returns a string if the input is a string. The original code always returned a character vector.
Which way is the right way for you? Well, that is the "choose your own adventure" part. If you are doing a lot of work with text, then it might be time for you to do a deep dive into your code and rewrite its internals to use string arrays. If not, then use convertStringsToChars and minimize the pain of updating your code. On either path, look for help in our guidelines.
And if you have things to say about string arrays, or your efforts to update your own code, then let us know here.
Get the MATLAB code
Published with MATLAB® R2018a
Comments are closed.
4 CommentsOldest to Newest
In the past, I have created several functions that accept a character vector, a character array or a cell array of character vectors as input argument.
This input is first converted to a cell array with cellstr and handled in the rest of the code as a cell array of character vectors.
The nice advantage is that since R2016b these functions also automatically accept strings as input, because strings will also be converted to a cell array of character vectors.
So, no code changes are required and the code keeps compatible with older versions of Matlab.
Please can you make it possible to enter a double quoted string argument in the same way single quoted char for the command style function invocation? For example, in the MATLAB command window:
dir 'C:\Program Files\MATLAB\R2018a'
works fine although
dir "C:\Program Files\MATLAB\R2018a"
gives the error Too many input arguments.
This would be especially useful because Windows uses double quotes to specify paths. If you “Copy as Path” from Windows explorer you have to manually change the double quotes back into single quotes after pasting into MATLAB, which is a pain.
Julian, thank you for your feedback.
The command form of function calls supports only the use of single-quotes. The purpose is to allow inputs that include space characters. Unfortunately, the string team found that supporting double-quotes for the command form introduced too many incompatibilities. Therefore, we do not support double-quotes for command form. We do regret that this choice leads to the inconvenience you cite in your comment.
To work around this issue, we recommend that you use the functional form. In fact, as a general rule the functional form is a safer choice. To extend your example, this call works in R2018a:
We recommend this approach instead of changing double-quotes to single-quotes.
Stephen thanks for your reply. I do miss this potential feature but have not considered the compatibility issues that you have to face.
When bashing away at the command line I skip using the functional form when I can to save keying pesky matching brackets, quotes and commas – as these feel like an extra burden (especially brackets, which need SHIFT button on the keyboard). This could be made easier if the bracket matching / quote matching features in the new Live Editor were also present for the command line (and old editor!). I was very pleased to discover this feature in the Live Editor because I had often thought about suggesting it to MathWorks. I can’t really make too much use of it while I am not using Live Editor for most of my work.