Loren on the Art of MATLAB

Turn ideas into MATLAB

Note

Loren on the Art of MATLAB has been archived and will not be updated.

Accept String Inputs in Your Code

Today I'd like to introduce a guest blogger, Stephen Doe, who works for the MATLAB Documentation team here at MathWorks. In today's post, Stephen discusses how, and why, you might want to update your code to accept string arrays as inputs.

Contents

What Are Strings?

In R2016b, MATLAB® introduced the string data type as a new data type for text. Each element of a string array stores a sequence of characters. You can use standard array indexing and operations on string arrays, along with string manipulation functions introduced in R2016b. Also, you can use many MATLAB functions (such as sort and unique) on string arrays.

Here's a short example. Start with some text you can store as one string--or as a string scalar, a term we use to describe a string array with one element. As of R2017a, you can use double-quotes to create a string. (Single-quotes still create a character vector.)

str = "A horse! a horse! My kingdom for a horse!"
str = 
    "A horse! a horse! My kingdom for a horse!"

If you use the split function to split that string on space characters, then the result is a string array with nine elements. Instead of getting a cell array, you get a homogeneous array storing text--an array that has the same data type you started with. The split function returns the string array as a column vector. Let's reshape it as a row vector of strings for more compact display.

str = split(str);
str = str'
str = 
  1×9 string array
  Columns 1 through 8
    "A"    "horse!"    "a"    "horse!"    "My"    "kingdom"    "for"    "a"
  Column 9
    "horse!"

String arrays now provide a powerful way to deal with text in your data. As you work with your text in a string array, you never have to resort to cell arrays or curly brace indexing. I won't belabor the point, because Loren already has published some wonderful guest posts on using string arrays. If you are looking for more examples using strings, see: Introducing String Arrays, Singing the Praises of Strings, and Working with Text in MATLAB.

And of course, you can find the documentation for string arrays at Characters and Strings.

Public Service Announcement!

A few paragraphs ago, I said that "many" MATLAB functions now accept string arrays as inputs arguments. In fact, in a future release nearly all MathWorks® products will work with string arrays as inputs.

Here's what we mean when we say that a product "works with" string arrays as inputs:

  • If an input argument can be a single piece of text, then you can specify it as a character vector or as a string scalar. That is, you can use either single-quotes or double-quotes.
  • If an input argument can contain multiple pieces of text, then you can specify it as a cell array of character vectors or as a string array.

We're holding to this pattern in old functions where we are adding support for string arrays, and in new functions going forward. Consider the split function I used to split a string. It works just as well on character vectors or cell arrays of character vectors as it does on strings. (Note that functions which have always returned a character vector in previous releases, such as the fileread function, will continue to do so for compatibility.)

We think this approach is the most seamless way to support use of string arrays throughout our products. Our advice is to take the same approach in your own code. If you maintain code of your own, or write code for other MATLAB users, then it is to your advantage to update your code to accept string arrays as inputs, while maintaining support for the older types for storing text.

How to Make Your Code Work with Strings

Right now, you are thinking, "Great! But how do I do that?" (Well, perhaps "great" is not the exact word that came to mind.)

To help you update your code, the R2018a documentation provides guidelines to Update Your Code to Accept Strings. You can apply these guidelines to your code now. The guidelines, and our documentation, also provide links to helper functions, such as convertStringsToChars, that smooth the way to accepting strings in your code.

Here is a short example that shows two ways to put these guidelines into action. I'll start with a fast and simple way that uses one of our helper functions. Then I'll follow with another way that is a little more time consuming, but is also a little more forward-looking.

Suppose this code is your original function to abbreviate the name of a city using its first three letters, making those letters uppercase. Up to now, you have specified a city's name as a character vector. For example, if the input argument is 'Boston', then the output argument is 'BOS'.

function abr = abbreviateName(str)
    abr = upper(str(1:3));
end

If your user instead specifies the string "Boston" as the input argument, then there will be a syntax error. For a string array, the syntax str(1:3) means, return the first, second, and third strings as a three-element string array. But "Boston" all by itself is a 1-by-1 string array!

One way to support both character vectors and strings is to use the helper function convertStringsToChars at the beginning of the original code. convertStringsToChars processes an arbitrary number of input arguments and, if any of them are strings, converts them. It leaves all other data types alone.

function abr = abbreviateName(str)
    str = convertStringsToChars(str);
    abr = upper(str(1:3));
end

With the helper function, you can accept input string arrays without altering any of the code that follows. Also, the output argument type does not differ from the type returned by the original code.

A second way is to use one of the new functions for text manipulation, such as extractBefore. These new functions also work with character vectors and cell arrays of character vectors. And so, this new version of abbreviateName works the same way for inputs of any text data type.

function abr = abbreviateName(str)
    abr = extractBefore(str,4);
    abr = upper(abr);
end

While this code supports both strings and character vectors, I did have to rewrite the code. Also, this code returns a string if the input is a string. The original code always returned a character vector.

Pulling More Strings

Which way is the right way for you? Well, that is the "choose your own adventure" part. If you are doing a lot of work with text, then it might be time for you to do a deep dive into your code and rewrite its internals to use string arrays. If not, then use convertStringsToChars and minimize the pain of updating your code. On either path, look for help in our guidelines.

And if you have things to say about string arrays, or your efforts to update your own code, then let us know here.




Published with MATLAB® R2018a


  • print

Comments

To leave a comment, please click here to sign in to your MathWorks Account or create a new one.