File Exchange Pick of the Week

Our best user submissions

R2020b: pattern (new way to regular express)

Jiro's Pick this week is the new pattern matching capabilities that were added in the newest release, R2020b.

I've always had a love-hate relationship with regular expressions. It's a powerful technique for string searching. It's powerful, but it's quite complicated as well. The fact that there are books on regular expressions goes to show that it is not a trivial technique to completely master. We have Picked a couple of entries related to regular expressions in the past, such as a regular express cheat sheet and a regular expression builder app.

With R2020b, there is a whole new way of searching and modifying text. It involves a much simpler method of building pattern expressions using simple functions.

Let's look at an example. Here is a string vector.

str = ["When I joined MathWorks, the version of MATLAB was R2006a."
  "When I moved to Japan, r2014a had just been released."
  "Now, six years later, the current version is R2020B."];

Let's say that I want to extract all of the MATLAB releases from the text. All I need to do is search for patterns starting with "R", followed by 4 numbers, followed by "a" or "b".

pat = "R" + digitsPattern(4) + ("a"|"b")
pat = 
    "R" + digitsPattern(4) + ("a" | "b")

To allow case insensitive search,

pat = caseInsensitivePattern(pat)
pat = 
    caseInsensitivePattern("R" + digitsPattern(4) + ("a" | "b"))

Now we simply extract!

extract(str, pat)
ans = 
  3×1 string array

For reference, if we were to do this with regular expressions,

regexp(str, "[Rr]\d{4}[aAbB]", "match")
ans =
  3×1 cell array

You be the judge as to which one is easier to understand.

Take a look at this page for more examples. Patterns will not be able to completely replace regular expressions, but not to worry! You can use regexpPattern to match regular expressions.

EDIT: Check out one of the comments below to see a slightly more complicated example.


Give it a try and let us know what you think here.

Published with MATLAB® R2020b

  • print


To leave a comment, please click here to sign in to your MathWorks Account or create a new one.