<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: MATLAB, Strings, and Regular Expressions</title>
	<atom:link href="http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/</link>
	<description>Loren Shure works on design of the MATLAB language at MathWorks. She writes here about once a week on MATLAB programming and related topics.</description>
	<lastBuildDate>Thu, 09 Feb 2012 04:19:21 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Jason Breslau</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-32862</link>
		<dc:creator>Jason Breslau</dc:creator>
		<pubDate>Tue, 27 Dec 2011 15:11:29 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-32862</guid>
		<description>This is a limitation of dynamic expressions which is not well documented.

The expression generated dynamically needs to be a complete expression.  What this means is that if you take the pattern returned by the dynamic operator, it should be able to match as a standalone expression.

The dynamic portion of your pattern produces the expression &quot;4&quot; which will match the number &quot;4&quot;, as opposed to being used as a quantifier for the greater expression which contains it.  The way you handle this is to generate more of your expression as part of the dynamic component:


&gt;&gt; ptrn = &#039;^(\d)((??(\\s+[A-D]){$1}))&#039;;
&gt;&gt; [m,t]=regexpi(tst,ptrn,&#039;match&#039;,&#039;tokens&#039;,&#039;once&#039;)

m =

4 A B C D


t = 

    &#039;4&#039;    &#039; A B C D&#039;


A few things to note here:

1) Changing the dynamic component to (??{$1}) is not enough, as it is still not a complete expression, just the quantifier.

2) You have to escape the \s in the expression, as the entire subexpression is parsed like a replace string for regexprep.

3) To capture the second portion, I needed to add another set of parenthesis, as dynamic expressions can not create new capturing groups.

I hope that helps,

-=&gt;J</description>
		<content:encoded><![CDATA[<p>This is a limitation of dynamic expressions which is not well documented.</p>
<p>The expression generated dynamically needs to be a complete expression.  What this means is that if you take the pattern returned by the dynamic operator, it should be able to match as a standalone expression.</p>
<p>The dynamic portion of your pattern produces the expression &#8220;4&#8243; which will match the number &#8220;4&#8243;, as opposed to being used as a quantifier for the greater expression which contains it.  The way you handle this is to generate more of your expression as part of the dynamic component:</p>
<p>&gt;&gt; ptrn = &#8216;^(\d)((??(\\s+[A-D]){$1}))&#8217;;<br />
&gt;&gt; [m,t]=regexpi(tst,ptrn,&#8217;match&#8217;,'tokens&#8217;,'once&#8217;)</p>
<p>m =</p>
<p>4 A B C D</p>
<p>t = </p>
<p>    &#8217;4&#8242;    &#8216; A B C D&#8217;</p>
<p>A few things to note here:</p>
<p>1) Changing the dynamic component to (??{$1}) is not enough, as it is still not a complete expression, just the quantifier.</p>
<p>2) You have to escape the \s in the expression, as the entire subexpression is parsed like a replace string for regexprep.</p>
<p>3) To capture the second portion, I needed to add another set of parenthesis, as dynamic expressions can not create new capturing groups.</p>
<p>I hope that helps,</p>
<p>-=&gt;J</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adrian Thompson</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-32861</link>
		<dc:creator>Adrian Thompson</dc:creator>
		<pubDate>Sat, 24 Dec 2011 03:26:15 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-32861</guid>
		<description>Loren,

Something&#039;s not working for me. 

Why does the following use of a dynamic regular expression not return the same result as its static equivalent? Am I missing something about when dynamic expressions can be used in R2011a?

&gt;&gt; tst=&#039;4 A B C D 3&#039;;
&gt;&gt; ptrn=&#039;^(\d)(\s+[A-D]){4}&#039;;
&gt;&gt; [m,t]=regexpi(tst,ptrn,&#039;match&#039;,&#039;tokens&#039;,&#039;once&#039;)

m =

4 A B C D


t = 

    &#039;4&#039;    &#039; A B C D&#039;


&gt;&gt; ptrn=&#039;^(\d)(\s+[A-D]){(??$1)}&#039;

ptrn =

^(\d)(\s+[A-D]){(??$1)}

&gt;&gt; [m,t]=regexpi(tst,ptrn,&#039;match&#039;,&#039;tokens&#039;,&#039;once&#039;)

m =

     &#039;&#039;



t = 

     {}

NOTE: The only thing that has changed is that I&#039;ve replaced the number 4 with the dynamic regular expression (??$1) that should equate to exactly the same character &#039;4&#039;, as per the tokens shown above. I also have the same problem when I use a dynamic function call, as in (?@return_same_character($1)). The token is definitely captured, but it seems like the string is not being properly updated before the final matching is attempted.

Thanks for you help on this; I&#039;m stymied.

Regards,
Adrian</description>
		<content:encoded><![CDATA[<p>Loren,</p>
<p>Something&#8217;s not working for me. </p>
<p>Why does the following use of a dynamic regular expression not return the same result as its static equivalent? Am I missing something about when dynamic expressions can be used in R2011a?</p>
<p>&gt;&gt; tst=&#8217;4 A B C D 3&#8242;;<br />
&gt;&gt; ptrn=&#8217;^(\d)(\s+[A-D]){4}&#8217;;<br />
&gt;&gt; [m,t]=regexpi(tst,ptrn,&#8217;match&#8217;,'tokens&#8217;,'once&#8217;)</p>
<p>m =</p>
<p>4 A B C D</p>
<p>t = </p>
<p>    &#8217;4&#8242;    &#8216; A B C D&#8217;</p>
<p>&gt;&gt; ptrn=&#8217;^(\d)(\s+[A-D]){(??$1)}&#8217;</p>
<p>ptrn =</p>
<p>^(\d)(\s+[A-D]){(??$1)}</p>
<p>&gt;&gt; [m,t]=regexpi(tst,ptrn,&#8217;match&#8217;,'tokens&#8217;,'once&#8217;)</p>
<p>m =</p>
<p>     &#8221;</p>
<p>t = </p>
<p>     {}</p>
<p>NOTE: The only thing that has changed is that I&#8217;ve replaced the number 4 with the dynamic regular expression (??$1) that should equate to exactly the same character &#8217;4&#8242;, as per the tokens shown above. I also have the same problem when I use a dynamic function call, as in (?@return_same_character($1)). The token is definitely captured, but it seems like the string is not being properly updated before the final matching is attempted.</p>
<p>Thanks for you help on this; I&#8217;m stymied.</p>
<p>Regards,<br />
Adrian</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: altreus</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-32548</link>
		<dc:creator>altreus</dc:creator>
		<pubDate>Sun, 16 Oct 2011 08:07:51 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-32548</guid>
		<description>Although this post is from 2006, it&#039;s now 2011, so we&#039;ve caught up!

The same regex can be used in Perl: they just didn&#039;t because it&#039;s horrible :) Clear code is preferable unless you&#039;re intentionally obfuscating.

&lt;pre&gt;
use List::Util qw(shuffle); #List::Util is core
while (&lt;&gt;) {
  s{(?&lt;=\w)(\w{2,})(?=\w)}{join &#039;&#039;, shuffle split //, $1}e;
  print $_, &quot;\n&quot;;
}
&lt;/pre&gt;

Code evaluation like this has always been possible in Perl 5 as far as I can tell.

We also have named capture groups, which are stored in the special hash %+ since changing the way regexes return would introduce inconsistency:

(dump function provided by Data::Dump)
&lt;pre&gt;
use 5.010;
my @date = &#039;11/26/1977&#039; =~ m{(\d+)/(\d+)/(\d+)};
dump \@date;

# [11, 26, 1977]


&#039;11/26/1977&#039; =~ m{(?\d+)/(?\d+)/(?\d+)};
my %date = %+;

dump \%date;

# { day =&gt; 26, month =&gt; 11, year =&gt; 1977 }
&lt;/pre&gt;

This is a feature of 5.10. Although many places are still running 5.8.8 it&#039;s hardly Perl&#039;s fault that people can&#039;t keep up :) 5.10 is itself end-of-life; 5.14 is current and 5.12 is considered old.

Case preservation is a thing we don&#039;t do. I suspect if you ask one of the core team why they will give you one of two answers:

1. We don&#039;t know of anyone who has ever wanted it
2. We can&#039;t make it work in a consistent way.

Kirk out</description>
		<content:encoded><![CDATA[<p>Although this post is from 2006, it&#8217;s now 2011, so we&#8217;ve caught up!</p>
<p>The same regex can be used in Perl: they just didn&#8217;t because it&#8217;s horrible :) Clear code is preferable unless you&#8217;re intentionally obfuscating.</p>
<pre>
use List::Util qw(shuffle); #List::Util is core
while (&lt;&gt;) {
  s{(?&lt;=\w)(\w{2,})(?=\w)}{join '', shuffle split //, $1}e;
  print $_, "\n";
}
</pre>
<p>Code evaluation like this has always been possible in Perl 5 as far as I can tell.</p>
<p>We also have named capture groups, which are stored in the special hash %+ since changing the way regexes return would introduce inconsistency:</p>
<p>(dump function provided by Data::Dump)</p>
<pre>
use 5.010;
my @date = '11/26/1977' =~ m{(\d+)/(\d+)/(\d+)};
dump \@date;

# [11, 26, 1977]

'11/26/1977' =~ m{(?\d+)/(?\d+)/(?\d+)};
my %date = %+;

dump \%date;

# { day =&gt; 26, month =&gt; 11, year =&gt; 1977 }
</pre>
<p>This is a feature of 5.10. Although many places are still running 5.8.8 it&#8217;s hardly Perl&#8217;s fault that people can&#8217;t keep up :) 5.10 is itself end-of-life; 5.14 is current and 5.12 is considered old.</p>
<p>Case preservation is a thing we don&#8217;t do. I suspect if you ask one of the core team why they will give you one of two answers:</p>
<p>1. We don&#8217;t know of anyone who has ever wanted it<br />
2. We can&#8217;t make it work in a consistent way.</p>
<p>Kirk out</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brad  Stiritz</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-31924</link>
		<dc:creator>Brad  Stiritz</dc:creator>
		<pubDate>Mon, 20 Dec 2010 15:32:36 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-31924</guid>
		<description>Hi Loren, thanks for your empathy &amp; encouragement. I&#039;ve submitted an enhancement request &amp; referred to our dialogue here as supporting evidence. Fingers crossed! ;)

Happy holidays,
Brad</description>
		<content:encoded><![CDATA[<p>Hi Loren, thanks for your empathy &amp; encouragement. I&#8217;ve submitted an enhancement request &amp; referred to our dialogue here as supporting evidence. Fingers crossed! ;)</p>
<p>Happy holidays,<br />
Brad</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Loren</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-31923</link>
		<dc:creator>Loren</dc:creator>
		<pubDate>Mon, 20 Dec 2010 12:02:52 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-31923</guid>
		<description>Brad-

I have the same troubles with regexp as you describe and would love to see a GUI that could help me out!  Please make this into an enhancement request by using the support link on the right side of my blog.  Thanks!

--Loren</description>
		<content:encoded><![CDATA[<p>Brad-</p>
<p>I have the same troubles with regexp as you describe and would love to see a GUI that could help me out!  Please make this into an enhancement request by using the support link on the right side of my blog.  Thanks!</p>
<p>&#8211;Loren</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brad  Stiritz</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-31922</link>
		<dc:creator>Brad  Stiritz</dc:creator>
		<pubDate>Sun, 19 Dec 2010 17:27:48 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-31922</guid>
		<description>I occasionally need to use regexp() within MATLAB. When I do it&#039;s usually a frustrating experience. The documentation for regular expressions is by necessity &amp; by nature really long &amp; really dense! The doc page: User&#039;s Guide\Programming Fundamentals\ Basic Program Components\Regular Expressions is 48 letter-size pages long, 13,000 words!!

Consequently, If I&#039;m doing anything even slightly non-trivial, it can be very time-consuming to work out the proper match expression by trial-and-error at the MATLAB command-line. 

In desperation I Googled for a Reg. Expr. utility &amp; found a free .NET-based offering called Expresso. This is a multi-pane app (similar in look to MATLAB) that allows one to see (in a tree view) how match expressions parse out. It&#039;s really lowered the stress-level for me &amp; speeds things up dramatically when I have to use regexp() in MATLAB.

IMHO, the Mathworks should consider offering some kind of analyzer like Expresso within the MATLAB environment. Otherwise, in my experience, reg. expr&#039;s can be a real bottleneck for rapid code development.

Any comments appreciated,
Respectfully,
Brad</description>
		<content:encoded><![CDATA[<p>I occasionally need to use regexp() within MATLAB. When I do it&#8217;s usually a frustrating experience. The documentation for regular expressions is by necessity &amp; by nature really long &amp; really dense! The doc page: User&#8217;s Guide\Programming Fundamentals\ Basic Program Components\Regular Expressions is 48 letter-size pages long, 13,000 words!!</p>
<p>Consequently, If I&#8217;m doing anything even slightly non-trivial, it can be very time-consuming to work out the proper match expression by trial-and-error at the MATLAB command-line. </p>
<p>In desperation I Googled for a Reg. Expr. utility &amp; found a free .NET-based offering called Expresso. This is a multi-pane app (similar in look to MATLAB) that allows one to see (in a tree view) how match expressions parse out. It&#8217;s really lowered the stress-level for me &amp; speeds things up dramatically when I have to use regexp() in MATLAB.</p>
<p>IMHO, the Mathworks should consider offering some kind of analyzer like Expresso within the MATLAB environment. Otherwise, in my experience, reg. expr&#8217;s can be a real bottleneck for rapid code development.</p>
<p>Any comments appreciated,<br />
Respectfully,<br />
Brad</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ray</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-30976</link>
		<dc:creator>Ray</dc:creator>
		<pubDate>Tue, 19 Jan 2010 04:27:36 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-30976</guid>
		<description>It would be nice to have MATLAB language/text processing toolbox.</description>
		<content:encoded><![CDATA[<p>It would be nice to have MATLAB language/text processing toolbox.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Prakash</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-30880</link>
		<dc:creator>Prakash</dc:creator>
		<pubDate>Mon, 07 Dec 2009 06:43:55 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-30880</guid>
		<description>Figured out just now. For getting the number from the 7th row and 31st column of a cellarray, use

str2double(c{7,31}{1})

long live braces. Thanks for the platform.</description>
		<content:encoded><![CDATA[<p>Figured out just now. For getting the number from the 7th row and 31st column of a cellarray, use</p>
<p>str2double(c{7,31}{1})</p>
<p>long live braces. Thanks for the platform.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Prakash</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-30879</link>
		<dc:creator>Prakash</dc:creator>
		<pubDate>Mon, 07 Dec 2009 05:55:35 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-30879</guid>
		<description>I am unable to get numbers from tokens. Say token c is shown as 8X42 (cellarray?).  How do I get a number corresponding to say element in 7th row and 31st column?

How do I use str2double? For a 1X1 cellarray 
str2double(c{1}{1}); 
seems to work as an argument to str2double. I cannot figure out what would work for later elements of the cellarray.</description>
		<content:encoded><![CDATA[<p>I am unable to get numbers from tokens. Say token c is shown as 8X42 (cellarray?).  How do I get a number corresponding to say element in 7th row and 31st column?</p>
<p>How do I use str2double? For a 1X1 cellarray<br />
str2double(c{1}{1});<br />
seems to work as an argument to str2double. I cannot figure out what would work for later elements of the cellarray.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Loren</title>
		<link>http://blogs.mathworks.com/loren/2006/04/05/regexp-how-tos/#comment-30707</link>
		<dc:creator>Loren</dc:creator>
		<pubDate>Tue, 27 Oct 2009 17:06:17 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.mathworks.com/loren/?p=27#comment-30707</guid>
		<description>Alexander-

&lt;a href=&quot;http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/matlab_prog/f0-42649.html&quot; rel=&quot;nofollow&quot;&gt;The documentation&lt;/a&gt; covers what MATLAB supports.

--Loren</description>
		<content:encoded><![CDATA[<p>Alexander-</p>
<p><a href="http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/matlab_prog/f0-42649.html" rel="nofollow">The documentation</a> covers what MATLAB supports.</p>
<p>&#8211;Loren</p>
]]></content:encoded>
	</item>
</channel>
</rss>

