{"id":4844,"date":"2013-09-27T09:00:10","date_gmt":"2013-09-27T13:00:10","guid":{"rendered":"https:\/\/blogs.mathworks.com\/pick\/?p=4844"},"modified":"2016-07-06T09:26:39","modified_gmt":"2016-07-06T13:26:39","slug":"regexp-builder","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/pick\/2013\/09\/27\/regexp-builder\/","title":{"rendered":"Regexp Builder"},"content":{"rendered":"\r\n<div class=\"content\"><p><a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/29096\">Idin<\/a>'s pick for this week is the <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/41899-regexpbuilder\"><tt>regexpBuilder<\/tt><\/a> by <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/289864\">Michael Ryan<\/a>.<\/p><p><a href=\"http:\/\/en.wikipedia.org\/wiki\/Regular_expression\">Regular expressions<\/a> can be a powerful tool in searching through strings or documents. If you haven't used them before, think of them as a string pattern matching tool on steroids.<\/p><p>MATLAB introduced support for regular expressions in MATLAB 6.5 (R13). You can read about regular expressions in MATLAB <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/regular-expressions.html\">here<\/a>. There are also a couple of blog posts on Loren Shure's Art of MATLAB <a href=\"https:\/\/blogs.mathworks.com\/loren\/2012\/10\/18\/learning-to-love-regular-expressions\/\">here<\/a> and <a href=\"https:\/\/blogs.mathworks.com\/loren\/2006\/04\/05\/regexp-how-tos\/\">here<\/a>.<\/p><p>Despite their usefulness, the drawback to regular expressions has always been their non-intuitive syntax. For example, how do you find all strings that start with a capital \"B\" and end with \"e\"? Here is one option:<\/p><pre class=\"language-matlab\">regexp(<span class=\"string\">'Bobbie was born on Bastille Day before dawn.'<\/span>, <span class=\"string\">'B[a-z]*e'<\/span>)\r\n<\/pre><p>Now what if I want to ignore case? Or allow spaces within the string? Regular expressions can handle all these cases, but it takes some experience, and some doing.<\/p><p>Michael's regexpBuilder is an app that allows the user to interactively build their regular expression and see the results in real time. It even has checkboxes for some of the more common tasks (e.g. \"ignore case\", or \"match once\"). This can help reduce the time (and frustration) needed to construct the desired regular expression.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/pick\/idin\/potw_regexpbuilder\/regexpBuilder.png\" alt=\"\"> <\/p><p>The user interface is quite simple: enter the text-to-be-searched in the big \"Text\" box, then start entering a regular expression in the top left \"Regexp\" box. The tool displays the results by highlighting and\/or underlining the matches in the text window, but it also goes further: it lists all the different outputs from MATLAB's regexp function on the right-hand side of the screen. And just to make things a little more convenient, when you press the \"Evaluate\" button, it also echoes the equivalent MATLAB command at the MATLAB command prompt.<\/p><p><b>Suggestions for improvements<\/b><\/p><p>This app accomplishes its advertised task quite well. Here are some ideas on what could be added:<\/p><div><ul><li>The user interface could be cleaned up to look more professional<\/li><li>The text boxes on the right aren't supposed to be used by the user, and should probably be locked so the user doesn't type in them.<\/li><li>Ability to import text files and\/or web page would be nice.<\/li><\/ul><\/div><p>As always, your thoughts and comments <a href=\"https:\/\/blogs.mathworks.com\/pick\/?p=4844#respond\">here<\/a> are greatly appreciated.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_f094e17d7cad4a8aacee552636d92050() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='f094e17d7cad4a8aacee552636d92050 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' f094e17d7cad4a8aacee552636d92050';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2013 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_f094e17d7cad4a8aacee552636d92050()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2013b<br><\/p><p class=\"footer\"><br>\r\n      Published with MATLAB&reg; R2013b<br><\/p><\/div><!--\r\nf094e17d7cad4a8aacee552636d92050 ##### SOURCE BEGIN #####\r\n%%\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/29096\r\n% Idin>'s pick for this week is the\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/41899-regexpbuilder\r\n% |regexpBuilder|> by\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/authors\/289864\r\n% Michael Ryan>.\r\n%\r\n% <http:\/\/en.wikipedia.org\/wiki\/Regular_expression Regular expressions> can\r\n% be a powerful tool in searching through strings or documents. If you\r\n% haven't used them before, think of them as a string pattern matching tool\r\n% on steroids.\r\n%\r\n% MATLAB introduced support for regular expressions in MATLAB 6.5 (R13).\r\n% You can read about regular expressions in MATLAB\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/regular-expressions.html\r\n% here>. There are also a couple of blog posts on Loren Shure's Art of\r\n% MATLAB\r\n% <https:\/\/blogs.mathworks.com\/loren\/2012\/10\/18\/learning-to-love-regular-expressions\/\r\n% here> and <https:\/\/blogs.mathworks.com\/loren\/2006\/04\/05\/regexp-how-tos\/\r\n% here>.\r\n%\r\n% Despite their usefulness, the drawback to regular expressions has always\r\n% been their non-intuitive syntax. For example, how do you find all strings\r\n% that start with a capital \"B\" and end with \"e\"? Here is one option:\r\n%\r\n%   regexp('Bobbie was born on Bastille Day before dawn.', 'B[a-z]*e')\r\n%\r\n% Now what if I want to ignore case? Or allow spaces within the string?\r\n% Regular expressions can handle all these cases, but it takes some\r\n% experience, and some doing.\r\n%\r\n% Michael's regexpBuilder is an app that allows the user to interactively\r\n% build their regular expression and see the results in real time. It even\r\n% has checkboxes for some of the more common tasks (e.g. \"ignore case\", or\r\n% \"match once\"). This can help reduce the time (and frustration) needed to\r\n% construct the desired regular expression.\r\n%\r\n% <<regexpBuilder.png>>\r\n%\r\n% The user interface is quite simple: enter the text-to-be-searched in the\r\n% big \"Text\" box, then start entering a regular expression in the top left\r\n% \"Regexp\" box. The tool displays the results by highlighting and\/or\r\n% underlining the matches in the text window, but it also goes further: it\r\n% lists all the different outputs from MATLAB's regexp function on the\r\n% right-hand side of the screen. And just to make things a little more\r\n% convenient, when you press the \"Evaluate\" button, it also echoes the\r\n% equivalent MATLAB command at the MATLAB command prompt.\r\n%\r\n% *Suggestions for improvements*\r\n%\r\n% This app accomplishes its advertised task quite well. Here are some\r\n% ideas on what could be added:\r\n%\r\n% * The user interface could be cleaned up to look more professional\r\n% * The text boxes on the right aren't supposed to be used by the user, and\r\n% should probably be locked so the user doesn't type in them.\r\n% * Ability to import text files and\/or web page would be nice (as in\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/40781-regexphelper\r\n% this> FileExchange submission).\r\n%\r\n% As always, your thoughts and comments\r\n% <https:\/\/blogs.mathworks.com\/pick\/?p=4844#respond here> are greatly\r\n% appreciated.\r\n\r\n##### SOURCE END ##### f094e17d7cad4a8aacee552636d92050\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/images\/pick\/idin\/potw_regexpbuilder\/regexpBuilder.png\" onError=\"this.style.display ='none';\" \/><\/div><p>\r\nIdin's pick for this week is the regexpBuilder by Michael Ryan.Regular expressions can be a powerful tool in searching through strings or documents. If you haven't used them before, think of them... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/pick\/2013\/09\/27\/regexp-builder\/\">read more >><\/a><\/p>","protected":false},"author":36,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts\/4844"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/users\/36"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/comments?post=4844"}],"version-history":[{"count":8,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts\/4844\/revisions"}],"predecessor-version":[{"id":7602,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/posts\/4844\/revisions\/7602"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/media?parent=4844"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/categories?post=4844"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/pick\/wp-json\/wp\/v2\/tags?post=4844"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}