{"id":132,"date":"2008-03-27T15:08:43","date_gmt":"2008-03-27T20:08:43","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/2008\/03\/27\/a-way-to-automate-regular-renaming\/"},"modified":"2016-07-31T14:10:01","modified_gmt":"2016-07-31T19:10:01","slug":"a-way-to-automate-regular-renaming","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2008\/03\/27\/a-way-to-automate-regular-renaming\/","title":{"rendered":"A Way to Automate &#8220;Regular&#8221; Renaming"},"content":{"rendered":"<div class=\"content\">\n<p>Recently someone at MathWorks asked me how he could automate the renaming of a bunch of M-files containing underscores (<tt>'_'<\/tt>) in the names with derived names that removed the underscores and used <a href=\"http:\/\/en.wikipedia.org\/wiki\/CamelCase\">camelCasing<\/a> instead. You may have similar name manipulation operations you need to perform.<\/p>\n<p>&nbsp;<\/p>\n<h3>Contents<\/h3>\n<div>\n<ul>\n<li><a href=\"#1\">My First Attempt<\/a><\/li>\n<li><a href=\"#2\">Some Sample Names<\/a><\/li>\n<li><a href=\"#3\">My Solution<\/a><\/li>\n<li><a href=\"#6\">History Lesson<\/a><\/li>\n<li><a href=\"#7\">Using regexprep<\/a><\/li>\n<li><a href=\"#8\">Conclusions<\/a><\/li>\n<\/ul>\n<\/div>\n<h3>My First Attempt<a name=\"1\"><\/a><\/h3>\n<p>Of course I resorted to using MATLAB for the task, despite other options. I chose the following requirements.<\/p>\n<div>\n<ul>\n<li>Don't worry about leading _<\/li>\n<li>Don't worry about cell arrays of strings or string matrices (vectors only need apply)<\/li>\n<li>Do worry about multiple consecutive _<\/li>\n<li>Do worry about trailing _<\/li>\n<\/ul>\n<\/div>\n<h3>Some Sample Names<a name=\"2\"><\/a><\/h3>\n<p>I first create a list of some sample names so I have a test suite to try out.<\/p>\n<pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid #c8c8c8;\">names = {<span style=\"color: #a020f0;\">'foo_bar'<\/span>,<span style=\"color: #a020f0;\">'foo_bar_'<\/span>,<span style=\"color: #a020f0;\">'foo__bar'<\/span>, <span style=\"color: #0000ff;\">...<\/span>\r\n    <span style=\"color: #a020f0;\">'foo_bar__'<\/span>, <span style=\"color: #a020f0;\">'foo_3'<\/span>,<span style=\"color: #a020f0;\">'foo_3_'<\/span>,<span style=\"color: #a020f0;\">'foo_3a'<\/span>, <span style=\"color: #0000ff;\">...<\/span>\r\n    <span style=\"color: #a020f0;\">'foo_bar____baz___234___'<\/span>};\r\nallnames = names'<\/pre>\n<pre style=\"font-style: oblique;\">allnames = \r\n    'foo_bar'\r\n    'foo_bar_'\r\n    'foo__bar'\r\n    'foo_bar__'\r\n    'foo_3'\r\n    'foo_3_'\r\n    'foo_3a'\r\n    'foo_bar____baz___234___'\r\n<\/pre>\n<h3>My Solution<a name=\"3\"><\/a><\/h3>\n<p>Let's first try out my solution on these.<\/p>\n<pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid #c8c8c8;\"><span style=\"color: #0000ff;\">for<\/span> name = names\r\n    disp(camelCase(name{1}));\r\n<span style=\"color: #0000ff;\">end<\/span><\/pre>\n<pre style=\"font-style: oblique;\">fooBar\r\nfooBar\r\nfooBar\r\nfooBar\r\nfoo3\r\nfoo3\r\nfoo3a\r\nfooBarBaz234\r\n<\/pre>\n<p>And now let's look at the code.<\/p>\n<pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid #c8c8c8;\">type <span style=\"color: #a020f0;\">camelCase<\/span><\/pre>\n<pre style=\"font-style: oblique;\">function y = camelCase(x)\r\n%camelCase Convert name with underscores to camelCase.\r\n\r\n% find the underscores \r\nindall = find(x=='_');\r\n% figure out where consecutive _ are \r\n% and remove all but the last \r\nconsec = diff(indall)==1;\r\nind = indall;\r\nind(consec) = [];\r\n\r\ny = x;\r\ny(min(ind+1,end)) = upper(y(min(ind+1,end)));\r\ny(indall) = '';\r\n<\/pre>\n<p>I first find all the underscores. Then I look for consecutive ones since I really only want the last one in each sequence,<br \/>\nsince it's the following character that I want to turn into upper case. That is, <b>if<\/b> a following character exists! So I have to check for that too. I then have an array of indices to upper case (though I<br \/>\nallow myself to uppercase <tt>_<\/tt> at the end if it's the last character so I don't have to lengthen my input array; <tt>upper('_')<\/tt> is the same as <tt>'_'<\/tt>). Now, I go back and use the original indices pointing to all the instances of <tt>'_'<\/tt> and remove them. Voila!<\/p>\n<h3>History Lesson<a name=\"6\"><\/a><\/h3>\n<p>And then I got some pangs, because I am well aware that MATLAB supports regular expressions. First some history. Did you know that <a href=\"http:\/\/en.wikipedia.org\/wiki\/Stephen_Kleene\">Stephen Kleene<\/a>, an American mathematician, was the inventor of regular expressions? He has also been credited with developing a very approachable<br \/>\nproof to G\u00f6del's incompleteness theorems. And some punster then said, \"Kleeneliness is next to G\u00f6deliness\".<\/p>\n<h3>Using regexprep<a name=\"7\"><\/a><\/h3>\n<p>My friend, colleague, and <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/regexp.html\"><tt>regexp<\/tt><\/a> guru, <a href=\"https:\/\/blogs.mathworks.com\/loren\/2006\/04\/05\/regexp-how-tos\/\">Jason Breslau<\/a> gave me the <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/regexprep.html\"><tt>regexprep<\/tt><\/a> solution to the problem. Using the same names as before, I next show you Jason's magical 1-line expression, producing the<br \/>\nsame output as my M-file above.<\/p>\n<pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid #c8c8c8;\"><span style=\"color: #0000ff;\">for<\/span> name = names\r\n    disp(regexprep(name{1}, <span style=\"color: #a020f0;\">'_+(\\w?)'<\/span>, <span style=\"color: #a020f0;\">'${upper($1)}'<\/span>));\r\n<span style=\"color: #0000ff;\">end<\/span><\/pre>\n<pre style=\"font-style: oblique;\">fooBar\r\nfooBar\r\nfooBar\r\nfooBar\r\nfoo3\r\nfoo3\r\nfoo3a\r\nfooBarBaz234\r\n<\/pre>\n<h3>Conclusions<a name=\"8\"><\/a><\/h3>\n<p>My code is <i>still<\/i> easier for me to understand, and I conclude from that that I should spend some time trying to master regular expressions.<br \/>\nIn addition, the regular expression code requires no temporary variables, some of which could be large if the input string<br \/>\nis long enough. It also occurs to me that regular expressions are a topic worthy of students learning well in college.<\/p>\n<p><script>\/\/ <![CDATA[\nfunction grabCode_045ea1df88224d348f2955463e7b3b56() {\n        \/\/ Remember the title so we can use it in the new page\n        title = document.title;\n\n        \/\/ Break up these strings so that their presence\n        \/\/ in the Javascript doesn't mess up the search for\n        \/\/ the MATLAB code.\n        t1='045ea1df88224d348f2955463e7b3b56 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 045ea1df88224d348f2955463e7b3b56';\n    \n        b=document.getElementsByTagName('body')[0];\n        i1=b.innerHTML.indexOf(t1)+t1.length;\n        i2=b.innerHTML.indexOf(t2);\n \n        code_string = b.innerHTML.substring(i1, i2);\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\n\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \n        \/\/ in the XML parser.\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\n        \/\/ doesn't go ahead and substitute the less-than character. \n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\n\n        author = 'Loren Shure';\n        copyright = 'Copyright 2008 The MathWorks, Inc.';\n\n        w = window.open();\n        d = w.document;\n        d.write('\n\n\n\n\n\n<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\n\n\n\n\n\n\n\\n');\n      \n      d.title = title + ' (MATLAB code)';\n      d.close();\n      }\n\/\/ ]]><\/script><\/p>\n<p style=\"text-align: right; font-size: xx-small; font-weight: lighter; font-style: italic; color: gray;\"><a><span style=\"font-size: x-small; font-style: italic;\">Get<br \/>\nthe MATLAB code<br \/>\n<noscript>(requires JavaScript)<\/noscript><\/span><\/a><\/p>\n<p>Published with MATLAB\u00ae 7.6<\/p>\n<\/div>\n<p><!--\n045ea1df88224d348f2955463e7b3b56 ##### SOURCE BEGIN #####\n%% A Way to Automate \"Regular\" Renaming\n% Recently someone at MathWorks asked me how he could automate the renaming\n% of a bunch of M-files containing underscores (|'_'|) in the names with\n% derived names that removed the underscores and used\n% <http:\/\/en.wikipedia.org\/wiki\/CamelCase camelCasing> instead.\n% You may have similar name manipulation operations you need to perform.\n%% My First Attempt\n% Of course I resorted to using MATLAB for the task, despite other options.\n% I chose the following requirements.\n%\n% * Don't worry about leading _\n% * Don't worry about cell arrays of strings or string matrices (vectors\n% only need apply)\n% * Do worry about multiple consecutive _\n% * Do worry about trailing _\n%\n%% Some Sample Names\n% I first create a list of some sample names so I have a test suite to try\n% out.\nnames = {'foo_bar','foo_bar_','foo__bar', ...\n'foo_bar__', 'foo_3','foo_3_','foo_3a', ...\n'foo_bar____baz___234___'};\nallnames = names'\n%% My Solution\n% Let's first try out my solution on these.\nfor name = names\ndisp(camelCase(name{1}));\nend\n%%\n% And now let's look at the code.\ntype camelCase\n%%\n% I first find all the underscores.  Then I look for consecutive ones since\n% I really only want the last one in each sequence, since it's the\n% following character that I want to turn into upper case.  That is, *if*\n% a following character exists!  So I have to check for that too.  I then\n% have an array of indices to upper case (though I allow myself to\n% uppercase |_| at the end if it's the last character so I don't have to\n% lengthen my input array; |upper('_')| is the same as |'_'|).  Now, I go\n% back and use the original indices pointing to all the instances of |'_'|\n% and remove them.  Voila!\n%% History Lesson\n% And then I got some pangs, because I am well aware that MATLAB supports\n% <https:\/\/www.mathworks.com\/access\/helpdesk\/help\/techdoc\/matlab_prog\/f0-42649.html regular expressions>.\n% First some history.  Did you know that\n% <http:\/\/en.wikipedia.org\/wiki\/Stephen_Kleene Stephen Kleene>, an American\n% mathematician, was the inventor of regular expressions?  He has also been\n% credited with developing a very approachable proof to G\u00c3\u00b6del's\n% incompleteness theorems.  And some punster then said,\n% \"Kleeneliness is next to G\u00c3\u00b6deliness\".\n%% Using regexprep\n% My friend, colleague, and <https:\/\/www.mathworks.com\/help\/matlab\/ref\/regexp.html |regexp|> guru, <https:\/\/blogs.mathworks.com\/loren\/2006\/04\/05\/regexp-how-tos\/ Jason Breslau> gave me the\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/regexprep.html |regexprep|> solution to the problem.\n% Using the same names as before, I next show you Jason's magical\n% 1-line expression, producing the same output as my M-file above.\nfor name = names\ndisp(regexprep(name{1}, '_+(\\w?)', '${upper($1)}'));\nend\n%% Conclusions\n% My code is _still_ easier for me to understand, and I conclude from that\n% that I should spend some time trying to master regular expressions.  In\n% addition, the regular expression code requires no temporary variables,\n% some of which could be large if the input string is long enough.\n% It also occurs to me that regular expressions are a topic worthy of\n% students learning well in college.  What do you think?  Let me know\n% <https:\/\/blogs.mathworks.com\/loren\/?p=132#respond here>.\n\n##### SOURCE END ##### 045ea1df88224d348f2955463e7b3b56\n--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\nRecently someone at MathWorks asked me how he could automate the renaming of a bunch of M-files containing underscores ('_') in the names with derived names that removed the underscores and used... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2008\/03\/27\/a-way-to-automate-regular-renaming\/\">read more >><\/a><\/p>\n","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[10,15,8,2],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/132"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=132"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/132\/revisions"}],"predecessor-version":[{"id":1845,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/132\/revisions\/1845"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=132"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=132"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=132"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}