{"id":2080,"date":"2016-08-01T09:26:20","date_gmt":"2016-08-01T13:26:20","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/?p=2080"},"modified":"2019-11-01T16:44:44","modified_gmt":"2019-11-01T20:44:44","slug":"ineffective-preallocation","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2016\/08\/01\/ineffective-preallocation\/","title":{"rendered":"Ineffective preallocation"},"content":{"rendered":"<div class=\"content\"><p>Today I want to explain a MATLAB coding mistake that I have seen even experienced MATLAB users make.<\/p><p>I was looking at some code on the File Exchange recently, and these lines caught my eye:<\/p><pre>cm_data = zeros(m,3);\r\nhsv=rgb2hsv(cm);\r\ncm_data=interp1(linspace(0,1,size(cm,1)),hsv,linspace(0,1,m));\r\ncm_data=hsv2rgb(cm_data);<\/pre><p>Specifically, I was struck by the first and third lines. The first line creates an all-zeros matrix. Then the third line throws away the matrix created by the first line, replacing it with the output of <tt>interp1<\/tt>.<\/p><p>I have seen code like this before. It results from a misunderstanding of the concept of <i>preallocation<\/i> in MATLAB.<\/p><p>In MATLAB, preallocation refers to creating a matrix (usually with <tt>zeros<\/tt>) prior to entering a loop that would otherwise repeatedly resize the matrix.<\/p><p>Here is an example. In the for-loop below, the size of the matrix <tt>x<\/tt> is increased by one element each time through the body of the loop.<\/p><pre class=\"codeinput\">x = 0;\r\n<span class=\"keyword\">for<\/span> k = 2:1000000\r\n   x(k) = x(k-1) + 5;\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>In versions of MATLAB before R2011a, this kind of loop could be quite slow. The reason is that MATLAB would reallocate the memory space for the matrix <tt>x<\/tt> each time through the loop, resulting in $O(N^2)$ execution time overall. To avoid this performance penalty, the usual recommendation was to create the full-size matrix <tt>x<\/tt> before entering the loop, like this:<\/p><pre class=\"codeinput\">x = zeros(1,1000000);\r\n<span class=\"keyword\">for<\/span> k = 2:1000000\r\n   x(k) = x(k-1) + 5;\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>Although automatic array growth got <a href=\"https:\/\/blogs.mathworks.com\/steve\/2011\/05\/16\/automatic-array-growth-gets-a-lot-faster-in-r2011a\/\">much more efficient in MATLAB R2011a<\/a> (it is no longer $O(N^2)$), there is still some performance benefit to preallocation, which is why it is still <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/preallocating-arrays.html\">described in the documentation<\/a>.<\/p><p>In the code from the File Exchange, though, the matrix <tt>cm_data<\/tt> is not being grown in a loop. The result of the initial assignment, <tt>cm_data = zeros(m,3)<\/tt>, is just discarded without ever being used.<\/p><p>The MATLAB Editor tries to tell you about this. Do you see the orange squiggle on the first line below?<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/editor-screen-shot.png\" alt=\"\"> <\/p><p>If you hover over the squiggle with the mouse, you see this:<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/editor-screen-shot-short-message.png\" alt=\"\"> <\/p><p>And if you then click on the Details button, you get a full explanation:<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/steve\/files\/editor-screen-shot-full-explanation.png\" alt=\"\"> <\/p><p>In summary, don't worry about putting in that call to <tt>zeros<\/tt> unless you are growing a matrix in a loop. If you later replace the variable with a completely new matrix returned by another function call, then the call to <tt>zeros<\/tt> is not helping you, and it might even slow down your code.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_0b2286802748443f9a83ba9f122ca13e() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='0b2286802748443f9a83ba9f122ca13e ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 0b2286802748443f9a83ba9f122ca13e';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2016 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_0b2286802748443f9a83ba9f122ca13e()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2016a<br><\/p><\/div><!--\r\n0b2286802748443f9a83ba9f122ca13e ##### SOURCE BEGIN #####\r\n%%\r\n% Today I want to explain a MATLAB coding mistake that I have seen even\r\n% experienced MATLAB users make.\r\n%\r\n% I was looking at some code on the File Exchange recently, and these lines\r\n% caught my eye:\r\n%\r\n%  cm_data = zeros(m,3);\r\n%  hsv=rgb2hsv(cm);\r\n%  cm_data=interp1(linspace(0,1,size(cm,1)),hsv,linspace(0,1,m));\r\n%  cm_data=hsv2rgb(cm_data);\r\n%\r\n% Specifically, I was struck by the first and third lines. The first line\r\n% creates an all-zeros matrix. Then the third line throws away the matrix\r\n% created by the first line, replacing it with the output of |interp1|.\r\n%\r\n% I have seen code like this before. It results from a misunderstanding of\r\n% the concept of _preallocation_ in MATLAB.\r\n%\r\n% In MATLAB, preallocation refers to creating a matrix (usually with\r\n% |zeros|) prior to entering a loop that would otherwise repeatedly resize\r\n% the matrix.\r\n%\r\n% Here is an example. In the for-loop below, the size of the matrix |x| is\r\n% increased by one element each time through the body of the loop.\r\n\r\nx = 0;\r\nfor k = 2:1000000\r\n   x(k) = x(k-1) + 5;\r\nend\r\n\r\n%%\r\n% In versions of MATLAB before R2011a, this kind of loop could be quite\r\n% slow. The reason is that MATLAB would reallocate the memory space for the\r\n% matrix |x| each time through the loop, resulting in $O(N^2)$ execution\r\n% time overall. To avoid this performance penalty, the usual recommendation\r\n% was to create the full-size matrix |x| before entering the loop, like\r\n% this:\r\n\r\nx = zeros(1,1000000);\r\nfor k = 2:1000000\r\n   x(k) = x(k-1) + 5;\r\nend\r\n\r\n%%\r\n% Although automatic array growth got <https:\/\/blogs.mathworks.com\/steve\/2011\/05\/16\/automatic-array-growth-gets-a-lot-faster-in-r2011a\/ \r\n% much more efficient in MATLAB R2011a>\r\n% (it is no longer $O(N^2)$), there is still some performance benefit to\r\n% preallocation, which is why it is still\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/preallocating-arrays.html\r\n% described in the documentation>.\r\n%\r\n% In the code from the File Exchange, though, the matrix |cm_data| is not\r\n% being grown in a loop. The result of the initial assignment, |cm_data =\r\n% zeros(m,3)|, is just discarded without ever being used.\r\n%\r\n% The MATLAB Editor tries to tell you about this. Do you see the orange\r\n% squiggle on the first line below?\r\n%\r\n% <<https:\/\/blogs.mathworks.com\/steve\/files\/editor-screen-shot.png>>\r\n%\r\n% If you hover over the squiggle with the mouse, you see this:\r\n%\r\n% <<https:\/\/blogs.mathworks.com\/steve\/files\/editor-screen-shot-short-message.png>>\r\n%\r\n% And if you then click on the Details button, you get a full explanation:\r\n%\r\n% <<https:\/\/blogs.mathworks.com\/steve\/files\/editor-screen-shot-full-explanation.png>>\r\n%\r\n% In summary, don't worry about putting in that call to |zeros| unless you\r\n% are growing a matrix in a loop. If you later replace the variable with a\r\n% completely new matrix returned by another function call, then the call to\r\n% |zeros| is not helping you, and it might even slow down your code.\r\n\r\n##### SOURCE END ##### 0b2286802748443f9a83ba9f122ca13e\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/steve\/files\/editor-screen-shot-full-explanation.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>Today I want to explain a MATLAB coding mistake that I have seen even experienced MATLAB users make.I was looking at some code on the File Exchange recently, and these lines caught my eye:cm_data =... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2016\/08\/01\/ineffective-preallocation\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":2078,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[1165,723,32,957,190,130],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/2080"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=2080"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/2080\/revisions"}],"predecessor-version":[{"id":2081,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/2080\/revisions\/2081"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media\/2078"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=2080"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=2080"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=2080"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}