{"id":371,"date":"2011-05-16T20:38:35","date_gmt":"2011-05-17T00:38:35","guid":{"rendered":"https:\/\/blogs.mathworks.com\/steve\/2011\/05\/16\/automatic-array-growth-gets-a-lot-faster-in-r2011a\/"},"modified":"2019-10-29T16:36:41","modified_gmt":"2019-10-29T20:36:41","slug":"automatic-array-growth-gets-a-lot-faster-in-r2011a","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/steve\/2011\/05\/16\/automatic-array-growth-gets-a-lot-faster-in-r2011a\/","title":{"rendered":"Automatic array growth gets a lot faster in R2011a"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <p>One of the most significant MATLAB improvements in the R2011a release is in how certain kinds of array growth operations are\r\n      handled. This is a kind of \"under-the-hood\" change that I think is exciting and deserves some publicity.\r\n   <\/p>\r\n   <p>For those of you who are MATLAB power users and understand preallocation and why it is necessary, I'll tell you the story's\r\n      ending first: MATLAB R2011a performs <b>much<\/b> better than previous MATLAB versions when performing repeated array growth in a loop.\r\n   <\/p>\r\n   <p>For everyone else, let's start with some not-so-hypothetical tech support questions:<\/p>\r\n   <div>\r\n      <ul>\r\n         <li>\"Why is the MATLAB function <tt>foobar<\/tt> so slow?\"\r\n         <\/li>\r\n         <li>\"Why does MATLAB hang if I put a call to <tt>foobar<\/tt> in a loop?\"\r\n         <\/li>\r\n         <li>\"Why does the function <tt>foobar<\/tt> run so much faster at the command line than when I put it in a function?\"\r\n         <\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <p>The tech support engineer will start digging into the question to see why the unhappy MATLAB user has reached his conclusion:<\/p>\r\n   <p>\"Well, I ran the profiler on my function <tt>flibbergibble<\/tt>, which calls <tt>foobar<\/tt>:\r\n   <\/p>\r\n   <p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2011\/flibbergibble-screenshot.png\"> <\/p>\r\n   <p>\"You can see that <tt>foobar<\/tt> takes about 0.35 ms per call (70.93 \/ 200000), but when I call <tt>foobar<\/tt> 200000 times from the command line in a loop it only takes 8.5e-7 seconds per call:\r\n   <\/p><pre> tic, for k = 1:200000, foobar(k); end, toc\r\n Elapsed time is 0.173287 seconds.<\/pre><p>\"That's more than 400 times faster! Why is <tt>foobar<\/tt> so slow in a function?\"\r\n   <\/p>\r\n   <p>Experienced MATLAB users know that the performance problem has little to do with <tt>foobar(k)<\/tt> on line 4 of the profile report and much to do with <tt>y(k) =<\/tt> on the same line.\r\n   <\/p>\r\n   <p>Let's examine why. The first time through the loop, the variable <tt>y<\/tt> hasn't been assigned to yet, so MATLAB creates it through the assignment to <tt>y(1)<\/tt>. The second time through the loop assigns to <tt>y(2)<\/tt>, which doesn't exist yet. So MATLAB allocates enough new memory for two elements and then copies <tt>y(1)<\/tt> and assigns <tt>y(2)<\/tt> into the new space. The third time through the loop assigns to <tt>y(3)<\/tt>, which doesn't exist yet. So MATLAB allocates enough new memory for three elements and then copies <tt>y(1)<\/tt> and <tt>y(2)<\/tt> and assigns <tt>y(3)<\/tt> into the new space.\r\n   <\/p>\r\n   <p>See the pattern? On the k-th time through the loop, MATLAB has to make a copy of k elements into newly allocated memory. The\r\n      time required to copy k elements is proportional to k. The time required to execute the entire loop is therefore proportional\r\n      to the sum of the integers from 1 to n, which is n*(n+1)\/2.\r\n   <\/p>\r\n   <p>Bottom line: memory copying during the assignment statement causes the time required to execute the loop to be proportional\r\n      to n^2.\r\n   <\/p>\r\n   <p>Let's see some data. I timed <tt>flibbergibble<\/tt> for various values of <tt>n<\/tt> using MATLAB R2010b. Here's the data:\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">n = [100 200 500 1000 2000 10000 20000 40000 50000];\r\nt = [1.2e-4 2.7e-4 8.6e-4 2.4e-3 2.8e-3 5.8e-2 .24 1.32 2.4];\r\n\r\nplot(n,t)\r\ntitle(<span style=\"color: #A020F0\">'n versus t, MATLAB R2010b'<\/span>)<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/steve\/2011\/prealloc_01.png\"> <p>That looks pretty parabola-like. I checked it out with <tt>cftool<\/tt> from the Curve Fitting Toolbox and found that a rough fit to the data is the curve <tt>1.0e-9 * n^2<\/tt>. What if we tried <tt>n = 1000000<\/tt>?  From our curve we could estimate that it would take about 17 minutes:\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">n = 1e6;\r\nt_minutes = 1e-9 * n^2 \/ 60<\/pre><pre style=\"font-style:oblique\">\r\nt_minutes =\r\n\r\n   16.6667\r\n\r\n<\/pre><p>This is what triggers the tech support call where the user thinks that the function <tt>foobar<\/tt> has caused MATLAB to hang. MATLAB isn't hung, it's just working really hard reallocating all that memory.\r\n   <\/p>\r\n   <p>The solution is to <i>preallocate<\/i> space for <tt>y<\/tt> before beginning the loop, like this:\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">type <span style=\"color: #A020F0\">flibbergibble2<\/span><\/pre><pre style=\"font-style:oblique\">\r\nfunction y = flibbergibble2(n)\r\n\r\ny = zeros(1, n);\r\nfor k = 1:n\r\n    y(k) = foobar(k);\r\nend\r\n\r\n<\/pre><p>The code above does not have to reallocate any memory each time through the loop. In MATLAB R2010b on my computer it runs\r\n      in less than a second:\r\n   <\/p><pre> tic, flibbergibble2(1000000); toc\r\n Elapsed time is 0.737451 seconds.<\/pre><p>The big change in MATLAB R2011a is that MATLAB uses a better strategy for reallocating memory as the array grows when it detects\r\n      this growth pattern. As a result, the time required to execute the loop grows only linearly with n.\r\n   <\/p>\r\n   <p>Here's how it takes for <tt>n = 1000000<\/tt> in MATLAB R2011a on my computer:\r\n   <\/p><pre> tic, flibbergibble(1000000); toc\r\n Elapsed time is 1.105256 seconds.<\/pre><p>That's in the neighborhood of a thousand times faster than R2010b. You can see that the preallocation strategy (in <tt>flibbergibble2<\/tt>) is still a bit faster, but the performance penalty for not preallocating has been substantially reduced.\r\n   <\/p>\r\n   <p>If the final size of your output array is trivial to calculate, then I would say go ahead and preallocate. It's just one extra\r\n      code line and it's a little faster. But if the final size is complicated to calculate, then maybe you don't have to work as\r\n      hard you did before to optimize your code. If you are growing a vector, or if the array growth is in the final dimension of\r\n      a matrix or multidimensional array, you can now seriously consider just letting MATLAB do its own \"automatic array growth\"\r\n      thing.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_cc75abd5643d4f8794c6bed3db3c2488() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='cc75abd5643d4f8794c6bed3db3c2488 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' cc75abd5643d4f8794c6bed3db3c2488';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = '';\r\n        copyright = 'Copyright 2011 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_cc75abd5643d4f8794c6bed3db3c2488()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.12<br><\/p>\r\n<\/div>\r\n<!--\r\ncc75abd5643d4f8794c6bed3db3c2488 ##### SOURCE BEGIN #####\r\n%%\r\n% One of the most significant MATLAB improvements in the R2011a release is\r\n% in how certain kinds of array growth operations are handled. This is a\r\n% kind of \"under-the-hood\" change that I think is exciting and deserves\r\n% some publicity.\r\n%\r\n% For those of you who are MATLAB power users and understand preallocation\r\n% and why it is necessary, I'll tell you the story's ending first: MATLAB\r\n% R2011a performs *much* better than previous MATLAB versions when\r\n% performing repeated array growth in a loop.\r\n%\r\n% For everyone else, let's start with some not-so-hypothetical tech support\r\n% questions:\r\n%\r\n% * \"Why is the MATLAB function |foobar| so slow?\"\r\n% * \"Why does MATLAB hang if I put a call to |foobar| in a loop?\"\r\n% * \"Why does the function |foobar| run so much faster at the command line\r\n% than when I put it in a function?\"\r\n%\r\n% The tech support engineer will start digging into the question to see why\r\n% the unhappy MATLAB user has reached his conclusion:\r\n%\r\n% \"Well, I ran the profiler on my function |flibbergibble|, which calls\r\n% |foobar|:\r\n%\r\n% <<https:\/\/blogs.mathworks.com\/images\/steve\/2011\/flibbergibble-screenshot.png>>\r\n%\r\n% \"You can see that |foobar| takes about 0.35 ms per call (70.93 \/ 200000),\r\n% but when I call |foobar| 200000 times from the command line in a loop it\r\n% only takes 8.5e-7 seconds per call:\r\n%\r\n%   tic, for k = 1:200000, foobar(k); end, toc\r\n%   Elapsed time is 0.173287 seconds.\r\n%\r\n% \"That's more than 400 times faster! Why is |foobar| so slow in a\r\n% function?\" \r\n% \r\n% Experienced MATLAB users know that the performance problem has little to\r\n% do with |foobar(k)| on line 4 of the profile report and much to do with\r\n% |y(k) =| on the same line.\r\n%\r\n% Let's examine why. The first time through the loop, the variable |y|\r\n% hasn't been assigned to yet, so MATLAB creates it through the assignment\r\n% to |y(1)|. The second time through the loop assigns to |y(2)|, which\r\n% doesn't exist yet. So MATLAB allocates enough new memory for two\r\n% elements and then copies |y(1)| and assigns |y(2)| into the new space.\r\n% The third time through the loop assigns to |y(3)|, which doesn't exist\r\n% yet. So MATLAB allocates enough new memory for three elements and then\r\n% copies |y(1)| and |y(2)| and assigns |y(3)| into the new space.\r\n%\r\n% See the pattern? On the k-th time through the loop, MATLAB has to make a\r\n% copy of k elements into newly allocated memory. The time required to copy\r\n% k elements is proportional to k. The time required to execute the entire\r\n% loop is therefore proportional to the sum of the integers from 1 to n,\r\n% which is n*(n+1)\/2.\r\n%\r\n% Bottom line: memory copying during the assignment statement causes the\r\n% time required to execute the loop to be proportional to n^2.\r\n%\r\n% Let's see some data. I timed |flibbergibble| for various values of |n|\r\n% using MATLAB R2010b. Here's the data:\r\n\r\nn = [100 200 500 1000 2000 10000 20000 40000 50000];\r\nt = [1.2e-4 2.7e-4 8.6e-4 2.4e-3 2.8e-3 5.8e-2 .24 1.32 2.4];\r\n\r\nplot(n,t)\r\ntitle('n versus t, MATLAB R2010b')\r\n\r\n%%\r\n% That looks pretty parabola-like. I checked it out with |cftool| from the\r\n% Curve Fitting Toolbox and found that a rough fit to the data is the curve\r\n% |1.0e-9 * n^2|. What if we tried |n = 1000000|?  From our curve we could\r\n% estimate that it would take about 17 minutes:\r\n\r\nn = 1e6;\r\nt_minutes = 1e-9 * n^2 \/ 60\r\n\r\n%%\r\n% This is what triggers the tech support call where the user thinks that\r\n% the function |foobar| has caused MATLAB to hang. MATLAB isn't hung, it's\r\n% just working really hard reallocating all that memory.\r\n%\r\n% The solution is to _preallocate_ space for |y| before beginning the loop,\r\n% like this:\r\n\r\ntype flibbergibble2\r\n\r\n%%\r\n% The code above does not have to reallocate any memory each time through\r\n% the loop. In MATLAB R2010b on my computer it runs in less than a second:\r\n%\r\n%   tic, flibbergibble2(1000000); toc\r\n%   Elapsed time is 0.737451 seconds.\r\n%\r\n% The big change in MATLAB R2011a is that MATLAB uses a better strategy for\r\n% reallocating memory as the array grows when it detects this growth\r\n% pattern. As a result, the time required to execute the loop grows only\r\n% linearly with n.\r\n%\r\n% Here's how it takes for |n = 1000000| in MATLAB R2011a on my computer:\r\n%\r\n%   tic, flibbergibble(1000000); toc\r\n%   Elapsed time is 1.105256 seconds.\r\n%\r\n% That's in the neighborhood of a thousand times faster than R2010b. You\r\n% can see that the preallocation strategy (in |flibbergibble2|) is still a\r\n% bit faster, but the performance penalty for not preallocating has been\r\n% substantially reduced.\r\n%\r\n% If the final size of your output array is trivial to calculate, then I\r\n% would say go ahead and preallocate. It's just one extra code line and\r\n% it's a little faster. But if the final size is complicated to calculate,\r\n% then maybe you don't have to work as hard you did before to optimize your\r\n% code. If you are growing a vector, or if the array growth is in the final\r\n% dimension of a matrix or multidimensional array, you can now seriously\r\n% consider just letting MATLAB do its own \"automatic array growth\" thing.\r\n##### SOURCE END ##### cc75abd5643d4f8794c6bed3db3c2488\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   One of the most significant MATLAB improvements in the R2011a release is in how certain kinds of array growth operations are\r\n      handled. This is a kind of \"under-the-hood\" change that I... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/steve\/2011\/05\/16\/automatic-array-growth-gets-a-lot-faster-in-r2011a\/\">read more >><\/a><\/p>","protected":false},"author":42,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[68,803,52,805,130],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/371"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/comments?post=371"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/371\/revisions"}],"predecessor-version":[{"id":3729,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/posts\/371\/revisions\/3729"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/media?parent=371"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/categories?post=371"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/steve\/wp-json\/wp\/v2\/tags?post=371"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}