Steve on Image Processing

Automatic array growth gets a lot faster in R2011a 10

Posted by Steve Eddins,

One of the most significant MATLAB improvements in the R2011a release is in how certain kinds of array growth operations are handled. This is a kind of "under-the-hood" change that I think is exciting and deserves some publicity.

For those of you who are MATLAB power users and understand preallocation and why it is necessary, I'll tell you the story's ending first: MATLAB R2011a performs much better than previous MATLAB versions when performing repeated array growth in a loop.

For everyone else, let's start with some not-so-hypothetical tech support questions:

  • "Why is the MATLAB function foobar so slow?"
  • "Why does MATLAB hang if I put a call to foobar in a loop?"
  • "Why does the function foobar run so much faster at the command line than when I put it in a function?"

The tech support engineer will start digging into the question to see why the unhappy MATLAB user has reached his conclusion:

"Well, I ran the profiler on my function flibbergibble, which calls foobar:

"You can see that foobar takes about 0.35 ms per call (70.93 / 200000), but when I call foobar 200000 times from the command line in a loop it only takes 8.5e-7 seconds per call:

 tic, for k = 1:200000, foobar(k); end, toc
 Elapsed time is 0.173287 seconds.

"That's more than 400 times faster! Why is foobar so slow in a function?"

Experienced MATLAB users know that the performance problem has little to do with foobar(k) on line 4 of the profile report and much to do with y(k) = on the same line.

Let's examine why. The first time through the loop, the variable y hasn't been assigned to yet, so MATLAB creates it through the assignment to y(1). The second time through the loop assigns to y(2), which doesn't exist yet. So MATLAB allocates enough new memory for two elements and then copies y(1) and assigns y(2) into the new space. The third time through the loop assigns to y(3), which doesn't exist yet. So MATLAB allocates enough new memory for three elements and then copies y(1) and y(2) and assigns y(3) into the new space.

See the pattern? On the k-th time through the loop, MATLAB has to make a copy of k elements into newly allocated memory. The time required to copy k elements is proportional to k. The time required to execute the entire loop is therefore proportional to the sum of the integers from 1 to n, which is n*(n+1)/2.

Bottom line: memory copying during the assignment statement causes the time required to execute the loop to be proportional to n^2.

Let's see some data. I timed flibbergibble for various values of n using MATLAB R2010b. Here's the data:

n = [100 200 500 1000 2000 10000 20000 40000 50000];
t = [1.2e-4 2.7e-4 8.6e-4 2.4e-3 2.8e-3 5.8e-2 .24 1.32 2.4];

plot(n,t)
title('n versus t, MATLAB R2010b')

That looks pretty parabola-like. I checked it out with cftool from the Curve Fitting Toolbox and found that a rough fit to the data is the curve 1.0e-9 * n^2. What if we tried n = 1000000? From our curve we could estimate that it would take about 17 minutes:

n = 1e6;
t_minutes = 1e-9 * n^2 / 60
t_minutes =

   16.6667

This is what triggers the tech support call where the user thinks that the function foobar has caused MATLAB to hang. MATLAB isn't hung, it's just working really hard reallocating all that memory.

The solution is to preallocate space for y before beginning the loop, like this:

type flibbergibble2
function y = flibbergibble2(n)

y = zeros(1, n);
for k = 1:n
    y(k) = foobar(k);
end

The code above does not have to reallocate any memory each time through the loop. In MATLAB R2010b on my computer it runs in less than a second:

 tic, flibbergibble2(1000000); toc
 Elapsed time is 0.737451 seconds.

The big change in MATLAB R2011a is that MATLAB uses a better strategy for reallocating memory as the array grows when it detects this growth pattern. As a result, the time required to execute the loop grows only linearly with n.

Here's how it takes for n = 1000000 in MATLAB R2011a on my computer:

 tic, flibbergibble(1000000); toc
 Elapsed time is 1.105256 seconds.

That's in the neighborhood of a thousand times faster than R2010b. You can see that the preallocation strategy (in flibbergibble2) is still a bit faster, but the performance penalty for not preallocating has been substantially reduced.

If the final size of your output array is trivial to calculate, then I would say go ahead and preallocate. It's just one extra code line and it's a little faster. But if the final size is complicated to calculate, then maybe you don't have to work as hard you did before to optimize your code. If you are growing a vector, or if the array growth is in the final dimension of a matrix or multidimensional array, you can now seriously consider just letting MATLAB do its own "automatic array growth" thing.


Get the MATLAB code

Published with MATLAB® 7.12

10 CommentsOldest to Newest

Hi Steve

Thanks for the post. I have a few questions:

- Does this feature work for cell arrays too?

- Is it tied to for loops, or can this behaviour be triggered anywhere an array is growing?

- Presumably this features works under the hood by growing the actual array in memory by more than 1 element at a time (such as doubling the allocated size each time it runs out of space). This means that upon finishing the loop, the memory allocated for the array will be usually be larger than necessary. Does this allocation automatically get ‘trimmed’ to the actual length once the loop finishes, or at any other time? Or do we potentially have the situation where an array has 2^25+1 elments but it keeps a 2^26 element allocation size?

Thanks Steve, interesting post.

One question always struck me is about pre-allocation of cell arrays. Mlint pointed out that my loop might be growing on each iteration, but when the variable in question is a cell array that holds heterogeneous data, I wondered how sizing it in advance really helped without knowing the size of the data to be put in the cells!

Julian.

1000x faster is quite an achievement! Unfortunately, part of me can’t shake the feeling that it’s a lot of effort to expend to improve MATLAB support of what’s essentially poor coding practice.

Wes and Julian—I will try to post additional details later this week.

Gary—I completely disagree. When you don’t know the the size of the array at the beginning of the loop, it can be extremely cumbersome to implement automatic array growth yourself in order to make the implementation scalable. I know this because I’ve had to do it many times. Now I simply don’t have to worry about it because MATLAB is an even better array-oriented language than it already was.

Not preallocating was regarded as a poor coding practice because MATLAB didn’t handle it well. Now MATLAB handles it well.

Also, the feature works for array growth by repeated concatenation, something I didn’t mention in the post.

This is something that allocators in the C++ STL have done for approximately forever. I’m glad to see that MATLAB now has a reasonable allocation strategy.

Tom—I agree. I have been annoyed that the C++ vector class handled this sort of thing better than previous versions of MATLAB.

It’s great to see this feature added, and long overdue. There are plenty of problems where it’s difficult or impossible to calculate the array size beforehand, especially combinatoric optimization problems such as tree searches. The solution there is of course to use size doubling but this new feature will save the hassle.

Great! I once noticed that setting values in a sparse array to zero, one by one, in a loop is very slow. Has this been fixed? (I haven’t yet upgraded to test it myself).

This is great, there are always instances where the output size is unknown, for example the foobar function there could return a variable number of elements.

The trick I’ve used so far (and with great results, I might say) is to store results in a cell array and defer concatenation until after the loop.

y = cell(1, n);
for k = 1:n
    y{k} = foobar(k);
end
y = cat(1, y{:});

It still doesn’t help in the extreme case of a WHILE loop where you can’t guess the maximum number of iterations.

These postings are the author's and don't necessarily represent the opinions of MathWorks.