I was talking to my long-time colleague, Mike Croucher, who joined MathWorks team recently (yay!). About a bunch of interesting topics, some of which could be good fodder for a blog post. Today I want to talk about for
In my old Fortran programming days (yes, long gone), I learned early on that my code ran faster if I arranged my nested for-loops optimally. In Fortran, if I were computing one element at a time in a matrix, it meant making sure I filled the elements sequentially, which meant looping over each row, 1 column at a time, because that's the way Fortran (and MATLAB) stores arrays. This is not how arrays are stored in C.
I was curious what happens in MATLAB today if I try to run nested loops in different orders. So let's see!
Straight-forward Loop Solution, Inner Loop Rows
Of course I preallocate my output array, Ar.
[nrows,ncolumns] = size(Ar);
Ar(rows,columns) = rows*(rows+columns^3)+17;
Straight-forward Loop Solution, Inner Loop Columns
And now the other order.
[nrows,ncolumns] = size(Ac);
Ac(rows,columns) = rows*(rows+columns^3)+17;
And, of course, the vectorized version.
Av = tmpr .* (tmpr+(1:ncolumns).^3) + 17;
Let's make sure they also produce the same results.
areAnswerEqual = isequal(Ac,Ar,Av)
And what about the execution times?
In MATLAB, using nested for-loops, clearly the order of the loops matters from a timing perspective. So it would be wise to order them knowing that the data is stored column by column, if you can write your algorithm that way. And I did not show what the effects might be if we didn't preallocate the output matrices. You can try that for yourselves.
Despite getting a pretty good speedup by wrangling the nested loops, it's still often preferable, from a speed standpoint, to try to vectorize the code. Notice that I am using implicit expansion
to compute Av
. Also note that you won't always get the same ratio of speed difference between loops and the equivalent vectorized version - it will depend very much on how much work is going on for each computation and how much overhead arises.
When using loops in MATLAB, from a speed viewpoint, it's best if you can set up the loops so the elements are accessed as close to sequentially as possible. You can see from the small example above that running down the columns gives superior performance vs. running across the rows. This is because of the way MATLAB stores arrays.
Have you run into a situation where reordering loops helped you out? Let us know here
Copyright 2021 The MathWorks, Inc.