Conventional wisdom for programming MATLAB used to be that using for loops automatically forced a program to suffer from poor performance. Since MATLAB R13 (version 6.5), MATLAB has taken advantage of some innovations that accelerate many for loops so the code has performance on par with either vectorized code or code written in a lower level language such as C or Fortran. Obviously, details matter here. One thing that most people, even at MathWorks (!) don't appreciate is that the for loop has richer behavior than simply looping over single elements at a time. An informal hallway survey near my office found that even among experienced MATLAB programmers, far fewer than 50% knew about this behavior. Time to come clean.
Contents
Exploring the Behavior of 'for'
The MATLAB for statement is both more powerful and more subtle than many people realize because of the way it lets you iterate directly over an array rather than making use of explicit indices or subscripts. Here's an example that displays the logarithms of the positive numbers in a row vector:
x = [1 pi -17 1.3 289 -exp(1) -42]; for pnum = x(x > 0) disp('Iterating') log(pnum) end
Iterating
ans =
0
Iterating
ans =
1.1447298858494
Iterating
ans =
0.262364264467491
Iterating
ans =
5.66642668811243
This code is streamlined in comparison to the following, which uses an explicit index ind, with equivalent results (except for the details on the output that I added here).
pnums = x(x > 0); for ind = 1:numel(pnums) disp(['Iterating... Value #',int2str(ind)]) log(pnums(ind)) end
Iterating... Value #1
ans =
0
Iterating... Value #2
ans =
1.1447298858494
Iterating... Value #3
ans =
0.262364264467491
Iterating... Value #4
ans =
5.66642668811243
Behavior of for
To use for as in the first example, you need to understand that for loops do not iterate over the first dimension of an array, but only over dimensions 2 to the maximum array dimension. See this by transposing the input vector (turning it into a column).
xt = x.' for pnum = xt(xt > 0) disp('Iterating') log(pnum) end
xt =
1
3.14159265358979
-17
1.3
289
-2.71828182845905
-42
Iterating
ans =
0
1.1447298858494
0.262364264467491
5.66642668811243
We see one iteration, with pnum taking on the 4-by-1 value [1 pi 1.3 289]'. In contrast, the version with the explicit index works the same way whatever the shape of x.
More Examples
This iterates over s.
for s = [1,-2,8,pi,17], disp('Iterating'), disp(s), end
Iterating
1
Iterating
-2
Iterating
8
Iterating
3.14159265358979
Iterating
17
This doesn't iterate, but processes the entire column as one entity.
for s = [1,-2,8,pi,17]', disp('Iterating'), disp(s), end
Iterating
1
-2
8
3.14159265358979
17
What about Higher Dimensions?
This iterates over the 2nd and 3rd dimensions of A.
A = (1:12); A = reshape(A,[2 2 3])
A(:,:,1) =
1 3
2 4
A(:,:,2) =
5 7
6 8
A(:,:,3) =
9 11
10 12
for k = A, disp('iterating'), disp(k), end
iterating
1
2
iterating
3
4
iterating
5
6
iterating
7
8
iterating
9
10
iterating
11
12
The documentation makes it clear that for does not iterate over the first dimension. for does iterate over all of the dimensions of A except for the first (row) dimension. You can predict the number of iterations evaluating this.
numel(A,1,':')ans =
6
Returning to the first example, numel(x(x >= 0),1,':') evaluates to 4 when x is a row vector and 1 when x is a column vector.
Possible Uses
Why use this version of for? Suppose you have a large dataset and the vectorized calculations can't take full advantage of functions such as bsxfun. You may have a case where the memory tradeoff for vectorization is too high. But you don't want to make function calls for each array element since the function call overhead can get high as well. A possibly good compromise is to essentially process the data in chunks, perhaps by 'virtual' columns. That way the function call overhead is more limited as is the memory. To get the best outcome for this approach, you should preallocate the output array and assign into it with proper indexing as you loop.
What Do You Use?
Did you know about this for behavior? Do you ever take advantage of it? What strategies do you use for trading off memory use and function call overhead? Let me know here.
Get
the MATLAB code
Published with MATLAB® 7.9



I knew about the behavior of for; however, I did NOT know I could use multiple inputs for NUMEL!
What are some of the accelerating innovations you spoke of in the introduction, and how can I craft my code to take advantage of them?
Interesting, I didn’t know that. Though, this is actually quite predictable given the way MATLAB operates on matrices.
The statement “you need to understand that for loops do not iterate over the first dimension of an array, but only over dimensions 2 to the maximum array dimension” is a little hard to grasp, but thinking the following way makes it easier for me:
MATLAB favors doing things over columns and treats each column as a single entity (or input). It takes one column, processes it, goes on to the next column, and so on.
This helps me understand the for loop behaviour, as well as other MATLAB behaviour such as taking the mean of a matrix.
If you `mean’ over a matrix, MATLAB inputs columns as an entity, one at a time, and produces a row vector consisting of mean values for each column. Same with fft and other functions like sum.
One other tip for remembering this might be as follows:
Think of the elements of a column as a member of a set, so if an operation needs to be done, it is on the whole set. The next column on the other hand is a completely different set, so operations on that set is not dependent on the previous column (set). This is kind of like the spreadsheets’ behaviour.
PS: Of course, this is all true for matrices that are not single dimensional. It takes the mean of a vector either way, if it is a row or a column.
Jessee-
There’s nothing explicit you can do. MATLAB tried to accelerate the code where it can, depending on the code pattern and calculations. In some cases, code gets converted to machine operations and runs, e.g., a = a+1 can sometimes be done in-place.
–Loren
The speed of modern FOR loops has been demonstrated over and again on the newsgroup. Oftentimes someone will ask for help in vectorizing a FOR loop in order to decrease runtime, only to find out that all proposed vectorizations are slower than either their original loop, or a cleaned up version.
As for the behavior you describe, I learned if from reading a book way back with version 6! Hard to believe it has been so long.
Hi Loren.
Thanks for this clarification. I knew about this behaviour of
(although I did not know you talk about “richer behaviour” in the first paragraph). However, I did not learn this from the documentation, rather I learned this the hard way – by tracing errors that came out as a result of the difference in processing row and column vectors.
Several years ago I actually realized that the
command in MATLAB is actually a
command. At that time I tried to substitute looping over indices of arrays by looping over the array elements where possible. But I found that I need the array element numbers in almost all loops I create. I found myself adding
commands inside the loops over the elements, so that I returned to the usual practice of looping over array indices.
Loren, one question. When you speak about the possible use of this column-wise
behavior to process large data in chunks, do you actually suggest, that we should reshape the vector into a matrix to allow for this behaviour? And if we do, how should we allocate the result array? Should it be a vector or a matrix of the same size as the input? How should we properly index into the resulting array if the column-wise for loop does not provide any iteration counter that would allow us to index the right chunk in the resulting array?
-Petr
Very interesting. I learnt something here. Thanks, Loren.
Petr-
I am not sure exactly what you mean, but if I understand correctly, you do not have to reshape your data into a 2-D array. You simply use 2-D indexing to access all the remaining dimensions. If you preallocate the output as an ND array, then index into those with the 2:ndims dimensions collapsed all into the second one, and the code should just work correctly. Or preallocate the output as a 2D array, fill it in the loop, and reshape the output after the loops finishes.
As for the iteration counter, if you don’t have a natural one, you will need to create one.
–Loren
This is very interesting feature that apparently was there and I never knew about before. It will definitely help to write more concise code, but there is an unanswered question for me regarding the results of the calculation in the for loop. Say, I calculate something during each iteration and I want to save the results to an array. When I do it the old-school way, there is an index i and I say
x = linpsace(a, b, N) for i=1:numel(x) y(i) = something depending on x(i); endNow if I iterate directly on x, how do I know which element of x I am working now, so that I put the output in y in the correct position? Is there a simple way of knowing the location of an element in the iterating vector? I hope the question is clear enough to be answered.
I knew about this feature of “for” loop and I use it very often. Looking back at the comments on this post, I suspect the result is better than your hallway survey ! :)
Thanks for the post.
Thanks for another good column. But, I think that for the purposes of writing clear code, I’ll stick to the basic syntax like:
indx = x(x > 0);
for ii = 1 : length(indx)
disp(‘Iterating’)
log(x(indx(ii)))
end
However, I’d like to comment that even though vectorizing may not be faster than using the for loop, I would rather use it since I can eliminate at least 2 lines of code and I think the code is more clear.
Hooman-
You would need to create, update and keep track of your own counter if your for loop variable didn’t provide a useful index.
–Loren
Loren,
sorry if I did not express myself clearly. Your point was that the behaviour of for can be used for processing e.g. a large vector in chunks. I partially agree, but only in the case when something trivial shall be done with those chunks, e.g. if they should be
displayed, as in the following code.My point was that if we want to do anything a bit more complex, e.g. double the the entries in the vector, we have to use some form of indexing. We can use all the following three code snippets (using the
xlongandtempvariables from the previous code):This is for me somewhat non-intuitive since we access the same entries in both matrices, but the way we access them is not the same. For me, it is better to use looping over the indices which brings the symmetry:
Or, we can use use the original vector without reshaping:
% Reshape to a matrix to be able to use the column-wise behavior of for doublex = zeros(size(xlonx); for i = 1:nchunks doublex((i-1)*chunklength+1:i*chunklength) = ... 2*xlong((i-1)*chunklength+1:i*chunklength); endMy point is that the processing of vetor entries by chunks is not very good example of things that can be suitably solved by the column-wise behaviour of
for, unless you want to do something trivial… actually I cannot come up with anything other that displaying them.I hope I have cleared that out. Thanks,
- Petr
I did know of the column-wise behaviour of ‘for’, but again only by bug-finding rather than from the documentation.
A point that is often useful to me, but often not used in code that I see, is that a cell arrary (typically of strings, which don’t play nicely with normal arrays if of variable length) can also be used like this. Of course,
being a row-vector is important.
l = { ‘string1′, ‘second one’, ‘the third’ };
for s = l,
disp(s); iscell(s), numel(s), disp(‘ ‘);
end
Thanks for this post.
I have a mex function that takes approximately 11 seconds to do one iteration in a for loop. Before reading this post, I was using a row vector to iterate through the function to plot comparisons when given different inputs. Using the row vector took way too much time, whereas when using the column vector approach it only takes 11 seconds regardless of how long my vector is.
I’m learning more Matlab tricks everyday, so it seems, thanks to these blogs and the cssm newsgroup.
-Nathan
Neet.
I had a hunch that this is how it would work, but I wasn’t aware of the importance of dimension. That is why it never worked when I tried it.
Thanks for clearing that up. This is why I read this blog.
Thanks, Loren,
I kind of knew about the for-loop behaviour except I never use multi-dimensional arrays – I just don’t think in more than 2 dimensions so it doesn’t occur to me to use them.
Treating a column vector as a single element used to catch me out quite often. I’ve got a feeling the behaviour changed between versions 2 and 3 when a column vector stopped being iterated over (but my memory might be playing tricks – it was a long time ago!).
I got into the habit of doing
to avoid having to remember if a vector was a row or column.
These days I just make sure I always create row vectors instead of column vectors.
BTW I’d rate the just-in-time speed ups for for-loops as a fantastic achievement. Along with MCOS, it’s one of the two things that has kept me from straying away from Matlab.
Hi Loren! I often need to iterate over a cell array. The iteration variable is always of type cell. The following statement
for k={'a' 'bb'} disp([k{:} ' is of type ' class(k)]) endreturns
I’d prefer the iteration variable to take the type of the actual cell entry, such that I don’t have to expand the cell entry by k{:}. I’d like to do
for k={'a' 'bb'} disp([k ' is of type ' class(k)]) endwhich returns
This would be more convenient in my opinion
-Kusi
Well folks, seems like this at least affirmed suspicions if not outright taught you something.
Kusi-
To get the behavior you want, you need to extract the contents 9of k since k is a length-1 cell array. So instead of k, use k{1}.
–Loren
This is a very interesting and helpful post.
Here is a pathological example that demonstrates a semantic difference between iterating over a row or a column:
>> zzz = 1:10;
>> tic; for k=zzz yyy(k,:) = [1,2,3]*k; end; toc
Elapsed time is 0.000791 seconds.
>> tic; for k=zzz’ yyy(k,:) = [1,2,3]*k; end; toc
??? Error using ==> mtimes
Inner matrix dimensions must agree.
Similarly, here is an example that shows a performance difference:
>> zzz = 1:1000;
>> tic; for k=zzz zzz = zzz + 1; end; toc
Elapsed time is 0.007849 seconds.
>> tic; for k=zzz’ zzz = zzz + 1; end; toc
Elapsed time is 0.000585 seconds.
This indicates that for loops implement a particular automatic vectorization. It looks like programmers still need to be aware of performance pitfalls of non-vectorized for loops, but need not be as meticulous about vectorizing as in the past.
Thanks for explaining so clearly a very under-utilized aspect of the core Matlab language. I must admit that until now I have seldom used this feature, preferring to index-loop rather than ensuring a row-wise input vector. However, with your post I have “seen the light”. Thank you – it is not often that I learn new tricks in the core language.
Yair
I’ve known that “for” looped over columns of a 2D array for as long as I remember. Looping over successive dimensions of N-D arrays was news to me. I think I actually made use of this behavior once for the 2D case, but I can’t find the example. However, the documentation can be improved. With R20009A, “help for” yields:
“The general form of a FOR statement is:
FOR variable = expr, statement, …, statement END
The columns of the expression are stored one at a time in the variable and then the following statements, up to the
END, are executed.”
Clearly states looping over the columns, but could be more clear about what this means for N-D arrays.
However, typing “doc for” yields:
“for
Execute block of code specified number of times
Syntax
for x=initval:endval, statements, end
for x=initval:stepval:endval, statements, end
Description
for x=initval:endval, statements, end repeatedly executes one or more MATLAB statements in a loop.”
Needless to say, the syntax section is misleading at best because it doesn’t show the general case, and there is no discussion at all about looping over columns.
Another page in the documentation under Program Control Statements does discuss the case where the expr is a 2D array, but doesn’t discuss the N-D case.
Paul-
Thanks for the comments. The documentation has been getting a facelift. I’ll make sure the writer sees your comments as well.
–Loren
I often loop over struct arrays, or over a list of row vectors, and calculate the index on the fly, rather than looping over the index. Basically, I try not to think of for and switch statements as their C equivalents, but as higher-level constructs, that can and should be used to pass around something informative rather than passing indices to the information.
I guess the example below takes a bit of a hit in memory consumption when it constructs the array [Start;Step;Stop], but it’s still smaller than the memory needed for the output string. I find that, compared to looping over the index, it keeps the code cleaner to write it this way:
%% assemble output string SubSeq= cell(1,nnz(TF)); ix= 1; for SSS= [Start(TF); Step(TF); Stop(TF)] switch SSS(2) case 0 % single element SubSeq{ix}= sprintf(' %g,', SSS(1)); case 1 % unit spacing SubSeq{ix}= sprintf(' %g:%g,', SSS([1 3])); otherwise SubSeq{ix}= sprintf(' %g:%g:%g,', SSS); end ix= ix+1; end Str= ['[',SubSeq{:}]; Str(end:end+1)=' ]';In fact, I so often end up with an array or cell array with one column for each iteration of the for loop, and needing the index only for writing into that, that I’d like to have the syntax extended so I can let Matlab keep track of the index. Perhaps one could specify a destination like so:
and each time the loop ends, the column (or element) in YY corresponding to the position of X in XX would be assigned to whatever ans is at that time. YY could be a numeric array or struct array or cell array, pre-existing or automatically extended… anything that could be on the left-hand side of an assignment, basically. Maybe it helps to think of what I want as:
YY= for X = XX f(X) endbut that syntax doesn’t do much for clarity. I know that I can probably get arrayfun or one of its ilk to do what I want, but I find code using for loops easier to read.
Excellent post. No, I wasn’t aware of that property of
structures.
Thank you!!
Unfortunately, when using this method to loop over cell elements, we get a cell-encapsulated value rather than the expected cell element. This is counter-intuitive and results in errors. consider this:
data = {1,3,5,7}; for dataValue = data disp(dataValue==2); endThis raises an exception: “Undefined function or method ‘eq’ for input arguments of type ‘cell’.” The reason is that while we would expect dataValue to contain the numeric values 1,3,5 and 7, in reality dataValue gets {1},{3},{5} and {7}.
This in turn is probably due to the fact that internally Matlab probably sets dataValue=data(idx) and for cell arrays this returns a cell. Instead, for cells it should internally set dataValue=data{idx}.
Yair
Yair,
I respectfully disagree that the behavior should be changed. I do understand that this can cause confusion. I think changing it could cause even more. MATLAB for loops index consistently into whatever is giving for the index expression, using (). To change it to do something different for different types would be difficult behavior to explain and for users to guess as their expression produced diffferent types.
–Loren
Hi Loren – you are correct that the () indexing is consistent and therefore obviously not a bug. However, numerous Matlab functions act differently on numeric arrays vs. cell arrays, in a manner that would seem “natural” depending on the input type. For example:
ismember(4,1:4)
ismember(‘qwe’,{‘qwe’,'wer’,'ert’})
I would be interested to learn what an informal corridor survey (as you mentioned above) would tell us about the expectations by Matlab programmers from the following:
for value = {‘qwe’,'wer’,'ert’}
disp(value) % cell value or string?
end
I expect most people surveyed would expect a string value, so I think an enhancement request for this is in order here.
All the best,
Yair
Yair—It is reasonable to request an enhancement to provide an easier way to iterate over a set of string values. In my opinion, though, it is pointless to request an enhancement to alter the semantics of for as you describe, because (in my opinion!) there is no chance that will happen.
There’s a simple way to obtain both the element and the index at the same time, without updating a counter. It only makes sense when iterating over the elements of a vector, though. It works similarly to Sven Bossuyt’s proposed syntax. Just concatenate a list of indexes and the original vector.
Forgot the “end” to close the For :)
By the way, great article as usual!
After reading the last few comments, I briefly wondered why matlab doesn’t include a ‘foreach’ type command to supplement the ‘for’ command for cases such as the for ‘A = cell_arrayB’ type syntax specified by Yair above. Several difficulties would ensue, including strange behavior if the contents of ‘cell_arrayB’ were modified during the loop (such as removing an individual cell). In particular, the extra complication in dealing with the constraints of the foreach command would make the new command very challenging to use. In addition, since many functions already handle cell arrays natively and process each element in a cell array separately, there would be very little benefit.
I think that João Henriques suggestion above is particularly interesting. That ‘for’ syntax is quite a helpful tip!
Thanks again to Loren for sparking these interesting exchanges!
Loren: Is it possible to design a function that will operate on a matrix in an element-by-element basis?
Specifically, suppose a have a 2D matrix of 1s and 0s (e.g. Black and White image). To ‘sharpen’ the image I can construct a ‘BOX’ function that evaluates the sum of the 1s in say a 3×3 area around each matrix element and then substitutes a 1 if the sum is say greater than 3 and 0 if less than or equal to three. This would use a nested for loop.
Thanks in advance.
Terry Deglow
Terry-
You might want to check out tools such as blockproc in the Image Processing Toolbox.
–Loren
Dear Loren
First of, thank you for your contribution via your blog – It helps alot.
Okay, so my problem is the following. I have a massive number of x,y coordinates and need to do some calculations on them. I used for loops, but it takes forever. Below is a simple piece of code that explains what I am trying to do:
x = [ 0.2 0.3 0.2 0.1]; y = [ 0.8 0.4 0.5 0.2]; for i = 1:numel(x) ind = 0; for j = i+1:numel(x)-1 ind = ind +1; d = abs(y(i)-y(j); c(ind) = d + x(i)+x(j); end e = [e c]; endThe idea is to end up with a vector e (that is a function of x and y) that I can pass as a function to another function.
So, my idea is to avoid a double for loop by having long (very long) vectors (sacrifice memory for runtime): thus to calculated d
vectori = [0.8 0.8 0.8 0.4 0.4 0.5];
vectorj = [0.4 0.5 0.2 0.5 0.2 0.2];
d = abs(vectori – vectorj);
The fact that j runs from i+1 to numel(x)-1 makes this also more challenging for me. How can I build vectori and vectorj for this application?
Hope to hear from you soon.
Kind regards,
Barend.