Loren on the Art of MATLAB

How for Works 18

Posted by Loren Shure,

Last week I touched on the difference between using 1:N and [1:N] as the expression in a for statement. I have gotten enough more questions recently on how for works in MATLAB that it seems time for its own post.

Contents

"Typical Usage"

The most common use of for is when we want to do something a certain known number of times and we decide to not vectorize the code. In this context, we often know the total number of times, N and can write code like this:

N = 3;
for ind = 1:N
    disp(ind)
end
     1

     2

     3

for Expression Can Be a Matrix

The for loop uses each column of the expression as the temporary loop variable. When MATLAB was designed, Cleve tells me, he chose : to create row vectors since 1:N is a natural expression and he expected : to be involved frequently in setting up loops.

From the early days of MATLAB, we could also use a matrix (and these days an N-dimensional array) as the loop expression. In these cases, MATLAB iterates on the columns (collapsing all dimensions >1 to be virtual columns).

A = reshape(1:6,3,2);
count = 0;
% display loop counter and transpose of loop expression
for ind = A
    count = count+1;
    disp([count, ind'])
end
     1     1     2     3

     2     4     5     6

for Expression can go to Infinity!

For the benefit of those who didn't read last week's post, I repeat an interesting MATLAB tidbit. You can have your for loop go to Infinity in MATLAB if you don't insist on precalculating the vector to iterate over.

for ind = 1:Inf
    disp(ind)
    if ind>=2
        disp('Stopping now')
        break
    end
end
ind
     1

     2

Stopping now

ind =

     2

for Scope

The scope of the loop counter in MATLAB's for loop is not like other laguages as far as I know. You can reassign values to the counter inside the loop and yet the loop will, in certain ways, proceed as if that reassignment hadn't happened. Let's look at some examples.

for ind = 1:3
    disp(ind)
end
ind
     1

     2

     3


ind =

     3

In this case, we don't try any funny business and we see we can use the final value of ind past the end of the for loop. This was true as well for the case in which we had the loop counter go to Inf.

MATLAB will continue marching through the loop even if, inside, the loop variable gets disturbed, unless you do something like above and break out when a condition is met (or, of course, if there's an error). Watch this example:

for ind = 1:3
    ind
    ind = 10
end
ind
ind =

     1


ind =

    10


ind =

     2


ind =

    10


ind =

     3


ind =

    10


ind =

    10

I reset ind inside the loop, and yet when the loop continues on its next pass, it reverts to using the loop expression to determine whether or not the for loop is complete.

Uses of for

With the introduction of the JIT/Accelerator in MATLAB 6.5, for loops often do not exact a large performance penalty as they had in the past. I can think of situations in which using for makes lots of sense, even though MATLAB is a vector-oriented language.

  • code can't be vectorized
  • takes too much memory if vectorized
  • code with loop is clearer and more maintainable

for Gotcha

The main blunder people make using for loops is assigning output to an array that is not preallocated before the loop begins. This results in MATLAB constantly needing to reallocate memory for an array that is typically one element larger each time through the loop. If you're not careful, this memory allocation time can overwhelm the time of the calculation. One pattern I have seen that doesn't flagrantly preallocate the output (but does intrinsically) is when you have the loop go in reverse.

N = 7;
for ind = N:-1:1
    B(ind) = ind
    if ind <= N-1
        break;
    end
end
B =

     0     0     0     0     0     0     7


B =

     0     0     0     0     0     6     7

When, how, and why do you use for loops? Do you know of other perils to watch out for? Post your thoughts here.


Published with MATLAB® 7.2

18 CommentsOldest to Newest

An excellent discussion of the “for” loop. Encourage the writers of MatLab documentation to include examples that span this range in the discussion of “for”.

Regarding when to use “for,” vs vectorizing, one common place to use the “for” is when performing a mathematical operation on the elements of a pair of vectors, e.g.,

a(1) * b(1)
a(2) * b(2)
.
.
.
a(n) * b(n)

This could also be done with:

a .* b

Since the “.*” and it associated operators seem to be unique to MatLab, many lay users won’t be familiar with it. Thus, even though the “for” loop may be slower, it is preferred because it is easier for many lay users to understand.

Writers of code frequently have to make style choices which may impact the readability and / or speed of the code. Generally, I find that it is better to choose readability over speed.

just to add a bit to the amazing handling of the scope of the loop-counter:
if you run this (slightly contrived) code a few times, either as a script or a function, w/o feature accel on|off, you will see how counters are re-stored into memory from a – yet unknown – place… some miraculous stuff must be happening under the hood…

format debug;
clear i;
rex=@(str) regexp(str,...
'(?< = = )|(\\s)[-]*\\w+',...
'match');
for i=1:5
    % let's peek at the addresses
    str=evalc('i');
    a=rex(str);
    disp(sprintf('+ outer %5d %9s %9s %6s',...
        i,a{[4,10,13]}));
    i=10;
    str=evalc('i');
    a=rex(str);
    disp(sprintf('* outer %5d %9s %9s %6s',...
        i,a{[4,10,13]}));
    clear i; % !
    for i=-500:-499
        str=evalc('i');
        a=rex(str);
        disp(sprintf('- inner %5d %9s %9s %6s',...
            i,a{[4,10,13]}));
    end % inner
end % outer

And this is the output:

% this is a typical output (note format may be off!)
+ outer     1  192df560   241bce8      1
* outer    10  192df560   241bce8     10 % = outer counter
- inner  -500  192df560   241a988   -500
- inner  -499  192df560   241a988   -499
+ outer     2  192df560   241a988      2 % = inner counter
* outer    10  192df560   241a988     10
- inner  -500  192df560   2419298   -500
- inner  -499  192df560   2419298   -499
+ outer     3  192df560   2419298      3
* outer    10  192df560   2419298     10
- inner  -500  192df560   241bce8   -500
- inner  -499  192df560   241bce8   -499
+ outer     4  192df560   241bce8      4
* outer    10  192df560   241bce8     10
- inner  -500  192df560   241a988   -500
- inner  -499  192df560   241a988   -499
+ outer     5  192df560   241a988      5
* outer    10  192df560   241a988     10
- inner  -500  192df560   2419298   -500
- inner  -499  192df560   2419298   -499

it really is AMAZING

us

Loren,

May be it is slightly off topic, but since you have mentioned assigning array output in a loop, can you please explain which one of the following two array assignments should be preferred.

Case 1:

for ind = 1:N
a = ……….;
b(ind) = a
end

Case 2:

b = [];
for ind = 1:N
a = ……….;
b = [b a];
end

I understand that pre-allocation is faster, but the vector length is not always known before the loop is executed (for example, the loop may be terminated with a conditional break statement).

Thanks,

Eric

Eric-

Both methods have the same issue with respect to memory allocation. They will allocate each time through the loop. If you know that you will be adding at most one element each time, you can still preallocate, and then just chop off the end afterwards, perhaps. That might be temporarily wasteful of memory, but will cause less thrashing, on average.

b = zeros(1,N);
for ind = 1:N
   b(ind) = something;
   possibly break
end
b = b(1:ind);

–l

for Structure fields:
(let me know if there is a shorter way to
do the same ?)

GroupA.n=100;
GroupA.money=25;
GroupA.rank=8;

GroupB.n=155;
GroupB.money=33;
GroupB.rank=9;

for k={‘n’ ‘money’}
All.(k{1})=GroupA.(k{1})+GroupB.(k{1});
end
%rank is an intensive value
All.rank=(GroupA.n*GroupA.rank+GroupB.rank*GroupB.n)/All.n;

All

regards,
Xav

Xav-

I’m not sure how much you might need to generalize this. Ironically, you’d use one line less in this case if you do the formulae explicitly instead of the for loop:

All.n=GroupA.n+GroupB.n;
All.money=GroupA.money+GroupB.money;

but the runtime shouldn’t be significantly different (although not having to create and dereference cells for the for index might be a win here).

A for loop won’t actually go to infinity; on my computer it won’t go past 2.1475e+009. Take the following script:

for x = 1:inf
end
a = x;

for x = 1:a*10
end
b = x;

x = 1;
while x<a*10
x = x + 1;
end
c = x;

x = 1;
while x<inf
x = x + 1;
end

I let it run for a long period of time, and then did a CTRL-C. It was still running on the last while loop and the variable values where:

a = 2.1475e+009
b = 2.1475e+009
c = 2.1475e+010
x = 9.1499e+010

It seems as though there is some constraint on the number of iterations on a for loop, but not a while loop. I can’t explain exactly where the constraint comes from or why it’s there, but I certainly can’t find a way to beat it.

Interestingly, I discovered this because I asserted to someone that, in MATLAB, there are some things that a while loop could do, but not a for loop. Of course, in most languages, both are equivalent though optimized for different uses. Then someone reminded me of the ability to go to infinity. When I tested it though, it didn’t work.

Anyone know why?

Sorry, about the weird formatting of the script above. It should be:


for x = 1:inf
end
a = x;
for x = 1:a*10
end
b = x;
x = 1;
while x<a*10
x = x + 1;
end
c = x;
x = 1;
while x<inf
x = x + 1;
end

I get the same limit for FOR loops. While I’m no expert, I hypothesize that the difference occurs between For and While because of variable declaration sizes. Consider these two loops:

for x=0:inf
end

x = 0
while 1
x = x +1
end

The for loop terminates quite early with x equal to the 2.15e19 constant you speak of. However, the while loop can march on much past it.

My guesstimation is that MATLAB allocates all of the FOR loop values (the vector 1:inf) at once. The size of which reaches the MATLAB limit for number of elements in a vector. So I assume MATLAB, truncates the size of the vector to this limitation.

Meanwhile, the while loop only deals with a 1×1 double and will only error out when it reaches the maximum value for a double.

Further proof of this is demonstrated when you try to create a vector 1:2.15e19. This fails because it exceeds the maximum number of elements in a vector. However, 1:2.15^18 does not (though may crash your computer!). This indicates to me that MATLAB is approaching the limit quantity of elements in a vector when trying to run the 1:inf FOR loop. Thus MATLAB, truncates it prematurely (rather than run the expected “infinite” for loop).

Whereas, the while loop can continue on until it overflows the double.

Hmm, very interesting. I think you’re mostly right, but slightly off. See Loren’s post from last week: http://blogs.mathworks.com/loren/2006/07/12/what-are-you-really-measuring/,
She explains the difference between 1:N and [1:N] in a for loop. In short, 1:N is expanded as needed, while [1:N] is expanded at start so that MATLAB can complete the called for concatenate operation. For example:

for x = [1:inf]
end

will immediately produce an error because MATLAB can’t store an infinite length array. This:

for x = 1:inf
end

works though because MATLAB expands it as needed.

However, apparently MATLAB actually does save the entire vector 1:inf somewhere despite the fact that all of the previous elements will never be used. To verify this I ran:

[c, maxsize] = computer

which revealed:

maxsize = 2.1475e+009

which is exactly where my for loops stops.

This seems to confirm our suspicions. It also indicates that the for loop could be further optimized to throw out previous elements when given a vector that could be expanded as needed since there is no way to retrieve the previous elements anyway.

I looked into this a little, and it seems to me that the for loop uses a signed integer as the index. This is why it couldn’t go past 2.15e19. In the current version of MATLAB, the limit for the loop is 2^63 (they’ve updated to 64-bit integers). It actually gives a warning when the loop index passes past this value (after interrupting with Ctrl-C):

Warning: FOR loop index is too large. Truncating to 9223372036854775807.

This totally makes sense, of course. Someone needs to count to keep track of how many iterations have been done. “for ii=1:inf” is not the same as “while 1″!

Thank you for the informative help!

To extend this, how would you “redo” an iteration in a for loop? I have a particularly long for loop repeatedly asking questions to my user, and I want to be able to redo a number if necessary in the middle of the loop.

Ex.

for i = 1:100
%When i = 50, I realize that I want to do 49 again (in the middle of the loop)
end

Any ideas? If there’s a way that I can do it without using for loops, that is also welcomed.

Nolan-

I can’t think of a way to do what you ask with a for loop. You could more easily do it with a while loop however, since in that case, you can adjust the control parameter for the while loop inside the loop. It also depends on whether or not the calculations later in the loop depend on previous iterations. If they do not, then you might be able to vectorize the calculation, avoid the loop totally, then check the conditions afterwards and just recalculate the few elements that need to be adjusted.

–Loren

Nevermind! You can do it with a while loop and then a count variable that you increment (or decrement) based on your preferences.

count = 1;
while (endloop == 0)
if (goforward == 1) 
count = count + 1
end
if (backup == 1) 
count = count - 1
end
end

Hi Loren,

I have a question regarding the scope of for. You gave an example where you try to change the for loop variable:

for ind = 1:3
ind
ind = 10
end
ind

and yet the for loop continues as if you haven’t. Is there away past this? I am checking something with an “if” inside the loop, and in some cases I would like to make the loop run one more, so I would like to do something like this: ind=ind-1
but I would like this to have the desired effect of making to loop run with the same ind value again.

Thanks for your help!
Naama

Naama,

You can create a secondary counter that you increment or not at the relevant places and make your decision based on that derived value. You might also consider the break or continue commands in MATLAB for more control over the loop.

–Loren

These postings are the author's and don't necessarily represent the opinions of MathWorks.