What Are You Really Measuring?
I had an interesting encounter with a colleague, Bob, last week. We were talking about timing some calculations and we realized that the written code was actually measuring something different. Here's a small cautionary tale along with a neat factoid about vectors and for loops in MATLAB.
Contents
The Story
The story Bob and I started off with is less interesting than where we ended up. After looking at some code, I mentioned to Bob that using square brackets when there was no real concatenation was sometimes expensive in MATLAB. You'll see this as a message from mlint. We pared the code in question down to this:
dbtype forsquare
1 %% forsquare 2 for j = 1:ntimes 3 for k=[1:theEnd] 4 k; 5 end 6 end
and set out to compare it with this code:
dbtype forparen
1 %% forparen 2 for j = 1:ntimes 3 for k=(1:theEnd) 4 k; 5 end 6 end
Notice that the only difference between these M-files is the way format of the expression, the first one using square brackets and the second using parentheses to group the expression. We'll now set out to time the two M-files.
Prep
clc, clear ntimes = 10; ntotal = 1e7; theEnd = ntotal/ntimes;
[], measure with tic-toc
tic, forsquare, toc
Elapsed time is 42.060557 seconds.
(), measure with tic-toc
tic, forparen, toc
Elapsed time is 0.173959 seconds.
[], measure with cputime
t = cputime; forsquare, cputime-t
ans = 41.2794
(), measure with cputime
t = cputime; forparen, cputime-t
ans = 0.1702
Explanation
I am timing these on my home laptop with MATLAB R2006a. I have 512 MB of RAM and 1.70 GHz processor. I am performing the timings with both tic toc and cputime tic - toc is easier, but I am not being too careful about what else is running on my machine so I am also using cputime. As you can see, it doesn't make much difference in this case. The construct with the [] is noticeably slower. Why is that?
In forsquare, the for loop expression delimited by the square brackets [] first constructs the full array inside the brackets in preparation for concatenating the array. In this case, there is nothing else being tacked on, so we create the full vector, ntotal elements in length. However, in the case of forparen, the parentheses () are used solely for grouping and MATLAB recognizes the expression as one it can expand during the execution of the for loop. In forsquare we create a large vector and that allocation dominates the time of the operation. In case you don't believe me that expressions in for loops are expanded as they are used, when possible, here's some code to look at:
dbtype forinf
1 %% forinf 2 for j = 1:Inf 3 if j > 17 4 break 5 end 6 end
Notice that the end of the for expression is Inf ! There's no way I have enough memory to create this vector, and yet the code runs just fine:
tic, forinf, toc
Elapsed time is 0.000061 seconds.
Remarks
Timing is a delicate matter. Be sure you know what exactly you are timing when you make comparisons. Do you have any timing war stories to share?
Published with MATLAB® 7.2
- Category:
- Best Practice,
- Memory