I recently noticed a change in the way we write some of our product release notes, and I wanted to mention it to you.
In my quarter century at MathWorks doing toolbox and MATLAB development, there have been a few areas of focus that have been remarkably consistent over that entire time. One of those areas is performance. Specifically, computation speed.
If you have used MATLAB more than five years, it is likely that something you use in MATLAB a lot has been completely reimplemented to make it go faster in our ever-evolving computational environments.
Maybe it was new algorithms, like image resizing or Gaussian filtering. Maybe the memory access patterns were modified to exploit changing memory cache architectures, like image resizing (again), transposition (and permute), conv2, and even the seemingly straightforward sum function.
Possibly the functions you rely upon were modified to adapt to new core libraries, such as LAPACK or FFTW.
Many, many, many functions and operators were completely overhauled when multicore computers became common. Then they were modified again to exploit extended processor instruction sets for instruction word parallelism.
Finally, the very foundations for MATLAB language execution were completely overhauled in 2015 to make everything go faster. Since then, the MATLAB execution engine continues to be refined with almost every release to add new types of optimizations.
The curious thing about all this effort, over so many years, is how ... well ... vague we typically have been in describing performance improvements in our release notes.
For example, here is a snippet from the R2018b Release Notes for the Image Processing Toolbox:
Like I said: it's vague.
It was never our intent to be obscure. It's just that performance measurements are almost always challenging to report with accuracy and precision, and the experiences of individual users will almost always vary, sometimes considerably. Part of our company culture here is that we are allergic to making statements that could be perceived as inaccurate. I think that's what has been behind the history of vague statements about performance improvements in release notes. (OK, I should state this explicitly: this is my personal opinion, and not a statement of what company policy is or has been.)
Well, things are starting to change. Our documentation writers now have a new standard to follow when writing release notes about performance. Here is a sample from R2019b, which was released last month:
The release note describes what operation has been improved, how it was timed, what the times were for specific releases, and details about the computer used to measure the performance.
Look for more performance changes to be reported with this level of detail in the future. I think this is a great improvement!
Get the MATLAB code
Published with MATLAB® R2019b
3 CommentsOldest to Newest
function [tloop,tvec] = prodeval(neval,nround) % neval ... size of problem % nround ... number of rounds prodval = 1; prodval2 = 1; x = ones(1,neval); % for-loop product tic; for k = 1:nround for i = 1:neval prodval = prodval*x(i); end end tloop = toc; % vectorized product tic; for k = 1:nround prodval2 = prodval2*prod(x); end tvec = toc; endMatlab R2019b produce following results:
>> profile on >> [tloop,tvec] = prodeval(1e1,1e8) tloop = 61.1934 tvec = 6.6288with activated profiling the vectorized coce is significant faster than for-loop code. But with de-activated profiler are results completely different:
>> profile off >> [tloop,tvec] = prodeval(1e1,1e8) tloop = 0.8389 tvec = 1.1821So finally, profiler in this case produce results which does not correspond to the results with deactivated profiler.
Michal—I have experimented with your code sample, and I've talked with other MATLAB language developers about your profiler experience.
I ran your code, with and without the profiler on, in both R2015a and R2019b. R2015a is the last version of MATLAB before the new execution engine was introduced. Some conclusions:
- You have a good point about the profiler sometimes skewing the comparison between different implementations.
- The skew is a lot less now than before the new execution engine.
- Vectorization just for the sake of performance is often no longer useful.
- R2019b is a LOT faster than R2015a for your example, for both the looped and the prod versions.
The looped version is 8 times faster in R2019b than R2015a. The prod version is 36 times faster. As a result of these speedups, the loop and prod version are equally as fast in R2019b.
In R2015a, turning on the profiler slows the loop version down by 320x, and it slows the prod version down by 8x, resulting in a relative measurement skew of 40x.
In R2019a, turning on the profiler slows the loop version down by 50x, and it slows down the prod version by 6x, resulting in a relative measurement skew of 8x.
Other MATLAB language developers tell me your loop code pretty much the worst case for profiler overhead: a small, tight loop containing only scalar indexing and simple arithmetic. For now, there's no workaround that I know of. It is just a profiler characteristic to be aware of. Perhaps we'll be able to improve upon this in a future release. In the meantime, enjoy the speedups, and don't worry as much about vectorizing everything in sight.