Mike on MATLAB Graphics

Graphics & Data Visualization


Mike on MATLAB Graphics has been retired and will not be updated.

Performance Scaling 4

Posted by Mike Garrity,

Performance Scaling

Graphics performance is a complex and interesting field. It's one my group has been spending a lot of our time working on, especially as we designed MATLAB's new graphics system. Because the new graphics system is multithreaded and splits work between the CPU and the graphics card, you usually need to do quite a bit of exploration to understand why a particular case has the performance characterstics it does. The balance between the different parts of the system is usually more important than any single component.

There are a lot of different ways to explore the performance of a particular case. We’ll visit several of them here in future posts. Today we’re going to look at how the time it takes to create a chart scales with the number of values we’re plotting and the type of chart we’re using. This type of scaling analysis is a great first step in understanding the performance of any software system, and I generally recommend it as the starting point in figuring out a graphics performance issue.

Lets start with a really simple example. We can measure the time it takes to create various size area charts using the following code.

s = round(10.^(1:.25:6));
nt = numel(s);

t = zeros(1,nt);
for i=1:nt
    np = s(i);
    d = rand(1,np);
    t(i) = toc;

If we plot t, we’ll get something like this:

xlabel('# points')
ylabel('Time in seconds')
title('Scaling of area chart')

As you can see, the scaling is roughly linear. That makes sense. As the chart gets larger and more complex, the time it takes to create it gets larger in proportion. When we get all the way to the right side of the chart, we're creating an area chart with a million points, and it takes about 8 seconds.

Now lets look at how different types of charts compare. The following script will do the same sort of measurement for six different types of charts.

s = round(10.^(1:.25:6));
n = numel(s);
funcs = {'area','stem','bar','scatter','stairs','plot'};
results.count = s;
for i=1:numel(funcs)
  t = zeros(1,n);
  for j=1:n
    np = s(j);
    x = 1:np;
    d = rand(1,np);
    f = str2func(funcs{i});
    t(j) = toc;
  results.(funcs{i}) = t;

But I actually used a slightly more complicated version which you can download here.

load r2014b_scaling_results

Then we can plot the results like this:

hold on
funcs = {'area','stem','bar','scatter','stairs','plot'};
m = {'+','s','^','o','*','p'};
for ix=1:numel(funcs)
    f = funcs{ix};
    x = results.count;
    y = results.(f);

xlabel('# Points')
title('Performance Scaling')

As you can see, the area chart we looked at first is actually the slowest of the bunch, while the line plot which is created by the plot command is the fastest.

If you're following closely, you might have noticed that this chart didn't get exactly the same number for the million point area chart. That's because the script I used in this case does multiple runs and then uses the median value of the times. This is usually a good idea. You'll see small variations in run times depending on where things are in memory and what other processes are running on your computer.

And if you look really, really closely, you might notice that something interesting is happening down there in the lower left corner. A good way to get a better look at it is to switch our XScale and YScale properties to log.

That gives us something like this:


Now we can see a number of interesting things.

As we saw earlier, area is the slowest when N is very large, but when N is small it is actually faster than bar, stem, and stairs. It scales differently from the others because of the big polygon it creates.

For a small number of points, bar and stem are very similar in performance, but bar pulls ahead when the number of points gets large. The performance scaling of stem actually involves interactions between the threads in the new multithreaded graphics system. This is a very interesting area that we'll be looking at in an upcoming post.

Also notice that all of the curves are flat on the left side. That's because it costs a certain amount to create the chart and initialize the axes regardless of how large or small the chart is. We refer to that as "startup cost".

It's also interesting to compare scaling in different versions of MATLAB. Here is the same chart for R2014a.

As you know, there were a lot of changes to the graphics system in R2014b. Performance scaling was one of the things we worked on improving with the new graphics system. As you can see, we did eliminate the really nasty cases. In R2014a, area and scatter behaved very badly when the amount of data got large. In fact, I locked up my computer trying to get the R2014a number for area at 1,000,000 points! The bad scaling of the old version of area was an artifact of how it handed that large polygon off to the patch object.

We also improved the scaling of bar charts by quite a bit.

On the other hand, the scaling of stem and stairs got a bit worse. You can also see that startup costs have increased a bit in R2014b. We're still working on improving that. In the meantime, there are some workarounds you can use to minimize the impact of startup costs. We'll also talk about those in a future post.

Get the MATLAB code

Published with MATLAB® R2014b

4 CommentsOldest to Newest

Royi replied on : 1 of 4
Hopefully the Graphics System of MATLAB will improve a lot. By the way, I wish there was an easy way to display images in 1:1 mode (100% scaling) in `image`. Currently it is only easy using `imshow`. I know `imshow` is now part of MATLAB and doesn't require image processing, and yet, it would be nice to be able to do so in `image`. Thanks.
mgarrity replied on : 2 of 4
I agree. Now that imshow is in core MATLAB, it'd be really nice to make its functionality easier to get to.
Shana Trollope replied on : 3 of 4
Hello I have data similar to this shown by Mike. Mine is seismic data plotted on a log log scale. The outputs show a linear regression both positive and negative. How do I do a linear regression with residuals ? Choosing Tools and Basic Fitting gives me an odd line removed from the data. Anyone with Help or ideas ? I have been battling for months now to get it right. Secondly I have data sets that show multiple but distinct relationships within a single set. Is it possible to get regression relationships for each of them. I.e. Draw a line through each and get its equation ?? Help will be greatly appreciated. I am located in Upington, South Africa and far removed from collleagues.
mgarrity replied on : 4 of 4
I'm not an expert on fitting, but my guess would be the following. The loglog plot has lots of points over on the right side. Those are factoring heavily in the fit, so it's not near your points over on the left side. In addition, when basic fit draws the curve, it places points on the curve linearly in data space. This means that it's also not generating many points on the left side. This can make it appear to jump around. Once you've created the fit, you can call get(gca,'Children'). That should give you two objects. The Line object is your data, and the FunctionLine object is the one that basic fit is drawing. If you set the Marker property on the function line to something like '*' or '.', you'll see the points it's creating. I would suggest heading over to MATLAB Answers (https://www.mathworks.com/matlabcentral/answers/) and posting your question there. There are lots of people over there who know more about data fitting than I do, and I'm sure they'd be happy to help you.

Add A Comment

Your email address will not be published. Required fields are marked *


Preview: hide

This site uses Akismet to reduce spam. Learn how your comment data is processed.