{"id":180,"date":"2015-01-20T16:07:49","date_gmt":"2015-01-20T21:07:49","guid":{"rendered":"https:\/\/blogs.mathworks.com\/graphics\/?p=180"},"modified":"2015-01-20T16:07:49","modified_gmt":"2015-01-20T21:07:49","slug":"performance-scaling","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/graphics\/2015\/01\/20\/performance-scaling\/","title":{"rendered":"Performance Scaling"},"content":{"rendered":"<div class=\"content\"><h3>Performance Scaling<\/h3><p>Graphics performance is a complex and interesting field. It's one my group has been spending a lot of our time working on, especially as we designed MATLAB's new graphics system. Because the new graphics system is multithreaded and splits work between the CPU and the graphics card, you usually need to do quite a bit of exploration to understand why a particular case has the performance characterstics it does. The balance between the different parts of the system is usually more important than any single component.<\/p><p>There are a lot of different ways to explore the performance of a particular case. We&#8217;ll visit several of them here in future posts. Today we&#8217;re going to look at how the time it takes to create a chart scales with the number of values we&#8217;re plotting and the type of chart we&#8217;re using. This type of scaling analysis is a great first step in understanding the performance of any software system, and I generally recommend it as the starting point in figuring out a graphics performance issue.<\/p><p>Lets start with a really simple example. We can measure the time it takes to create various size area charts using the following code.<\/p><pre class=\"codeinput\">figure\r\naxes\r\ns = round(10.^(1:.25:6));\r\nnt = numel(s);\r\n\r\nt = zeros(1,nt);\r\n<span class=\"keyword\">for<\/span> i=1:nt\r\n    np = s(i);\r\n    d = rand(1,np);\r\n    cla;\r\n    drawnow;\r\n    tic\r\n    area(d);\r\n    drawnow;\r\n    t(i) = toc;\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>If we plot t, we&#8217;ll get something like this:<\/p><pre class=\"codeinput\">cla\r\nplot(s,t)\r\nxlabel(<span class=\"string\">'# points'<\/span>)\r\nylabel(<span class=\"string\">'Time in seconds'<\/span>)\r\ntitle(<span class=\"string\">'Scaling of area chart'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/graphics\/2015\/performancescaling_02.png\" alt=\"\"> <p>As you can see, the scaling is roughly linear. That makes sense. As the chart gets larger and more complex, the time it takes to create it gets larger in proportion. When we get all the way to the right side of the chart, we're creating an area chart with a million points, and it takes about 8 seconds.<\/p><p>Now lets look at how different types of charts compare. The following script will do the same sort of measurement for six different types of charts.<\/p><pre class=\"language-matlab\">figure\r\naxes\r\ns = round(10.^(1:.25:6));\r\nn = numel(s);\r\nfuncs = {<span class=\"string\">'area'<\/span>,<span class=\"string\">'stem'<\/span>,<span class=\"string\">'bar'<\/span>,<span class=\"string\">'scatter'<\/span>,<span class=\"string\">'stairs'<\/span>,<span class=\"string\">'plot'<\/span>};\r\nresults.count = s;\r\n<span class=\"keyword\">for<\/span> i=1:numel(funcs)\r\n  t = zeros(1,n);\r\n  <span class=\"keyword\">for<\/span> j=1:n\r\n    np = s(j);\r\n    x = 1:np;\r\n    d = rand(1,np);\r\n    cla;\r\n    drawnow;\r\n    f = str2func(funcs{i});\r\n    tic\r\n    f(x,d);\r\n    drawnow;\r\n    t(j) = toc;\r\n  <span class=\"keyword\">end<\/span>\r\n  results.(funcs{i}) = t;\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>But I actually used a slightly more complicated version which you can <a href=\"https:\/\/blogs.mathworks.com\/images\/graphics\/2015\/scaling_test.m\">download here<\/a>.<\/p><pre class=\"codeinput\">load <span class=\"string\">r2014b_scaling_results<\/span>\r\n<\/pre><p>Then we can plot the results like this:<\/p><pre class=\"codeinput\">cla\r\nhold <span class=\"string\">on<\/span>\r\nfuncs = {<span class=\"string\">'area'<\/span>,<span class=\"string\">'stem'<\/span>,<span class=\"string\">'bar'<\/span>,<span class=\"string\">'scatter'<\/span>,<span class=\"string\">'stairs'<\/span>,<span class=\"string\">'plot'<\/span>};\r\nm = {<span class=\"string\">'+'<\/span>,<span class=\"string\">'s'<\/span>,<span class=\"string\">'^'<\/span>,<span class=\"string\">'o'<\/span>,<span class=\"string\">'*'<\/span>,<span class=\"string\">'p'<\/span>};\r\n<span class=\"keyword\">for<\/span> ix=1:numel(funcs)\r\n    f = funcs{ix};\r\n    x = results.count;\r\n    y = results.(f);\r\n    plot(x,y,<span class=\"string\">'DisplayName'<\/span>,f,<span class=\"string\">'Marker'<\/span>,m{ix})\r\n<span class=\"keyword\">end<\/span>\r\nlegend(<span class=\"string\">'show'<\/span>,<span class=\"string\">'Location'<\/span>,<span class=\"string\">'NorthWest'<\/span>);\r\n\r\nset(gca,<span class=\"string\">'YGrid'<\/span>,<span class=\"string\">'on'<\/span>)\r\nxlabel(<span class=\"string\">'# Points'<\/span>)\r\nylabel(<span class=\"string\">'Seconds'<\/span>)\r\ntitle(<span class=\"string\">'Performance Scaling'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/graphics\/2015\/performancescaling_03.png\" alt=\"\"> <p>As you can see, the area chart we looked at first is actually the slowest of the bunch, while the line plot which is created by the plot command is the fastest.<\/p><p>If you're following closely, you might have noticed that this chart didn't get exactly the same number for the million point area chart. That's because the script I used in this case does multiple runs and then uses the median value of the times. This is usually a good idea. You'll see small variations in run times depending on where things are in memory and what other processes are running on your computer.<\/p><p>And if you look really, really closely, you might notice that something interesting is happening down there in the lower left corner. A good way to get a better look at it is to switch our XScale and YScale properties to log.<\/p><p>That gives us something like this:<\/p><pre class=\"codeinput\">set(gca,<span class=\"string\">'XScale'<\/span>,<span class=\"string\">'log'<\/span>,<span class=\"string\">'YScale'<\/span>,<span class=\"string\">'log'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/graphics\/2015\/performancescaling_04.png\" alt=\"\"> <p>Now we can see a number of interesting things.<\/p><p>As we saw earlier, area is the slowest when N is very large, but when N is small it is actually faster than bar, stem, and stairs. It scales differently from the others because of the big polygon it creates.<\/p><p>For a small number of points, bar and stem are very similar in performance, but bar pulls ahead when the number of points gets large. The performance scaling of stem actually involves interactions between the threads in the new multithreaded graphics system. This is a very interesting area that we'll be looking at in an upcoming post.<\/p><p>Also notice that all of the curves are flat on the left side. That's because it costs a certain amount to create the chart and initialize the axes regardless of how large or small the chart is. We refer to that as \"startup cost\".<\/p><p>It's also interesting to compare scaling in different versions of MATLAB. Here is the same chart for R2014a.<\/p><p><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/images\/graphics\/2015\/performance_scaling_R2014a_results.png\" alt=\"\"> <\/p><p>As you know, there were a lot of changes to the graphics system in R2014b. Performance scaling was one of the things we worked on improving with the new graphics system. As you can see, we did eliminate the really nasty cases. In R2014a, area and scatter behaved very badly when the amount of data got large. In fact, I locked up my computer trying to get the R2014a number for area at 1,000,000 points! The bad scaling of the old version of area was an artifact of how it handed that large polygon off to the patch object.<\/p><p>We also improved the scaling of bar charts by quite a bit.<\/p><p>On the other hand, the scaling of stem and stairs got a bit worse. You can also see that startup costs have increased a bit in R2014b. We're still working on improving that. In the meantime, there are some workarounds you can use to minimize the impact of startup costs. We'll also talk about those in a future post.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_cb58ee4f6bef445c8c4f3b0dd399f23c() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='cb58ee4f6bef445c8c4f3b0dd399f23c ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' cb58ee4f6bef445c8c4f3b0dd399f23c';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2015 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_cb58ee4f6bef445c8c4f3b0dd399f23c()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2014b<br><\/p><\/div><!--\r\ncb58ee4f6bef445c8c4f3b0dd399f23c ##### SOURCE BEGIN #####\r\n%% Performance Scaling\r\n% Graphics performance is a complex and interesting field. It's one my\r\n% group has been spending a lot of our time working on, especially as we designed \r\n% MATLAB's new graphics system. Because the new graphics system is\r\n% multithreaded and splits work between the CPU and the graphics card, you\r\n% usually need to do quite a bit of exploration to understand why a\r\n% particular case has the performance characterstics it does. The balance\r\n% between the different parts of the system is usually more important than any \r\n% single component. \r\n%\r\n% There are a lot of different ways to explore the performance of a\r\n% particular case. We\u00e2\u20ac\u2122ll visit several of them here in future posts. Today \r\n% we\u00e2\u20ac\u2122re going to look at how the time it takes \r\n% to create a chart scales with the number of values we\u00e2\u20ac\u2122re plotting and the \r\n% type of chart we\u00e2\u20ac\u2122re using. This type of scaling analysis is a great first \r\n% step in understanding the performance of any software system, and I generally \r\n% recommend it as the starting point in figuring out a graphics performance\r\n% issue.\r\n%\r\n% Lets start with a really simple example. We can measure the time it takes \r\n% to create various size area charts using the following code.\r\n%\r\nfigure\r\naxes\r\ns = round(10.^(1:.25:6));\r\nnt = numel(s);\r\n\r\nt = zeros(1,nt);\r\nfor i=1:nt\r\n    np = s(i);\r\n    d = rand(1,np);\r\n    cla;\r\n    drawnow;\r\n    tic\r\n    area(d);\r\n    drawnow;\r\n    t(i) = toc;\r\nend\r\n\r\n%%\r\n% If we plot t, we\u00e2\u20ac\u2122ll get something like this:\r\ncla\r\nplot(s,t)\r\nxlabel('# points')\r\nylabel('Time in seconds')\r\ntitle('Scaling of area chart')\r\n\r\n%%\r\n% As you can see, the scaling is roughly linear. That makes sense. As the \r\n% chart gets larger and more complex, the time it takes to create it gets \r\n% larger in proportion. When we get all the way to the right side of the\r\n% chart, we're creating an area chart with a million points, and it takes\r\n% about 8 seconds.\r\n%\r\n% Now lets look at how different types of charts compare. The following \r\n% script will do the same sort of measurement for six different types of \r\n% charts.\r\n%\r\n%   figure\r\n%   axes\r\n%   s = round(10.^(1:.25:6));\r\n%   n = numel(s);\r\n%   funcs = {'area','stem','bar','scatter','stairs','plot'};\r\n%   results.count = s;\r\n%   for i=1:numel(funcs)\r\n%     t = zeros(1,n);\r\n%     for j=1:n\r\n%       np = s(j);\r\n%       x = 1:np;\r\n%       d = rand(1,np);\r\n%       cla;\r\n%       drawnow;\r\n%       f = str2func(funcs{i});\r\n%       tic\r\n%       f(x,d);\r\n%       drawnow;\r\n%       t(j) = toc;\r\n%     end\r\n%     results.(funcs{i}) = t;\r\n%   end\r\n%\r\n% But I actually used a slightly more complicated version which you can\r\n% <file:\/\/mathworks\/devel\/sandbox\/mgarrity\/external_blog\/performance_scaling\/scaling_test.m download here>.\r\nload r2014b_scaling_results\r\n\r\n%%\r\n% Then we can plot the results like this:\r\n\r\ncla\r\nhold on\r\nfuncs = {'area','stem','bar','scatter','stairs','plot'};\r\nm = {'+','s','^','o','*','p'};\r\nfor ix=1:numel(funcs)\r\n    f = funcs{ix};\r\n    x = results.count;\r\n    y = results.(f);\r\n    plot(x,y,'DisplayName',f,'Marker',m{ix})\r\nend\r\nlegend('show','Location','NorthWest');\r\n\r\nset(gca,'YGrid','on')\r\nxlabel('# Points')\r\nylabel('Seconds')\r\ntitle('Performance Scaling')\r\n\r\n%%\r\n% As you can see, the area chart we looked at first is actually the slowest \r\n% of the bunch, while the line plot which is created by the plot command \r\n% is the fastest. \r\n%\r\n% If you're following closely, you might have noticed that\r\n% this chart didn't get exactly the same number for the million point area\r\n% chart. That's because the script I used in this case does multiple runs\r\n% and then uses the median value of the times. This is usually a good idea.\r\n% You'll see small variations in run times depending on where things are in\r\n% memory and what other processes are running on your computer.\r\n%\r\n% And if you look really, really closely, you might notice that \r\n% something interesting is happening down there in the lower left \r\n% corner. A good way to get a better look at it is to switch our XScale and \r\n% YScale properties to log.\r\n%\r\n% That gives us something like this:\r\n\r\nset(gca,'XScale','log','YScale','log')\r\n\r\n%%\r\n% Now we can see a number of interesting things. \r\n%\r\n% As we saw earlier, area is the slowest when N is very large, but when N \r\n% is small it is actually faster than bar, stem, and stairs. It scales \r\n% differently from the others because of the big polygon it creates.\r\n% \r\n% For a small number of points, bar and stem are very similar in performance, \r\n% but bar pulls ahead when the number of points gets large. The performance \r\n% scaling of stem actually involves interactions between the threads in the \r\n% new multithreaded graphics system. This is a very interesting area that \r\n% we'll be looking at in an upcoming post.\r\n%\r\n% Also notice that all of the curves are flat on the left side. That's\r\n% because it costs a certain amount to create the chart and initialize the\r\n% axes regardless of how large or small the chart is. We refer to that as \r\n% \"startup cost\".\r\n\r\n%%\r\n% It's also interesting to compare scaling in different versions of MATLAB. \r\n% Here is the same chart for R2014a. \r\n%\r\n% <<..\/R2014a_results.png>>\r\n\r\n%%\r\n% As you know, there were a lot of changes to the graphics system in R2014b.\r\n% Performance scaling was one of the things we worked on improving with the\r\n% new graphics system. As you can see, we did eliminate the really nasty\r\n% cases. In R2014a, area and scatter behaved very badly when the amount\r\n% of data got large. In fact, I locked up my computer trying to get the\r\n% R2014a number for area at 1,000,000 points! The bad scaling of the old\r\n% version of area was an artifact of how it handed that large polygon off\r\n% to the patch object. \r\n%\r\n% We also improved the scaling of bar charts by quite a\r\n% bit. \r\n%\r\n% On the other hand, the scaling of stem and stairs got a bit worse. You\r\n% can also see that startup costs have increased a bit in R2014b.\r\n% We're still working on improving that. In the meantime, there are some\r\n% workarounds you can use to minimize the impact of startup costs. We'll\r\n% also talk about those in a future post.\r\n\r\n##### SOURCE END ##### cb58ee4f6bef445c8c4f3b0dd399f23c\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/graphics\/files\/feature_image\/performance_scaling_thumbnail.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><p>Performance ScalingGraphics performance is a complex and interesting field. It's one my group has been spending a lot of our time working on, especially as we designed MATLAB's new graphics system.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/graphics\/2015\/01\/20\/performance-scaling\/\">read more >><\/a><\/p>","protected":false},"author":89,"featured_media":184,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[8],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/posts\/180"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/users\/89"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/comments?post=180"}],"version-history":[{"count":5,"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/posts\/180\/revisions"}],"predecessor-version":[{"id":186,"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/posts\/180\/revisions\/186"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/media\/184"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/media?parent=180"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/categories?post=180"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/graphics\/wp-json\/wp\/v2\/tags?post=180"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}