{"id":524,"date":"2016-05-24T16:33:10","date_gmt":"2016-05-24T16:33:10","guid":{"rendered":"https:\/\/blogs.mathworks.com\/developer\/?p=524"},"modified":"2016-05-24T18:05:47","modified_gmt":"2016-05-24T18:05:47","slug":"performance-algorithm-scaling","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/developer\/2016\/05\/24\/performance-algorithm-scaling\/","title":{"rendered":"Performance Review Criteria 3: Scales to the task"},"content":{"rendered":"<div class=\"content\"><!--introduction--><p>To scale or not to scale. That is the question. When talking about algorithmic complexity, the answer to said question is usually an important one. No matter what the constant factors are that affect your algorithm's speed, poor scaling performance severely limits the scope of the problems you can solve. How do you measure your code's runtime complexity? If you don't have a good method for this (or even if you do!) then once again the performance testing framework is on your side. I was <b>amazed<\/b> when I learned of this <b>one weird trick<\/b> to measure the complexity of my algorithms.<\/p><p>The (not so) secret here is that now that <a href=\"https:\/\/blogs.mathworks.com\/developer\/2016\/05\/06\/performance-review-criteria-2-sticks-to-solid-principles\/#2c2fec16-4dec-4277-9c75-0841ddafef3a\">performance tests are tests<\/a>, performance tests can be <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/create-basic-parameterized-test.html\">parameterized tests<\/a>! Leveraging this feature we can simply send in the size of the problem as the parameter. Let's see how this is done.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#432afe20-33cd-4ebe-bcde-ec598dd7431b\">The Code<\/a><\/li><li><a href=\"#492c9b08-26cd-4d5c-a632-85ea853fee9a\">The Sum of all its parts<\/a><\/li><li><a href=\"#85f2e888-49de-46a8-bd45-66b9f0d2eaf8\">Does it fit?<\/a><\/li><\/ul><\/div><h4>The Code<a name=\"432afe20-33cd-4ebe-bcde-ec598dd7431b\"><\/a><\/h4><p>To start, we need some code to scale. Let's look at a simple function that will allow us to simulate some work:<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> doSomeWork(n)\r\n<span class=\"comment\">% Simple function that mimics some constant-time work to be done.<\/span>\r\npause(n);\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>With this little \"working\" function, we now can play with some algorithms of different time complexity. First, let's look at something that is $\\mathcal{O}(1)$, or constant time:<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> constantTimeFunction(~)\r\ndoSomeWork(.2);\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Easy enough. Note how the constantTimeFunction takes in a parameter but simply ignores it. This parameter is intended to be the size of the problem, and this function ignores it because constant time algorithms are the same regardless of size. How can we make something $\\mathcal{O}(n)$, or linear? Put it in a for loop around the size of the problem:<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> linearTimeFunction(n)\r\n<span class=\"keyword\">for<\/span> idx = 1:n\r\n    doSomeWork(0.001);\r\n<span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Now don't worry about the example constants here. They really can be anything I've simply tweaked them to get something reasonable for this demonstration. OK, moving on let's create a $\\mathcal{O}(n^2)$ quadratic time function by using two for-loops around the size:<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> quadraticTimeFunction(n)\r\n<span class=\"keyword\">for<\/span> idx1 = 1:n\r\n    <span class=\"keyword\">for<\/span> idx2 = 1:n\r\n        doSomeWork(0.0001)\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>How do we create something that is $\\mathcal{O}(\\log{}n)$? One way is to simply decrease the number of iterations so that as the problem size grows there is less work to do per unit of size. This can look like this:<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> logTimeFunction(n)\r\ndoSomeWork(0.001);\r\n<span class=\"keyword\">while<\/span> n &gt; 1\r\n    n = n\/2;\r\n    doSomeWork(0.01)\r\n<span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Now, finally let's show something with $\\mathcal{O}(2^{n})$, or exponential complexity. Recursion is a nice fallback whenever you want (!) an exponential algorithm.<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> exponentialTimeFunction(n)\r\ndoSomeWork(1e-4);\r\n<span class=\"keyword\">for<\/span> idx = 1:round(n)\r\n    exponentialTimeFunction(idx-5);\r\n<span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Alright, now that we have all of our \"code\", let's see how we can evaluate how it scales. You can do this pretty easily by creating a performance test that uses a <b><tt>problemSize<\/tt><\/b> property as TestParameter. If this property contains the different parameter data in either a cell array or a structure if you'd like to give each point an intent revealing label. For this test lets just create a few points across the first couple decades. Once this parameter is defined, its trivially easy to pass it into the performance test methods and get your results:<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">classdef<\/span> ScalingTest &lt; matlab.perftest.TestCase\r\n       \r\n    \r\n    <span class=\"keyword\">properties<\/span>(TestParameter)\r\n        problemSize = num2cell(round(logspace(0,2,10)));\r\n    <span class=\"keyword\">end<\/span>\r\n    \r\n    \r\n    <span class=\"keyword\">methods<\/span>(Test)\r\n        <span class=\"keyword\">function<\/span> testConstantTime(~, problemSize)\r\n            constantTimeFunction(problemSize);\r\n        <span class=\"keyword\">end<\/span>\r\n        <span class=\"keyword\">function<\/span> testLogTime(~, problemSize)\r\n            logTimeFunction(problemSize);\r\n        <span class=\"keyword\">end<\/span>\r\n        <span class=\"keyword\">function<\/span> testLinearTime(~, problemSize)\r\n            linearTimeFunction(problemSize);\r\n        <span class=\"keyword\">end<\/span>\r\n        <span class=\"keyword\">function<\/span> testQuadraticTime(~, problemSize)\r\n            quadraticTimeFunction(problemSize);\r\n        <span class=\"keyword\">end<\/span>\r\n        <span class=\"keyword\">function<\/span> testExponentialTime(testCase, problemSize)\r\n            testCase.assumeLessThan(problemSize, 50);\r\n            testCase.startMeasuring;\r\n            exponentialTimeFunction(problemSize);\r\n            testCase.stopMeasuring;\r\n        <span class=\"keyword\">end<\/span>\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Let's get our data and look at each algorithm individually. I can do this by calling runperf with the 'Name' option with some wildcards to narrow down on each method of interest, starting with the constant algorithm:<\/p><pre class=\"codeinput\">constantResult = runperf(<span class=\"string\">'ScalingTest'<\/span>, <span class=\"string\">'Name'<\/span>, <span class=\"string\">'*testConstantTime*'<\/span>);\r\n<\/pre><pre class=\"codeoutput\">Running ScalingTest\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\nDone ScalingTest\r\n__________\r\n\r\n<\/pre><p>Let's create a little helper function to get at the minimum value for each size. This will help us quickly pull out the data points. I am going to choose the minimum of each sample because it is Tuesday and Tuesday told me it really likes mins.<\/p><pre class=\"codeinput\">findMins = @(resultArray) arrayfun(@(result) min(result.Samples.MeasuredTime), resultArray);\r\n\r\nax = axes;\r\nylabel(<span class=\"string\">'Execution time (s)'<\/span>);\r\nxlabel(<span class=\"string\">'Problem Size'<\/span>);\r\nhold <span class=\"string\">on<\/span>;\r\ngrid <span class=\"string\">on<\/span>;\r\n\r\nproblemSize = [ScalingTest.problemSize{:}];\r\nplot(ax, problemSize, findMins(constantResult), <span class=\"string\">'LineWidth'<\/span>, 2);\r\nleg = legend({<span class=\"string\">'$\\mathcal{O}(1)$'<\/span>}, <span class=\"string\">'Interpreter'<\/span>, <span class=\"string\">'latex'<\/span>);\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_01.png\" alt=\"\"> <p>This looks all over the place until you realize the scale is off. Adjusting the y axis to start at zero shows its pretty flat.<\/p><pre class=\"codeinput\">ax.YLim(1) = 0;\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_02.png\" alt=\"\"> <p>Alright now we can start adding the more complex algorithms and see how they compare. Next logarithmic:<\/p><pre class=\"codeinput\">logResult = runperf(<span class=\"string\">'ScalingTest'<\/span>, <span class=\"string\">'Name'<\/span>, <span class=\"string\">'*testLogTime*'<\/span>);\r\nplot(ax, problemSize, findMins(logResult), <span class=\"string\">'LineWidth'<\/span>, 2);\r\naxis <span class=\"string\">tight<\/span>;\r\nleg = legend([leg.String <span class=\"string\">'$\\mathcal{O}(\\log{} n)$'<\/span>],<span class=\"string\">'Interpreter'<\/span>,<span class=\"string\">'latex'<\/span>);\r\n<\/pre><pre class=\"codeoutput\">Running ScalingTest\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n....\r\nDone ScalingTest\r\n__________\r\n\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_03.png\" alt=\"\"> <p>Then linear:<\/p><pre class=\"codeinput\">linearResult = runperf(<span class=\"string\">'ScalingTest'<\/span>, <span class=\"string\">'Name'<\/span>, <span class=\"string\">'*testLinearTime*'<\/span>);\r\nplot(ax, problemSize, findMins(linearResult), <span class=\"string\">'LineWidth'<\/span>, 2);\r\naxis <span class=\"string\">tight<\/span>;\r\nleg = legend([leg.String <span class=\"string\">'$\\mathcal{O}(n)$'<\/span>],<span class=\"string\">'Interpreter'<\/span>,<span class=\"string\">'latex'<\/span>);\r\n<\/pre><pre class=\"codeoutput\">Running ScalingTest\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n........\r\nDone ScalingTest\r\n__________\r\n\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_04.png\" alt=\"\"> <p>Then quadratic:<\/p><pre class=\"codeinput\">quadraticResult = runperf(<span class=\"string\">'ScalingTest'<\/span>, <span class=\"string\">'Name'<\/span>, <span class=\"string\">'*testQuadraticTime*'<\/span>);\r\nplot(ax, problemSize, findMins(quadraticResult), <span class=\"string\">'LineWidth'<\/span>, 2);\r\naxis <span class=\"string\">tight<\/span>;\r\nleg = legend([leg.String <span class=\"string\">'$\\mathcal{O}(n^2)$'<\/span>],<span class=\"string\">'Interpreter'<\/span>,<span class=\"string\">'latex'<\/span>);\r\n<\/pre><pre class=\"codeoutput\">Running ScalingTest\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\nDone ScalingTest\r\n__________\r\n\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_05.png\" alt=\"\"> <p>...and finally exponential. Note that since the exponential blows up so quickly we needed to limit the problem size by using an assumption. Since we don't want to time the assumption, we are explicitly setting the measurement boundary to exclude the assumption call. For those sizes that fail the assumption, the result's <b><tt>Valid<\/tt><\/b> property will be false, so we can use that to trim the data for only the valid measurements we collected.<\/p><pre class=\"codeinput\">exponentialResult = runperf(<span class=\"string\">'ScalingTest'<\/span>, <span class=\"string\">'Name'<\/span>, <span class=\"string\">'*testExponentialTime*'<\/span>);\r\nvalid = [exponentialResult.Valid];\r\nvalidResult = exponentialResult(valid);\r\nvalidSize = problemSize(valid);\r\n\r\nplot(ax, validSize, findMins(validResult), <span class=\"string\">'LineWidth'<\/span>, 2);\r\n\r\nleg = legend([leg.String <span class=\"string\">'$\\mathcal{O}(2^n)$'<\/span>],<span class=\"string\">'Interpreter'<\/span>,<span class=\"string\">'latex'<\/span>);\r\n<\/pre><pre class=\"codeoutput\">Running ScalingTest\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n..\r\n================================================================================\r\nScalingTest\/testExponentialTime(problemSize=value9) was filtered.\r\n================================================================================\r\n.\r\n================================================================================\r\nScalingTest\/testExponentialTime(problemSize=value10) was filtered.\r\n================================================================================\r\n.\r\nDone ScalingTest\r\n__________\r\n\r\nFailure Summary:\r\n\r\n     Name                                                  Failed  Incomplete  Reason(s)\r\n    ===================================================================================================\r\n     ScalingTest\/testExponentialTime(problemSize=value9)               X       Filtered by assumption.\r\n    ---------------------------------------------------------------------------------------------------\r\n     ScalingTest\/testExponentialTime(problemSize=value10)              X       Filtered by assumption.\r\n    \r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_06.png\" alt=\"\"> <p>It is now clear that trying to see this on a linear scale is getting ridiculous. Let's go log scale.<\/p><pre class=\"codeinput\">ax.YScale = <span class=\"string\">'log'<\/span>;\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_07.png\" alt=\"\"> <p>There we go! Here is the scale of all of our dummy algorithms. As you can see, the algorithmic complexity is very important. As the problem sizes scale, large constant factors quickly become irrelevant.<\/p><h4>The Sum of all its parts<a name=\"492c9b08-26cd-4d5c-a632-85ea853fee9a\"><\/a><\/h4><p>However, just for kicks lets do some MATLAB fun to find the constant factors. If you can't improve the algorithm's complexity, perhaps you might want to see where the constant factors are in order to see ways to improve the experience for achievable problem sizes. Well, in MATLAB this can be done using the ever beautiful matrix left divide. First lets put all of this code into one function and gather the scalability data<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">function<\/span> complexFunction(n)\r\n\r\nconstantTimeFunction(n);\r\nlogTimeFunction(n);\r\nlinearTimeFunction(n);\r\nquadraticTimeFunction(n);\r\nexponentialTimeFunction(n);\r\n\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><p>Now let's write the test and gather the runtime data<\/p><pre class=\"language-matlab\">\r\n<span class=\"keyword\">classdef<\/span> FullComplexityTest &lt; matlab.perftest.TestCase\r\n       \r\n    \r\n    <span class=\"keyword\">properties<\/span>(TestParameter)\r\n        problemSize = {1, 2, 3, 5, 8, 13, 22};\r\n    <span class=\"keyword\">end<\/span>\r\n    \r\n    \r\n    <span class=\"keyword\">methods<\/span>(Test)\r\n        <span class=\"keyword\">function<\/span> testComplexFunction(~, problemSize)\r\n            complexFunction(problemSize);\r\n        <span class=\"keyword\">end<\/span>\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\n<\/pre><pre class=\"codeinput\">additiveResult = runperf(<span class=\"string\">'FullComplexityTest'<\/span>);\r\nproblemSize = [FullComplexityTest.problemSize{:}];\r\nexecutionTime = findMins(additiveResult);\r\n\r\n<span class=\"comment\">% Start with a fresh axes<\/span>\r\ndelete(ax)\r\nax = axes;\r\nylabel(<span class=\"string\">'Execution time (s)'<\/span>);\r\nxlabel(<span class=\"string\">'Problem Size'<\/span>);\r\nhold <span class=\"string\">on<\/span>;\r\ngrid <span class=\"string\">on<\/span>;\r\n\r\nplot(ax, problemSize, executionTime, <span class=\"string\">'o'<\/span>);\r\n<\/pre><pre class=\"codeoutput\">Running FullComplexityTest\r\n..........\r\n..........\r\n..........\r\n..........\r\n..........\r\n......\r\nDone FullComplexityTest\r\n__________\r\n\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_08.png\" alt=\"\"> <h4>Does it fit?<a name=\"85f2e888-49de-46a8-bd45-66b9f0d2eaf8\"><\/a><\/h4><p>Now let's fit the data! Note that $y= k_0 + k_1\\log{}n + k_2n + k_3n^2 + k_42^n$<\/p><pre class=\"codeinput\">format <span class=\"string\">long<\/span>\r\nn = problemSize.';\r\n\r\ny = executionTime.';\r\nX = [ones(size(n)), log(n), n, n.^2, 2.^n];\r\nk = X\\y\r\n\r\nN = (0:0.1:22)';\r\nY = [ones(size(N)), log(N), N, N.^2, 2.^N]*k;\r\nplot(ax, N,Y,<span class=\"string\">'-'<\/span>,<span class=\"string\">'LineWidth'<\/span>,2);\r\n<\/pre><pre class=\"codeoutput\">\r\nk =\r\n\r\n   0.203249252542941\r\n   0.024019014853585\r\n  -0.000703911764008\r\n   0.000243945808296\r\n   0.000000018436312\r\n\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_09.png\" alt=\"\"> <p>...and there we go. Lookin' good.<\/p><p>What do you think, do you see yourself analyzing how your MATLAB code scales using this technique?<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_5b96b6d8d90e4ad4bb5531fa27c40e19() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='5b96b6d8d90e4ad4bb5531fa27c40e19 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 5b96b6d8d90e4ad4bb5531fa27c40e19';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2016 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_5b96b6d8d90e4ad4bb5531fa27c40e19()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2016a<br><\/p><\/div><!--\r\n5b96b6d8d90e4ad4bb5531fa27c40e19 ##### SOURCE BEGIN #####\r\n%% \r\n%\r\n% To scale or not to scale. That is the question. When talking about\r\n% algorithmic complexity, the answer to said question is usually an\r\n% important one. No matter what the constant factors are that affect your\r\n% algorithm's speed, poor scaling performance severely limits the scope of\r\n% the problems you can solve. How do you measure your code's runtime\r\n% complexity? If you don't have a good method for this (or even if you do!)\r\n% then once again the performance testing framework is on your side. I\r\n% was *amazed* when I learned of this *one weird trick* to measure the\r\n% complexity of my algorithms.\r\n% \r\n% The (not so) secret here is that now that\r\n% <https:\/\/blogs.mathworks.com\/developer\/2016\/05\/06\/performance-review-criteria-2-sticks-to-solid-principles\/#2c2fec16-4dec-4277-9c75-0841ddafef3a\r\n% performance tests are tests>, performance tests can be\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/create-basic-parameterized-test.html\r\n% parameterized tests>! Leveraging this feature we can simply send in the\r\n% size of the problem as the parameter. Let's see how this is done.\r\n% \r\n%% The Code \r\n% To start, we need some code to scale. Let's look at a simple function\r\n% that will allow us to simulate some work:\r\n% \r\n% <include>doSomeWork.m<\/include>\r\n%\r\n% With this little \"working\" function, we now can play with some algorithms\r\n% of different time complexity. First, let's look at something that is\r\n% $\\mathcal{O}(1)$, or constant time:\r\n% \r\n% <include>constantTimeFunction<\/include>\r\n%\r\n% Easy enough. Note how the constantTimeFunction takes in a parameter but\r\n% simply ignores it. This parameter is intended to be the size of the\r\n% problem, and this function ignores it because constant time algorithms\r\n% are the same regardless of size. How can we make something\r\n% $\\mathcal{O}(n)$, or linear? Put it in a for loop around the size of the\r\n% problem:\r\n%\r\n% <include>linearTimeFunction<\/include>\r\n%\r\n% Now don't worry about the example constants here. They really can be\r\n% anything I've simply tweaked them to get something reasonable for this\r\n% demonstration. OK, moving on let's create a $\\mathcal{O}(n^2)$ quadratic\r\n% time function by using two for-loops around the size:\r\n%\r\n% <include>quadraticTimeFunction<\/include>\r\n%\r\n% How do we create something that is $\\mathcal{O}(\\log{}n)$? One way is to\r\n% simply decrease the number of iterations so that as the problem size\r\n% grows there is less work to do per unit of size. This can look like this:\r\n%\r\n% <include>logTimeFunction<\/include>\r\n%\r\n% Now, finally let's show something with $\\mathcal{O}(2^{n})$, or\r\n% exponential complexity. Recursion is a nice fallback whenever you want\r\n% (!) an exponential algorithm.\r\n%\r\n% <include>exponentialTimeFunction<\/include>\r\n%\r\n% Alright, now that we have all of our \"code\", let's see how we can\r\n% evaluate how it scales. You can do this pretty easily by creating a\r\n% performance test that uses a *|problemSize|* property as TestParameter.\r\n% If this property contains the different parameter data in either a cell\r\n% array or a structure if you'd like to give each point an intent revealing\r\n% label. For this test lets just create a few points across the first\r\n% couple decades. Once this parameter is defined, its trivially easy to\r\n% pass it into the performance test methods and get your results:\r\n%\r\n% <include>ScalingTest<\/include>\r\n% \r\n% Let's get our data and look at each algorithm individually. I can do this\r\n% by calling runperf with the 'Name' option with some wildcards to narrow\r\n% down on each method of interest, starting with the constant algorithm:\r\nconstantResult = runperf('ScalingTest', 'Name', '*testConstantTime*');\r\n\r\n%%\r\n% Let's create a little helper function to get at the minimum value for each\r\n% size. This will help us quickly pull out the data points. I am going to\r\n% choose the minimum of each sample because it is Tuesday and Tuesday told\r\n% me it really likes mins.\r\nfindMins = @(resultArray) arrayfun(@(result) min(result.Samples.MeasuredTime), resultArray);\r\n\r\nax = axes;\r\nylabel('Execution time (s)');\r\nxlabel('Problem Size');\r\nhold on;\r\ngrid on;\r\n\r\nproblemSize = [ScalingTest.problemSize{:}];\r\nplot(ax, problemSize, findMins(constantResult), 'LineWidth', 2);\r\nleg = legend({'$\\mathcal{O}(1)$'}, 'Interpreter', 'latex');\r\n\r\n%%\r\n% This looks all over the place until you realize the scale is off.\r\n% Adjusting the y axis to start at zero shows its pretty flat.\r\nax.YLim(1) = 0;\r\n\r\n%%\r\n% Alright now we can start adding the more complex algorithms and see how\r\n% they compare. Next logarithmic:\r\nlogResult = runperf('ScalingTest', 'Name', '*testLogTime*');\r\nplot(ax, problemSize, findMins(logResult), 'LineWidth', 2);\r\naxis tight;\r\nleg = legend([leg.String '$\\mathcal{O}(\\log{} n)$'],'Interpreter','latex');\r\n\r\n%%\r\n% Then linear:\r\nlinearResult = runperf('ScalingTest', 'Name', '*testLinearTime*');\r\nplot(ax, problemSize, findMins(linearResult), 'LineWidth', 2);\r\naxis tight;\r\nleg = legend([leg.String '$\\mathcal{O}(n)$'],'Interpreter','latex');\r\n\r\n\r\n%%\r\n% Then quadratic:\r\nquadraticResult = runperf('ScalingTest', 'Name', '*testQuadraticTime*');\r\nplot(ax, problemSize, findMins(quadraticResult), 'LineWidth', 2);\r\naxis tight;\r\nleg = legend([leg.String '$\\mathcal{O}(n^2)$'],'Interpreter','latex');\r\n\r\n\r\n%%\r\n% ...and finally exponential. Note that since the exponential blows up so\r\n% quickly we needed to limit the problem size by using an assumption. Since\r\n% we don't want to time the assumption, we are explicitly setting the\r\n% measurement boundary to exclude the assumption call. For those sizes that\r\n% fail the assumption, the result's *|Valid|* property will be false, so we\r\n% can use that to trim the data for only the valid measurements we collected.\r\n \r\nexponentialResult = runperf('ScalingTest', 'Name', '*testExponentialTime*');\r\nvalid = [exponentialResult.Valid];\r\nvalidResult = exponentialResult(valid);\r\nvalidSize = problemSize(valid);\r\n\r\nplot(ax, validSize, findMins(validResult), 'LineWidth', 2);\r\n\r\nleg = legend([leg.String '$\\mathcal{O}(2^n)$'],'Interpreter','latex');\r\n\r\n%%\r\n% It is now clear that trying to see this on a linear scale is getting\r\n% ridiculous. Let's go log scale.\r\nax.YScale = 'log';\r\n\r\n%%\r\n% There we go! Here is the scale of all of our dummy algorithms. As you can\r\n% see, the algorithmic complexity is very important. As the problem sizes\r\n% scale, large constant factors quickly become irrelevant. \r\n\r\n%% The Sum of all its parts\r\n% However, just for kicks lets do some MATLAB fun to find the constant\r\n% factors. If you can't improve the algorithm's complexity, perhaps you\r\n% might want to see where the constant factors are in order to see ways to\r\n% improve the experience for achievable problem sizes. Well, in MATLAB this\r\n% can be done using the ever beautiful matrix left divide. First lets put\r\n% all of this code into one function and gather the scalability data\r\n%\r\n% <include>complexFunction<\/include>\r\n%\r\n% Now let's write the test and gather the runtime data\r\n%\r\n% <include>FullComplexityTest<\/include>\r\nadditiveResult = runperf('FullComplexityTest');\r\nproblemSize = [FullComplexityTest.problemSize{:}];\r\nexecutionTime = findMins(additiveResult);\r\n\r\n% Start with a fresh axes\r\ndelete(ax)\r\nax = axes;\r\nylabel('Execution time (s)');\r\nxlabel('Problem Size');\r\nhold on;\r\ngrid on;\r\n\r\nplot(ax, problemSize, executionTime, 'o');\r\n\r\n%% Does it fit?\r\n% Now let's fit the data! Note that $y= k_0 + k_1\\log{}n + k_2n + k_3n^2 + k_42^n$\r\nformat long\r\nn = problemSize.';\r\n\r\ny = executionTime.';\r\nX = [ones(size(n)), log(n), n, n.^2, 2.^n];\r\nk = X\\y\r\n\r\nN = (0:0.1:22)';\r\nY = [ones(size(N)), log(N), N, N.^2, 2.^N]*k;\r\nplot(ax, N,Y,'-','LineWidth',2);\r\n\r\n%%\r\n% ...and there we go. Lookin' good.\r\n%\r\n% What do you think, do you see yourself analyzing how your MATLAB code\r\n% scales using this technique? \r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n    \r\n\r\n\r\n##### SOURCE END ##### 5b96b6d8d90e4ad4bb5531fa27c40e19\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img src=\"https:\/\/blogs.mathworks.com\/developer\/files\/2016PerfScalingblog_07.png\" class=\"img-responsive attachment-post-thumbnail size-post-thumbnail wp-post-image\" alt=\"\" decoding=\"async\" loading=\"lazy\" \/><\/div><!--introduction--><p>To scale or not to scale. That is the question. When talking about algorithmic complexity, the answer to said question is usually an important one. No matter what the constant factors are that affect your algorithm's speed, poor scaling performance severely limits the scope of the problems you can solve. How do you measure your code's runtime complexity? If you don't have a good method for this (or even if you do!) then once again the performance testing framework is on your side. I was <b>amazed<\/b> when I learned of this <b>one weird trick<\/b> to measure the complexity of my algorithms.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/developer\/2016\/05\/24\/performance-algorithm-scaling\/\">read more >><\/a><\/p>","protected":false},"author":90,"featured_media":563,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[13,7],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts\/524"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/users\/90"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/comments?post=524"}],"version-history":[{"count":10,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts\/524\/revisions"}],"predecessor-version":[{"id":570,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts\/524\/revisions\/570"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/media\/563"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/media?parent=524"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/categories?post=524"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/tags?post=524"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}