{"id":107,"date":"2015-03-13T13:36:16","date_gmt":"2015-03-13T13:36:16","guid":{"rendered":"https:\/\/blogs.mathworks.com\/developer\/?p=107"},"modified":"2015-03-13T18:28:55","modified_gmt":"2015-03-13T18:28:55","slug":"encouragingly-parallel-epilogue","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/developer\/2015\/03\/13\/encouragingly-parallel-epilogue\/","title":{"rendered":"Encouragingly Parallel (Epilogue)"},"content":{"rendered":"<div class=\"content\"><!--introduction--><p>OK, so remember how we just went through the exercise (not just <a href=\"https:\/\/blogs.mathworks.com\/developer\/2015\/02\/20\/encouragingly-parallel-part-1\/\">once<\/a>, but <a href=\"https:\/\/blogs.mathworks.com\/developer\/2015\/03\/03\/encouragingly-parallel-part-2\/\">twice<\/a>) of how you can write a function to help parallelize your test runs? Well all of that still applies if you have not yet upgraded to MATLAB R2015a. However, if you do have R2015a you now get this out of the box because the TestRunner now has a <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.unittest.testrunner.runinparallel.html\">runInParallel<\/a> method!<\/p><!--\/introduction--><p>One last time, we can create the same representative suite used in the previous posts:<\/p><pre class=\"codeinput\">import <span class=\"string\">matlab.unittest.TestSuite<\/span>;\r\nclassSuite = TestSuite.fromFile(<span class=\"string\">'aClassBasedTest.m'<\/span>);\r\nfcnSuite = TestSuite.fromFile(<span class=\"string\">'aFunctionBasedTest.m'<\/span>);\r\nscriptSuite = TestSuite.fromFile(<span class=\"string\">'aScriptBasedTest.m'<\/span>);\r\n\r\nsuite = [repmat(classSuite, 1, 50), repmat(fcnSuite, 1, 50), repmat(scriptSuite, 1, 50)];\r\n<\/pre><p>Next we create a TestRunner explicitly. To achieve the same style of test output we'll need to create the runner as configured with text output.<\/p><pre class=\"codeinput\">import <span class=\"string\">matlab.unittest.TestRunner<\/span>;\r\nrunner = TestRunner.withTextOutput;\r\n<\/pre><p>Now, just run it in parallel!<\/p><pre class=\"codeinput\">tic;\r\nrunner.runInParallel(suite)\r\ntoc;\r\n<\/pre><pre class=\"codeoutput\">Split tests into 48 groups and running them on 16 workers.\r\n-----------------\r\nFinished Group 15\r\n-----------------\r\nRunning aFunctionBasedTest\r\n.......\r\nDone aFunctionBasedTest\r\n__________\r\n\r\n\r\n-----------------\r\nFinished Group 14\r\n-----------------\r\nRunning aFunctionBasedTest\r\n........\r\nDone aFunctionBasedTest\r\n__________\r\n\r\n\r\n&lt;SNIP: Output truncated because you get the idea&gt;\r\n\r\n\r\n-----------------\r\nFinished Group 25\r\n-----------------\r\nRunning aFunctionBasedTest\r\n......\r\nDone aFunctionBasedTest\r\n__________\r\n\r\n\r\n-----------------\r\nFinished Group 47\r\n-----------------\r\nRunning aScriptBasedTest\r\n...\r\nDone aScriptBasedTest\r\n__________\r\n\r\n\r\n-----------------\r\nFinished Group 44\r\n-----------------\r\nRunning aScriptBasedTest\r\n....\r\nDone aScriptBasedTest\r\n__________\r\n\r\n\r\n\r\nans = \r\n\r\n  300x1 TestResult array with properties:\r\n\r\n    Name\r\n    Passed\r\n    Failed\r\n    Incomplete\r\n    Duration\r\n\r\nTotals:\r\n   300 Passed, 0 Failed, 0 Incomplete.\r\n   482.5131 seconds testing time.\r\n\r\nElapsed time is 33.861671 seconds.\r\n<\/pre><p>The result here is for the most part the same as described in the last post. However, it does apply a slightly different heuristic. I need to get something off my chest here and admit to one more possible problem with the algorithm we have developed in the past couple posts. First note that the example test suite we are using is front-loaded with the most expensive tests. Remember the class-based tests are first in the array and these tests contain the expensive system test (<a href=\"https:\/\/blogs.mathworks.com\/developer\/2015\/02\/20\/encouragingly-parallel-part-1\/#05de8e27-8e92-4a4c-a34b-674bb46d8ba2\">reminder<\/a>). Since we are scheduling these first we are in great shape, but think about the case where all of these expensive tests happen to fall on the end of the array rather than the front. It would be unfortunate to enjoy the benefits of parallelism for most of the tests only to have the last group(s) get stuck with the long pole. If you think about it, the system is resilient to the early groups being the long pole because in that event we can still utilize the full set of workers and crank away at the other groups while the long pole executes. However, if the long pole is the last job to get scheduled we have most of the pool sitting idle at the end of the run while we wait for the last group.<\/p><p>For this reason, the runInParallel method of the TestRunner tweaks the scheduling a bit by sending larger chunks of the test suite in the early schedules and by gradually reducing the size as the later groups are scheduled. This means we pay a small price in the best case when the expensive tests are in front but have more consistency and pay a lower price in the worst case when the expensive tests are in back (Note, the runInParallel may use a different scheduling algorithm in future releases in order to allow continued improvements).<\/p><p>I'll spare you the experiment, but I ran this test suite in these four scenarios:<\/p><div><ol><li>front heavy using runWithParFeval (the previous blog post's algorithm)<\/li><li>back heavy using runWithParFeval<\/li><li>front heavy using  runInParallel (R2015a's new method of TestRunner)<\/li><li>back heavy using runInParallel<\/li><\/ol><\/div><p>Remember that the suite has some inherent randomness in its runtime. This way we can get a better sense of how arbitrary test suites might behave. However, because of this randomness I ran these scenarios 100 times each to get a better picture of the behavior. I saved these results to a mat file so you can just see the result.<\/p><pre class=\"codeinput\">data = load(<span class=\"string\">'runData'<\/span>);\r\n<\/pre><p>Comparing runWithParFeval with runInParallel for the front loaded suite we see that there is not much affect in the best case:<\/p><pre class=\"codeinput\">clf;\r\nhold <span class=\"string\">on<\/span>;\r\nhistogram(data.runInParallelFrontHeavy, <span class=\"string\">'BinWidth'<\/span>, 1);\r\nhistogram(data.runWithParFevalFrontHeavy, <span class=\"string\">'BinWidth'<\/span>, 1);\r\nhold <span class=\"string\">off<\/span>;\r\n\r\n\r\ntitle(<span class=\"string\">'R2015a Algorithm vs. Blog Algorithm (Front Heavy Suite)'<\/span>)\r\nlegend(<span class=\"string\">'R2015a Algorithm'<\/span>, <span class=\"string\">'Blog Algorithm'<\/span>)\r\nxlabel(<span class=\"string\">'Wallclock Time to run full suite in parallel (seconds)'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/blog3_01.png\" alt=\"\"> <p>There is a bit more variance in the R2015a algorithm. This is expected. Remember it was really the best case for the blog algorithm to operate on a front heavy suite, but in general we don't have enough information to achieve this in practice (in other words, here we are just getting lucky). However, if we look at the case where the expensive suite elements are include at the end of the suite we can see the R2015a approach does better:<\/p><pre class=\"codeinput\">clf;\r\nhold <span class=\"string\">on<\/span>;\r\nhistogram(data.runInParallelBackHeavy, <span class=\"string\">'BinWidth'<\/span>, 1);\r\nhistogram(data.runWithParFevalBackHeavy, <span class=\"string\">'BinWidth'<\/span>, 1);\r\nhold <span class=\"string\">off<\/span>;\r\n\r\n\r\ntitle(<span class=\"string\">'R2015a Algorithm vs. Blog Algorithm (Back Heavy Suite)'<\/span>)\r\nlegend(<span class=\"string\">'R2015a Algorithm'<\/span>, <span class=\"string\">'Blog Algorithm'<\/span>)\r\nxlabel(<span class=\"string\">'Wallclock Time to run full suite in parallel (seconds)'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/blog3_02.png\" alt=\"\"> <p>There you have it. Adjusting the scheduling of the groups to gradually decrease the group size prevents the run from having too large of a segment scheduled as the last suite. Also, you can't beat the ease of use, all it takes is a simple call to the new runInParallel method of TestRunner. Go check it out and let us know what you think!<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_2ff24e348af7429b9ccd294c670ba90e() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='2ff24e348af7429b9ccd294c670ba90e ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 2ff24e348af7429b9ccd294c670ba90e';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2015 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_2ff24e348af7429b9ccd294c670ba90e()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2015a<br><\/p><\/div><!--\r\n2ff24e348af7429b9ccd294c670ba90e ##### SOURCE BEGIN #####\r\n%% Encouragingly Parallel (Epilogue)\r\n%\r\n% OK, so remember how we just went through the exercise (not just\r\n% <https:\/\/blogs.mathworks.com\/developer\/2015\/02\/20\/encouragingly-parallel-part-1\/\r\n% once>, but\r\n% <https:\/\/blogs.mathworks.com\/developer\/2015\/03\/03\/encouragingly-parallel-part-2\/\r\n% twice>) of how you can write a function to help parallelize your test\r\n% runs? Well all of that still applies if you have not yet upgraded to\r\n% MATLAB R2015a. However, if you do have R2015a you now get this out of the\r\n% box because the TestRunner now has a\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.unittest.testrunner.runinparallel.html\r\n% runInParallel> method!\r\n% \r\n%%\r\n%\r\n% One last time, we can create the same representative suite used in the previous posts:\r\nimport matlab.unittest.TestSuite;\r\nclassSuite = TestSuite.fromFile('aClassBasedTest.m');\r\nfcnSuite = TestSuite.fromFile('aFunctionBasedTest.m');\r\nscriptSuite = TestSuite.fromFile('aScriptBasedTest.m');\r\n\r\nsuite = [repmat(classSuite, 1, 50), repmat(fcnSuite, 1, 50), repmat(scriptSuite, 1, 50)];\r\n\r\n%%\r\n% Next we create a TestRunner explicitly. To achieve the same style of test\r\n% output we'll need to create the runner as configured with text output.\r\nimport matlab.unittest.TestRunner;\r\nrunner = TestRunner.withTextOutput;\r\n\r\n%%\r\n% Now, just run it in parallel!\r\ntic;\r\nrunner.runInParallel(suite)\r\ntoc;\r\n\r\n%%\r\n% The result here is for the most part the same as described in the last\r\n% post. However, it does apply a slightly different heuristic. I need to\r\n% get something off my chest here and admit to one more possible problem\r\n% with the algorithm we have developed in the past couple posts. First note\r\n% that the example test suite we are using is front-loaded with the most\r\n% expensive tests. Remember the class-based tests are first in the array\r\n% and these tests contain the expensive system test\r\n% (<https:\/\/blogs.mathworks.com\/developer\/2015\/02\/20\/encouragingly-parallel-part-1\/#05de8e27-8e92-4a4c-a34b-674bb46d8ba2\r\n% reminder>). Since we are scheduling these first we are in great shape,\r\n% but think about the case where all of these expensive tests happen to\r\n% fall on the end of the array rather than the front. It would be\r\n% unfortunate to enjoy the benefits of parallelism for most of the tests\r\n% only to have the last group(s) get stuck with the long pole. If you think\r\n% about it, the system is resilient to the early groups being the long pole\r\n% because in that event we can still utilize the full set of workers and\r\n% crank away at the other groups while the long pole executes. However, if\r\n% the long pole is the last job to get scheduled we have most of the pool\r\n% sitting idle at the end of the run while we wait for the last group.\r\n%\r\n%%\r\n% For this reason, the runInParallel method of the TestRunner tweaks the\r\n% scheduling a bit by sending larger chunks of the test suite in the early\r\n% schedules and by gradually reducing the size as the later groups are\r\n% scheduled. This means we pay a small price in the best\r\n% case when the expensive tests are in front but have more\r\n% consistency and pay a lower price in the worst case when the expensive\r\n% tests are in back (Note, the runInParallel may use a different\r\n% scheduling algorithm in future releases in order to allow continued\r\n% improvements).\r\n%\r\n%%\r\n% I'll spare you the experiment, but I ran this test suite in these four\r\n% scenarios:\r\n% \r\n% # front heavy using runWithParFeval (the previous blog post's algorithm)\r\n% # back heavy using runWithParFeval\r\n% # front heavy using  runInParallel (R2015a's new method of TestRunner)\r\n% # back heavy using runInParallel\r\n%\r\n% Remember that the suite has some inherent randomness in its runtime. This\r\n% way we can get a better sense of how arbitrary test suites might behave.\r\n% However, because of this randomness I ran these scenarios 100 times each\r\n% to get a better picture of the behavior. I saved these results to a mat\r\n% file so you can just see the result. \r\ndata = load('runData');\r\n\r\n%%\r\n% Comparing runWithParFeval with runInParallel for the front loaded suite\r\n% we see that there is not much affect in the best case:\r\nclf;\r\nhold on;\r\nhistogram(data.runInParallelFrontHeavy, 'BinWidth', 1);\r\nhistogram(data.runWithParFevalFrontHeavy, 'BinWidth', 1);\r\nhold off;\r\n\r\n\r\ntitle('R2015a Algorithm vs. Blog Algorithm (Front Heavy Suite)')\r\nlegend('R2015a Algorithm', 'Blog Algorithm')\r\nxlabel('Wallclock Time to run full suite in parallel (seconds)')\r\n\r\n%%\r\n% There is a bit more variance in the R2015a algorithm. This is expected.\r\n% Remember it was really the best case for the blog algorithm to operate on\r\n% a front heavy suite, but in general we don't have enough information to\r\n% achieve this in practice (in other words, here we are just getting\r\n% lucky). However, if we look at the case where the expensive suite\r\n% elements are include at the end of the suite we can see the R2015a\r\n% approach does better:\r\nclf;\r\nhold on;\r\nhistogram(data.runInParallelBackHeavy, 'BinWidth', 1);\r\nhistogram(data.runWithParFevalBackHeavy, 'BinWidth', 1);\r\nhold off;\r\n\r\n\r\ntitle('R2015a Algorithm vs. Blog Algorithm (Back Heavy Suite)')\r\nlegend('R2015a Algorithm', 'Blog Algorithm')\r\nxlabel('Wallclock Time to run full suite in parallel (seconds)')\r\n\r\n\r\n\r\n%%\r\n% There you have it. Adjusting the scheduling of the groups to gradually\r\n% decrease the group size prevents the run from having too large of a\r\n% segment scheduled as the last suite. Also, you can't beat the ease of\r\n% use, all it takes is a simple call to the new runInParallel method of\r\n% TestRunner. Go check it out and let us know what you think!\r\n\r\n\r\n\r\n\r\n##### SOURCE END ##### 2ff24e348af7429b9ccd294c670ba90e\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/developer\/files\/blog3_02.png\" onError=\"this.style.display ='none';\" \/><\/div><!--introduction--><p>OK, so remember how we just went through the exercise (not just <a href=\"https:\/\/blogs.mathworks.com\/developer\/2015\/02\/20\/encouragingly-parallel-part-1\/\">once<\/a>, but <a href=\"https:\/\/blogs.mathworks.com\/developer\/2015\/03\/03\/encouragingly-parallel-part-2\/\">twice<\/a>) of how you can write a function to help parallelize your test runs? Well all of that still applies if you have not yet upgraded to MATLAB R2015a. However, if you do have R2015a you now get this out of the box because the TestRunner now has a <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/matlab.unittest.testrunner.runinparallel.html\">runInParallel<\/a> method!... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/developer\/2015\/03\/13\/encouragingly-parallel-epilogue\/\">read more >><\/a><\/p>","protected":false},"author":90,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[8,7],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts\/107"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/users\/90"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/comments?post=107"}],"version-history":[{"count":12,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts\/107\/revisions"}],"predecessor-version":[{"id":124,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/posts\/107\/revisions\/124"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/media?parent=107"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/categories?post=107"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/developer\/wp-json\/wp\/v2\/tags?post=107"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}