Encouragingly Parallel (Epilogue)
OK, so remember how we just went through the exercise (not just once, but twice) of how you can write a function to help parallelize your test runs? Well all of that still applies if you have not yet upgraded to MATLAB R2015a. However, if you do have R2015a you now get this out of the box because the TestRunner now has a runInParallel method!
One last time, we can create the same representative suite used in the previous posts:
import matlab.unittest.TestSuite; classSuite = TestSuite.fromFile('aClassBasedTest.m'); fcnSuite = TestSuite.fromFile('aFunctionBasedTest.m'); scriptSuite = TestSuite.fromFile('aScriptBasedTest.m'); suite = [repmat(classSuite, 1, 50), repmat(fcnSuite, 1, 50), repmat(scriptSuite, 1, 50)];
Next we create a TestRunner explicitly. To achieve the same style of test output we'll need to create the runner as configured with text output.
import matlab.unittest.TestRunner; runner = TestRunner.withTextOutput;
Now, just run it in parallel!
tic; runner.runInParallel(suite) toc;
Split tests into 48 groups and running them on 16 workers. ----------------- Finished Group 15 ----------------- Running aFunctionBasedTest ....... Done aFunctionBasedTest __________ ----------------- Finished Group 14 ----------------- Running aFunctionBasedTest ........ Done aFunctionBasedTest __________ <SNIP: Output truncated because you get the idea> ----------------- Finished Group 25 ----------------- Running aFunctionBasedTest ...... Done aFunctionBasedTest __________ ----------------- Finished Group 47 ----------------- Running aScriptBasedTest ... Done aScriptBasedTest __________ ----------------- Finished Group 44 ----------------- Running aScriptBasedTest .... Done aScriptBasedTest __________ ans = 300x1 TestResult array with properties: Name Passed Failed Incomplete Duration Totals: 300 Passed, 0 Failed, 0 Incomplete. 482.5131 seconds testing time. Elapsed time is 33.861671 seconds.
The result here is for the most part the same as described in the last post. However, it does apply a slightly different heuristic. I need to get something off my chest here and admit to one more possible problem with the algorithm we have developed in the past couple posts. First note that the example test suite we are using is front-loaded with the most expensive tests. Remember the class-based tests are first in the array and these tests contain the expensive system test (reminder). Since we are scheduling these first we are in great shape, but think about the case where all of these expensive tests happen to fall on the end of the array rather than the front. It would be unfortunate to enjoy the benefits of parallelism for most of the tests only to have the last group(s) get stuck with the long pole. If you think about it, the system is resilient to the early groups being the long pole because in that event we can still utilize the full set of workers and crank away at the other groups while the long pole executes. However, if the long pole is the last job to get scheduled we have most of the pool sitting idle at the end of the run while we wait for the last group.
For this reason, the runInParallel method of the TestRunner tweaks the scheduling a bit by sending larger chunks of the test suite in the early schedules and by gradually reducing the size as the later groups are scheduled. This means we pay a small price in the best case when the expensive tests are in front but have more consistency and pay a lower price in the worst case when the expensive tests are in back (Note, the runInParallel may use a different scheduling algorithm in future releases in order to allow continued improvements).
I'll spare you the experiment, but I ran this test suite in these four scenarios:
- front heavy using runWithParFeval (the previous blog post's algorithm)
- back heavy using runWithParFeval
- front heavy using runInParallel (R2015a's new method of TestRunner)
- back heavy using runInParallel
Remember that the suite has some inherent randomness in its runtime. This way we can get a better sense of how arbitrary test suites might behave. However, because of this randomness I ran these scenarios 100 times each to get a better picture of the behavior. I saved these results to a mat file so you can just see the result.
data = load('runData');
Comparing runWithParFeval with runInParallel for the front loaded suite we see that there is not much affect in the best case:
clf; hold on; histogram(data.runInParallelFrontHeavy, 'BinWidth', 1); histogram(data.runWithParFevalFrontHeavy, 'BinWidth', 1); hold off; title('R2015a Algorithm vs. Blog Algorithm (Front Heavy Suite)') legend('R2015a Algorithm', 'Blog Algorithm') xlabel('Wallclock Time to run full suite in parallel (seconds)')
There is a bit more variance in the R2015a algorithm. This is expected. Remember it was really the best case for the blog algorithm to operate on a front heavy suite, but in general we don't have enough information to achieve this in practice (in other words, here we are just getting lucky). However, if we look at the case where the expensive suite elements are include at the end of the suite we can see the R2015a approach does better:
clf; hold on; histogram(data.runInParallelBackHeavy, 'BinWidth', 1); histogram(data.runWithParFevalBackHeavy, 'BinWidth', 1); hold off; title('R2015a Algorithm vs. Blog Algorithm (Back Heavy Suite)') legend('R2015a Algorithm', 'Blog Algorithm') xlabel('Wallclock Time to run full suite in parallel (seconds)')
There you have it. Adjusting the scheduling of the groups to gradually decrease the group size prevents the run from having too large of a segment scheduled as the last suite. Also, you can't beat the ease of use, all it takes is a simple call to the new runInParallel method of TestRunner. Go check it out and let us know what you think!
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.