When I am neck deep in hardcore TDD Red-Green-Refactor cycles I am constantly looking for ways to ensure that my feedback loop is as quick as possible. If the testing feedback takes too long I am liable to start daydreaming and I lose my context in the design process (because of course TDD is about design not testing). Often this means that I run a fast, small test suite which is focused on just the change at hand. However, sometimes the refactor step touches a few different areas, and it requires running a more substantial set of tests. How do we minimize the "distractable" time and maximize the design time in these cycles?
Encouragingly, well written test suites are (almost) embarrassingly parallel. As a principle each test should be completely independent of the next and thus can be run in any order, on any machine, at any time. Furthermore test suites in the MATLAB® test framework are just arrays of Test objects where each element can be independently run. If you have the Parallel Computing Toolbox™ there exists a variety of ways these tests can be run in parallel. Awesome, let's dig into the details of how this is done.
We need to establish what the test suite is as we explore this topic. Of course the test suite can be anything written using the unit test framework. Typically the time taken to execute the tests corresponds to the time taken actually setting up, exercising, verifying, and tearing down the software under test. However, for this example why don't we just add some calls to the pause function in order to mimic a real test? We can create 3 simple tests that we can use to build a demonstrative test suite.
Let's use one script-based test with just a couple simple tests:
%% The feature should do something pause(rand) % 0-1 seconds %% The feature should do another thing pause(rand); % 0-1 seconds
...a function-based test with a couple tests and a relatively long file fixture function:
function tests = aFunctionBasedTest tests = functiontests(localfunctions); function setupOnce(~) % Create a fixture that takes a while to build pause(rand*10); % 0-10 seconds function testSomeFeature(~) pause(rand); % 0-1 seconds function testAnotherFeature(~) pause(rand); % 0-1 seconds
...and finally a class-based test with one simple test and one relatively long system test:
classdef aClassBasedTest < matlab.unittest.TestCase methods(Test) function testLongRunningEndToEndWorkflow(~) pause(rand*10); % 0-10 seconds end function testANormalFeature(~) pause(rand); % 0-1 seconds end end end
Using these simple dummy tests we can create a large representative suite by just using repmat and concatenation:
import matlab.unittest.TestSuite; classSuite = TestSuite.fromFile('aClassBasedTest.m'); fcnSuite = TestSuite.fromFile('aFunctionBasedTest.m'); scriptSuite = TestSuite.fromFile('aScriptBasedTest.m'); suite = [repmat(classSuite, 1, 50), repmat(fcnSuite, 1, 50), repmat(scriptSuite, 1, 50)]; % Let's run this suite serially to see how long it takes: tic; result = run(suite) toc;
Running aClassBasedTest .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... Done aClassBasedTest __________ Running aFunctionBasedTest .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... Done aFunctionBasedTest __________ Running aScriptBasedTest .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... Done aScriptBasedTest __________ result = 1x300 TestResult array with properties: Name Passed Failed Incomplete Duration Totals: 300 Passed, 0 Failed, 0 Incomplete. 380.6043 seconds testing time. Elapsed time is 385.422383 seconds.
Somewhere in the neighborhood of 6 and a half minutes delay time will definitely lose my attention, hopefully to something more productive than cat videos (but no guarantee!). What is the simplest way I can run these in parallel? Let's try a simple parfor loop using a parallel pool containing 16 workers:
tic; parfor idx = 1:numel(suite) results(idx) = run(suite(idx)); end results toc;
Running aClassBasedTest Running aClassBasedTest Running aFunctionBasedTest Running aClassBasedTest . Done aFunctionBasedTest __________ Running aFunctionBasedTest . Done aClassBasedTest __________ Running aClassBasedTest . Done aClassBasedTest __________ Running aClassBasedTest Running aFunctionBasedTest Running aFunctionBasedTest . Done aFunctionBasedTest __________ Running aFunctionBasedTest Running aClassBasedTest . Done aClassBasedTest __________ Running aClassBasedTest Running aFunctionBasedTest Running aClassBasedTest Running aClassBasedTest . Done aClassBasedTest __________ Running aClassBasedTest . Done aFunctionBasedTest __________ Running aFunctionBasedTest Running aClassBasedTest . Done aClassBasedTest __________ <SNIP: Lengthy output removed to save your scrollwheel finger.> Running aFunctionBasedTest . Done aFunctionBasedTest __________ results = 1x300 TestResult array with properties: Name Passed Failed Incomplete Duration Totals: 300 Passed, 0 Failed, 0 Incomplete. 838.4606 seconds testing time. Elapsed time is 81.866555 seconds.
Parallelism FTW! Now we have the suite running on the order of a minute and a half. That is much better time, but it's still not good enough for me. Also, what is the deal with the humongous (and unparsable) output? Note, I spared you from excessive browser scrolling by actually removing (SNIP!) most of the produced output. You can see, however, that each test element got its own start/end lines and different workers all printed their output to the command window without any grouping or understanding of what ran where. Do you see the lines that look like we start aClassBasedTest and finish aFunctionBasedTest? Theres no magic test conversion going on here, were are just getting garbled output from the workers.
Another interesting tidbit you can see is that the overall testing time actually increased significantly. This is not explained by the test framework time or the client/worker communication overhead, because the Duration property of TestResult only includes the time taken by the actual test content. What is actually happening here is that the function-based test, which has an expensive setupOnce function, is not enjoying the efficiency benefits of only setting up that fixture once and sharing it across multiple tests. Instead this setupOnce function is executed on every element of the function-based test on every worker. The benefits of sharing the fixture only apply when a MATLAB runs more than one test using that fixture. In this case, we are setting it up for every new Test element that we send to each parallel worker because we are sending each suite element one at a time to the pool.
Let's talk next time about how we can improve on this further and tackle these problems. In the meantime, have you used parallelism in your testing workflow? What works for you?
Get the MATLAB code
Published with MATLAB® R2014b
6 CommentsOldest to Newest
a little off topic question. How can I make my test running silently when using a VerifyError which checks an error is occurring? Now it displays the entire exception trace.
The approach I use is: testCase.verifyError(@() testfunction,’fcnid:errorid’);
I had no success using the NoPlugin approach.
I am not sure what is going on with your experiences with verifyError. You should not see the stack trace at all when using it so I suspect something is different about this case.
Can you provide more details on what is going on by asking a question here:
This will be a better forum for asking the question where you can show some code snippets and have a bit more formatting options than this comment field.
Add the unittest tag to your question and I will see it right away.
[…] 20 Feb Encouragingly Parallel […]
[…] 20 Feb Encouragingly Parallel […]
Just found this article, great stuff 😉
Now I am using basic parfor loop (in 2016b) to execute the set of test suits in parallel as in the blog post:
% above all_suites are populated parfor idx = 1:numel(all_suites) run_results(idx) = run(runner, all_suites(idx)); end
All performed tests seem to run fine, but mlint gives disturbing error notification:
The function RUN might make an invalid workspace access inside the PARFOR loop.
Could you please comment on this?
Sorry for the late reply. This message is a code analyzer message that is assuming the call to run is the standard RUN function that can be used to run scripts. Since scripts execute in the context of the workspace in which it was run, the variables created in the script can populate the workspace, thus conflicting with the transparency requirement of the parfor loop.
In this case it is actually the run method of test runner which is getting invoked so it is fine to suppress the code analyzer warning. Alternatively, you can write your parfor loop as follows:
parfor idx = 1:numel(all_suites) run_results(idx) = runner.run(all_suites(idx)); end
In this case, the run method is explicitly invoked as a method of the runner value so there is no ambiguity between the runner.run method and the global RUN function.
Even better though, read the two blog posts subsequent to this one, as there is a better way to do this, and one of those includes simply using the builtin parallel running features of the framework. Enjoy!