Encouragingly Parallel
Contents
When I am neck deep in hardcore TDD Red-Green-Refactor cycles I am constantly looking for ways to ensure that my feedback loop is as quick as possible. If the testing feedback takes too long I am liable to start daydreaming and I lose my context in the design process (because of course TDD is about design not testing). Often this means that I run a fast, small test suite which is focused on just the change at hand. However, sometimes the refactor step touches a few different areas, and it requires running a more substantial set of tests. How do we minimize the "distractable" time and maximize the design time in these cycles?
Encouragingly, well written test suites are (almost) embarrassingly parallel. As a principle each test should be completely independent of the next and thus can be run in any order, on any machine, at any time. Furthermore test suites in the MATLAB® test framework are just arrays of Test objects where each element can be independently run. If you have the Parallel Computing Toolbox™ there exists a variety of ways these tests can be run in parallel. Awesome, let's dig into the details of how this is done.
The Test Suite
We need to establish what the test suite is as we explore this topic. Of course the test suite can be anything written using the unit test framework. Typically the time taken to execute the tests corresponds to the time taken actually setting up, exercising, verifying, and tearing down the software under test. However, for this example why don't we just add some calls to the pause function in order to mimic a real test? We can create 3 simple tests that we can use to build a demonstrative test suite.
Let's use one script-based test with just a couple simple tests:
%% The feature should do something pause(rand) % 0-1 seconds %% The feature should do another thing pause(rand); % 0-1 seconds
...a function-based test with a couple tests and a relatively long file fixture function:
function tests = aFunctionBasedTest tests = functiontests(localfunctions); function setupOnce(~) % Create a fixture that takes a while to build pause(rand*10); % 0-10 seconds function testSomeFeature(~) pause(rand); % 0-1 seconds function testAnotherFeature(~) pause(rand); % 0-1 seconds
...and finally a class-based test with one simple test and one relatively long system test:
classdef aClassBasedTest < matlab.unittest.TestCase methods(Test) function testLongRunningEndToEndWorkflow(~) pause(rand*10); % 0-10 seconds end function testANormalFeature(~) pause(rand); % 0-1 seconds end end end
Using these simple dummy tests we can create a large representative suite by just using repmat and concatenation:
import matlab.unittest.TestSuite; classSuite = TestSuite.fromFile('aClassBasedTest.m'); fcnSuite = TestSuite.fromFile('aFunctionBasedTest.m'); scriptSuite = TestSuite.fromFile('aScriptBasedTest.m'); suite = [repmat(classSuite, 1, 50), repmat(fcnSuite, 1, 50), repmat(scriptSuite, 1, 50)]; % Let's run this suite serially to see how long it takes: tic; result = run(suite) toc;
Running aClassBasedTest .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... Done aClassBasedTest __________ Running aFunctionBasedTest .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... Done aFunctionBasedTest __________ Running aScriptBasedTest .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... Done aScriptBasedTest __________ result = 1x300 TestResult array with properties: Name Passed Failed Incomplete Duration Totals: 300 Passed, 0 Failed, 0 Incomplete. 380.6043 seconds testing time. Elapsed time is 385.422383 seconds.
Into the Pool
Somewhere in the neighborhood of 6 and a half minutes delay time will definitely lose my attention, hopefully to something more productive than cat videos (but no guarantee!). What is the simplest way I can run these in parallel? Let's try a simple parfor loop using a parallel pool containing 16 workers:
tic; parfor idx = 1:numel(suite) results(idx) = run(suite(idx)); end results toc;
Running aClassBasedTest Running aClassBasedTest Running aFunctionBasedTest Running aClassBasedTest . Done aFunctionBasedTest __________ Running aFunctionBasedTest . Done aClassBasedTest __________ Running aClassBasedTest . Done aClassBasedTest __________ Running aClassBasedTest Running aFunctionBasedTest Running aFunctionBasedTest . Done aFunctionBasedTest __________ Running aFunctionBasedTest Running aClassBasedTest . Done aClassBasedTest __________ Running aClassBasedTest Running aFunctionBasedTest Running aClassBasedTest Running aClassBasedTest . Done aClassBasedTest __________ Running aClassBasedTest . Done aFunctionBasedTest __________ Running aFunctionBasedTest Running aClassBasedTest . Done aClassBasedTest __________ <SNIP: Lengthy output removed to save your scrollwheel finger.> Running aFunctionBasedTest . Done aFunctionBasedTest __________ results = 1x300 TestResult array with properties: Name Passed Failed Incomplete Duration Totals: 300 Passed, 0 Failed, 0 Incomplete. 838.4606 seconds testing time. Elapsed time is 81.866555 seconds.
Parallelism FTW! Now we have the suite running on the order of a minute and a half. That is much better time, but it's still not good enough for me. Also, what is the deal with the humongous (and unparsable) output? Note, I spared you from excessive browser scrolling by actually removing (SNIP!) most of the produced output. You can see, however, that each test element got its own start/end lines and different workers all printed their output to the command window without any grouping or understanding of what ran where. Do you see the lines that look like we start aClassBasedTest and finish aFunctionBasedTest? Theres no magic test conversion going on here, were are just getting garbled output from the workers.
Another interesting tidbit you can see is that the overall testing time actually increased significantly. This is not explained by the test framework time or the client/worker communication overhead, because the Duration property of TestResult only includes the time taken by the actual test content. What is actually happening here is that the function-based test, which has an expensive setupOnce function, is not enjoying the efficiency benefits of only setting up that fixture once and sharing it across multiple tests. Instead this setupOnce function is executed on every element of the function-based test on every worker. The benefits of sharing the fixture only apply when a MATLAB runs more than one test using that fixture. In this case, we are setting it up for every new Test element that we send to each parallel worker because we are sending each suite element one at a time to the pool.
Let's talk next time about how we can improve on this further and tackle these problems. In the meantime, have you used parallelism in your testing workflow? What works for you?
댓글
댓글을 남기려면 링크 를 클릭하여 MathWorks 계정에 로그인하거나 계정을 새로 만드십시오.