Importance of Implicit Expansion For Performance

Posted by Loren Shure, December 19, 2019

9 views (last 30 days) | 3 Likes | 6 comments

Sometimes people state that they like using MATLAB because it's easy to express their mathematical thoughts. Sometimes there's a follow-on that they then switch to another language for performance. While early in the history of MATLAB, that was sometimes beneficial, it is not so obvious these days. Let's take the example of implicit expansion (also here).

To maximize the benefits of implicit expansion, it's best if you have a more complicated, computationally expensive expression for MATLAB to work with while minimizing the need for temporary arrays. MATLAB can then exploit coarse grain parallelism. But it also depends on your computer and the array sizes you are using. And it's worth thinking about code maintenance (for the future) and its complexity/simplicity.

Demonstration
Thoughts
Which Code Do You Prefer, and Why?

Demonstration

Here is demonstration of this simple idea. I'm testing equivalent numerical algorithms for removing the mean of any array and scaling it. Below are three equivalent implementations in MATLAB, the last one embeds a possible temporary array into one single line to take advantage of the most magic possible. The first one uses the well-named bsxfun function. The second one uses the same steps as the first, but uses implicit expansion instead. And the third method combines some of the steps so there are fewer overall statements. I'm calling this one smart.

n = [300, 1000, 3000, 10000];
alltimes = zeros(length(n),4);
%alltimes = table( "size", [length(n),4],...
%    "VariableNames",["n", "bsxfun", "implicit","smart"]);
for i = 1:length(n)
   runtimes = testRemoveMeanAndScale(n(i));
   alltimes(i,:) = [n(i), runtimes];
end
format short g
alltimesT = array2table(alltimes,"VariableNames",["n", "bsxfun", "implicit","smart"])

alltimesT =
  4×4 table
      n        bsxfun       implicit       smart   
    _____    __________    __________    __________
      300    0.00024632    0.00014821    7.5306e-05
     1000       0.00322     0.0036559     0.0027499
     3000      0.030063      0.036868      0.027908
    10000        0.3469       0.38712       0.33361

Thoughts

You can see that the timings are not completely consistent. It appears that as the arrays get larger, the smart algorithm consistently outperforms the other ones. Implicit expansion and bsxfun are generally on par except for small matrix sizes, where perhaps the extra function call costs enough extra to be noticeable.

Which Code Do You Prefer, and Why?

I'm wondering which code you'd prefer and I'd love to hear your reasons. Let me know here.

%% Test Functions
function runtime = testRemoveMeanAndScale(n)
% This function tests the 3 algorithms we wish to compare w.r.t. speed.
rng(0);
X = rand(n);
mu = rand(1,n);
sigma = randi([0 1],1,n);
runtime(1) = timeit(@()bsxfunRemoveMeanAndScale(X, mu, sigma));
runtime(2) = timeit(@()implicitExpansionRemoveMeanAndScale(X, mu, sigma));
runtime(3) = timeit(@()smartRemoveMeanAndScale(X, mu, sigma));
end % testRemoveMeanAndScale

function X = bsxfunRemoveMeanAndScale(X, mu, sigma) 
% Implementation using bsxfun
X = bsxfun(@minus, X, mu);
sigma(sigma==0) = 1;
X = bsxfun(@rdivide, X, sigma);
end % bsxfunRemoveMeanAndScale
 
function X = implicitExpansionRemoveMeanAndScale(X, mu, sigma) 
% Use implicit expansion 
X = X - mu;
sigma(sigma==0) = 1;
X = X ./ sigma;
end % implicitExpansionRemoveMeanAndScale
 
function X = smartRemoveMeanAndScale(X, mu, sigma) 
% Recommended implicit expansion implementation
sigma(sigma==0) = 1;
X = (X - mu)./ sigma;
end % smartRemoveMeanAndScale