Introduction to Functional Programming with Anonymous Functions, Part 1
Tucker McClure is an Application Engineer with The MathWorks. He spends his time helping our customers accelerate their work with the right tools and problem-solving techniques. Today, he'll be discussing how "functional programming" can help create brief and powerful MATLAB code.
Contents
The Goal
I use a lot of anonymous functions. They're nice and compact and almost invisible in their simplicity. Plus, if I can write an anonymous function to do something, I don't need to save the function in a file, and that can save me from file clutter on larger projects and from having to send someone a dozen files instead of sending one clean script. However, it seems at first glance like anonymous functions must necessarily be simple. No if... else, while, for, or any other keywords can be used. So how could we possibly write sophisticated programs in anonymous functions? We'll see, and it will involve some ideas from functional programming.
The goal of this introduction is to demonstrate how a few of these techniques can change the way we work in MATLAB, allowing greater brevity and simultaneously increasing readability. There are three parts. In this first part, we'll present creating functions of functions and treating functions as variables (in MATLAB, that means function handles), and from there, we'll move on to implementing conditional statements (like if... else) in anonymous functions. In the next part, we'll add recursion and executing multiple statements inside an anonymous function. In the final part, we'll develop a loop function. But first, if "function handle" or "anonymous function" is new to you, go check out Loren’s great introductions to those ideas in her previous posts!
Minimum and Maximum Example
Let's say we want to write a function to find the minimum and maximum of a set of numbers and store the results in an array. Here's a first pass:
min_and_max = @(x) [min(x), max(x)]; min_and_max([3 4 1 6 2])
ans = 1 6
Our min_and_max function takes in an array that we'll call x, finds the minimum and maximum, and stores the two results in an output array. Clear? Good. But now let's make it more difficult. The min and max functions both return two outputs if desired (both the minimum or maximum and the index at which they occur in the input array). Our simple min_and_max function can't get those secondary outputs! How can we access them? Consider this odd-looking line.
[extrema, indices] = cellfun(@(f) f([3 4 1 6 2]), {@min, @max})
extrema = 1 6 indices = 3 4
Well, that clearly worked. The minimum, 1, occurs at index 3. The maximum, 6, occurs at index 4, but what is this line actually doing? First, recall how cellfun behaves. The first argument is a function handle. The second argument is a cell array of whatever. Each element of the cell array is given as an argument to the provided function handle. Most of the time, that cell array is full of data, and each piece of data is passed to the function. However, we could just as easily put function handles in the cell array. Then the first function (@(f) f(...)) acts on all the other functions. So, first @min is passed in for f and the outputs from min([3 4 1 6 2]) are stored. Then, @max is passed in for f, and its outputs are stored.
Ok, now that we're working with functions of functions, let's remove that hard-coded [3 4 1 6 2] and write a new min_and_max function by simply adding a @(x) out front and changing [3 4 1 6 2] to x.
min_and_max = @(x) cellfun(@(f) f(x), {@min, @max});
We can now use min_and_max for just the extrema, like before, but we can also get the indices too.
y = randi(10, 1, 10) just_values = min_and_max(y) [~, just_indices] = min_and_max(y) [extrema, indices] = min_and_max(y)
y = 2 10 10 5 9 2 5 10 8 10 just_values = 2 10 just_indices = 1 2 extrema = 2 10 indices = 1 2
That might have looked a little funny, but it's pretty easy to think about, right? Now let's make it look a little nicer too.
Map
Above, we're mapping each function to our input x. More generally, we might write a "map" function to map a series of functions to the input values. We'll make val a cell array so we can also send multiple inputs to multiple functions all at once. This is like what we had before, but rearranged a bit.
map = @(val, fcns) cellfun(@(f) f(val{:}), fcns);
Look how simple this makes min_and_max (below), while still accessing both outputs. Not only is it shorter to write than any other versions so far, it's easier to read, with hardly anything but a single occurrence of each variable or function name. "Map x to the min and max functions". No problem.
x = [3 4 1 6 2]; [extrema, indices] = map({x}, {@min, @max})
extrema = 1 6 indices = 3 4
Let's try multiple inputs:
map({1, 2}, {@plus, @minus, @times})
ans = 3 -1 2
What if outputs are different sizes? We'll write mapc (as in MAP with Cell array outputs) to handle this; all it needs is an extra argument to cellfun to say that our output isn't uniform in size.
mapc = @(val, fcns) cellfun(@(f) f(val{:}), fcns, 'UniformOutput', false);
Send pi to multiple functions that return differently-sized arrays. The first output is a scalar, the second is a scalar, and the third is a string.
mapc({pi}, {@(x) 2 * x, ... % Multiply by 2 @cos, ... % Find cosine @(x) sprintf('x is %.5f...', x)}) % Return a string
ans = [6.2832] [-1] 'x is 3.14159...'
That takes care of map, which we can now use anywhere to send a set of inputs to numerous functions and collect their multiple outputs with brief and easy-to-read code.
By the way, writing these functions that operate on other functions is part of the "functional programming" style, and we're just scratching the surface. Let's go a little deeper and see how we can write a function to choose which function to apply from a list of functions.
Inline Conditionals
Sometimes an anonymous function might need a condition, like if...else. However, normal MATLAB syntax doesn't allow program flow statements like these in anonymous functions. Hope it not lost. We can implement an "inline if" in a single line:
iif = @(varargin) varargin{2 * find([varargin{1:2:end}], 1, 'first')}();
Alright, that looks decidedly strange, so before we discuss how it works, take a look at how easy it is to use:
[out1, out2, ...] = iif( if this, then run this, ... else if this, then run this, ... ... else, then run this );
All the "if this" conditions should evaluate to true or false. The "then run this" action next to the first true condition is executed. None of the other actions are executed! We could use this to make, for example, a safe normalization function to do the following:
- If not all values of x are finite, throw an error.
- Else if all values of x are equal to 0, return zeros.
- Else, return x/norm(x).
This is implemented below. Note the @() out in front of the actions. This means, "don't do this action, but refer to this action". That is, we're passing pieces of code to the iif function as arguments. In this way, we aren't actually doing all three things; we'll only call the action for the single case we need.
normalize = @(x) iif( ~all(isfinite(x)), @() error('Must be finite!'), ... all(x == 0), @() zeros(size(x)), ... true, @() x/norm(x) );
Test the nominal condition.
normalize([1 1 0])
ans = 0.70711 0.70711 0
Test the error condition with non-finite inputs.
try normalize([0 inf 2]), catch err, disp(err.message); end
Must be finite!
Test the all-zeros condition.
normalize([0 0 0])
ans = 0 0 0
Easy to use, right? We've implemented if... else behavior without needing an actual if or else anywhere! So now it's time to see how this thing works.
First, the iif function takes any number of arguments, thanks to varargin. These arguments will be condition 1 (true or false), action 1 (a function), condition 2, action 2, etc. First, the iif function selects all of the conditions (that's the odd numbered items in varargin) via [varargin{1:2:end}]. For our safe norm, this returns:
[~all(isfinite(x)), all(x == 0), true]
Next, it finds the index of the first true value in those conditions with find(..., 1, 'first'). E.g., if ~all(isfinite(x)) was false, but all(x == 0) was true, the index would be 2.
The actions to perform are the even-numbered items of varargin, so we just multiply that index by 2 to get the index of the action to perform. Finally, we execute the action by appending () on the end, as in
varargin{...}()
Did you catch what was happening there? We're passing little pieces of code as inputs to the iif function. Functions as arguments. See why this is called "functional" programming? I'll admit it looks weird at first, but once you've seen it, the pattern is hard to forget.
Ok, that's it for today. Here are the functions we developed today.
iif = @(varargin) varargin{2*find([varargin{1:2:end}], 1, 'first')}(); map = @(val, fcns) cellfun(@(f) f(val{:}), fcns); mapc = @(val, fcns) cellfun(@(f) f(val{:}), fcns, 'UniformOutput', false);
These can also be found here, implemented as regular MATLAB functions that can be kept on the path.
To Be Continued
In the next installment, we'll build on these to enable recursion and to make anonymous functions that execute multiple statements. In the mean time, there are multiple ways to accomplish an inline if. For instance, I've seen a very brief ternary operator in our forums in the past. I'm curious what other implementations people might suggest for this type of behavior, so please post suggestions here.