Skip to Main Content Skip to Search
File Exchange
MATLAB Newsgroup
Link Exchange
  Blogs  
 Contest 
MathWorks.com

Loren on the Art of MATLAB

February 8th, 2006

Use nested functions to memoize costly functions

Some languages provide ways to store results of costly calculations so if the calculation had been performed before, it need not be repeated when requested later. There is a way to accomplish this in MATLAB as well, using nested functions. Consider this function:
function f = memoize1(F)
%one-arg F, inputs testable with ==

x = [];
y = [];
f = @inner;
    function out = inner(in)
        ind = find(in == x);
        if isempty(ind)
            out = F(in);
            x(end+1) = in;
            y(end+1) = out;
        else
            out = y(ind);
        end
    end
end
Let's see what happens when we use this function, admittedly with not such an expensive function to calculate. Also, you could substitute any other function handle here, including one referring to an anonymous function, provided it has one input and one output.
>> f = memoize1(@sin)
f =
@memoize1/inner
>> x = f(3)
x =
0.1411
>> y = f(7)
y =
0.6570
>> z = f(3)
z =
0.1411
>> f(.2)
ans =
0.1987
>> f(pi/8)
ans =
0.3827
Now let's look into what's happening with our function f using the functions function. This should be used primarily for debugging and gaining insight, not as a programmatic tool.

Get information about the function handle:

>> s = functions(f)
s =
function: 'memoize1/inner'
type: 'nested'
file: 'H:DocumentsLORENv7memoize1.m'
workspace: {[1x1 struct]}

Get info about the nested function including its workspace:

>> s.workspace{1}
ans =

f: @memoize1/inner
F: @sin
x: [3 7 0.2000 0.3927]
y: [0.1411 0.6570  0.1987 0.3827]
So, if we were to put a breakpoint at inner and issued the command f(3) from the prompt, we'd be able to follow the logic about indexing into the saved values rather than doing the calculation again. In addition, if we wanted to quit MATLAB and come back at a later time, we could save our function handle in a MAT-file and reload it later. We would not need to save anything in addition to the handle, since the associated workspace would be also be saved. memoize1 can clearly be improved by vectorizing it (probably using ismember). Here is the latest version of this variant of the code:

function f = memoize2(F)
%one-arg F, inputs testable with ==
%allow nonscalar input.

x = [];
y = [];
f = @inner;
    function out = inner(in)
        out = zeros(size(in));  % preallocate output
        [tf,loc] = ismember(in,x);  % find which in's already computed in x
        ft = ~tf;  % ones to be computed
        out(ft) = F(in(ft));  % get output values for ones not already in
        % place new values in storage
        x = [x in(ft(:).')];
        y = [y reshape(out(ft),1,[])];
        out(tf) = y(loc(tf));  % fill in the rest of the output values
    end
end

Variants of this memoize code pattern might be useful for functions that have recurrence relationships, recursive functions, or refining underlying meshes. It's possible that you would want to have a buffer limit so the size of the stored data in the nested function didn't grow beyond some bound. But you get the idea.

15 Responses to “Use nested functions to memoize costly functions”

  1. A Brook replied on :

    The link to “ismember” leads to “functions”; beware the dangers of cut-and-paste!

  2. loren replied on :

    Fixed, thanks

  3. Markus replied on :

    Hi!

    I am a little bit confused why your function memoize1 does what it does. I was used that I need a “persistent” declaration when I want a function to keep variable values from one call to another. Is there a special behaviour when using function handles to nested functions? It does not seem to be consistent/straightforward to me at the moment.

    Regards
    Markus

  4. Loren replied on :

    With persistent, you only have one instance because all invocations of a function with a persistent variable share the value (assuming regular and not nested functions). With nested functions, each instance of the function handle carries along its own workspace. I recommend you read the user documentation carefully and examine the demos and examples to understand more (especially nesteddemo).

  5. Rod replied on :

    I have tried to use your memoize2 function and it doesn’t work for me. The problem comes from the function ismember.
    Below are three easy tests which should give the same answer:
    >> ismember(10.*(0:0.1:1),10.*(0:0.1:2))
    ans =
    1 1 1 1 1 1 0 0 1 1 1
    >> ismember(0:10,0:20)
    ans =
    1 1 1 1 1 1 1 1 1 1 1
    >> ismember(0:10,10.*(0:0.1:2))
    ans =
    1 1 1 0 1 1 0 0 1 1 1
    Whereas the two sets of data are the same (or should be same) for the three examples I don’t get the same answer!
    Any idea what’s wrong here?
    Thanks

  6. Loren replied on :

    You are seeing effects of numerical roundoff. I recommend you read some of the faqs on the web. You can start off here or with this article by Cleve.

  7. Markus replied on :

    The winning contribution of the last Matlab programming contest is a bad example for using nested functions. You can see the code at http://www.mathworks.com/contest/blockbuster.cgi/view_submission.html?id=31864 or a similar unobfuscated version at http://www.mathworks.com/contest/blockbuster.cgi/view_submission.html?id=31890. The functions solveri and CalculateMoves are nested functions (this is better visible after correct identing). Both functions are quite large and use quite a number of variables. After starting to modify the code I quickly ran into the problem that I erroneously used variables twice and the code did not work any more.

    The use of nested functions requires a lot of discipline and due to that is very error-prone, especially when different people work on the same code.

  8. Alex replied on :

    Hi Loren,
    Calling nested functions which access cell arrays seems to take time proportional to the size of the array, while nested functions using numeric arrays do not. I asked about this in the newsgroup (thread “matlab and copy-on-write” — I thought it was due to a memory copy, but that seems to be incorrect) but have yet to find an explanation or work-around. Any ideas?

  9. Ged replied on :

    Hi Loren,

    > With persistent, you only have one instance

    This is an obvious advantage for nested functions when multiple copies are useful, but how do the two options compare when both are feasible? In terms of speed, etc?

    Instead of memoising, I have lots of data which I run a computationally intensive function on, parts of which are invariant to some aspects of the particular data. I can use either persistent variables or nested functions to store partial results for the invariant parts, speeding up the function for further calls on remaining data. Alternatively, I could use other functions to compute the invariant parts and simply pass these in to a different function.

    I expected the first option (persistence) to be much quicker than the third (passing) since no memory needs to be copied back and forth, but I note on profiling that the persistent line itself seems to be quite a slow part of the code (when repeated over many data). I hadn’t considered the second option (nesting) at all; I wonder if it might be quicker than persistence… If so, why? Persistent arrays seem like a concept that should map efficiently to the lower level implementation, while nested functions keeping track of their own workspaces sounds more complex.

    Many thanks,
    Ged

  10. Ged replied on :

    Hello again,

    Had a quick look at this myself… It seems that with largish arrays (e.g. 1000×1000), the options are much the same, but that with smaller arrays (e.g. 100×100) persistent is slower than nested, with the persistent declaration line itself being one of the slower lines in the profile.

    I can’t think of any good reasons for this, but it looks to me like nested beats persistent in any case, as well as allowing you to have multiple instances.

    Hope that’s of interest to someone (and that maybe Loren or someone else can suggest reasons why persistent isn’t better)

    Ged

  11. Loren replied on :

    Ged-

    Performance aside (though it is a legitimate question), persistent doesn’t easily allow multiple instantiations, for example, 2 gui screens, while nested functions allow this naturally.

    –Loren

  12. David replied on :

    Hi,
    I was just wondering: the function ‘inner’ retains the variable values between runs. When does memory allocated by it get freed?

  13. Loren replied on :

    David-

    inner itself doesn’t retain the values. The function handle to the nested function does. All related get cleared up when the last reference to the nested function is cleared. However, local variables in inner get cleared right after each time it is run. Only x and y are retained with the nested function handle.

    –Loren

  14. Yi Cao replied on :

    Loren

    It is an interesting usage of nested function. Two questions:

    1. After complete all calls and save f to a mat file, then clear whole workspace. Now, load the mat file and check with whos, the function handle, f only occupies 16 bytes. But check with s=functions(f), and s.workspace{1} all information is still there. I worder where the information is stored?

    2. In the inner function, x and y are not pre-allocated. Will this affect the performance when the sizes of x and y becomes large?

    -Yi

  15. Loren replied on :

    Yi-

    The rest of the information is also stored in the mat-file. What you are seeing is the size of the function handle itself, not everything it encapsulates.

    As for pre-allocation, that would be nice, and it might help performance, especially if I chunked it up - but it can’t solve everything as we don’t know how big to make it. Whether or not that slows things down materially depends on the very picky details of how it gets called and how much and how often the array is grown.

    For the example, it was going to make things muddier and harder to understand, I think, to write the chunked memory version.

    –Loren

Leave a Reply


Loren Shure works on design of the MATLAB language at The MathWorks. She writes here about once a week on MATLAB programming and related topics.

  • Brad Phelan: Hi Tim, I understand where you are coming from. It is my one pet annoyance with Matlab, the lack of...
  • Loren: Timothee- Anonymous functions can only be a single (complicated) expression. You might be able to do what you...
  • Timothee: Is there a way to combine multiple commands in anonymous functions? ex1: fun=@(A)([V,D]=eig(A ); A*V-V*D)...
  • Loren: Here’s Cleve’s reply to Etienne: The crucial factor is the number and location of the nonzero...
  • Loren: Tristan- Nested functions can be slower in some cases currently. We know we have some opportunities to...
  • Tristan: Wow! I just tried with a global variable and it’s 5 times slower than with a argument! function...
  • Jon: Loren, I encountered this same problem and I attempted to find the answer by looking at the documentation for...
  • Tristan: “One thing that I have long wondered about is relative speed of nested functions relative to...
  • Etienne Non: Hi! I’m trying to understand why the Matlab function LU.m takes almost 20 times more time to...
  • Loren: Jonathan- The behavior you see is because the variable x has to come into inplaceTest and then a copy is made...

These postings are the author's and don't necessarily represent the opinions of The MathWorks.

Related Topics