Cache Your Nuts to Eat Later
Greg's pick this week is cachedcall by Aslak Grinsted.
Contents
Call That Expensive Computation Once!
Aslak has created an ingenious utility that stores the results to a function call, so that if the same function is called again with the same inputs, it will return the stored results instead of recomputing the answer.
StandardCall FirstCacheCall SecondCacheCall SpeedUp ____________ ______________ _______________ _______ isprime 2.4633 2.4228 0.0036991 99.85 urlread 0.0048197 0.01007 0.0038378 20.372 sim 0.029784 0.034612 0.0038576 87.048
What Is Caching?
The definition is to store a resource for future use. In computing, it generally refers to storing data in memory in a fashion that it can be rapidly accessed at a later time.
In fact developing algorithms for implicitly creating caches is always one of those areas under research in development.
However, in this particular example, the creation of the cache is explicit.
How Is It Done?
The simplified version is, for unique instance of a function call, the results of that function call are stored in a file. A data table is developed to link the unique function calls to specific data files. When a function call is applied, CACHECALL determines if that particular function call situation has been performed before. If so, return the stored results, otherwise call the desired function and store the results for later use.
For me, the part I find most fascinating is Aslak is able to develop unique identifiers for each function call situation based on the function, inputs, classes, etc. He actually ends up leveraging another File Exchange entry DataHash by Jan Simon.
The DataHash uses MATLAB's inherent ability to call Java code to develop his hash tables and unique identifiers.
Doesn't MATLAB Cache Functions Already?
Yes, but generally not based on specific inputs. When a function is called the first time, if it hasn't been used yet in the current MATLAB session, then that function must be loaded into memory. Subsequent calls to the same function require less time because the function is already in memory, but it doesn't store the results of specific computations for future use.
The same thing applies if the function has been edited. It must be reloaded into memory the first time it is called.
Should Every Function Call Be Cached?
Probably not. There is a penalty for the caching process. If the resulting cached results are never used, then that penalty is wasted. Also caching involves writing to disc, which causes the slowdown in the following results.
StandardCall FirstCacheCall SecondCacheCall SpeedUp ____________ ______________ _______________ _______ rand 1.0061 19.194 4.3404 -331.4 triangulation 3.7253e-05 0.0066531 0.0051827 -13812
It probably only makes sense in cases where there are a limited number of input scenarios that will occur multiple times, and takes longer than writing to disc.
That's not always predictable, so it may require some profiling to determine if it's worth the cost of caching.
Why Did I Choose This Entry?
This is a clever method to speed up repetitive computations when there are a relatively finite number of possible inputs.
I'm impressed that Aslak was able to figure out how to identify unique function calls. When I saw this entry, it was one of those "I wish I had thought of that!" moments.
In addition, Aslak does a nice job of architecting the software and documenting his public interface to the caching features.
Could You Use Function Caching?
Have you tried this out? What do you think about it? Should this type of caching be available in MATLAB?
You can leave your comments here.
- Category:
- Picks
Comments
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.