Loren on the Art of MATLAB

Considering Performance in Object-Oriented MATLAB Code 17

Posted by Loren Shure,

I’m pleased to have Dave Foti back for a look at objects and performance. Dave manages the group responsible for object-oriented programming features in MATLAB.

I often get questions along the lines of what performance penalty is paid for using objects or how fast will an object-oriented implementation perform compared with some other implementation. As with many performance questions, the answer is that it very much depends on what the program is trying to do and what parts of its work are most performance-intensive. I see many cases where most of the real work is inside methods of an object that do math on ordinary matrices. Such applications won’t really see much difference with or without objects. I also realize that many MATLAB users aren’t nearly as concerned with run-time performance as with how long it takes to write the program. However, for those applications where performance matters and objects might be used in performance critical parts of the application, let’s look at what can we say about how MATLAB works that might be helpful to consider. We’ll also look at what has changed in recent MATLAB versions.

Contents

How objects spend their time

Let’s start with some basics – some of the places where objects spend time and how to minimize it. Objects will spend time in four basic places – object construction, property access, method invocation, and object deletion.

Object Construction

Object construction time is mostly spent copying the default values of properties from the class definition to the object and then calling the object’s constructor function(s). Generally speaking, there isn’t much to consider about default values in terms of performance since the expressions that create the default values are executed once when the class is first used, but then each object is just given a copy of the values from the class. Generally more superclasses will mean more constructor function calls for each object creation so this is a factor to consider in performance critical code.

Property Access

Property access is one of the most important factors in object performance. While changes over the past several releases have made property performance in R2012a more uniform and closer to struct performance, there is still some additional overhead for properties and the potential for aliasing with handle objects means that handle objects don’t get the same level of optimization that structs and value objects get from the MATLAB JIT. The simpler a property can be, the faster it will be. For example, making a property observable by listeners (using SetObservable/GetObservable) will turn off many optimizations and make property access slower. Using a set or get function will turn off most optimizations and also introduce the extra time to call the set or get function which is generally much greater than the time to just access the property. MATLAB doesn’t currently inline functions including set/get functions and so these are always executed as function calls. MATLAB optimizes property reading separately from property writing so it is important not to add a get-function just because the property needs a set-function.

Consider the following class:

type SimpleCylinder
classdef SimpleCylinder
    properties
        R
        Height 
    end
    
    methods
        function V = volume(C)
            V = pi .* [C.R].^2 .* [C.Height];
        end
    end
end

We can measure the time to create 1000 cylinders and compute their volumes:

tic
C1 = SimpleCylinder;
for k = 1:1000,
    C1(k).R = 1;
    C1(k).Height = k;
end
V = volume(C1);
toc
Elapsed time is 0.112309 seconds.

Now consider a slightly different version of the above class where the class checks all the property values:

type SlowCylinder
classdef SlowCylinder
    properties
        R
        Height
    end
    methods
        function V = volume(C)
            V = pi .* [C.R].^2 .* [C.Height];
        end
        
        function C = set.R(C, R)
            checkValue(R);
            C.R = R;
        end
        
        function C = set.Height(C, Height)
            checkValue(Height);
            C.Height = Height;
        end
    end
end

function checkValue(x)
    if ~isa(x, 'double') || ~isscalar(x)
        error('value must be a scalar double.');
    end
end


We can measure the same operations on this class:

tic
C2 = SlowCylinder;
for k = 1:1000,
    C2(k).R = 1;
    C2(k).Height = k;
end
A = volume(C2);
toc
Elapsed time is 0.174094 seconds.

Optimizing for Property Usage Inside the Class

If much of the performance critical code is inside methods of the class, it might make sense to consider using two property definitions for properties accessed in such performance critical code. One property is private to the class and doesn’t define any set or get functions. A second dependent property is public and passes through to the private property but adds error checking in its set-function. This allows the class to check values coming from outside the class, but not check values inside the class. Set functions always execute except when setting the default value during object creation and this allows the class to use its public interface if it is more convenient to do so. Set functions may do convenient transformations or other work in addition to just checking that the input value is legal for the property. However, if a performance-critical method doesn’t need this work, it can be helpful to use two properties.

For example, consider a new version of the cylinder class that checks its inputs but is designed to keep loops inside the class methods and use unchecked properties inside those methods.

type NewCylinder
classdef NewCylinder
    properties(Dependent)
        R
        Height
    end
    properties(Access=private)
        R_
        Height_
    end
    methods
        function C = NewCylinder(R, Height)
            if nargin > 0
                if ~isa(R, 'double') || ~isa(Height, 'double')
                    error('R and Height must be double.');
                end
                
                if ~isequal(size(R), size(Height))
                    error('Dimensions of R and Height must match.');
                end
                for k = numel(R):-1:1
                    C(k).R_ = R(k);
                    C(k).Height_ = Height(k);
                end
            end            
        end
        
        function V = volume(C)
            V = pi .* [C.R_].^2 .* [C.Height_];
        end
        
        function C = set.R(C, R)
            checkValue(R);
            C.R_ = R;
        end

        function R = get.R(C)
            R = C.R_;
        end
        
        function C = set.Height(C, Height)
            checkValue(Height);
            C.Height_ = Height;
        end
        
        function Height = get.Height(C)
            Height = C.Height_;
        end        
    end
end

function checkValue(x)
    if ~isa(x, 'double') || ~isscalar(x)
        error('value must be a scalar double.');
    end
end


Here we measure the same operations as above.

tic
C3 = NewCylinder(ones(1,1000), 1:1000);
A = volume(C3);
toc
Elapsed time is 0.006654 seconds.

Method Invocation

Method invocation using function call notation e.g. f(obj, data) is generally faster than using obj.f(data). Method invocation, like function calls on structs, cells, and function handles will not benefit from JIT optimization of the function call and can be many times slower than function calls on purely numeric arguments. Because of the overhead for calling a method, it is always better to have a loop inside of a method rather than outside of a method. Inside the method, if there is a loop, it will be faster if the loop just does indexing operations on the object and makes calls to functions that are passed numbers and strings from the object rather than method or function calls that take the whole object. If function calls on the object can be factored outside of loops, that will generally improve performance.

Calling a method on an object:

C4 = NewCylinder(10, 20);
tic
for k = 1:1000
    volume(C4);
end
toc
Elapsed time is 0.013509 seconds.

Calling a method on the object vector:

C5 = NewCylinder(ones(1,1000), 1:1000);
tic
volume(C5);
toc
Elapsed time is 0.001903 seconds.

Calling a function on a similar struct and struct array First calling the function inside a loop:

CS1 = struct('R', 10, 'Height', 20);
tic
for k = 1:1000
    cylinderVolume(CS1);
end

Next, we call the function on a struct array:

toc
CS2 = struct('R', num2cell(ones(1,1000)), ...
             'Height', num2cell(1:1000));
tic
cylinderVolume(CS2);
toc
Elapsed time is 0.008510 seconds.
Elapsed time is 0.000705 seconds.

Deleting Handle Objects

MATLAB automatically deletes handle objects when they are no longer in use. MATLAB doesn't use garbage collection to clean up objects periodically but instead destroys objects when they first become unreachable by any program. This means that MATLAB destructors (the delete method) are called more deterministically than in environments using garbage collection, but it also means that MATLAB has to do more work whenever a program potentially changes the reachability of a handle object. For example, when a variable that contains a handle goes out of scope, MATLAB has to determine whether or not that was the last reference to that variable. This is not as simple as checking a reference count since MATLAB has to account for cycles of objects. Changes in R2011b and R2012a have made this process much faster and more uniform. However, there is one aspect of object destruction that we are still working on and that has to do with recursive destruction. As of R2012a, if a MATLAB object is destroyed, any handle objects referenced by its properties will also be destroyed if no longer reachable and this can in turn lead to destroying objects in properties of those objects and so on. This can lead to very deep recursion for something like a very long linked list. Too much recursion can cause MATLAB to run out of system stack space and crash. To avoid such an issue, you can explicitly destroy elements in a list rather than letting MATLAB discover that the whole list can be destroyed.

Consider a doubly linked list of nodes using this node class:

type dlnode
classdef dlnode < handle
    properties
        Data
    end
    properties(SetAccess = private)
        Next
        Prev
    end
    
    methods
        function node = dlnode(Data)
            node.Data = Data;
        end

        function delete(node)
            disconnect(node);
        end

        function disconnect(node)
            prev = node.Prev;
            next = node.Next;
            if ~isempty(prev)
                prev.Next = next;
            end
            if ~isempty(next)
                next.Prev = prev;
            end
            node.Next = [];
            node.Prev = [];
        end
        
        function insertAfter(newNode, nodeBefore)
            disconnect(newNode);
            newNode.Next = nodeBefore.Next;
            newNode.Prev = nodeBefore;
            if ~isempty(nodeBefore.Next)
                nodeBefore.Next.Prev = newNode;
            end
            nodeBefore.Next = newNode;
        end
        
        function insertBefore(newNode, nodeAfter)
            disconnect(newNode);
            newNode.Next = nodeAfter;
            newNode.Prev = nodeAfter.Prev;
            if ~isempty(nodeAfter.Prev)
                nodeAfter.Prev.Next = newNode;
            end
            nodeAfter.Prev = newNode;
        end       
    end
end

    
    

Create a list of 1000 elements:

top = dlnode(0);
tic
for i = 1:1000
    insertBefore(dlnode(i), top);
    top = top.Prev;
end
toc
Elapsed time is 0.123879 seconds.

Destroy the list explicitly to avoid exhausting the system stack:

tic
while ~isempty(top)
    oldTop = top;
    top = top.Next;
    disconnect(oldTop);
end
toc
Elapsed time is 0.113519 seconds.

Measure time for varying lengths of lists. We expect to see time vary linearly with the number of nodes.

N = [500 2000 5000 10000];
% Create a list of 10000 elements:
CreateTime = [];
TearDownTime = [];
for n = N
    top = dlnode(0);
    tic
    for i = 1:n
        insertBefore(dlnode(i), top);
        top = top.Prev;
    end
    CreateTime = [CreateTime;toc];
    tic
    while ~isempty(top)
        oldTop = top;
        top = top.Next;
        disconnect(oldTop);
    end
    TearDownTime = [TearDownTime; toc];
end
subplot(2,1,1);
plot(N, CreateTime);
title('List Creation Time vs. List Length');
subplot(2,1,2);
plot(N, TearDownTime);
title('List Destruction Time vs. List Length');

A Look to the Future

We continue to look for opportunities to improve MATLAB object performance and examples from you are very helpful for learning what changes will make an impact on real applications. If you have examples or scenarios you want us to look at, please let me know. Also, if you have your own ideas or best practices, it would be great to share them as well. You can post ideas and comments here.


Get the MATLAB code

Published with MATLAB® 7.14

17 CommentsOldest to Newest

Hi, nice post.

I am wondering if language level support could be added for specifying a variable type and size, to avoid having to write “checkValue” functions and speed performance.

I would say you could borrow from the Julia programming language:

func(var) -> any type what we have now
func(var::double) -> doubles of any size
func(var::double(:,:)) -> doubles of any size 2 dim
func(var::double(:,1)) -> doubles of any size on first dim

Having that type information could also be useful for the JIT as well.

Joan

Hello Loren,

Great topic, Im happy you brought this up. I wonder if you have any advice/thoughts on an issue I face often with regard to indexing into dataset array objects. There is very significant overhead involved – I imagine it comes down to the subsref method implementation for this class. In practice, I extract the columns of the dataset arrays as double arrays (or other built in types), do my math, then copy the results to a new column. It causes significant code bloat but the performance hit is often a factor of a 1000x or more.

Here is a simple example. I realize that this example can be vectorized easily but in my real applications this is not usually possible:

% Prep Data
N = 10000;
col1 = randn(N,1);
col2 = nan(N,1);

ds = dataset({col1,’Col1′},{col2,’Col2′});

% Method 1 – Using dataset subsref
tic;
for i = 1:N
ds.Col2(i) = 2*ds.Col1(i);
end
toc;

% Method 2 – Using double arrays
tic;
col1 = ds.Col1;
col2 = nan(N,1);
for i = 1:N
col2(i) = 2*col1(i);
end
ds.Col2 = col2;
toc;

Is this general method invocation overhead or something specific to the dataset arrays themselves?

Thanks!

A trick I use a lot to improve OO code performance is to replace “dependent” properties that will have large compute times with “transient” properties. (This only works on Handle classes). Basically, the trick is to allow me to have something be compute-on-demand, but only compute once. The added wrinkle is that you need something like a “dirty bit” to set when you change a normal property to tell the transient property that it needs to recompute.

This methodology does have a tendency to set off code analyzer warnings about “a set method for a non-dependent property should not access another property.” On the other hand, it can save absurd amounts of time versus having the properties be “dependent.”

properties
vec = rand(1000,3);
end
properties (transient = true)
magnitude = [];
end
methods
function value = get.magnitude(obj)
if isempty(obj.magnitude)
obj.magnitude = sqrt(sum(obj.vec.^2,2));
end
value = obj.magnitude;
end
function obj = set.vec(obj)
dirty(obj);
end
function dirty(obj)
obj.magnitude = [];
end
end

For my code (simple classes: no multithreading, no dynamic class changing, etc.) I create a local variables at the beginning of every method using class properties, just to copy their value into. Then a method’s body use these variables, and at the end I copy theirr’s values back into class properties.

With 2011a (didn’t ckeck with 2012a yet) this trick can divide your elapsed time by 40.

And it works even better with constants!

owr, dataset arrays are just not as fast as double arrays for that kind of scalar math. We are working on improving the performance, but they are really intended for vectorized operations, and that’s why they have the dot syntax to access entire variables. It’s hard to make concrete suggestions for your problem because as you say, your example is not realistic. However, it is sometimes the case that that kind of loop can be written as a function, and then called with a dataset variable with no extra code, and perhaps more clarity. For example:

ds.Col2 = twice(ds.Col1);

function x = twice(x)
for i = 1:N
x(i) = 2*x(i);
end

Joan,
Thanks for the suggestion. I can see how optional type declarations could be useful as you describe for reducing code and improving performance.

owr,
The difference between your “method 1″ and “method 2″ is mainly the MATLAB accelerator and JIT. In method 2, the JIT will produce very fast machine code for the loop on double arrays, but in method 1, the fact that dataset overloads subsref means the loop is interpreted and doesn’t benefit from the JIT.

-Dave

Peter, Dave,

Thanks for your thoughts. I have a working methodology that is similar to Alexandre’s comment above. I learned something from this blog though – previoulsy I thought the overhead was due mainly to OOP. It looks like in my situation though it mainly comes down to the use of subsref for the dataset array – or more specificaly, that plus the subsref method that I have written for my own class that inherits from dataset. In the end though it is well worth it for me due to the dynamic number/type/names of cols that the dataset array provides. Thanks for your time.

Great post. Will have to reread this post a number of times to digest the huge number of nuggets about OO MATLAB. Dave I would love to see, and would definitely read, a regular blog from someone in your group.

I agree with Daniel – A Mathworks OOP blog would be very exciting! I know there’s some low-hanging fruit in some of my MATLAB “OOPlications” in terms of efficiency gains.

Two Questions: In designing a cohesive solution to plotting general equations, I have one for independent variable and first attempt at plotting, another which supplies the dependent variable and a third for formatting the final plot [by passing the handle for the original plot]. Can you point to other options and techniques?

Secondly, a good design requires avoiding problems, including use of try/catch/finally statements. I have seen just a few examples. Please point me to other examples or tutorials, especially for the Matlab specific error messages that are available.

Thanks in advance.

Dave,

I’m curious about the order-of-magnitude difference between the performance of the SimpleCylinder and NewCylinder examples:

SimpleCylinder (loop outside) : 0.112309 seconds.
SlowCylinder (custom set/get): 0.174094 seconds.
NewCylinder (loop inside constructor): 0.006654 seconds.

I would have expected the SimpleCylinder time to be closer to the NewCylinder time. I assume the discrepancy is because the SimpleCylinder loop goes from 1:1000, whereas the NewCylinder loop goes from numel(R):-1:1? (in the former case, the array size grows continually, whereas in the latter case it is effectively preallocated?)

Thanks,
Gautam

Hi, I often like to have objects embedded in normal properties of other objects (bike.wheel). These child objects themselves have properties (bike.wheel.nSpokes). Is there any additional overhead with nesting objects and retrieving properties, and is the best solution to copy the final property to a standard struct etc. and work on those each time, i.e. x = bike.wheel.rim.thickness; function(x) VS. function(bike.wheel.rim.thickness)

Also, I tried testing this:

f(obj, data) is generally faster than using obj.f(data)

But saw no difference in my test. Why is it supposed to be faster, should I be using f(obj, data) as best practice irrespectively?

Hi Ian,
In R2012a and going forward, there should be almost no overhead for nesting objects in other object properties and no need to pull the property out into a temporary variable before using it. If you repeat the indexed expression many times in a function, it might still be helpful to do the indexing once into a variable that is used in several places.

I should clarify the statement about obj.f(data) being slower than f(obj, data). The overhead is higher for obj.f(data) but this will generally not be noticeable if the function execution time is significant so I can’t say that you will notice a difference for any particular function. The overhead is higher because of the way MATLAB currently interprets syntax involving dots.

I am a machine-learning team leader in my company. We considered to use OO in matlab for our new system. However we fill that OO is not mature enough in MATLAB. Especially because of bugs that happen when you change a class code. Sometimes clearing the instance doesn’t help and neither the class or even clearing the whole work-space (‘clear classes’). The only thing that make the object to update is to close MATLAB and reopen it. This is very annoying and makes debugging an impossible mission.
Moreover, if there are handles to functions inside the class (as properties) the saved class takes 100 times more disk space than if you convert it to string (why?!)
Hanan

We have been able to use Matlab OO quite successfully in our projects, though there definitely have been some hiccups along the way. The issue of needing to restart Matlab frequently when ‘clear classes’ fails, as Hanan mentions, is definitely prevalent. I’m not sure if the improved deletion logic seemingly added in 2012a may help there…I’m still using 2011b for all of my projects.

I agree with the idea that the OO group needs a blog!

I find the inability to override getters of dependant properties in subclasses very frustrating.

classdef superClass
properties (Access=private)
state_data
end
properties (Dependent=true)
state
end
methods
function tf = get.state(obj)
tf = strcmp(obj.state_data, ‘Happy’);
end
end
end

classdef subClass < superClass
methods
function tf = get.state(obj)
tf = strcmp(obj.state_data, 'Sad');
end
end
end

(sorry not really performance related)

To work around and get the type of behaviour I want means I have to write specific get methods. Yes I know there is risk in overriding set methods, but leave it to the programmer to Do It Right, rather than enforce rules that Make It Hard

Hi Jan,

I assume your alternative with specific get methods is something like this:

classdef superClass
properties (Access=private)
state_data
end
properties (Dependent)
state
end
methods
function tf = get.state(obj)
tf = getState(obj);
end
end
methods(Access=protected)
function getState(obj)
tf = strcmp(obj.state_data, ‘Happy’);
end
end
end
classdef subClass < superClass
methods(Access=protected)
function tf = getState(obj)
tf = strcmp(obj.state_data, 'Sad');
end
end
end

I do see your point about not “making it hard” and I agree we could make improvements. Sometimes enforcing rules makes some things easier but we want to do it in a way that doesn’t make other things harder. For example, if a superclass can guarantee what kinds of values may be returned by that property, it is easier to write general-purpose functions that use the superclass without knowing about all possible subclasses. If changing a property to Dependendent and adding a get-function doesn’t open up the property to reimplementation in subclasses, there is one less consideration for the superclass author making this change.

These postings are the author's and don't necessarily represent the opinions of MathWorks.