Loren on the Art of MATLAB

Turn ideas into MATLAB

Evolution of the Function isfield

Today I want to talk a bit about the function isfield and some of the history of its implementation. isfield has been built into MATLAB for a few releases now. We felt it was important to do this after talking to some customers with customers who had struct arrays with a large number (10s of thousands) of fields, and they found the performance too slow. Let's see how isfield used to look as an M-file and see some of the studies we did as we investigated how to proceed.

Contents

Original Code

Let me first show you the original code.

dbtype isfieldOrig
1     function tf = isfieldOrig(s,fn)
2     %ISFIELDORIG True if field is in structure array.
3     %   F = isfieldOrig(S,'field') returns true if 'field' is the name of a field
4     %   in the structure array S, otherwise it returns false.
5     
6     if isa(s,'struct')  
7       tf = any(strcmp(fieldnames(s),fn));
8     else
9       tf = false;
10    end

The algorithm is straight-forward and pretty easy to read. Only return true for structures and only then if the requested field exists. So what's the problem? As I mentioned earlier, this code was slow for structures with a large number of fields. Why? First, there was getting all the fieldnames from the structure, and then comparing each fieldname to the requested one. If there are lots and lots of fields, there needs to be lots and lots of comparisons.

Speedier with try

So I figured, what if I just try to address the structure field and return my success or failure from that? Here's isfieldTry.

dbtype isfieldTry
1     function tf = isfieldTry(s,f)
2     %ISFIELD True if field is in structure array.
3     %   F = ISFIELDTRY(S,'field') returns true if 'field' is the name of a field
4     %   in the structure array S.
5     %
6     %   See also GETFIELD, SETFIELD, FIELDNAMES.
7     
8     tf = true;
9     try  
10      tmp = s.(f);
11    catch
12        tf = false;
13    end
14    
15    

This approach was much faster for structures with lots of fields. But I missed something when I did this. And that was it might return true for objects, if the class designed allowed users to use dot notation to address some aspects of the object.

Return false for Objects

Next I added some logic to see if the first input is actually a struct. If not, return false straight away.

dbtype isfieldReturnBeforeTry
1     function tf = isfieldReturnBeforeTry(s,f)
2     %ISFIELD True if field is in structure array.
3     %   F = ISFIELDRETURNBEFORETRY(S,'field') returns true if 'field' is the name of a field
4     %   in the structure array S.
5     %
6     %   See also GETFIELD, SETFIELD, FIELDNAMES.
7     
8     tf = false;
9     if ~isstruct(s)
10        return
11    end  
12    try  
13      tmp = s.(f);
14      tf = true;
15    catch
16    end
17    
18    

As you can see, the function is now getting a little bit more complicated than I originally thought would be required.

Speedier with eval, but Ugly

I next saw a user substitute this code. Any guesses what I think about this? Here's one article I've written about eval previously.

dbtype isfieldEval
1     function tf = isfieldEval(s,f)
2     %ISFIELD True if field is in structure array.
3     %   F = ISFIELDEVAL(S,'field') returns true if 'field' is the name of a field
4     %   in the structure array S.
5     %
6     %   See also GETFIELD, SETFIELD, FIELDNAMES.
7     
8     tf = true;
9     eval('tmp = s.(f);','tf = false;');
10    
11    

Correct Behavior with Decent Speed

Well folks, the new M-files still don't have the right behavior in MATLAB even though the answers are correct! If I use a version of the M-file that uses the try-catch syntax, and if the field does not exist, then the state of lasterror will be changed, even though the isfield* collection of functions never actually results in a user error, but rather always returns true or false. Let me show you what I mean.

lasterror reset % reset lasterror, just in case
isfieldTry(17,'fred')
lerr = lasterror
disp(lerr.message)
ans =

     0


lerr = 

       message: 'Attempt to reference field of non-structure array.'
    identifier: 'MATLAB:nonStrucReference'
         stack: [5x1 struct]

Attempt to reference field of non-structure array.

Now suppose that we want to use isfield in a larger application where I will need to sometimes check for errors. Depending on how I've written the program, I might well get a false positive error message. And I don't want that.

What I really need to do is to get the state of lasterror before I do any work, and reset its state to the previous error state after I finish. Finally, here's the code to do this.

dbtype isfieldLasterror
1     function tf = isfieldLasterror(s,f)
2     %ISFIELD True if field is in structure array.
3     %   F = ISFIELDLASTERROR(S,'field') returns true if 'field' is the name of a field
4     %   in the structure array S.
5     %
6     %   See also GETFIELD, SETFIELD, FIELDNAMES.
7     
8     tf = false;
9     if ~isa(s,'struct')
10        % non-structs shouldn't have exposed fields unless specific objects
11        % overload isfield for themselves
12        return  
13    end  
14    try
15      l = lasterror;  % capture current error state
16      tmp = s.(f);
17      tf = true;
18    catch
19      lasterror(l);  % reset previous error state
20    end
21    
22    

Result

In the end, we chose to build isfield into MATLAB where we have access to the internal data structures. At that point, it is very quick and easy to ascertain if the first input is a structure. If so, it's a quick hash table lookup to see if the requested fieldname exists and to return the answer without disrupting the error state.

Do you have functions that alter a state or property for expediency and then need to reset it? Have you remembered to reset it? I'd love to hear your thoughts.


Published with MATLAB® 7.3

|
  • print
  • send email

Comments

To leave a comment, please click here to sign in to your MathWorks Account or create a new one.