Evolution of the Function isfield
Today I want to talk a bit about the function isfield and some of the history of its implementation. isfield has been built into MATLAB for a few releases now. We felt it was important to do this after talking to some customers with customers who had struct arrays with a large number (10s of thousands) of fields, and they found the performance too slow. Let's see how isfield used to look as an M-file and see some of the studies we did as we investigated how to proceed.
Contents
Original Code
Let me first show you the original code.
dbtype isfieldOrig
1 function tf = isfieldOrig(s,fn) 2 %ISFIELDORIG True if field is in structure array. 3 % F = isfieldOrig(S,'field') returns true if 'field' is the name of a field 4 % in the structure array S, otherwise it returns false. 5 6 if isa(s,'struct') 7 tf = any(strcmp(fieldnames(s),fn)); 8 else 9 tf = false; 10 end
The algorithm is straight-forward and pretty easy to read. Only return true for structures and only then if the requested field exists. So what's the problem? As I mentioned earlier, this code was slow for structures with a large number of fields. Why? First, there was getting all the fieldnames from the structure, and then comparing each fieldname to the requested one. If there are lots and lots of fields, there needs to be lots and lots of comparisons.
Speedier with try
So I figured, what if I just try to address the structure field and return my success or failure from that? Here's isfieldTry.
dbtype isfieldTry
1 function tf = isfieldTry(s,f) 2 %ISFIELD True if field is in structure array. 3 % F = ISFIELDTRY(S,'field') returns true if 'field' is the name of a field 4 % in the structure array S. 5 % 6 % See also GETFIELD, SETFIELD, FIELDNAMES. 7 8 tf = true; 9 try 10 tmp = s.(f); 11 catch 12 tf = false; 13 end 14 15
This approach was much faster for structures with lots of fields. But I missed something when I did this. And that was it might return true for objects, if the class designed allowed users to use dot notation to address some aspects of the object.
Return false for Objects
Next I added some logic to see if the first input is actually a struct. If not, return false straight away.
dbtype isfieldReturnBeforeTry
1 function tf = isfieldReturnBeforeTry(s,f) 2 %ISFIELD True if field is in structure array. 3 % F = ISFIELDRETURNBEFORETRY(S,'field') returns true if 'field' is the name of a field 4 % in the structure array S. 5 % 6 % See also GETFIELD, SETFIELD, FIELDNAMES. 7 8 tf = false; 9 if ~isstruct(s) 10 return 11 end 12 try 13 tmp = s.(f); 14 tf = true; 15 catch 16 end 17 18
As you can see, the function is now getting a little bit more complicated than I originally thought would be required.
Speedier with eval, but Ugly
I next saw a user substitute this code. Any guesses what I think about this? Here's one article I've written about eval previously.
dbtype isfieldEval
1 function tf = isfieldEval(s,f) 2 %ISFIELD True if field is in structure array. 3 % F = ISFIELDEVAL(S,'field') returns true if 'field' is the name of a field 4 % in the structure array S. 5 % 6 % See also GETFIELD, SETFIELD, FIELDNAMES. 7 8 tf = true; 9 eval('tmp = s.(f);','tf = false;'); 10 11
Correct Behavior with Decent Speed
Well folks, the new M-files still don't have the right behavior in MATLAB even though the answers are correct! If I use a version of the M-file that uses the try-catch syntax, and if the field does not exist, then the state of lasterror will be changed, even though the isfield* collection of functions never actually results in a user error, but rather always returns true or false. Let me show you what I mean.
lasterror reset % reset lasterror, just in case isfieldTry(17,'fred') lerr = lasterror disp(lerr.message)
ans = 0 lerr = message: 'Attempt to reference field of non-structure array.' identifier: 'MATLAB:nonStrucReference' stack: [5x1 struct] Attempt to reference field of non-structure array.
Now suppose that we want to use isfield in a larger application where I will need to sometimes check for errors. Depending on how I've written the program, I might well get a false positive error message. And I don't want that.
What I really need to do is to get the state of lasterror before I do any work, and reset its state to the previous error state after I finish. Finally, here's the code to do this.
dbtype isfieldLasterror
1 function tf = isfieldLasterror(s,f) 2 %ISFIELD True if field is in structure array. 3 % F = ISFIELDLASTERROR(S,'field') returns true if 'field' is the name of a field 4 % in the structure array S. 5 % 6 % See also GETFIELD, SETFIELD, FIELDNAMES. 7 8 tf = false; 9 if ~isa(s,'struct') 10 % non-structs shouldn't have exposed fields unless specific objects 11 % overload isfield for themselves 12 return 13 end 14 try 15 l = lasterror; % capture current error state 16 tmp = s.(f); 17 tf = true; 18 catch 19 lasterror(l); % reset previous error state 20 end 21 22
Result
In the end, we chose to build isfield into MATLAB where we have access to the internal data structures. At that point, it is very quick and easy to ascertain if the first input is a structure. If so, it's a quick hash table lookup to see if the requested fieldname exists and to return the answer without disrupting the error state.
Do you have functions that alter a state or property for expediency and then need to reset it? Have you remembered to reset it? I'd love to hear your thoughts.
Published with MATLAB® 7.3
- Category:
- Best Practice,
- Common Errors,
- Efficiency,
- Robustness