{"id":56,"date":"2006-09-27T11:49:41","date_gmt":"2006-09-27T16:49:41","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=56"},"modified":"2007-03-08T09:59:06","modified_gmt":"2007-03-08T14:59:06","slug":"evolution-of-the-function-isfield","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2006\/09\/27\/evolution-of-the-function-isfield\/","title":{"rendered":"Evolution of the Function isfield"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>Today I want to talk a bit about the function <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/isfield.html\"><tt>isfield<\/tt><\/a> and some of the history of its implementation.  <tt>isfield<\/tt> has been built into MATLAB for a few releases now.  We felt it was important to do this after talking to some customers with\r\n         customers who had <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/struct.html\"><tt>struct<\/tt><\/a> arrays with a large number (10s of thousands) of fields, and they found the performance too slow. Let's see how <tt>isfield<\/tt> used to look as an M-file and see some of the studies we did as we investigated how to proceed.\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#1\">Original Code<\/a><\/li>\r\n         <li><a href=\"#3\">Speedier with try<\/a><\/li>\r\n         <li><a href=\"#5\">Return false for Objects<\/a><\/li>\r\n         <li><a href=\"#7\">Speedier with eval, but Ugly<\/a><\/li>\r\n         <li><a href=\"#8\">Correct Behavior with Decent Speed<\/a><\/li>\r\n         <li><a href=\"#11\">Result<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <h3>Original Code<a name=\"1\"><\/a><\/h3>\r\n   <p>Let me first show you the original code.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">dbtype <span style=\"color: #A020F0\">isfieldOrig<\/span><\/pre><pre style=\"font-style:oblique\">\r\n1     function tf = isfieldOrig(s,fn)\r\n2     %ISFIELDORIG True if field is in structure array.\r\n3     %   F = isfieldOrig(S,'field') returns true if 'field' is the name of a field\r\n4     %   in the structure array S, otherwise it returns false.\r\n5     \r\n6     if isa(s,'struct')  \r\n7       tf = any(strcmp(fieldnames(s),fn));\r\n8     else\r\n9       tf = false;\r\n10    end\r\n\r\n<\/pre><p>The algorithm is straight-forward and pretty easy to read.  Only return <tt>true<\/tt> for structures and only then if the requested field exists. So what's the problem?  As I mentioned earlier, this code was\r\n      slow for structures with a large number of fields.  Why?  First, there was getting all the <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/fieldnames.html\"><tt>fieldnames<\/tt><\/a> from the structure, and then comparing each fieldname to the requested one.  If there are lots and lots of fields, there\r\n      needs to be lots and lots of comparisons.\r\n   <\/p>\r\n   <h3>Speedier with try<a name=\"3\"><\/a><\/h3>\r\n   <p>So I figured, what if I just <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/try.html\"><tt>try<\/tt><\/a> to address the structure field and return my success or failure from that?  Here's <tt>isfieldTry<\/tt>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">dbtype <span style=\"color: #A020F0\">isfieldTry<\/span><\/pre><pre style=\"font-style:oblique\">\r\n1     function tf = isfieldTry(s,f)\r\n2     %ISFIELD True if field is in structure array.\r\n3     %   F = ISFIELDTRY(S,'field') returns true if 'field' is the name of a field\r\n4     %   in the structure array S.\r\n5     %\r\n6     %   See also GETFIELD, SETFIELD, FIELDNAMES.\r\n7     \r\n8     tf = true;\r\n9     try  \r\n10      tmp = s.(f);\r\n11    catch\r\n12        tf = false;\r\n13    end\r\n14    \r\n15    \r\n\r\n<\/pre><p>This approach was <b>much<\/b> faster for structures with lots of fields.  But I missed something when I did this.  And that was it might return <tt>true<\/tt> for objects, if the class designed allowed users to use dot notation to address some aspects of the object.\r\n   <\/p>\r\n   <h3>Return false for Objects<a name=\"5\"><\/a><\/h3>\r\n   <p>Next I added some logic to see if the first input is actually a <tt>struct<\/tt>.  If not, return <tt>false<\/tt> straight away.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">dbtype <span style=\"color: #A020F0\">isfieldReturnBeforeTry<\/span><\/pre><pre style=\"font-style:oblique\">\r\n1     function tf = isfieldReturnBeforeTry(s,f)\r\n2     %ISFIELD True if field is in structure array.\r\n3     %   F = ISFIELDRETURNBEFORETRY(S,'field') returns true if 'field' is the name of a field\r\n4     %   in the structure array S.\r\n5     %\r\n6     %   See also GETFIELD, SETFIELD, FIELDNAMES.\r\n7     \r\n8     tf = false;\r\n9     if ~isstruct(s)\r\n10        return\r\n11    end  \r\n12    try  \r\n13      tmp = s.(f);\r\n14      tf = true;\r\n15    catch\r\n16    end\r\n17    \r\n18    \r\n\r\n<\/pre><p>As you can see, the function is now getting a little bit more complicated than I originally thought would be required.<\/p>\r\n   <h3>Speedier with eval, but Ugly<a name=\"7\"><\/a><\/h3>\r\n   <p>I next saw a user substitute this code. Any guesses what I think about this?  <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=9\">Here<\/a>'s one article I've written about <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/eval.html\"><tt>eval<\/tt><\/a> previously.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">dbtype <span style=\"color: #A020F0\">isfieldEval<\/span><\/pre><pre style=\"font-style:oblique\">\r\n1     function tf = isfieldEval(s,f)\r\n2     %ISFIELD True if field is in structure array.\r\n3     %   F = ISFIELDEVAL(S,'field') returns true if 'field' is the name of a field\r\n4     %   in the structure array S.\r\n5     %\r\n6     %   See also GETFIELD, SETFIELD, FIELDNAMES.\r\n7     \r\n8     tf = true;\r\n9     eval('tmp = s.(f);','tf = false;');\r\n10    \r\n11    \r\n\r\n<\/pre><h3>Correct Behavior with Decent Speed<a name=\"8\"><\/a><\/h3>\r\n   <p>Well folks, the new M-files still don't have the right behavior in MATLAB <b>even though the answers are correct!<\/b>  If I use a version of the M-file that uses the <tt>try-catch<\/tt> syntax, and if the field does not exist, then the state of <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/ref\/lasterror.html\"><tt>lasterror<\/tt><\/a> will be changed, even though the <tt>isfield*<\/tt> collection of functions never actually results in a user error, but rather always returns <tt>true<\/tt> or <tt>false<\/tt>.  Let me show you what I mean.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">lasterror <span style=\"color: #A020F0\">reset<\/span> <span style=\"color: #228B22\">% reset lasterror, just in case<\/span>\r\nisfieldTry(17,<span style=\"color: #A020F0\">'fred'<\/span>)\r\nlerr = lasterror\r\ndisp(lerr.message)<\/pre><pre style=\"font-style:oblique\">\r\nans =\r\n\r\n     0\r\n\r\n\r\nlerr = \r\n\r\n       message: 'Attempt to reference field of non-structure array.'\r\n    identifier: 'MATLAB:nonStrucReference'\r\n         stack: [5x1 struct]\r\n\r\nAttempt to reference field of non-structure array.\r\n<\/pre><p>Now suppose that we want to use <tt>isfield<\/tt> in a larger application where I will need to sometimes check for errors.  Depending on how I've written the program, I might\r\n      well get a false positive error message.  And I don't want that.\r\n   <\/p>\r\n   <p>What I really need to do is to get the state of <tt>lasterror<\/tt> before I do any work, and reset its state to the previous error state after I finish. Finally, here's the code to do this.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">dbtype <span style=\"color: #A020F0\">isfieldLasterror<\/span><\/pre><pre style=\"font-style:oblique\">\r\n1     function tf = isfieldLasterror(s,f)\r\n2     %ISFIELD True if field is in structure array.\r\n3     %   F = ISFIELDLASTERROR(S,'field') returns true if 'field' is the name of a field\r\n4     %   in the structure array S.\r\n5     %\r\n6     %   See also GETFIELD, SETFIELD, FIELDNAMES.\r\n7     \r\n8     tf = false;\r\n9     if ~isa(s,'struct')\r\n10        % non-structs shouldn't have exposed fields unless specific objects\r\n11        % overload isfield for themselves\r\n12        return  \r\n13    end  \r\n14    try\r\n15      l = lasterror;  % capture current error state\r\n16      tmp = s.(f);\r\n17      tf = true;\r\n18    catch\r\n19      lasterror(l);  % reset previous error state\r\n20    end\r\n21    \r\n22    \r\n\r\n<\/pre><h3>Result<a name=\"11\"><\/a><\/h3>\r\n   <p>In the end, we chose to build <tt>isfield<\/tt> into MATLAB where we have access to the internal data structures.  At that point, it is very quick and easy to ascertain\r\n      if the first input is a structure.  If so, it's a quick <a href=\"http:\/\/en.wikipedia.org\/wiki\/Hash_table\">hash table<\/a> lookup to see if the requested <tt>fieldname<\/tt> exists and to return the answer without disrupting the error state.\r\n   <\/p>\r\n   <p>Do you have functions that alter a state or property for expediency and then need to reset it?  Have you remembered to reset\r\n      it?  I'd love to hear <a href=\"https:\/\/blogs.mathworks.com\/loren\/2006\/09\/27\/evolution-of-the-function-isfield\/#respond\">your thoughts<\/a>.\r\n   <\/p>\r\n   <p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br>\r\n      Published with MATLAB&reg; 7.3<br><\/p>\r\n<\/div>\r\n<!--\r\n##### SOURCE BEGIN #####\r\n%% Evolution of the Function isfield\r\n% Today I want to talk a bit about the function\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/isfield.html |isfield|> \r\n% and some of the history of its implementation.  |isfield| has been built\r\n% into MATLAB for a few releases now.  We felt it was important to do this\r\n% after talking to some customers with customers who had \r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/struct.html |struct|> \r\n% arrays with a large number (10s of thousands) of fields, and they found the \r\n% performance too slow. Let's see how |isfield| used to look as an M-file\r\n% and see some of the studies we did as we investigated how to proceed.\r\n%% Original Code\r\n% Let me first show you the original code.\r\ndbtype isfieldOrig\r\n%%\r\n% The algorithm is straight-forward and pretty easy to read.  Only return\r\n% |true| for structures and only then if the requested field exists. So what's\r\n% the problem?  As I mentioned earlier, this code was slow for structures with\r\n% a large number of fields.  Why?  First, there was getting all the\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/fieldnames.html |fieldnames|>\r\n% from the structure, and then comparing each fieldname to the requested\r\n% one.  If there are lots and lots of fields, there needs to be lots and\r\n% lots of comparisons.\r\n%% Speedier with try\r\n% So I figured, what if I just <https:\/\/www.mathworks.com\/help\/matlab\/ref\/try.html |try|> \r\n% to address the structure field and return my success or failure from\r\n% that?  Here's |isfieldTry|.\r\ndbtype isfieldTry\r\n%%\r\n% This approach was *much* faster for structures with lots of fields.  But\r\n% I missed something when I did this.  And that was it might return |true|\r\n% for objects, if the class designed allowed users to use dot notation to\r\n% address some aspects of the object.\r\n%% Return false for Objects\r\n% Next I added some logic to see if the first input is actually a |struct|.\r\n%  If not, return |false| straight away.\r\ndbtype isfieldReturnBeforeTry\r\n%%\r\n% As you can see, the function is now getting a little bit more complicated\r\n% than I originally thought would be required.\r\n\r\n%% Speedier with eval, but Ugly\r\n% I next saw a user substitute this code. Any guesses what I think about\r\n% this?  <https:\/\/blogs.mathworks.com\/loren\/?p=9 Here>'s one article I've\r\n% written about <https:\/\/www.mathworks.com\/help\/matlab\/ref\/eval.html |eval|>\r\n% previously.\r\ndbtype isfieldEval\r\n\r\n%% Correct Behavior with Decent Speed\r\n% Well folks, the new M-files still don't have the right behavior in MATLAB\r\n% *even though the answers are correct!*  If I use a version of the M-file\r\n% that uses the |try-catch| syntax, and if the field does not exist, then\r\n% the state of\r\n% <https:\/\/www.mathworks.com\/help\/matlab\/ref\/lasterror.html |lasterror|>\r\n% will be changed, even though the |isfield*| collection of functions never\r\n% actually results in a user error, but rather always returns |true| or\r\n% |false|.  Let me show you what I mean.\r\nlasterror reset % reset lasterror, just in case\r\nisfieldTry(17,'fred')\r\nlerr = lasterror\r\ndisp(lerr.message)\r\n%%\r\n% Now suppose that we want to use |isfield| in a larger application where I\r\n% will need to sometimes check for errors.  Depending on how I've written\r\n% the program, I might well get a false positive error message.  And I\r\n% don't want that.\r\n%%\r\n% What I really need to do is to get the state of |lasterror| before I do\r\n% any work, and reset its state to the previous error state after I finish.\r\n% Finally, here's the code to do this.\r\ndbtype isfieldLasterror\r\n%% Result\r\n% In the end, we chose to build |isfield| into MATLAB where we have access\r\n% to the internal data structures.  At that point, it is very quick and\r\n% easy to ascertain if the first input is a structure.  If so, it's a quick\r\n% <http:\/\/en.wikipedia.org\/wiki\/Hash_table hash table> lookup to see if the\r\n% requested |fieldname| exists and to return the answer without disrupting\r\n% the error state.\r\n%%\r\n% Do you have functions that alter a state or property for expediency and\r\n% then need to reset it?  Have you remembered to reset it?  I'd love to\r\n% hear <?p=56#respond your thoughts>.\r\n##### SOURCE END #####\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      Today I want to talk a bit about the function isfield and some of the history of its implementation.  isfield has been built into MATLAB for a few releases now.  We felt it was important... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2006\/09\/27\/evolution-of-the-function-isfield\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16,14,10,11],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/56"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=56"}],"version-history":[{"count":0,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/56\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=56"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=56"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=56"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}