{"id":168,"date":"2009-01-20T15:28:10","date_gmt":"2009-01-20T15:28:10","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/2009\/01\/20\/more-ways-to-find-matching-data\/"},"modified":"2018-01-08T15:19:51","modified_gmt":"2018-01-08T20:19:51","slug":"more-ways-to-find-matching-data","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2009\/01\/20\/more-ways-to-find-matching-data\/","title":{"rendered":"More Ways to Find Matching Data"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>Today on the newsgroup, a user wanted help finding when values in a matrix matched some other values (see the <a>post<\/a>). There was already a solution posted when I was reading, but something about this problem kept nagging at me.  So I've invested\r\n         a little bit of time thinking more about the problem.\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#1\">Sample Data<\/a><\/li>\r\n         <li><a href=\"#2\">Bruno's Solution - Column-wise Solution<\/a><\/li>\r\n         <li><a href=\"#4\">Find Exact Location Matches<\/a><\/li>\r\n         <li><a href=\"#6\">Find Presence - Row-wise Solution<\/a><\/li>\r\n         <li><a href=\"#9\">Another Solution<\/a><\/li>\r\n         <li><a href=\"#12\">Other Approaches<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <h3>Sample Data<a name=\"1\"><\/a><\/h3>\r\n   <p>Here's sample data, and the user wants to find all the places in <tt>A<\/tt> which have values that match values in <tt>B<\/tt>.  Simple enough statement.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">A = [11 22 34 56 89\r\n23 44 11 20 66\r\n79 54 32 17 89\r\n11 66 21 45 90]\r\nB = [11 66 44 40 90]<\/pre><pre style=\"font-style:oblique\">A =\r\n    11    22    34    56    89\r\n    23    44    11    20    66\r\n    79    54    32    17    89\r\n    11    66    21    45    90\r\nB =\r\n    11    66    44    40    90\r\n<\/pre><h3>Bruno's Solution - Column-wise Solution<a name=\"2\"><\/a><\/h3>\r\n   <p>As I said, there was already a solution when I was reading.  Here it is.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">RESULTS = zeros(size(A));\r\n<span style=\"color: #0000FF\">for<\/span> i = 1: size(B,2)\r\n    RESULTS = RESULTS + ( A == B(1,i) );\r\n<span style=\"color: #0000FF\">end<\/span>\r\nRESULTS<\/pre><pre style=\"font-style:oblique\">RESULTS =\r\n     1     0     0     0     0\r\n     0     1     1     0     1\r\n     0     0     0     0     0\r\n     1     1     0     0     1\r\n<\/pre><p>The idea here is to create the right size output, and cycle through the values in <tt>B<\/tt> (the smaller array for the user's example).  Check to see where a given value in <tt>B<\/tt> matches one in <tt>A<\/tt>, and add a 1 to the <tt>RESULTS<\/tt> when those hits are found.\r\n   <\/p>\r\n   <h3>Find Exact Location Matches<a name=\"4\"><\/a><\/h3>\r\n   <p>To be honest, I misread the question at first, and came up with the following code.  However, it does <b>not<\/b> solve the problem as stated!\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">C = ~(A-repmat(B,size(A,1),1))<\/pre><pre style=\"font-style:oblique\">C =\r\n     1     0     0     0     0\r\n     0     0     0     0     0\r\n     0     0     0     0     0\r\n     1     1     0     0     1\r\n<\/pre><p>The reason I tried the solution with <tt>repmat<\/tt> was so I could first get the right answer, and then find a solution instead with <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/bsxfun.html\"><tt>bsxfun<\/tt><\/a>. Instead, this solution (which could be quite costly due to the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/repmat.html\"><tt>repmat<\/tt><\/a>) looks to match values in specific column locations.  In other words, the wrong problem solved.\r\n   <\/p>\r\n   <h3>Find Presence - Row-wise Solution<a name=\"6\"><\/a><\/h3>\r\n   <p>Wising up a very tiny amount, that I was solving the wrong problem, I next tried finding matches by row in <tt>A<\/tt>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">Z = zeros(size(A));\r\n<span style=\"color: #0000FF\">for<\/span> k = 1:size(A,1)\r\n    Z(k,:) = ismember(A(k,:),B);\r\n<span style=\"color: #0000FF\">end<\/span>\r\nZ<\/pre><pre style=\"font-style:oblique\">Z =\r\n     1     0     0     0     0\r\n     0     1     1     0     1\r\n     0     0     0     0     0\r\n     1     1     0     0     1\r\n<\/pre><p>At least this time I get the right answer and am solving the right problem!  But this normally would take longer than Bruno's\r\n      approach because the user premise was that <tt>A<\/tt> was <b>huge<\/b> and <tt>B<\/tt> wasn't nearly so large.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">isequal(Z,RESULTS)<\/pre><pre style=\"font-style:oblique\">ans =\r\n     1\r\n<\/pre><p>The reason I tried this approach was to then see if could convert it to something using <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/arrayfun.html\"><tt>arrayfun<\/tt><\/a>, or perhaps <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/cellfun.html\"><tt>cellfun<\/tt><\/a>.\r\n   <\/p>\r\n   <h3>Another Solution<a name=\"9\"><\/a><\/h3>\r\n   <p>Finally I had some coffee however!  And thank goodness.  The answer was in front of me all along.  And I was already using\r\n      it before: <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/ismember.html\"><tt>ismember<\/tt><\/a>.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">FinalAnswer = ismember(A,B)<\/pre><pre style=\"font-style:oblique\">FinalAnswer =\r\n     1     0     0     0     0\r\n     0     1     1     0     1\r\n     0     0     0     0     0\r\n     1     1     0     0     1\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">isequal(FinalAnswer, RESULTS)<\/pre><pre style=\"font-style:oblique\">ans =\r\n     1\r\n<\/pre><p>In one fell swoop I can compute the entire result because I don't care about matching those locations. It's enough to say\r\n      that a value in <tt>A<\/tt> matches some value in <tt>B<\/tt>.  Voila!\r\n   <\/p>\r\n   <h3>Other Approaches<a name=\"12\"><\/a><\/h3>\r\n   <p>I just showed you a few approaches for solving this problem.  Do you have similar problems, perhaps ones that don't yield\r\n      as simple a solution?  Or do you have other approaches to solving this problem that might be useful, especially in more complicated\r\n      situations?  Please share them <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=168#respond\">here<\/a>.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_d45da04263fd444b9acaeb1a3aa1f4fc() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='d45da04263fd444b9acaeb1a3aa1f4fc ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' d45da04263fd444b9acaeb1a3aa1f4fc';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Loren Shure';\r\n        copyright = 'Copyright 2009 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_d45da04263fd444b9acaeb1a3aa1f4fc()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.7<br><\/p>\r\n<\/div>\r\n<!--\r\nd45da04263fd444b9acaeb1a3aa1f4fc ##### SOURCE BEGIN #####\r\n%% More Ways to Find Matching Data\r\n% Today on the newsgroup, a user wanted help finding when values in a\r\n% matrix matched some other values (see the  \r\n% <http:\/\/view_thread\/242736 post>).\r\n% There was already a solution posted when I was reading, but something\r\n% about this problem kept nagging at me.  So I've invested a little bit of\r\n% time thinking more about the problem.\r\n%% Sample Data\r\n% Here's sample data, and the user wants to find all the places in |A|\r\n% which have values that match values in |B|.  Simple enough statement.\r\nA = [11 22 34 56 89\r\n23 44 11 20 66\r\n79 54 32 17 89\r\n11 66 21 45 90]\r\nB = [11 66 44 40 90]\r\n%% Bruno's Solution - Column-wise Solution\r\n% As I said, there was already a solution when I was reading.  Here it is.\r\nRESULTS = zeros(size(A));\r\nfor i = 1: size(B,2)\r\n    RESULTS = RESULTS + ( A == B(1,i) );\r\nend\r\nRESULTS\r\n%%\r\n% The idea here is to create the right size output, and cycle through the\r\n% values in |B| (the smaller array for the user's example).  Check to see\r\n% where a given value in |B| matches one in |A|, and add a 1 to the\r\n% |RESULTS| when those hits are found.\r\n%% Find Exact Location Matches\r\n% To be honest, I misread the question at first, and came up with the\r\n% following code.  However, it does *not* solve the problem as stated!\r\nC = ~(A-repmat(B,size(A,1),1))\r\n%%\r\n% The reason I tried the solution with |repmat| was so I could first get\r\n% the right answer, and then find a solution instead with\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/bsxfun.html |bsxfun|>.\r\n% Instead, this solution (which could be quite costly due to the \r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/repmat.html |repmat|>)\r\n% looks to match values in specific column locations.  In other words, the\r\n% wrong problem solved.\r\n%% Find Presence - Row-wise Solution \r\n% Wising up a very tiny amount, that I was solving the wrong problem, I\r\n% next tried finding matches by row in |A|.  \r\nZ = zeros(size(A));\r\nfor k = 1:size(A,1)\r\n    Z(k,:) = ismember(A(k,:),B);\r\nend\r\nZ\r\n%%\r\n% At least this time I get the right answer and am solving the right\r\n% problem!  But this normally would take longer than Bruno's approach\r\n% because the user premise was that |A| was *huge* and |B| wasn't nearly so\r\n% large.\r\nisequal(Z,RESULTS)\r\n%%\r\n% The reason I tried this approach was to then see if could convert it to\r\n% something using \r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/arrayfun.html |arrayfun|>,\r\n% or perhaps \r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/cellfun.html |cellfun|>.\r\n\r\n%% Another Solution\r\n% Finally I had some coffee however!  And thank goodness.  The answer was\r\n% in front of me all along.  And I was already using it before:\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/ismember.html |ismember|>.\r\nFinalAnswer = ismember(A,B)\r\n%%\r\nisequal(FinalAnswer, RESULTS)\r\n%%\r\n% In one fell swoop I can compute the entire result because I don't care\r\n% about matching those locations. It's enough to say that a value in |A|\r\n% matches some value in |B|.  Voila!\r\n%% Other Approaches\r\n% I just showed you a few approaches for solving this problem.  Do you have\r\n% similar problems, perhaps ones that don't yield as simple a solution?  Or\r\n% do you have other approaches to solving this problem that might be\r\n% useful, especially in more complicated situations?  Please share them\r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=168#respond here>.\r\n\r\n##### SOURCE END ##### d45da04263fd444b9acaeb1a3aa1f4fc\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      Today on the newsgroup, a user wanted help finding when values in a matrix matched some other values (see the post). There was already a solution posted when I was reading, but something... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2009\/01\/20\/more-ways-to-find-matching-data\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[10,15,12],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/168"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=168"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/168\/revisions"}],"predecessor-version":[{"id":2564,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/168\/revisions\/2564"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=168"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=168"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=168"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}