More Ways to Find Matching Data

Posted by Loren Shure, January 20, 2009

8 views (last 30 days) | 0 Likes | 55 comments

Today on the newsgroup, a user wanted help finding when values in a matrix matched some other values (see the post). There was already a solution posted when I was reading, but something about this problem kept nagging at me. So I've invested a little bit of time thinking more about the problem.

Sample Data
Bruno's Solution - Column-wise Solution
Find Exact Location Matches
Find Presence - Row-wise Solution
Another Solution
Other Approaches

Sample Data

Here's sample data, and the user wants to find all the places in A which have values that match values in B. Simple enough statement.

A = [11 22 34 56 89
23 44 11 20 66
79 54 32 17 89
11 66 21 45 90]
B = [11 66 44 40 90]

A =
    11    22    34    56    89
    23    44    11    20    66
    79    54    32    17    89
    11    66    21    45    90
B =
    11    66    44    40    90

Bruno's Solution - Column-wise Solution

As I said, there was already a solution when I was reading. Here it is.

RESULTS = zeros(size(A));
for i = 1: size(B,2)
    RESULTS = RESULTS + ( A == B(1,i) );
end
RESULTS

RESULTS =
     1     0     0     0     0
     0     1     1     0     1
     0     0     0     0     0
     1     1     0     0     1

The idea here is to create the right size output, and cycle through the values in B (the smaller array for the user's example). Check to see where a given value in B matches one in A, and add a 1 to the RESULTS when those hits are found.

Find Exact Location Matches

To be honest, I misread the question at first, and came up with the following code. However, it does not solve the problem as stated!

C = ~(A-repmat(B,size(A,1),1))

C =
     1     0     0     0     0
     0     0     0     0     0
     0     0     0     0     0
     1     1     0     0     1

The reason I tried the solution with repmat was so I could first get the right answer, and then find a solution instead with bsxfun. Instead, this solution (which could be quite costly due to the repmat) looks to match values in specific column locations. In other words, the wrong problem solved.

Find Presence - Row-wise Solution

Wising up a very tiny amount, that I was solving the wrong problem, I next tried finding matches by row in A.

Z = zeros(size(A));
for k = 1:size(A,1)
    Z(k,:) = ismember(A(k,:),B);
end
Z

Z =
     1     0     0     0     0
     0     1     1     0     1
     0     0     0     0     0
     1     1     0     0     1

At least this time I get the right answer and am solving the right problem! But this normally would take longer than Bruno's approach because the user premise was that A was huge and B wasn't nearly so large.

isequal(Z,RESULTS)

ans =
     1

The reason I tried this approach was to then see if could convert it to something using arrayfun, or perhaps cellfun.

Another Solution

Finally I had some coffee however! And thank goodness. The answer was in front of me all along. And I was already using it before: ismember.

FinalAnswer = ismember(A,B)

FinalAnswer =
     1     0     0     0     0
     0     1     1     0     1
     0     0     0     0     0
     1     1     0     0     1

isequal(FinalAnswer, RESULTS)

ans =
     1

In one fell swoop I can compute the entire result because I don't care about matching those locations. It's enough to say that a value in A matches some value in B. Voila!

Other Approaches

I just showed you a few approaches for solving this problem. Do you have similar problems, perhaps ones that don't yield as simple a solution? Or do you have other approaches to solving this problem that might be useful, especially in more complicated situations? Please share them here.

Published with MATLAB® 7.7