File Exchange Pick of the Week

MATLAB Puzzler: Removing columns and rows from binary matrices 37

Posted by Doug Hull,

I am trying something new this week: the MATLAB Puzzler.

Rather than just work one of the Practical Examples directly, I am doing it in two parts. The first I will pose the MATLAB puzzler and say how much code it took me to solve it. This is not a contest, so don’t try to solve this in less code (unless you think that would be fun). I give the benchmark so that if you find yourself using 50 lines of code you know there is a simpler way! Just try to solve the puzzle with clear code and sharpen your skills.

In the second video, I explain my thought process in the algorithm and then implement it. I posed this problem as a test to Jiro. With heckling, he was able to solve this puzzler in 14 minutes and the same amount of code.

Here is the code you need for the test matrix:

a=[
0 1 1 0 0
0 0 1 0 1
0 0 1 0 0
0 0 1 0 1
0 0 1 0 1]

The above is a good test matrix, but you can generate more random ones with this:

a = full(round(sprand(5,5,0.8)))


Video Content


Video Content

I was surprised that Jiro and I took very different paths to solving this problem. Post your solutions in the comments area along with how long it took you to solve.

[NOTE] After a week, a wrap-up was posted for this Puzzler. [click here]

37 CommentsOldest to Newest

Here’s my solution:

elimCols = [];
zeroCols = all(~a);
while any(zeroCols)
currCol = find(zeroCols,1,’first’)
elimCols = [elimCols currCol+numel(elimCols)];
a(:,currCol) = [];
a(currCol,:) = [];
zeroCols = all(~a);
end

It took about four minutes.

Dan,

I found a counter-example for the above solution:

a = [
0 1 0 1 0
0 1 0 0 0
1 1 0 0 0
0 1 0 0 1
0 0 0 1 1]

The above solution reports columns [3 2] to be removed, but it is actually [1 3] to be removed.

The keeping track of the indices is tricky.

Doug

my solution:
L = true(1,length(A(1,:))) ;
while (any(sum(A(L,L))==0))&&(sum(L)>1)
L(sum(A(L,:))==0) = false ;
end

display(‘Columns Removed:’)
find(L==false)
A(L,L)

one of the possible solutions (<one min) – if(f) i understand the problem correctly

r=~any(tril(a),1)&(diag(a).’==0)

us

@JN

I found a counter-example for the solution you submitted:

a =[
0 0 0 0 0
1 0 0 1 0
0 1 0 0 0
1 0 0 0 1
1 0 0 0 0];

The solution you submitted misses this (all rows are eliminated).

Doug

JN, This is clever. But it appears to fail when A should reduce to []. Consider:
A = [
0 0 1 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 0
0 1 0 0 0];

This should remove columns 1–5, leaving A=[]; instead, your solution returns:

A(L,L)
Columns Removed:
ans =
1 2 3 5
ans =
0

OK Doug,
I don’t feel bad, but here’s a minor mod to fix (I think the problem you mentioned):

elimCols = [];
zeroCols = all(~a);
while any(zeroCols)
currCol = find(zeroCols,1,’first’)
elimCols = [elimCols currCol+sum(elimCols<=currCol)];
a(:,currCol) = [];
a(currCol,:) = [];
zeroCols = all(~a);
end

Dan
BTW. In my nomenclature, the correct list for this is [3,1], not [1,3], since the order of elimination is relevant to the solution.

Dan,

a =[
1 1 0 0 0
1 0 0 0 0
0 0 0 0 0
1 0 1 0 0
1 0 0 0 0];

I can see that for the next puzzler, I am going to need to make a test harness to send out with the problem. I have already produced next week’s puzzler, but I can do it for the future ones.

Thanks for playing around.

Doug

Hi Doug,

I didn’t understand. From my simulation your example should generate:
0 0 0 0 0
1 0 0 1 0
0 1 0 0 0
1 0 0 0 1
1 0 0 0 0
———
0 0 0 0
1 0 1 0
1 0 0 1
1 0 0 0
——-
0 0 0
1 0 1
1 0 0
—–
0 0
1 0

0
-
[]

What am I missing? Thanks.

Brett, thanks. I did not use the last step: delete entry if 0, keep if 1.

Hey Doug,

thanks for another great idea!
The blogs, esp. yours are great.

Here is my solution:

clear all;
clc;

% a=[ 0 1 1 0 0;
% 0 0 1 0 1;
% 0 0 1 0 0;
% 0 0 1 0 1;
% 0 0 1 0 1]

%a = full(round(sprand(5,5,0.8)))

a=[ 0 1 0 1 0;
0 1 0 0 0;
1 1 0 0 0;
0 1 0 0 1;
0 0 0 1 1]

zerocols=find(any(a)==0);
a(zerocols,:)=NaN;
a(:,zerocols)=NaN;
zerocols=unique([zerocols find(any(a)==0)])
a(zerocols,:)=NaN;
a(:,zerocols)=NaN;
a(isnan(a))=[];
reshape(a,sqrt(numel(a)),sqrt(numel(a)))

Any comments would be appreciated.

All the best
Daniel

JN,

Yes, that is the right progression. Your code was very close, with the exception of the edge case where the entire matrix is eliminated!

Thanks for playing,
Doug

Daniel,

Since you asked for comments,

The code fails for this example:

a = [
0 0 0 0 1
0 0 0 0 0
1 0 0 0 1
1 0 1 0 0
0 0 0 0 0];

You did the right thing by running the elimination a second time, but it is possible that you need to run once for each column. A while loop is the right thing to do here.

Replacing the rows with NaN is a good idea so that you can keep the matrix the same size throughout, that means you do not have to keep track of tricky index math when the matrix is shrinking.

The way you eliminate the NaN at the end works, but it is more difficult to understand than other methods because the reshape operation is a little non-intuitive. Look at my elimination strategy for a different implementation.

Thanks for letting me critique the code and for playing along.

Doug

> Your code works for the test case,
> but not for the general case.
> Doug

doug
apparently i’m missing something (as i said) – as far as i can see it solves all the examples given so far; can you show me a matrix that fails?

recap:
your original mat yields (find(r)): 1 2 4
the second mat yields: 3 (as you say)
brett’s matrix yields: 1 3 4 5 (hence: [])

urs

Urs,

a = [
0 1 0 1 0
0 1 0 0 0
1 1 0 0 0
0 1 0 0 1
0 0 0 1 1]

fails. Your code produces:

r =
0 0 1 0 0

The correct answer is below:

Remove these columns: 1 3

a =
1 0 0
1 0 1
0 1 1

Note that by eliminating the third column and row, the first column then becomes all zeros.

I am not sure I understand how your code would work since this needs to be an iterative solution.

Thanks for playing!
Doug

Hi again,

of course you’re right with the while loop, shame on me.
I don’t know why i have missed that.

What do you think of the adapted code:
zerocols=find(any(a)==0);
while 1
numoldzerocols=numel(find(any(a)==0));
a(zerocols,:)=NaN;
a(:,zerocols)=NaN;
zerocols=unique([zerocols find(any(a)==0)]);
if numel(zerocols)==numoldzerocols
break;
end
end
zerocols
a(isnan(a))=[];
reshape(a,sqrt(numel(a)),sqrt(numel(a)))

What do you think is a more intuitive way of getting rid
of the NaNs? Is there a way to do so without the need of reshaping? Indeed, it seemed pretty straightforward to me.

I don’t want to look at your code yet ;-)
But I will soon…

Thank you for your time!
Daniel

Dan,

I could not find a counter-example for your code. Congratulations on finding a solution.

When you watch the solution video, you will find that I used a very similar strategy. Some changes that I think would make the code clearer:

zerocols=find(any(a)==0);

can be replaced with

zerocols = find( all(~a) )

It just seems more intuitive to parse mentally, at least to me.

Removing the correct rows and columns was done in my code like this:

a(newEmptyCols,:) = [];
a(:,newEmptyCols) = [];

Where newEmptyCols was a list of columns to be eliminated.
This avoids the reshape.

Thanks for playing!
Doug

well, i was obviously missing something… (the iteration bit wasn’t clear [to me] from your video-ex [and the second video does not play!?])…
anyhow, using this approach, the code looks like this

while true
r=~any(tril(a),1)&(diag(a).’==0);
b=a;
b(r,:)=[];
b(:,r)=[];
if isequal(a,b)
break;
end
a=b;
end
disp(b);

which takes care of the example you produced…
thanks for an interesting evening…
urs

Urs,

The use of TRIL seems to throw away the information in the upper triangular part of the matrix, and I am not sure why you would want to do that. This makes the code fail for certain matrices. Try using a few random inputs as mentioned at the top of the post:

a = full(round(sprand(5,5,0.8)))

Has anyone else had problems with the second video? I have tested it and it is working.

Thanks,
Doug

Hey Doug, nice thing you got going here. And Daniel seems like a popular name. I am going to give this a shot

keep = 1:5;
while(sum(all(~a))>0&&~isempty(a))

[R,jb]=rref(a);
a = a(jb,jb);
keep = keep(jb);
end
a
Colsremoved = setdiff(1:5,keep)

Took me about 30 minutes to think of this

Daniel,

I thought of the Puzzler after doing your contest last week at your MATLAB blog, BlinkDagger.
[ Click here ]

a =
0 0 0 0 0
0 0 0 0 1
1 0 1 0 1
0 0 1 1 1
0 0 0 0 0

Sorry, but your code fails on the above input!

Doug

Hi Doug, nice puzzle.

Here is my solution, based on eliminating as one goes along. Please let me know if you see any counter examples or problems.

k=1:length(a);
j=[];
while prod(sum(a))==0 && ~isempty(a)
i=find(sum(a)==0)
j=[j,k(i(1))]
k=k([1:i-1 i+1:end])
a=a([1:i-1 i+1:length(a)],[1:i-1 i+1:length(a)])
end
disp(['Remove these columns: ' int2str(sort(j))])
a

Hey, Doug.
Based on the reply, I’d say the puzzler challenge is a resounding success. People seem to be having a lot of fun with it, and coming up with some truly creative approaches. I like the puzzles!
Brett

Hello everyone,

So as Doug mentioned, my approach was slightly different from his (and I think some of you have used a similar approach as mine). Basically, I kept deleting the rows and columns, but also kept track of the indices along the way. So this was the original solution I had (after 14 minutes of Doug looking over my shoulder):

idx = 1;
szA = size(a, 2);
remCols = 1:szA;
while ~isempty(a) && any(~sum(a))
  id(idx) = find(sum(a)==0, 1, 'first');
  remCols(id(idx))  = [];
  a(id(idx),:) = [];
  a(:,id(idx)) = [];
end
disp(['Remove these columns: ', num2str(setdiff(1:szA, remCols))]);
a

“remCols” keeps track of the remaining columns. And at the end, I take the set difference between all columns and the remaining, and they are the columns that I deleted.

But then after spending about another 15 minutes, I realized that I actually didn’t need to keep “id”. And I also realized that I can do the row and column deletion in one line. So this is a more streamlined code that I ended up with:

szA = size(a, 2);
remCols = 1:szA;
while ~isempty(a) && any(~sum(a))
  id = find(sum(a)==0, 1, 'first');
  remCols(id)  = [];
  a = a(setdiff(1:size(a,2),id), setdiff(1:size(a,2),id));
end
disp(['Remove these columns: ', num2str(setdiff(1:szA, remCols))]);
a

I hope you all enjoyed doing this puzzler. Thanks, Doug, for doing this. We should definitely do more of these.

NOTE: Everyone, you can now use the <pre></pre> tag to display your code using fixed-width font in your comments. There’s also a “preview” button above that you can use to preview your comment. (Javascript is required for this feature)

Tom,

I just noticed that my solution is very much like yours. Nice! I really like your line for deleting the row and column at the same time (I have a similar code):

a=a([1:i-1 i+1:length(a)],[1:i-1 i+1:length(a)])

This really highlights the indexing capability of MATLAB, where you can specify vector indices to extract a submatrix of a matrix.

Steve talks about this in one of his blog posts:

http://blogs.mathworks.com/steve/2008/02/08/linear-indexing/

This reminds me of something that goes on inside x=A\b when A is sparse and unsymmetric. It has a preprocessing step that looks something like this:

k = 1 ;
while (a column in A exists with 1 nonzero) do
    let A(i,j) be the single nonzero in a column j where
        nnz(a(:,j)) == 1
    remove column j and row i from the matrix A
    row i becomes the kth row of U
    column j becomes the kth column of L
    k = k + 1
end

Then repeat, but looking for rows with one entry instead. You can see the effect of this with [L,U,P,Q] = lu(A) for, say the west0479 matrix:

load west0479
[L,U,P,Q]=lu(west0479) ;
spy(L+U)

If you look closely, you can see that the first 25 rows of U have lots of nonzeros in them, but the first 25 columns of L have just one nonzero in them. The first loop (above) iterated 25 times.

If A were a morally or psychologically triangular matrix, then this process would eliminate all of A, leaving L equal to the identity matrix, and U a permuted version of A.

Then the next 108 rows of U have just one entry each but the same columns of L have lots of nonzeros.

The the rest of the matrix is factorized using a sparse LU factorization.

So … how would you do this binary thing efficiently for a huge sparse binary matrix? Note that x=A(i,:) is very slow for a sparse A, but x=A(:,j) is fast (if you want x=A(i,:) for lots of rows i, then do B=A’ ; x=B(:,i) instead).

How’s that for an extension to your puzzler?

Hello,

Here is another solution based on MATLAB’s powerful indexing capabilities:

function res = puzzler(a)
sub = 1:size(a, 1); 
rem = zeros(1, 0);
zer = 0;
while ~isempty(zer)
    zer=find(sum(a(sub, sub), 1)==0);
    rem = [rem sub(zer)];
    sub(zer)=[];
end
disp(sprintf('Removed: %s', mat2str(rem)));
res = a(sub,sub);

(this works “in place” without disturbing the contents of a)

It is an interesting puzzle. Here is my solution. It took about 10 minutes.

while ~isequal(a,diag(double(any(a)))*a)
a=diag(double(any(a)))*a;
end
idx=any(a);
disp(['Removed columns: ',num2str(find(~idx))])
disp(a(idx,idx))

Another solution may be even simpler:

while ~isequal(any(A),any(A,2)')
A(~any(A),:)=0;
end
disp(['Removed columns: ',num2str(find(~any(A)))]);
disp(A(any(A),any(A)));

Yi,

The first algorithm seems to work.

The second fails here:

A = [
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
1 0 0 1 0
0 0 0 0 1]

What I notice with your code is that it works, but after five minutes, I still do not really understand why. It is very clever, more clever than me!

My goal in writing code is to make people understand it as easily as the computer does. More often than not, the person that will need to understand it is me in a few weeks once I have forgotten all about it. Here is my solution:

a = full(round(sprand(5,5,0.8)))
oldEmptyCols = [];
newEmptyCols = all(~a);
%
while ~isequal(oldEmptyCols, newEmptyCols)
a(newEmptyCols,:) = 0;
oldEmptyCols = newEmptyCols;
newEmptyCols = all(~a);
end
% display results
disp ('---------------')
disp(['Remove these columns: ' num2str(find(newEmptyCols))])
find(newEmptyCols)
%
a(newEmptyCols,:) = [];
a(:,newEmptyCols) = []

Notice that with a good variable naming scheme and use of functions that are pronounceable, this code is very readable, even without comments.

For example, I picked:
~isequal(oldEmptyCols, newEmptyCols)
instead of
~(oldEmptyCols == newEmptyCols)

Little things like this make the code easier to understand when you need to deal with it again later.

Doug

Doug,

Thanks for comments. I see the reason why the second does not work because it missed the cases where there are more rows than columns with all zeros. Here is the modified version.

while ~isequal(any(A) & any(A,2)', any(A'))
    A(~any(A),:)=0;
end
disp(['Removed columns: ',num2str(find(~any(A)))]);
disp(A(any(A),any(A)));

The first code is based on the fact that a diagonl mtrix left-multiply a matrix equivalent to multiply the matrix row by row with the corresponding elements in the diagonal vector. Hence, any(A) get a vector with 1 where the coulmn is not full zero, 0 otherwise. Using this vector to form a diagonl mtrix to left multiply A will set rows of A either all zero (multiply by 0 where corresponding column is all zero) or no change (multiply 1 where corresponding column is not full zero). This is exactly your puzzle required.

In the code, double is used to convert logical variable produced by any(A) to double so that the multiplication can be performed.

Hope this makes the code understandable.

single liner for you, just in case someone looks for this and wants something simpler (just switch the sum direction and col and rows, to delete the rows)

A(:,find(sum(A)~=0))

One fast fix to this to make it faster just use logical indexing and remove the find

remember a sum is simply an or operation on the columns

A(:,sum(A)~=0)

These postings are the author's and don't necessarily represent the opinions of MathWorks.