Doug’s MATLAB Video Tutorials
February 13th, 2009
Puzzler: optimize this
Often when I post code, there is a discussion of how to optimize it. I tend to subscribe to this maxim:
Rules of Optimization:
Rule 1: Don’t do it.
Rule 2 (for experts only): Don’t do it yet.
– M.A. Jackson
The reason for that is many times coders will complicate their code in an effort to optimize it. This optimization is often premature. I recommend waiting until you know it is too slow or too memory intensive before putting in that effort and adding that complexity. Often times, the bottlenecks are not where you would expect at the beginning and readability would be sacrificed for only a marginal gain.
That being said, code optimization can be a fun way to show how clever you are. Here is your chance with a good puzzler brought to us by John D’Errico.
Given an NxN matrix, M, rearrange the columns such that sum(diag(M)) is minimized. I do not know for certain, but I think an exhaustive search is the only way to find the best possible solution. Since the exhaustive search is very time consuming, one does need to trade off search time versus result.
Here is the challenge:
Rewrite findTrace.m so that the calculated score is minimized. Notice the score formula:
sum(trace) + 10*time
This should give plenty of room to explore time versus results trade-offs.
Here is the code to get you started:
n = 15;
%
for i = 1 : 100
mat = rand(n)*5;
%
tic
[matOut, trace] = findTrace(mat);
elapsed(i) = toc;
%
sumTrace(i) = sum(trace);
%
end
%
score = (mean(sumTrace) + mean(elapsed)*10)
function [matOut, trace] = findTrace(matIn)
%
matOut = matIn;
trace = diag(matIn);
I think it would be fun to change n to see which code works best for different size inputs.
As always with these puzzlers, be sure to use the proper tags in the comments.
<pre> <code>
[deltaX, deltaY] = puzzler(inMatrix)
%% All the code so someone can just copy and paste it from the comments.
</code> </pre>
18:57 UTC |
Posted in Level: Advanced, Topic: Puzzler |
Permalink |
You can follow any responses to this entry through the RSS 2.0 feed.
You can skip to the end and leave a response. Pinging is currently not allowed.
Leave a Reply
|
Quick and dirty:
>> I do not know for certain, but I think an exhaustive search is the only way to find the best possible solution
Actually, you are asking to solve an assignment problem http://en.wikipedia.org/wiki/Assignment_problem, for which there exist algorithms more efficient than exhaustive search, e.g. http://en.wikipedia.org/wiki/Hungarian_algorithm
Damn! Google Reader doesn’t show comments…
Solve this puzzler to find the same solution in coments ))
Petter, good work!
Petter’s solution is great, but when I run it in the profiler I noticed the ‘ind2sub’ function takes 90% of the time. Calling ‘findTrace’ 100 times takes 0.193s. So I changed ‘ind2sub’ with ‘find’ and now it takes only 0.045s.
Starting from the Petter’s greedy solution, greedily swap pairs of columns that achieve a lower score:
I find the smallest value in the matrix which has row number rowNr, and place its column in matOut(:,rowNr). I then change all values on that row in matIn to NaN (and repeat). This will automatically sort the columns in matOut giving the smallest trace.
… I was a bit “hasty” in my previous solution ;-) … unintentionaly creating my own matrix. Here is a modified version which is hopefully doing what I wanted it to do:
A problem with the test script is that it doesn’t verify the validity of solutions. To do this one must check that the permutation of columns contains unique indices. Rewriting Jensen’s solution to explicitly generate the permutation, we see that it uses some columns multiple times:
In fact, finding the optimal solution IS possible using linear programming:
This problem is analogous to the “job assignment” problem, for which a polynomial-time algorithm exists (Hungarian method). A number of implementations are on the file exchange, see for example
http://www.mathworks.com/matlabcentral/fileexchange/20652
By “is possible”, I meant tractable, i.e. solvable in polynomial time, rather than the factorial (which is greater than exponential!) time of an exhaustive search. I’m sure there are several methods that can be used to solve it - I use the Linear Interior Point Solver in linprog, but the Simplex method would work just as well (though I found it to be slower). Darren has suggested the Hungarian method, which (using the implementation suggested) is much faster, as it is tailored to this particular type of problem. Good work, Darren!
Interesting. (I’ve just had a chance to read through what has happened over the last couple of days.) My own solution seemed reasonably good, using a greedy algorithm to get a starting point, not too unlike Jensen. Then my code used a swapping scheme to search the space, not too inefficiently.
The greedy algorithm trick is to use sort. If we have
[junk,tags] = sort(A,2);
then if the first column of tags contains all of the numbers from 1:n, then the solution is given by that column. This is extremely fast to compute, and so is well worth checking. Even if not, the sort does much of the work that jensen was doing. You only need to look for the columns that were not assigned positions uniquely by the sort.
Regardless, munkres by Yi Cao is an order of magnitude better than my own search scheme. My thanks to those who pointed it out.
Wow, I set you all loose on this problem right before I went home for the long weekend and come back to find it well taken care of.
I like how the community of readers are able to quickly write, optimize and re-optimize a problem. Petter makes a solid greedy algorithm, then Danilo is able to optimize the bottleneck in it.
Francois is able to make an algorithmic advance by recognizing the problem as one with an existing solution and then Darren points out a file exchange submission that does this already.
This is what I love about the social computing aspect of this site.
Thanks for all the effort!
-Doug
Credit to Francois. None of us saw his post because it only appeared today, despite being posted 3 days earlier.
Oliver,
For some reason, Francois’ comment needed to be “moderated” because the automatic spam filter thought it might be spammy. Usually things are posted immediately, but sometimes things get stuck, like his post.
Doug
Doug: I thought that might’ve been the issue. I guess external links are to be avoided if one wants to post quickly.
Doug,
Just read this blog to know such a puzzler has been going on and to understand why John came to me to ask for the use of the munkres code a few weeks ago. Thanks to Darren who pointed out my code as a solution to the puzzler. Thank you and John to post this puzzler, which shows how useful of the Hungarian method is.
Regards,
Yi