I recently heard Greg Wilson speak about a topic dear to him, Beautiful Code. He and Andy Oran edited a book on this topic in which programmers explain their thought processes. The essay Writing Programs for "The Book", by Brian Hayes, really resonates for me, in part because of the actual problem he discusses, in part because of the eureka moment he describes. Brian wrote his code in Lisp. I will write similar code in MATLAB to illustrate the points.
Contents
Problem Description
Given 3 points in a plane, find out if they lie on a single line, i.e., find out if the points are collinear. Here are the first set of points we'll use for trying out algorithms.
p1 = [1 1]; p2 = [3.5 3.5]; p3 = [-7.2 -7.2];
Method 1 - See if Third Point Obeys Same Slope
In this variant, we compute the straight lines connected 2 pairs of points, and then check that the third point fits the same line.
collinear1(p1,p2,p3)
ans =
1
dbtype collinear11 function tf = collinear1(p1,p2,p3) 2 m = slope(p1,p2); 3 b = intercept(p1,p2); 4 tf = (m*p3(1)+b) == p3(2); 5 end 6 function m = slope(p1,p2) 7 m = (p2(2)-p1(2))/(p2(1)-p1(1)); 8 end 9 function b = intercept(p1,p2) 10 m = slope(p1,p2); 11 b = -p1(2)+m*p1(1); 12 end
Method 2 - Method 1 Modified
There's a flaw in method one in the case where the points are collinear and lie on a vertical line, i.e., where there are repeated x values. Then the denominator we calculate will have the differences of two x values that are the same, resulting in division by 0 and a non-finite slope. First, let's see what happens with the first algorithm and vertically aligned points.
q1 = [0 1]; q2 = [0 3.5]; q3 = [0 -7.2]; collinear1(q1,q2,q3)
ans =
0
The first algorithm says these points are not collinear. Time to patch the algorithm.
collinear2(q1,q2,q3)
ans =
1
dbtype collinear21 function tf = collinear2(p1,p2,p3) 2 m = slope(p1,p2); 3 b = intercept(p1,p2); 4 % put in check for m being finite in case 5 % points are all on y axis 6 if ~isfinite(m) 7 if (p3(1)==p1(1)) 8 tf = true; 9 else 10 tf = false; 11 end 12 else 13 tf = (m*p3(1)+b) == p3(2); 14 end 15 end 16 function m = slope(p1,p2) 17 m = (p2(2)-p1(2))/(p2(1)-p1(1)); 18 end 19 function b = intercept(p1,p2) 20 m = slope(p1,p2); 21 b = -p1(2)+m*p1(1); 22 end
Methods 3 and 4 - Compare Two Slopes Only
The next thing Brian noticed was he really didn't need to see if the 3rd point fit the same line. What he really needed to determine is if two lines had the same slope. If so, all three lines have the same slope (convince yourself that this is true, at least for Euclidean geometry).
Let's try this algorithm on our two sets of collinear points.
collinear3(p1,p2,p3) collinear3(q1,q2,q3)
ans =
1
ans =
0
dbtype collinear31 function tf = collinear3(p1,p2,p3) 2 m1 = slope(p1,p2); 3 m2 = slope(p1,p3); 4 tf = isequal(m1,m2); 5 end 6 function m = slope(p1,p2) 7 q = p2-p1; 8 m = q(2)/q(1); 9 end
Once again we need to modify the code for infinite slopes.
collinear4(q1,q2,q3)
ans =
1
dbtype collinear41 function tf = collinear3(p1,p2,p3) 2 m1 = slope(p1,p2); 3 m2 = slope(p1,p3); 4 if isfinite(m1) 5 tf = isequal(m1,m2); 6 else 7 if ~isfinite(m2) 8 tf = true; 9 else 10 tf = false; 11 end 12 end 13 end 14 function m = slope(p1,p2) 15 q = p2-p1; 16 m = q(2)/q(1); 17 end
Method 5 and 6 - Summing Lengths of Sides of Triangle
The next idea that Brian explores is exploiting the relationship between the sides of the triangle that 3 points define. The longest leg is shorter than the sum of the lengths of the other 2 sides, except when the points are collinear. Then the length of the longest side is equal to the sum of the 2 other lengths. Let's try it out.
collinear5(p1,p2,p3) collinear5(q1,q2,q3)
ans =
0
ans =
1
Let's use the point values that Brian uses.
bh1 = [0 1]; bh2 = [14 5]; bh3 = [8 4]; collinear5(bh1,bh2,bh3);
dbtype collinear51 function tf = collinear5(p1,p2,p3) 2 side(3) = norm(p2-p1); 3 side(2) = norm(p3-p1); 4 side(1) = norm(p3-p2); 5 lengths = sort(side,'descend'); 6 tf = isequal(lengths(1), ... 7 (lengths(2)+lengths(3))); 8 end
So far, so good. Points that are not collinear don't claim to be. But the first set was and the answer came out negative. Let's try another batch of points to be sure.
bh4 = [0 0]; bh5 = [3 3]; bh6 = [5 5]; collinear5(bh1,bh2,bh3) collinear5(bh1*1e5,bh2*1e5,bh3*1e5)
ans =
0
ans =
0
Uh-oh! What's happened? Looking back at collinear5, we see we are no longer doing just addition, subtraction, multiplication, and division. Now we're computing the norm which involves computing a sqrt. As a result, we have more opportunity for numerical roundoff issues, even if the points are confined to be rational numbers (including integers).
collinear6(bh1,bh2,bh3) collinear6(bh1*1e5,bh2*1e5,bh3*1e5)
ans =
0
ans =
0
Version collinear6 no longer does an exact equality comparison but looks for differences on the order of eps for the appropriate lengths.
dbtype collinear61 function tf = collinear6(p1,p2,p3) 2 side(3) = norm(p2-p1); 3 side(2) = norm(p3-p1); 4 side(1) = norm(p3-p2); 5 lengths = sort(side,'descend'); 6 tf = ... 7 abs(lengths(1)-(lengths(2)+lengths(3))) ... 8 <= eps(lengths(1)); 9 end
Method 7 - Eureka, Compute Triangle Area!
Finally, as Brian describes it, he realized there was a completely different way to see if 3 points were collinear, and that was to compute the area of the triangle they define.
collinear7(bh1,bh2,bh3) collinear7(bh1*1e5,bh2*1e5,bh3*1e5)
ans =
0
ans =
0
To compute the area, you could compute base * height / 2, but again this would involve square roots. The trig approach would involve transcendental functions. Another way to think of this is to view any 2 pairs of the points as defining a parallelogram, and the triangle area is half that area. To compute the area, treat the sides as vectors which you translate to the origin. Then the area computation is straight-forward, boiling down to something proportional to the determinant If the result is 0, the points are collinear.
dbtype collinear71 function tf = collinear7(p1,p2,p3) 2 mat = [p1(1)-p3(1) p1(2)-p3(2); ... 3 p2(1)-p3(1) p2(2)-p3(2)]; 4 tf = det(mat) == 0;
This algorithm requires an equality test following 6 arithmetic operations. Pretty simple, very elegant, much less prone to numerical issues than other algorithms. Much nicer than all of the other code above, no accounting for singular behavior necessary.
Note: Added 9 June 2008
If you read the comments for this blog, you will find an even more satisfactory solution relying on svd. svd (and the related rank function) have better numerical attributes than det.
What's Your Recent Aha Moment?
Have you had a similar epiphany recently? We'd love to hear about it here.
Get
the MATLAB code
Published with MATLAB® 7.6



Here’s a one-liner that works for any number of points in any number of dimensions.
collinear8 = @(varargin) rank(cat(1,varargin{:}) - circshift(cat(1,varargin{:}),1)) == 1;
It creates a matrix where each row is a point, cyclically permutes the rows by 1, and then checks whether the rank of the resulting matrix is 1.
collinear8([1,1,1],[3.5,3.5,3.5],[-7.2,-7.2,-7.2])
> 1
collinear8([1,1,1],[3.5,3.5,3.5],[-7.2,-7.2,-7.20000001])
> 0
Slight bug in the explanation: it checks whether the difference between the original matrix and the permutation has rank 1.
Jason-
Nice algorithm! But not as robust as collinear7 from what I see:
>> bh4 = [0 0]; bh5 = [3 3]; bh6 = [5 5]; collinear8(bh1,bh2,bh3) collinear8(bh1*1e5,bh2*1e5,bh3*1e5) ans = 0 ans = 0–Loren
I’d call this less a case of beautiful code than beautiful mathematics. Unfortunately, it’s all too easy now to get credentials in computer science or computer engineering without taking math classes about vectors or matrices.
Loren-
I believe you must be making a mistake somewhere. You’ve assigned bh4,bh5,bh6, and used bh1,bh2,bh3. The latter actually weren’t collinear in your article, and I’m guessing they actually aren’t now.
collinear8([0,0],[3,3],[5,5])
> 1
collinear8([0,0]*1e5,[3,3]*1e5,[5,5]*1e5)
> 1
collinear8([0,1],[14,5],[18,4])
> 0
All these are correct. This algorithm really ought to be stable.
Ought to be robust, that is.
Jason-
My mistake - you are right!
–Loren
Toby-
I think the code is beautiful, perhaps because of the math, but still - no conditional branches, no edge cases, etc.
My $0.02,
–Loren
Maybe you should be pointing your users towards well-understood and thoroughly tested geometric predicates, rather than encouraging them to roll their own, which will likely be inaccurate in the very cases when accuracy matters most. The code that comes to mind is Jonathan Shewchuk’s:
http://www.cs.cmu.edu/~quake/robust.html
though presumably there are many others.
>> rank ([p1 ; p2 ; p3]) ans = 1If rank is 2, then they are not colinear. Don’t use “det”, it’s not reliable. The SVD is the best (which is what “rank” uses).
oops. I just tested if p1, p2, and p3 were colinear with the origin as well (which these 3 points happen to be).
is what you need. Much more numerically reliable than your suggested
for testing for singular matrices, which is what you’re doing (See “help det”).
Mark and Tim-
Thanks for the comments. Yes, I agree people should look at good references and rank is preferable to det.
Thinking about the problem in different ways (fitting a line vs. checking area), and especially to help get algorithmic robustness was my main take-away.
–Loren
Following up on idea of using area of triangle even simpler solution (in Matlab at least) would be using cross product:
~any(cross([p2-p1, 0], [p3-p1, 0]))
Anohter way could be using a dot product
d1 = p2 - p1;
d2 = p3 - p1;
dot(d1,d2)^2/dot(d1,d1)/dot(d2,d2) == 1
It should be emphasized again that the first 5 methods are highly sensitive to numerical errors. It depends on where the numbers are coming from, but if there is any chance of real data being used, it would be poor practice to even consider == or isequal for floating point numbers. It is generally much better to be thinking in terms of tolerances like eps or to use svd. I would therefore consider the point below, for Method 6, to be the FIRST and most important consideration for writing code that works (rather than merely being “beautiful” and not useful):
Uh-oh! What’s happened? Looking back at collinear5, we see we are no longer doing just addition, subtraction, multiplication, and division. Now we’re computing the norm which involves computing a sqrt. As a result, we have more opportunity for numerical roundoff issues, even if the points are confined to be rational numbers (including integers).
It would appear to me that Ilya Rozenfeld’s solution would be the cleanest. Just to help those who might have no background in linear algebra,
using
cos(theta) = [dot(v1,v2) / (||v1||*||v2||) ], and getting one on the right side would mean theta = 0, and the points are collinear. However, since ||v1|| = sqrt(dot(v1,v1)), and sqrt(.) can lead to roundoff error, Ilya Rozenfeld has squared both sides of the equation above, giving
cos^2(theta) = dot(v1,v2)^2 / (sqrt(dot(v1,v1))^2 * sqrt(dot(v2,v2))^2 )
cos^2(theta) = dot(v1,v2)^2 / ( dot(v1,v1)*dot(v2,v2) ).
As long as they are collinear and theta = 0, cos^2(0)=1^2=1.
So, if the right side equals 1, then we have collinearity, which is why Ilya Rozenfeld has written such a solution.
The only thing I fear is numerical precision on the dot product, but this seems to be practical enough for me:
function colOrNot = collinear(A,B,C)
% … code to ensure all column or row vectors
v1 = A-B; v2 = B-C;
colOrNot = ( dot(v1,v2)^2/( dot(v1,v1) * dot(v2,v2) ) ) == 1;
endfunction
> collinear( [0 0], [-3 -3], [6 6] )
ans = 1
> collinear( [1e-8 1e-8]’, [-6+10e-14 -6+10e-14]’, [1e6+5e-21 1e6+5e-21]’ )
ans = 1
—– But I am also subject to:
> eps
ans = 2.22044604925031e-16
> collinear( [0 3], [0 8], [0 -1e21+2e-15] )
ans = 1
> collinear( [0 3], [0 8], [0 -1e22+2e-15] )
ans = 0
JB-
It looks to me like Ilya’s solution and therefore yours are equivalent to the determinant. As Tim and others have pointed out, svd/rank solutions are more robust.
–Loren
Indeed,
> collinear( [0 3],[0 8],[0 -1e21+2e-15] )
ans = 1
> collinear( [0 3],[0 8],[0 -1e22+2e-15] )
ans = 0
> rank( [ [0 3]’ [0 8]’ [0 -1e22+2e-15]’ ] )
ans = 1
Of course, the point of this article is to convey the point “there is more than one way to skin a cat,” which Loren and others have sufficiently exemplified. :)
Ah, and I am at fault for simply testing collinearity with the origin in the example above.
My head’s still spinning trying to figure out how the solution got from ultra-fragile floating point solution to a ultra-elegant integer solution…
My question is: how would this be generalized to N points? In particular I’m interested in detecting collinearity of 4 points, I fail to see how the matrix in Method 7 can be generalized.
Michel-
I am not sure you can extend it. What you can do is run it twice, first with 3 points, and then, if they are collinear, run it again, substituting the 4th point for any of the other points.
However, I think you will see at least one solution (svd/rank) in the comments that extends fully to multiple points.
–Loren