Loren on the Art of MATLAB

Collinearity 19

Posted by Loren Shure,

I recently heard Greg Wilson speak about a topic dear to him, Beautiful Code. He and Andy Oran edited a book on this topic in which programmers explain their thought processes. The essay Writing Programs for "The Book", by Brian Hayes, really resonates for me, in part because of the actual problem he discusses, in part because of the eureka moment he describes. Brian wrote his code in Lisp. I will write similar code in MATLAB to illustrate the points.

Contents

Problem Description

Given 3 points in a plane, find out if they lie on a single line, i.e., find out if the points are collinear. Here are the first set of points we'll use for trying out algorithms.

p1 = [1 1];
p2 = [3.5 3.5];
p3 = [-7.2 -7.2];

Method 1 - See if Third Point Obeys Same Slope

In this variant, we compute the straight lines connected 2 pairs of points, and then check that the third point fits the same line.

collinear1(p1,p2,p3)
ans =
     1
dbtype collinear1
1     function tf = collinear1(p1,p2,p3)
2     m = slope(p1,p2);
3     b = intercept(p1,p2);
4     tf = (m*p3(1)+b) == p3(2);
5     end
6     function m = slope(p1,p2)
7     m = (p2(2)-p1(2))/(p2(1)-p1(1));
8     end
9     function b = intercept(p1,p2)
10    m = slope(p1,p2);
11    b = -p1(2)+m*p1(1);
12    end

Method 2 - Method 1 Modified

There's a flaw in method one in the case where the points are collinear and lie on a vertical line, i.e., where there are repeated x values. Then the denominator we calculate will have the differences of two x values that are the same, resulting in division by 0 and a non-finite slope. First, let's see what happens with the first algorithm and vertically aligned points.

q1 = [0 1];
q2 = [0 3.5];
q3 = [0 -7.2];
collinear1(q1,q2,q3)
ans =
     0

The first algorithm says these points are not collinear. Time to patch the algorithm.

collinear2(q1,q2,q3)
ans =
     1
dbtype collinear2
1     function tf = collinear2(p1,p2,p3)
2     m = slope(p1,p2);
3     b = intercept(p1,p2);
4     % put in check for m being finite in case
5     % points are all on y axis
6     if ~isfinite(m)
7         if (p3(1)==p1(1))
8             tf = true;
9         else
10            tf = false;
11        end
12    else
13        tf = (m*p3(1)+b) == p3(2);
14    end
15    end
16    function m = slope(p1,p2)
17    m = (p2(2)-p1(2))/(p2(1)-p1(1));
18    end
19    function b = intercept(p1,p2)
20    m = slope(p1,p2);
21    b = -p1(2)+m*p1(1);
22    end

Methods 3 and 4 - Compare Two Slopes Only

The next thing Brian noticed was he really didn't need to see if the 3rd point fit the same line. What he really needed to determine is if two lines had the same slope. If so, all three lines have the same slope (convince yourself that this is true, at least for Euclidean geometry).

Let's try this algorithm on our two sets of collinear points.

collinear3(p1,p2,p3)
collinear3(q1,q2,q3)
ans =
     1
ans =
     0
dbtype collinear3
1     function tf = collinear3(p1,p2,p3)
2     m1 = slope(p1,p2);
3     m2 = slope(p1,p3);
4     tf = isequal(m1,m2);
5     end
6     function m = slope(p1,p2)
7     q = p2-p1;
8     m = q(2)/q(1);
9     end

Once again we need to modify the code for infinite slopes.

collinear4(q1,q2,q3)
ans =
     1
dbtype collinear4
1     function tf = collinear3(p1,p2,p3)
2     m1 = slope(p1,p2);
3     m2 = slope(p1,p3);
4     if isfinite(m1)
5         tf = isequal(m1,m2);
6     else
7        if ~isfinite(m2)
8            tf = true;
9        else
10           tf = false;
11       end
12    end
13    end
14    function m = slope(p1,p2)
15    q = p2-p1;
16    m = q(2)/q(1);
17    end

Method 5 and 6 - Summing Lengths of Sides of Triangle

The next idea that Brian explores is exploiting the relationship between the sides of the triangle that 3 points define. The longest leg is shorter than the sum of the lengths of the other 2 sides, except when the points are collinear. Then the length of the longest side is equal to the sum of the 2 other lengths. Let's try it out.

collinear5(p1,p2,p3)
collinear5(q1,q2,q3)
ans =
     0
ans =
     1

Let's use the point values that Brian uses.

bh1 = [0 1];
bh2 = [14 5];
bh3 = [8 4];
collinear5(bh1,bh2,bh3);
dbtype collinear5
1     function tf = collinear5(p1,p2,p3)
2     side(3) = norm(p2-p1);
3     side(2) = norm(p3-p1);
4     side(1) = norm(p3-p2);
5     lengths = sort(side,'descend');
6     tf = isequal(lengths(1), ...
7         (lengths(2)+lengths(3)));
8     end

So far, so good. Points that are not collinear don't claim to be. But the first set was and the answer came out negative. Let's try another batch of points to be sure.

bh4 = [0 0];
bh5 = [3 3];
bh6 = [5 5];
collinear5(bh1,bh2,bh3)
collinear5(bh1*1e5,bh2*1e5,bh3*1e5)
ans =
     0
ans =
     0

Uh-oh! What's happened? Looking back at collinear5, we see we are no longer doing just addition, subtraction, multiplication, and division. Now we're computing the norm which involves computing a sqrt. As a result, we have more opportunity for numerical roundoff issues, even if the points are confined to be rational numbers (including integers).

collinear6(bh1,bh2,bh3)
collinear6(bh1*1e5,bh2*1e5,bh3*1e5)
ans =
     0
ans =
     0

Version collinear6 no longer does an exact equality comparison but looks for differences on the order of eps for the appropriate lengths.

dbtype collinear6
1     function tf = collinear6(p1,p2,p3)
2     side(3) = norm(p2-p1);
3     side(2) = norm(p3-p1);
4     side(1) = norm(p3-p2);
5     lengths = sort(side,'descend');
6     tf = ...
7         abs(lengths(1)-(lengths(2)+lengths(3))) ...
8                              <= eps(lengths(1));
9     end

Method 7 - Eureka, Compute Triangle Area!

Finally, as Brian describes it, he realized there was a completely different way to see if 3 points were collinear, and that was to compute the area of the triangle they define.

collinear7(bh1,bh2,bh3)
collinear7(bh1*1e5,bh2*1e5,bh3*1e5)
ans =
     0
ans =
     0

To compute the area, you could compute base * height / 2, but again this would involve square roots. The trig approach would involve transcendental functions. Another way to think of this is to view any 2 pairs of the points as defining a parallelogram, and the triangle area is half that area. To compute the area, treat the sides as vectors which you translate to the origin. Then the area computation is straight-forward, boiling down to something proportional to the determinant If the result is 0, the points are collinear.

dbtype collinear7
1     function tf = collinear7(p1,p2,p3)
2     mat = [p1(1)-p3(1) p1(2)-p3(2); ...
3            p2(1)-p3(1) p2(2)-p3(2)];
4     tf = det(mat) == 0;

This algorithm requires an equality test following 6 arithmetic operations. Pretty simple, very elegant, much less prone to numerical issues than other algorithms. Much nicer than all of the other code above, no accounting for singular behavior necessary.

Note: Added 9 June 2008

If you read the comments for this blog, you will find an even more satisfactory solution relying on svd. svd (and the related rank function) have better numerical attributes than det.

What's Your Recent Aha Moment?

Have you had a similar epiphany recently? We'd love to hear about it here.


Get the MATLAB code

Published with MATLAB® 7.6

19 CommentsOldest to Newest

Here’s a one-liner that works for any number of points in any number of dimensions.

collinear8 = @(varargin) rank(cat(1,varargin{:}) – circshift(cat(1,varargin{:}),1)) == 1;

It creates a matrix where each row is a point, cyclically permutes the rows by 1, and then checks whether the rank of the resulting matrix is 1.

collinear8([1,1,1],[3.5,3.5,3.5],[-7.2,-7.2,-7.2])
> 1
collinear8([1,1,1],[3.5,3.5,3.5],[-7.2,-7.2,-7.20000001])
> 0

Slight bug in the explanation: it checks whether the difference between the original matrix and the permutation has rank 1.

Jason-

Nice algorithm! But not as robust as collinear7 from what I see:

>> bh4 = [0 0];
bh5 = [3 3];
bh6 = [5 5];
collinear8(bh1,bh2,bh3)
collinear8(bh1*1e5,bh2*1e5,bh3*1e5)
ans =
     0
ans =
     0

–Loren

I’d call this less a case of beautiful code than beautiful mathematics. Unfortunately, it’s all too easy now to get credentials in computer science or computer engineering without taking math classes about vectors or matrices.

Loren-

I believe you must be making a mistake somewhere. You’ve assigned bh4,bh5,bh6, and used bh1,bh2,bh3. The latter actually weren’t collinear in your article, and I’m guessing they actually aren’t now.

collinear8([0,0],[3,3],[5,5])
> 1
collinear8([0,0]*1e5,[3,3]*1e5,[5,5]*1e5)
> 1
collinear8([0,1],[14,5],[18,4])
> 0

All these are correct. This algorithm really ought to be stable.

Jason-

My mistake – you are right!

–Loren

Toby-

I think the code is beautiful, perhaps because of the math, but still – no conditional branches, no edge cases, etc.

My $0.02,
–Loren

Maybe you should be pointing your users towards well-understood and thoroughly tested geometric predicates, rather than encouraging them to roll their own, which will likely be inaccurate in the very cases when accuracy matters most. The code that comes to mind is Jonathan Shewchuk’s:

http://www.cs.cmu.edu/~quake/robust.html

though presumably there are many others.

>> rank ([p1 ; p2 ; p3])

ans =

     1

If rank is 2, then they are not colinear. Don’t use “det”, it’s not reliable. The SVD is the best (which is what “rank” uses).

oops. I just tested if p1, p2, and p3 were colinear with the origin as well (which these 3 points happen to be).

rank ([p2-p1 ; p3-p1]) < 2

is what you need. Much more numerically reliable than your suggested

det ([p2-p1 ; p3-p1]) == 0

for testing for singular matrices, which is what you’re doing (See “help det”).

Mark and Tim-

Thanks for the comments. Yes, I agree people should look at good references and rank is preferable to det.

Thinking about the problem in different ways (fitting a line vs. checking area), and especially to help get algorithmic robustness was my main take-away.

–Loren

Following up on idea of using area of triangle even simpler solution (in Matlab at least) would be using cross product:

~any(cross([p2-p1, 0], [p3-p1, 0]))

Anohter way could be using a dot product

d1 = p2 – p1;
d2 = p3 – p1;
dot(d1,d2)^2/dot(d1,d1)/dot(d2,d2) == 1

It should be emphasized again that the first 5 methods are highly sensitive to numerical errors. It depends on where the numbers are coming from, but if there is any chance of real data being used, it would be poor practice to even consider == or isequal for floating point numbers. It is generally much better to be thinking in terms of tolerances like eps or to use svd. I would therefore consider the point below, for Method 6, to be the FIRST and most important consideration for writing code that works (rather than merely being “beautiful” and not useful):

Uh-oh! What’s happened? Looking back at collinear5, we see we are no longer doing just addition, subtraction, multiplication, and division. Now we’re computing the norm which involves computing a sqrt. As a result, we have more opportunity for numerical roundoff issues, even if the points are confined to be rational numbers (including integers).

It would appear to me that Ilya Rozenfeld’s solution would be the cleanest. Just to help those who might have no background in linear algebra,

using
cos(theta) = [dot(v1,v2) / (||v1||*||v2||) ], and getting one on the right side would mean theta = 0, and the points are collinear. However, since ||v1|| = sqrt(dot(v1,v1)), and sqrt(.) can lead to roundoff error, Ilya Rozenfeld has squared both sides of the equation above, giving
cos^2(theta) = dot(v1,v2)^2 / (sqrt(dot(v1,v1))^2 * sqrt(dot(v2,v2))^2 )
cos^2(theta) = dot(v1,v2)^2 / ( dot(v1,v1)*dot(v2,v2) ).
As long as they are collinear and theta = 0, cos^2(0)=1^2=1.
So, if the right side equals 1, then we have collinearity, which is why Ilya Rozenfeld has written such a solution.

The only thing I fear is numerical precision on the dot product, but this seems to be practical enough for me:
function colOrNot = collinear(A,B,C)
% … code to ensure all column or row vectors
v1 = A-B; v2 = B-C;
colOrNot = ( dot(v1,v2)^2/( dot(v1,v1) * dot(v2,v2) ) ) == 1;
endfunction

> collinear( [0 0], [-3 -3], [6 6] )
ans = 1
> collinear( [1e-8 1e-8]‘, [-6+10e-14 -6+10e-14]‘, [1e6+5e-21 1e6+5e-21]‘ )
ans = 1
—– But I am also subject to:
> eps
ans = 2.22044604925031e-16
> collinear( [0 3], [0 8], [0 -1e21+2e-15] )
ans = 1
> collinear( [0 3], [0 8], [0 -1e22+2e-15] )
ans = 0

JB-

It looks to me like Ilya’s solution and therefore yours are equivalent to the determinant. As Tim and others have pointed out, svd/rank solutions are more robust.

–Loren

Indeed,
> collinear( [0 3],[0 8],[0 -1e21+2e-15] )
ans = 1
> collinear( [0 3],[0 8],[0 -1e22+2e-15] )
ans = 0
> rank( [ [0 3]‘ [0 8]‘ [0 -1e22+2e-15]‘ ] )
ans = 1

Of course, the point of this article is to convey the point “there is more than one way to skin a cat,” which Loren and others have sufficiently exemplified. :)

My head’s still spinning trying to figure out how the solution got from ultra-fragile floating point solution to a ultra-elegant integer solution…

My question is: how would this be generalized to N points? In particular I’m interested in detecting collinearity of 4 points, I fail to see how the matrix in Method 7 can be generalized.

Michel-

I am not sure you can extend it. What you can do is run it twice, first with 3 points, and then, if they are collinear, run it again, substituting the 4th point for any of the other points.

However, I think you will see at least one solution (svd/rank) in the comments that extends fully to multiple points.

–Loren

These postings are the author's and don't necessarily represent the opinions of MathWorks.