Collinearity

Posted by Loren Shure, June 6, 2008

9 views (last 30 days) | 0 Likes | 19 comments

I recently heard Greg Wilson speak about a topic dear to him, Beautiful Code. He and Andy Oran edited a book on this topic in which programmers explain their thought processes. The essay Writing Programs for "The Book", by Brian Hayes, really resonates for me, in part because of the actual problem he discusses, in part because of the eureka moment he describes. Brian wrote his code in Lisp. I will write similar code in MATLAB to illustrate the points.

Problem Description
Method 1 - See if Third Point Obeys Same Slope
Method 2 - Method 1 Modified
Methods 3 and 4 - Compare Two Slopes Only
Method 5 and 6 - Summing Lengths of Sides of Triangle
Method 7 - Eureka, Compute Triangle Area!
Note: Added 9 June 2008
What's Your Recent Aha Moment?

Problem Description

Given 3 points in a plane, find out if they lie on a single line, i.e., find out if the points are collinear. Here are the first set of points we'll use for trying out algorithms.

p1 = [1 1];
p2 = [3.5 3.5];
p3 = [-7.2 -7.2];

Method 1 - See if Third Point Obeys Same Slope

In this variant, we compute the straight lines connected 2 pairs of points, and then check that the third point fits the same line.

collinear1(p1,p2,p3)

ans =
     1

dbtype collinear1

1     function tf = collinear1(p1,p2,p3)
2     m = slope(p1,p2);
3     b = intercept(p1,p2);
4     tf = (m*p3(1)+b) == p3(2);
5     end
6     function m = slope(p1,p2)
7     m = (p2(2)-p1(2))/(p2(1)-p1(1));
8     end
9     function b = intercept(p1,p2)
10    m = slope(p1,p2);
11    b = -p1(2)+m*p1(1);
12    end

Method 2 - Method 1 Modified

There's a flaw in method one in the case where the points are collinear and lie on a vertical line, i.e., where there are repeated x values. Then the denominator we calculate will have the differences of two x values that are the same, resulting in division by 0 and a non-finite slope. First, let's see what happens with the first algorithm and vertically aligned points.

q1 = [0 1];
q2 = [0 3.5];
q3 = [0 -7.2];
collinear1(q1,q2,q3)

ans =
     0

The first algorithm says these points are not collinear. Time to patch the algorithm.

collinear2(q1,q2,q3)

ans =
     1

dbtype collinear2

1     function tf = collinear2(p1,p2,p3)
2     m = slope(p1,p2);
3     b = intercept(p1,p2);
4     % put in check for m being finite in case
5     % points are all on y axis
6     if ~isfinite(m)
7         if (p3(1)==p1(1))
8             tf = true;
9         else
10            tf = false;
11        end
12    else
13        tf = (m*p3(1)+b) == p3(2);
14    end
15    end
16    function m = slope(p1,p2)
17    m = (p2(2)-p1(2))/(p2(1)-p1(1));
18    end
19    function b = intercept(p1,p2)
20    m = slope(p1,p2);
21    b = -p1(2)+m*p1(1);
22    end

Methods 3 and 4 - Compare Two Slopes Only

The next thing Brian noticed was he really didn't need to see if the 3rd point fit the same line. What he really needed to determine is if two lines had the same slope. If so, all three lines have the same slope (convince yourself that this is true, at least for Euclidean geometry).

Let's try this algorithm on our two sets of collinear points.

collinear3(p1,p2,p3)
collinear3(q1,q2,q3)

ans =
     1
ans =
     0

dbtype collinear3

1     function tf = collinear3(p1,p2,p3)
2     m1 = slope(p1,p2);
3     m2 = slope(p1,p3);
4     tf = isequal(m1,m2);
5     end
6     function m = slope(p1,p2)
7     q = p2-p1;
8     m = q(2)/q(1);
9     end

Once again we need to modify the code for infinite slopes.

collinear4(q1,q2,q3)

ans =
     1

dbtype collinear4

1     function tf = collinear3(p1,p2,p3)
2     m1 = slope(p1,p2);
3     m2 = slope(p1,p3);
4     if isfinite(m1)
5         tf = isequal(m1,m2);
6     else
7        if ~isfinite(m2)
8            tf = true;
9        else
10           tf = false;
11       end
12    end
13    end
14    function m = slope(p1,p2)
15    q = p2-p1;
16    m = q(2)/q(1);
17    end

Method 5 and 6 - Summing Lengths of Sides of Triangle

The next idea that Brian explores is exploiting the relationship between the sides of the triangle that 3 points define. The longest leg is shorter than the sum of the lengths of the other 2 sides, except when the points are collinear. Then the length of the longest side is equal to the sum of the 2 other lengths. Let's try it out.

collinear5(p1,p2,p3)
collinear5(q1,q2,q3)

ans =
     0
ans =
     1

Let's use the point values that Brian uses.

bh1 = [0 1];
bh2 = [14 5];
bh3 = [8 4];
collinear5(bh1,bh2,bh3);

dbtype collinear5

1     function tf = collinear5(p1,p2,p3)
2     side(3) = norm(p2-p1);
3     side(2) = norm(p3-p1);
4     side(1) = norm(p3-p2);
5     lengths = sort(side,'descend');
6     tf = isequal(lengths(1), ...
7         (lengths(2)+lengths(3)));
8     end

So far, so good. Points that are not collinear don't claim to be. But the first set was and the answer came out negative. Let's try another batch of points to be sure.

bh4 = [0 0];
bh5 = [3 3];
bh6 = [5 5];
collinear5(bh1,bh2,bh3)
collinear5(bh1*1e5,bh2*1e5,bh3*1e5)

ans =
     0
ans =
     0

Uh-oh! What's happened? Looking back at collinear5, we see we are no longer doing just addition, subtraction, multiplication, and division. Now we're computing the norm which involves computing a sqrt. As a result, we have more opportunity for numerical roundoff issues, even if the points are confined to be rational numbers (including integers).

collinear6(bh1,bh2,bh3)
collinear6(bh1*1e5,bh2*1e5,bh3*1e5)

ans =
     0
ans =
     0

Version collinear6 no longer does an exact equality comparison but looks for differences on the order of eps for the appropriate lengths.

dbtype collinear6

1     function tf = collinear6(p1,p2,p3)
2     side(3) = norm(p2-p1);
3     side(2) = norm(p3-p1);
4     side(1) = norm(p3-p2);
5     lengths = sort(side,'descend');
6     tf = ...
7         abs(lengths(1)-(lengths(2)+lengths(3))) ...
8                              <= eps(lengths(1));
9     end

Method 7 - Eureka, Compute Triangle Area!

Finally, as Brian describes it, he realized there was a completely different way to see if 3 points were collinear, and that was to compute the area of the triangle they define.

collinear7(bh1,bh2,bh3)
collinear7(bh1*1e5,bh2*1e5,bh3*1e5)

ans =
     0
ans =
     0

To compute the area, you could compute base * height / 2, but again this would involve square roots. The trig approach would involve transcendental functions. Another way to think of this is to view any 2 pairs of the points as defining a parallelogram, and the triangle area is half that area. To compute the area, treat the sides as vectors which you translate to the origin. Then the area computation is straight-forward, boiling down to something proportional to the determinant If the result is 0, the points are collinear.

dbtype collinear7

1     function tf = collinear7(p1,p2,p3)
2     mat = [p1(1)-p3(1) p1(2)-p3(2); ...
3            p2(1)-p3(1) p2(2)-p3(2)];
4     tf = det(mat) == 0;

This algorithm requires an equality test following 6 arithmetic operations. Pretty simple, very elegant, much less prone to numerical issues than other algorithms. Much nicer than all of the other code above, no accounting for singular behavior necessary.

Note: Added 9 June 2008

If you read the comments for this blog, you will find an even more satisfactory solution relying on svd. svd (and the related rank function) have better numerical attributes than det.