Loren on the Art of MATLAB

August 25th, 2008

Piecewise Linear Interpolation

John D'Errico is back today to talk about linear interpolation.

Contents

Introduction

You saw in my previous blog that high order polynomials can have some problems. Why not go to the opposite extreme? Use a piecewise version of linear interpolation? I like to call it connect-the-dots, after the child's game of that name. This is really the simplest interpolation of all.

In MATLAB, given a list of points, sampled from some functional relationship in one dimension, how would we perform piecewise linear interpolation? There are really two steps.

For any point u, given a set of (x,y) pairs with a monotonic vector x (by monotonic, I mean that x(k) < x(k+1) ), first find the index k, such that

Second, perform the linear interpolation to predict the value of y at x=u, between the pair of points (x(k),y(k)) and (x(k+1),y(k+1)).

Each data point in the list of points becomes a point where the slope of the piecewise linear interpolant changes to a new value. However, the function is still continuous across those locations. So one might call these locations "knots" because at those points consecutive polynomial segments are tied together. Knots is a common term applied to splines for these locations; breaks is a common alternative name.

Create Some Data to Interpolate

x = linspace(0,2*pi,10)';
y = sin(x);
plot(x,y,'o')
title 'A simple trig function'
xlabel X
ylabel Y

Suppose I want to interpolate this function at some intermediate point, perhaps u == 1.3?

u = 1.3;

histc Solves the Binning Problem

The first step I describe above is what I call binning. histc solves that problem efficiently.

[k,k] = histc(u,x);
k
k =
     2

As an aside, look at the way I took the output from histc. Since I only need the second output from histc but not the first output, rather than clutter up my workspace with a variable that I did not need, the output style [k,k] returns only the information I need.

Next, it seems instructive to dive a little more deeply into binning, so let me offer a few alternatives to histc.

Binning - A Loop With An Explicit Test

Just a test inside a loop suffices.

for k = 1:(length(x)-1)
  if (x(k) <= u) && (u < x(k+1))
    break
  end
end
x
k
x =
            0
      0.69813
       1.3963
       2.0944
       2.7925
       3.4907
       4.1888
       4.8869
       5.5851
       6.2832
k =
     2

Binning - A Semi-vectorized Test

Do the binning with a single vectorized test. This works reasonably as long as I have only a scalar value for u, so I call this only a semi-vectorized solution.

k = sum(u>=x)
k =
     2

When I want to place multiple points into their respective bins, this sum fails.

There are other ways to place points in bins in MATLAB, including a sort, or with hash tables. You can find several different binning methods in bindex on the file exchange. It is useful mainly to those with older MATLAB releases, because histc became available with version 5.3 and later of MATLAB.

Fully Vectorized Binning

Next, I may, and often do, have a list of points to interpolate. This is a common event, where I wish to more finely resample a curve that is sampled only at some short list of points. This time, let me generate 1000 points at which to interpolate the sampled function.

u = linspace(x(1),x(end),1000)';

[k,k] = histc(u,x);

This next line handles the very last point. Recall the definition of a bin as

Note the strict inequality on the left. So that very last point (at x(end)) might be technically said to fall into the n|th bin when I have |n break points.

On the other hand, it is more convenient to put anything that lies exactly at the very last break point into bin (n-1).

By the way, any piecewise interpolation should worry about points that fall fully above or below the domain of the data. Will the code extrapolate or not? Should you extrapolate?

n = length(x);
k(k == n) = n - 1;

Interpolation as a Linear Combination

The final step is to interpolate between two points. Long ago, I recall from high school what was called a point-slope form for a line. If you know a pair of points that a line passes through, as (x(k),y(k)) and (x(k+1),y(k+1)), then the slope of the line is simple to compute. An equation for our line as a function of the parameter u is just:

I can also rewrite this in a way that I like, by defining a parameter t as

Then the interpolant is a simple linear combination of the function values at each end of the interval.

Do the Interpolation and Plot the Result

See how nicely this all worked, in a fully vectorized coding style.

t = (u - x(k))./(x(k+1) - x(k));
yu = (1-t).*y(k) + t.*y(k+1);

plot(x,y,'bo',u,yu,'r-')
xlabel X
ylabel Y
title 'The connect-the-dots interpolant'

Use interp1 Instead

It is always better to use a built-in tool to solve your problem than to do it yourself, so I might just as well have used interp1 to accomplish this interpolation.

yinterp1 = interp1(x,y,u,'linear');

plot(x,y,'bo',u,yinterp1,'r-')
xlabel X
ylabel Y
title 'The connect-the-dots interpolant, using interp1'

In my next blog I'll begin talking about piecewise cubic interpolants. Until then please tell me your ideas here. Are there some associated topics that I should cover?


Get the MATLAB code

Published with MATLAB® 7.6

15 Responses to “Piecewise Linear Interpolation”

  1. Yakov replied on :

    Another option is to use sgolayfilt built in to the signal processing toolbox or a version on matlab central (I also have my own copy to handle non-equally spaced independent variable data). This lets you do an arbitrary degree polynomial interpolation as a convolution step (it turns out to be equivalent to a least square fit) very efficiently.

  2. Pam replied on :

    Dear John,
    The mini lecture is great! I am interested in knowing how to linearly interpolate for higher dimensions, say 3. I understan that MATLAB has interp3 for this task, but given that I am calling this function millions of times I am thinking of coding the basic commands myself. Would you mind talking about linear interpolations for 3 dimensions?
    Thanks!
    Panle

  3. John DErrico replied on :

    Hi Pam,

    Interpolation in a higher number of dimensions is interesting for a few reasons. In fact, I wrote the equivalent of about a 400 page book just on different facets of this problem in 3 or more dimensions. (In my past life at Kodak, as a series of technical reports.)

    My first suggestion is to try to recode your problem to do it in a vectorized way. Don’t loop over interp3, but call it with millions of points at once. Its probably not an option for you though. Perhaps you are doing an optimization, or some other sequential process that precludes a single vectorized call. If the reason that you are interpolating millions of points is image processing, then a vectorized call is very logical.

    I’ll describe the interpolation in higher dimensions. Do you wish to do linear interpolation? This is the simplest. Unfortunately, even linear interpolation is not completely simple, since there are different ways to accomplish a linear interpolation in higher dimensions.

    The problem is easy enough to see in 2-d. Imagine you have a function of two variables, defined on a rectangular grid of points. For some point inside this grid, you wish to interpolate a value. Assume that the point lies inside some particular square in the grid. So first, you need to locate that square region. This is simply accomplished via histc, called twice, once in x and then in y.

    So now, WLOG, we can assume that we have a point inside a unit square. We know the value of some function z(z,y) at the four corners of the unit square. (All of this applies equally in three or more independent variables. Its just messier to explain.) Can we interpolate “linearly” in 2-d across that square? In fact, we cannot do so. There is no pure linear interpolant over four points in a rectangular region in the plane. We can appreciate this fact by the maxim that three points determine a plane. So if we add a fourth point, the problem is no longer solvable in general via a linear interpolation.

    How do interp2, interp3, and interpn get around this, since they claim to provide a “linear” interpolant? They use a tool called a tensor product linear interpolation - a first order spline. It can also be seen in finite element methods, for those of us with some mechanical engineering background. Image processing people tend to call these methods bilinear or trilinear interpolation, depending upon the number of dimensions.

    The ‘linear’ interpolant in the interp(2,3,n) codes is truly linear only if you look at how it performs parallel to one of the axes of interpolation. In other directions, it is a local polynomial function of higher order. Thus in 2-d, we end up with a locally polynomial function with terms like

    f(x,y) = a0 + a1*x + a2*y + a3*x*y

    In 3-d, the polynomial looks like

    f(x,y,z) = a0 + a1*x + a2*y + a3*z + a4*x*y + a5*x*z + a6*y*z + a7*x*y*z

    So we see it is not truly a linear interpolant, but look at the 2-d case. Fix y at a single value y0.

    g(x) = f(x,y0) = (a0+a2*y0) + (a1 + a3*y0)*x

    This is equivalent to an interpolation parallel to the x axis, with y fixed. See that it is truly linear in x.

    There are other ways to accomplish a linear interpolation in multiple dimensions. In fact, the trilinear interpolants I described are a poor way to interpolate if you are using them for image interpolation. This is because of the behavior of a trilinear interpolant on the neutrals in an RGB color space.

    For such color interpolants, I’d recommend using a simplicial interpolant. Break the color cube into 6 tetrahedra, and then do a pure linear interpolation over those tetrahedra. This is the way such interpolations are commonly done in the color world. If you are careful in your choice of the dissection of the cube into tetrahedra, that interpolation can be achieved in a way that handles the neutral colors gracefully.

    All of this so far is merely on the issues of linear interpolation in higher dimensions. I’ll note that the higher order interpolants (cubic spline based interpolants) should generally NOT be used for color table interpolations, as that introduces a whole slew of interesting problems. Plus, they are considerably slower if you are interpolating millions of image pixels through a color table. If you are interpolating images for the purposes of resampling, then those interpolants can be appropriate to produce a smooth result. For example, a bicubic interpolant in photoshop is often a good choice for resampling an image, although a relatively slow one.

    The point is, in order to help you more, I really need to know more about your problem. At least I need to stop here if I am not to write many more pages of text that might not be really necessary. What kind of interpolations are you doing, and what is the purpose of that interpolation? As I’ve stated in these blogs, understanding your goals in a problem are of major importance in solving that problem.

    John

  4. Richard W. Lukanen, Jr. replied on :

    To Loren Shure / John D’Errico

    I have read your blog from August 25, 2008; Piecewise Linear Interpolation. At the end of your blog you mention that you would write about piecewise cubic interpolation. I looked for the piecewise cubic blog but didn’t find any.

    Would you consider continuing that topic in the near future?

    My application for the piecewise cubic interpolation is in motion control applications. I often have a table of x,y data that would have a begin slope and end slope. The interpolation needs to be cubic. The x data represents time (or lineshaft angle) and the y data is position of the servo. Rockwell uses this trajectory planning in their RSLogix controls software. I would like a simple means to achieve the same results that our machine controls produces inside of MatLab applications. I do know for certain that the method they use is piecewise cubic interpolation. As I mentioned, the begin slop and end slope need to be specified and they need not be zero.

  5. John DErrico replied on :

    Hi Richard,

    (I’m sorry about the next blog not appearing yet. This series is on my short list of things to finish, one way or another. I do promise to return to the topic of interpolation as soon as I can.)

    There are several ways to solve your problem. Perhaps the simplest solution is to use the splines toolbox, if you have it. csape is a nice function that offers the ability to fit an interpolatory cubic spline, with supplied end conditions. One of those options is to supply the end point slopes. For example, try this on some random data:

    This generates a cubic spline interpolant through a random set of points, subject to the end conditions that the left hand end point slope is 1, and the right hand end point slope is 2.

    pp = csape(1:10,rand(1,10),’clamped’,[1 2]);

    It may not be obvious to you that I was successful here, so it is instructive to look at the coefficients. The pp form produced stores the coefficients of each cubic segment in pp.coefs.

    pp
    pp =
    form: ‘pp’
    breaks: [1 2 3 4 5 6 7 8 9 10]
    coefs: [9×4 double]
    pieces: 9
    order: 4
    dim: 1

    pp.coefs
    ans =
    0.96601 -1.8925 1 0.26187
    -0.77411 1.0055 0.11298 0.33536
    0.97201 -1.3168 -0.19835 0.67973
    -1.0986 1.5992 0.084018 0.13655
    1.0952 -1.6964 -0.01322 0.72123
    -0.92164 1.5892 -0.12051 0.10676
    0.72333 -1.1758 0.29286 0.65376
    -0.82063 0.99421 0.1113 0.49417
    1.7659 -1.4677 -0.36218 0.77905

    The very first segment “lives” on the interval [1,2]. The corresponding coefficients are

    pp.coefs(1,:)
    ans =
    0.96601 -1.8925 1 0.26187

    See that the linear coefficient is 1! Perhaps I could convince you that this means the first derivative of that segment at its lower knot is exactly 1. It might take even more work to convince you that this next code snippet predicts the right hand endpoint slope of the spline. (It does, as long as the breaks were chosen to have a unit spacing.)

    sum(pp.coefs(end,:)*diag([3 2 1],1))
    ans =
    2

    Perhaps easier is to just use fnder to differentiate the curve, and plot it. fnder will differentiate it, then fnplt does the plot.

    ppd = fnder(pp);
    fnplt(ppd)

    Oh, sorry, you don’t see the nice plot I’ve just displayed. 8-) Instead, just evaluate the derivative function at its end points:

    fnval(ppd,[1 10])
    ans =
    1 2

    As expected, I get [1,2].

    If you do not have the splines toolbox, my next option is to offer you a piece of code that will build an interpolatory cubic spline (alternatively a least squares spline) with a huge variety of applied constraint classes. It is free after all, as are all of the utilities that I offer.

    Please send me an e-mail if that is your preference, and I will send you a zip file containing the utility to do this. Sorry, it is not yet on the file exchange, but it is quite usable.

    John

  6. Alex replied on :

    Dear John,

    I am really interested in simplicial and multilinear interpolation (so N-dimensional interpolation by these two schemes), I would like to use and compare them in Matlab. Is there a built-in function in Matlab for simplicial interpolation in N-dimension?If not, how can I implement it in Matlab?Interpn in Matlab I guess is only for multilinear interpolation.

    Thanks

    Best regards,

    Alex

  7. John DErrico replied on :

    Hi Alex,

    It might be easier to just pump me for knowledge, having already done everything possible with each of these methods.

    I assume that you wish to use these methods for color lookup table interpolation? In how many dimensions? What size tables? What types of tables?

    Yes, interpn is the only tool in matlab. It will do n-d linear, or if you prefer, tensor product linear interpolation.

    A simplicial interpolant is easy enough to write. Almost trivial. (John says this with a mischievous grin on his face.) There are a few tricks to make it easier. So what is your goal in this investigation of yours?

    John

  8. Alex replied on :

    Dear John,

    I would like to use simplicial and multilinear interpolation schemes for multivariate interpolation of general functions with complex values depending from more real variables.

    Thanks

    Best regards,

    Alex

  9. Peter Kabal replied on :

    The Matlab routine interp1 (and interp1q) has some surprises. The arguments are the reference points defined by x and y. The points at which interpolation occurs are given by xi. One expects that if xi=x, then yi should be equal to y. But not always!

    This behaviour breaks some code of mine. I have a simple piecewise linear mapping. Some processing occurs on the mapped data, and then the inverse mapping is done. The inverse mapping is easily implemented by swapping x and y.
    Consider the following code where the forward mapping values are passed back into the inverse mapping (i.e. no modification happens). The value of y2 is a “magic” value which causes problems.

    x = [0 0.0275 0.08];
    y2 = 0.016603773584906 - 3.400058012914542e-016;
    y = [0 y2 0.08];
    
    xi = x;
    
    % Forward mapping
    yi = interp1(x, y, xi)
    yi - y   % not zero for the last point
    
    % Inverse mapping, returns an NaN for the last point
    xi = interp1(y, x, yi)
    

    Running this snippet of code will show that xi contains an NaN. Tracking down the problem shows that interp1q uses a one multiplier version of linear interpolation. In summary form for a scalar xi,

      yi = yleft + (yright-yleft)*u
    

    For the case of the values in the earlier example, u=1 for the last value of xi, but numerical errors cause the result not to be equal to yright. A two multiplier version of linear interpolation would solve the problem.

      yi = (1-u)*yleft + u*yright
    

    Looks like I will have to implement a two multipler version of linear interpolation to avoid the problem!

  10. John DErrico replied on :

    Peter,

    No. Sorry, but no. There is no assurance that simply rewriting the equation for yi in this form will solve your problem. In fact, the problem will surely arise again if you do. The problem is that MATLAB does its arithmetic in a binary floating point arithmetic (true for almost all computational platforms.) Most decimal numbers cannot be exactly represented in floating point arithmetic. One way to convince yourself of that is to look at the hex representation of some simple numbers.

    format hex
    0.1
    ans =
       3fb999999999999a
    
    0.2
    ans =
       3fc999999999999a
    
    0.3
    ans =
       3fd3333333333333
    
    0.4
    ans =
       3fd999999999999a
    
    0.5
    ans =
       3fe0000000000000
    
    0.6
    ans =
       3fe3333333333333
    
    0.7
    ans =
       3fe6666666666666
    
    0.8
    ans =
       3fe999999999999a
    
    0.9
    ans =
       3feccccccccccccd
    

    As it turns out, only 0.5 was exactly represented as a binary floating point number. All of the other members of this sequence were off by a bit or so in the least significant bit. This happens because a number like 0.1 has no representation in a finite number of bits in a binary form.

    So you should NEVER test floating point numbers that are the result of any numerical computation like this for exact equality. Never assume the least significant bit of your result is known exactly.

    That the point is returned as a NaN happens because of these errors in the LSB. Make this comparison instead:

    xi = interp1(y, x, yi)
    xi =
                             0                    0.0275                       NaN
    
    xi = interp1(y, x, yi,[],'extrap')
    xi =
                             0                    0.0275                      0.08
    

    See that when I allow interp1 to extrapolate outside the interval, there is no longer a problem. It only had to extrapolate by a bit or so in the LSB, but that was enough.

    See topic 6.1 in the MATLAB FAQ for more discussion of this, or the newsgroup for infinitely many discussions on this and similar topics.

    http://matlabwiki.mathworks.com/MATLAB_FAQ

  11. Peter Kabal replied on :

    John

    My concern was not entirely about the NaN on the inverse function, but also that the result of the forward transform did not guarantee that the output would be bounded by the limits of the reference points. Extrapolation only help to avoid the NaN in the inverse transformation.

    I will give the code for a simple two multiplier interpolation (as in the main part of your blog).

    function yi = IntLin (x, y, xi)
    % Linear interpolation with a guarantee that if xi=x, then yi=y
    
    [Nc, k] = histc (xi, x);
    Nx = length (x);
    k(k==Nx) = Nx - 1;
    
    t = (xi - x(k)) ./ (x(k+1) - x(k));
    yi = (1-t) .* y(k) + t .* y(k+1);
    

    For the following, let us assume xi is scalar
    1. The histc function and the subsequent fix-up below it will guarantee that xi lies between x(k) and x(k+1) (inclusive).
    2. This guarantees that xi-x(k) is zero or greater. Further any reasonable floating point implementation will give xi-x(k) as a non-decreasing function of xi.
    3. The largest value of the numerator in the calculation of t is x(k+1)-x(k).
    4. Any reasonable implementation of floating point operations will guarantee that t = (xi-x(k)) / (x(k+1)-x(k)) will lie between 0 and 1 (inclusive) and that t will be a non-decreasing function of xi.
    5. Floating point systems (IEEE 754 in particular) can represent integers exactly. Then when t=0 or t=1, (1-t) will be exactly 1 or zero.
    6. Certainly when xi hits the limits we will have 0 times one value + 1 times another value, giving the correct results at the end points.

    The two multiplier version will handle the end points correctly. I cannot at the moment see a way to argue that yi using the two multiplier version changes monotonically as xi goes from x(k) to x(k+1). The one multiplier version (yi = y(k) + (y(k+1)-y(k))*t) is a monotonic function of t, but has the possibility of going beyond the upper bound.

    As a side note: one can see that the authors of linspace realized there was a problem with the boundary value. The routine generates the first N-1 points using a formula similar to the one multiplier version of the interpolation formula. The last point boundary point is appended separately.

  12. ahmed replied on :

    Hi,
    I’m trying to draw unloading curve of a one-dimensional spring mass model, with a lookup force deflection for loading and linearly decreasing curve for unloading curve.
    first of all, is there any way to draw a lookup table with decreasing values?
    Then, i tried to used a gain multiplier instead of lookup table, but it doesn’t work since the integration of the system prevents the displacement from decreasing, actually it makes it increases instead
    Kind regards

  13. John DErrico replied on :

    Hi Ahmed,

    Is this a Simulink question I think? Your question is very confusing, so I am not sure how to respond. You need to work out what is your question. What are you trying to do here? I have said this many times before, but the most important part of solving a difficult problem is in simply stating the problem accurately.

    Regardless of the environment you will use it in, let me talk briefly about lookup tables. A lookup table generally is defined by a list of points, an input and an output for each input. I’ll call the input x (the independent variable) and the output variable y. y is dependent upon x. Now, suppose we will use the lookup table. For a new value, call it X, we wish to determine an appropriate output value Y.

    With one dimensional input, we must identify the pair of points in x that bound the new value X. Then use some scheme to predict a value for Y at that location. Some lookup tables will merely use nearest neighbor interpolation or a variant of it, or linear interpolation, or perhaps a cubic spline interplant. The choice of method would be up to you. But regardless, the table itself is fixed, dependent upon what data you supply. So if your data points represent a decreasing relationship, then so would your lookup table. (Note that some choice of interpolants will not always produce a monotone interpolant even for monotone data. This is why PCHIP was invented, and provided as part of interp1.)

    So please try restating your problem.

    Regards,
    John

  14. Tina replied on :

    Do you know of an interpolation scheme for nonmonotonic functiona? I have three independent variables and two dependent variables. I am using the interpolated variables to solve an energy equation. The process is iterative and I need an algorithm that will not shoot off into space. The code I use now uses Newton-Raphson to find a solution and it is restricted to monotonic data. Any thoughts?

  15. John DErrico replied on :

    Tina,

    If I understand your question correctly, you are trying to solve an interpolation problem where the two dependent variables are given, and you wish to recover the values of three independent variables. If a solution exists, then it will generally be infinitely many solutions, i.e., a 1-manifold of solutions that lie in the R^3 domain space of this problem. Think of this as an iso-line. Is any point on this locus of solutions a valid solution?

    As is often the case, my approach would depend on my goals. If a piecewise linear approximation to the 1-manifold was acceptable, then I might start with a triangulation of the R^3 domain set. Then slice through this twice to reduce that triangulation to a piecewise linear 1-manifold. Now I can do anything I want with that curve. It is quite nice for visualization purposes, and sometimes for other end uses. I would use a set of custom made tools for these slices that I have never gotten around to putting on the File Exchange.

    If my goal is otherwise, I might apply fmincon (rather than Newton-Raphson, which cannot employ constraints) to solve for the closest point to my starting point, such that the dependent variables are properly interpolated (as a pair of nonlinear equality constraints.) Here my interpolation might use griddata or interp2, with a smooth method employed.

    Let me know if I can be of further help, or if my understanding of your question was incorrect.

    John

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


Loren Shure works on design of the MATLAB language at The MathWorks. She writes here about once a week on MATLAB programming and related topics.

  • Jun: I totally can not believe it, Loren. You are really helpful. Thank you so much, MATLAB master!
  • Loren: Wow folks- Always lots of interest when there’s a quickie to try out! I will only make 2 general...
  • Loren: Jun- ismember is your friend here: >> [aa,ind] = ismember(Array2,Arra y1) aa = 1 1 1 1 1 1 1 ind = 1 2 1 4 4 3...
  • Dan: I like the first way better than the second way. Combining the arrays into one and running any is nice, although...
  • James Myatt: How about I = (a == 0 | b == 0); a(I) = []; b(I) = [];
  • Tunc: Hello Loren, love your blog because of such inspiring and challenging comments to such ’small’...
  • Pekka Kumpulainen: Here is my tradeoff. I usually want to keep the original variables as they are most probably...
  • Iain: Followup: Of course, to allow NaNs (counting them as non-zero): mask = (a~=0) & (b~=0); The mask says “a...
  • Matt Fig: I would usually go with something like this: y = a&b; x = a(y); y = b(y); But I was surprised to find...
  • kk: c=all([a;b]) a(c) a(b)

These postings are the author's and don't necessarily represent the opinions of The MathWorks.