I'd like to introduce a new guest blogger - John D'Errico - an applied mathematician, now retired from Eastman Kodak, where he used MATLAB for over 20 years. Since then, MATLAB is still in his blood, so you will often find him answering questions on the newsgroup and writing new utilities to add to MATLAB Central.
Contents
I'll assume you have some data points through which you wish to pass a curve, interpolating your data. (Initially, I will only talk about problems with one independent variable.) In these coming blogs, I'll try to show some ways to do exactly this, i.e., find a curve that passes through your data. Along the way I'll try to give some pointers on curve fitting, interpolation, modeling, approximation, etc.
Polynomial regression
A valid question for some to ask is why start out with a discussion about polynomial regression , when we really wanted to talk about interpolation. Many people mistake the ideas of interpolation with the approximation produced by a regression model, calling both of these things interpolation. So I'm starting out with some discussion about what interpolation is not. Plus, I want to assure an understanding of polynomials, since many of the tools for interpolation are polynomial based in some way.
Let us start by creating some data. An exponential is a good place to start, a simple curve shape that is easy to fit.
x = -1:.1:1; y = exp(x);
Plot your data
It is always a good idea to plot your data. In fact, I'll suggest that you should plot everything. Plots are useful, since your eye and brain are splendid at things like pattern recognition. Only you know your data, as the scientist, engineer, or analyst. You will always benefit if you can employ your knowledge of a system as part of the modeling process.
Next, always think in advance about your goals for any model.
- Will you use this model purely for interpolation, i.e., for predictive purposes only?
- Do you need to derive some understanding about the process from your model? Perhaps you need to estimate an upper asymptote of your process. If so, then you may want a model that has such an upper asymptote built into it.
- Will you need to include the model coefficients in some paper that you expect to write? Will you need to use this model in MATLAB, or in some other tool?
- Must the model be simple to evaluate?
- Must the model be efficient to evaluate? You should know whether you will need to evaluate this model millions of times, or just once.
- What do you know about your data? Is there noise in the data? May you ignore that noise, or must you smooth the noise away?
- What do you know about the underlying functional relationship? Is it monotone? Increasing? Decreasing? Positively/negatively curved? Must it pass through a specific point?
- Must the interpolant have specific requirements in terms of continuity or differentiability?
For example, I once had a problem where I knew I had some significant noise in our process, but I chose not to do any smoothing anyway. Any such smoothing would also have smoothed out some potentially important features of our process. Since I could survive with the noise in my interpolant, I chose the lesser evil.
Always know your goals for any such task.
plot(x,y,'bo') xlabel X ylabel Y grid on title 'Exponential data'
This is a nice, well-behaved function. It is of the form y = f(x). I'll pretend for the moment that I have no idea what was in the underlying functional relationship.
One thing I learned in some long past calculus course was that a Taylor series will provide an approximation for many functions. Polynomials are useful things. They are simple to use, simple to build, simple to work with. And a truncated Taylor series is basically a nice, simple polynomial.
So we will start with a linear polynomial approximation for this curve, built using polyfit . This is a utility provided in MATLAB to estimate a polynomial model using linear regression techniques. We could also use many other tools to build our polynomial model, but polyfit is a useful one, and easy to use.
A linear, or first degree polynomial (many use the words "order" and "degree" interchangeably), might be written mathematically as y(x) = a1*x + a2. In MATLAB we will merely store the coefficients, as a vector [a1,a0]. Note that a polynomial in MATLAB has it's coefficients stored with the highest order term first.
P1 = polyfit(x,y,1)
P1 =
1.1140 1.1937
We can evaluate the polynomial with polyval.
yhat = polyval(P1,x); plot(x,y,'bo',x,yhat,'r-') xlabel X ylabel Y grid on title 'Linear polynomial fit'
Look at the residuals
In fact, I'll claim the relationship we are modeling is not terribly well represented by a linear model. Depending on your needs for this model, you might have decided differently.
When you build a regression model, look at the residuals.
ALWAYS PLOT EVERYTHING!
Plot your residuals. In general, some good ways to plot the residuals are versus
- the independent variable. Look for patterns. Patterns here in the residuals are often a hint that you should look more deeply.
- the measurement order, just in case there was a problem with your equipment. (I've seen many cases where measurement apparatus was re-calibrated at the end of every day, but an experiment spanned more than one day.)
- the dependent variable. This might help pick out cases of non-uniform variance.
Look at your plots. Think about what you see there. Compare it to your expectations.
Since this data was very simply generated, I'll dispense with some of those plots for brevity. Note that the residuals for this linear fit look vaguely like a quadratic polynomial. This is often the case when there is lack of fit in a polynomial. That lack of fit often looks like the first term we truncated from the Taylor series.
res = y - yhat; plot(x,res,'bo') xlabel X ylabel Residuals grid on title 'Residuals for the linear fit'
If the residuals looked vaguely parabolic in shape, then it might make sense to use a second order (quadratic) polynomial for our fit.
P2 = polyfit(x,y,2) yhat = polyval(P2,x); plot(x,y,'bo',x,yhat,'r-') xlabel X ylabel Y grid on title 'Quadratic polynomial fit'
P2 =
0.5402 1.1140 0.9956
Note that the residuals here look vaguely like a cubic polynomial, although they are much smaller in magnitude than the previous fit. Again, at each step as we increase the order of the model, the residuals will often tend to look much like a polynomial of the next higher order.
res = y - yhat; plot(x,res,'bo') xlabel X ylabel Residuals grid on title 'Residuals for the quadratic fit'
Higher order polynomials - are more terms always better?
Polynomial modeling with polyfit is indeed simple and easy to do. In fact, we might decide to just use higher and higher order polynomials, always chasing the term we truncated in the previous model. After all, if a quadratic model is better than a linear one, then why not go to a cubic? Ten terms must be better than nine. Look at a tenth order model.
P10 = polyfit(x,y,10); disp(P10(1:6)) disp(P10(7:11))
0.0000 0.0000 0.0000 0.0002 0.0014 0.0083
0.0417 0.1667 0.5000 1.0000 1.0000
We can write the Maclaurin series representation for the exponential function as
We can compare P10 to the coefficients of the known series. How well did we do in the fit?
series = 1./factorial(10:-1:0); disp(series(1:6)) disp(series(7:11))
0.0000 0.0000 0.0000 0.0002 0.0014 0.0083
0.0417 0.1667 0.5000 1.0000 1.0000
Note that the higher order coefficients deviate somewhat from the known series, although the lower order terms appear to be quite accurate.
yhat = polyval(P10,x); plot(x,y,'bo',x,yhat,'r-') xlabel X ylabel Y grid on title 'Tenth order polynomial fit'
The residuals oscillate tightly around zero. In fact, they are so small that this last polynomial begins to approach a true interpolating polynomial. Perhaps if we just add a few more terms we may get there. The numerical issues of floating point arithmetic will often preclude true interpolation down to the least significant bit anyway.
res = y - yhat; plot(x,res,'bo') xlabel X ylabel Residuals grid on title 'Residuals for the tenth order fit'
What do you do with interpolation?
I'll start talking about true interpolation in my next blog. But remember that interpolation is different from the approximations provided by polyfit or any other regression modeling tool.
Until that time, please give me your comments on this blog, or ideas for future blog topics on interpolation or modeling in general. Do you have some interesting applications of interpolation? Some interesting problems?
Get
the MATLAB code
Published with MATLAB® 7.6



John,
Do you have any thoughts in the way of rules of thumb for polynomial fitting with relatively sparse data. I will often have a situation where I’ll have ten samples at each of 10 discrete values of the independent variable. Can one issue a general statement such as “Make sure your polynomial has at most N-3 terms”? (Where N would be the number of discrete values.) Likewise, are there any great rules of thumb if one has to consider the possibility of extrapolation?
Thanks,
Dan
PS. At least as it is being displayed, there is no difference between the series coefficients and the polyfit values.
Dan,
Sorry. I looks like a format difference when the blog was published. The two sets of coefficients truly are a little different, mainly in the higher order terms. Here they are as I saw them:
P10 =
Columns 1 through 6
2.8157e-07 2.8227e-06 2.4795e-05 0.00019835 0.0013889 0.0083334
Columns 7 through 11
0.041667 0.16667 0.5 1 1
ans =
Columns 1 through 6
2.7557e-07 2.7557e-06 2.4802e-05 0.00019841 0.0013889 0.0083333
Columns 7 through 11
0.041667 0.16667 0.5 1 1
Good question about the choice of model order. I’m not sure that a rule of N-k terms, or only 50% as many, or ay such arbitrary rule will work, since I am also sure that I could cook up an example where such a chosen rule will fail.
There are some things that you can try however. The extra data that you have is a boon. For example, try this example:
x = -5:5;
x = repmat(-5:5,10,1);
xfit = x(1:5,:);
xval = x(6:10,:);
y = sin(x/4) + randn(size(x))/20;
plot(x,y,’.')
yfit = y(1:5,:);
yval = y(6:10,:);
valerror = zeros(1,10);
for i = 1:10
P = polyfit(xfit,yfit,i);
valres = yval - polyval(P,xval);
valerror(i) = std(valres(:));
end
plot(1:10,valerror)
xlabel ‘Polynomial order’
ylabel ‘Validation error’
It is fairly logical that a first or second order polynomial would be inadequate for this data, since there is a clear inflection points in the curve. What I did was to split the data in half, using some of the points for the fit, the rest purely for validation purposes. Look at the curve of validation error as a function of order. As I added the higher order terms beyond cubic, I start to chase the noise. Those higher order terms are not justified here, not in terms of the signal to noise ratio in this data. The minimum validation error came from a cubic model.
My very last step would be to use the entire data set to fit the chosen order model, a cubic in this case.
P = polyfit(x,y,3);
Finally, I’ll be honest and state that my own choice would not be to fit a polynomial model at all for many such problems. I’d use a least squares spline, constrained if necessary to have the behavior I’d expect, here for example to be monotone.
John
Dan,
There are often domain-specific rules of thumb for the model order. For example, when fitting speech waveforms to an autoregressive model, a good approximation for AR model order is 2 * the number of spectral peaks [poles] one expects to find (the idea is that it takes two AR coefficients to capture information about one spectral peak).
A more general approach is to use an information-theoretic criterion: Akaike Information Criterion (AIC) or Minimum Description Length (MDL). See
http://en.wikipedia.org/wiki/Model_selection
These are heuristics for estimating the marginal “bang for the buck” - assume that each additional parameter comes with a certain cost, and assess when this cost outweighs the decrease in variance. The different criteria (AIC, BIC, MDL, etc.) differ in how they assign the cost, their expectation of noise in the parameter estimation, and so on. Since they are heuristics rather than proven optimal methods, knowing which one to apply can be a bit of an art.
Gautam
Wonderful column that introduced me to polyfit. I figured I’d learn it some day, but haven’t had the need.
Hi.
In the blog post I see a question stated as to wheather higher orders are better, and looking at the examples one might be led to believe that is so.
At this time one might add a word of caution against extremely high polynomial orders are they tend to become very moody close to the edges of the measured data and are completely undefined outside the domain of the measured data.
I have to say I am looking forward to the rest of this series as interpolation is something I allways found to be somewhat of a hassle but have often seen the benefit of if I was more familiar with the tools.
Daniel,
High order polynomials are almost never the interpolation tool of choice. But I wanted to start out this way, increasing the order until you run out of room. I’ll give some examples in my next blog that show off polynomial failures.
While I have plans for a few more blogs, I’d love to hear if anyone has specific areas I should cover.
John
John and Gautam,
Thank you both for your responses to my query. I look forward to reading more on the interpolation aspects of Matlab (especially when it regards rapid computation).
Regards,
Dan
I use the thin-plate spline for interpolating velocity fields for a particle image velocimetry (PIV) algorithm. The velocity vectors are calculated at discrete locations - and so the spline is needed to evaluate the velocity field at any arbitrary set of image coordinates. A spline can be fit to each component of the velocity vector (vx, vy). I like to think of this procedure as “surface” interpolation.
Hi John,
Thank you for the blog post. I have developed a image processing program to track the shape of fluctuating protein filaments. The program outputs a bunch of coordinates representing the ’skeleton’ in sub-pixel resolution.
At the end, I need to fit a function/model to the coordinate so I can analyze the shape; thus the function/model will be treated as the filament itself.
I naturally went to polynomial fit of the data, generally using 10th order. I do notice the fit becomes ‘moody’ at the edge and introduces error. May I know if there’d be a better model, or a proper implementation of polyfit so I can fit the coordinates (I need them fitted very tightly) while not chasing noise? S/N ratio should be pretty high in my case.
For n data points, polyfit (x,y,n-1) is an interpolant. So polyfit can be used for interpolation.
For the concept that high order interpolation is a bad idea, check out
http://autarkaw.wordpress.com/2008/06/14/higher-order-interpolation-is-a-bad-idea/
http://autarkaw.wordpress.com/2008/06/16/a-simple-matlab-program-to-show-that-high-order-interpolation-is-a-bad-idea/
In regards to finding the optimum order of polynomial to use for regression, one finds where
sr(m)/(n-m-1) becomes either a minimum or does not vary much by increasing m any further.
sr(m)= sum of square of residuals for mth order polynomial
n= number of data points
m= order of polynomial (m<n)
Yes, you can use the overall sum of squares of residuals to choose an order. Given enough data, the validation trick I suggested should be as useful though, and it will often have an easily identified minimum to pick out.
The next step (my next blog) will point out that polyfit(x,y,n-1) is indeed an interpolant, although it does more work than is necessary. In fact, a quick test for a 14th order polynomial shows that it is roughly 15 times faster to use a simple backslash. (I’ve warmed up vander and polyfit in the test below.)
x = linspace(-1,1,15)’;
y = rand(size(x));
tic,polyfit(x,y,14);toc
Elapsed time is 0.009136 seconds.
tic,vander(x)\y;toc
Elapsed time is 0.000605 seconds.
Campion,
I’d suggest avoiding the use of high order polynomials here. While they are indeed simple and easy to use, they can be difficult too as you have found.
Can you use a least squares spline instead? They are often much more well behaved, and they can be controlled nicely in terms of monotonicity, curvature, etc., at least if you have knowledge of their expected behavior.
John
I remember two problems. One concerning interpolation the other one approximation. The interpolation problem was the question how to connect points in the plane by clothoidal splines. I found an article or maybe diploma theses on this at the internet. But this is years ago and the paper work has gone. For this think of the problem to build a track for a road or a railway. The spline should fulfill the condition that the curvature is linear. Is there always or sometimes a solution for cubic splines ? I don´t remember the mathematics and I´m not a mathematician.
Hello John,
I have enjoyed gleaning information from you on this topic for years. I look forward to future blogs, but I would like to ask if you can recommend a good book on this subject.
To let you know what kind of book I have in mind, I did double major in Physics/Mathematics, so an applied book at upper-division level or graduate level would be great. The only books I have found are Springer publishing books (the yellow ones) that were too specialized. I would like a book that briefly covered the basics then went on to advanced applied topics. Thanks.
Matt,
The classic is de Boor, “A Practical Guide to Splines”. It is quite readable, and covers some very useful material. It was the first book I read on the subject, and still the first I’d recommend.
A book that I always thought quite readable, even for an advanced undergrad, was Lancaster and Salkauskas. “Curve and Surface Fitting, an Introduction”.
john
Sorry. I missed the question from Campion. 10th order polynomial interpolation/approximation will indeed be “moody” at the ends. This is why I strongly recommend least squares splines instead. A least squares spline is much easier to control than is a high order polynomial, and a cubic spline tends to be often the best choice, not TOO flexible, but just capable enough to fit almost any curve shape of interest. Splines are also nice things to handle for modeling.
For example, the second derivative of a cubic spline can be no more complex than piecewise linear, so it is quite easy to constrain a spline to be positively or negatively curved over a domain.
Not quite so easy is to constrain a cubic spline to be monotonic, but that too is possible in a linear regression context. Other constraints too are possible. In total, splines can form a splendid way to approximate data where you do not have a mechanistic or physical model for the process, yet you need something that behaves itself.
The next blogs I do will begin to discuss splines, starting with baby steps as piecewise linear interpolants, then next into the realm of cubic Hermite functions and cubic splines. Perhaps in a future blog I could even spend some time on least squares approximation by splines.
John
That’s truly wonderful and helpful.
I am working with some real time series data and plan to use interpolation to take account of missing datas from a radar scaterring of space observation. However, I am afraid that importantly significant frequency componetns will be lost if I use interpolation. Also I would like to know what the effect of this interpolation will be when I take the fft of the interpolated data.
I am not sure, if you could have delt with the interpolation before I send you this comment. Even then, i really love to read it, given the site.
Thx a lot
Abiyu
Hi Abiyu,
I’m afraid that I can’t help you too much here. You can look at the MTF for an interpolation method to learn what it will do to the frequencies of interest. See pages 73+ in Vollmerhausen & Driggers, “Analysis of Sampled Imaging Systems” for example. (You can find a large part of this book on Google Books.)
John
I’m trying to use polyfit to fit a nearly vertical straight line and a very poor fit results.
This is not mentioned in the polyfit documentation as a known issue.
Can you comment on this?
Thanks,
Jennifer
Jennifer,
I’d not expect this to be mentioned as an “issue”. A vertical line has an infinite slope. It has no finite value for the slope. Since polyfit estimates a model of the form y = a1*x + a2, what coefficients would you wish polyfit to return? For example,
x = 2 + randn(5,1)*1e-8; y = randn(5,1); [x,y] ans = 1.99999999813291 0.11393131352081 2.00000000725791 1.06676821135919 1.99999999411683 0.0592814605236053 2.00000002183186 -0.095648405483669 1.99999999863604 -0.832349463650022 plot(x,y,'o') axis equalThis is a vertical line, to within any rational limits. The equation of the line is essentially x=2. That is, y is not a function of x at all. So when you try to fit a line using polyfit,
P = polyfit(x,y,1) P = 10177980.6056492 -20355961.1895639The coefficients are garbage, purely so. But we should expect that! Garbage in, garbage out. When you ask polyfit to fit the model it is designed to fit on this set of data, why would you expect differently?
Having said that, there are several alternatives. One can swap the axes, fitting a model of the form x = b1*y + b2, still using polyfit.
P = polyfit(y,x,1) P = 2.71191329645078e-09 2.0000000038259Again, as expected. The model suggests that x=2, is independent of the value of y.
You might also choose to use an orthogonal regression code. There should be several such tools to be found by a quick search on the file exchange.
John
Hello Jon,
Is there any easy way to interpolate a general function f from R^3 into R^3 given a list of N points that the function maps? I.e. I have a list of N “fixed points” where the behavior of the function f is known: f([x_1,y_1,z_1])=[u1,v1,w1], f([x_2,y_2,z_2])=[u2,v2,w2], …, f([x_N,y_N,z_N])=[uN,vN,wN] and I want to estimate f([x,y,z]) for any point [x,y,z] in R^3.
Thanks,
Geoff
There are several schemes for doing this type of interpolation. Griddata3 (or griddatan) provides one way. Here we treat each output independently as a 3-1 variable mapping of the form f(x,y,z). In that case, griddata forms a tessellation of the scattered domain (x,y,z). Then any point must lie inside one of these simplexes, or along a possibly shared facet thereof. Linear interpolation is then done within the simplex. Think of this as building three independent higher dimensional surfaces to form the mapping.
The second scheme is to consider this as a deformation of a domain in three dimensions. To do this, we might perhaps use a finite element code to solve for the deformation of an elastic body in three dimensions. Any single point in the domain then maps to a corresponding point in the range space.
Both of these schemes have their associated merits. A problem with the elastic body solver is what I like to call unwanted baggage in the metaphorical model. Think of it like this. Suppose I were to compress a piece of rubber along one axis. When I do so, my expectation is that the rubber will bulge out into the other dimensions. This is expected behavior for an elastic body. (A parameter in a physical model is Poisson’s ratio, which helps us to quantify how a given material behaves under deformation.) But suppose I were to do a similar operation on a set of data points in some three dimensional domain? Here a simple scaling along one axis need not be accompanied by a bulge in the other directions. It turns out that if you do use an elastic body solver here, then I’ll argue that it probably makes sense to use a Poisson’s ration of zero.
My point is that use of a metaphorical model such as the deformations of an elastic body to interpolate data may make some sense. A cubic spline is my favorite member of the family of metaphorical models, wherein we use a mathematical model for one physical system to predict the behavior or a completely, unrelated physical system. But at the same time any such metaphorical model carries along baggage. That baggage is the set of sometimes strange predictions that we will see if we are less than careful in our usage of a metaphor.
Climbing down now from my metaphorical soapbox…
John
Hi John, thanks for these blogs on interpolation- they’ve been very helpful so far.
However, I am having trouble working out how to find the polynomial fit to a 3D surface, when instead of having 3 vectors for x,y and z I simply have a single matrix, Z, of size [m x n]. The Z values represent the elevations/surface points, obviuosly I can view this with surf(Z) and see my real surface with no problems. I want to find the function Z = f(x,y) that best fits my surface.
I have seen your polyfitn function but I cannot work out how to prepare my matrix Z for input. Do you have any suggestions?
Many thanks,
Gareth
Hi Gareth,
There are two issues here. One is to build the independent variables for the fit itself. (This is the easy part, but even there we will trip over issues to deal with.) The second issue is what model one will choose.
I’ll use the MATLAB standard here to define x and y. So if we have an mxn matrix Z, then we have m points in y, and n in x. We can simply choose to define x and y as living on the integers 1:n and 1:m respectively. Don’t forget that we have a n entire matrix though! The MESHGRID function will allow us to build arrays of he right size and shape for x and y. For example, given a 3×5 array for Z, we might do this:
m = 3; n = 5; [ x, y ] = meshgrid(1:n,1:m) x = 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 y = 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3Here x and y are now both mxn matrices, of the same size as Z. How might we build the polynomial model itself? For this, x, suppose we simply wanted a linear model in both x and y? We can use either backslash to do the work, or we might use regression codes like polyfitn or regress from the statistics toolbox. Or, if you have the curvefitting toolbox, it now allows you to do fits in two independent variables. I’ll show you how to do it using some of those alternatives.
Before I do any actual examples, I’ll need some actual data. Ok, made up data.
Z = -3 + x + 2.5*y + randn(m,n)/3 Z = 0.53798 1.4681 2.0546 3.2694 3.9688 3.3556 3.7226 5.2381 6.286 6.5197 5.5198 6.5981 8.0412 8.918 9.6904 surf(x,y,Z)My model will be of the general form
First, solve this with backslash. For a model that is linear in x and y, this is easy.
coef = [ones(numel(x),1),x(:),y(:)] \ Z(:) coef = -3.3027 0.94052 2.7469Not too bad. I had dumped some noise into the surface when I built it, so you should not expect perfect estimates of the numbers.
My own POLYFITN (from the file exchange) also does this nicely enough of course, along with some useful statistics on the model and the parameters.
model = polyfitn([x(:),y(:)],Z(:),'constant x y') model = ModelTerms: [3x2 double] Coefficients: [-3.3027 0.94052 2.7469] ParameterVar: [0.057808 0.0025134 0.0075402] ParameterStd: [0.24043 0.050134 0.086834] R2: 0.99121 RMSE: 0.2456 VarNames: {'x' 'y'}Or if you have the statistics toolbox, REGRESS is a very good choice. Here we see 95% confidence intervals returned on the parameter estimates.
[B,BINT] = regress(Z(:),[ones(numel(x),1),x(:),y(:)]) B = -3.3027 0.94052 2.7469 BINT = -3.8266 -2.7789 0.83129 1.0498 2.5577 2.9361Having estimated a linear model so easily, one might choose to expand to higher order models. At some point, you will see numerical conditioning problems. You might be forced to work with orthogonal polynomials, or at the very least, to consider centering and scaling the variables to be lie in the interval [-1,1]. This will greatly reduce the potentiality for singular or nearly singular matrices.
Even so, what happens when we try to estimate a higher order model here? Admittedly, we don’t have much data to estimate that model.
model = polyfitn([x(:),y(:)],Z(:),'constant x y x^2 x*y y^2') model = ModelTerms: [6x2 double] Coefficients: [-3.0534 0.99008 2.5181 -0.041565 0.099913 -0.017752] ParameterVar: [0.49589 0.076522 0.37543 0.0016708 0.0035087 0.021052] ParameterStd: [0.7042 0.27663 0.61272 0.040875 0.059234 0.14509] R2: 0.99386 RMSE: 0.20519 VarNames: {'x' 'y'}Yes, we can estimate a fully quadratic model on this data. In fact though, we know the true underlying relationship to have been linear. So we might hope that to see the higher order terms to be statistically insignificant. Look at the ratio of the parameter coefficients to the parameter standard deviations. The absolute value of that ratio should be compared to a student’s t statistic, here with 15-6=9 degrees of freedom.
model.Coefficients./model.ParameterStd ans = -4.336 3.5791 4.1097 -1.0169 1.6867 -0.12235The critical level for the Student’s t is given here by TINV, from the stats toolbox, or from a set of tables.
tinv(0.975,9) ans = 2.2622So it would appear that the x^2, x*y, and y^2 terms are not significantly different from zero in this model, based on this very limited amount of data. I won’t go any further along these lines. Better is to refer you to any competent book on regression analysis. I like Draper and Smith, as that is what I used many years ago.
Higher order polynomial models are not in general something I recommend though. Linear or quadratic models often are useful. They provide information about the parameters, about the relative importance of the variables in a model. But when people start throwing fifth or tenth order models in several dimensions at their data, I’ll claim this is silly. Many dozens of coefficients will mean nothing to them anyway. Worse, high order polynomial models do terrible things when they are used to extrapolate. They can even do strange and nasty things between your data points. (I call this “intrapolation”.)
In this event, I’ll always recommend a spline type model for your system. For example, my own gridfit from the file exchange is one such member of the general class of spline-like models. Or use the splines toolbox here. These tools do not give you a model that you can simply write down. But why does that matter, since an adequate polynomial model might have had dozens of nonsensical coefficients for the fit?
Finally, it seems that I often get requests from people wanting to know how to find the nonlinear “function” that fits their surface. The nonlinear modeling question is an entirely different one, worth several books worth of writing. Having written virtually a small book here, I’ll stop now.
HTH,
John
Hi John,
Wow! thanks so much for such a detailed reply- Your examples really help- and I can see where I was going wrong, the biggest problem was using meshgrid correctly for my scenario and getting the syntax right for polyfitn and the backslash.
polyfitn will definitely be the most useful to me to begin with I think- as I want to experiment with fitting to a surface that looks like a peak/impulse, so I’m looking at at least order 2 to get a parabola shape function.
Thanks again.
Gareth