{"id":3177,"date":"2018-12-19T10:16:55","date_gmt":"2018-12-19T15:16:55","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=3177"},"modified":"2018-12-19T10:16:55","modified_gmt":"2018-12-19T15:16:55","slug":"a-couple-of-topics-in-curve-fitting","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2018\/12\/19\/a-couple-of-topics-in-curve-fitting\/","title":{"rendered":"A Couple of Topics in Curve Fitting"},"content":{"rendered":"<div class=\"content\"><!--introduction--><pre>               \"All the noise, noise, noise, NOISE!\"<\/pre><pre>                       -- The Grinch<\/pre><p>Today's guest blogger is Tom Lane. Tom has been a MathWorks developer since 1999, working primarily on the <a href=\"https:\/\/www.mathworks.com\/products\/statistics.html\">Statistics and Machine Learning Toolbox<\/a>. He'd like to share with you a couple of issues that MATLAB users repeatedly encounter.<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#98a4abb2-6cc6-4312-b441-abe3fec3ef65\">Curve Fitting and Transformations<\/a><\/li><li><a href=\"#2448f25b-1c5c-4a85-a384-601aed1ff9df\">Curve Fitting vs. Distribution Fitting<\/a><\/li><li><a href=\"#168d2830-7fc8-4ad7-9766-73cccf93f251\">Conclusion<\/a><\/li><\/ul><\/div><h4>Curve Fitting and Transformations<a name=\"98a4abb2-6cc6-4312-b441-abe3fec3ef65\"><\/a><\/h4><p>The topic for today is curve fitting. Let's look at a simple exponential function:<\/p><pre class=\"codeinput\">rng <span class=\"string\">default<\/span>\r\nx = rand(10,1);\r\ny = 10*exp(-5*x);\r\n<\/pre><p>We can plot this, but many of the values are smooshed up against the <tt>X<\/tt> axis. The semilogy function can help with that, and also turn the relationship into a straight line.<\/p><pre class=\"codeinput\">subplot(1,2,1);\r\nplot(x,y,<span class=\"string\">'x'<\/span>);\r\nsubplot(1,2,2);\r\nsemilogy(x,y,<span class=\"string\">'x'<\/span>);  <span class=\"comment\">% log(y) = -5*x + log(10)<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"http:\/\/blogs.mathworks.com\/images\/loren\/2018\/cf_topicsLS_01.png\" alt=\"\"> <p>Suppose we have the <tt>X<\/tt> and <tt>Y<\/tt> values, and we can see or guess the functional form, but we don't know the constant values 10 and 5. We can estimate them from the data. We can do this using either the original data shown at the left, or the <tt>log(Y)<\/tt> transformed data shown at the right.<\/p><pre class=\"codeinput\">p1 = fitnlm(x,y,<span class=\"string\">'y ~ b1*exp(b2*x)'<\/span>,[1 1])\r\np2 = polyfit(x,log(y),1); p2(2) = exp(p2(2))\r\n<\/pre><pre class=\"codeoutput\">p1 = \r\nNonlinear regression model:\r\n    y ~ b1*exp(b2*x)\r\n\r\nEstimated Coefficients:\r\n          Estimate    SE    tStat    pValue\r\n          ________    __    _____    ______\r\n    b1       10       0      Inf       0   \r\n    b2       -5       0     -Inf       0   \r\n\r\nNumber of observations: 10, Error degrees of freedom: 8\r\nRoot Mean Squared Error: 0\r\nR-Squared: 1,  Adjusted R-Squared 1\r\nF-statistic vs. zero model: Inf, p-value = 0\r\np2 =\r\n   -5.0000   10.0000\r\n<\/pre><p>Both fits give the same coefficients.<\/p><p>Some time ago, a MATLAB user reported that he was fitting this curve to his own data, and getting different parameter estimates from the ones given by other software. They weren't dramatically different, but larger than could be attributed to rounding error. They were different enough to raise suspicion. If we add a little noise to <tt>log(y)<\/tt>, we can reproduce what the user saw.<\/p><pre class=\"codeinput\">y = exp(log(y) + randn(size(y))\/10);\r\np1 = fitnlm(x,y,<span class=\"string\">'y ~ b1*exp(b2*x)'<\/span>,[1 1])\r\np2 = polyfit(x,log(y),1); p2(2) = exp(p2(2))\r\nxx = linspace(0,1)';\r\nsubplot(1,1,1)\r\nplot(x,y,<span class=\"string\">'x'<\/span>,  xx,predict(p1,xx),<span class=\"string\">'r-'<\/span>)\r\n<\/pre><pre class=\"codeoutput\">p1 = \r\nNonlinear regression model:\r\n    y ~ b1*exp(b2*x)\r\n\r\nEstimated Coefficients:\r\n          Estimate      SE        tStat       pValue  \r\n          ________    _______    _______    __________\r\n    b1     10.015     0.34113      29.36    1.9624e-09\r\n    b2    -4.8687     0.23021    -21.149     2.625e-08\r\n\r\nNumber of observations: 10, Error degrees of freedom: 8\r\nRoot Mean Squared Error: 0.141\r\nR-Squared: 0.997,  Adjusted R-Squared 0.996\r\nF-statistic vs. zero model: 1.91e+03, p-value = 1.91e-11\r\np2 =\r\n   -4.8871   10.0008\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"http:\/\/blogs.mathworks.com\/images\/loren\/2018\/cf_topicsLS_02.png\" alt=\"\"> <p>Which estimates to believe? Well, it turns out that once you add noise, these models are no longer equivalent. Adding noise to the original data is one thing. Adding noise to the log data gives noise values that, back on the original scale, grow with the value of <tt>y<\/tt>. We can see the difference between the two more easily if we generate a larger set of data.<\/p><pre class=\"codeinput\">rng <span class=\"string\">default<\/span>\r\nX = rand(100,1);\r\nY = 10*exp(-5*X);\r\nsubplot(1,2,1);\r\nY1 = Y + randn(size(Y))\/5;\r\nplot(X,Y1,<span class=\"string\">'x'<\/span>, xx,10*exp(-5*xx),<span class=\"string\">'r-'<\/span>);\r\nsubplot(1,2,2);\r\nY2 = exp(log(Y) + randn(size(Y))\/10);\r\nplot(X,Y2,<span class=\"string\">'x'<\/span>, xx,10*exp(-5*xx),<span class=\"string\">'r-'<\/span>);\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"http:\/\/blogs.mathworks.com\/images\/loren\/2018\/cf_topicsLS_03.png\" alt=\"\"> <p>It's hard to see at the top of the plots, but near <tt>y=0<\/tt> we can see that the noise is larger on the left than on the right. On the left the noise is additive. On the right, the noise is multiplicative.<\/p><p>Which model is correct or appropriate? We'd have to understand the data to decide that. One clue is that if negative values are plausible when the curve approaches zero, then an additive model may be appropriate. If the noise is more plausibly described in terms like +\/- 10%, then the multiplicative model may be appropriate.<\/p><p>Now, not all models are easily transformed this way. A multiplicative model may still be appropriate even when no such simple transformation exists. Fortunately, the <tt><a href=\"https:\/\/www.mathworks.com\/help\/stats\/fitnlm.html\">fitnlm<\/a><\/tt> function from the Statistics and Machine Learning Toolbox has a feature that lets you specify the so-called \"error model\" directly.<\/p><pre class=\"codeinput\">p1 = fitnlm(x,y,<span class=\"string\">'y ~ b1*exp(b2*x)'<\/span>,[1 1],<span class=\"string\">'ErrorModel'<\/span>,<span class=\"string\">'proportional'<\/span>)\r\n<\/pre><pre class=\"codeoutput\">p1 = \r\nNonlinear regression model:\r\n    y ~ b1*exp(b2*x)\r\n\r\nEstimated Coefficients:\r\n          Estimate      SE       tStat       pValue  \r\n          ________    ______    _______    __________\r\n    b1     9.9973     0.8188      12.21    1.8787e-06\r\n    b2    -4.8771     0.1162    -41.973     1.144e-10\r\n\r\nNumber of observations: 10, Error degrees of freedom: 8\r\nRoot Mean Squared Error: 0.121\r\nR-Squared: 0.974,  Adjusted R-Squared 0.97\r\nF-statistic vs. zero model: 344, p-value = 1.75e-08\r\n<\/pre><p>This model isn't exactly the same as the ones before. It's modeling additive noise, but with a scale factor that increases with the size of the fitted function. But rest assured, it takes into account the different noise magnitudes, so it may be useful for data having that characteristic.<\/p><h4>Curve Fitting vs. Distribution Fitting<a name=\"2448f25b-1c5c-4a85-a384-601aed1ff9df\"><\/a><\/h4><p>Now that I have your attention, though, I'd like to address another topic related to curve fitting. This time let's consider the Weibull curve.<\/p><pre class=\"codeinput\">rng <span class=\"string\">default<\/span>\r\nx = wblrnd(5,2,100,1);\r\nsubplot(1,1,1)\r\nhistogram(x, <span class=\"string\">'BinEdges'<\/span>,0:14, <span class=\"string\">'Normalization'<\/span>,<span class=\"string\">'pdf'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"http:\/\/blogs.mathworks.com\/images\/loren\/2018\/cf_topicsLS_04.png\" alt=\"\"> <p>The Weibull density has this form:<\/p><pre class=\"codeinput\">X = linspace(0,20);\r\nA = 5; B = 2;\r\nY = (B\/A) * (X\/A).^(B-1) .* exp(-(X\/A).^B);\r\nhold <span class=\"string\">on<\/span>\r\nplot(X,Y)\r\nhold <span class=\"string\">off<\/span>\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"http:\/\/blogs.mathworks.com\/images\/loren\/2018\/cf_topicsLS_05.png\" alt=\"\"> <p>Suppose we don't know the parameters <tt>A<\/tt> and <tt>B<\/tt>. Once again, there are two ways of estimating them. First, we could get the bin centers and bin heights, and use curve fitting to estimate the parameters.<\/p><pre class=\"codeinput\">heights = histcounts(x, <span class=\"string\">'BinEdges'<\/span>,0:14, <span class=\"string\">'Normalization'<\/span>,<span class=\"string\">'pdf'<\/span>);\r\ncenters = (0.5:1:13.5)';\r\nfitnlm(centers,heights,@(params,x)wblpdf(x,params(1),params(2)),[2 2])\r\n<\/pre><pre class=\"codeoutput\">ans = \r\nNonlinear regression model:\r\n    y ~ wblpdf(x,params1,params2)\r\n\r\nEstimated Coefficients:\r\n               Estimate      SE       tStat       pValue  \r\n               ________    _______    ______    __________\r\n    params1     4.6535     0.18633    24.975    1.0287e-11\r\n    params2     1.9091     0.10716    17.815    5.3598e-10\r\n\r\nNumber of observations: 14, Error degrees of freedom: 12\r\nRoot Mean Squared Error: 0.0182\r\nR-Squared: 0.937,  Adjusted R-Squared 0.932\r\nF-statistic vs. zero model: 197, p-value = 6.62e-10\r\n<\/pre><p>The alternative is not to treat this like a curve fitting problem, but to treat it like a distribution fitting problem instead. After all, there is no need for us to artificially bin the data before fitting. Let's just fit the data as we have it.<\/p><pre class=\"codeinput\">fitdist(x,<span class=\"string\">'weibull'<\/span>)\r\n<\/pre><pre class=\"codeoutput\">ans = \r\n  WeibullDistribution\r\n\r\n  Weibull distribution\r\n    A = 4.76817   [4.29128, 5.29806]\r\n    B = 1.96219   [1.68206, 2.28898]\r\n\r\n<\/pre><p>It's much simpler to call the distribution fitting function than to set this up as a curve fitting function. But in case that doesn't convince you, I would like to introduce the concept of statistical efficiency. Notice that the distribution fitting parameters are closer in this case to the known values 5 and 2. Is that just a coincidence? A method is statistically more efficient than another if it can get the same accuracy using less data. Let's have a contest. We will fit both of these models 1000 times, collect the estimates of <tt>A<\/tt>, and see which one is more variable.<\/p><pre class=\"codeinput\">AA = zeros(1000,2);\r\n<span class=\"keyword\">for<\/span> j=1:1000\r\n    x = wblrnd(5,2,100,1);\r\n    heights = histcounts(x, <span class=\"string\">'BinEdges'<\/span>,0:14, <span class=\"string\">'Normalization'<\/span>,<span class=\"string\">'pdf'<\/span>);\r\n    f = fitnlm(centers,heights,@(params,x)wblpdf(x,params(1),params(2)),[2 2]);\r\n    p = fitdist(x,<span class=\"string\">'weibull'<\/span>);\r\n\r\n    AA(j,1) = f.Coefficients.Estimate(1);\r\n    AA(j,2) = p.A;\r\n<span class=\"keyword\">end<\/span>\r\nmean_AA = mean(AA)\r\nstd_AA = std(AA)\r\n<\/pre><pre class=\"codeoutput\">mean_AA =\r\n    5.0145    4.9973\r\nstd_AA =\r\n    0.3028    0.2566\r\n<\/pre><p>We have a winner! Both values have a mean that is close to the known values of 5, but the values produced by distribution fitting are less variable.<\/p><h4>Conclusion<a name=\"168d2830-7fc8-4ad7-9766-73cccf93f251\"><\/a><\/h4><p>When you approach a curve fitting problem, I recommend you consider a few things:<\/p><div><ol><li>Is this a curve fitting problem at all? If it involves the relationship between two variables, it's curve fitting. If it involves the distribution of a single variable, try to approach it as a distribution fitting problem.<\/li><li>Is the function one that can transform to a simpler form, such as a linear relationship? If so, consider doing that to make the fitting process more efficient.<\/li><li>However, consider the noise. Does the noise seem additive or multiplicative?  Does the noise vary with the magnitude of <tt>Y<\/tt>? Are negative <tt>Y<\/tt> values plausible? These questions can help you decide whether to fit on the original or transformed scale, and whether to specify an error model.<\/li><\/ol><\/div><p>What criteria do you use to choose a model for fitting your data?  Let us know <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=3177#respond\">here<\/a>.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_193fb599e4ee48b48bcb3215cc27eff1() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='193fb599e4ee48b48bcb3215cc27eff1 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 193fb599e4ee48b48bcb3215cc27eff1';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2018 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_193fb599e4ee48b48bcb3215cc27eff1()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2018b<br><\/p><\/div><!--\r\n193fb599e4ee48b48bcb3215cc27eff1 ##### SOURCE BEGIN #####\r\n%% A Couple of Topics in Curve Fitting\r\n%                 \"All the noise, noise, noise, NOISE!\"\r\n% \r\n%                         REPLACE_WITH_DASH_DASH The Grinch\r\n% \r\n% Today's guest blogger is Tom Lane. Tom has been a MathWorks developer\r\n% since 1999, working primarily on the\r\n% <https:\/\/www.mathworks.com\/products\/statistics.html Statistics and\r\n% Machine Learning Toolbox>. He'd like to share with you a couple of issues\r\n% that MATLAB users repeatedly encounter.\r\n\r\n%% Curve Fitting and Transformations\r\n% The topic for today is curve fitting. Let's look at a simple exponential\r\n% function:\r\n\r\nrng default\r\nx = rand(10,1);\r\ny = 10*exp(-5*x);\r\n%% \r\n% We can plot this, but many of the values are smooshed up against the |X|\r\n% axis. The semilogy function can help with that, and also turn the\r\n% relationship into a straight line.\r\n\r\nsubplot(1,2,1);\r\nplot(x,y,'x');\r\nsubplot(1,2,2);\r\nsemilogy(x,y,'x');  % log(y) = -5*x + log(10)\r\n%% \r\n% Suppose we have the |X| and |Y| values, and we can see or guess the\r\n% functional form, but we don't know the constant values 10 and 5. We can\r\n% estimate them from the data. We can do this using either the original\r\n% data shown at the left, or the |log(Y)| transformed data shown at the\r\n% right.\r\n\r\np1 = fitnlm(x,y,'y ~ b1*exp(b2*x)',[1 1])\r\np2 = polyfit(x,log(y),1); p2(2) = exp(p2(2))\r\n%% \r\n% Both fits give the same coefficients.\r\n% \r\n% Some time ago, a MATLAB user reported that he was fitting this curve to\r\n% his own data, and getting different parameter estimates from the ones\r\n% given by other software. They weren't dramatically different, but larger\r\n% than could be attributed to rounding error. They were different enough to\r\n% raise suspicion. If we add a little noise to |log(y)|, we can reproduce\r\n% what the user saw.\r\n\r\ny = exp(log(y) + randn(size(y))\/10);\r\np1 = fitnlm(x,y,'y ~ b1*exp(b2*x)',[1 1])\r\np2 = polyfit(x,log(y),1); p2(2) = exp(p2(2))\r\nxx = linspace(0,1)';\r\nsubplot(1,1,1)\r\nplot(x,y,'x',  xx,predict(p1,xx),'r-')\r\n%% \r\n% Which estimates to believe? Well, it turns out that once you add noise,\r\n% these models are no longer equivalent. Adding noise to the original data\r\n% is one thing. Adding noise to the log data gives noise values that, back\r\n% on the original scale, grow with the value of |y|. We can see the\r\n% difference between the two more easily if we generate a larger set of\r\n% data.\r\n\r\nrng default\r\nX = rand(100,1);\r\nY = 10*exp(-5*X);\r\nsubplot(1,2,1);\r\nY1 = Y + randn(size(Y))\/5;\r\nplot(X,Y1,'x', xx,10*exp(-5*xx),'r-');\r\nsubplot(1,2,2);\r\nY2 = exp(log(Y) + randn(size(Y))\/10);\r\nplot(X,Y2,'x', xx,10*exp(-5*xx),'r-');\r\n%% \r\n% It's hard to see at the top of the plots, but near |y=0| we can see that\r\n% the noise is larger on the left than on the right. On the left the noise\r\n% is additive. On the right, the noise is multiplicative.\r\n% \r\n% Which model is correct or appropriate? We'd have to understand the data\r\n% to decide that. One clue is that if negative values are plausible when\r\n% the curve approaches zero, then an additive model may be appropriate. If\r\n% the noise is more plausibly described in terms like +\/- 10%, then the\r\n% multiplicative model may be appropriate.\r\n% \r\n% Now, not all models are easily transformed this way. A multiplicative\r\n% model may still be appropriate even when no such simple transformation\r\n% exists. Fortunately, the\r\n% |<https:\/\/www.mathworks.com\/help\/stats\/fitnlm.html fitnlm>| function from\r\n% the Statistics and Machine Learning Toolbox has a feature that lets you\r\n% specify the so-called \"error model\" directly.\r\n\r\np1 = fitnlm(x,y,'y ~ b1*exp(b2*x)',[1 1],'ErrorModel','proportional')\r\n%% \r\n% This model isn't exactly the same as the ones before. It's modeling\r\n% additive noise, but with a scale factor that increases with the size of\r\n% the fitted function. But rest assured, it takes into account the\r\n% different noise magnitudes, so it may be useful for data having that\r\n% characteristic.\r\n\r\n%% Curve Fitting vs. Distribution Fitting\r\n% Now that I have your attention, though, I'd like to address another topic\r\n% related to curve fitting. This time let's consider the Weibull curve.\r\nrng default\r\nx = wblrnd(5,2,100,1);\r\nsubplot(1,1,1)\r\nhistogram(x, 'BinEdges',0:14, 'Normalization','pdf')\r\n\r\n%%\r\n% The Weibull density has this form:\r\nX = linspace(0,20);\r\nA = 5; B = 2;\r\nY = (B\/A) * (X\/A).^(B-1) .* exp(-(X\/A).^B);\r\nhold on\r\nplot(X,Y)\r\nhold off\r\n\r\n%%\r\n% Suppose we don't know the parameters |A| and |B|. Once again, there are\r\n% two ways of estimating them. First, we could get the bin centers and bin\r\n% heights, and use curve fitting to estimate the parameters.\r\n\r\nheights = histcounts(x, 'BinEdges',0:14, 'Normalization','pdf');\r\ncenters = (0.5:1:13.5)';\r\nfitnlm(centers,heights,@(params,x)wblpdf(x,params(1),params(2)),[2 2])\r\n\r\n%%\r\n% The alternative is not to treat this like a curve fitting problem, but to\r\n% treat it like a distribution fitting problem instead. After all, there is\r\n% no need for us to artificially bin the data before fitting. Let's just\r\n% fit the data as we have it.\r\nfitdist(x,'weibull')\r\n\r\n%%\r\n% It's much simpler to call the distribution fitting function than to set\r\n% this up as a curve fitting function. But in case that doesn't convince\r\n% you, I would like to introduce the concept of statistical efficiency.\r\n% Notice that the distribution fitting parameters are closer in this case\r\n% to the known values 5 and 2. Is that just a coincidence? A method is\r\n% statistically more efficient than another if it can get the same accuracy\r\n% using less data. Let's have a contest. We will fit both of these models\r\n% 1000 times, collect the estimates of |A|, and see which one is more\r\n% variable.\r\n\r\nAA = zeros(1000,2);\r\nfor j=1:1000\r\n    x = wblrnd(5,2,100,1);\r\n    heights = histcounts(x, 'BinEdges',0:14, 'Normalization','pdf');\r\n    f = fitnlm(centers,heights,@(params,x)wblpdf(x,params(1),params(2)),[2 2]);\r\n    p = fitdist(x,'weibull');\r\n    \r\n    AA(j,1) = f.Coefficients.Estimate(1);\r\n    AA(j,2) = p.A;\r\nend\r\nmean_AA = mean(AA)\r\nstd_AA = std(AA)\r\n\r\n%%\r\n% We have a winner! Both values have a mean that is close to the known\r\n% values of 5, but the values produced by distribution fitting are less\r\n% variable. \r\n\r\n%% Conclusion\r\n% When you approach a curve fitting problem, I recommend you consider a few\r\n% things:\r\n% \r\n% # Is this a curve fitting problem at all? If it involves the relationship\r\n% between two variables, it's curve fitting. If it involves the\r\n% distribution of a single variable, try to approach it as a distribution\r\n% fitting problem.\r\n% # Is the function one that can transform to a simpler form, such as a\r\n% linear relationship? If so, consider doing that to make the fitting\r\n% process more efficient.\r\n% # However, consider the noise. Does the noise seem additive or\r\n% multiplicative?  Does the noise vary with the magnitude of |Y|? Are\r\n% negative |Y| values plausible? These questions can help you decide\r\n% whether to fit on the original or transformed scale, and whether to\r\n% specify an error model.\r\n%\r\n% What criteria do you use to choose a model for fitting your data?  Let us\r\n% know <https:\/\/blogs.mathworks.com\/loren\/?p=3177#respond here>.\r\n##### SOURCE END ##### 193fb599e4ee48b48bcb3215cc27eff1\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"http:\/\/blogs.mathworks.com\/images\/loren\/2018\/cf_topicsLS_05.png\" onError=\"this.style.display ='none';\" \/><\/div><!--introduction--><pre>               \"All the noise, noise, noise, NOISE!\"<\/pre><pre>                       -- The Grinch<\/pre><p>Today's guest blogger is Tom Lane. Tom has been a MathWorks developer since 1999, working primarily on the <a href=\"https:\/\/www.mathworks.com\/products\/statistics.html\">Statistics and Machine Learning Toolbox<\/a>. He'd like to share with you a couple of issues that MATLAB users repeatedly encounter.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2018\/12\/19\/a-couple-of-topics-in-curve-fitting\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[47,48],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3177"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=3177"}],"version-history":[{"count":3,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3177\/revisions"}],"predecessor-version":[{"id":3185,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3177\/revisions\/3185"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=3177"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=3177"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=3177"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}