Loren on the Art of MATLAB

July 8th, 2008

HELP – I Got the Wrong Answer for My Optimization!

How often I have heard this!?! It turns out there are a bunch of reasons why you might not get the answer you expect when performing an optimization task. These possibilities include the answer not being unique, the optimization method used isn't global and therefore returns a local optimal solution, and the problem, as stated, isn't the problem you really want to solve. Let me tell you more about a recent case.

Contents

What We Heard

We heard from a user that fmincon, from Optimization Toolbox, gets the wrong answer for the problem posed in myOptFun. In fact, the optimization gets stuck at the initial point.

The Function

Let's take a quick look at the function. The user told us that it was a simple paraboloid of revolution. I'll calculate z values on a grid and display as a surface.

[xx,yy] = meshgrid(-2:.1:2);
zz = xx.^2+yy.^2;
surfc(xx,yy,zz)
shading interp
colorbar
view(-37.5,50)

Run the Example

First I do some housekeeping.

opt = optimset('display', 'final');
warning off optim:fmincon:SwitchingToMediumScale

Next, I'll run the example using the user's objective function.

lb = [-5;-5];
ub = [5;5];
x0 = [1;1];
[x fval exitflag output lamb grad hess] = ...
           fmincon(@myOptFun,x0,[],[],[],[],lb,ub,[],opt);
Optimization terminated: first-order optimality measure less
 than options.TolFun and maximum constraint violation is less
 than options.TolCon.
No active inequalities.

According the user, the optimal answer is 0 and can be found at (0,0). Instead, look what we find.

x
x =
     1
     1
fval
fval =
     2

Why Did the Optimization Not Find Zero?

The source of the confusion is in myOptFun. Now let me show you the code.

type myOptFun
function f = myOptFun(X)
formule = 'x.*x+y.*y';
formule = strrep(formule,'x',num2str(X(1)));
formule = strrep(formule,'y',num2str(X(2)));
f = eval(formule);

It's the use of num2str specifically. As the documentation says, the default format for num2str is '%11.4g' which means I am calculating results to 4 significant figures. Calling num2str(1), the result is 1 and the same is true for num2str(1+1e-5). In other words the accuracy of the customer's objective function cannot be expected to be better than 1e-4. However, the default tolerances for TolFun and TolX are 1e-6. More importantly DiffMinChange ("minimum change in variables for finite difference derivatives") is 1e-8. When the first finite difference calculations are done, the approximate first-order derivatives of the customer's objective function are zero and fmincon stops.

How to Fix the Optimization

I have several options for obtaining a "better" result. I can

  • increase the accuracy of the objective function
  • adjust the relevant tolerances to be larger to accomodate the resolution of the objective function
  • provide exact first-order derivatives (TolFun and TolX still need to be less than 1e-4).

Change the Objective Function

Let's try the same optimization with a modified objective function.

[xFixed fvalFixed exitflag output lamb grad hess] = ...
           fmincon(@myOptFunFixed,x0,[],[],[],[],lb,ub,[],opt);
Optimization terminated: first-order optimality measure less
 than options.TolFun and maximum constraint violation is less
 than options.TolCon.
No active inequalities.
xFixed
xFixed =
  1.0e-015 *
    0.4441
   -0.2220
fvalFixed
fvalFixed =
  2.4652e-031

Have You Hit a Similar Snag?

If you have hit a similar snag, it would be great if you could share it here.


Get the MATLAB code

Published with MATLAB® 7.6

16 Responses to “HELP – I Got the Wrong Answer for My Optimization!”

  1. Arthur replied on :

    Can you post the code for myOptFunFixed, so we can see how you decided to rewrite the function?

    I’ve had similar problems: I was trying to minimize a complex function that was just a little noisy. My function was noisy because it required solving a nonlinear equation and approximating an integral. Solving them to many decimal places was very time-consuming, but relaxing the solvers was causing problems with the optimization.

    That’s when I discovered DiffMinChange, which let me ensure that the finite differencing was larger than the magnitude of the noise.

  2. Loren replied on :

    Sure, Arthur, sorry for the oversight! Here it is:

    function f = myOptFunFixed(X)
    f = X(1,:).^2+X(2,:).^2;
    

    –Loren

  3. Dan K replied on :

    Loren,
    Thank you once again for a post on a topic of great interest. Two thoughts: First, why on earth would somebody want to use the code you described as the original. Won’t using the string replacements and eval vastly increase processing time? Second, in answer to your question,the problem which I had a great deal of difficulty with was a case where I had a huge number of local minima. I was trying to fit a noisy sinusoid, with unknown frequency, amplitudes, phase, and offset. Of course for any test frequency you get a local minima when you adjust the phase. I finally ended up having to drop the direct approach, and set up a genetic optimizer with a godawful mutation function (which occasionally would calculate the frequency change to effect exactly 2 pi equivalent phase jump).
    Dan

  4. Brian replied on :

    I am in awe of the original code. I can’t understand how someone ends up thinking that’s the way to go. It’s totally missing a basic idea of Matlab (and programming in general): you can assign numbers to variables.

  5. Loren replied on :

    Brian-

    I wonder if the user had the numbers in strings from reading in a text file in some way. In any case, their “solution” certainly led to difficulties on multiple levels.

    –Loren

  6. Stuart Kozola replied on :

    Loren is probably right that the user was reading in a text file. I’ve run into a similar problem when reading in input/output files from an external program. The rounding and truncation that occurs in the file i/o operations can cause the same problem as the strrep function did in this example. In my case, I used DiffMinChange (fmincon) to keep the solver getting stuck.

    -Stu

  7. Ben replied on :

    It’s worth pointing out that fmincon (R2007b) can get stuck at the initial point even for “nice” functions, like

    >>fmincon(@cos,0,[],[],[],[],-1,1)

    which returns 0 the local maximum. Not that we’d do that in practice though :)

  8. Loren replied on :

    Ben-

    What you mention is not about the quality of the objective function itself. In fact, when I run the problem in R2008b, I get a warning before I get my answer:

    Warning: Trust-region-reflective method does not currently solve this type of problem,
     using active-set (line search) instead.
    > In fmincon at 437
    Optimization terminated: first-order optimality measure less
     than options.TolFun and maximum constraint violation is less
     than options.TolCon.
    

    which hints at the answer (and the problem as posed, perhaps) needing scrutiny.

    –loren

  9. lishuqin replied on :

    ti have the problem:Maximum number of function evaluations exceeded;
    increase OPTIONS.MaxFunEvals.
    my function as follows:

    function f=gear(x)
    f=1*0.7408*x(1)^3*x(2)^3*x(3)*[1+(1-x(3))^2+(1-x(3))]+0*[-1.88+4.858/x(2)]
    function [g,q]=mycon(x)
    g(1)=160057/[x(1)^3*x(2)^3*(1-0.5*x(3))^2*x(3)]^(-0.5)-664;
    g(2)=398121*x(1)^(-3)*x(2)^(-2)*(1-0.5*x(3))^(-2)*x(3)^(-1)-480;
    g(3)=358848*x(1)^(-3)*x(2)^(-2)*(1-0.5*x(3))^(-2)*x(3)^(-1)-464;
    g(4)=-0.48+4.858/x(2);
    g(5)=4-x(1);
    g(6)=0.25-x(3);
    g(7)=x(3)-0.33;
    g(8)=x(2)-40;
    g(9)=x(1)-10;
    g(10)=11-x(2);
    q=[];

    options=optimset(‘largescale’,'off’);x0=[4,14,0.3];[x,fval]=fmincon(@gear,x0,[ ],[ ],[ ],[ ],[ ],[ ],@mycon,options);please give me help ,thanks.

  10. Loren replied on :

    Lishuqin-

    The message is telling you what the issue is (and how to change it). If that doesn’t help, please contact technical support – link at the right.

    –Loren

  11. shankar replied on :

    Hi, did anyone figure out what Ben asked? Why does matlab get stuck at intitial point for minimizing even cosine. I am having a similiar problem with my objective function getting stuck at initial point. as soon as i perturb the initial point ever so slightly, it evolves very well. but unfortunately i fear that the direction of the perturbation might be influencing the evolution of the solution.

  12. Alan Weiss replied on :

    When your initial point is stationary (i.e., has gradient = 0 or satisfies the KKT conditions), the solver has no way to know whether the point is a local minimum, local maximum, or is simply an indefinite stationary point. The issue is addressed in these sections of the documentation:
    http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/br44i40.html#br44j7o
    http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/br44i40.html#br44i5e
    http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/br44i40.html#brhkghv-65

    The final link is particularly relevant, in the it explains the difference between a local and a global solution, and why the initial point is so important.

  13. James Allison replied on :

    Regarding the answer to Ben’s comment: fmincon is a gradient-based algorithm. It’s search direction depends on the gradient of the function at a point, and it stops when first-order conditions are satisfied, i.e., the gradient is zero if no constraints are active. In this case Ben is starting fmincon at a point where the gradient is zero, so of course the algorithm doesn’t move. Another term like this is a ‘stationary point’. This is a property of gradient-based algorithms, and not just MATLAB’s fmincon. It’s expected behavior. In cases like this you might want to try a multi-start approach, or an initial search with a gradient-free algorithm. The new global optimization toolbox can handle problems like this very well (multiple stationary points, noisy responses, etc.).

  14. shankar replied on :

    Thanks alan and james. i understand what you are saying. i am especially surprised as there seems to be no code which can make a point descend from a maxima or even a saddle point. the only way i could get anything out of my matlab code is to perturb the initial point slightly. but then since i am looking at the subsequent evolution of the problem, i fear i might be getting the wrong or right answer depending upon how i perturb the initial point, however so slightly (its a stability/bifurcation problem). strange indeed that numerical guys have not figured out this issue.

  15. James replied on :

    Shankar: while algorithms in the optimization toolbox may not be able to descend from a maximum or a saddle point, gradient-free algorithms in the Global Optimization Toolbox certainly can, such as ga or patternsearch. It’s a matter of choosing an algorithm appropriate for your problem.

    If you don’t know a lot about what the problem space looks like, and have some indication that multiple stationary points may exist, then a hybrid approach can be helpful, i.e., starting off with a gradient-free method with relatively loose tolerances, and then using the result as a starting point for a gradient-based method to identify a precise solution. The new GlobalSearch and MultiStart algorithms may also be very helpful.

  16. shankar replied on :

    Hi james, great information. Thanks. in fact what you suggested is what i kind of did. wrote down the hessian at the initial point, and minimizing it gives us the bifurcation modes. However nonlinear an energy be, usually its hessian’s minimization gives a linear eqn. for bifurcation. Since the bifurcation problem is an eigenvalue, discretizing gives a simple AX=tX matrix eigenvalue problem. Choosing smallest eigenvalue of A, and its corresponding eigenvector, I used that eigenvector multiplied by a (any) small number to perturb the intial point. As soon as it is perturbed, fmincon works very well. Also if the perturbation is not correct direction, ideally it should fall back into the maxima.


MathWorks
Loren Shure works on design of the MATLAB language at MathWorks. She writes here about once a week on MATLAB programming and related topics.

These postings are the author's and don't necessarily represent the opinions of The MathWorks.