{"id":3067,"date":"2018-09-25T07:11:03","date_gmt":"2018-09-25T12:11:03","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=3067"},"modified":"2018-09-21T00:35:50","modified_gmt":"2018-09-21T05:35:50","slug":"surrogate-optimization-for-expensive-computations","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2018\/09\/25\/surrogate-optimization-for-expensive-computations\/","title":{"rendered":"Surrogate Optimization for Expensive Computations"},"content":{"rendered":"<p>I'd like to introduce this week's guest blogger <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/profile\/authors\/1033975-alan-weiss\">Alan Weiss<\/a>. Alan writes documentation for mathematical toolboxes here at MathWorks. <\/p>\r\n      <p>Hi, folks. This post is about a new solver in Global Optimization Toolbox, <a href=\"https:\/\/www.mathworks.com\/help\/gads\/surrogateopt.html\"> <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline><\/a>. This solver differs from many of our previous solvers in the following ways: <\/p>\r\n      <ul type=\"square\" style=\"list-style-type:square\">\r\n         <li>It is best suited to optimizing expensive functions, meaning functions that are time-consuming to evaluate. Typically, you\r\n            use \r\n            <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> to optimize a simulation or ODE or other expensive objective function.\r\n         <\/li>\r\n         <li>It is inherently a global solver. The longer you let it run, the more it searches for a global solution.<\/li>\r\n         <li>It requires no start point, just problem bounds. Of course, you can specify start points if you like.<\/li>\r\n      <\/ul>\r\n      <p>The solver is closest in concept to a Statistics and Machine Learning Toolbox&#x2122; solver, <a href=\"https:\/\/www.mathworks.com\/help\/stats\/bayesopt.html\"> <inline style=\"font-family: monospace, monospace; font-size: inherit;\">bayesopt<\/inline><\/a>. Both of these solvers work by internally creating a model of the objective function, and using the internal model to attempt to find better points to evaluate. They each attempt to find a global solution, but they use different techniques to do so. They both have overhead (take noticeable time) in choosing evaluation points. And they both work best on low-dimensional problems. <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> is a bit easier to use, largely because <inline style=\"font-family: monospace, monospace; font-size: inherit;\">bayesopt<\/inline> requires you to create problem variables one at a time. But <inline style=\"font-family: monospace, monospace; font-size: inherit;\">bayesopt<\/inline> has several compelling features, too, including many excellent plots. <\/p>\r\n      <h3>Function to Minimize<\/h3>\r\n      <p>Let's try to minimize a simple function of two variables:<\/p>\r\n      <p><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_1.png\">, <\/p>\r\n      <p>where <img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_2.png\"> and <img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_3.png\">. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">x0 = [1,2];\r\nx1 = [-3,-5];\r\nfun = @(x)sum((x\/5).^2,2)-4*sech(sum((x-x0).^2,2))-6*sech(sum((x-x1).^2,2));\r\n[X,Y] = meshgrid(linspace(-10,10));\r\nZ = fun([X(:),Y(:)]);\r\nZ = reshape(Z,size(X));\r\nsurf(X,Y,Z)\r\nview([53,3])\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_4.png\"><p>You can see that there are three local minima, the global minimum near [\u20133,\u20135], one that is nearly as good near [1,2], and a poor one near [0,0]. You can see this because MATLAB\u00ae evaluated 10,000 points for the plot. The value of <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> is to search for a global minimum using relatively few function evaluations. <\/p>\r\n      <h3>Try \r\n         <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline>\r\n      <\/h3>\r\n      <p>To search for a global minimum, set finite bounds and call <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline>. Because <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> is designed for objective functions that take a long time to compute, use <inline style=\"font-family: monospace, monospace; font-size: inherit;\">slowfun<\/inline>, which is a version of <inline style=\"font-family: monospace, monospace; font-size: inherit;\">fun<\/inline> that has an extra 0.5 s delay in every function evaluation. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">type <span style=\"color:rgb(160, 32, 240);\">slowfun<\/span>\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">function y = slowfun(x,x0,x1)\r\n\r\ny = sum((x\/5).^2,2)-4*sech(sum((x-x0).^2,2))-6*sech(sum((x-x1).^2,2));\r\npause(0.5)\r\n<\/pre><p>Set bounds and run the optimization of <inline style=\"font-family: monospace, monospace; font-size: inherit;\">slowfun<\/inline>. Time the results using tic\/toc. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">fun = @(x)slowfun(x,x0,x1);\r\nlb = [-10,-10];\r\nub = -lb;\r\nrng <span style=\"color:rgb(160, 32, 240);\">default<\/span> <span style=\"color:rgb(34, 139, 34);\">% for reproducibility<\/span>\r\ntic\r\n[xsol,fval] = surrogateopt(fun,lb,ub)\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_5.png\"><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">Surrogateopt stopped because it exceeded the function evaluation limit set by \r\n'options.MaxFunctionEvaluations'.\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">xsol = <em>1\u00d72<\/em>\r\n\r\n    0.8779    1.7580\r\n\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">fval = -3.8348\r\n<\/pre><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">toc\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">Elapsed time is 114.326822 seconds.\r\n<\/pre><p> <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> found the second-best solution near [1,2]. The optimization took 114 seconds, of which 100 seconds were due to the artificial 0.5 s pause in each function evaluation. <\/p>\r\n      <h3>Speed Optimization via Parallel Computing<\/h3>\r\n      <p>To try to find a better solution, run the solver again. To run faster, set the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">UseParallel<\/inline> option to <inline style=\"font-family: monospace, monospace; font-size: inherit;\">true<\/inline>. (Parallel surrogate optimization requires a Parallel Computing Toolbox&#x2122; license.) This timing result comes from a 6-core computer. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">options = optimoptions(<span style=\"color:rgb(160, 32, 240);\">'surrogateopt'<\/span>,<span style=\"color:rgb(160, 32, 240);\">'UseParallel'<\/span>,true);\r\ntic\r\n[xsol,fval] = surrogateopt(fun,lb,ub,options)\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_6.png\"><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">Surrogateopt stopped because it exceeded the function evaluation limit set by \r\n'options.MaxFunctionEvaluations'.\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">xsol = <em>1\u00d72<\/em>\r\n\r\n   -2.8278   -4.7159\r\n\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">fval = -4.7542\r\n<\/pre><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">toc\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">Elapsed time is 20.130099 seconds.\r\n<\/pre><p>This time, <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> found the global minimum. If you run this example on a computer, your results can differ, because <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> does not run reproducibly in parallel. The optimization took 20 seconds, which is almost six times faster than the serial optimization. <\/p>\r\n      <h3>How \r\n         <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> Works\r\n      <\/h3>\r\n      <p>The <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> algorithm alternates between two phases: <\/p>\r\n      <ul type=\"square\" style=\"list-style-type:square\">\r\n         <li><b>Construct Surrogate<\/b>. In this phase, the algorithm takes a fixed number of quasirandom points within the problem bounds and evaluates the objective\r\n            function at those points. It then interpolates and extrapolates these points to a smooth function called the surrogate in\r\n            the entire bounded region. In subsequent phases, the surrogate interpolates the evaluated values at all the quasirandom points\r\n            where the objective has been evaluated. The interpolation function is a radial basis function, because these functions are\r\n            computationally inexpensive to create and evaluate. Radial basis functions also make it inexpensive to add new points to the\r\n            surrogate.\r\n         <\/li>\r\n      <\/ul>\r\n      <p><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_7.png\"><\/p>\r\n      <ul type=\"square\" style=\"list-style-type:square\">\r\n         <li><b>Search for Minimum<\/b>. This phase is more complicated than the <b>Construct Surrogate<\/b> phase, and is detailed below.\r\n         <\/li>\r\n      <\/ul>\r\n      <p>In theory, there are two main considerations in searching for a minimum of the objective: refining an existing point to get a better value of the objective function, and searching in places that have not yet been evaluated in hopes of finding a better global minimum. <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> balances these considerations by constructing a merit function that explictly gives cyclical weights to these considerations. The weights follow the cycle 0.3, 0.5, 0.7, 0.95, where a high weight means more consideration to refining the current best point. <\/p>\r\n      <p>After constructing a merit function, <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> evaluates the merit function at hundreds or thousands of sample points near the <b><em>incumbent<\/em><\/b>, meaning the point that has lowest objective function value for points evaluated since the start of the most recent surrogate construction phase. These sample points are distributed according to a multidimensional normal distribution, centered on the incumbent and truncated to stay within bounds, with standard deviation in each coordinate proportional to the difference between those coordinate bounds. The standard deviations are multiplied by a <b><em>scale<\/em><\/b> <inline style=\"font-family: monospace, monospace; font-size: inherit;\">sigma<\/inline> that the solver adjusts according to a rule described soon. The solver discards from consideration all sample points that are within <inline style=\"font-family: monospace, monospace; font-size: inherit;\">options.MinSurrogateDistance<\/inline> from any point where <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> has already evaluated the objective function. If it discards all sample points (because all are too close to evaluated points), this signals the solver to <b><em>reset<\/em><\/b> the surrogate, meaning return to the <b>Construct Surrogate<\/b> phase. <\/p>\r\n      <p><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_8.png\"><\/p>\r\n      <p>After evaluating the merit function on the sample points, the solver chooses the point with lowest merit function value, and evaluates the objective function at that point. This point is called an <b><em>adaptive point<\/em><\/b>. The algorithm updates the surrogate (interpolation) to include this point. <\/p>\r\n      <ul type=\"square\" style=\"list-style-type:square\">\r\n         <li>If the objective function value at the adaptive point is sufficiently lower than the value at the incumbent, then the adaptive\r\n            point becomes the incumbent. This is termed a successful search.\r\n         <\/li>\r\n         <li>If the objective function value at the adaptive point is not sufficiently lower than the value at the incumbent, then the\r\n            search is termed unsuccessful.\r\n         <\/li>\r\n      <\/ul>\r\n      <p>The scale <inline style=\"font-family: monospace, monospace; font-size: inherit;\">sigma<\/inline> updates as follows. <\/p>\r\n      <ul type=\"square\" style=\"list-style-type:square\">\r\n         <li>If there have been three successful searches since the last scale change, then the scale is doubled: \r\n            <inline style=\"font-family: monospace, monospace; font-size: inherit;\">sigma<\/inline> becomes \r\n            <inline style=\"font-family: monospace, monospace; font-size: inherit;\">2*sigma<\/inline>.\r\n         <\/li>\r\n         <li>If there have been \r\n            <inline style=\"font-family: monospace, monospace; font-size: inherit;\">max(5,nvar)<\/inline> unsuccessful searches since the last scale change, where \r\n            <inline style=\"font-family: monospace, monospace; font-size: inherit;\">nvar<\/inline> is the number of problem dimensions, then the scale is halved: \r\n            <inline style=\"font-family: monospace, monospace; font-size: inherit;\">sigma<\/inline> becomes \r\n            <inline style=\"font-family: monospace, monospace; font-size: inherit;\">sigma\/2<\/inline>.\r\n         <\/li>\r\n      <\/ul>\r\n      <p>Generally, <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> continues alternating between the two phases until it reaches an iteration limit or time limit. Unlike most other solvers, <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> has no notion of convergence. <\/p>\r\n      <p>For details, see the <a href=\"https:\/\/www.mathworks.com\/help\/gads\/surrogate-optimization-algorithm.html\">Surrogate Optimization Algorithm<\/a> section in the documentation. <\/p>\r\n      <h3>Monitor Optimization Using \r\n         <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateoptplot<\/inline>\r\n      <\/h3>\r\n      <p>You can optionally use the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateoptplot<\/inline> plot function while performing a surrogate optimization. This plot function shows you the types of points where the algorithm evaluates the objective function: quasirandom (named Random Samples in the legend), incumbent, and adaptive. You can see where the algorithm performs a surrogate reset, meaning it could not produce a new adaptive point (this happens when the scale becomes too small). <\/p>\r\n      <p>Run the same problem as before, while using the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateoptplot<\/inline> plot function. Turn off parallel computing to have reproducible results. <\/p><pre class=\"matlab-code\" id=\"matlabcode\" style=\"background-color: #F7F7F7;font-family: monospace;font-weight:normal;border-style: solid; border-width: 1px ;border-color:#E9E9E9;padding-top:5px;padding-bottom:5px;line-height:150%;\">options.PlotFcn = <span style=\"color:rgb(160, 32, 240);\">'surrogateoptplot'<\/span>;\r\noptions.UseParallel = false;\r\nrng(4) <span style=\"color:rgb(34, 139, 34);\">% For reproducibility<\/span>\r\n[xsol,fval] = surrogateopt(fun,lb,ub,options)\r\n<\/pre><img decoding=\"async\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_9.png\"><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">Surrogateopt stopped because it exceeded the function evaluation limit set by \r\n'options.MaxFunctionEvaluations'.\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">xsol = <em>1\u00d72<\/em>\r\n\r\n   -2.7998   -4.7109\r\n\r\n<\/pre><pre class=\"output\" style=\"font-family:monospace;border:none;background-color:white;color:rgba(64, 64, 64, 1);\">fval = -4.7532\r\n<\/pre><p>Here is how to interpret the plot. Starting from the left, follow the green line, which represents the best function value found. The solver reaches the vicinity of the second-best point at around evaluation number 30. Around evaluation number 70, there is a surrogate reset. After this, follow the dark blue incumbent line, which represents the best objective function found since the previous surrogate reset. This line approaches the second-best point after evaluation number 100. There is another surrogate reset before evaluation number 140. Just before evaluation number 160, the solver reaches the vicinity of the global optimum. The solver continues to refine this solution until the end of the evaluations, number 200. Notice that after evaluation number 190, all of the adaptive samples are at or near the global optimum, showing the shrinking scale. There is a similar discussion of interpreting the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateoptplot<\/inline> display in the documentation topic <a href=\"https:\/\/www.mathworks.com\/help\/gads\/interpret-surrogateoptplot.html\">Interpret surrogateoptplot<\/a>. <\/p>\r\n      <h3>Closing Thoughts<\/h3>\r\n      <p>I hope that this brief description gives you some understanding of the <inline style=\"font-family: monospace, monospace; font-size: inherit;\">surrogateopt<\/inline> solver. If you have an expensive function to optimize, keep this solver in mind. While currently the solver does not accept any constraints except for bounds, there are techniques that allow you to try the solver even on nonlinearly constrained problems. See <a href=\"https:\/\/www.mathworks.com\/help\/gads\/solve-nonlinearly-constrained-problem-using-surrogateopt.html\">Surrogate Optimization with Nonlinear Constraint<\/a>. <\/p>\r\n      <p>Did you enjoy seeing how to use surrogate optimization? Do you have problems that might be addressed by this new solver? Tell us about it <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=3067#respond\">here<\/a>. <\/p>","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/loren\/files\/loren_9.png\" onError=\"this.style.display ='none';\" \/><\/div><p>I'd like to introduce this week's guest blogger Alan Weiss. Alan writes documentation for mathematical toolboxes here at MathWorks. \r\n      Hi, folks. This post is about a new solver in Global... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2018\/09\/25\/surrogate-optimization-for-expensive-computations\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[60,1],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3067"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=3067"}],"version-history":[{"count":2,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3067\/revisions"}],"predecessor-version":[{"id":3073,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3067\/revisions\/3073"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=3067"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=3067"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=3067"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}