{"id":3021,"date":"2018-08-20T11:09:53","date_gmt":"2018-08-20T16:09:53","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/?p=3021"},"modified":"2018-08-23T09:20:53","modified_gmt":"2018-08-23T14:20:53","slug":"optimizing-to-survive","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2018\/08\/20\/optimizing-to-survive\/","title":{"rendered":"Optimizing to Survive"},"content":{"rendered":"\r\n<div class=\"content\"><!--introduction--><p>Today's guest blogger is Mary Fenelon, who is the product marketing manager for Optimization and Math here at MathWorks. In today's post she describes how she uses optimization to try to best the rest of the product marketing team.<\/p><p>The National Football League 2018 season will soon be_ (is?)_ upon us. Several of us in the product marketing team here at MathWorks have played in an office league for the last couple of years. I don't follow the NFL very closely so I didn't think about participating until one of my fellow product marketing managers challenged me to do so by remarking that this looked like an optimization problem. That's all it took!<\/p><!--\/introduction--><h3>Contents<\/h3><div><ul><li><a href=\"#ab7296fd-5bed-4ac5-8bf1-ae324674a5c1\">Rules<\/a><\/li><li><a href=\"#c0d96096-7ce1-4bf2-aa77-b63474c4bc1a\">My Strategy<\/a><\/li><li><a href=\"#cb112425-16a2-409c-9869-7daf9da214f2\">Data<\/a><\/li><li><a href=\"#b3f7139b-ab8e-4c65-95d9-18abd04cc0c4\">Linear Assignment Model<\/a><\/li><li><a href=\"#89ba6fd9-77ab-4488-8199-1d0f9e069201\">Extending the Model<\/a><\/li><li><a href=\"#d24fd0ff-6a03-464d-950d-bbad2bfd7cdc\">Results<\/a><\/li><\/ul><\/div><h4>Rules<a name=\"ab7296fd-5bed-4ac5-8bf1-ae324674a5c1\"><\/a><\/h4><div><ol><li>This is a knockout or survivor style pool. Every week, your goal is to pick the winning team for one game. You can only choose a team once for the entire season. If your team wins or ties you move on to the next week and pick from the remaining teams.  As long as you haven&#8217;t picked a losing team you can choose from the entire group of NFL teams.<\/li><li>If you pick a losing team you are not out of the knockout pool.  You are now restricted to picking from the teams that had a non-winning record last season. You can try to keep your streak alive, but now you have to do it with a more limited set of teams.<\/li><li>Whoever is alive for the longest number of weeks wins the knockout pool. The longer you keep the knockout streak alive the bigger the knockout pool grows. It will start at 50% of the pool, and grow onward.  The pot will grow until the last person has two losses.  If two or more people tie, they will either split the pool or determine a final winner by a joust in the parking lot.<\/li><li>After your second loss you can still win the secondary pool, which is whoever picks the most winners throughout the season.  Once you&#8217;ve lost twice you can go back to picking any team as long as you haven&#8217;t already picked them.  Whoever wins the knockout pool is ineligible for this prize unless there&#8217;s unforeseen circumstances (if everyone is knocked out in week 2, for example).<\/li><\/ol><\/div><h4>My Strategy<a name=\"c0d96096-7ce1-4bf2-aa77-b63474c4bc1a\"><\/a><\/h4><p>I could choose a team each week without considering that it would be better to choose that team further out. That myopic strategy could leave me with poor choices at the end of the season. Instead, my first strategy decision is to run an optimization each week choosing teams for all of the remaining weeks of the season but use only the choice for the current week. That way, I benefit from updated information as the season progresses while still looking at the whole schedule.<\/p><p>Coming up with a set of choices can be modeled as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Assignment_problem\">linear assignment problem<\/a>. The goal in an assignment problem is to assign workers to tasks so that the cost of completing all the tasks is minimized. Each worker can only do one task. For the NFL pool, the workers are the teams and the tasks are the weeks.<\/p><p>What to use for the costs of each team-to-week assignment? This choice is the second place where the art and science of modeling comes into play. Art, to think of measures, and science, to validate that the chosen measure gives the desired result. My choice is to use the win probabilities produced by <a href=\"https:\/\/fivethirtyeight.com\/\">fivethirtyeight.com<\/a>. <a href=\"https:\/\/fivethirtyeight.com\/features\/how-our-2017-nfl-predictions-work\/\">This article<\/a> explains how they built their predictive model using the entire history of the NFL as data. With that measure, the objective is to maximize the sum of the win probabilities of the chosen teams over the entire season.<\/p><p>Linear assignment problems can by solved by the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hungarian_algorithm\">Hungarian algorithm<\/a>. <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/20652-hungarian-algorithm-for-linear-assignment-problems-v2-3\">This implementation<\/a> on the MATLAB File Exchange is a popular one. These problems can also be solved by the functions in <a href=\"https:\/\/www.mathworks.com\/products\/optimization.html\">Optimization Toolbox<\/a>. Formulating the problem as an optimization problem gives me the option to specify additional constraints, for instance, to follow the rule that after one loss, I'm restricted to choose from the losers bracket.<\/p><h4>Data<a name=\"cb112425-16a2-409c-9869-7daf9da214f2\"><\/a><\/h4><p><b>Season Data<\/b><\/p><p>For this post, I'll generate some data instead of using actual NFL data. I'll use about half the teams and half as many games as the NFL.<\/p><pre class=\"codeinput\">nSeasonWeeks = 8;\r\nteams = [<span class=\"string\">\"Aardvarks\"<\/span>,<span class=\"string\">\"Bats\"<\/span>,<span class=\"string\">\"Cheetahs\"<\/span>,<span class=\"string\">\"Dragons\"<\/span>,<span class=\"string\">\"Emus\"<\/span>,<span class=\"string\">\"Foxes\"<\/span>,<span class=\"string\">\"Giraffes\"<\/span>,<span class=\"string\">\"Hippos\"<\/span>,<span class=\"string\">\"Ibexes\"<\/span>,<span class=\"string\">\"Jackals\"<\/span>,<span class=\"keyword\">...<\/span>\r\n    <span class=\"string\">\"Koalas\"<\/span>,<span class=\"string\">\"Lemurs\"<\/span>,<span class=\"string\">\"Monkeys\"<\/span>,<span class=\"string\">\"Newts\"<\/span>,<span class=\"string\">\"Otters\"<\/span>,<span class=\"string\">\"Porcupines\"<\/span>]';\r\nnTeams = numel(teams);\r\n<\/pre><p>Designate some teams as losers from the previous season for the losers bracket. Set the random seed to reproduce the same bracket each time I run the script.<\/p><pre class=\"codeinput\">rng(19);\r\nnLosers = floor(nTeams\/2);\r\nidx = randperm(nTeams);\r\nlosers = teams(idx(1:nLosers));\r\n<\/pre><p><b>Weekly Data<\/b><\/p><p>I'll rerun the model each week, keeping track of the week of the season and which teams I already picked.<\/p><p>Each week I increment the week counter.<\/p><pre class=\"codeinput\">thisWeek = 2;\r\n<\/pre><p>Each week I add the team that was chosen the previous week<\/p><pre class=\"codeinput\">previousPicks = [<span class=\"string\">\"Aardvarks\"<\/span>];\r\n<\/pre><p>Generate the labels for the remaining weeks:<\/p><pre class=\"codeinput\">weeks = <span class=\"string\">\"week\"<\/span> + string(thisWeek:nSeasonWeeks)';\r\nnWeeksLeft = nSeasonWeeks - thisWeek + 1;\r\n<\/pre><p>Get the probabilities.<\/p><p>Here's where I would scrape a web page for predictions. Instead, I'll use a random matrix for the win probabiliites. The random values won't show the patterns one would see in real data, but they are sufficient for showing how to set up and run the optimization model. One step towards more realistic probabilities is to first generate team pairings for the games and a game schedule. This is an optimization problem in itself; for an example see <a href=\"https:\/\/pubsonline.informs.org\/stoken\/default+domain\/EC-SportsOA\/full\/10.1287\/opre.46.1.1\">scheduling the ACC basketball conference<\/a>.<\/p><pre class=\"codeinput\">winProbs = rand(nTeams,nWeeksLeft);\r\n<\/pre><h4>Linear Assignment Model<a name=\"b3f7139b-ab8e-4c65-95d9-18abd04cc0c4\"><\/a><\/h4><p>I will use the <a href=\"https:\/\/www.mathworks.com\/help\/optim\/ug\/problem-based-workflow.html\">problem-based workflow<\/a> for optimization introduced in R2017b. This workflow simplifies the process of specifying and solving linear and mixed-integer linear programs, that is, optimization problems with linear constraints and objectives and with variables that can take on continuous or discrete values.<\/p><p><b>Variables<\/b><\/p><p>First, define a two dimensional <a href=\"https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationvariable.optimvar.html\">optimvar<\/a>, <tt>x<\/tt>, indexed by teams and weeks. The optimization solver will compute values for an <tt>optimvar<\/tt> include in an <a href=\"https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationproblem.optimproblem.html\">optimproblem<\/a>. I make this a binary variable, that is a variable that only takes on the values of 0 and 1 with this statement:<\/p><pre class=\"codeinput\">x = optimvar(<span class=\"string\">'x'<\/span>,teams,weeks,<span class=\"string\">'LowerBound'<\/span>,0,<span class=\"string\">'UpperBound'<\/span>,1,<span class=\"string\">'Type'<\/span>,<span class=\"string\">'integer'<\/span>);\r\n<\/pre><p>In my model, a value of 1 for <tt>x(i,j)<\/tt> will mean that team <tt>i<\/tt> is chosen for week <tt>j<\/tt> and a value of 0 means it is not chosen.<\/p><p>To eliminate a team that's already been picked, set the upper bounds of variables corresponding to those teams to 0.<\/p><pre class=\"codeinput\">x.UpperBound(previousPicks,weeks) = 0;\r\n<\/pre><p><b>Optimization Problem and Objective<\/b><\/p><p>Next, define the optimization problem and the objective to maximize the sum of the win probabilities of the chosen teams:<\/p><pre class=\"codeinput\">p = optimproblem;\r\np.ObjectiveSense = <span class=\"string\">'maximize'<\/span>;\r\np.Objective = sum(sum(winProbs.*x));\r\n<\/pre><p><b>Constraints<\/b><\/p><p>The first constraint statement generates one constraint for each team: a team can be chosen at most once. The result is an <a href=\"https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationconstraint.optimconstr.html\">optimconstr<\/a>. Show the first constraint of each set to check the formulation as it's built.<\/p><pre class=\"codeinput\">p.Constraints.eachTeamAtMostOnce = sum(x,2) &lt;= 1;\r\nshowconstr(p.Constraints.eachTeamAtMostOnce(1))\r\n<\/pre><pre class=\"codeoutput\">\r\n  x('Aardvarks', 'week2') + x('Aardvarks', 'week3')\r\n+ x('Aardvarks', 'week4') + x('Aardvarks', 'week5')\r\n+ x('Aardvarks', 'week6') + x('Aardvarks', 'week7')\r\n+ x('Aardvarks', 'week8') &lt;= 1\r\n\r\n<\/pre><p>The second constraint statement generates one constraint for each week: a team must be choosen.<\/p><pre class=\"codeinput\">p.Constraints.mustPickOne = sum(x,1) == 1;\r\nshowconstr(p.Constraints.mustPickOne(1))\r\n<\/pre><pre class=\"codeoutput\">\r\n  x('Aardvarks', 'week2') + x('Bats', 'week2') + x('Cheetahs', 'week2')\r\n+ x('Dragons', 'week2') + x('Emus', 'week2') + x('Foxes', 'week2')\r\n+ x('Giraffes', 'week2') + x('Hippos', 'week2') + x('Ibexes', 'week2')\r\n+ x('Jackals', 'week2') + x('Koalas', 'week2') + x('Lemurs', 'week2')\r\n+ x('Monkeys', 'week2') + x('Newts', 'week2') + x('Otters', 'week2')\r\n+ x('Porcupines', 'week2') == 1\r\n\r\n<\/pre><p><b>Solve<\/b><\/p><p>Now solve the optimization problem. The <a href=\"https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationproblem.solve.html\">solve<\/a> function will call the mixed-integer linear programming solver, <a href=\"https:\/\/www.mathworks.com\/help\/optim\/ug\/intlinprog.html\">intlinprog<\/a>,| b|ecause some of the optimization variables are of type integer<\/p><pre class=\"codeinput\">soln = solve(p);\r\n<\/pre><pre class=\"codeoutput\">LP:                Optimal objective value is -6.661074.                                            \r\n\r\n\r\nOptimal solution found.\r\n\r\nIntlinprog stopped at the root node because the objective value is within a gap\r\ntolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default\r\nvalue). The intcon variables are integer within tolerance,\r\noptions.IntegerTolerance = 1e-05 (the default value).\r\n\r\n<\/pre><p><b>Tabulate choices<\/b><\/p><p>Optimization Toolbox solvers use floating point arithmetic so it's possible that the solution values may not be integral. I want to use the values as logical indexes, so I will round them just to be sure that they are.<\/p><pre class=\"codeinput\">picks = round(soln.x);\r\n[teampicks,weekpicks] = find(picks);\r\nprobpicks = find(picks);\r\ntable(weeks(weekpicks),teams(teampicks),winProbs(probpicks),<span class=\"keyword\">...<\/span>\r\n     <span class=\"string\">'VariableNames'<\/span>,{<span class=\"string\">'Week'<\/span>,<span class=\"string\">'Team'<\/span>,<span class=\"string\">'WinProb'<\/span>})\r\n<\/pre><pre class=\"codeoutput\">ans =\r\n  7&times;3 table\r\n     Week        Team       WinProb\r\n    _______    _________    _______\r\n    \"week2\"    \"Foxes\"      0.94616\r\n    \"week3\"    \"Lemurs\"      0.9943\r\n    \"week4\"    \"Bats\"       0.97137\r\n    \"week5\"    \"Koalas\"     0.99744\r\n    \"week6\"    \"Dragons\"    0.91183\r\n    \"week7\"    \"Ibexes\"     0.98344\r\n    \"week8\"    \"Hippos\"     0.85654\r\n<\/pre><h4>Extending the Model<a name=\"89ba6fd9-77ab-4488-8199-1d0f9e069201\"><\/a><\/h4><p>The win probabilities will be updated on the results of the prior games as the season progresses. Maybe I don't want to value the probabilities for the end of the season as highly as those in the current week. This can be done by applying a discount factor. In the <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/live-scripts-and-functions.html\">Live Script<\/a> version of this post, I set the discount factor with a <a href=\"https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/add-interactive-controls-to-a-live-script.html\">Live Control<\/a> to make it easy to experiment with different values.<\/p><pre class=\"codeinput\">discountFactor =0.05\r\nscalePerWeek = (1-discountFactor).^(0:nWeeksLeft-1);\r\ndProbs = winProbs.*scalePerWeek;\r\np.Objective = sum(sum(x.*dProbs));\r\nsoln = solve(p);\r\n[teampicks,weekpicks] = find(picks);\r\nprobpicks = find(picks);\r\ntable(weeks(weekpicks),teams(teampicks),winProbs(probpicks),<span class=\"keyword\">...<\/span>\r\n     <span class=\"string\">'VariableNames'<\/span>,{<span class=\"string\">'Week'<\/span>,<span class=\"string\">'Team'<\/span>,<span class=\"string\">'WinProb'<\/span>})\r\n<\/pre><pre class=\"codeoutput\">discountFactor =\r\n         0.05\r\nLP:                Optimal objective value is -5.755877.                                            \r\n\r\n\r\nOptimal solution found.\r\n\r\nIntlinprog stopped at the root node because the objective value is within a gap\r\ntolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default\r\nvalue). The intcon variables are integer within tolerance,\r\noptions.IntegerTolerance = 1e-05 (the default value).\r\n\r\nans =\r\n  7&times;3 table\r\n     Week        Team       WinProb\r\n    _______    _________    _______\r\n    \"week2\"    \"Foxes\"      0.94616\r\n    \"week3\"    \"Lemurs\"      0.9943\r\n    \"week4\"    \"Bats\"       0.97137\r\n    \"week5\"    \"Koalas\"     0.99744\r\n    \"week6\"    \"Dragons\"    0.91183\r\n    \"week7\"    \"Ibexes\"     0.98344\r\n    \"week8\"    \"Hippos\"     0.85654\r\n<\/pre><p>Applying a discount factor of 5% didn't change the results.<\/p><p>I might want to only choose winning teams when I have no losses so that I can pick the best of the losing teams when I have to pick among them.<\/p><pre class=\"codeinput\">onlyWinners = false;\r\n<span class=\"keyword\">if<\/span> onlyWinners\r\n    p.Constraints.onlyWinners = sum(x(losers,weeks),1) == 0;\r\n\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><p>With one loss, I must pick from the losers bracket.<\/p><pre class=\"codeinput\">onlyLosers = true;\r\n<span class=\"keyword\">if<\/span> onlyLosers\r\n    p.Constraints.onlyLosers = sum(x(losers,weeks),1) &gt;= 1;\r\n    showconstr(p.Constraints.onlyLosers(1))\r\n<span class=\"keyword\">end<\/span>\r\n<\/pre><pre class=\"codeoutput\">\r\n  x('Aardvarks', 'week2') + x('Cheetahs', 'week2') + x('Dragons', 'week2')\r\n+ x('Emus', 'week2') + x('Foxes', 'week2') + x('Koalas', 'week2')\r\n+ x('Newts', 'week2') + x('Otters', 'week2') &gt;= 1\r\n\r\n<\/pre><pre class=\"codeinput\">soln = solve(p);\r\n[teampicks,weekpicks] = find(picks);\r\nprobpicks = find(picks);\r\ntable(weeks(weekpicks),teams(teampicks),winProbs(probpicks),<span class=\"keyword\">...<\/span>\r\n     <span class=\"string\">'VariableNames'<\/span>,{<span class=\"string\">'Week'<\/span>,<span class=\"string\">'Team'<\/span>,<span class=\"string\">'WinProb'<\/span>})\r\n<\/pre><pre class=\"codeoutput\">LP:                Optimal objective value is -4.838951.                                            \r\n\r\n\r\nOptimal solution found.\r\n\r\nIntlinprog stopped at the root node because the objective value is within a gap\r\ntolerance of the optimal value, options.AbsoluteGapTolerance = 0 (the default\r\nvalue). The intcon variables are integer within tolerance,\r\noptions.IntegerTolerance = 1e-05 (the default value).\r\n\r\nans =\r\n  7&times;3 table\r\n     Week        Team       WinProb\r\n    _______    _________    _______\r\n    \"week2\"    \"Foxes\"      0.94616\r\n    \"week3\"    \"Lemurs\"      0.9943\r\n    \"week4\"    \"Bats\"       0.97137\r\n    \"week5\"    \"Koalas\"     0.99744\r\n    \"week6\"    \"Dragons\"    0.91183\r\n    \"week7\"    \"Ibexes\"     0.98344\r\n    \"week8\"    \"Hippos\"     0.85654\r\n<\/pre><h4>Results<a name=\"d24fd0ff-6a03-464d-950d-bbad2bfd7cdc\"><\/a><\/h4><p>So how did I do? In both years, I was knocked out fairly early but finished at the top of the secondary pool.<\/p><p>Unfortunately, one player had no losses in both years so there was no secondary pool award. That player has not revealed his strategy but there is speculation that he uses an optimization model.  A player with just one loss did reveal his strategy: picking the best team from fivethirtyeight.com each week. That's the myopic strategy I didn't want to use! I can see why it makes sense, though. There are about twice as many teams as games so as long as you can pick from the entire league, there should be some good picks available each week. A greedy strategy also makes sense when the predictions are changing during the season.<\/p><p>I'm not willing to give up on placing well in the secondary pool so I'll keep my assignment model. I can make it greedier by using a higher discount factor. Maybe I should also deal with the uncertainty in the probabilities by using data from more than one source or by using a stochastic optimization approach. Now that I have two years worth of data I can do some model validation before deciding on this year's strategy. It's time to get busy!<\/p><p>Do you have a better approach? Let us know your thoughts <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=3021#respond\">here<\/a>.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_eae73635b2774ac09bffe05bb2159741() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='eae73635b2774ac09bffe05bb2159741 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' eae73635b2774ac09bffe05bb2159741';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2018 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_eae73635b2774ac09bffe05bb2159741()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2018a<br><\/p><\/div><!--\r\neae73635b2774ac09bffe05bb2159741 ##### SOURCE BEGIN #####\r\n%% Optimizing to Survive\r\n% Today's guest blogger is Mary Fenelon, who is the product marketing manager \r\n% for Optimization and Math here at MathWorks. In today's post she describes how \r\n% she uses optimization to try to best the rest of the product marketing team.\r\n% \r\n% The National Football League 2018 season will soon be_ (is?)_ upon us. \r\n% Several of us in the product marketing team here at MathWorks have played in \r\n% an office league for the last couple of years. I don't follow the NFL very closely \r\n% so I didn't think about participating until one of my fellow product marketing \r\n% managers challenged me to do so by remarking that this looked like an optimization \r\n% problem. That's all it took!\r\n%% Rules\r\n% # This is a knockout or survivor style pool. Every week, your goal is to pick \r\n% the winning team for one game. You can only choose a team once for the entire \r\n% season. If your team wins or ties you move on to the next week and pick from \r\n% the remaining teams.  As long as you haven\u00e2\u20ac&#x2122;t picked a losing team you can choose \r\n% from the entire group of NFL teams.\r\n% # If you pick a losing team you are not out of the knockout pool.  You are \r\n% now restricted to picking from the teams that had a non-winning record last \r\n% season. You can try to keep your streak alive, but now you have to do it with \r\n% a more limited set of teams. \r\n% # Whoever is alive for the longest number of weeks wins the knockout pool.  \r\n% The longer you keep the knockout streak alive the bigger the knockout pool grows.  \r\n% It will start at 50% of the pool, and grow onward.  The pot will grow until \r\n% the last person has two losses.  If two or more people tie, they will either \r\n% split the pool or determine a final winner by a joust in the parking lot.\r\n% # After your second loss you can still win the secondary pool, which is whoever \r\n% picks the most winners throughout the season.  Once you\u00e2\u20ac&#x2122;ve lost twice you can \r\n% go back to picking any team as long as you haven\u00e2\u20ac&#x2122;t already picked them.  Whoever \r\n% wins the knockout pool is ineligible for this prize unless there\u00e2\u20ac&#x2122;s unforeseen \r\n% circumstances (if everyone is knocked out in week 2, for example).\r\n%% My Strategy\r\n% I could choose a team each week without considering that it would be better \r\n% to choose that team further out. That myopic strategy could leave me with poor \r\n% choices at the end of the season. Instead, my first strategy decision is to \r\n% run an optimization each week choosing teams for all of the remaining weeks \r\n% of the season but use only the choice for the current week. That way, I benefit \r\n% from updated information as the season progresses while still looking at the \r\n% whole schedule.\r\n% \r\n% Coming up with a set of choices can be modeled as a <https:\/\/en.wikipedia.org\/wiki\/Assignment_problem \r\n% linear assignment problem>. The goal in an assignment problem is to assign workers \r\n% to tasks so that the cost of completing all the tasks is minimized. Each worker \r\n% can only do one task. For the NFL pool, the workers are the teams and the tasks \r\n% are the weeks.\r\n% \r\n% What to use for the costs of each team-to-week assignment? This choice \r\n% is the second place where the art and science of modeling comes into play. Art, \r\n% to think of measures, and science, to validate that the chosen measure gives \r\n% the desired result. My choice is to use the win probabilities produced by <https:\/\/fivethirtyeight.com\/ \r\n% fivethirtyeight.com>. <https:\/\/fivethirtyeight.com\/features\/how-our-2017-nfl-predictions-work\/ \r\n% This article> explains how they built their predictive model using the entire \r\n% history of the NFL as data. With that measure, the objective is to maximize \r\n% the sum of the win probabilities of the chosen teams over the entire season.\r\n% \r\n% Linear assignment problems can by solved by the <https:\/\/en.wikipedia.org\/wiki\/Hungarian_algorithm \r\n% Hungarian algorithm>. <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/20652-hungarian-algorithm-for-linear-assignment-problems-v2-3 \r\n% This implementation> on the MATLAB File Exchange is a popular one. These problems  \r\n% can also be solved by the functions in <https:\/\/www.mathworks.com\/products\/optimization.html \r\n% Optimization Toolbox>. Formulating the problem as an optimization problem gives \r\n% me the option to specify additional constraints, for instance, to follow the \r\n% rule that after one loss, I'm restricted to choose from the losers bracket.\r\n%% Data\r\n% *Season Data*\r\n% \r\n% For this post, I'll generate some data instead of using actual NFL data.  \r\n% I'll use about half the teams and half as many games as the NFL. \r\n%%\r\nnSeasonWeeks = 8;\r\nteams = [\"Aardvarks\",\"Bats\",\"Cheetahs\",\"Dragons\",\"Emus\",\"Foxes\",\"Giraffes\",\"Hippos\",\"Ibexes\",\"Jackals\",...\r\n    \"Koalas\",\"Lemurs\",\"Monkeys\",\"Newts\",\"Otters\",\"Porcupines\"]';\r\nnTeams = numel(teams);\r\n%% \r\n% Designate some teams as losers from the previous season for the losers \r\n% bracket. Set the random seed to reproduce the same bracket each time I run the \r\n% script.\r\n\r\nrng(19);\r\nnLosers = floor(nTeams\/2);\r\nidx = randperm(nTeams);\r\nlosers = teams(idx(1:nLosers));\r\n%% \r\n% *Weekly Data*\r\n% \r\n% I'll rerun the model each week, keeping track of the week of the season \r\n% and which teams I already picked.\r\n% \r\n% Each week I increment the week counter.\r\n%%\r\nthisWeek = 2;\r\n%% \r\n% Each week I add the team that was chosen the previous week\r\n\r\npreviousPicks = [\"Aardvarks\"];\r\n%% \r\n% Generate the labels for the remaining weeks:\r\n\r\nweeks = \"week\" + string(thisWeek:nSeasonWeeks)';\r\nnWeeksLeft = nSeasonWeeks - thisWeek + 1;\r\n%% \r\n% Get the probabilities.\r\n% \r\n% Here's where I would scrape a web page for predictions. Instead, I'll use \r\n% a random matrix for the win probabiliites. The random values won't show the \r\n% patterns one would see in real data, but they are sufficient for showing how \r\n% to set up and run the optimization model. One step towards more realistic probabilities \r\n% is to first generate team pairings for the games and a game schedule. This is \r\n% an optimization problem in itself; for an example see <https:\/\/pubsonline.informs.org\/stoken\/default+domain\/EC-SportsOA\/full\/10.1287\/opre.46.1.1 \r\n% scheduling the ACC basketball conference>.\r\n\r\nwinProbs = rand(nTeams,nWeeksLeft);\r\n%% Linear Assignment Model\r\n% I will use the <https:\/\/www.mathworks.com\/help\/optim\/ug\/problem-based-workflow.html \r\n% problem-based workflow> for optimization introduced in R2017b. This workflow \r\n% simplifies the process of specifying and solving linear and mixed-integer linear \r\n% programs, that is, optimization problems with linear constraints and objectives \r\n% and with variables that can take on continuous or discrete values.\r\n% \r\n% *Variables*\r\n% \r\n% First, define a two dimensional <https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationvariable.optimvar.html \r\n% optimvar>, |x|, indexed by teams and weeks. The optimization solver will compute \r\n% values for an |optimvar| include in an <https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationproblem.optimproblem.html \r\n% optimproblem>. I make this a binary variable, that is a variable that only takes \r\n% on the values of 0 and 1 with this statement:\r\n%%\r\nx = optimvar('x',teams,weeks,'LowerBound',0,'UpperBound',1,'Type','integer');\r\n%% \r\n% In my model, a value of 1 for |x(i,j)| will mean that team |i| is chosen \r\n% for week |j| and a value of 0 means it is not chosen.\r\n% \r\n% To eliminate a team that's already been picked, set the upper bounds of \r\n% variables corresponding to those teams to 0.\r\n\r\nx.UpperBound(previousPicks,weeks) = 0;\r\n%% \r\n% *Optimization Problem and Objective*\r\n% \r\n% Next, define the optimization problem and the objective to maximize the \r\n% sum of the win probabilities of the chosen teams:\r\n%%\r\np = optimproblem;\r\np.ObjectiveSense = 'maximize';\r\np.Objective = sum(sum(winProbs.*x));\r\n%% \r\n% *Constraints*\r\n% \r\n% The first constraint statement generates one constraint for each team: \r\n% a team can be chosen at most once. The result is an <https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationconstraint.optimconstr.html \r\n% optimconstr>. Show the first constraint of each set to check the formulation \r\n% as it's built.\r\n%%\r\np.Constraints.eachTeamAtMostOnce = sum(x,2) <= 1; \r\nshowconstr(p.Constraints.eachTeamAtMostOnce(1))\r\n%% \r\n% The second constraint statement generates one constraint for each week: \r\n% a team must be choosen.\r\n\r\np.Constraints.mustPickOne = sum(x,1) == 1;\r\nshowconstr(p.Constraints.mustPickOne(1))\r\n%% \r\n% *Solve*\r\n% \r\n% Now solve the optimization problem. The <https:\/\/www.mathworks.com\/help\/optim\/ug\/optim.problemdef.optimizationproblem.solve.html \r\n% solve> function will call the mixed-integer linear programming solver, <https:\/\/www.mathworks.com\/help\/optim\/ug\/intlinprog.html \r\n% intlinprog>,| b|ecause some of the optimization variables are of type integer\r\n%%\r\nsoln = solve(p);\r\n%% \r\n% \r\n% \r\n% *Tabulate choices*\r\n% \r\n% Optimization Toolbox solvers use floating point arithmetic so it's possible \r\n% that the solution values may not be integral. I want to use the values as logical \r\n% indexes, so I will round them just to be sure that they are.\r\n%%\r\npicks = round(soln.x);\r\n[teampicks,weekpicks] = find(picks);\r\nprobpicks = find(picks);\r\ntable(weeks(weekpicks),teams(teampicks),winProbs(probpicks),...\r\n     'VariableNames',{'Week','Team','WinProb'})\r\n%% Extending the Model\r\n% The win probabilities will be updated on the results of the prior games as \r\n% the season progresses. Maybe I don't want to value the probabilities for the \r\n% end of the season as highly as those in the current week. This can be done by \r\n% applying a discount factor. In the <https:\/\/www.mathworks.com\/help\/matlab\/live-scripts-and-functions.html \r\n% Live Script> version of this post, I set the discount factor with a <https:\/\/www.mathworks.com\/help\/matlab\/matlab_prog\/add-interactive-controls-to-a-live-script.html \r\n% Live Control> to make it easy to experiment with different values.\r\n%%\r\ndiscountFactor =0.05\r\nscalePerWeek = (1-discountFactor).^(0:nWeeksLeft-1);\r\ndProbs = winProbs.*scalePerWeek;\r\np.Objective = sum(sum(x.*dProbs));\r\nsoln = solve(p);\r\n[teampicks,weekpicks] = find(picks);\r\nprobpicks = find(picks);\r\ntable(weeks(weekpicks),teams(teampicks),winProbs(probpicks),...\r\n     'VariableNames',{'Week','Team','WinProb'})\r\n%% \r\n% Applying a discount factor of 5% didn't change the results.\r\n% \r\n% I might want to only choose winning teams when I have no losses so that \r\n% I can pick the best of the losing teams when I have to pick among them.\r\n%%\r\nonlyWinners = false;\r\nif onlyWinners\r\n    p.Constraints.onlyWinners = sum(x(losers,weeks),1) == 0;\r\n    \r\nend\r\n\r\n%% \r\n% With one loss, I must pick from the losers bracket.\r\n\r\nonlyLosers = true;\r\nif onlyLosers\r\n    p.Constraints.onlyLosers = sum(x(losers,weeks),1) >= 1;\r\n    showconstr(p.Constraints.onlyLosers(1))\r\nend\r\n%%\r\nsoln = solve(p);\r\n[teampicks,weekpicks] = find(picks);\r\nprobpicks = find(picks);\r\ntable(weeks(weekpicks),teams(teampicks),winProbs(probpicks),...\r\n     'VariableNames',{'Week','Team','WinProb'})\r\n%% Results\r\n% So how did I do? In both years, I was knocked out fairly early but finished \r\n% at the top of the secondary pool. \r\n% \r\n% Unfortunately, one player had no losses in both years so there was no secondary \r\n% pool award. That player has not revealed his strategy but there is speculation \r\n% that he uses an optimization model.  A player with just one loss did reveal \r\n% his strategy: picking the best team from fivethirtyeight.com each week. That's \r\n% the myopic strategy I didn't want to use! I can see why it makes sense, though. \r\n% There are about twice as many teams as games so as long as you can pick from \r\n% the entire league, there should be some good picks available each week. A greedy \r\n% strategy also makes sense when the predictions are changing during the season.\r\n% \r\n% I'm not willing to give up on placing well in the secondary pool so I'll \r\n% keep my assignment model. I can make it greedier by using a higher discount \r\n% factor. Maybe I should also deal with the uncertainty in the probabilities by \r\n% using data from more than one source or by using a stochastic optimization approach.  \r\n% Now that I have two years worth of data I can do some model validation before \r\n% deciding on this year's strategy. It's time to get busy!\r\n% \r\n% Do you have a better approach? Let us know your thoughts\r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=3021#respond here>.\r\n##### SOURCE END ##### eae73635b2774ac09bffe05bb2159741\r\n-->","protected":false},"excerpt":{"rendered":"<!--introduction--><p>Today's guest blogger is Mary Fenelon, who is the product marketing manager for Optimization and Math here at MathWorks. In today's post she describes how she uses optimization to try to best the rest of the product marketing team.... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2018\/08\/20\/optimizing-to-survive\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[60],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3021"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=3021"}],"version-history":[{"count":5,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3021\/revisions"}],"predecessor-version":[{"id":3037,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/3021\/revisions\/3037"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=3021"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=3021"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=3021"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}