{"id":2762,"date":"2014-07-11T14:27:56","date_gmt":"2014-07-11T19:27:56","guid":{"rendered":"https:\/\/blogs.mathworks.com\/community\/?p=2762"},"modified":"2014-07-11T14:27:56","modified_gmt":"2014-07-11T19:27:56","slug":"weather-prediction-how-far-can-you-go","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/community\/2014\/07\/11\/weather-prediction-how-far-can-you-go\/","title":{"rendered":"Weather Prediction &#8211; How Far Can You Go?"},"content":{"rendered":"<div class=\"content\"><p>This summer my mother-in-law is renting a house on a lake in New Hampshire. Looking at the calendar, my wife said: \"The ten-day forecast makes it look like it's going to be pretty hot up at the lake next week.\" This led to a more general discussion of the merits of ten-day forecasts.<\/p>\r\n\r\n<p>\r\n<a href=\"https:\/\/www.flickr.com\/photos\/glass_house\/4123408445\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.mathworks.com\/community\/files\/lake.jpg\" alt=\"lake\" width=\"400\" height=\"241\" class=\"alignnone size-full wp-image-2776\" \/><\/a>\r\n<\/p>\r\n\r\n<p>It's funny how we can make decisions based on long-term predictions of weather even though we rarely go back and verify that the forecast was any good. Somehow the fact that the forecast exists at all gives it value.  I'm left pondering this question: how much should we trust a ten-day prediction? As it happens, I have some data that can be useful here. For some time, I have been collecting some relevant data on <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/trendy\/plots\">Trendy<\/a>: the ten day forecast for Natick, Massachusetts (hometown for MathWorks). So let's run some numbers.<\/p><p>Here's the trend: <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/trendy\/trends\/1655\">Ten Day Forecast Highs for Natick, MA<\/a>.<\/p><p>Once a day this trend collects ten data points: today's high temperature and the predicted high temperature for the next nine days. In MATLAB, we'll be working with a matrix with one row for each day and ten columns.<\/p><p>Let's get the data into MATLAB so we can play around with it. I can retrieve (and so can you) the data from Trendy as a JSON object using the following call:<\/p><p><a href=\"https:\/\/www.mathworks.com\/matlabcentral\/trendy\/trends\/1655\/trend_data.json\">https:\/\/www.mathworks.com\/matlabcentral\/trendy\/trends\/1655\/trend_data.json<\/a><\/p><p>In order to read this into MATLAB, I'm going to use <a href=\"https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/42236-parse-json-text\">Joe Hicklin's JSON parser<\/a>.<\/p><pre class=\"codeinput\">url = <span class=\"string\">'https:\/\/www.mathworks.com\/matlabcentral\/trendy\/trends\/1655\/trend_data.json'<\/span>;\r\njson = urlread(url);\r\nraw = JSON.parse(json);\r\n<\/pre><pre class=\"codeinput\">t = zeros(length(raw),1);\r\nd = zeros(length(raw),10);\r\n<span class=\"keyword\">for<\/span> i = 1:length(raw)\r\n    t(i) = raw{i}{1};\r\n    predictions = raw{i}{2};\r\n    <span class=\"keyword\">for<\/span> j = 1:10\r\n        d(i,j) = str2num(predictions{j});\r\n    <span class=\"keyword\">end<\/span>\r\n<span class=\"keyword\">end<\/span>\r\n\r\nfirstTenRows = d(1:10,:)\r\n<\/pre><pre class=\"codeoutput\">\r\nfirstTenRows =\r\n\r\n    44    51    59    66    46    48    53    50    51    54\r\n    50    58    63    49    48    52    48    46    57    54\r\n    59    61    49    47    53    49    48    43    41    48\r\n    62    49    48    54    49    39    39    44    47    46\r\n    49    48    54    50    40    38    39    47    51    54\r\n    47    55    50    39    40    48    52    53    53    53\r\n    54    50    40    39    48    53    54    52    52    50\r\n    49    40    38    50    55    54    56    56    53    49\r\n    40    39    50    56    54    52    56    54    47    43\r\n    39    50    55    55    55    59    58    40    41    46\r\n\r\n<\/pre><p>Now I have a temperature prediction matrix that's structured like this.<\/p>\r\n\r\n<p>\r\n<img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/blogs.mathworks.com\/community\/files\/weather-predictions.png\" alt=\"weather-predictions\" width=\"689\" height=\"370\" class=\"alignnone size-full wp-image-2777\" \/>\r\n<\/p>\r\n\r\n<p>I want to re-order this matrix so that each line shows the prediction trajectory for a single day in time. That means picking off the diagonals highlighted in the diagram above. So let's write some code that does this shift. I'm going to end up with two new matrices, d1 and d2<\/p><pre class=\"codeinput\">rowIndex = 1:10;\r\ncolIndex = 10:-1:1;\r\nsz = size(d);\r\n\r\nlen = (size(d,1)-10);\r\nd1 = zeros(len,10);\r\nd2 = zeros(len,10);\r\nt1 = zeros(len,1);\r\n<span class=\"keyword\">for<\/span> i = 1:len\r\n    ind = sub2ind(sz,rowIndex+i-1,colIndex);\r\n    trend = d(ind);\r\n    d1(i,:) = trend;\r\n    d2(i,:) = trend-trend(end);\r\n    t1(i) = t(i+9);\r\n<span class=\"keyword\">end<\/span>\r\n\r\nfirstTenRows = d1(1:10,:)\r\n<\/pre><pre class=\"codeoutput\">\r\nfirstTenRows =\r\n\r\n    54    57    43    39    38    40    39    38    39    39\r\n    54    41    44    39    48    48    50    50    50    49\r\n    48    47    47    52    53    55    56    55    54    57\r\n    46    51    53    54    54    54    55    55    57    58\r\n    54    53    52    56    52    55    57    57    60    58\r\n    53    52    56    56    59    59    61    63    63    64\r\n    50    53    54    58    45    42    45    44    44    43\r\n    49    47    40    39    37    42    43    43    43    43\r\n    43    41    41    42    44    48    47    49    48    47\r\n    46    44    48    49    46    52    51    50    49    48\r\n\r\n<\/pre><p>In d1, each row is the evolving temperature prediction for each day. So when we plot the first row of d1, we're getting the predictive arc for November 13th of last year.<\/p><pre class=\"codeinput\">i = 1;\r\nplot(-9:0,d1(i,:))\r\ntitle(sprintf(<span class=\"string\">'Predicted Temperature for %s'<\/span>,datestr(t1(i),1)))\r\nxlabel(<span class=\"string\">'Time of Prediction (Offset in Days)'<\/span>)\r\nylabel(<span class=\"string\">'Predicted Temperature (Deg. F)'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/community\/files\/weather_prediction_01.png\" alt=\"\"> <p>In d2, we just subtract from each row the last value in each row. Since this last value is the final (and presumably correct) temperature, this difference gives us the predictive error across the ten days. Here's the error for the November 13th prediction.<\/p><pre class=\"codeinput\">i = 1;\r\nplot(-9:0,d2(i,:))\r\ntitle(sprintf(<span class=\"string\">'Error in Predicted Temperature for %s'<\/span>,datestr(t1(i),1)))\r\nxlabel(<span class=\"string\">'Time of Prediction (Offset in Days)'<\/span>)\r\nylabel(<span class=\"string\">'Prediction Error (Deg. F)'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/community\/files\/weather_prediction_02.png\" alt=\"\"> <p>Notice that it shrinks to zero over time. That's good! Our predictions get more accurate as we approach the actual day in question. But the early predictions were off by as much as 18 degrees. Is that good or bad? You tell me.<\/p><p>Now let's look at all the days.<\/p><pre class=\"codeinput\">plot(-9:0,d2',<span class=\"string\">'Color'<\/span>,[0.5 0.5 1])\r\ntitle(<span class=\"string\">'Error in Predicted Temperature'<\/span>)\r\nxlabel(<span class=\"string\">'Time of Prediction (Offset in Days)'<\/span>)\r\nylabel(<span class=\"string\">'Prediction Error (Deg. F)'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/community\/files\/weather_prediction_03.png\" alt=\"\"> <p>It's hard to get a sense of the error distribution. So let's finish with a histogram of the absolute value of the error. Out of 240 measurements in this data set, the median error for a ten-day prediction is six degrees.<\/p><pre class=\"codeinput\">hist(abs(d2(:,1)),1:25)\r\ntitle(<span class=\"string\">'Histogram of Error in the Ten-Day Forecast'<\/span>)\r\nxlabel(<span class=\"string\">'Error (deg. F)'<\/span>)\r\nylabel(<span class=\"string\">'Number of Samples'<\/span>)\r\n<\/pre><img decoding=\"async\" vspace=\"5\" hspace=\"5\" src=\"https:\/\/blogs.mathworks.com\/community\/files\/weather_prediction_04.png\" alt=\"\"> <p>That seems pretty good. Most of the time that error is going to be less than seven or so degrees Fahrenheit (or four degrees Celsius). I probably don't need to pack a sweater for the weekend at the lake.<\/p><script language=\"JavaScript\"> <!-- \r\n    function grabCode_28f5dbacb2a74e55b73306e217cb9698() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='28f5dbacb2a74e55b73306e217cb9698 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 28f5dbacb2a74e55b73306e217cb9698';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        copyright = 'Copyright 2014 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add copyright line at the bottom if specified.\r\n        if (copyright.length > 0) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n\r\n        d.title = title + ' (MATLAB code)';\r\n        d.close();\r\n    }   \r\n     --> <\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_28f5dbacb2a74e55b73306e217cb9698()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n      the MATLAB code <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; R2014a<br><\/p><\/div><!--\r\n28f5dbacb2a74e55b73306e217cb9698 ##### SOURCE BEGIN #####\r\n%% Weather Prediction - How Far Can You Go?\r\n% \r\n% This summer my mother-in-law is renting a house on a lake in New\r\n% Hampshire. Looking at the calendar, my wife said: \"The ten-day forecast\r\n% makes it look like it's going to be pretty hot up at the lake next week.\"\r\n% This led to a more general discussion of the merits of ten-day forecasts.\r\n%\r\n% https:\/\/www.flickr.com\/photos\/glass_house\/4123408445\r\n%\r\n% <<..\/lake.jpg>>\r\n% \r\n% It's funny how we can make decisions based on long-term predictions of\r\n% weather even though we rarely go back and verify that the forecast was\r\n% any good. Somehow the fact that the forecast exists at all gives it\r\n% value.  I'm left pondering this question: how much should we trust a\r\n% ten-day prediction? As it happens, I have some data that can be useful\r\n% here. For some time, I have been collecting some relevant data on\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/trendy\/plots Trendy>: the ten day\r\n% forecast for Natick, Massachusetts (hometown for MathWorks). So let's run\r\n% some numbers.\r\n% \r\n% Here's the trend: <https:\/\/www.mathworks.com\/matlabcentral\/trendy\/trends\/1655 Ten Day\r\n% Forecast Highs for Natick, MA>.\r\n% \r\n% Once a day this trend collects ten data points: today's high temperature\r\n% and the predicted high temperature for the next nine days. In MATLAB,\r\n% we'll be working with a matrix with one row for each day and ten columns.\r\n%\r\n% Let's get the data into MATLAB so we can play around with it. I can\r\n% retrieve (and so can you) the data from Trendy as a JSON object using the\r\n% following call:\r\n% \r\n% <https:\/\/www.mathworks.com\/matlabcentral\/trendy\/trends\/1655\/trend_data.json>\r\n% \r\n% In order to read this into MATLAB, I'm going to use\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/42236-parse-json-text\r\n% Joe Hicklin's JSON parser>.\r\n\r\nurl = 'https:\/\/www.mathworks.com\/matlabcentral\/trendy\/trends\/1655\/trend_data.json';\r\njson = urlread(url);\r\nraw = JSON.parse(json);\r\n\r\n%%\r\nt = zeros(length(raw),1);\r\nd = zeros(length(raw),10);\r\nfor i = 1:length(raw)\r\n    t(i) = raw{i}{1};\r\n    predictions = raw{i}{2};\r\n    for j = 1:10\r\n        d(i,j) = str2num(predictions{j});\r\n    end\r\nend\r\n\r\nfirstTenRows = d(1:10,:)\r\n\r\n%%\r\n% Now I have a temperature prediction matrix that's structured like this.\r\n% \r\n% <<..\/weather-predictions.png>>\r\n% \r\n% I want to re-order this matrix so that each line shows the prediction\r\n% trajectory for a single day in time. That means picking off the diagonals\r\n% highlighted in the diagram above. So let's write some code that does this\r\n% shift. I'm going to end up with two new matrices, d1 and d2\r\n\r\nrowIndex = 1:10;\r\ncolIndex = 10:-1:1;\r\nsz = size(d);\r\n\r\nlen = (size(d,1)-10);\r\nd1 = zeros(len,10);\r\nd2 = zeros(len,10);\r\nt1 = zeros(len,1);\r\nfor i = 1:len\r\n    ind = sub2ind(sz,rowIndex+i-1,colIndex);\r\n    trend = d(ind);\r\n    d1(i,:) = trend;\r\n    d2(i,:) = trend-trend(end);\r\n    t1(i) = t(i+9);\r\nend\r\n\r\nfirstTenRows = d1(1:10,:)\r\n\r\n%%\r\n% In d1, each row is the evolving temperature prediction for each day. So\r\n% when we plot the first row of d1, we're getting the predictive arc for\r\n% November 13th of last year.\r\n\r\ni = 1;\r\nplot(-9:0,d1(i,:))\r\ntitle(sprintf('Predicted Temperature for %s',datestr(t1(i),1)))\r\nxlabel('Time of Prediction (Offset in Days)')\r\nylabel('Predicted Temperature (Deg. F)')\r\n\r\n%%\r\n% In d2, we just subtract from each row the last value in each row. Since\r\n% this last value is the final (and presumably correct) temperature, this\r\n% difference gives us the predictive error across the ten days. Here's the\r\n% error for the November 13th prediction.\r\n \r\ni = 1;\r\nplot(-9:0,d2(i,:))\r\ntitle(sprintf('Error in Predicted Temperature for %s',datestr(t1(i),1)))\r\nxlabel('Time of Prediction (Offset in Days)')\r\nylabel('Prediction Error (Deg. F)')\r\n\r\n%% \r\n% Notice that it shrinks to zero over time. That's good! Our predictions\r\n% get more accurate as we approach the actual day in question. But the\r\n% early predictions were off by as much as 18 degrees. Is that good or bad?\r\n% You tell me.\r\n% \r\n% Now let's look at all the days.\r\n\r\nplot(-9:0,d2','Color',[0.5 0.5 1])\r\ntitle('Error in Predicted Temperature')\r\nxlabel('Time of Prediction (Offset in Days)')\r\nylabel('Prediction Error (Deg. F)')\r\n\r\n%%\r\n% It's hard to get a sense of the error distribution. So let's finish with\r\n% a histogram of the absolute value of the error. Out of 240 measurements\r\n% in this data set, the median error for a ten-day prediction is six\r\n% degrees.\r\n\r\nhist(abs(d2(:,1)),1:25)\r\ntitle('Histogram of Error in the Ten-Day Forecast')\r\nxlabel('Error (deg. F)')\r\nylabel('Number of Samples')\r\n\r\n%% \r\n% That seems pretty good. Most of the time that error is going to be less\r\n% than seven or so degrees Fahrenheit (or four degrees Celsius). I probably\r\n% don't need to pack a sweater for the weekend at the lake.\r\n\r\n\r\n##### SOURCE END ##### 28f5dbacb2a74e55b73306e217cb9698\r\n-->","protected":false},"excerpt":{"rendered":"<div class=\"overview-image\"><img decoding=\"async\"  class=\"img-responsive\" src=\"https:\/\/blogs.mathworks.com\/community\/files\/lake.jpg\" onError=\"this.style.display ='none';\" \/><\/div><p>This summer my mother-in-law is renting a house on a lake in New Hampshire. Looking at the calendar, my wife said: \"The ten-day forecast makes it look like it's going to be pretty hot up at the lake... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/community\/2014\/07\/11\/weather-prediction-how-far-can-you-go\/\">read more >><\/a><\/p>","protected":false},"author":69,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/posts\/2762"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/users\/69"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/comments?post=2762"}],"version-history":[{"count":7,"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/posts\/2762\/revisions"}],"predecessor-version":[{"id":3075,"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/posts\/2762\/revisions\/3075"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/media?parent=2762"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/categories?post=2762"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/community\/wp-json\/wp\/v2\/tags?post=2762"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}