{"id":161,"date":"2008-11-13T15:33:26","date_gmt":"2008-11-13T20:33:26","guid":{"rendered":"https:\/\/blogs.mathworks.com\/loren\/2008\/11\/13\/new-ways-with-random-numbers-part-ii\/"},"modified":"2016-08-02T14:42:33","modified_gmt":"2016-08-02T19:42:33","slug":"new-ways-with-random-numbers-part-ii","status":"publish","type":"post","link":"https:\/\/blogs.mathworks.com\/loren\/2008\/11\/13\/new-ways-with-random-numbers-part-ii\/","title":{"rendered":"New Ways With Random Numbers, Part II"},"content":{"rendered":"<div xmlns:mwsh=\"https:\/\/www.mathworks.com\/namespace\/mcode\/v1\/syntaxhighlight.dtd\" class=\"content\">\r\n   <introduction>\r\n      <p>Once again we're going to hear from guest blogger Peter Perkins, who is a statistical software developer here at The MathWorks.\r\n      <\/p>\r\n   <\/introduction>\r\n   <h3>Contents<\/h3>\r\n   <div>\r\n      <ul>\r\n         <li><a href=\"#2\">Using Seeds to Get Different Results<\/a><\/li>\r\n         <li><a href=\"#12\">Using Multiple Independent Streams<\/a><\/li>\r\n         <li><a href=\"#20\">Using Substreams to Get Different Results<\/a><\/li>\r\n         <li><a href=\"#25\">Being a Good Random Citizen of MATLAB Nation<\/a><\/li>\r\n         <li><a href=\"#29\">Other Situations?<\/a><\/li>\r\n      <\/ul>\r\n   <\/div>\r\n   <p><a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=159\">Last time<\/a>, I showed a few ways to reproduce the values coming from MATLAB's random number generators. But sometimes what's needed is\r\n      the \"opposite of reproducibility\" - you want to make sure that different runs of a simulation use <i>different<\/i> random inputs, because you want to be able to assume that the different runs are statistically independent.  That's today's\r\n      subject.\r\n   <\/p>\r\n   <h3>Using Seeds to Get Different Results<a name=\"2\"><\/a><\/h3>\r\n   <p>Remember <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=159#9\">seeds<\/a> from my last post?  I talked about reproducibility.  If you create a <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=159#3\">random number stream<\/a> with the same seed, you'll get the same sequence of values every time.  But just as important, if you create different streams\r\n      using different seeds, you'll get different sequences.\r\n   <\/p>\r\n   <p>When MATLAB starts, it creates the default stream with commands equivalent to<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">stream0 = RandStream(<span style=\"color: #A020F0\">'mt19937ar'<\/span>,<span style=\"color: #A020F0\">'Seed'<\/span>,0)<\/pre><pre style=\"font-style:oblique\">stream0 = \r\nmt19937ar random stream\r\n             Seed: 0\r\n         RandnAlg: Ziggurat\r\n<\/pre><p>We saw that last week.  To get a <i>different<\/i> sequence of random values, I can create a new stream with a different seed, such as\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">stream1 = RandStream(<span style=\"color: #A020F0\">'mt19937ar'<\/span>,<span style=\"color: #A020F0\">'Seed'<\/span>,1)<\/pre><pre style=\"font-style:oblique\">stream1 = \r\nmt19937ar random stream\r\n             Seed: 1\r\n         RandnAlg: Ziggurat\r\n<\/pre><p>Next I'll make <i>it<\/i> the default stream.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">RandStream.setDefaultStream(stream1);<\/pre><p>By setting <tt>stream1<\/tt> as the default stream, I've made it so that <tt>rand<\/tt>, <tt>randn<\/tt>, and <tt>randi<\/tt> will use that stream from now on, and will draw values from a different sequence than they did when using <tt>stream0<\/tt>.  Just to confirm that I've changed the default stream,\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">stream1 == RandStream.getDefaultStream()<\/pre><pre style=\"font-style:oblique\">ans =\r\n     1\r\n<\/pre><p>In practice, I might have created the new stream on a second processor, to run in parallel with the first, or created it in\r\n      a second MATLAB session on the same processor.  It doesn't matter, either way works the same.\r\n   <\/p>\r\n   <p>One common use of seeds is to get different random values each time you start up MATLAB without having to think about it every\r\n      time.  To do that, you can base the seed on the system clock.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">mystream = RandStream(<span style=\"color: #A020F0\">'mt19937ar'<\/span>,<span style=\"color: #A020F0\">'Seed'<\/span>,sum(100*clock))<\/pre><pre style=\"font-style:oblique\">mystream = \r\nmt19937ar random stream\r\n             Seed: 210226\r\n         RandnAlg: Ziggurat\r\n<\/pre><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">RandStream.setDefaultStream(mystream);<\/pre><p>Bear in mind that you won't be able to reproduce your results later on unless you hang onto the value of <tt>sum(100*clock)<\/tt>.  Doh!\r\n   <\/p>\r\n   <p>You might wonder what value you should use for a seed, and if you will somehow add \"extra randomness\" into MATLAB by using\r\n      the system clock.  The answer is, <i>you won't<\/i>.  In fact, taking this strategy to the extreme and reseeding a generator before each call will very likely do <i>bad things<\/i> to the statistical properties of the sequence of values you end up with. There's nothing wrong with choosing widely-spaced\r\n      seeds, but the's no real advantage.\r\n   <\/p>\r\n   <p>Remember, seeding a stream is not something you want to do a lot of. It's most useful if you use it as an initialization step,\r\n      perhaps at MATLAB startup, perhaps before running a simulation.\r\n   <\/p>\r\n   <h3>Using Multiple Independent Streams<a name=\"12\"><\/a><\/h3>\r\n   <p>A reasonable question to ask at this point is, \"If you can't tell me how far apart or what in order the different seeds are\r\n      in the sequence of random values, how do I know that two simulation runs won't end up reusing the same random values?\"  The\r\n      answer is, \"You don't.\"  Practically speaking, the state space of the Mersenne Twister used in the example above is so ridiculously\r\n      much larger (2^19937 elements) than the number of possible seeds (2^32) that the chances of overlap in different simulation\r\n      runs are pretty remote unless you use a large number of different seeds.  But there is a more subtle potential problem - not\r\n      enough is known about the statistical (pseudo)independence of sequences of values corresponding to different \"seeds\" of the\r\n      Mersenne Twister generator.\r\n   <\/p>\r\n   <p>There is a better way available in MATLAB R2008b: two new generator algorithms support <i>multiple independent<\/i> streams. That is, you can create separate streams that are guaranteed to not overlap, and for which tests that demonstrate\r\n      (pseudo)independence of the values not only within each stream, but between streams, have been carried out.  Notably, the\r\n      Mersenne Twister (<tt>'mt19937ar'<\/tt>) does not support multiple streams.\r\n   <\/p>\r\n   <p>The two generator types that support multiple independent streams are the Combined Multiple Recursive (<tt>'mrg32k3a'<\/tt>) and the Multiplicative Lagged Fibonacci (<tt>'mlfg6331_64'<\/tt>) generators (the doc has <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/math\/brrztpq.html\">references<\/a>.) I'll create one of the former, and make it the default stream.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">cmrg1 = RandStream.create(<span style=\"color: #A020F0\">'mrg32k3a'<\/span>,<span style=\"color: #A020F0\">'NumStreams'<\/span>,2,<span style=\"color: #A020F0\">'StreamIndices'<\/span>,1);\r\nRandStream.setDefaultStream(cmrg1)<\/pre><p>The first parameter, <tt>NumStreams<\/tt>, says how many independent streams I plan on using, in total.  <tt>StreamIndices<\/tt> says which one (or which ones) I actually want to create right now.  Note that I've used the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/randstream.create.html\"><tt>RandStream.create<\/tt><\/a> factory method, which allows for creating multiple streams as a group, and that the display for this stream\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">cmrg1<\/pre><pre style=\"font-style:oblique\">cmrg1 = \r\nmrg32k3a random stream (current default)\r\n      StreamIndex: 1\r\n       NumStreams: 2\r\n             Seed: 0\r\n         RandnAlg: Ziggurat\r\n<\/pre><p>indicates that it is part of a larger group (even though I have not yet created the second stream).<\/p>\r\n   <p>Let's say I've made one run of a simulation using the above as the default stream, represented by these 100,000 random values.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">x1 = rand(100000,1);<\/pre><p>Now I want to make a second, independent simulation run. If the runs are short enough, I might just run one after another\r\n      in the same MATLAB session, without modifying the state of the default stream between runs. Since successive values coming\r\n      from a stream are statistically independent, there's nothing at all wrong with that.  On the other hand, if I want to shut\r\n      down MATLAB between runs, or I want to run on two machines in parallel, that isn't going to work.\r\n   <\/p>\r\n   <p>What I can do, though, is to create the second of my two independent streams, and make <i>it<\/i> the default stream.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">cmrg2 = RandStream.create(<span style=\"color: #A020F0\">'mrg32k3a'<\/span>,<span style=\"color: #A020F0\">'NumStreams'<\/span>,2,<span style=\"color: #A020F0\">'StreamIndices'<\/span>,2)\r\nRandStream.setDefaultStream(cmrg2)\r\nx2 = rand(100000,1);<\/pre><pre style=\"font-style:oblique\">cmrg2 = \r\nmrg32k3a random stream\r\n      StreamIndex: 2\r\n       NumStreams: 2\r\n             Seed: 0\r\n         RandnAlg: Ziggurat\r\n<\/pre><p>Let's look at the correlation between the two streams.<\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">corrcoef(x1,x2)<\/pre><pre style=\"font-style:oblique\">ans =\r\n            1    0.0015578\r\n    0.0015578            1\r\n<\/pre><p>That isn't by any means <i>proof<\/i> of statistical independence, but it is reassuring.  In fact, much more thorough tests have been used to demonstrate between-stream\r\n      independence of these generators.\r\n   <\/p>\r\n   <p>As long as the streams are created correctly, it doesn't matter where or when I make the second simulation run, the results\r\n      will still be independent from the first run.  <tt>'mrg32k3a'<\/tt> and <tt>'mlfg6331_64'<\/tt> both support a <i>huge<\/i> number of parallel independent streams, and each stream is itself very, very long, so you can use multiple independent streams\r\n      for very large simulations.\r\n   <\/p>\r\n   <h3>Using Substreams to Get Different Results<a name=\"20\"><\/a><\/h3>\r\n   <p>Remember <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=159#15\">substreams<\/a> from my last post? I promised to explain why they're different than seeds, so here it is.  Unlike seeds, which are located\r\n      <i>who-knows-where<\/i>, <i>who-knows-how-far-apart<\/i> along the sequence of random numbers, the spacing between substreams is known, so any chance of overlap can be eliminated.\r\n       And like independent parallel streams, research has been done to demonstrate statistical independence across substreams.\r\n       In short, they are a more controlled way to do many of the same things that seeds have traditionally been used for, and a\r\n      more lightweight solution than parallel streams.\r\n   <\/p>\r\n   <p>Substreams provide a quick and easy way to ensure that you get <i>different<\/i> results from the same code at different times.  For example, if I run this loop (from the last post)\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">defaultStream = RandStream.getDefaultStream();\r\n<span style=\"color: #0000FF\">for<\/span> i = 1:5\r\n    defaultStream.Substream = i;\r\n    z = rand(1,i)\r\n<span style=\"color: #0000FF\">end<\/span><\/pre><pre style=\"font-style:oblique\">z =\r\n      0.32457\r\nz =\r\n      0.25369       0.3805\r\nz =\r\n      0.22759      0.34154      0.61441\r\nz =\r\n    0.0054898      0.12139      0.70173      0.15851\r\nz =\r\n      0.51784      0.37283      0.54639      0.14839      0.89321\r\n<\/pre><p>and then run off to the Tahiti, and then come back (maybe), I can run <i>this<\/i> loop\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\"><span style=\"color: #0000FF\">for<\/span> i = 6:10\r\n    defaultStream.Substream = i;\r\n    z = rand(1,11-i)\r\n<span style=\"color: #0000FF\">end<\/span><\/pre><pre style=\"font-style:oblique\">z =\r\n      0.48889      0.46817      0.62735      0.14538      0.84446\r\nz =\r\n      0.66568      0.11152     0.035575     0.013499\r\nz =\r\n      0.60589      0.28911      0.74664\r\nz =\r\n      0.24981      0.78979\r\nz =\r\n      0.25271\r\n<\/pre><p>and be certain that I've generated random values independently of the first set of 5 iterations, and in fact have gotten exactly\r\n      the same results as if I had never gone to Tahiti (though not gotten as good of a tan):\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\"><span style=\"color: #0000FF\">for<\/span> i = 1:10\r\n    defaultStream.Substream = i;\r\n    z = rand(1,min(i,11-i));\r\n<span style=\"color: #0000FF\">end<\/span><\/pre><p>As always, I might run the second loop at a different time than the first, but on the same machine, or run it at the same\r\n      time on a different machine. As long as the stream is set up correctly, it makes no difference.\r\n   <\/p>\r\n   <h3>Being a Good Random Citizen of MATLAB Nation<a name=\"25\"><\/a><\/h3>\r\n   <p>It's important to remember that replacing the default stream, or changing its state, either by resetting it, by setting its\r\n      state directly, or by positioning it to a new substream, affect all subsequent calls to <tt>rand<\/tt>, <tt>randn<\/tt>, and <tt>randi<\/tt>, and are things you should be careful about doing. If code you write is used by someone else who has no idea that you've\r\n      done one of those things, your code may unintentionally invalidate the results of their simulation by affecting the statistical\r\n      properties of the random numbers they use, or make it difficult for them to reproduce their results when they need to.\r\n   <\/p>\r\n   <p>In general, I recommend leaving control of the default stream up to the end user as much as possible.  If you are your own\r\n      end user, it might be a good idea to execute \"stream control\" statements only at MATLAB startup, or only right before running\r\n      a simulation.\r\n   <\/p>\r\n   <p>In code that will be called by others, it's safe to generate values using the default stream, with no effect on the statistical\r\n      properties of other random numbers generated in other parts of the code.  But if possible, don't otherwise change the default\r\n      stream's state or configuration by doing things like resetting or replacing the default stream.  It's safest if those kinds\r\n      of operations are done only by, or at the specific request of, the end user, so that it's clear when they are happening.\r\n   <\/p>\r\n   <p>When that's not possible, if you need to somehow modify the default stream's state, consider \"cleaning up after yourself\"\r\n      by saving and restoring the state before and after your code does whatever it needs to do.  For example,\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">defaultStream = RandStream.getDefaultStream();\r\nsavedState = defaultStream.State;\r\n\r\n<span style=\"color: #228B22\">% some use of the default stream and its state<\/span>\r\n\r\ndefaultStream.State = savedState;<\/pre><p>makes it as if your code did not modify the default stream at all.<\/p>\r\n   <p>Alternatively, consider using a \"private stream\", one that you create and draw values from without affecting the default stream.\r\n       For example.\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">mystream = RandStream(<span style=\"color: #A020F0\">'mt19937ar'<\/span>,<span style=\"color: #A020F0\">'Seed'<\/span>,sum(100*clock));\r\nx = rand(mystream,100,1);<\/pre><p>generates values completely separately from the default stream.  That call to <tt>rand<\/tt> (the <a href=\"https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/randstream.rand.html\">method<\/a> not the function) uses your private stream, not the default stream.  Another possibility is to make the private stream the\r\n      default, but only temporarily.  For example,\r\n   <\/p><pre style=\"background: #F9F7F3; padding: 10px; border: 1px solid rgb(200,200,200)\">mystream = RandStream(<span style=\"color: #A020F0\">'mt19937ar'<\/span>,<span style=\"color: #A020F0\">'Seed'<\/span>,sum(100*clock));\r\nsavedStream = RandStream.getDefaultStream();\r\nRandStream.setDefaultStream(mystream);\r\n\r\n<span style=\"color: #228B22\">% some use of the (new) default stream and its state<\/span>\r\n\r\n\r\nRandStream.setDefaultStream(savedStream);<\/pre><p>allows your code to use your stream as the default, but also puts things back just as they were originally.<\/p>\r\n   <h3>Other Situations?<a name=\"29\"><\/a><\/h3>\r\n   <p>Do the examples I've covered leave out any situations involving reproducibility of random numbers, or \"the opposite or reproducibility\",\r\n      that you've run into?  Let me know <a href=\"https:\/\/blogs.mathworks.com\/loren\/?p=161#respond\">here<\/a>.\r\n   <\/p><script language=\"JavaScript\">\r\n<!--\r\n\r\n    function grabCode_4d6e2c8d5302449683dba7ff1786de39() {\r\n        \/\/ Remember the title so we can use it in the new page\r\n        title = document.title;\r\n\r\n        \/\/ Break up these strings so that their presence\r\n        \/\/ in the Javascript doesn't mess up the search for\r\n        \/\/ the MATLAB code.\r\n        t1='4d6e2c8d5302449683dba7ff1786de39 ' + '##### ' + 'SOURCE BEGIN' + ' #####';\r\n        t2='##### ' + 'SOURCE END' + ' #####' + ' 4d6e2c8d5302449683dba7ff1786de39';\r\n    \r\n        b=document.getElementsByTagName('body')[0];\r\n        i1=b.innerHTML.indexOf(t1)+t1.length;\r\n        i2=b.innerHTML.indexOf(t2);\r\n \r\n        code_string = b.innerHTML.substring(i1, i2);\r\n        code_string = code_string.replace(\/REPLACE_WITH_DASH_DASH\/g,'--');\r\n\r\n        \/\/ Use \/x3C\/g instead of the less-than character to avoid errors \r\n        \/\/ in the XML parser.\r\n        \/\/ Use '\\x26#60;' instead of '<' so that the XML parser\r\n        \/\/ doesn't go ahead and substitute the less-than character. \r\n        code_string = code_string.replace(\/\\x3C\/g, '\\x26#60;');\r\n\r\n        author = 'Loren Shure';\r\n        copyright = 'Copyright 2008 The MathWorks, Inc.';\r\n\r\n        w = window.open();\r\n        d = w.document;\r\n        d.write('<pre>\\n');\r\n        d.write(code_string);\r\n\r\n        \/\/ Add author and copyright lines at the bottom if specified.\r\n        if ((author.length > 0) || (copyright.length > 0)) {\r\n            d.writeln('');\r\n            d.writeln('%%');\r\n            if (author.length > 0) {\r\n                d.writeln('% _' + author + '_');\r\n            }\r\n            if (copyright.length > 0) {\r\n                d.writeln('% _' + copyright + '_');\r\n            }\r\n        }\r\n\r\n        d.write('<\/pre>\\n');\r\n      \r\n      d.title = title + ' (MATLAB code)';\r\n      d.close();\r\n      }   \r\n      \r\n-->\r\n<\/script><p style=\"text-align: right; font-size: xx-small; font-weight:lighter;   font-style: italic; color: gray\"><br><a href=\"javascript:grabCode_4d6e2c8d5302449683dba7ff1786de39()\"><span style=\"font-size: x-small;        font-style: italic;\">Get \r\n            the MATLAB code \r\n            <noscript>(requires JavaScript)<\/noscript><\/span><\/a><br><br>\r\n      Published with MATLAB&reg; 7.7<br><\/p>\r\n<\/div>\r\n<!--\r\n4d6e2c8d5302449683dba7ff1786de39 ##### SOURCE BEGIN #####\r\n%% New Ways With Random Numbers, Part II\r\n% Once again we're going to hear from guest blogger\r\n% <https:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/loadAuthor.do?objectId=1093801&objectType=author Peter Perkins>,\r\n% who is a statistical software developer here at The MathWorks.\r\n\r\n%%\r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=159 Last time>, I showed a few ways\r\n% to reproduce the values coming from MATLAB's random number generators.\r\n% But sometimes what's needed is the \"opposite of reproducibility\" - you\r\n% want to make sure that different runs of a simulation use _different_\r\n% random inputs, because you want to be able to assume that the different\r\n% runs are statistically independent.  That's today's subject.\r\n\r\n\r\n%% Using Seeds to Get Different Results\r\n% Remember <https:\/\/blogs.mathworks.com\/loren\/?p=159#9 seeds> from my last\r\n% post?  I talked about reproducibility.  If you create a \r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=159#3 random number stream> with\r\n% the same seed, you'll get the same sequence of values every time.  But just\r\n% as important, if you create different streams using different seeds, you'll\r\n% get different sequences.\r\n\r\n%%\r\n% When MATLAB starts, it creates the default stream with commands\r\n% equivalent to\r\nstream0 = RandStream('mt19937ar','Seed',0)\r\n%%\r\n% We saw that last week.  To get a _different_ sequence of random\r\n% values, I can create a new stream with a different seed, such as\r\nstream1 = RandStream('mt19937ar','Seed',1)\r\n%%\r\n% Next I'll make _it_ the default stream.\r\nRandStream.setDefaultStream(stream1);\r\n%%\r\n% By setting |stream1| as the default stream, I've made it so that |rand|,\r\n% |randn|, and |randi| will use that stream from now on, and will draw values\r\n% from a different sequence than they did when using |stream0|.  Just to\r\n% confirm that I've changed the default stream,\r\nstream1 == RandStream.getDefaultStream()\r\n%%\r\n% In practice, I might have created the new stream on a second processor, to\r\n% run in parallel with the first, or created it in a second MATLAB session on\r\n% the same processor.  It doesn't matter, either way works the same.\r\n\r\n%%\r\n% One common use of seeds is to get different random values each time you\r\n% start up MATLAB without having to think about it every time.  To do that,\r\n% you can base the seed on the system clock.\r\nmystream = RandStream('mt19937ar','Seed',sum(100*clock))\r\n%%\r\nRandStream.setDefaultStream(mystream);\r\n%%\r\n% Bear in mind that you won't be able to reproduce your results later on\r\n% unless you hang onto the value of |sum(100*clock)|.  Doh!\r\n\r\n%%\r\n% You might wonder what value you should use for a seed, and if you will\r\n% somehow add \"extra randomness\" into MATLAB by using the system clock.  The\r\n% answer is, _you won't_.  In fact, taking this strategy to the extreme and\r\n% reseeding a generator before each call will very likely do _bad things_ to\r\n% the statistical properties of the sequence of values you end up with.\r\n% There's nothing wrong with choosing widely-spaced seeds, but the's no real\r\n% advantage.\r\n%\r\n% Remember, seeding a stream is not something you want to do a lot of. It's\r\n% most useful if you use it as an initialization step, perhaps at MATLAB\r\n% startup, perhaps before running a simulation.\r\n\r\n\r\n%% Using Multiple Independent Streams\r\n% A reasonable question to ask at this point is, \"If you can't tell me how far\r\n% apart or what in order the different seeds are in the sequence of random\r\n% values, how do I know that two simulation runs won't end up reusing the same\r\n% random values?\"  The answer is, \"You don't.\"  Practically speaking, the\r\n% state space of the Mersenne Twister used in the example above is so\r\n% ridiculously much larger (2^19937 elements) than the number of possible\r\n% seeds (2^32) that the chances of overlap in different simulation runs are\r\n% pretty remote unless you use a large number of different seeds.  But there\r\n% is a more subtle potential problem - not enough is known about the\r\n% statistical (pseudo)independence of sequences of values corresponding to\r\n% different \"seeds\" of the Mersenne Twister generator.\r\n%\r\n% There is a better way available in MATLAB R2008b: two new generator\r\n% algorithms support _multiple independent_ streams. That is, you can create\r\n% separate streams that are guaranteed to not overlap, and for which tests\r\n% that demonstrate (pseudo)independence of the values not only within each\r\n% stream, but between streams, have been carried out.  Notably, the Mersenne\r\n% Twister (|'mt19937ar'|) does not support multiple streams.\r\n%\r\n% The two generator types that support multiple independent streams are the\r\n% Combined Multiple Recursive (|'mrg32k3a'|) and the Multiplicative\r\n% Lagged Fibonacci (|'mlfg6331_64'|) generators (the doc has\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/math\/brrztpq.html references>.)\r\n% I'll create one of the former, and make it the default stream.\r\ncmrg1 = RandStream.create('mrg32k3a','NumStreams',2,'StreamIndices',1);\r\nRandStream.setDefaultStream(cmrg1)\r\n%%\r\n% The first parameter, |NumStreams|, says how many independent streams I plan\r\n% on using, in total.  |StreamIndices| says which one (or which ones) I\r\n% actually want to create right now.  Note that I've used the\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/randstream.create.html |RandStream.create|>\r\n% factory method, which allows for creating multiple streams as a group, and\r\n% that the display for this stream\r\ncmrg1\r\n%%\r\n% indicates that it is part of a larger group (even though I have not yet\r\n% created the second stream).\r\n\r\n%%\r\n% Let's say I've made one run of a simulation using the above as the default\r\n% stream, represented by these 100,000 random values.\r\nx1 = rand(100000,1);\r\n%%\r\n% Now I want to make a second, independent simulation run. If the runs are\r\n% short enough, I might just run one after another in the same MATLAB session,\r\n% without modifying the state of the default stream between runs. Since\r\n% successive values coming from a stream are statistically independent,\r\n% there's nothing at all wrong with that.  On the other hand, if I want to\r\n% shut down MATLAB between runs, or I want to run on two machines in parallel,\r\n% that isn't going to work.\r\n%\r\n% What I can do, though, is to create the second of my two independent\r\n% streams, and make _it_ the default stream.\r\ncmrg2 = RandStream.create('mrg32k3a','NumStreams',2,'StreamIndices',2)\r\nRandStream.setDefaultStream(cmrg2)\r\nx2 = rand(100000,1);\r\n%%\r\n% Let's look at the correlation between the two streams.\r\ncorrcoef(x1,x2)\r\n%%\r\n% That isn't by any means _proof_ of statistical independence, but it is\r\n% reassuring.  In fact, much more thorough tests have been used to demonstrate\r\n% between-stream independence of these generators.\r\n\r\n%%\r\n% As long as the streams are created correctly, it doesn't matter where or\r\n% when I make the second simulation run, the results will still be independent\r\n% from the first run.  |'mrg32k3a'| and |'mlfg6331_64'| both support a _huge_\r\n% number of parallel independent streams, and each stream is itself very, very\r\n% long, so you can use multiple independent streams for very large simulations. \r\n\r\n\r\n%% Using Substreams to Get Different Results\r\n% Remember <https:\/\/blogs.mathworks.com\/loren\/?p=159#15 substreams> from my\r\n% last post? I promised to explain\r\n% why they're different than seeds, so here it is.  Unlike seeds, which are\r\n% located _who-knows-where_, _who-knows-how-far-apart_ along the sequence of\r\n% random numbers, the spacing between substreams is known, so any chance of\r\n% overlap can be eliminated.  And like independent parallel streams, research\r\n% has been done to demonstrate statistical independence across substreams.  In\r\n% short, they are a more controlled way to do many of the same things that seeds\r\n% have traditionally been used for, and a more lightweight solution than\r\n% parallel streams.\r\n\r\n%%\r\n% Substreams provide a quick and easy way to ensure that you get _different_\r\n% results from the same code at different times.  For example, if I run this\r\n% loop (from the last post)\r\ndefaultStream = RandStream.getDefaultStream();\r\nfor i = 1:5\r\n    defaultStream.Substream = i;\r\n    z = rand(1,i)\r\nend\r\n%%\r\n% and then run off to the Tahiti, and then come back (maybe), I can run _this_\r\n% loop\r\nfor i = 6:10\r\n    defaultStream.Substream = i;\r\n    z = rand(1,11-i)\r\nend\r\n%%\r\n% and be certain that I've generated random values independently of the\r\n% first set of 5 iterations, and in fact have gotten exactly the same results\r\n% as if I had never gone to Tahiti (though not gotten as good of a tan):\r\nfor i = 1:10\r\n    defaultStream.Substream = i;\r\n    z = rand(1,min(i,11-i));\r\nend\r\n%%\r\n% As always, I might run the second loop at a different time than the first,\r\n% but on the same machine, or run it at the same time on a different machine.\r\n% As long as the stream is set up correctly, it makes no difference.\r\n\r\n\r\n%% Being a Good Random Citizen of MATLAB Nation\r\n% It's important to remember that replacing the default stream, or changing\r\n% its state, either by resetting it, by setting its state directly, or by\r\n% positioning it to a new substream, affect all subsequent calls to |rand|,\r\n% |randn|, and |randi|, and are things you should be careful about doing. If\r\n% code you write is used by someone else who has no idea that you've done one\r\n% of those things, your code may unintentionally invalidate the results of\r\n% their simulation by affecting the statistical properties of the random\r\n% numbers they use, or make it difficult for them to reproduce their results\r\n% when they need to.\r\n%\r\n% In general, I recommend leaving control of the default stream up to the end\r\n% user as much as possible.  If you are your own end user, it might be a good\r\n% idea to execute \"stream control\" statements only at MATLAB startup, or only\r\n% right before running a simulation.\r\n%\r\n% In code that will be called by others, it's safe to generate values using\r\n% the default stream, with no effect on the statistical properties of other\r\n% random numbers generated in other parts of the code.  But if possible, don't\r\n% otherwise change the default stream's state or configuration by doing things\r\n% like resetting or replacing the default stream.  It's safest if those kinds\r\n% of operations are done only by, or at the specific request of, the end user,\r\n% so that it's clear when they are happening.\r\n%\r\n% When that's not possible, if you need to somehow modify the default stream's\r\n% state, consider \"cleaning up after yourself\" by saving and restoring the\r\n% state before and after your code does whatever it needs to do.  For\r\n% example,\r\ndefaultStream = RandStream.getDefaultStream();\r\nsavedState = defaultStream.State;\r\n\r\n% some use of the default stream and its state\r\n\r\ndefaultStream.State = savedState;\r\n%%\r\n% makes it as if your code did not modify the default stream at all.\r\n%\r\n% Alternatively, consider using a \"private stream\", one that you create and\r\n% draw values from without affecting the default stream.  For example.\r\nmystream = RandStream('mt19937ar','Seed',sum(100*clock));\r\nx = rand(mystream,100,1);\r\n%%\r\n% generates values completely separately from the default stream.  That call to\r\n% |rand| (the\r\n% <https:\/\/www.mathworks.com\/help\/releases\/R2008b\/techdoc\/ref\/randstream.rand.html method>\r\n% not the function) uses your private stream, not the default stream.  Another possibility\r\n% is to make the private stream the default, but only temporarily.  For\r\n% example,\r\nmystream = RandStream('mt19937ar','Seed',sum(100*clock));\r\nsavedStream = RandStream.getDefaultStream();\r\nRandStream.setDefaultStream(mystream);\r\n\r\n% some use of the (new) default stream and its state\r\n\r\nRandStream.setDefaultStream(savedStream);\r\n%%\r\n% allows your code to use your stream as the default, but also puts things\r\n% back just as they were originally.\r\n\r\n\r\n%% Other Situations?\r\n% Do the examples I've covered leave out any situations involving\r\n% reproducibility of random numbers, or \"the opposite or reproducibility\",\r\n% that you've run into?  Let me know \r\n% <https:\/\/blogs.mathworks.com\/loren\/?p=161#respond here>.\r\n\r\n##### SOURCE END ##### 4d6e2c8d5302449683dba7ff1786de39\r\n-->","protected":false},"excerpt":{"rendered":"<p>\r\n   \r\n      Once again we're going to hear from guest blogger Peter Perkins, who is a statistical software developer here at The MathWorks.\r\n      \r\n   \r\n   Contents\r\n   \r\n    ... <a class=\"read-more\" href=\"https:\/\/blogs.mathworks.com\/loren\/2008\/11\/13\/new-ways-with-random-numbers-part-ii\/\">read more >><\/a><\/p>","protected":false},"author":39,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16,6,28],"tags":[],"_links":{"self":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/161"}],"collection":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/users\/39"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/comments?post=161"}],"version-history":[{"count":1,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/161\/revisions"}],"predecessor-version":[{"id":1894,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/posts\/161\/revisions\/1894"}],"wp:attachment":[{"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/media?parent=161"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/categories?post=161"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.mathworks.com\/loren\/wp-json\/wp\/v2\/tags?post=161"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}