<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.1" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: New Ways With Random Numbers, Part II</title>
	<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/</link>
	<description>Loren Shure  works on design of the MATLAB language at &#60;a href="http://www.mathworks.com/"&#62;The MathWorks&#60;/a&#62;. She writes here about once a week on MATLAB programming and related topics. &#60;br&#62;&#60;br&#62;&#60;a href="/images/loren-full.jpg"&#62;&#60;img src="/images/loren.jpg"&#62;&#60;/a&#62;</description>
	<pubDate>Mon, 23 Nov 2009 00:33:09 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.1</generator>
		<item>
		<title>By: OysterEngineer</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30745</link>
		<dc:creator>OysterEngineer</dc:creator>
		<pubDate>Sun, 08 Nov 2009 02:12:31 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30745</guid>
		<description>Clearly this is a complex topic &#38; these pair of blogs show that The MathWorks are masters of the topic.  I'm happy to see that you have provided the capability that various users need.

However, I am stumbling over your choice to provide the capability via this "new kind of object."  From my view, you've made the end user's job more complex by requiring that he master this new object, with its obtuse syntax, rather than just providing clear options &#38; syntax in the appropriate functions.  As I read the first page of documentation on this class, I was struck that the syntax in the 1st example required 27 characters!  That doesn't look very user friendly to me.

Next the documentation pretty quickly is talking about handles, which as I've said before, is confusing to most lay users out here.

I accept that you developers think these objects &#38; classes are powerful &#38; attractive ways to provide capability.  But, most of us out here are confused by the topic &#38; would rather have a function with clear syntax.</description>
		<content:encoded><![CDATA[<p>Clearly this is a complex topic &amp; these pair of blogs show that The MathWorks are masters of the topic.  I&#8217;m happy to see that you have provided the capability that various users need.</p>
<p>However, I am stumbling over your choice to provide the capability via this &#8220;new kind of object.&#8221;  From my view, you&#8217;ve made the end user&#8217;s job more complex by requiring that he master this new object, with its obtuse syntax, rather than just providing clear options &amp; syntax in the appropriate functions.  As I read the first page of documentation on this class, I was struck that the syntax in the 1st example required 27 characters!  That doesn&#8217;t look very user friendly to me.</p>
<p>Next the documentation pretty quickly is talking about handles, which as I&#8217;ve said before, is confusing to most lay users out here.</p>
<p>I accept that you developers think these objects &amp; classes are powerful &amp; attractive ways to provide capability.  But, most of us out here are confused by the topic &amp; would rather have a function with clear syntax.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Perkins</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30616</link>
		<dc:creator>Peter Perkins</dc:creator>
		<pubDate>Thu, 17 Sep 2009 13:25:59 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30616</guid>
		<description>Sure.  You can either save/read the default stream's state to/from a mat file, if you'll never change the generator type, or save/read the default stream itself, if you want to be more general.  That part is straight-forward.

But if you have concurrent sessions, you're obviously going to have figure out how you want them to interact or not, and figure out some way for them to not step on each other.</description>
		<content:encoded><![CDATA[<p>Sure.  You can either save/read the default stream&#8217;s state to/from a mat file, if you&#8217;ll never change the generator type, or save/read the default stream itself, if you want to be more general.  That part is straight-forward.</p>
<p>But if you have concurrent sessions, you&#8217;re obviously going to have figure out how you want them to interact or not, and figure out some way for them to not step on each other.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Karthik</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30615</link>
		<dc:creator>Karthik</dc:creator>
		<pubDate>Thu, 17 Sep 2009 08:27:10 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30615</guid>
		<description>I am thinking of saving the state of the random generator at the end of a MATLAB session, and loading the state in the next MATLAB session. Can I do this by adding code to startup.m and finish.m? Further, what will be the behaviour, if I have concurrent MATLAB sessions?

Thanks,
Karthik</description>
		<content:encoded><![CDATA[<p>I am thinking of saving the state of the random generator at the end of a MATLAB session, and loading the state in the next MATLAB session. Can I do this by adding code to startup.m and finish.m? Further, what will be the behaviour, if I have concurrent MATLAB sessions?</p>
<p>Thanks,<br />
Karthik</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Perkins</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30064</link>
		<dc:creator>Peter Perkins</dc:creator>
		<pubDate>Mon, 02 Mar 2009 14:36:00 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30064</guid>
		<description>Vladimir, if you really think you need parallel streams, and are not running R2008b, you could look at a submission I made to MATLAB Central File Exchange a few years ago:

http://www.mathworks.com/matlabcentral/fileexchange/6445

It uses the Mersenne Twister with different seeds.  And that, I believe, answers your P.S. too:  the "standard" MT mt19937ar is not designed for parallel streams, but because the state space is so large and the seeding algorithm considered good enough, lots of people use it as a de-factor parallel generator by choosing different seeds.  I don't know how much formal testing this method has received.  I believe the paper you're referring to is a slightly different MT than the mt19937ar, and my recollection is that the implementation is still a work in progress.  By the way, there's nothing preventing you from using the FEX code as a starting point and implementing your own MT dynamic creator.

But before you spend a lot of time on any of this, I'd like to point out that you may not need multiple streams at all.  If all your code does is to call rand, i.e., you use the MT as a serial generator, you are getting values that are as "independent" as anything you'll get with parallel streams, regardless of the fact that you use the values in different places.  Parallel streams are often a convenience for various reasons, and make repeatability easier in many cases, but think hard whether you really need them.  Also, using sum(100*clock) to seed the MT is a good trick for not having to think up new seeds every time you need one, but I don't know of any evidence that this (apparent) "added randomness" provides anything that using 1, 2, ... wouldn't already give you.</description>
		<content:encoded><![CDATA[<p>Vladimir, if you really think you need parallel streams, and are not running R2008b, you could look at a submission I made to MATLAB Central File Exchange a few years ago:</p>
<p><a href="http://www.mathworks.com/matlabcentral/fileexchange/6445" rel="nofollow">http://www.mathworks.com/matlabcentral/fileexchange/6445</a></p>
<p>It uses the Mersenne Twister with different seeds.  And that, I believe, answers your P.S. too:  the &#8220;standard&#8221; MT mt19937ar is not designed for parallel streams, but because the state space is so large and the seeding algorithm considered good enough, lots of people use it as a de-factor parallel generator by choosing different seeds.  I don&#8217;t know how much formal testing this method has received.  I believe the paper you&#8217;re referring to is a slightly different MT than the mt19937ar, and my recollection is that the implementation is still a work in progress.  By the way, there&#8217;s nothing preventing you from using the FEX code as a starting point and implementing your own MT dynamic creator.</p>
<p>But before you spend a lot of time on any of this, I&#8217;d like to point out that you may not need multiple streams at all.  If all your code does is to call rand, i.e., you use the MT as a serial generator, you are getting values that are as &#8220;independent&#8221; as anything you&#8217;ll get with parallel streams, regardless of the fact that you use the values in different places.  Parallel streams are often a convenience for various reasons, and make repeatability easier in many cases, but think hard whether you really need them.  Also, using sum(100*clock) to seed the MT is a good trick for not having to think up new seeds every time you need one, but I don&#8217;t know of any evidence that this (apparent) &#8220;added randomness&#8221; provides anything that using 1, 2, &#8230; wouldn&#8217;t already give you.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Vladimir</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30063</link>
		<dc:creator>Vladimir</dc:creator>
		<pubDate>Sun, 01 Mar 2009 20:55:01 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-30063</guid>
		<description>Hello, Loren

Are there any possibilities to run multiple "streams" in R2008a (probably, some third-party scripts)?

I'm going to provide some statistical calculations on different cluster nodes and so need to obtain independent random numbers on each node.
As I understand, running
  rand('twister',sum(100*clock))
before simulation on the node is better than doing nothing, but won't give me statisticial independene?

P.S. Thank you for such interesting and extremely useful blog.

P.P.S. BTW, why doesn't 'mt19937ar' in MATLAB support multiple streams? In Intel MKL, for instance, Mersenne Twister does. And also Matsumoto has shown this possibility in the article "Dynamic Creation of Pseudorandom Number Generators"

Best, 
Vladimir</description>
		<content:encoded><![CDATA[<p>Hello, Loren</p>
<p>Are there any possibilities to run multiple &#8220;streams&#8221; in R2008a (probably, some third-party scripts)?</p>
<p>I&#8217;m going to provide some statistical calculations on different cluster nodes and so need to obtain independent random numbers on each node.<br />
As I understand, running<br />
  rand(&#8217;twister&#8217;,sum(100*clock))<br />
before simulation on the node is better than doing nothing, but won&#8217;t give me statisticial independene?</p>
<p>P.S. Thank you for such interesting and extremely useful blog.</p>
<p>P.P.S. BTW, why doesn&#8217;t &#8216;mt19937ar&#8217; in MATLAB support multiple streams? In Intel MKL, for instance, Mersenne Twister does. And also Matsumoto has shown this possibility in the article &#8220;Dynamic Creation of Pseudorandom Number Generators&#8221;</p>
<p>Best,<br />
Vladimir</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Loren</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29997</link>
		<dc:creator>Loren</dc:creator>
		<pubDate>Wed, 28 Jan 2009 15:42:00 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29997</guid>
		<description>Ben-

You need R2008b for these features.

--Loren</description>
		<content:encoded><![CDATA[<p>Ben-</p>
<p>You need R2008b for these features.</p>
<p>&#8211;Loren</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29996</link>
		<dc:creator>Ben</dc:creator>
		<pubDate>Wed, 28 Jan 2009 15:30:32 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29996</guid>
		<description>I am using MATLAB v7.4. None of this stuff seems to work for me.  Am I missing a toolbox?  Wrong version?  How would one do similar types of things in a previous version?</description>
		<content:encoded><![CDATA[<p>I am using MATLAB v7.4. None of this stuff seems to work for me.  Am I missing a toolbox?  Wrong version?  How would one do similar types of things in a previous version?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kieran Parsons</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29885</link>
		<dc:creator>Kieran Parsons</dc:creator>
		<pubDate>Fri, 21 Nov 2008 19:42:06 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29885</guid>
		<description>Thanks Peter for your quick reply. I know that the code as written will produce the same output each time. Later I mentioned seeding outside the function.

The problem with "using up" random numbers is the following. Imagine instead of only the 1 function that I provided I instead have 2 (one produces A+B (or A, B) and the other A+B+C (or A, B or C)). Again for reproducibility I need to ensure that I can repeat either case with only A output. By using up the values I have ensured that the 2 cases will never be able to produce the same output. This may sound esoteric, but it is a real situation in my sims (imagine a functional block similar to a Simulink block with A/B/C being noise sources). I may have one version with 2 noise sources and another more advanced version with the same 2 noise sources + another one. If in the latter version I turn off the last noise source (the equivalent of C) the output should be *exactly* the same as the former version.

I'll use multiple streams with one seed I think. Better than independent streams with different seeds.</description>
		<content:encoded><![CDATA[<p>Thanks Peter for your quick reply. I know that the code as written will produce the same output each time. Later I mentioned seeding outside the function.</p>
<p>The problem with &#8220;using up&#8221; random numbers is the following. Imagine instead of only the 1 function that I provided I instead have 2 (one produces A+B (or A, B) and the other A+B+C (or A, B or C)). Again for reproducibility I need to ensure that I can repeat either case with only A output. By using up the values I have ensured that the 2 cases will never be able to produce the same output. This may sound esoteric, but it is a real situation in my sims (imagine a functional block similar to a Simulink block with A/B/C being noise sources). I may have one version with 2 noise sources and another more advanced version with the same 2 noise sources + another one. If in the latter version I turn off the last noise source (the equivalent of C) the output should be *exactly* the same as the former version.</p>
<p>I&#8217;ll use multiple streams with one seed I think. Better than independent streams with different seeds.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Perkins</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29884</link>
		<dc:creator>Peter Perkins</dc:creator>
		<pubDate>Fri, 21 Nov 2008 18:12:49 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29884</guid>
		<description>Kieran, as you've written your code, gen_randn_sum isn't a random number generator at all -- given the same input, it will always return the same output regardless of how many time it's called.  Is that really what you meant?

One interpretation of what you might have meant is that you want gen_randn_sum to "use up" the same random values regardless of what actual (useA, useB)'s are passed in.  To do that, you don't need multiple streams at all, just "burn off" the random values you don't actually use.  You could use parallel streams, but there's no need to.

As far as parallel streams vs. substreams, they are very similar ideas, and for the mrg32k3a generator, they are exactly the same thing:  this generator uses what's called sequence splitting -- imagine a very big circle broken up into hours (except 2^63 of them) and minutes (except 2^70-something per hour).  Those are the streams and substreams.  The mlfg6331 generator uses what's called parameterization for parallel streams (imagine a bunch of "parallel" circles going the long way around a doughnut), and sequence splitting for substreams.

The operational difference is that you can create several parallel streams, and draw from each one independently at the same time with no bookkeeping.  It'd be hard to do that with substreams.  The latter are more aimed at using values from one substream, then the next, and so on.</description>
		<content:encoded><![CDATA[<p>Kieran, as you&#8217;ve written your code, gen_randn_sum isn&#8217;t a random number generator at all &#8212; given the same input, it will always return the same output regardless of how many time it&#8217;s called.  Is that really what you meant?</p>
<p>One interpretation of what you might have meant is that you want gen_randn_sum to &#8220;use up&#8221; the same random values regardless of what actual (useA, useB)&#8217;s are passed in.  To do that, you don&#8217;t need multiple streams at all, just &#8220;burn off&#8221; the random values you don&#8217;t actually use.  You could use parallel streams, but there&#8217;s no need to.</p>
<p>As far as parallel streams vs. substreams, they are very similar ideas, and for the mrg32k3a generator, they are exactly the same thing:  this generator uses what&#8217;s called sequence splitting &#8212; imagine a very big circle broken up into hours (except 2^63 of them) and minutes (except 2^70-something per hour).  Those are the streams and substreams.  The mlfg6331 generator uses what&#8217;s called parameterization for parallel streams (imagine a bunch of &#8220;parallel&#8221; circles going the long way around a doughnut), and sequence splitting for substreams.</p>
<p>The operational difference is that you can create several parallel streams, and draw from each one independently at the same time with no bookkeeping.  It&#8217;d be hard to do that with substreams.  The latter are more aimed at using values from one substream, then the next, and so on.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kieran Parsons</title>
		<link>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29880</link>
		<dc:creator>Kieran Parsons</dc:creator>
		<pubDate>Fri, 21 Nov 2008 14:20:29 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/11/13/new-ways-with-random-numbers-part-ii/#comment-29880</guid>
		<description>Great blog entry, but I am still somewhat confused about the difference between multiple streams and substreams. Should I think of multiple streams as entirely independent and substreams as less so? Is there a speed advantage of using one vs the other?

Here's an example of something I would like to do. Say I have a function whose output is the sum of 2 randn calls (ignore why for now) and designate them A and B. For one simulation run A may be active and B uncalled (output=A), for another B may be active (output=B), for another A&#38;B may be active (output=A+B). An important point is that if A is not active then its randn is *never called*. Here's some example code:

function out = gen_randn_sum(useA, useB)
stream0 = RandStream('mt19937ar','Seed',0);
n_rows = 10;
outA = zeros(n_rows, 1);
outB = zeros(n_rows, 1);
if useA
    outA = randn(stream0, n_rows, 1);
end
if useB
    outB = randn(stream0, n_rows, 1);
end 
out = outA + outB;
end

In this case gen_randn_sum(1, 0) + gen_randn_sum(0, 1) is not equal to gen_randn_sum(1, 1). But I do want them to be equal as I do not want the A-only case to generate different values when B is also used. So A &#38; B need to be independent. But should I use multiple streams with different seeds or one stream/one seed with multiple substreams?

To bound the problem imagine that A &#38; B can be A-J (ie 10 randn sources), n_rows is 1024, and the function can be called 1M times (with the seeding done outside the function only once at the beginning of the 1M calls). Speed is a major factor as the function is called so many times.

Thanks.</description>
		<content:encoded><![CDATA[<p>Great blog entry, but I am still somewhat confused about the difference between multiple streams and substreams. Should I think of multiple streams as entirely independent and substreams as less so? Is there a speed advantage of using one vs the other?</p>
<p>Here&#8217;s an example of something I would like to do. Say I have a function whose output is the sum of 2 randn calls (ignore why for now) and designate them A and B. For one simulation run A may be active and B uncalled (output=A), for another B may be active (output=B), for another A&amp;B may be active (output=A+B). An important point is that if A is not active then its randn is *never called*. Here&#8217;s some example code:</p>
<p>function out = gen_randn_sum(useA, useB)<br />
stream0 = RandStream(&#8217;mt19937ar&#8217;,'Seed&#8217;,0);<br />
n_rows = 10;<br />
outA = zeros(n_rows, 1);<br />
outB = zeros(n_rows, 1);<br />
if useA<br />
    outA = randn(stream0, n_rows, 1);<br />
end<br />
if useB<br />
    outB = randn(stream0, n_rows, 1);<br />
end<br />
out = outA + outB;<br />
end</p>
<p>In this case gen_randn_sum(1, 0) + gen_randn_sum(0, 1) is not equal to gen_randn_sum(1, 1). But I do want them to be equal as I do not want the A-only case to generate different values when B is also used. So A &amp; B need to be independent. But should I use multiple streams with different seeds or one stream/one seed with multiple substreams?</p>
<p>To bound the problem imagine that A &amp; B can be A-J (ie 10 randn sources), n_rows is 1024, and the function can be called 1M times (with the seeding done outside the function only once at the beginning of the 1M calls). Speed is a major factor as the function is called so many times.</p>
<p>Thanks.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
