<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.1" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Another Lesson in Floating Point</title>
	<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/</link>
	<description>Loren Shure  works on design of the MATLAB language at &#60;a href="http://www.mathworks.com/"&#62;The MathWorks&#60;/a&#62;. She writes here about once a week on MATLAB programming and related topics. &#60;br&#62;&#60;br&#62;&#60;a href="/images/loren-full.jpg"&#62;&#60;img src="/images/loren.jpg"&#62;&#60;/a&#62;</description>
	<pubDate>Sun, 22 Nov 2009 23:29:42 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.1</generator>
		<item>
		<title>By: Ljubomir Josifovski</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29543</link>
		<dc:creator>Ljubomir Josifovski</dc:creator>
		<pubDate>Wed, 25 Jun 2008 17:10:53 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29543</guid>
		<description>Thanks guys, most helpful, all clear now wrt my example, format hex separates presentation from content nicely. :-)
Ljubomir</description>
		<content:encoded><![CDATA[<p>Thanks guys, most helpful, all clear now wrt my example, format hex separates presentation from content nicely. :-)<br />
Ljubomir</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Hosea</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29537</link>
		<dc:creator>Mike Hosea</dc:creator>
		<pubDate>Tue, 24 Jun 2008 23:30:00 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29537</guid>
		<description>Sorry for the delay, Ljubomir.  I've been away.  My decimal example was formulated to show the essential principle without the distractions of binary floating point.  At any rate, I agree with Peter's exposition.  You may round up or down when losing mantissa bits in the conversion from double to single.  When converting back to double, the mantissa, such as it is, is padded with zeros.  The phenomenon is obfuscated, however, by the conversion of the binary form to decimal form for display.</description>
		<content:encoded><![CDATA[<p>Sorry for the delay, Ljubomir.  I&#8217;ve been away.  My decimal example was formulated to show the essential principle without the distractions of binary floating point.  At any rate, I agree with Peter&#8217;s exposition.  You may round up or down when losing mantissa bits in the conversion from double to single.  When converting back to double, the mantissa, such as it is, is padded with zeros.  The phenomenon is obfuscated, however, by the conversion of the binary form to decimal form for display.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Perkins</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29530</link>
		<dc:creator>Peter Perkins</dc:creator>
		<pubDate>Fri, 20 Jun 2008 15:10:15 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29530</guid>
		<description>Ljubomir, using format hex will show you what's really going on:

&#62;&#62; format hex
&#62;&#62; .4
ans =
    3fd999999999999a
&#62;&#62; qsingle = single(0.4)
qsingle =
    3ecccccd
&#62;&#62; qdouble = double(qsingle)
qdouble =
    3fd99999a0000000

.4 as a double is represented by a 0 sign bit, 01111111101 (1023-biased) exponent, and [1]10011001...10011010 mantissa, where the [1] is implied.  Cast that to single and you get 0 sign bit, 01111101 (127-biased) exponent, and [1]10011001100110011001101 mantissa, which is the 52 bit d.p. mantissa, rounded to 23 bits.  Cast that back to double and you get the exact same mantissa, but zero extended.

Everything else is just an artifact of printing a binary number in base 10.</description>
		<content:encoded><![CDATA[<p>Ljubomir, using format hex will show you what&#8217;s really going on:</p>
<p>&gt;&gt; format hex<br />
&gt;&gt; .4<br />
ans =<br />
    3fd999999999999a<br />
&gt;&gt; qsingle = single(0.4)<br />
qsingle =<br />
    3ecccccd<br />
&gt;&gt; qdouble = double(qsingle)<br />
qdouble =<br />
    3fd99999a0000000</p>
<p>.4 as a double is represented by a 0 sign bit, 01111111101 (1023-biased) exponent, and [1]10011001&#8230;10011010 mantissa, where the [1] is implied.  Cast that to single and you get 0 sign bit, 01111101 (127-biased) exponent, and [1]10011001100110011001101 mantissa, which is the 52 bit d.p. mantissa, rounded to 23 bits.  Cast that back to double and you get the exact same mantissa, but zero extended.</p>
<p>Everything else is just an artifact of printing a binary number in base 10.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ljubomir Josifovski</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29529</link>
		<dc:creator>Ljubomir Josifovski</dc:creator>
		<pubDate>Fri, 20 Jun 2008 08:56:04 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29529</guid>
		<description>Thanks Loren. I understand that maybe "...the “extra” digits are truncated for the single variable (as fewer bits for the mantissa)" whereas they may appear in the double. That's fine. What puzzles me is that the *conversion* of single-&#62;double yielding this i.e. it happens on extending the single (already truncated) into double. I take I wrongly assume promotion single-&#62;double involves only extending the extra mantissa bits with zeros (which I take can not add extra digits)?
Ljubomir</description>
		<content:encoded><![CDATA[<p>Thanks Loren. I understand that maybe &#8220;&#8230;the “extra” digits are truncated for the single variable (as fewer bits for the mantissa)&#8221; whereas they may appear in the double. That&#8217;s fine. What puzzles me is that the *conversion* of single-&gt;double yielding this i.e. it happens on extending the single (already truncated) into double. I take I wrongly assume promotion single-&gt;double involves only extending the extra mantissa bits with zeros (which I take can not add extra digits)?<br />
Ljubomir</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Loren</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29515</link>
		<dc:creator>Loren</dc:creator>
		<pubDate>Tue, 17 Jun 2008 17:42:02 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29515</guid>
		<description>Ljubomir-

It's not just printing.  Converting to a representation that can't exactly hold the number causes some loss of real information.  That's what happens when you go from the mathematical number 0.4 to holding it in double precision, with 53 bits for the mantissa.  Single has even fewer bits to hold the information so even more information gets "lost".  Then, converting back to double, the computer takes the now appproximate single value and doesn't get the double value exactly as it was (which may not have been exact in the first place).

With 3 digits, we can't accurately represent 4 digits decimal.  It's the same issue.  Not printing - no place in memory!

--Loren</description>
		<content:encoded><![CDATA[<p>Ljubomir-</p>
<p>It&#8217;s not just printing.  Converting to a representation that can&#8217;t exactly hold the number causes some loss of real information.  That&#8217;s what happens when you go from the mathematical number 0.4 to holding it in double precision, with 53 bits for the mantissa.  Single has even fewer bits to hold the information so even more information gets &#8220;lost&#8221;.  Then, converting back to double, the computer takes the now appproximate single value and doesn&#8217;t get the double value exactly as it was (which may not have been exact in the first place).</p>
<p>With 3 digits, we can&#8217;t accurately represent 4 digits decimal.  It&#8217;s the same issue.  Not printing - no place in memory!</p>
<p>&#8211;Loren</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ljubomir Josifovski</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29509</link>
		<dc:creator>Ljubomir Josifovski</dc:creator>
		<pubDate>Tue, 17 Jun 2008 08:22:55 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29509</guid>
		<description>Hm, in your example both 2-digit and 3-digit numbers are 200. In mine the single and the double differ. Analogy would be if you got something else then 200x10^0 when converting 20x10^1 to 3 digits. Don't understand how the single -&#62; double manissa extension (the extra bits were set to 0s?) yielded the difference? Maybe is just a printing issue in the command window (singles have less significant digits printed)?</description>
		<content:encoded><![CDATA[<p>Hm, in your example both 2-digit and 3-digit numbers are 200. In mine the single and the double differ. Analogy would be if you got something else then 200&#215;10^0 when converting 20&#215;10^1 to 3 digits. Don&#8217;t understand how the single -&gt; double manissa extension (the extra bits were set to 0s?) yielded the difference? Maybe is just a printing issue in the command window (singles have less significant digits printed)?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Hosea</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29503</link>
		<dc:creator>Mike Hosea</dc:creator>
		<pubDate>Fri, 13 Jun 2008 16:30:43 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29503</guid>
		<description>The extra digits are the representation error of the single precision number as compared to the double.  Suppose we represent 199.2 in 3-digit decimal floating point (and I'll normalize my mantissas however I want).  We get 199x10^0.  The loss of 0.2 here is representation error.  Something like this also happens when you represent 4/10 as a double--double(4/10) is not exactly 4/10.  Back to the example, now convert this 3-digit result to 2-digit decimal floating point.  This time we round up: 20x10^1.  Converting that back to 3-digit we get 200x10^0, not the 199x10^0 we had before.  So here we picked up +1 by converting from 3-digits to 2-digits and then back to 3-digits.</description>
		<content:encoded><![CDATA[<p>The extra digits are the representation error of the single precision number as compared to the double.  Suppose we represent 199.2 in 3-digit decimal floating point (and I&#8217;ll normalize my mantissas however I want).  We get 199&#215;10^0.  The loss of 0.2 here is representation error.  Something like this also happens when you represent 4/10 as a double&#8211;double(4/10) is not exactly 4/10.  Back to the example, now convert this 3-digit result to 2-digit decimal floating point.  This time we round up: 20&#215;10^1.  Converting that back to 3-digit we get 200&#215;10^0, not the 199&#215;10^0 we had before.  So here we picked up +1 by converting from 3-digits to 2-digits and then back to 3-digits.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ljubomir Josifovski</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29502</link>
		<dc:creator>Ljubomir Josifovski</dc:creator>
		<pubDate>Fri, 13 Jun 2008 14:47:29 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29502</guid>
		<description>A collegue encountered this recently:

  &#62;&#62; clear all
  &#62;&#62; qsingle = single(0.4)
  qsingle =
              0.4
  &#62;&#62; qdouble = double(qsingle)
  qdouble =
         0.400000005960464
  &#62;&#62; whos
    Name         Size            Bytes  Class     Attributes
    qdouble      1x1                 8  double              
    qsingle      1x1                 4  single              

I presume the "extra" digits are truncated for the single variable (as fewer bits for the mantissa).

Ljubomir</description>
		<content:encoded><![CDATA[<p>A collegue encountered this recently:</p>
<p>  &gt;&gt; clear all<br />
  &gt;&gt; qsingle = single(0.4)<br />
  qsingle =<br />
              0.4<br />
  &gt;&gt; qdouble = double(qsingle)<br />
  qdouble =<br />
         0.400000005960464<br />
  &gt;&gt; whos<br />
    Name         Size            Bytes  Class     Attributes<br />
    qdouble      1&#215;1                 8  double<br />
    qsingle      1&#215;1                 4  single              </p>
<p>I presume the &#8220;extra&#8221; digits are truncated for the single variable (as fewer bits for the mantissa).</p>
<p>Ljubomir</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Davis</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29215</link>
		<dc:creator>Tim Davis</dc:creator>
		<pubDate>Thu, 22 May 2008 01:22:33 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29215</guid>
		<description>Another example of checking for floating point equality with zero is gallery('house',...).  If norm(x) is already zero, H*x does not need to do nothing so H=I is chosen.  Try:

edit private/house.m

It checks for exact zero, not eps, because H is computed safely if norm(x) is O(eps).

Dave:  the check for abs(S-c)&#60;eps only works if the value c you're checking against is abs(c)&#60;.5 or so.  If S is larger then that won't work.  You should use something like abs(S-c)&#60;4*eps(c), for example, where "4" is something small.  Then if you compare S with c=34 instead of c=0.34, it will still work.  Otherwise you might as well be testing "if(S==c)".  Type

help eps

for more details.</description>
		<content:encoded><![CDATA[<p>Another example of checking for floating point equality with zero is gallery(&#8217;house&#8217;,&#8230;).  If norm(x) is already zero, H*x does not need to do nothing so H=I is chosen.  Try:</p>
<p>edit private/house.m</p>
<p>It checks for exact zero, not eps, because H is computed safely if norm(x) is O(eps).</p>
<p>Dave:  the check for abs(S-c)&lt;eps only works if the value c you&#8217;re checking against is abs(c)&lt;.5 or so.  If S is larger then that won&#8217;t work.  You should use something like abs(S-c)&lt;4*eps(c), for example, where &#8220;4&#8243; is something small.  Then if you compare S with c=34 instead of c=0.34, it will still work.  Otherwise you might as well be testing &#8220;if(S==c)&#8221;.  Type</p>
<p>help eps</p>
<p>for more details.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Davis</title>
		<link>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29196</link>
		<dc:creator>Tim Davis</dc:creator>
		<pubDate>Tue, 20 May 2008 20:37:35 +0000</pubDate>
		<guid>http://blogs.mathworks.com/loren/2008/05/20/another-lesson-in-floating-point/#comment-29196</guid>
		<description>Checking for equality is OK in certain circumstances: With "flints", for example, (p = -10:10 for example), or for values that you know can be computed without error (such as 1./4.).  But you have to know everything you've done to a number ... which includes built-in functions for which you can't see the code.  So it's not always possible.

The other place where checking for equality makes sense is when an entry is exactly zero.  That happens all the time in sparse matrix computations ... those entries get purged by MATLAB.  Roundoff error can differ on on different machines, so nnz(A) can vary from machine to machine even though the code and the input data are the same.</description>
		<content:encoded><![CDATA[<p>Checking for equality is OK in certain circumstances: With &#8220;flints&#8221;, for example, (p = -10:10 for example), or for values that you know can be computed without error (such as 1./4.).  But you have to know everything you&#8217;ve done to a number &#8230; which includes built-in functions for which you can&#8217;t see the code.  So it&#8217;s not always possible.</p>
<p>The other place where checking for equality makes sense is when an entry is exactly zero.  That happens all the time in sparse matrix computations &#8230; those entries get purged by MATLAB.  Roundoff error can differ on on different machines, so nnz(A) can vary from machine to machine even though the code and the input data are the same.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
