Bank Format and Metric Socket Wrenches
A report about a possible bug in format bank and a visit to a local hardware store made me realize that doing decimal arithmetic with binary floating point numbers is like tightening a European bolt with an American socket wrench.
Contents
Tax Time
It's mid-April and so those of us who file United States income taxes have that chore to do. Years ago, I used MATLAB to help with my taxes. I had a program named form1040.m that had one statement for each line on the tax form. I just had to enter my income and deductions. Then MATLAB would do all the arithmetic.
If we're really meticulous about our financial records, we keep track of things to the nearest penny. So line 28 of form1040 might have been something like
interest = 48.35
interest = 48.3500
I didn't like those trailing zeros in the output. So I introduced
format bank
into MATLAB. Now
interest
interest = 48.35
is printed with just two digits after the decimal point.
format bank has turned out to be useful more broadly and is still in today's MATLAB.
Breaking Ties
We recently had a user ask about the rounding rule employed by format bank. What if a value fails halfway between two possible outputs? Which is chosen and why?
Here's the example that prompted the user's query. Start with
format short
so we can see four decimal places.
x = (5:10:45)'/1000 y = 1+x z = 2+x
x = 0.0050 0.0150 0.0250 0.0350 0.0450 y = 1.0050 1.0150 1.0250 1.0350 1.0450 z = 2.0050 2.0150 2.0250 2.0350 2.0450
These values appear to fall halfway between pairs of two-digit decimal fractions. Let's see how the ties are broken.
format bank
x
y
z
x = 0.01 0.01 0.03 0.04 0.04 y = 1.00 1.01 1.02 1.03 1.04 z = 2.00 2.02 2.02 2.04 2.04
Look at the last digits. What mysterious hand is at work here? Three of the x's, x(1), x(3), and x(4), have been rounded up. None of the y's have been rounded up. Two of the z's, z(2) and z(4), have been rounded up.
Email circulated internally at MathWorks for a few days after this was reported suggesting various explanations. Is it a bug in format bank? In the I/O library? Does it depend upon which compiler was used to build MATLAB? Do we see the same behavior on the PC and the Mac? Has it always been this way? These are the usual questions that we ask ourselves when we see curious behavior.
Do you know what's happening?
None of the above.
Well, none of the suspects I just mentioned is the culprit. The fact is none of the numbers fall exactly on a midpoint. Think binary, not decimal. A value like 0.005 expressed as a decimal fraction cannot be represented exactly as a binary floating point number. Decimal fractions fall in between the binary fractions, and rarely fall precisely half way.
Symbolic Toolbox
To understand what is going on, the Symbolic Toolbox function
sym(x,'e')
is your friend. The 'e' flag is provided for this purpose. The help entry says
'e' stands for 'estimate error'. The 'r' form is supplemented by a term involving the variable 'eps' which estimates the difference between the theoretical rational expression and its actual floating point value. For example, sym(3*pi/4,'e') is 3*pi/4-103*eps/249.
To see how this works for the situation encountered here.
symx = sym(x,'e') symy = sym(y,'e') symz = sym(z,'e')
symx = eps/2133 + 1/200 3/200 - eps/400 eps/160 + 1/40 (3*eps)/200 + 7/200 9/200 - (3*eps)/400 symy = 201/200 - (12*eps)/25 203/200 - (11*eps)/25 41/40 - (2*eps)/5 207/200 - (9*eps)/25 209/200 - (8*eps)/25 symz = 401/200 - (12*eps)/25 (14*eps)/25 + 403/200 81/40 - (2*eps)/5 (16*eps)/25 + 407/200 409/200 - (8*eps)/25
The output is not as clear as I would like to see it, but I can pick off the sign of the error terms and find
x + - + + - y - - - - - z - + - + -
Or, I can compute the signs with
format short
sigx = sign(double(symx - x))
sigy = sign(double(symy - y))
sigz = sign(double(symz - z))
sigx = 1 -1 1 1 -1 sigy = -1 -1 -1 -1 -1 sigz = -1 1 -1 1 -1
The sign of the error term tells us whether the floating point numbers stored in x, y and z are larger or smaller than the anticipated decimal fractions. After this initial input, there is essentially no more roundoff error. format bank will round up or down from the decimal value accordingly. Again, it is just doing its job on the input it is given.
Socket Wrenches
Photo credit: http://toolguyd.com/
A high-end socket wrench set includes both metric (left) and fractional inch (right) sizes. Again, think decimal and binary. Metric sizes of nuts and bolts and wrenches are specified in decimal fractions, while the denominators in fractional inch sizes are powers of two.
Conversion charts between the metric and fractional inch standards abound on the internet. Here is a link to one of them: Wrench Conversion Chart.
But we can easily compute our own conversion chart. And in the process compute delta, the relative error made when the closest binary wrench is used on a decimal bolt.
make_chart
Inch Metric delta 1/64 0.016 1/32 0.031 3/64 0.047 1mm 0.039 -0.191 1/16 0.063 5/64 0.078 2mm 0.079 0.008 3/32 0.094 7/64 0.109 1/8 0.125 3mm 0.118 -0.058 9/64 0.141 5/32 0.156 4mm 0.157 0.008 11/64 0.172 3/16 0.188 13/64 0.203 5mm 0.197 -0.032 7/32 0.219 15/64 0.234 6mm 0.236 0.008 1/4 0.250 9/32 0.281 7mm 0.276 -0.021 5/16 0.313 8mm 0.315 0.008 11/32 0.344 9mm 0.354 0.030 3/8 0.375 13/32 0.406 10mm 0.394 -0.032 7/16 0.438 15/32 0.469 12mm 0.472 0.008 1/2 0.500 9/16 0.563 14mm 0.551 -0.021 5/8 0.625 16mm 0.630 0.008 11/16 0.688 18mm 0.709 0.030 3/4 0.750 13/16 0.813 20mm 0.787 -0.032 7/8 0.875 22mm 0.866 -0.010 15/16 0.938 24mm 0.945 0.008 1 1.000
Let's plot those relative errors. Except for the small sizes, where this set doesn't have enough wrenches, the relative error is only a few percent. But that's still enough to produce a damaging fit on a tight nut.
bar(k,d) axis([0 25 -.05 .05]) xlabel('millimeters') ylabel('delta')
You might notice that my conversion chart, like all such charts, and like the wrenches themselves, has a little bit of floating point character. The spacing of the entries is not uniform. The spacing between the binary values is 1/64, then 1/32, then 1/16. The spacing of the metric values is 1mm at the top of the chart and 2mm later on.
10 millimeters
Suppose we want to tighten a 10mm nut and all we have are binary wrenches. The diameter of the nut in inches is
meter = 39.370079; d = 10*meter/1000
d = 0.3937
Consulting our chart, we see that a 13/32 wrench is the best fit, but it's a little too large. 10mm lies between these two binary values.
[floor(32*d) ceil(32*d)] b1 = 12/32; b2 = 13/32; [b1 d b2]'
ans = 12 13 ans = 0.3750 0.3937 0.4063
The fraction of the interval is
f = (d - b1)/(b2 - b1)
f = 0.5984
10mm is about 60% of the way from 12/32 inches to 13/32 inches.
One-tenth
Now let's turn to floating point numbers. What happens when execute this MATLAB statement?
x = 1/10
x = 0.1000
One is divided by ten and the closest floating point number is stored in x. The same value is produced by the statement
x = 0.1
x = 0.1000
The resulting x lies in the interval between 1/16 and 1/8. The floating point numbers in this interval are uniformly spaced with a separation of
e = eps(1/16)
e = 1.3878e-17
This is 2^-56. Let
e = sym(e)
e = 1/72057594037927936
This value e plays the role that 1/64 plays for my wrenches.
The result of x = 1/10 lies between these two binary fractions.
b1 = floor(1/(10*e))*e b2 = ceil(1/(10*e))*e
b1 = 7205759403792793/72057594037927936 b2 = 3602879701896397/36028797018963968
The tiny interval of length e is
c = [b1 1/10 b2]'
c = 7205759403792793/72057594037927936 1/10 3602879701896397/36028797018963968
In decimal
vpa(c)
ans = 0.099999999999999991673327315311326 0.1 0.10000000000000000555111512312578
Where in this interval does 1/10 lie?
f = double((1/10 - b1)/e)
f = 0.6000
So 1/10 is about 60% of the way from b1 to b2 and so is closer to b2 than to b1. The two statements
x = 1/10 x = double(b2)
x = 0.1000 x = 0.1000
store exactly the same value in x.
The 13/32 wrench is the closest tool in the binary collection to the 10mm nut.
评论
要发表评论,请点击 此处 登录到您的 MathWorks 帐户或创建一个新帐户。