The Pentium Papers — My First MATLAB Central Contribution

MATLAB Central is celebrating its 15th birthday this fall. In honor of the occasion, MathWorks bloggers are reminiscing about their first involvement with the Web site. My first contribution to the File Exchange was not MATLAB software, but rather a collection of documents that I called the Pentium Papers. I saved this material in November and December of 1994 when I was deeply involved in the Intel Pentium Floating Point Division Affair.

Pentium key chain. Image thanks to Thomas Johansson, thomas@cpucollection.se.

Bad companies are destroyed by crises;
Good companies survive them;
Great companies are improved by them.;

–Andy Grove, Chairman, Intel Corp., December 1994

Contents

Internet in 1994

As I said in my blog post in 2013, “The Pentium division bug episode in the fall of 1994 was a defining moment for the MathWorks, for the Internet, for Intel Corporation, and for me personally.”

In the fall of 1994 the Internet was not anything like it is today. The World Wide Web was in its infancy. All of the connections were text based. The Mosaic graphical browser was only a few months old and few people knew about it. Internet explorer did not yet exist and Google was four years in the future. We did have email. Some of us used FTP, File Transfer Protocol. And the social media of the day were the text only news groups, which had names like comp.soft-sys.matlab and comp.soft-sys.intel.

But calling the news groups “social media” is a stretch. The term “social media” didn’t exist in 1994. And almost all of the participants in the news groups were geeks and nerds in universities and industrial research labs.

The Division Bug

Thomas Nicely, a Professor of Mathematics at Lynchburg College in Virginia, did research on prime numbers, especially twin primes. For his computational experiments he employed several IBM PCs running Intel 486 processors. In the summer of 1994 he added a PC with the new Pentium chip. Much to his surprise, he found the Pentium gave a different result for the reciprocal of one of his primes.

On October 30, Nicely emailed several other Pentium users that he had discovered a bug in the Pentium’s floating point arithmetic. The news soon reached Terje Mathisen, a computer jock at Norsk Hydro in Oslo, Norway, who had written about the accuracy of Intel’s transcendental functions. Mathisen confirmed the bug and, on November 3, posted a test program to comp.soft-sys.intel. A day later Andreas Kaiser in Germany used Mathisen’s program to post a list of numbers whose reciprocals were being computed to what appeared to be only single precision accuracy.

Tim Coe designed floating point hardware for an aerospace contractor in California. He didn’t have access to a Pentium machine. But from Kaiser’s list of erroneous reciprocals he was able to reverse engineer Intel’s division algorithm. He expected that certain divisions, involving quantities with specific bit patterns, would produce results that were much less accurate than even single precision. He drove to a local computer store, ran a calculator program on a Pentium in the showroom, and confirmed his prediction.

My Initial Involvement

I first heard about the FDIV bug (FDIV is the mnemonic for Floating point Division on x86 processors) in early November from an email list for floating point arithmetic. It appeared to be a single/double precision hardware glitch. Annoying, but not surprising. I started to follow comp.soft-sys.intel anyway. But when Coe posted his results, and when MathWorks tech support got a couple of queries asking how this affect MATLAB, I got seriously interested.

On November 15, 1994, I made the first of what would become several postings to both comp.soft-sys.intel and comp.soft-sys.matlab, summarizing what I knew. I pointed out that the relative error in one of Coe’s examples is $6.1 \cdot 10^{-5}$. This is ten orders of magnitude larger than what we expect from MATLAB, or any other scientific computation using IEEE double precision. The error might not occur very often, but when it does, it can be very significant.

Within the next week the Net became very active. Several participants contacted newspapers and TV stations. Two engineers at the Jet Propulsion Laboratory convinced JPL to issue a press release announcing that they were no longer purchasing Pentium-based PCs. Reporters seeking more information found my posting and redistributed it.

On November 22 CNN sent a TV crew to MathWorks, interviewed me, and then led off their evening Moneyline with a story about Intel’s troubles. I spent the next day fielding phone calls from other reporters. A number of newspapers, including the New York Times and the San Jose Mercury News, ran stories on the 24th. The headline of the front page story in the Boston Globe was “Sorry, Wrong Number”.

Intel’s Initial Reaction

As a corporation, Intel did not know how to react and handled the situation badly. They had little experience dealing with the public. Their customers were computer manufacturers, not individual computer users. They had no experience whatsoever dealing with the Internet. Their first response came in the form of canned statement that could be faxed back to anyone contacting tech support via fax. This FAXBACK document was soon posted to the Net, but not by Intel. I’ve included a copy in the Pentium Papers.

Intel’s statement said that they had already discovered the FDIV “flaw” themselves and had fixed it in recent releases of the Pentium chip. They claimed that it would occur only once in 9 billion divide operations and that the average spreadsheet user would encounter it only once in 27,000 years of use. They provided an 800 telephone number to call for anyone doing “prime number generation or other complex mathematics”. They would interview callers and offer to replace the chip for anyone they thought really required error-free divisions.

Needless to say, this aggressive non-apology only enflamed the criticism of Intel on the Net. Nobody believed their claims about the frequency of occurrence. The frequencies came from a study Intel had made, but initially refused to release. As far as I was concerned, the problem was not the likelihood of encountering the FDIV bug, it was the fact that we had to worry about it at all.

Software Workaround

Tim Coe, Terje Mathisen and I devised a scheme where a Pentium FDIV hardware instruction could be replaced by less than a dozen lines of software that insured all divisions were done correctly. The idea is that the divisor of each prospective division operation would be checked for the presence of certain bit patterns in the floating point fraction that made it “at risk”. When an at risk divisor and the corresponding dividend are both scaled by 15/16, the quotient remains unchanged, but the operation can then be done safely by the FDIV instruction.

We wanted to make our workaround widely available. Intel contacted us and, along with Peter Tang, a computer scientist who was then at Argonne and who has been a consultant to Intel, we began to work, via conference calls, with a group at Intel. It was our intention to provide the workaround to compiler writers and major software vendors, and to announce its availability on the Net. The workaround macro would replace FDIV in all PC software being developed. (It was more complicated — functions like “mod” and “rem” and a few transcendental functions like “atan” that had Pentium hardware support were also involved.)

Pentium Aware MATLAB

MATLAB was the proof-of-concept for the software workaround. We built and released a special “Pentium Aware” MATLAB. Its documentation says

MATLAB detects, and optionally repairs,
erroneous arithmetic results produced by Intel’s Pentium processor.
Erroneous results are infrequent, but can occur in many MATLAB operations
and functions. … When an erroneous result is detected, MATLAB prints a
message. … Options exist to suppress the printing of the messages,
count the number of occurrences, and suppress the corrections
altogether.

Our public relations firm had sent out a press release with the headline

THE MATHWORKS DEVELOPS FIX FOR THE INTEL PENTIUM(tm) FLOATING POINT ERROR

At the time, MathWorks had just reached its 10th anniversary. The company name was barely known in the industries we served, and completely unheard of in the public generally. So when a press release arrived saying this obscure little company in Massachusetts has fixed the Pentium bug, it created quite a stir. I got dozens of more phone calls.

I now have a folder of hundreds of press clippings from all over the world that resulted from the Pentium affair.

IBM’s Study

On December 12th IBM issued its own study. IBM had several reasons to get involved. They had invented, or at least named, the IBM PC, and one division was selling PCs employing the Pentium. Another division was developing and manufacturing its own chip, the PowerPC.

The IBM study claimed that typical spreadsheet calculations were likely to generate numbers with the “at risk” bit patterns and so FDIV errors were much more likely to occur than Intel claimed. The study also cleverly multiplied their predicted likelihood of an individual spreadsheet user encountering an error by an estimated total number of spreadsheet users worldwide to conclude “on a typical day a large number of people are making mistakes in their computation without realizing it.”

IBM announced they were suspending production of Pentium-based PCs.

Intel Surrenders

Within hours of the IBM announcement, Intel’s stock price dropped 10 points. A week later Intel issued an apology and announced a no-questions-asked return policy on Pentium chips. They set up a network of service centers to handle the replacements and allocated $475 million to pay for the replacement program.

Months later, very few actual requests for replacements had been made. We had learned that encountering the FDIV error was, in fact, very unlikely. But more important, for most people, Intel’s apology was enough.

Pentium Papers

As all this was happening, I saved some of the postings from the comp.soft-sys.intel news groups, a few newspaper stories and a few other contemporary documents. I would respond to email requests for information with “If you have access to the Internet, you can download my Pentium Papers using anonymous ftp from ftp.​mathworks.com.”

As the MathWorks Web site developed there was usually a note somewhere about how to get to the Pentium Papers. When we started MATLAB Central File Exchange I moved the collection there as my first contribution.

Footnote

If you open the .txt files in the Pentium Papers with Microsoft Word, it will break up the long lines and produce a readable file.

Published with MATLAB® R2016a

|
  • print

评论

要发表评论,请点击 此处 登录到您的 MathWorks 帐户或创建一个新帐户。