Read Digital Edition


ADS BY GOOGLE
Top Three Links You Must Click On


Sun engineer's response to 'How Sun can pull out of its slump'
Consistency is key

Dear Mr. Murphy,

We appreciate the high estimation in which you hold Sun Microsystems, as expressed in your LinuxWorld article of 1/29/03. We would like to address a small part of the article in which you call attention to what you see as a problem with SPARC and scientific computing, quoted below; in fact, although there is something to be learned from your square-root summation example, it is not what you conclude. More follows the quotation:

  • The actual problem is that SPARC, like other RISC CPUs, defaults to the IEEE 754 64-bit floating-point specification. Therefore, it doesn't get the right answers in cases requiring extensive iterative computation. If, for example, you get it to sum the square roots of the first billion positive integers using the GCC compiler and default libraries, the result is off by 0.18.

    Although completely meaningless in business processing, that error is devastating if you're trying to simulate the three-body problem over a long period. You can get around this by using extended precision libraries, but that cuts performance significantly.

  • The self-inflicted version of this problem is that SPARC has an answer to floating-point performance problems in its SIMD instruction set, but Sun doesn't go out of its way to tell anyone about it and hasn't given the research community the support needed to motivate its use. The main problem is that researchers tend to develop for the box on their desk, and these days that's more likely to be an Intel box running FreeBSD or Linux than a SPARC. As a result, when they do run that code on a SPARC, there's nothing there to take advantage of the short-array capabilities provided in the hardware, and they get neither the performance nor the accuracy that the machine is capable of."

The main points we want to make are:

  • The error as measured by the more suitable unit of ulps is not really large and it is further controllable by improving the summation algorithm.
  • The SPARC results are uniform across systems while the IA32 results vary from somewhat better to much worse; uniformity is a virtue. Note that the results on other RISC systems match Sun's and also note that high performance scientific computing is still much more practiced on the RISC systems exemplified in the table.
  • The variability of IA32 results stems from improper management of extended precision registers vis-a-vis memory precision, a subtlety that vitiates significantly the potential real advantage of extended precision intermediate computation. Note that IA32 does indeed comply with IEEE754; its differences with SPARC or other RISCs is within IEEE754 latitude.
  • The example is typical of kernels in the solution of differential equations, like the three-body problem to which you refer, which will overwhelm eventually any finite-bounded precision representation, so better numerical approaches must be used anyway.
  • The SIMD instruction set, VIS, available in successively improved versions in UltraSPARC processors, does not as yet implement floating-point arithmetic. Until recently this disadvantage with respect to the IA32 SIMD sets, SSE and SSE2, was not material since the latter was addressing a more egregious floating-point performance issue. VIS is used to accelerate some FP codes on SPARC by doing logical and bit-wise operations on FP registers. Note that SSE and SSE2 implement SPARC-style pure double and float precision FP arithmetic; there's no use of extended precision.

Next is a more detailed discussion. For a good foundation in floating-point computing we recommend the ACM article by David Goldberg "What Every Computer Scientist Should Know about Floating-Point Computing" especially with the addendum by Douglas Priest that addresses IA32. Also of interest might be Joe Darcy's JavaOne talk or even better, but requiring jdc registration, is What Everybody Using the Java Programming Language Should Know About Floating-Point Arithmetic

I) The Code and Compilation

The source code at the heart of the discussion:

        ------- sqr.c -------
        #include 
        #include 
        main()
        {
        register i;
        double f=0.0;
        double sqrt();

for(i=0;i<=1000000000;i++) { f+=sqrt( (double) i); } printf("Finished %20.14f\n",f); } ------- sqr.c -------

Investigators were asked to compile this program with the following command:

cc -o sqr -O sqr.c -lm

and to run the executable as follows:

time ./sqr

II) The Results: Correct versus IA32 versus SPARC

RUN-NameResultDouble Prec Rep
1) Correct: 21081851083600.37596259382529338=0x42b32c803ebb5060
2) SPARC-quad: 21081851083600.37500000000000=0x42b32c803ebb5060
3) Correct+1ULP: 21081851083600.3789=0x42b32c803ebb5061
4) IA32: 21081851083600.38281250000000=0x42b32c803ebb5062
5) SPARC double: 21081851083600.55859375000000=0x42b32c803ebb508f

The table below presents results from several sources/runs in order of accuracy.

Entry number 1 is the result from the Euler-Maclaurin formula as presented by Murphy. The third column of the table gives the IEEE-754 floating-point double-precision hexadecimal representation of the result.

Entry number 3 presents not the result of a run, but rather serves to illustrate the error in the decimal representation when the hexadecimal number varies by 1 bit, or unit in the last place, denoted ULP.

The best computed results are in error by 2 ULPs and are represented by entry number 4.

The SPARC results which are in error by 47 ULPs, are shown as entry number 5.

Entry number 2, shows a result that in decimal representation, is in error by -.00096259382529338. However, observe that the hex representation shows an error of 0 ULPs, an exact answer. The reason for this apparent contradiction is that IEEE-754 double precision represents only 16- or 17-decimal digits. Additional digits are used in rounding when converting from decimal to the double precision format. Entry number 2 presents the double-precision results when the sum is kept in IEEE-754 128-bit format.

I) Magnitude of the Errors

RUN-NameDecimal ErrorULP Errorerr/correct
SPARC-quad-0.0009625938252933800
Correct+1ULP0.0029374061747066211.39333E-16
BEST0.0068499061747066223.24920E-16
SPARC-double0.182631156174706624786.6296E-16

Consider now the errors encountered in the survey, based on the table in section II. Consider #1 as perfect. Errors can be considered in terms of decimal representation, and ULPs.

When considering the absolute error alone, SPARC results appear bad. However, considering the ratio error/correct, the variance between results, in fact, is discovered to be very small.

About Gregory V. Tarsy
Gregory V. Tarsy works in the Floating Point and Numerical Computing Group at Sun Microsystems.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

  Subscribe to our RSS feeds now and receive the next article instantly!
In It? Reprint It! Contact advertising(at)sys-con.com to order your reprints!
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE
We talk a lot about social media on Marketing Trenches. And for good reason – Social media seems to...
Intel has put out its promised beta SDK for Windows (C and C++) and Moblin (C) developers working on...
InformationWeek stumbled on a Microsoft patent application dating back to 2006 deceptively titled “M...
Berlin-based ThinPrint AG, the printer virtualization house, thinks it’s got a cloud solution for th...
IBM has acquired Guardium, a seven-year-old subsidiary of Israel’s Log-On Software transplanted to M...
But on the web, access to services is implicit in the fact that the business is offering the service...
Behaving like it’s got a future, Sun Monday put out what it calls a significant new version of Virtu...
Oracle has offered to cordon off MySQL inside a combined Oracle-Sun to get the European Commission t...
The second set of charges filed last week against Indian outsourcer Satyam Computer Services founder...
Gartner told Reuters that it overestimated how many PCs Acer shipped in the last seven quarters by a...
Office Web Apps, Microsoft’s answer to Google Apps, are supposed to be out sometime in June along wi...
Gartner thinks the server business has stopped sliding into the abyss. Third-quarter sales weren’t a...
Gartner is buying ~$40 million-a-year AMR Research Inc for close to $64 million in cash. AMD special...
Singed by user reaction to its plans to up the price of its support contracts, SAP Tuesday postponed...
Apparently Google Gears ain’t gonna stick around that long. Google Apps will eventually get their of...
Oracle seems to have divided the open source ranks over the MySQL delay it’s having closing its acqu...
We hear – well, you know how people talk – that Oracle has been quietly meeting with the European Co...
The Korean government is going to sink around $172 million into cloud computing next year under a st...
In response to Opera’s complaints Microsoft has reportedly modified the proposed ballot screen that’...
Microsoft has sold the Folio and NXT businesses it got when it bought Fast Search and Transfer, the ...