Archive

Posts Tagged ‘error’

Quality comparison of floating-point maths libraries

April 11th, 2011 No comments

What is the best way to compare the quality of floating-point math libraries (e.g., sin, cos and log)? The traditional approach for evaluating the quality of an algorithm implementing a mathematical function is based on mathematics; methods have been developed to calculate the maximum error between the calculated and the actual value. The answer produced by this approach does not say anything about how frequently this maximum error will occur, only that it occurs at least once.

The log (natural logarithm) is probably the most frequently used mathematical function and I decided to compare a few implementations; R statistical package version 2.11.1 and glibc (libm version 2.11.2) both running under Suse 11.3 on an AMD Athlon 64 X2, and Cygwin version 1.7.1 under Windows XP SP 2 on another AMD Athlon 64 X2.

The algorithm often used to implement mathematical functions involves evaluating a polynomial expression (e.g., Chebyshev polynomials) within a small range of values (various methods are used to map the argument into this range and then scale the calculated result). I decided to initially treat the implementations under test as black boxes and did not know the ranges they used; a range of 0.1 to 1.0 seemed like a good idea and I generated all single precision floating-point values between these two bounds (all 28,521,267 of them, with each adjacent pair still having 2^29 double precision values between them).

#include <math.h>
#include <stdio.h>
 
int main(int argc, char *argv[])
{
float val = 0.1,
       max_val = 1.0;
 
while (val < max_val)
   {
   printf("%12.10f\n", val);
   val=nextafterf(val, 1.1);
   }
}

This list of 28 million values was fed as input to three programs:

  • bc, which was used to generate the list of assumed to be correct logarithm of these 28 million values. R supports 64-bit IEEE compliant floating-point values, as do the C compilers/libraries used, and the number of decimal digits supported in this representation is 15. To provide greater accuracy to compare against the values generated by bc contained 17 digits, an extra two decimal digits over the IEEE values.
    scale=17
    while ((val=read() > 0))
       l(val)
  • A C program.
    #include <math.h>
    #include <stdio.h>
     
    int main(int argc, char *argv[])
    {
    double val;
     
    while (!feof(stdin))
       {
       scanf("%lf", &val);
       printf("%17.15f\n", log(val));
       }
    }
  • A R program.
    base_range=file("stdin", open="r")
     
    base_val=as.numeric(readLines(base_range, n=1))
    while (length(base_val) != 0)
       {
       cat(format(log(base_val), nsmall=15), file=stdout())
       cat("\n", file=stdout())
     
       base_val=as.numeric(readLines(base_range, n=1))
       }

The output of the C and R programs was then compared against the output from bc; which unfortunately creates a dependency on the accuracy of the C & R binary to decimal output routines (the subsequent comparison process gets around the decimal-to-binary input problem by reading the log values as strings and comparing the last few digits of each respective value). Accurate floating-point I/O needs something like hexadecimal floating constants.

Plotting the number of computed values of log that differ by a given amount from the value computed by bc we get (values whose error is below -50 will be rounded down and those above 50 rounded up, ignoring the issue of round to even):

Error in evaluation of log

The results (raw data for R, Linux C and Cygwin C) show that around 5.6% of values are off by one in the last (15th) digit (Cygwin was slightly worse at 5.7%). The results for R/Linux C were almost identical and a quick check of the R source tree showed that R calls the C library function to evaluate log (it is a bit worrying that R is dependent on the host maths library, they should think about replacing this dependency by something like MPFR tout suite; even though the 64-bit glibc library is of very high quality it still has an environmental dependency).

Being off by one in the last decimal place is unlikely to keep many people awake at night. But if we want a measure of quality, is percentage of inaccurate values a useful measure of math library quality? Provided it is coupled with the amount of inaccuracy I think this is a useful measure.

Share

CPUs also exhibit hardware faults

October 31st, 2009 No comments

The cpu is the one element of a computing platform that people rarely treat as a source of error caused by physically malfunction, i.e., randomly flipping a bit in a register or instruction pipeline. I once worked on a compiler for the Mototola 88000 using a test platform that contained alpha silicon (i.e., not yet saleable components where some of the instructions were known not to work; the generated assembler code was piped through a sed script that mapped these instructions into an alternative instruction sequence that did work) and the cpus in a few of the hardware updates turned out to be temperature sensitive; some of the instructions changed their behavior when they got too hot. People who write compilers using alpha silicon learn to expect this sort of thing.

Quite a bit has been published on faults in other hardware components. Some of the best recent empirical hardware fault data and analysis I have seen is that published by Google engineers on hard disc and dram memory fault occurrences in their server farms. They might have a problem publishing such results for the cpus they use because these commodity items generally don’t have the ability to report any detailed fault data, they just die or one of the programs being executed crashes.

As device fabrication continues to shrink erroneous behavior caused by cosmic ray impact will become more and more common. Housing a computer farm at a high altitude might not be a good idea (at 7500 ft cosmic ray-induced neutrons that can lead to soft errors are 6.4 times more common than at sea level).

IBM’s Power4 chip (“Power4 System Design for High Reliability” by Bossen, Tendler and Reick) is one of the few that provides error checking of cache contents, while IBM’s System z9 is one of the very few that provide parity checking on the cpu registers (“Enhanced I/O subsystem recovery and availability on the IBM System z9″ by Oakes et al).

One solution to the problem of unreliable cpu behavior is for the compiler to insert consistency checks into the generated code. Two such checking methods are:

  • ‘Signature Analysis’ which performs consistency checks between signatures calculated at compile time and runtime. A signature is associated with every basic block with the current signature being derived from the execution history. This technique can detect spurious changes to the flow of control caused by a hardware glitch.
  • ‘Error Detection by Duplicated Instructions’ generates code which duplicates the behavior of some instruction sequence and compares the result calculated by both sequences, i.e., a source language construct is executed twice and an error raised if the results are different. The parallel instruction sequences use different sets of registers on the same cpu and ideally the instructions are scheduled to exploit instruction level parallelism

At the moment cosmic-ray induced hardware faults are probably very small change compared to faults in the code. Will code quality increase to the point where cosmic-ray faults become an issue or will devices get so small that they have to be lead lined to prevent background radiation corrupting them? Let the race begin.

Share

Monte Carlo arithmetic operations

December 5th, 2008 No comments

Working out whether software based calculations involving floating-point values delivers a sensible answer requires lots of mathematical sophistication and in practice is often impractical or intractable. The vast majority of developers make no effort, indeed most don’t even know why the effort is needed. Various ‘end-user’ solutions have been proposed, e.g., interval arithmetic.

One interesting solution is to perturb the result of floating-point operations and measure the effect on the final answer. Any calculation that is sensitive to small random changes in the result of an operation (there is randomness present in any operation that operates on values that can only be represented to a finite precision) will produce answers that depend on the direction and magnitude of the perturbation. Comparing the answers from several program executions provides a measure of one kind of error present in the calculation.

Monte Carlo arithmetic is a proposed extension of floating-point arithmetic that operates by randomly selecting how round-off errors occur (the proposer provides sample code).

With computing power continuing to increase, running a program several times is often a viable option (we don’t all number crunch for cpu days). Most of the transistors on a modern CPU chip are devoted to memory cache, using a few of these to support Monte Carlo arithmetic instructions is entirely practical. Perhaps when vendors get over supporting the base-10 radix required by the latest IEEE 754R standard and are looking for something new to attract customers they will provide a mechanism that makes it practical to obtain estimates of some of the error in floating-point calculations.

Share
www.wenn.com
www.tinynibbles.com