Sunday, 6 November 2011

How reliable are floating-point comparisons? in C programming language

How reliable are floating-point comparisons?

Floating-point numbers are the “black art” of computer programming. One reason why this is so is that there
is no optimal way to represent an arbitrary number. The Institute of Electrical and Electronic Engineers
(IEEE) has developed a standard for the representation of floating-point numbers, but you cannot guarantee
that every machine you use will conform to the standard.
Even if your machine does conform to the standard, there are deeper issues. It can be shown mathematically
that there are an infinite number of “real” numbers between any two numbers. For the computer to
distinguish between two numbers, the bits that represent them must differ. To represent an infinite number
of different bit patterns would take an infinite number of bits. Because the computer must represent a large
range of numbers in a small number of bits (usually 32 to 64 bits), it has to make approximate representations
of most numbers.
Because floating-point numbers are so tricky to deal with, it’s generally bad practice to compare a floatingpoint
number for equality with anything. Inequalities are much safer. If, for instance, you want to step
through a range of numbers in small increments, you might write this:
 
#include <stdio.h>
const float first = 0.0;
const float last = 70.0;
const float small = 0.007;
main()
{
float f;
for (f = first; f != last && f < last + 1.0; f += small)
;
printf(“f is now %g\n”, f);
}

However, rounding errors and small differences in the representation of the variable small might cause f to
never be equal to last (it might go from being just under it to being just over it). Thus, the loop would go
past the value last. The inequality f < last + 1.0 has been added to prevent the program from running
on for a very long time if this happens. If you run this program and the value printed for f is 71 or more,
this is what has happened. A safer way to write this loop is to use the inequality f < last to test for the loop ending, as in this example:

float f;
for (f = first; f < last; f += small)
;
You could even precompute the number of times the loop should be executed and use an integer to count
iterations of the loop, as in this example:

float f;
int count = (last - first) / small;
for (f = first; count-- > 0; f += small);

Cross Reference:

II.11: Are there any problems with performing mathematical operations on different variable
types?

No comments:

Post a Comment