User Defined Types  «Prev

# Floating-point Numbers

Floating-point numbers are rational numbers, that is, numbers with decimal points, such as 76.5 or -1.0.
In this course, we do not care how a particular language implements floating-point values. The same is true for the integer and string data types. All that is important is that these are data types that can contain an integer or a text string.

In computing, floating point describes a method of representing an approximation of a real number in a way that can support a wide range of values. The numbers are, in general, represented approximately to a fixed number of significant digits (the significand) and scaled using an exponent. The base for the scaling is normally 2, 10 or 16. The typical number that can be represented exactly is of the form:

## Floating-point Representation

The idea of floating-point representation over intrinsically integer fixed-point numbers, which consist purely of significand, is that expanding it with the exponent component achieves greater range.
For instance, to represent large values, e.g. distances between galaxies, there is no need to keep all 39 decimal places down to femtometre-resolution (employed in particle physics). Assuming that the best resolution is in light years, only the 9 most significant decimal digits matter, whereas the remaining 30 digits carry pure noise, and thus can be safely dropped. This represents a savings of 100 bits of computer data storage. Instead of these 100 bits, much fewer are used to represent the scale (the exponent), e.g. 8 bits or 2 decimal digits. Given that one number can encode both astronomic and subatomic distances with the same nine digits of accuracy, but because a 9-digit number is 100 times less accurate than the 11 digits reserved for scale, this is considered a trade-off exchanging range for precision. The example of using scaling to extend the dynamic range reveals another contrast with fixed-point numbers: Floating-point values are not uniformly spaced. Small values, close to zero, can be represented with much higher resolution (e.g. one femtometre) than large ones because a greater scale must be selected for encoding significantly larger values.
That is, floating-point numbers cannot represent point coordinates with atomic accuracy at galactic distances, only close to the origin. The term floating point refers to the fact that a number's radix point (decimal point) can "float". That is, the decimal can be placed anywhere relative to the significant digits of the number. This position is indicated as the exponent component in the internal representation, and floating point can thus be thought of as a computer realization of scientific notation.

### IEEE 754 Standard

Over the years, a variety of floating-point representations have been used in computers. However, since the 1990s, the most commonly encountered representation is that defined by the IEEE 754 Standard.
The speed of floating-point operations, commonly referred to in performance measurements as FLOPS, is an important characteristic of a computer system, especially in software that performs large-scale mathematical calculations.