ffllooaatt -- C Keyword

Data type

Floating point numbers are a subset of the real numbers.  Each has a built-
in radix  point (or ``decimal  point'') that shifts, or  ``floats'', as the
value of the  number changes.  It consists of the  following: one sign bit,
which  indicates whether  the  number is  positive or  negative; bits  that
encode the  number's _e_x_p_o_n_e_n_t; and bits that  encode the number's _f_r_a_c_t_i_o_n,
or the number upon which the  exponent works.  In general, the magnitude of
the number encoded depends upon the number of bits in the exponent, whereas
its precision depends upon the number of bits in the fraction.

The ranges of values that can be held by a COHERENT ffllooaatt are set in header
file ffllooaatt.hh.

The exponent often uses a _b_i_a_s. This is a value that is subtracted from the
exponent to yield the power of two by which the fraction will be increased.

Floating point  numbers come in two levels  of precision: single precision,
called   ffllooaatts;  and   double   precision,  called   ddoouubbllees.  With   most
microprocessors,  ssiizzeeooff(ffllooaatt) returns  four, which  indicates that  it is
four cchhaarrs (bytes) long, and ssiizzeeooff(ddoouubbllee) returns eight.

Several formats are used to  encode ffllooaatts, including IEEE, DECVAX, and BCD
(binary coded decimal).

The  following   describes  DECVAX,  IEEE,   and  BCD  formats,   for  your
information.

_D_E_C_V_A_X _F_o_r_m_a_t
The 32 bits in a ffllooaatt  consist of one sign bit, an eight-bit exponent, and
a 24-bit  fraction, as follows.   Note that in this  diagram, `s' indicates
``sign'', `e' indicates ``exponent'', and `f` indicates ``fraction'':

     =============
     | seee eeee |Byte 4
     |===========|
     | efff ffff |Byte 3
     |===========|
     | ffff ffff |Byte 2
     |===========|
     | ffff ffff |Byte 1
     =============

The exponent has a bias of 129.

If the  sign bit is  set to one,  the number is  negative; if it  is set to
zero, then  the number is positive.   If the number is  all zeroes, then it
equals  zero;  an  exponent  and  fraction  of zero  plus  a  sign  of  one
(``negative zero'')  is by  definition not a  number.  All other  forms are
numeric values.

The most  significant bit in the  fraction is always set to  one and is not
stored.  It is usually called the ``hidden bit''.

The format for  ddoouubbllees simply adds another 32 fraction  bits to the end of
the ffllooaatt representation, as follows:

     =============
     | seee eeee |Byte 8
     |===========|
     | efff ffff |Byte 7
     |===========|
     | ffff ffff |Byte 6
     |===========|
     | ffff ffff |Byte 5
     |===========|
     | ffff ffff |Byte 4
     |===========|
     | ffff ffff |Byte 3
     |===========|
     | ffff ffff |Byte 2
     |===========|
     | ffff ffff |Byte 1
     =============

_I_E_E_E _F_o_r_m_a_t
The IEEE  encoding of  a ffllooaatt is  the same as  that in the  DECVAX format.
Note, however, that the exponent has a bias of 127, rather than 129.

Unlike the  DECVAX format,  IEEE format  assigns special values  to several
floating point  numbers.  Note  that in  the following description,  a _t_i_n_y
exponent is one that is all  zeroes, and a _h_u_g_e exponent is one that is all
ones:

-> A tiny exponent  with a fraction of zero equals  zero, regardless of the
   setting of the sign bit.

-> A huge  exponent with a fraction of zero  equals infinity, regardless of
   the setting of the sign bit.

-> A  tiny exponent  with a  fraction greater than  zero is  a denormalized
   number, i.e., a number that is less than the least normalized number.

-> A huge exponent with a fraction greater than zero is, by definition, not
   a number.  These values can be used to handle special conditions.

An  IEEE ddoouubbllee,  unlike DECVAX  format, increases  the number  of exponent
bits.   It  consists of  a  sign  bit, an  11-bit  exponent,  and a  53-bit
fraction, as follows:

    =============
    | seee eeee |   Byte 8
    |===========|
    | eeee ffff |   Byte 7
    |===========|
    | ffff ffff |   Byte 6
    |===========|
    | ffff ffff |   Byte 5
    |===========|
    | ffff ffff |   Byte 4
    |===========|
    | ffff ffff |   Byte 3
    |===========|
    | ffff ffff |   Byte 2
    |===========|
    | ffff ffff |   Byte 1
    =============

The exponent  has a bias of  1,023.  The rules of encoding  are the same as
for ffllooaatts.

_B_C_D _F_o_r_m_a_t
The BCD  format (``binary coded decimal'',  also called ``packed decimal'')
is used to eliminate rounding errors  that alter the worth of an account by
a fraction of  a cent.  It consists of a  sign, an exponent, and a chain of
four-bit numbers, each of which is  defined to hold the values zero through
nine.

A  BCD ffllooaatt  has a  sign  bit, seven  bits of  exponent, and  six four-bit
digits.  In the following diagrams, `d' indicates ``digit'':

    =============
    | seee eeee |   Byte 4
    |===========|
    | dddd dddd |   Byte 3
    |===========|
    | dddd dddd |   Byte 2
    |===========|
    | dddd dddd |   Byte 1
    =============

A BCD ddoouubbllee  has a sign bit, 11 bits  of exponent, and 13 four-bit digits,
as follows:

    =============
    | seee eeee |   Byte 8
    |===========|
    | eeee dddd |   Byte 7
    |===========|
    | dddd dddd |   Byte 6
    |===========|
    | dddd dddd |   Byte 5
    |===========|
    | dddd dddd |   Byte 4
    |===========|
    | dddd dddd |   Byte 3
    |===========|
    | dddd dddd |   Byte 2
    |===========|
    | dddd dddd |   Byte 1
    =============

Passing the hexadecimal numbers A through F in a digit yields unpredictable
results.

The following rules apply when handling BCD numbers:

-> A tiny exponent with a fraction of zero equals zero.

-> A tiny  exponent with  a fraction  of non-zero indicates  a denormalized
   number.

-> A huge exponent with a fraction of zero indicates infinity.

-> A huge  exponent with a  fraction of non-zero  is, by definition,  not a
   number; these non-numbers are used to indicate errors.

_C_O_H_E_R_E_N_T _F_l_o_a_t_i_n_g _P_o_i_n_t
COHERENT  286 uses  DECVAX floating-point format.   COHERENT 386  uses IEEE
floating-point  format.   Please note  that  this does  _n_o_t  mean that  the
COHERENT-386  floating-point software fully  implements the  IEEE standard;
for example, it does not support denormals.

To  allow you  to convert  binary  data from  one floating-point  format to
another,  COHERENT comes  with four  functions with  which you  can convert
DECVAX-format floating-point numbers  to IEEE format, and vice versa.  They
are as follows:

ddeeccvvaaxx_dd()
        Convert an IEEE ddoouubbllee to DECVAX format.

ddeeccvvaaxx_ff()
        Convert an IEEE ffllooaatt to DECVAX format.

iieeeeee_dd()
        Convert a DECVAX ddoouubbllee to IEEE format.

iieeeeee_ff()
        Convert a DECVAX ffllooaatt to IEEE format.

For details, see their respective entries in the Lexicon.

_S_e_e _A_l_s_o
CC kkeeyywwoorrddss, ddaattaa ffoorrmmaattss, ddeeccvvaaxx_dd, ddeeccvvaaxx_ff, ddoouubbllee, eeccvvtt(), eemm8877, ffccvvtt(),
ffllooaatt, ffllooaatt.hh, ggccvvtt(), iieeeeee_dd, iieeeeee_ff
_T_h_e _A_r_t _o_f _C_o_m_p_u_t_e_r _P_r_o_g_r_a_m_m_i_n_g, vol. 2, page 180_f_f
ANSI Standard, section 6.1.2.5

_N_o_t_e_s
The COHERENT-386  preprocessor implicitly defines the  macro _IIEEEEEE, whereas
the COHERENT-286  preprocessor implicitly defines the  macro _DDEECCVVAAXX. These
can  be used  to  conditionally include  code  that applies  to a  specific
edition  of  COHERENT.  If  you  were writing  code  that intensively  used
floating-point numbers and you want to compile the code under both editions
of COHERENT, you can write code of the form:

    #ifdef _DECVAX
        ...
    #elif _IEEE
        ...
    #endif

The  C preprocessor  under each  edition of COHERENT  will ensure  that the
correct code is included for compilation.

The C  compiler for  release 4.2 of  COHERENT (and subsequent  releases can
generate  code  to  use a  numeric  coprocessor.   For  information on  the
differences  between hardware  floating-point arithmetic  and  its software
counterpart, and how to include such code in your programs, see the Lexicon
entry for cccc.
