Numerical Computing with Modern FortranRichard J. Hanson and Tim Hopkins |
Chapter 9: IEEE Arithmetic Features and Exception Handling
Source Code:
There is a single executable that may be generated by the makefile
-
book_test uses set_precision.f90,
dnrm2_testmod.f90, dnrm2_support.f90, dnrm2_ieee.f90.
The INTRINSIC module IEEE_ARITHMETIC and IEEE_EXCEPTIONS must be available with the compiler. See the directory gfortran for support for the gfortran compiler which does not have these modules available at the publication time, November 15th 2013.
Sample output from book_test
This is an output sample, using INTRINSIC modules and the Intel compiler XE 14.0.1.139.
Test No: 1 -- n <= 0 signifies null array 0.00000000D+00 0.00000000D+00 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 2 -- Illegal value of incx 0.00000000D+00 0.00000000D+00 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 3 Returned expected infinite result Test No: 4 Returned expected infinite result Test No: 5 Returned expected NaN result Test No: 6 Returned expected NaN result Test No: 7 -- Obvious overflow Returned expected infinite result Test No: 8 -- Boundary value causing overflow Returned expected infinite result Test No: 9 -- Boundary value not causing overflow 0.17976931+309 0.17976931+309 0.00000000D+00 Values Expected, Returned, Relative Error with Expected inexact flag incorrectly set. Expected: F Test No: 10 -- No underflow when dealing with very small values 0.44501477-307 0.44501477-307 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 11 -- Simple data for representable result 0.50000000D+01 0.50000000D+01 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 12 -- Simple data for representable result (exact) 0.92438690D+08 0.92438690D+08 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 13 -- Simple data for representable result (inexact) 0.11633466D+09 0.11633466D+09 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 14 -- n=1: Trivial case 0.30000000D+01 0.30000000D+01 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 15 -- n=1: Trivial case (boundary value) 0.17976931+309 0.17976931+309 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 16 -- n=1: Trivial case (boundary value) 0.22250739-307 0.22250739-307 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 17 n=1: +Inf Returned expected infinite result Test No: 18 n=1: -Inf Returned expected infinite result Test No: 19 n=1: sNaN Returned expected NaN result Test No: 20 n=1: qNaN Returned expected NaN result Test No: 21 -- incx = 2, exact result 0.70000000D+01 0.70000000D+01 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 22 -- incx = 3, inexact result 0.53851648D+01 0.53851648D+01 0.00000000D+00 Values Expected, Returned, Relative Error with Expected Test No: 23 -- incx = 2, Inf and qNan present in input vector but should not be accessed 0.70000000D+01 0.70000000D+01 0.00000000D+00 Values Expected, Returned, Relative Error with Expected
Discussion of results obtained above
Note that Test 9 signals INEXACT for a special case. And we have specified that it should not. Here are some thoughts.
The test uses the vector (maxreal, xvalue) where xvalue is chosen so that the result of xvalue squared is just too small to register -- i.e., it is treated as zero. The result will then be maxreal to the available precision. If xvalue is multiplied by 2 (for example) then the result of the square and add will be to generate an overflow condition.
Looking on the network at, what appear to be, reasonable discussions one can find a writer using the definition
The rounded result of a valid operation is different from the infinitely precise result -- in which case the inexact flag should be set.
Another uses the definition: An operation produces a result that cannot be represented with infinite precision. -- in which case one could make a case for not setting it.
Finally, the standard says something like "The inexact exception need not be set correctly, however the other exceptions must be set as specified. Presumably this is intended to allow the implementor to choose whether or not to set this flag due to efficiency considerations.