Annotated Floating-Point Bibliography prepared for CS279, Spring 1992 by Deyu Judy Hu [ 1] Alefeld, G., and J. Herzberger. Introduction to Interval Computations. New York: Academic Press, 1983. A systematic introduction to the concepts and applications of interval analysis. [ 2] Atkins, D.E. "Higher-Radix Division Using Estimates of the Divisor and Partial Remainders." IEEE Trans. Comput., vol. C-17(1968): 925-934. Reviews the theory of the SRT division technique, develops analytic expressions for determining the number of bits of divisor and partial remainder which must be inspected for a given radix. [ 3] Bailey, David H. "A Portable High Performance Multiprecision Package." to appear in ACM Transactions on Mathematical Software. Describes the MPFUN multiprecision package of Fortran routines that perform a variety of arithmetic operations and transcendental functions on floating point numbers of arbitrarily high precision. Includes a discussion of some advanced multiprecision multiplication algorithms employed. [ 4] Bailey, David H. "The Computation of Pi to 29,360,000 Decimal Digits Using Borweins' Quartically Convergent Algorithm." Mathematics of Computation, 50.181(1988): 283-296. See pp. 287-289 for more discussion on multi-precision multiplication algorithms including Complex Fourier Transform and Prime Modulus Transform methods. [ 5] Bleher, J.H., et al. "ACRITH: High-Accuracy Arithmetic -- An Advanced Tool for Numerical Computation." In 7th Symposium on Computer Arithmetic, ed. Kai Hwang, 318-321. Silver Spring, MD: IEEE Computer Society Press, 1985. The IBM ACRITH package of numerical software is introduced, its application environment and underlying 20-instruction facility are described. See the critique [50] by Kahan & LeBlanc. [ 6] Bleher, J.H., et al. "FORTRAN-SC, A Study of a FORTRAN Extension for Engineering/Scientific Computation with Access to ACRITH." Computing, 39(1987): 93-110. Describes the language FORTRAN-SC, and features intended to facilitate engineering and scientific computation as well as the use of the ACRITH subroutine library. [ 7] Booth, A.D. "A Signed Binary Multiplication Technique." Q. J. Mech. Appl. Math., 4(1951): 236-240. Describes a two's complement multiplication algorithm that avoids the need to correct the product when either input is negative. [ 8] Brent, R.P. "A Fortran Multiple Precision Arithmetic Package." ACM Transactions on Mathematical Software, 4(1978): 57-70. Describes the MP package of Fortran subroutines for performing multi- precision floating point arithmetic, and some of the algorithms are presented. [ 9] Cheng, T.C., and I.T. Ho. "Storage-Efficient Representation of Decimal Data." Communications of the ACM 18.1(1975): 49-52. Presents a simple binary encoding scheme of decimal digits which achieves high storage efficiency. [10] Clenshaw, C.W., and F.W.J. Olver. "Beyond Floating Point." Journal of the Association for Computing Machinery, 31(1984): 319-328. A new number system -- level-index arithmetic -- is proposed for computer arithmetic based on iterated exponential functions to eradicate over/underflow. See also Demmel's critique [17]. [11] Cody, W.J., et al. "A Proposed Radix- and Word-length-independent Standard for Floating-point Arithmetic." IEEE Micro 4.4(1984): 86-100. The most readable and precise exposition of the IEEE 854 standard to date; some implementation problems and how to overcome them are also discussed. [12] Coonen, J.T. "An Implementation Guide to a Proposed Standard for Floating-point Arithmetic." Computer 13.1(1980): 68-79. Provides reasonable algorithms for arithmetic operations and exception handling that implement the IEEE standard. [13] Corbett, R.R. "Enhanced Arithmetic for Fortran." ACM Signum Newsletter, 18.1(1983): 24-28; ACM SIGPLAN Newsletter, 17.12 (1982): 41-48. The problems of interpreting mixed-precision arithmetic expressions are introduced, a strategy -- widest need evaluation -- is discussed, and its application to Fortran is presented. [14] Dadda, L. "Some Schemes for Parallel Multipliers." Alta Freq. 34(1965): 349-356. Describles the parallel multiplication process in more detail than Wallace's paper. [15] Dekker, T.J. "A Floating-Point Technique for Extending the Available Precision." Numerische Mathematik, 18(1971): 224-242. A technique is described for expressing multilength floating-point arithmetic in terms of singlelength floating-point arithmetic. See also D. Priest's article [72]. [16] Demmel, James. "Underflow and the Reliability of Numerical Software." SIAM Journal on Scientific and Statistical Computing 5 (1984): 887-919. Examines the effects of different underflow mechanisms (mainly store zero and gradual underflow) on the reliability of numerical software. [17] Demmel, James. "On Error Analysis in Arithmetic with Varying Relative Precision." Proceedings of the IEEE 8th Symposium on Computer Arithmetic, ed. M.J. Irwin, and R. Stefanelli, 148-152. Washington, D.C.: IEEE Computer Society Press, 1987. Illustrates that nonconventional floating-point representations proposed by Clenshau/Olver and Iri/Matsui require extra effort in error analysis; they are not shortcuts to writing reliable numerical code. [18] Dodrill, James D., et al. "A Low Latency Inplementation of a Convergence Divide Algorithm." ?? A low latency implementation of the floating point-division operation which uses the Goldschmidt algorithm is presented. The final rounding algorithm is explained. [19] Dongarra, J.J., et al. LINPACK User's Guide. Philadelphia, PA : Siam Press, 1979. LINPACK has since been superseded by LAPACK, q.v. [20] Farnum, Charles. "Compiler Support for Floating-Point Computation." Software--Practice and Experience, 18.7 (1988):701-709. Addresses a number of issues that compiler writers should know in order to produce good floating-point code. [21] Fateman, R.J. "The MACSYMA Big-Floating-Point Arithmetic System." in Proceedings of the 1976 ACM Symposium on Symbolic and Algebraic Computation, ed. R.D. Jenks. New York: ACM, 1976. Describes the bigfloat data type and arithmetic system in MACSYMA. [22] Goldberg, David. "What Every Computer Scientist Should Know About Floating-Point Arithmetic." ACM Computing Surveys 23.1 (1991): 5-48. A tutorial on IEEE 754/854 Standard and other software and hardware issues of floating point arithmetic that have a direct impact on computer system designers. A version augmented by D. Priest's comments is available on-line at ??????? [23] Goldberg, D. "Computer Arithmetic" Appendix A of Computer Architecture: A Quantitative Approach, by J.L. Hennessy and D.A. Patterson. San Mateo, CA: Morgan Kaufmann Publishers, 1990. An introduction to algorithms for fixed and floating point arithmetic, with emphasis on the IEEE 754 standard. [24] Goldberg, I.B. "27 Bits Are Not Enough for 8-digit Accuracy." Communications of the ACM 10(1967): 105-106. A general formula tells how many bits are enough for p-digit accuracy. [25] Griewank, Andreas. "On Automatic Differentiation." in Mathematical Programming: Recent Developments and Applications, ed. M. Iri and K. Tanabe, 83-109. Dordrecht, Holland: Kluwer Academic Publishers, 1989. Compares several techniques for evaluating gradients, shows that, for numerical purpose, the reverse mode of automatic differentiation is superior to other methods. Several software implementations of automatic differentiation are also discussed. [26] Griewank, Andreas. "Direct Calculation of Newton Steps without Accumulating Jacobians." in Large-Scale Numerical Opeimization, 115-137. Philadelphia, PA: SIAM, 1990. Shows that in calculating the roots of vector functions, direct computation of an extended linear system without accumulating Jacobians can be superior in performance. [27] Hamada, H. "A New Real Number Representation and Its Operation." Proceedings of the IEEE 8th Symposium on Computer Arithmetic, ed. M.J. Irwin, and R. Stefanelli, 153-157. Washington, D.C.: IEEE Computer Society Press, 1987. An internal representation called URR is proposed for real numbers. A variable length exponent part is used to "eliminate" overflow and underflow. See Demmel's critique [17]. [28] Hwang, K. Computer Arithmetic: Principles, Architecture, and Design. John Wiley & Sons, 1979. [29] Hull, T.E., et al. "Numerical Turing." SIGNUM Newsletter 20.3(1985): 26-33. Describes the Numerical Turing language with emphasis on its new features: clean decimal arithmetic and complete precision control of variables and operations. [30] IBM High-Accuracy Arithmetic Subroutine Library (ACRITH): General Information Manual. GC 33-6163-02, 3rd ed. (April 1986) Briefly describes the basic ideas underlying ACRITH's arithmetic and those special features that distinguish ACRITH from other mathematical subroutine libraries. [31] IBM High-Accuracy Arithmetic Subroutine Library (ACRITH): Program Description and User's Guide. SC33-6164-02, 3rd ed. (April 1986) The ACRITH subroutine library and it's underlying high-accuracy arithmetic facility are discussed in more detail. [32] IBM System/370 RPQ: High Accuracy Arithmetic. SA22-7093-0, (Jan. 1984) Describes the IBM System/370 RPQ high-accuracy-arithmetic facility. [33] IEEE 1987. IEEE Standard 754-1985 for Binary Floating-Point Arithmetic, IEEE. Reprinted in SIGPLAN 22, 2, 9-25. ????????? [34] Jansen, Paul, and Peter Weidner. "High-Accuracy Arithmetic Software -- Some Tests of the ACRITH Problem-Solving Routines." ACM Transactions on Mathematical Software 12.1 (1986): 62-70. Test results for most of the routines in the ACRITH package are presented, comments on the quality and reliability of these routines are given. [35] Kahan, W. "Interval arithmetic options in the proposed IEEE floating point arithmetic standard." in Interval Mathematics, ed. Karl L.E. Nickel, 99-128. New York: Academic Press, 1980. Exposes common misconceptions about computation, interval arithmetic, and the proposed IEEE standard. Explains some controversial features of that standard. [36] Kahan, W. "Analysis and Refutation." Lecture notes, UC Berkeley, spring 1992. [37] Kahan, W. "Analysis and Refutation of the LCAS." Lecture notes, UC Berkeley, spring 1992. Published in the ACM's SIGNUM Newsletter 26.3(1991):2-15, and SIGPLAN Bulletin 27.1(1992):61-74. Points out serious flaws in LCAS, analyses why it will discourage the promulgation of portable software. [38] Kahan, W. "Contracted Multiply-Adds." Lecture notes, 1992. Discusses the problems arised with the introduction of various contracted multiply-adds, and suggests some solutions. [39] Kahan, W. "Software Sqrt(x) for the Proposed IEEE Floating-point Standard." Lecture notes, 1980. Provides an algorithm to implement squre root operation in software for the IEEE standard. Includes comments on whether to implement sqrt in software or hardware, and which algorithm to use under different hardware assumptions. [40] Kahan, W. "The Occasional Futility of Higher-Precision Arithmetic." Lecture notes, 1989. Includes a program which shows that merely carrying enough precision cannot be the way to guarantee adequately accurate final results. [41] Kahan, W. "Multiplicative Iteration for Reciprocals." Lecture notes, 1991. [42] Kahan, W. "Computing a Real Cube Root." Lecture notes, 1991. [43] Kahan, W. "How to Round a Rough Quotient." Lecture notes, UCB, 1992. Shows different ways to correctly round a computed rough quotient. [44] Kahan, W. "Checking Whether Floating-point Division is Correctly Rounded." Lecture notes. Shows how to generate hard cases for testing the accuracy of approximate techniques for division. [45] Kahan, W. "A Computer Program with Almost No Significance." Lecture notes, UCB, 1988. A little program computes the correct result only on computers that implement the IEEE 754 arithmetic standard; what makes it act this way is explained. [46] Kahan, W. "To Test Whether Binary Floating-Point Multiplication is Correctly Rounded." Lecture notes, UCB, 1988. Discusses how to efficiently generate test cases to test whether binary floating-point multiplcation is correctly rounded. [47] Kahan, W. "On the Monotonicity of Some Computed Functions." Lecture notes, 1985. Techniques are introduced to help decide whether roundoff errors will abrogate the monotonicity of a function when it's computed. They are applied to several expressions that arise during the calculation of elementary transcendental functions. [48] Kahan, W. "Elementary Inequalities among Elementary Functions." Lecture notes, 1985. Discusses those relations among elementary functions that can be taken for granted, and those that might be violated by roundoff. [49] Kahan, W. "Elementary Functions from Kernels." Lecture notes, 1985. Includes formulas that calculate elementary transcendental functions economically from kernel functions: exp, log, tan, and arctan. [50] Kahan, W., and E. LeBlanc. "Anomalies in the IBM ACRITH Package." In 7th IEEE Symposium on Computer Arithmetic, ed. Kai Hwang, 322-331. Silver Spring, MD: IEEE Computer Society Press, 1985. Questions the reliability of ACRITH and its underlying methodology to manage extra-precise arithmetic. [51] Kahan, W. "How CRAY's Arithmetic Hurts Scientific Computation and What Might be Done About It." Lecture notes. [52] Kahan, W. and J.T. Coonen. "The near orthogonality of syntax, semantics, and diagnostics in numerical programming environments." in The Relationship between Numerical Computation and Programming Languages, North-Holland Publishing Company, 1982, 103-115. Shows that language syntax, arithmetic semantics, and execution-time diagnostics are approximately independent features of the numerical programming environment, and suggests that each should be dealt with by experts in that area. [53] Kahan, W. "Branch Cuts for Complex Elementary Functions, or Much Ado About Nothing's Sign Bit." in The State of Art in Mumerical Analysis, ed. A. Iserles, and M.J.D. Powell, 165-211. New York: Oxford University Press, 1987. Discusses the impact of signed and unsigned zeros on complex arithmetic, and supplies formulas to compute complex-valued inverse elementary functions correctly and accurately on their slitted domains. [54] Kahan, W. "Augmenting a Programming Language with Complex Arithmetic." Lecture notes, 1991, UCB. [55] Kahan, W. "Invalid Operations Deemed Advantageous." Lecture notes, 1992, UCB. [56] Kahan, W. "A Survey of Error Analysis." in Information Processing 71, 1214-1239. Amsterdam: North-Holland, 1972. [57] Knuth, Donald E. The Art of Computer Programming. Vol.2, Seminumerical Algorithms. 2nd ed. Reading, MA: Addison-Wesley, 1981. (Is the 3rd edition out yet?) Includes two chapters about floating-point arithmetic and multi- precision arithmetic, and many references to earlier literature. [58] Koblitz, Neal. P-adic Numbers, P-adic Analysis, and Zeta-Functions. second ed. New York: Springer-Verlag, 1984. An introductory text on P-adic analysis. The basic ideas of P-adic analysis are developd, and some application are presented. [59] Kuck, D.J., et al. "Analysis of Rounding Methods in Floating-Point Arithmetic." IEEE Trans. Comput., C-26(1977): 643-650. A survey of several floating-point rounding methods, including truncation, rough rounding, jamming, optimal rounding, and ROM rounding, concentrating upon average error (as if it were all that mattered) instead of worst-case error. [60] Kulisch, Ulrich W., and Willard L. Miranker. Computer Arithmetic in Theory and Practice. New York: Academic Press, 1981. As the title suggests, the book deals with both the theory and implementation of computer arithmetic taking for granted the alleged advantages of a "super-accumulator". [61] Kulisch, Ulrich W., and Willard L. Miranker, ed. A New Approach to Scientific Computation. New York: Academic Press, 1983. Includes a collection of papers about the concepts of a new theory of computer arithmetic including a "super-accumulator" (which forms the theoretical basis of ACRITH), and its software and hardware implementation. [62] Kulisch, U. et al. PASCAL-SC, A PASCAL Extension for Scientific Computation, Information Manual and Floppy Disk, B.G. Teubner, Stuttgart, and John Wiley & Sons. New York, 1987. [63] Lang, Serge. Algebraic Number Theory. New York: Springer-Verlag, 1986. An exposition of the classical basic algebraic and analytic number theory, includes a chapter on p-adic analysis. [64] Lee, Corinna. "Multistep Gradual Rounding." IEEE Transactions on Computers, 38.4 (1989): 595-600. The motivation and problems of multistep rounding are explained. A new scheme called multistep gradual rounding which overcomes those problems is discussed. [65] Linnainmaa, S. "Combatting the Effects of Underflow and Overflow in Determining Real Roots of Polynomials." IEEE Floating Point Subcommittee Working Document P754/80-2.23, 1980. [66] Lu, Paul Y., and Kevin Dawallu. "A VLSI Module for IEEE Floating- point Multiplication/Division/Square Root." ?? A hardware implementation of division and square root using Newton-Ralphson method and a rounding algorithm are presented. [67] Matsui, S., and M. Iri, "An Overflow/Underflow-Free Floating Point Representation of Numbers." J. of Inf. Proc. 4.3 (1981): 123-133. See Demmel's critique [17]. [68] Matula, D.W. "In-and-out Convertions." Comm. ACM 11(1968), 47-50. Discusses how many significant digits are needed to retain maximum accuracy during radix conversion for all combinations of rounding and truncation conversions. [69] Matula, D.W., and P. Kornerup. "Finite Precision Rational Arithmetic: Slash Number Systems." IEEE TC, vol. c-34, no.1 (1985): 3-18. Specifies the fixed-slash and floating-slash number systems and the exact rational and approximate real arithmetic they support. [70] Moore, R.E. "Methods and Applications of Interval Analysis." SIAM Studies in Applied Mathematics, Philadelphia: SIAM, 1979. A survey/text of the principal methods and applications of interval analysis. [71] Pichat, M. "Correction d'une somme en arithmetique a virgule flottante." Numerische Mathematik 19 (1972): 400-406. [72] Priest, Douglas M. "Algorithms for Arbitrary Precision Floating Point Arithmetic." in 1991 IEEE 10th Symposium on Computer Arithmetic, ed. P. Kornerup, and D.W. Matula, 132-143. Washington: IEEE Computer Society Press, 1991. Presents techniques to perform computation of high accuracy using floating- point operations of limited precision. Points out that it is cost, not precision, that limits accuracy. [73] Priest, Douglas. "VEGE: An Automatic Test Vector Generator for Validation of Floating Point Arithmetic." Department of Mathematics, UC Berkeley, 1991. Describes the design and implementation of VEGE, a software system for automatically generating floating point arithmetic test data. [74] Priest, Douglas. "Strategies for Testing Floating Point Arithmetic." UCB, 1992. Surveys several strategies for testing floating point arithmetic, including both basic arithmetic operations and elementary transcendental functions. [75] Rall, L. B. "Differentiation in PASCAL-SC: Type GRADIENT." ACM Transactions on Mathematical Software 10(1984): 161-184. Shows how automatic differentiation can be carried out in a modern computer language which permits user-defined operators and data type. [76] Reiser, J.F., and D.E. Knuth. "Evading the Drift in Floating-point Addition." Inf. Process. Lett. 3.3(1975): 84-87. Presents a proof that various unbiased forms of rounding, called stable rounding, will eliminate drift. [77] Rice, John R. Numerical Methods, Software, and Analysis: IMSL Reference Edition New York: McGraw-Hill Book Company, 1983. Includes an introduction to the IMSL Library. [78] Robertson, J.E. "A New Class of Digital Division Methods." IRE Transactions on Electronic Computers, vol. EC-7, no.9 (1958): 218-222. Describes the SRT division method and presents examples Radix 4 and Radix 10 dividers. [79] Smith, B.T., et al. Matrix Eigensystem Routines -- Eispack Guide. Heidelberg: Springer-Verlag, 1976. [80] Sorensen D.C., and Ping Tak Peter Tang. "On the Orthogonality of Eigenvectors Computed bye Divide-and-conquer Techniques." SIAM Journal on Numerical Analysis 28(1991):1752-1775. [81] Sterbenz, Pat H. Floating-Point Computation. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1974. A good text book on floating-point computation. Includes a detailed discussion of the arithmetics of IBM's 7094 and /360, and CDC's 6600. [82] Strassen, V. "Gaussian elimination is not optimal." Numerische Mathematik, 13(1969): 354-356. An algorithm which computes the product of two square matrices and tries to minimize the total number of arithmetic operations is given. [83] Sweeney, D.W. "An Analysis of Floating Point Addition." IBM Systems Journal, 4.1 (1965): 31-42 Provides a detailed, statistical analysis of floating point addition with different radices. [84] Taylor, G.S. "Radix 16 SRT Dividers with Overlapped Quotient Selection Stages." In 1985 IEEE 7th Symposium on Computer Arithmetic, ed. Kai Hwang, 64-71. Silver Spring, MD:IEEE Computer Society Press, 1985. Several SRT division methods and their implementation alternatives are compared for performance and costs. [85] Tocher, T.D. "Techniques of Multiplication and Division for Automatic Binary Computers." Quarterly Journal of Mechanics and Applied Mathematics, vol.11, part 3 (1958): 364-384. A survey of some possible schemes for fast multiplication. Applies mathematical analysis to the problem of making fast multipliers and dividers. [86] Vuillemin, Jean. "Exact Real Computer Arithmetic with Continued Fractions." in Proceedings of the 1988 ACM Conference on LISP and Functional Programming, 14-27. New York: ACM Press, 1988. Introduces a representation of the computable real numbers by continued fractions and the general algorithms for performing arithmetic operations on them. [87] Wallace, C.S. "A Suggestion for a Fast Multiplier." IEEE Trans. Electron. Comput. EC-13 (1964):14-17. Introduces the notion of a fully parallel multiplier implemented with carry-save adders. [88] Wijngaarden, A. Van. "Numerical Analysis as an Independent Science." BIT 6(1966), 66-81. Tries to define computer arithmetic upon real numbers in 32 axioms. [89] Wilkinson, J.H. Rounding Errors in Algebraic Processes. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1963. Shows how to apply error analysis of the individual arithmetic operations to the error analysis of large-scale problems. [90] Williams, T.E., and M. Horowitz. "SRT Division Diagrams and their Usage in Designing Custom Integrated Circuits for Division." Technical Report No. CSL-TR-87-326, Stanford University, 1986. Describes the construction and analysis of several diagrams which depict SRT division algorithms. [91] Williams, T.E., and M.A. Horowitz. "A 160nS 54bit CMOS Division Implementation using Self-Timing and Symmetrically Overlapped SRT Stages." In Proc. 10th IEEE Symposium on Computer Arithmetic, ed. P. Kornerup and D.W. Matula, 210- 217. Washington: IEEE Computer Society Press, 1991. Describes several performance enhancements used to make this self- timed SRT division implementation one of the fastest to date.