Annotated Floating-Point Bibliography  prepared for  CS279,  Spring 1992
                               by  Deyu Judy Hu


[ 1] Alefeld, G., and J. Herzberger.  Introduction  to Interval
    Computations.  New York:  Academic Press, 1983.
    A systematic introduction to the concepts and applications of interval
    analysis.

[ 2] Atkins, D.E.  "Higher-Radix Division Using Estimates of the Divisor
    and Partial Remainders." IEEE Trans.  Comput., vol. C-17(1968):
    925-934.
    Reviews the theory of the SRT division technique, develops analytic
    expressions for determining the number of bits of divisor and partial
    remainder which must be inspected for a given radix.

[ 3] Bailey, David H.  "A Portable High Performance Multiprecision
    Package." to appear in ACM Transactions on Mathematical Software.
    Describes the MPFUN multiprecision package of Fortran routines that
    perform a variety of arithmetic operations and transcendental functions
    on floating point numbers of arbitrarily high precision.  Includes a
    discussion of some advanced multiprecision multiplication algorithms
    employed.

[ 4] Bailey, David H.  "The Computation of Pi to 29,360,000 Decimal Digits
    Using Borweins' Quartically Convergent Algorithm." Mathematics of
    Computation, 50.181(1988):  283-296.
    See pp.  287-289 for more discussion on multi-precision multiplication
    algorithms including Complex Fourier Transform and Prime Modulus
    Transform methods.

[ 5] Bleher, J.H., et al.  "ACRITH: High-Accuracy Arithmetic -- An Advanced
    Tool for Numerical Computation." In 7th Symposium on Computer
    Arithmetic, ed. Kai Hwang, 318-321.  Silver Spring, MD:  IEEE Computer
    Society Press, 1985.
    The IBM ACRITH package of numerical software is introduced, its
    application environment and underlying 20-instruction facility are
    described.  See the critique [50] by  Kahan & LeBlanc.

[ 6] Bleher, J.H., et al.  "FORTRAN-SC, A Study of a FORTRAN Extension for
    Engineering/Scientific Computation with Access to ACRITH." Computing,
    39(1987):  93-110.
    Describes the language FORTRAN-SC, and features intended to facilitate
    engineering and scientific computation as well as the use of the ACRITH
    subroutine library.

[ 7] Booth, A.D.  "A Signed Binary Multiplication Technique."  Q. J. Mech.
    Appl. Math., 4(1951):  236-240.
    Describes a two's complement multiplication algorithm that avoids the
    need to correct the product when either input is negative.

[ 8] Brent, R.P.  "A Fortran Multiple Precision Arithmetic Package."  ACM
    Transactions on Mathematical Software, 4(1978):  57-70.
    Describes the MP package of Fortran subroutines for performing multi-
    precision floating point arithmetic, and some of the algorithms are
    presented.

[ 9] Cheng, T.C., and I.T. Ho.  "Storage-Efficient Representation of Decimal
    Data." Communications of the ACM 18.1(1975):  49-52.
    Presents a simple binary encoding scheme of decimal digits which
    achieves high storage efficiency.

[10] Clenshaw, C.W., and F.W.J. Olver.  "Beyond Floating Point."  Journal
    of the Association for Computing Machinery, 31(1984):  319-328.

    A new number system -- level-index arithmetic -- is proposed for
    computer arithmetic based on iterated exponential functions to
    eradicate over/underflow.  See also  Demmel's  critique [17].

[11] Cody, W.J., et al.  "A Proposed Radix- and Word-length-independent
    Standard for Floating-point Arithmetic." IEEE Micro 4.4(1984):  86-100.
    The most readable and precise exposition of the IEEE 854 standard to
    date; some implementation problems and how to overcome them are also
    discussed.

[12] Coonen, J.T.  "An Implementation Guide to a Proposed Standard for
    Floating-point Arithmetic." Computer 13.1(1980):  68-79.
    Provides reasonable algorithms for arithmetic operations and exception
    handling that implement the IEEE standard.

[13] Corbett, R.R.  "Enhanced Arithmetic for Fortran."  ACM Signum
    Newsletter, 18.1(1983):  24-28; ACM SIGPLAN Newsletter, 17.12 (1982):
    41-48.
    The problems of interpreting mixed-precision arithmetic expressions are
    introduced, a strategy -- widest need evaluation -- is discussed, and
    its application to Fortran is presented.

[14] Dadda, L.  "Some Schemes for Parallel Multipliers." Alta Freq.
    34(1965): 349-356.
    Describles the parallel multiplication process in more detail than
    Wallace's paper.

[15] Dekker, T.J.  "A Floating-Point Technique for Extending the Available
    Precision." Numerische Mathematik, 18(1971):  224-242.
    A technique is described for expressing multilength floating-point
    arithmetic in terms of singlelength floating-point arithmetic.  See
    also  D. Priest's  article [72].

[16] Demmel, James.  "Underflow and the Reliability of Numerical Software."
    SIAM Journal on Scientific and Statistical Computing 5 (1984):
    887-919.
    Examines the effects of different underflow mechanisms (mainly store
    zero and gradual underflow) on the reliability of numerical software.

[17] Demmel, James.  "On Error Analysis in Arithmetic with Varying Relative
    Precision." Proceedings of the IEEE 8th Symposium on Computer
    Arithmetic, ed.  M.J.  Irwin, and R.  Stefanelli, 148-152.  Washington,
    D.C.:  IEEE Computer Society Press, 1987.
    Illustrates that nonconventional floating-point representations
    proposed by Clenshau/Olver and Iri/Matsui require extra effort in error
    analysis; they are not shortcuts to writing reliable numerical code.

[18] Dodrill, James D., et al.  "A Low Latency Inplementation of a
    Convergence Divide Algorithm." ??
    A low latency implementation of the floating point-division operation
    which uses the Goldschmidt algorithm is presented.  The final rounding
    algorithm is explained.

[19] Dongarra, J.J., et al.  LINPACK User's Guide.  Philadelphia, PA : Siam
    Press, 1979.
    LINPACK  has since been superseded by  LAPACK,  q.v.

[20] Farnum, Charles.  "Compiler Support for Floating-Point Computation."
    Software--Practice and Experience, 18.7 (1988):701-709.
    Addresses a number of issues that compiler writers should know in order
    to produce good floating-point code.

[21] Fateman, R.J.  "The MACSYMA Big-Floating-Point Arithmetic System."  in
    Proceedings of the 1976 ACM Symposium on Symbolic and Algebraic
    Computation, ed.  R.D.  Jenks.  New York:  ACM, 1976.
    Describes the bigfloat data type and arithmetic system in MACSYMA.

[22] Goldberg, David.  "What Every Computer Scientist Should Know About
    Floating-Point Arithmetic." ACM Computing Surveys 23.1 (1991):  5-48.

    A tutorial on IEEE 754/854 Standard and other software and hardware
    issues of floating point arithmetic that have a direct impact on
    computer system designers.  A version augmented by  D. Priest's
    comments is available on-line at  ???????

[23] Goldberg, D.  "Computer Arithmetic"   Appendix A of Computer
    Architecture:  A Quantitative Approach, by J.L.  Hennessy and D.A.
    Patterson.  San Mateo, CA:  Morgan Kaufmann Publishers, 1990.
    An introduction to algorithms for fixed and floating point arithmetic,
    with emphasis on the IEEE 754 standard.

[24] Goldberg, I.B.  "27 Bits Are Not Enough for 8-digit Accuracy."
    Communications of the ACM 10(1967):  105-106.
    A general formula tells how many bits are enough for p-digit accuracy.

[25] Griewank, Andreas.  "On Automatic Differentiation."  in Mathematical
    Programming:  Recent Developments and Applications, ed. M. Iri and K.
    Tanabe, 83-109.  Dordrecht, Holland:  Kluwer Academic Publishers, 1989.
    Compares several techniques for evaluating gradients, shows that, for
    numerical purpose, the reverse mode of automatic differentiation is
    superior to other methods.  Several software implementations of
    automatic differentiation are also discussed.

[26] Griewank, Andreas.  "Direct Calculation of Newton Steps without
    Accumulating Jacobians." in Large-Scale Numerical Opeimization,
    115-137.  Philadelphia, PA:  SIAM, 1990.
    Shows that in calculating the roots of vector functions, direct
    computation of an extended linear system without accumulating Jacobians
    can be superior in performance.

[27] Hamada, H. "A New Real Number Representation and Its Operation."
    Proceedings of the IEEE 8th Symposium on Computer Arithmetic, ed.  M.J.
    Irwin, and R.  Stefanelli, 153-157.  Washington, D.C.:  IEEE Computer
    Society Press, 1987.
    An internal representation called URR is proposed for real numbers.  A
    variable length exponent part is used to  "eliminate"  overflow and
    underflow.  See  Demmel's  critique [17].

[28] Hwang, K.  Computer Arithmetic: Principles, Architecture, and Design.
    John Wiley & Sons, 1979.

[29] Hull, T.E., et al.  "Numerical Turing."  SIGNUM Newsletter
    20.3(1985): 26-33.
    Describes the Numerical Turing language with emphasis on its new
    features:  clean decimal arithmetic and complete precision control of
    variables and operations.

[30] IBM High-Accuracy Arithmetic Subroutine Library (ACRITH): General
    Information Manual.  GC 33-6163-02, 3rd ed.  (April 1986)
    Briefly describes the basic ideas underlying ACRITH's arithmetic and
    those special features that distinguish ACRITH from other mathematical
    subroutine libraries.

[31] IBM High-Accuracy Arithmetic Subroutine Library (ACRITH): Program
    Description and User's Guide.  SC33-6164-02, 3rd ed.  (April 1986)
    The ACRITH subroutine library and it's underlying high-accuracy
    arithmetic facility are discussed in more detail.

[32] IBM System/370 RPQ: High Accuracy Arithmetic. SA22-7093-0, (Jan. 1984)
    Describes the IBM System/370 RPQ high-accuracy-arithmetic facility.

[33] IEEE 1987.  IEEE Standard 754-1985 for Binary Floating-Point
    Arithmetic, IEEE.
    Reprinted in SIGPLAN 22, 2, 9-25.  ?????????

[34] Jansen, Paul, and Peter Weidner.  "High-Accuracy Arithmetic Software
    -- Some Tests of the ACRITH Problem-Solving Routines." ACM Transactions
    on Mathematical Software 12.1 (1986):  62-70.
    Test results for most of the routines in the ACRITH package are
    presented, comments on the quality and reliability of these routines
    are given.

[35] Kahan, W.  "Interval arithmetic options in the proposed IEEE floating
    point arithmetic standard." in Interval Mathematics, ed.  Karl L.E.
    Nickel, 99-128.  New York:  Academic Press, 1980.
    Exposes common misconceptions about computation, interval arithmetic,
    and the proposed IEEE standard.  Explains some controversial features
    of that standard.

[36] Kahan, W.  "Analysis and Refutation."  Lecture notes, UC Berkeley,
    spring 1992.

[37] Kahan, W.  "Analysis and Refutation of the LCAS."  Lecture notes, UC
    Berkeley, spring 1992.  Published in the ACM's SIGNUM Newsletter
    26.3(1991):2-15, and SIGPLAN Bulletin 27.1(1992):61-74.
    Points out serious flaws in LCAS, analyses why it will discourage the
    promulgation of portable software.

[38] Kahan, W.  "Contracted Multiply-Adds."   Lecture notes, 1992.
    Discusses the problems arised with the introduction of various
    contracted multiply-adds, and suggests some solutions.

[39] Kahan, W.  "Software Sqrt(x) for the Proposed IEEE Floating-point
    Standard." Lecture notes, 1980.
    Provides an algorithm to implement squre root operation in software for
    the IEEE standard.  Includes comments on whether to implement sqrt in
    software or hardware, and which algorithm to use under different
    hardware assumptions.

[40] Kahan, W.  "The Occasional Futility of Higher-Precision Arithmetic."
    Lecture notes, 1989.
    Includes a program which shows that merely carrying enough precision
    cannot be the way to guarantee adequately accurate final results.

[41] Kahan, W.  "Multiplicative Iteration for Reciprocals."  Lecture notes,
    1991.

[42] Kahan, W.  "Computing a Real Cube Root." Lecture notes, 1991.

[43] Kahan, W.  "How to Round a Rough Quotient."  Lecture notes, UCB, 1992.
    Shows different ways to correctly round a computed rough quotient.

[44] Kahan, W.  "Checking Whether Floating-point Division is Correctly
    Rounded."     Lecture notes.
    Shows how to generate hard cases for testing the accuracy of
    approximate techniques for division.

[45] Kahan, W.  "A Computer Program with Almost No Significance."  Lecture
    notes, UCB, 1988.

    A little program computes the correct result only on computers that
    implement the IEEE 754 arithmetic standard; what makes it act this way
    is explained.

[46] Kahan, W.  "To Test Whether Binary Floating-Point Multiplication is
    Correctly Rounded." Lecture notes, UCB, 1988.
    Discusses how to efficiently generate test cases to test whether binary
    floating-point multiplcation is correctly rounded.

[47] Kahan, W.  "On the Monotonicity of Some Computed Functions." Lecture
    notes, 1985.
    Techniques are introduced to help decide whether roundoff errors will
    abrogate the monotonicity of a function when it's computed.  They are
    applied to several expressions that arise during the calculation of
    elementary transcendental functions.

[48] Kahan, W.  "Elementary Inequalities among Elementary Functions."
    Lecture notes, 1985.
    Discusses those relations among elementary functions that can be taken
    for granted, and those that might be violated by roundoff.

[49] Kahan, W.  "Elementary Functions from Kernels."  Lecture notes, 1985.
    Includes formulas that calculate elementary transcendental functions
    economically from kernel functions:  exp, log, tan, and arctan.

[50] Kahan, W., and E. LeBlanc.  "Anomalies in the IBM ACRITH Package."  In
    7th IEEE Symposium on Computer Arithmetic, ed.  Kai Hwang, 322-331.
    Silver Spring, MD:  IEEE Computer Society Press, 1985.
    Questions the reliability of ACRITH and its underlying methodology to
    manage extra-precise arithmetic.

[51] Kahan, W.  "How CRAY's Arithmetic Hurts Scientific Computation and What
    Might be Done About It." Lecture notes.

[52] Kahan, W. and J.T. Coonen.  "The near orthogonality of syntax,
    semantics, and diagnostics in numerical programming environments." in
    The Relationship between Numerical Computation and Programming
    Languages, North-Holland Publishing Company, 1982, 103-115.
    Shows that language syntax, arithmetic semantics, and execution-time
    diagnostics are approximately independent features of the numerical
    programming environment, and suggests that each should be dealt with by
    experts in that area.

[53] Kahan, W.  "Branch Cuts for Complex  Elementary Functions, or Much Ado
    About Nothing's Sign Bit." in The State of Art in Mumerical Analysis,
    ed. A. Iserles, and M.J.D. Powell, 165-211.  New York:  Oxford
    University Press, 1987.
    Discusses the impact of signed and unsigned zeros on complex
    arithmetic,  and supplies formulas to compute complex-valued inverse
    elementary functions correctly and accurately on their slitted domains.

[54] Kahan, W.  "Augmenting a Programming Language with Complex Arithmetic."
    Lecture notes, 1991, UCB.

[55] Kahan, W.  "Invalid Operations Deemed Advantageous."  Lecture notes,
    1992, UCB.

[56] Kahan, W.  "A Survey of Error Analysis." in Information Processing 71,
    1214-1239.  Amsterdam:  North-Holland, 1972.

[57] Knuth, Donald E.  The Art of Computer Programming.  Vol.2,
    Seminumerical Algorithms.  2nd ed.  Reading, MA:  Addison-Wesley, 1981.
    (Is the  3rd  edition out yet?)
    Includes two chapters about floating-point arithmetic and multi-
    precision arithmetic, and many references to earlier literature.

[58] Koblitz, Neal.  P-adic Numbers, P-adic Analysis, and Zeta-Functions.
    second ed.  New York:  Springer-Verlag, 1984.
    An introductory text on P-adic analysis.  The basic ideas of P-adic
    analysis are developd, and some application are presented.

[59] Kuck, D.J., et al.  "Analysis of Rounding Methods in Floating-Point
    Arithmetic." IEEE Trans.  Comput., C-26(1977):  643-650.

    A survey of several floating-point rounding methods, including
    truncation, rough rounding, jamming, optimal rounding, and ROM
    rounding,  concentrating upon average error (as if it were all that
    mattered) instead of worst-case error.

[60] Kulisch, Ulrich W., and Willard L. Miranker.  Computer Arithmetic in
    Theory and Practice.  New York:  Academic Press, 1981.
    As the title suggests, the book deals with both the theory and
    implementation of computer arithmetic taking for granted the alleged
    advantages of a  "super-accumulator".

[61] Kulisch, Ulrich W., and Willard L. Miranker, ed.  A New Approach to
    Scientific Computation.  New York:  Academic Press, 1983.

    Includes a collection of papers about the concepts of a new theory of
    computer arithmetic including a  "super-accumulator" (which forms the
    theoretical basis of ACRITH), and its software and hardware
    implementation.

[62] Kulisch, U. et al.  PASCAL-SC, A PASCAL Extension for Scientific
    Computation, Information Manual and Floppy Disk, B.G.  Teubner,
    Stuttgart, and John Wiley & Sons.  New York, 1987.

[63] Lang, Serge.  Algebraic Number Theory.  New York: Springer-Verlag, 1986.
    An exposition of the classical basic algebraic and analytic number
    theory, includes a chapter on p-adic analysis.

[64] Lee, Corinna.  "Multistep Gradual Rounding."  IEEE Transactions on
    Computers, 38.4 (1989):  595-600.
    The motivation and problems of multistep rounding are explained.  A new
    scheme called multistep gradual rounding which overcomes those problems
    is discussed.

[65] Linnainmaa, S.  "Combatting the Effects of Underflow and Overflow in
    Determining Real Roots of Polynomials." IEEE Floating Point
    Subcommittee Working Document P754/80-2.23, 1980.

[66] Lu, Paul Y., and Kevin Dawallu.  "A VLSI Module for IEEE Floating-
    point Multiplication/Division/Square Root." ??
    A hardware implementation of division and square root using
    Newton-Ralphson method and a rounding algorithm are presented.

[67] Matsui, S., and M. Iri,  "An Overflow/Underflow-Free Floating Point
    Representation of Numbers." J. of Inf. Proc. 4.3 (1981):  123-133.
    See  Demmel's  critique [17].

[68] Matula, D.W.  "In-and-out Convertions."  Comm. ACM  11(1968), 47-50.
    Discusses how many significant digits are needed to retain maximum
    accuracy during radix conversion for all combinations of rounding and
    truncation conversions.

[69] Matula, D.W., and P. Kornerup.  "Finite Precision Rational Arithmetic:
    Slash Number Systems." IEEE TC, vol.  c-34, no.1 (1985):  3-18.
    Specifies the fixed-slash and floating-slash number systems and the
    exact rational and approximate real arithmetic they support.

[70] Moore, R.E.  "Methods  and Applications of Interval Analysis."  SIAM
    Studies in Applied Mathematics, Philadelphia:  SIAM, 1979.

    A survey/text of the principal methods and applications of interval
    analysis.

[71] Pichat, M.  "Correction d'une somme en arithmetique a virgule
    flottante." Numerische Mathematik 19 (1972):  400-406.

[72] Priest, Douglas M.  "Algorithms for Arbitrary Precision Floating Point
        Arithmetic."  in 1991 IEEE 10th Symposium on Computer Arithmetic,
        ed. P. Kornerup, and D.W. Matula, 132-143.  Washington: IEEE
        Computer Society Press, 1991.
        Presents techniques to perform computation of high accuracy using floating-
        point operations of limited precision. Points out that it is cost, not
        precision, that limits accuracy.

[73] Priest, Douglas.  "VEGE: An Automatic Test Vector Generator for
    Validation of Floating Point Arithmetic." Department of Mathematics, UC
    Berkeley, 1991.
    Describes the design and implementation of VEGE, a software system for
    automatically generating floating point arithmetic test data.

[74] Priest, Douglas.  "Strategies for Testing Floating Point Arithmetic."
    UCB, 1992.
    Surveys several strategies for testing floating point arithmetic,
    including both basic arithmetic operations and elementary
    transcendental functions.

[75] Rall, L. B.  "Differentiation in PASCAL-SC: Type GRADIENT."  ACM
    Transactions on Mathematical Software 10(1984):  161-184.
    Shows how automatic differentiation can be carried out in a modern
    computer language which permits user-defined operators and data type.

[76] Reiser, J.F., and D.E. Knuth.  "Evading the Drift in Floating-point
    Addition." Inf.  Process.  Lett.  3.3(1975):  84-87.
    Presents a proof that various unbiased forms of rounding, called stable
    rounding, will eliminate drift.


[77] Rice, John R.  Numerical Methods, Software, and Analysis: IMSL
    Reference Edition New York:  McGraw-Hill Book Company, 1983.
    Includes an introduction to the IMSL Library.

[78] Robertson, J.E.  "A New Class of Digital Division Methods."  IRE
    Transactions on Electronic Computers, vol. EC-7, no.9 (1958): 218-222.
    Describes the SRT division method and presents examples Radix 4 and
    Radix 10 dividers.

[79] Smith, B.T., et al. Matrix Eigensystem Routines -- Eispack Guide.
    Heidelberg:  Springer-Verlag, 1976.

[80] Sorensen D.C., and Ping Tak Peter Tang.  "On the Orthogonality of
    Eigenvectors Computed bye Divide-and-conquer Techniques." SIAM Journal
    on Numerical Analysis 28(1991):1752-1775.

[81] Sterbenz, Pat H.  Floating-Point Computation.  Englewood Cliffs, NJ:
    Prentice-Hall, Inc., 1974.

    A good text book on floating-point computation.  Includes a detailed
    discussion of the arithmetics of IBM's 7094 and /360,  and CDC's 6600.

[82] Strassen, V. "Gaussian elimination is not optimal." Numerische
    Mathematik, 13(1969):  354-356.
    An algorithm which computes the product of two square matrices and
    tries to minimize the total number of arithmetic operations is given.

[83] Sweeney, D.W.  "An Analysis of Floating Point Addition."  IBM Systems
    Journal, 4.1 (1965): 31-42
    Provides a detailed, statistical analysis of floating point addition
    with different radices.

[84] Taylor, G.S.  "Radix 16 SRT Dividers with Overlapped Quotient
    Selection Stages." In 1985 IEEE 7th Symposium on Computer Arithmetic,
    ed. Kai Hwang, 64-71.  Silver Spring, MD:IEEE Computer Society Press,
    1985.
    Several SRT division methods and their implementation alternatives are
    compared for performance and costs.

[85] Tocher, T.D.  "Techniques of Multiplication and Division for Automatic
    Binary Computers." Quarterly Journal of Mechanics and Applied
    Mathematics, vol.11, part 3 (1958):  364-384.
    A survey of some possible schemes for fast multiplication.  Applies
    mathematical analysis to the problem of making fast multipliers and
    dividers.

[86] Vuillemin, Jean.  "Exact Real Computer Arithmetic with Continued
    Fractions." in Proceedings of the 1988 ACM Conference on LISP and
    Functional Programming, 14-27.  New York:  ACM Press, 1988.
    Introduces a representation of the computable real numbers by continued
    fractions and the general algorithms for performing arithmetic
    operations on them.

[87] Wallace, C.S.  "A Suggestion for a Fast Multiplier."  IEEE Trans.
    Electron. Comput. EC-13 (1964):14-17.
    Introduces the notion of a fully parallel multiplier implemented with
    carry-save adders.

[88] Wijngaarden, A. Van.  "Numerical Analysis as an Independent Science."
    BIT 6(1966), 66-81.
    Tries to define computer arithmetic upon real numbers in 32 axioms.

[89] Wilkinson, J.H.  Rounding Errors in Algebraic Processes.  Englewood
    Cliffs, NJ:  Prentice-Hall, Inc., 1963.
    Shows how to apply error analysis of the individual arithmetic
    operations to the error analysis of large-scale problems.

[90] Williams, T.E., and M. Horowitz.  "SRT Division Diagrams and their
    Usage in Designing Custom Integrated Circuits for Division." Technical
    Report No.  CSL-TR-87-326, Stanford University, 1986.
    Describes the construction and analysis of several diagrams which
    depict SRT division algorithms.

[91] Williams, T.E., and M.A. Horowitz.  "A 160nS 54bit CMOS Division
    Implementation using Self-Timing and Symmetrically Overlapped SRT
    Stages." In Proc. 10th IEEE Symposium on Computer Arithmetic, ed. P.
    Kornerup and D.W. Matula, 210- 217.  Washington:  IEEE Computer
    Society Press, 1991.

    Describes several performance enhancements used to make this self-
    timed SRT division implementation one of the fastest to date.