\documentclass{article}
\usepackage{fullpage}
\usepackage[pdftex]{graphicx}
\title{DRAFT!! Comments on Unrestricted Algorithms for Bessel Functions in Computer Algebra:
Arbitrary Precision, The Backwards Recurrence, Taylor Series, Hermite Interpolation}
\author{Richard J. Fateman\\
University of California\\
Berkeley, CA 9472201776}
\begin{document}
\maketitle
\begin{abstract}
We explore various ways of implementing ``unrestricted algorithms''
\cite{brent}
% http://maths.anu.edu.au/~brent/pd/rpb052a.pdf
for approximating
Bessel (J) functions. An unrestricted algorithm for a function $f(x)$
provides a result to any requested precision in the answer. The
emphasis is on higherthannormal precision with the precision
specified as an extra argument to the function. That is, the
precision is specified at runtime. We require that the algorithm
provide at least the requested number of correct digits, contrary to
some existing codes which provide only ``absolute error'' near
critical points. We use $J_0$ of real nonnegative argument as an
example, although much of the reasoning generalizes to other Bessel
functions or related functions.
Since it is plausible that there will be requests for additional
values of $J_0$ at the same (high) precision at a collection of nearby
arguments, we consider implementations that cache certain reusable
key constants (namely zeros of $J_0$ near the argument values).
\end{abstract}
\section{Introduction}
The methods we look at for computing Bessel functions (The notation
for Bessel functions of the first kind is $J_\nu(x)$) include: (a) The
now traditional algorithm using a 3term formula relating Bessel
functions of different order using the defining recurrence in the
``backwards'' direction. (b) Taylor series expanded at zero or
especially at other locations. (c) Twopoint Hermite interpolation, a
kind of generalization of Taylor series, and (d) asymptotic formulas
for large argument. To qualify as an``unrestricted'' algorithm, the
procedure must in essence be a function not only of the obvious
arguments $\nu$ and $x$ but also a runtimespecified precision
$p$. This is the number of digits to which the answer is correct.
Conventional fixeddegree rational function approximation is suitable
only if the precision requirement is limited and can therefore be
accounted for when the approximation formula is derived. Even so, such
formulas tend to have large {\em relative} error near zeros of the function.
While we are using Bessel functions as the first ``less familiar''
function in relatively common use, we do not require the reader to be
especially familiar with Bessel functions (or for that
matter, Hermite interpolation).
In particular, some of the methods used here can be applied to other
scientific functions of substantial interest.
See Gautschi \cite{recurrence} for a survey of recurrence
methods, and Corless \cite{corlessairy} for a discussion of a similar
sort involving computer algebra systems, arbitrary precision,
recurrence methods and Airy functions. The standard concise
reference for other methods, especially asymptotic series is the
Digital Library of Mathematical Functions (DLMF)\footnote{DLMF http://dlmf.nist.gov/10.17}
A Bessel function $J_n(x)$ can be visualized as a kind of
sinusoidal wave of decreasing amplitude as $x$ increases,
asymptotically approaching zero. Also as $x$ increases, the spacing
between the zerocrossings approaches a constant, namely
$\pi$. Details and plots of $J_0$, $J_1$, as well as relations to
differential equations, integrals, and other functions can be found
online in the DLMF and in reference books e.g. Watson\cite{watson}.
%%The first three Bessel J functions are plotted here:
%%\begin{figure*}
%% \centerline{
%% \mbox{\includegraphics[width=3.00in]{c:/temp/besselplot.eps}}
%% }
%% \caption{J[0], J[1], J[2]}
%% \label{besselplot}
%% \end{figure*}
Given an environment of a computer algebra system\footnote {Our
illustrations are given in the free, opensource Maxima system, a
descendant of the early Macsyma system.} eases the burden for
development of unrestricted algorithms. Support for representation and
arithmetic on arbitraryprecision floatingpoint numbers with
effectively unlimited exponent range, it is often possible to
construct fairly naive unrestricted algorithms. One method is to use
bruteforce expansion of Taylor series using increasingly higher
precisions, adding additional terms until some termination condition
is reached. There are two problems with this: (a) It is displeasing
aesthetically (and sometimes practically) to do what appears to be far
more work than necessary to get the right answer, and Taylor series
expanded at zero can present substantial work; (b) Determining a
correct termination condition may be more subtle than appears at first
glance. The NumGFun package (in Maple) \cite{mezzarobba} provides a
general framework for Taylor series expansions of ``Dfinite''
functions. These include Bessel functions.\footnote{The initial
emphasis here seems to be on harnessing theoreticallyfast but
general algorithms, using exact arithmetic when possible to find
isolated numerical values to some required absolute error bound.
Most likely this won't be economical unless thousands or millions of
digits are required, and would not be a fast approach for
evaluations of the same function at many different points. Future
research could address finding good approximations. (personal
communication, 9/8/2012 MM)}
Among the existing Bessel function programs is the MPFR library, {\tt
http://www.mpfr.org} an opensource implementation whose algorithm
is also documented and``proved''\footnote{{\tt
www.mpfr.org/algorithms.pdf}}. It uses a Taylor series expanded
around zero, but when the value of the argument is sufficiently large
(compared to the index and the required precision) it switches to an
asymptotic formula.
MPFR attempts to sum the terms correctly by boosting the working
precision relative to the precision $p$ expected in the result.
When we first tried to test this, we
tested on $z_{40}= 953131103962007545291/23793523728624063229.$
We fell into a testing trap here, because this number is
not exactly representable as a binary float. Testing various
programs in different precisions
therefore added a variability in results because they {\em rounded the
input to different values!} A safer number
is the nearby
$w_{40}=1477895056151579973991/2^{65}$
which rounds to the same (binary) floatingpoint quantity in any
precision
sufficient to represent the explicit integer numerator.
Examining the coefficients
in the Taylor series suggests that substantially higher
working precision is necessary:
rather large terms (e.g. the 23rd term is of magnitude $10^{15}$).
Such terms {\em must mostly cancel since the first nonzero digit in the answer
is 22 digits to the right of the decimal point}. For 90 correct decimal
digits, it appears that about 127 decimals must be carried as
intermediate data.
By contrast, if we consider the same problems but
using expansion near
the approximate zero at $z_{40}$ rather than around zero,
it takes rather few terms (about 30) to get a result
that is accurate to 90 decimal places.
The algorithm in Mathematica 8.0 (perhaps as a result of our prompting)
was apparently changed because Mathematica
10.0 appears to provide a correctly rounded result, in {\em any}
softwareimplemented floatingpoint.
We also tried Maple 7 (we do not have the latest version of Maple)
where Maple allows one to compute highprecision Bessel functions by
setting a global variable, {\tt Digits}. Maple also allows one to
view the source code for part of the implementation (sans comments) by
{\tt interface(verboseproc=2);}. Some detective work reveals that in
this case it seems the code eventually calls {\tt
evalf/hypergeometric/kernel}, which is a builtin procedure without
provided source code. Ultimately, with {Digits:=90}
{\tt BesselJ[0,w40]}, Maple provides 90 decimal digits correct.
Any of the algorithms we have encountered in programs for arbitrary
precision Bessel functions seem capable of producing an arbitrary
number of digits; however, (a) Some seem unaware of just how many of
the digits are correct for particular arguments and (b) we suspect
that systems may choose algorithms that are far slower than they need be.
Since the precision specification for a call to an unrestricted algorithm is
revealed at runtime, the analysis and implementation of such
algorithms is usually different from the traditional case of a
predetermined precision. The error
analysis for any routine is aimed at using the least
work for finding an implementation that
produces justbarely the number of digits required for any input.
The programmer for an ordinary library (fixedprecision) routine
is given the bit representation and precision for numbers in advance,
which is an advantage in designing a program
to produce that number of bits {\em with the least amount of
time and space}.
It is tempting to try to construct an unrestricted algorithm by using
some convergent iteration until it is accurate enough, perhaps
coordinated with some {\em a priori} bounds, as is done by MPFR.
Unfortunately one cannot rely on the most tempting of conditions:
assuming that if iterations $n$ and $n+1$ agree in $n$ digits, that
those digits are correct. In fact such a proposed ``converged'' answer
may not be correct at all, but an artifact of roundoff and/or
coincidence.
In other words, if both iterations were performed in higher precision
the disparity would be evident.
The ACM Algorithm 236 for Bessel functions, which is otherwise
perfectly adaptable to arbitraryprecision and uses the previously
mentioned ``backwards recurrence'' \cite{gautschi} can exhibit this
kind of false convergence near the zerocrossings of $J_n$. The
computation can be boosted to higher precision {\em if one can tell
that that is needed}. Interestingly, the version of Algorithm 236
directly translated to the computer algebra system Maxima can even
give partially symbolic answers, e.g. $J_{7/3}(2)$ is approximately
690899216853/(582246954016*$\Gamma(1/3)$).
\section {Taylor series near zero}
For Bessel functions in the region close to the origin an easy and
certainly {\em unrestricted} algorithm is one that starts with the
Taylor series expansion at zero for $J_0$ and adds enough terms and
enough precision so that the next nonzero remaining term does not
affect the result. Near zero the successive terms drop substantially
and they can be added in sequence. The coefficient of $x^{2k}$ has
magnitude $1/(4^k (k!)^2)$. (It seems to be slightly advantageous to
use the Taylor series for $H(\nu,x) = J_\nu(\sqrt{x})$ which has all
nonzero coefficients; $H(0,0.25)$ computes $J_0(0.5)$). Ten terms in
this series is good for errors less than $10^{21}$ for $x< 1.202$,
which is half the distance to the first zero of $J_0$. If $x$ is
closer to that zero (at about 2.4048), it is more efficient to expand
at that point. Setting aside that efficiency argument and neglecting
the relative error at that first zero, the series seems like an
unrestricted algorithm can add up the terms $x^k/(4^k (k!)^2)$ in
$H(0,x)$ (of alternating sign) until they do not affect the answer.
As indicated earlier, summing this series suffers from numerical
cancellation near a zero $c$ of $J_0$ unless the expansion is about
the point $c$.
%The tactic in MPFR's implementation is to use the series about 0
%except when an asymptotic formula can be shown to have the required
%accuracy. The Taylor series can in principle always be used, but it
%will not be efficient far from $x=0$. The terms in the series grow
%quite large (possibly overflowing a fixedrange format) before
%eventually dropping down. Carrying sufficient extra precision
%(proportional to the log of the largest term) can be
%expensive.
One can avoid cancellation by an even more
expensive tactic, which is to use exact rational coefficients, accumulating the sum
exactly until subsequent terms are below the specified threshold.
This can require furious highprecision integer and rational number
computation.
As indicated earlier, these efficiency concerns (many terms, high precision)
legislate against expansions in a Taylor series about zero for large
$x$, but do not necessarily detract when the series expansion
is shifted so that it is centered on a nearby zero of $J_0$.
\section{Asymptotic Formula, large $x$}
For points far from $x=0$ the computation of a Taylor series can be
compared to the usual asymptotic formula (DLMF 10.17.3). This is
presented later, andf shows that one can provide considerable accuracy
with small effort. Unfortunately the asymptotic formula evaluated
near a zero will also require computation in higher precision to achieve
the same relative error.
The relevant DLMF formula 10.17.3 looks like this.
Where
\[a_{k}(\nu)=\frac{(4\nu^{2}1^{2})(4\nu^{2}3^{2})\cdots(4\nu^{2}(2k1)^{2})}{k!8^{k}},\]
and
\[\omega=z\frac{1}{2}\nu\pi\frac{1}{4}\pi,\]
then for $\nu$ fixed as $z\rightarrow \infty$,
\[\mathop{J_{{\nu}}\/}\nolimits\!\left(z\right)\sim\left(\frac{2}{\pi z}\right)^{{\frac{1}{2}}}\*\left(\mathop{\cos\/}\nolimits\omega\sum _{{k=0}}^{\infty}(1)^{k}\frac{a_{{2k}}(\nu)}{z^{{2k}}}\mathop{\sin\/}\nolimits\omega\sum _{{k=0}}^{\infty}(1)^{k}\frac{a_{{2k+1}}(\nu)}{z^{{2k+1}}}\right)\]
%(If the demand is for merely ``low'' precision  say 18 decimal digits 
%there are compact rational approximations to recommend.)
There is a question as to the bound on the accuracy of this asymptotic
formula as a function of $z$. That is, for a relative small $z$ as one
includes more terms in the summations, a particular value is
approached. It just might not be the required $p$digit correct value,
and no amount of higher precision $p+q$ arithmetic will change this.
There is a substantial published literature on Bessel functions in applied
mathematics journals, a literature which is only briefly touched upon
in our direct references. Each of our references has further bibliographic links.
The classic volume by Watson \cite{watson} predates the
explosion in computational power and consequent analysis unleashed
upon many aspects of the Bessel functions and their applications.
Some of the literature consists of theorems concerning Bessel function
properties and/or properties of programs to compute related values.
Some of the numerical literature includes recommendations for methods,
but theoretical error bounds are generally inadequate in that they
may not account for properties of computer arithmetic (finite representation,
roundoff, exponent overflow). These issues require additional analysis;
we have found that testing can illuminate the occasionally
overconservative theoretical bounds.
%Indeed,
%is there any aspect of Bessel functions that has been entirely unexplored?
Note that our program fragments, while executable, do not take
advantage of every algorithmic tweak possible. Instead their central
thrust is to indicate the efficient processes that are possible, and
still retain some semblance of readability.
\section{The 3term Recurrence (Miller's algorithm)}
Using a 3term defining recurrence in reverse to compute values of
Bessel functions is generally credited to to J.C.P. Miller
\cite{miller}. There are a number of useful modifications to the
basic idea as illustrated by {Gautschi \cite{gautschi}. To see a
relatively simple explanation, refer to section 9.12, Numerical
Methods, Bessel Functions of Integer Order, \cite{as}, or the more
recent online version of this publication which has a somewhat more
abstract explanation, sections iii and iv of the Digital Library of
Mathematical Functions ({\tt http://dlmf.nist.gov/3.6}) henceforth
DLMF. These sections cover, respectively, Miller's algorithm and a
variant, Olver's algorithm. \cite{olver}. } {A minimal version of
Miller's backwards recurrence program is quite short, and is given
below. It becomes more complicated if one imposes the (reasonable)
additional requirement that the program determines \em a priori} a
good value for the number of iterations ({\tt nmax}, below) needed to
meet some accuracy criterion \cite{olver}. In fact the program
computes not just $J_0(x)$, but returns an array with values for all
the integerindex Bessel functions up to $J_{n}(x)$, though the ones
that matter must all be checked for suitable convergence, and the
highest numbered ones will not be accurate. One of the tricky points
in a program like this disappears if it can be executed in an
arbitraryprecision floatingpoint setting where exponent overflow is
avoided. The terms in the iteration can grow exponentially until they
are divided by the normalization parameter $\lambda$ in the
formulation ({\tt lam} in the program.) While there are other
techniques to avoid this additional computation (cf.\cite{zaitsev}),
this is simply not a concern.
The 5line program is based on the recurrence:
$$J_k(x) ~=~ 2(k+1)/x J_{k+1}(x)  J_{k+2}(x).$$
This is inadequate without a few additional observations, namely that
for large enough $k$, $J_k(x)$ will be close to zero ($x>0$) and
$J_{k1}(x)$ will be not quite so small: we set this arbitrarily to some
value e.g. to 1, and correct it later.
Running the recurrence backward to 0 we end up with values
for $J_0, \cdots, J_n$ for $n10$ is negligible, the formula is:
$${{5\,x^81920\,x^6+169344\,x^44128768\,x^2+18579456}\over{x^8+96\,
x^6+8064\,x^4+516096\,x^2+18579456}}.$$
To compute $J_0(40.0)$ to 40 decimal places, experimentation shows it
is adequate to carry 45 decimal digits and compute {\tt
bj(40.0b0,117)[1]}. Implementing the decision process to figure
these parameter settings for $x=40.0$ (that is, the floating point
precision should be boosted to 45 or more and {\tt nterms} should be
117 or more) adds somewhat to the 5line program. Several methods are
discussed by Olver and Sookne \cite{olver}. One cited method is to
make a table to specify {\tt nmax} for a range of $x$, order, and
precision. Algorithm 236 \cite {gautschi} uses an approximation
involving the polylogarithm. An idea credited to W. Kahan which only
slightly overestimates {\tt nmax} seems to work well. On this
example, it recommends {\tt nmax=119} instead of 117. It is about 4
lines:
\begin{verbatim}
/* findn return a recommended value for nmax in the bj routine,
given the argument x and d=number of decimal digits */
findn(x,d):=block([r:1, y:[1,0],lim:2*10^d],
while abs(first(y))2^p$ in
precision $p$ is unacceptable
but $\sin(x)+\cos(x)$ is less so. This does not address the concern
that either formulation has a problem when the overall result is near
zero.
At such a point most of the significant figures in the subexpressions
cancel, and therefore the precision of the whole calculation must be boosted.
Considering $P(n, z)$ and $Q(n, z)$ together, the
ratio of the terms of index $k+1$ over the term of index $k$
is about $k/(2z)$. The series starts to diverge when $k > 2z$. At that point, the $k$th
term is about $e^{−2z}$, so for precision $p$, if $z > p/2 \log 2$ the asymptotic
formula is credible, so long as the result is not near a zero.
\medskip
We turn to other approaches in the subsequent sections. First we find
a mechanism to compute an important subpart of these other approaches,
namely accurate and inexpensive derivatives.
{\section {Derivatives of Bessel Functions}
We will need derivatives of Bessel functions at specified points
for use in evaluation of Taylor series and Hermite interpolation,
methods described later.
Here we describe how, as a consequence of various identities, the
formula for each derivative of $J_0$ at its $k$th zero $z_k$ can be
rephrased entirely as a rational function of $z_k$ multiplied by
$J_1(z_k)$. (See equation (9) in Wills \cite{wills}.) Compared to
Wills, the formulas become even simpler because at our points of
interest we know that $J_0(z_k)=0$. We combine this computation with
dividing the $n$th derivative by $n!$:
{$$
J_0(x)=0;~~J_0'(x)=J_1(x);~~ J_0''(x)/2 = J_1(x)/(2 x)
$$
$$
\frac{J_0^{(k)}(x)}{k!}~=~ {\frac{1}{x}}
\big({\frac{1k}{k}} J_0^{(k1)} \frac{1}{k (k1)} J_0^{(k3)}(x)\big)

\frac{1}{k (k1)} J_0^{(k2)}(x).
$$
}
It might seem at first that reducing the evaluation of $J_0(x)$ to evaluation
of $J_1(z_k)$ is not much of an advantage, but note that we are
accessing the value of $J_1$ only at {\em precomputed zeros of}
$J_0$, not at $x$.
Furthermore, while the fullyaccurate direct evaluation of $J_0$ near its
zeros is delicate, such points do not represent difficulty for
evaluating $J_1$.
As an example, the first six values for
$1/k! (d^k/dz^k)J_0(z)$ where $J_0(z)=0$ are
{\em exactly}
$$\left[ 1 , {{1}\over{2\,z}} , {{z^22}\over{6\,z^2}} , {{z^23
}\over{12\,z^3}} , {{z^47\,z^2+24}\over{120\,z^4}} , {{z^411\,z^2
+40}\over{240\,z^5}} \right] $$
times $J_1(z)$.
There are many ways of arranging these formulas. For example,
if we first set $y=1/z$ we can state those formulas using
only polynomials with rational constants (here we omit division by $n!$) as
$$\left[ 1 , y ,~ 12\,y^2 , 6\,y^32\,y ,~ 24\,y^4+7\, ~y^21 ,~ 120\,y
^533\,y^3+3\,y \right]. $$
Another rearrangement would be using Horner's rule. If we are dealing
with simple numbers, we need not worry about ``expression swell'' and
not be too concerned about roundoff, and so can use the recurrence above as
shown in the Maxima program below.
\begin{verbatim}
/*[J0(x), J0'(x), J0''(x)/2, ..., J0{n}(x)/n!].
We require as input n, x, and a sufficiently accurate value for J1(x) */
J0Ders(n,x,J1x):= /* all the derivatives of J0 at x*/
block([xinv:1/x, ans:[1/(2*x),1,0]],
for k:2 thru n1 do
push(((ans[1]*k^2+ans[3])*xinv+ans[2])/(k^2+k), ans),
reverse(J1x*ans))
\end{verbatim}
Note that the value of the $n$th derivative is a simple function of
the values of the $n1$, $n2$, and $n3$ derivatives, so the
algebra for making and displaying the
explicit formula is probably unnecessary. We can compute numeric terms
as needed, 4 multiplies and 3 adds. A convenient data
structure is a list rather than an array; each additional derivative
refers to only the top 3 elements. Also note that in Maxima the type
of the arithmetic is based on the type of the values of {\tt x} and
{\tt J1x}. If either of these is a machine float, the whole
calculation will be done in machine floats. If neither is a machine
float but either is a bigfloat, the whole calculation will be done in
bigfloats of precision equal to the global floatingpoint
precision. If both are rational numbers or integers, the entire
calculation will be done exactly. If one or both are symbolic, the
entire calculation will be symbolic. If $x$ is a rational form, the
result will be in canonical simplified rational form.
We are not quite done with this program unless we can determine that
the answers it produces are accurate: Is the sequence, as computed,
stable? If we need $k$ terms accurate to $d$ decimal digits, how many
digits of $J_1(x)$ do we need, and what should the working precision be?
Experiments suggest that at zeros near $z=100.0$ or 1000 or even near
10,000, using ordinary machine doublefloats provide coefficients that
are good for about 16 decimal digits all the way out to $n=100$,
although the coefficients underflow to zero before that point. Using
arbitrary precision software set at 25decimal precision, roundoff
eventually degrades the results after about 150 coefficients (when the
coefficients look like $ 10^{264}$). We suspect people do not really need
264 digit Bessel values, but there you are.
One variation in the use of this program will produce a result that
has either zero or one rounding, and so can be as precise as the input
allows. If {\tt x} and {\tt J1x} are given as rationals,
(approximation to the presumably irrational value of {\tt J1x}) the
answer will be a list of rationals, and there will be no roundoff at
all. The numbers will probably be presented with uncomfortably large
numerators and denominators. If only {\tt x} is given as a rational,
the bulk of the calculation will be done exactly, and then there will
be one rounding when multiplying by {\tt J1x} which is presumably a
bigfloat or float. Alternatively, binding {\tt J1x} to 1 and
subsequently multiplying the result of the summation by $J_1(x)$ would
work. This modification essentially removes the arithmetic error, but
would be timeconsuming so we would not want to do this except at
``tablebuilding'' time.
While appealing, this does not eliminate {\em all} sources of error:
since {\tt x} is a rational number it cannot be exactly a root of
$J_0(x)=0$ unless $x=0$, and so we are computing something subtlely
different from our goal. We can finesse this problem if we know that
the chosen rational value for {\tt x} is so accurate that the residual
$\epsilon=J_0(x)$ is far less than $10^{d} J_1(x)$, where $d$ is the
bigfloat precision. We should also assure ourselves that the rational
iteration in the derivative program is not especially sensitive to
slight changes in the value of {\tt x}. Fortunately we know that
$f(x+\epsilon)$ can be approximated by $f(x)+\epsilon f'(x)$, and we
have approximations of $f'(x)$ easily available in the list we are
computing! Thus we can multiply the $n$th normalized derivative by
$n$ to give the rate of change with error of the previous normalized
derivative. If the (rational) value for {\tt x} is known to be good
to $d$ decimal digits, we have a value for $\epsilon = 1/10^d$. Note
that this could be used with bigfloats (of sufficiently high
precision) but it certainly works because computing with a rational
value for {\tt x} removes all roundoff error!
\begin{verbatim}
/*[J0(x), J0'(x), J0''(x)/2, ..., J0{n}(x)/n!].
We require as input n, x, and J1(x). One or zero roundoff.. */
J0DersExact(n,x,J1x):= /* all the derivatives of J0 at x*/
block([xi:1/x, ans:[1/(2*x),1,0]],
for k:3 thru n do
push(xi*((1k)/k*first(ans) 1/(k*(k1))*third(ans))
second(ans)/(k*(k1)),
ans),
reverse(ans) *J1x )
\end{verbatim}
%hmm. write another short program?
\begin{verbatim}
/* error in nth deriv in list m; m[1]= function, m[k] = (k1)th deriv/(k1)!,
eps = known bound of error in x, a rational approximation to a zero of J_0 */
error_in_der(x,eps,m,n):= eps*m[n]/n
\end{verbatim}
%fpprec:50
%J0Ders(100,c1,j1c1)[100];
%J0Ders(100,bfloat(c1),bfloat(j1c1))[100];
%JxDers(100,bfloat(c1),bfloat(j1c1),40)[100];
%bfloat(JxDers(100,rationalize(c1),rationalize(j1c1))[100]);
}
%test(n):=[J0Ders(n,c1,j1c1)[n],J0Ders(n,bfloat(c1),bfloat(j1c1))[n]];
\section {Taylor series in summary}
Given a list of coefficients appropriate for a Taylor series for $f$ namely
$\{f(x_0),~f'(x_0),~f''(x_0)/2, ..., ~f^{(n)}(x_0)/n!\}$, here is a
Taylor series evaluation program for $f(v)$:
\begin{verbatim}
(tayser(m,len,x0,v):=
block([q:vx0],
tayser1(m,index):= if index>len then 0
else first(m)+q*tayser1(rest(m),index+1),
tayser1(m,1)))
\end{verbatim}
This program adds the small terms together first.
To use this program for Bessel functions we can use a zero for {\tt x0}
and for {\tt m}, a list of coefficients from the derivative program
of the previous section.
It might be helpful to see how rapidly the coefficients drop.
Below we have printed the first 5 digits of the coefficients
around the zero at c0=2.404825557695770 to demonstrate how many terms
in the sum are likely to be needed. (These coefficients will be
multiplied by $(xx_0)^n$). The 40th coefficient is approximately $1.1\times 10^{32}.$
\begin{verbatim}
[ 0, 0.519, 1.0793E1, 5.6601E2, 8.6576E3,
2.1942E3, 2.6437E4, 4.3729E5, 4.3388E6, 5.3049E7,
4.4700E8, 4.3265E9, 
3.1666E10, 2.5338E11, 1.6387E12,
1.1169E13, 6.4695E15, 3.8392E16, 2.0132E17, 1.0576E18,
5.0661E20, 2.3875E21, 1.0538E22, 4.4341E24, 1.5938E25,
1.6803E26, 4.1271E27, 1.5342E27, 6.1900E28, 2.4874E28,
1.0001E28, 4.0255E29, 1.6219E29, 6.5415E30, 2.6406E30,
1.0668E30, 4.3137E31, 1.7455E31, 7.0683E32, 2.8641E32,
1.1613E32].
\end{verbatim}
Around the 100th zero at about 313.37 the coefficients drop faster; the 40th
coefficient is then about 3.49E51.
As shown in the previous section on derivatives, the coefficients
could be stored as formulas, or computed as needed by the program {\tt
J0Ders} or stored for some set of zeros. If more precision is
needed, or expansion around a zero larger than had been anticipated is
needed, the coefficients can be recomputed and stored (again). An
accurate value of $J_1(c)$ is also needed: this can be computed by the
recurrence program. As one proceeds down the sum, the coefficients
toward the end need not be computed or stored or operated on to full
precision. If we have arranged to sum the terms from the smallest to
the largest, a 50digit correct result can be obtained even if the
initial 10 terms are added together in doublefloat precision; only
the last 10 terms or so require 50digit arithmetic.
There is no penalty in asking for the value at a large argument if the
data on a nearby zero has been pretabulated. There is still an issue
of accurate computation (lost digits) that has to be treated in the
same way as the previous section, and the number of terms to be used
can be computed at runtime. A program that adds terms ``until it doesn't
matter'' is a bit tricky: the coefficients may not be strictly
monotonically decreasing in absolute value. Encountering a zero coefficient
doesn't mean that all subsequent terms are zero.
Instead one can use an {\em a priori} error bound on the Taylor
series. The Taylor series computation could require higher precision
in the coefficient list, as well as more terms in that list to meet a
required relative error bound.
\begin{verbatim}
(tayser2(m,x0,v,precision_goal):=
[[ messy details omitted ... call tayser() with estimated length;
call tayser() with more terms and higher precision;
if they agree to precision_goal, return the value]]
)
\end{verbatim}
%\begin{verbatim}
% [0.5952, 0.07365, 0.02299, 0.002093, 3.156508*10^4,
% 2.263594*10^5, 2.228387*10^6, 1.315939*10^7, 9.576043*10^9, 4.802473*10^10,
% 2.766556*10^11, 1.205134*10^12, 5.739488*10^14, 2.209186*10^15, 8.962283*10^17,
% 3.08951*10^18, 1.091216*10^19, 3.405685*10^21, 1.064852*10^22, 3.035842*10^24,
% 8.513661*10^26, 2.23379*10^27, 5.678368*10^29, 1.37986*10^30, 3.207326*10^32,
% 7.257745*10^34, 1.553841*10^35, 3.289748*10^37, 6.527678*10^39, 1.2984*10^40,
% 2.400566*10^42, 4.5024*10^44, 7.792434*10^46, 1.3826*10^47, 2.249143*10^49,
% 3.786146*10^51, 5.809952*10^53, 9.303546*10^55, 1.351046*10^56, 2.062873*10^58]
%\end{verbatim}
%These numbers are the coefficients in a (truncated) polynomial to be
%evaluated at the point $(xc0)$. This can be done using Horner's Rule,
%rather than the sum formula {\tt gensum}. Finally we must multiply by
%a fullprecision value of $J_1(c0)= 0.519147497289466788...$
%The tables can be precomputed for some precision and some set of
%zeros. If higher argument values are needed, more zeros can be
%computed to high precision, the coefficients computed, and key values
%of $J_1$ stored.}
Remaining to be seen are two comparisons. For arguments that are
large enough and precision requirements that are modest, is the Taylor
series just too much work? Reasonable alternatives are Hermite Interpolation
and asymptotic series.
\section {Hermite Interpolation}
{There are many variations on techniques for interpolation, where some
information about function values at existing points is used to
determine or approximate function values at additional points. For
example, linear interpolation assumes that a function $f$ varies
approximately linearly between two tabulated points $x_0$ and $x_1$
thus the value of $f$ at an intermediate point lies on the straight
line between $(x_0,f(x_0))$ and $(x_1,f(x_1))$. More elaborate
interpolation using more points fits a more elaborate curve to the
tabulated points usually resulting in a more accurate result.
Hermite interpolation differs in that derivatives of $f$ are also
given at some or all of the points. The spacing may not be even. In
our case we are mostly going to look at Hermite interpolation with
exactly 2 points, but since we have neat ways of computing
derivatives, we exploit this information.}
Since Hermite interpolation may be less familar than (say) Taylor series, we discuss it
informally first.
Consider this problem: I'm thinking of a function $f$.
I know its derivatives at zero. Here are the first 7.
(The zeroth derivative is $f(0)$.) In Maxima, we can compute the list of derivatives this way:
\begin{verbatim}
s0: makelist(at(diff(f(x),x,n),x=0),n,0,8);
\end{verbatim}
which returns{\tt [0,1,0,1,0,1,0,1,0]}.
Similarly, I know its derivative at $\pi$.
\begin{verbatim}
spi: makelist(at(diff(f(x),x,n),x=%pi),n,0,8);
\end{verbatim}
returns {\tt [0,1,0,1,0,1,0,1,0]}
What is the value of $f(1/2)$ approximately?\footnote{If you have guessed that
the function looks like $\sin(x)$ I won't deny it.}
I can incorporate the values at 0 and $\pi$, and 4 derivatives. (or
any number of derivatives) into a calculation, via Hermite
Interpolation, to estimate the value of $f(q)$.
The 2point Hermite interpolation function {\tt hi}
(defined below) has the expected property that given this data the error
vanishes at 0 and $pi$. It also uses the derivatives at two points. The
approximation can be improved by using more derivatives and does not
depend on knowledge of function values at new intermediate points.
\begin{verbatim}
hi(s0,spi,9,0,bfloat(%pi),1.5b0)
\end{verbatim}
returns 0.99749498660{\it 35}.. while {\tt sin(1.5b0)} returns
0.99749498660405... \footnote{We use the convention that
digits that are not correct are typeset in italics.}
It is natural to wonder how efficient, in terms of error,
Hermite interpolation might be.
If you use a total of 10 data items from each of two points
the error bound is similar to that of a Taylor series of 20 terms:
proportional to (1/21!) times an 11th derivative evaluated at some point
in the interval.
Our scheme then is, given a point of interest $v$ to choose two nearby
points: those zeros of the Bessel function $J_0$ which bracket $v$.
If the point $v$ is ``very'' close to one of the zeros we can actually
do better with the Taylor series. We would need to quantify ``very'' though.
If we use Hermite interpolation, we find it is easy (i.e. in working precision
that is modest) to evaluate $J_0(v)$
accurately (i.e. with low relative error) even in the places where
$J_0(x)$ is difficult (i.e. requires much higher working precision)
to evaluate by recurrence. Hermite interpolation is a good choice for
places spaced somewhat midway between zeros of $J_0$.
If we are given some modest restriction, say that we will need no more
than 100 decimal digits of numerical values, and arguments less than,
say, 300, we can tabulate the first (say) 100 zeros of $J_0$ and
prospectively evaluate the needed derivatives, perhaps caching them.
Given $z$, some zero of $J_0$, $d^n/dx^2 (J_0(x))_{x=z}$ is a {\em
simple rational function} of $z$ times $J_1(z)$. In fact, that
rational function is a polynomial divided by $z^n$. These polynomial
functions are easily tabulated, and are not delicate to compute at
those points $z$. This is convenient since these coefficients are
good irrespective of future perhaps more demanding precision
requirements.
There are other types of interpolations. For those familiar with the technology
we note: the Chebyshev approximation will not match the endpoints, which
we really would like to see if we are going to maintain low relative error
near zeros. (We can match the endpoints
in the Chebyshev approximation, compromising the minimax property). The Hermite approximation
will not be so uniformly close, but will match the endpoints and also be
rather good beyond them. As noted, Taylor series (of the same degree) will be
even better than the Hermite approximation sufficiently
close to the point
of expansion, becoming much worse further away.
For comparison we have tried each on the expression $\sin(\pi x)$ between 1 and 1. (see fHI, fCHI, fTS below).
A simple way of computing an Hermite Interpolation formula is via
an implementation of divided differences. This takes only a few lines
of code in Maxima. The program assumes two points, two lists of
normalized derivatives, where the nth derivative is divided by $n!$:
\begin{verbatim}
/*hermite interp.at 2 points, use len1 derivs.
m0 and m1 are each available lists of normalized derivatives at x0 and x1.
(nth derivative divided by n!)
Evaluate the function at v which can be any numeric type or symbolic. */
hi(m0,m1,len,x0,x1,v) :=
block([del:x1x0 ,v0:vx0,v1:vx1],
/* Compute a divided difference tableau using derivatives.*/
kill(dd),
dd[K,L]:= if K=0 then m1[L] else if L=0 then m0[K] else
(dd[K1,L]dd[K,L1])/del,
/* auxiliary function for hermite interpolation, start it at hi2(1,0),
This computes the Newton interpolation polynomial. It
refers to external values for dd[], m0, m1, len ,v0 and v1*/
hi2(K,L):= dd[K,L]+
(if K=L then 0
else if (K=1) or (x<=0) or (nmax<0) then return(Japlusalarm(x,a,nmax,d)),
epsilon: 0.5*10^(d),
for n: 0 thru nmax do Japprox[n]:0,
sum: (x/2)^a/gamma(1+a),
d1:2.3026*d+1.3863,
if nmax> 0 then r:nmax*gt(0.5*d1/nmax) else r:0,
s: 1.3591*x*gt(0.73576*d1/x),
nu: 1+entier(max(r,s)),
L0,
m:0, L:1, limit: entier(nu/2),
L1,
m:m+1,
L:L*(m+a)/(m+1),
if m< limit then go(L1),
n:2*m, r:0,s:0,
L2,
r:1/(2*(a+n)/xr),
if entier(n/2) # n/2 then lambda:0 else
(L:L*(n+2)/(n+2*a),
lambda:L*(n+a)),
s:r*(lambda+s), if n<=nmax then rr[n1]:r,
n:n1,
if n>=1 then go(L2),
J[0]:sum/(1+s),
for n:0 thru nmax1 do J[n+1]:rr[n]*J[n],
enough:true,
for n:0 thru nmax do
if abs((J[n]Japprox[n])/J[n]) >= epsilon then enough:false,
if enough then return(J[nmax]), /* We return one; Alg. 236 returns ALL of J values.. */
for m:0 thru nmax do Japprox[m]:J[m],
nu:nu+5, go(L0)))
\end{verbatim}
Kahan estimation method. from olver/sookne paper
consider
y[r1] (2r/x)y[r]+y[r+1]=0.
or
(kill(y), y[0](x):=0,
y[1](x):=1,
y[r](x):= (2*(r1)/x)*y[r1](x)y[r2](x),
findn(x,d):=block([i:2], while y[i](x)<2*10^d do i:i+1, i))$
``For fixedpoint computation to D decimals, N is the least
integer for which y[N+1] >= (2*10^d)*beta.''
findn(x,d):=block([r:1, y:[1,0],lim:2*10^d],
while abs(first(y))=k, then
A[k] is diff(J_0(z),z,k)/J[1](x) evaluated at z=x.
iff x is a positive zero of J_0 ! */
SS(n,x):=
block([arr:make_array(any,n+1)],
arr[0]:0,arr[1]:1,arr[2]:1/x,
for k:3 thru n do arr[k]: 1/x*((1k)*arr[k1] +(2k)*arr[k3]) arr[k2],
arr))
/* Compute A: HH(n,z) for n>=k, then
A[k] is (1/k!)*diff(J_0(z),z,k) evaluated at z=x.
iff x is a positive zero of J_0 ! */
HH(n,x,J1x):=
block([arr:make_array(any,n+1),xi:1/x],
arr[0]:0,arr[1]:J1x,arr[2]:J1x*xi/2,
for k:3 thru n do arr[k]:
xi*((1k)/k*arr[k1] 1/(k*(k1))*arr[k3]) arr[k2]/(k*(k1)),
arr)
/* returns a LIST. [note index origin is 1 rather than 0]
A[k+1] is (1/k!)*diff(J_0(z),z,k)/J[1](x) evaluated at z=x.
*/
HH(n,x):=
block([xi:1/x, ans:[1/(2*x),1,0]],
for k:3 thru n do ans:
cons(xi*((1k)/k*first(ans)1/(k*(k1))*third(ans))second(ans)/(k*(k1)),
ans),
reverse(ans));
HH(n,x,J1x):= /* all the normalized derivatives using J1 */
block([xi:1/x, ans:[J1x/(2*x),J1x,0]],
for k:3 thru n do ans:
cons(xi*((1k)/k*first(ans) 1/(k*(k1))*third(ans))
second(ans)/(k*(k1)),
ans),
reverse(ans));
if load(basic)....
push(c,l)::=buildq([c,l],l:cons(c,l))$
/* If you think you will be needing random access to derivatives,
store them in an array of len1 derivatives at ith zero of J[0] this way */
cacheJder(i,len,ci,j1ci):=
fillarray(make_array(any,len),J0Ders(len,ci,j1ci))$
use it as, for example, CD: CD[i]:cachedJder(9,100,c9,j1c9).
/* version of alg 236, bessel functions, using lists instead. Works. */
push(c,l)::=buildq([c,l],l:cons(c,l))$
bj(x,nmax):=block([lam:0, r:[1,0],xi:1/x,li:0],
for k:nmax2 step 1 thru 0 do
(push(2*(k+1)*xi*first(r)second(r), r),
if evenp(k) then lam:lam+first(r)),
r/(2*lamfirst(r)))$
a:bj(3.5,40)$
a[1] is bessel_j(0,3.5)
a[2] is bessel_j(1,3.5)
a[21] is bessel_j(20,3.5) to about 14 decimals.
/*array version of above */
bj(x,nmax):=block([lam,r],
r: make_array(any, nmax+1),
r[nmax]:0, r[nmax1]:1,
for k:nmax2 step 1 thru 0 do r[k]:2*(k+1)/x*r[k+1]r[k+2],
lam: r[0]+ 2*sum(r[2*i],i,1, entier((nmax1)/2)),
print([1/lam,r]),
for k:nmax step 1 thru 0 do r[k]:r[k]/lam,r)$
\end{verbatim} )
P.L Richman, “Automatic Error Analysis for Determining Precision,” Comm. ACM 15 (1972) 813–817
Table, how good is the asymptotic formula? That is, when does the
truncation error make you hit a wall, regardless of the precision
being carried? It looks something like this:
consider J[0](x)
x precision lim delta asympt. J[0](x)
5 30 5 2*10^6
10 30 10 3*10^11 hits a wall
50 30 14 5*10^31 FULL PRECISION
100 30 10 5*10^31
124*
200 30 7 3*10^31
300 30 7 2*10^30
*124 actually
124.8793089132329460452591283971119168563.. is a zero of
Bessel
5 40 4 3*10^6 ; note, lim=20 gives error 10^5
10 40 10 3*10^11 hits a wall
50 40 25 4*10^41 FULL PRECISION
100 40 15 4*10^41
300 40 10 1*10^(40)
asymptotic version at x=300 is 20X faster for precision 40
100 100 100 2*10^90 hits a wall
100 120 100 2*10^(90)
So it looks like we can sort of say things like this:
for nu=0 and x>50 and precision < 40 we can
use asymptotic formula with 25 terms
q(z,lim):=J0(z)j(0,z,lim,bfloat(%pi))$
j(n,z,lim,pi):=block([w:z1/4*(2*n+1)*pi],
sqrt(2/(pi*z))*(cos(w)*gg(n,lim,z,0)sin(w)*gg(n,lim,z,1)))$
gg(n,lim,z,r):=sum((1)^k*ww(n,2*k+r)/z^(2*k+r),k,0,lim)$
ww(n,m):=gamma(n+m+1/2)/(m!*gamma(nm+1/2))/2^m$
push(c,l)::=buildq([c,l],l:cons(c,l))$
J0(x):=block([b,r,nmax,fpprec:fpprec+3],b:bjn(x,0,nmax:findn(x,fpprec3)),
r:ceiling((log(abs(b)))/log(10.0)1),
if r < 3 then b else (b:block([fpprec:fpprec+r],bjn(x,0,nmax)),bfloat(b)))$
findn(x,d):=block([r:1,y:[1,0],lim:2*10^d],
while abs(first(y)) < lim do
(push(2*(r1)/x*first(y)second(y),y),r:r+1),r1)$
fpprec:40
makelist(q(100.0b0,i),i,3,30);
simulation of mpfr bessel.
to compute J[n](z):
(Jm(n,z,w):= /* w is number of bits? */
block([fpprec:w,x,y,u,t,s,k,z2],
z2:z^2,
x:z^n, y:z^2/4,u:n!,t:(x/u)/2^n, s:t,
k:1,
while true do
(if (abs(t) p, the global precision, will
use that precision pz for computation. The result will be presented in precision p.
(Is some of this done by the Lisp interface I'm using?)
(Jm(n,z,w):= /* w is number of bits? */
block([fpprec:round(w*0.30103),x,y,u,t,s,k,z2],
z2:z^2,
x:z^n, y:z^2/4,u:n!,t:(x/u)/2^n, s:t,
k:1,
while true do
(if (abs(t) p*log (2)/2
or z> 0.3466*p
then for p=100, the cutoff would be z=34.
Let's try z=34.
................
set q to 663514145/21658974, about 30.6346064
convert to bigfloat to 600 bits.
compute Jn to 180 bits.
2.0075262443912719548607574952509881879988214540556339627^16 mpfr
2.00752624439127195486075749525098818799882145405563396318764*10^16 mma 8
//////// argggh. mpfr seems to work in high precision if conversion to
bigfloat is high enough precision.
now compare to
J0 at 31.0
5.1208145304542248799820491048897838203433609629728165006^2 mpfr
5.12081453045422487998204910488978382034336096297281649982555 mma.
Jm(0, 34.0b0, 100);
3.04211910217926496285821621189b2
^wrong
It is especially bad near a zero...
let
c: 553786423/16395943
fpprec:100$
Jm(0,bfloat(c),100);
4.90127491293600033329098262567b16
^ wrong.
The mpfr code itself gives this::
4.8947887218241289899665567789565^16
^wrong
(j0 (into 553786422/16395943))
8.3738368423171871495055697386111^9
^wrong
to 120 bits..
8.3738368423171871495081488472489365269^9
^wrong
diddle the 2nd digit..
(j0 (into 56378643/16395943))
3.7089824688980036762744909301333^1
all digits correct, except the last one should be rounded up to 4.
Sensitive to nearzero.
(Jm(n,z,w):= /* w is number of bits, not decimal digits */
block([?fpprec:w,x,y,u,t,s,k,z2],
z2:z^2,
x:z^n, y:z^2/4,u:n!,t:(x/u)/2^n, s:t,
k:1,
while true do
(if (abs(t) 2*k*(k+n)) do
(t:t*y, t:t/k, t:t/(k+n), s:(s+t),k:k+1),
s)$