So how did I end up creating (more likely reinventing) a rootfinding algorithm?

I was creating a series of graphs illustrating regions of optimality for various multiprocessor mutual-exclusion algorithms. Since some of the cost functions involved a very slowly converging Poisson series (more than 10,000 terms required for convergence in some cases), I needed to use greater precision in order to check for roundoff errors.

So I converted the version of Brent's method in "Numerical Recipes" by Press et al. to Macsyma. I chose Macsyma because it can solve for several of the breakeven points analytically, and because Macsyma's bfloat type allows arbitrarily large user-specified precision for the breakeven points that must be solved for numerically.

Unfortunately, I am very used to coding in C, so I coded one of the assignment statements as "e=d" instead of the correct "e:d". Of course, in C, the former is an assignment statement while the latter is a syntax error (unless preceded by a valid "?" clause). In Macsyma, both are legal. The first just asserts equality, and is more or less analogous to C's "e= =d". The latter is assignment. So, my initial implementation of Brent's method omitted a critical assignment statement.

Surprisingly, the resulting algorithm converged to roots in all cases that I tried. Unfortunately, it would often converge to meaningless roots that lay outside the interval of interest. Since I needed to locate thousands of roots, even occasional failures to find the desired root were completely unacceptable. I reviewed my code and compared it to the FORTRAN, PASCAL, and C versions of the "Numerical Recipes" code many, many times, and consistently failed to see the "=" that should have been a ":".

Of course, at this point I should have asked someone else to look at the code. They undoubtably would have instantly spotted the error, and saved me considerable time and trouble. However, this effort was part of a late-at-night hobby, when I was often not as alert as I might be and when reviewers were not readily available.

So, instead, I became quite annoyed with the rootfinding method, and, filled with the certainty that I could do quite better on my own, I hammered out a fairly straightforward algorithm.

It was obvious that Brent's method was keeping a minimum of information about prior function evaluations. So I considered ways of maintaining and efficiently using more history. I attacked what appeared to me to be weaknesses in Brent's algorithm:

- At most three points are kept from one interation to the next, and
all of these are needed to do the inverse quadratic interpolation
step. However, one can easily picture cases where three points are
needed
*on the same side of the root*for good convergence, and where a fourth point is needed to keep track of the other side of the interval known to contain the root. - Bisection/interpolation decisions are made based only on the current iteration's rate of convergence. One can easily picture cases where more laissez-faire convergence criteria should be used over a small number of iterations, with more draconian criteria in use longer term.

For the second weakness, I maintain a table of past lengths of the zero-crossing interval. Instead of making an interpolation/bisection decision based only on the current iteration, this decision is made based on the convergence over a specified number of iterations. For the functions tested, the best results were obtained by requiring that the convergence rate be at least a factor of 1.9 per iteration, measured over eight iterations. If this long-term rate of convergence is maintained, interpolation is used, otherwise, bisection is used. This means that interpolation is used unconditionally for the first eight iterations, until the history table fills.

The resulting algorithm performs surprisingly well. On the functions tested
at 20-decimal-digit accuracy,
"Bent" required no more than one more evaluations than Brent, and in
some cases, required more than 50 *fewer* evaluations. "Bent" required
no more than five additional evaluations compared to bisection, while
Brent required more than 60 additional evaluations in some cases.

At 50-decimal-digit accuracy, "Bent" required at most 54 more evaluations
than did Brent, but in a couple of cases required more than 100 *fewer*
evaluations. "Bent" required no more than five additional evaluations
compared to bisection, while Brent required more than two hundred additional
evaluations in one case.

"Bent" uses more complex decision criteria, so each iteration is more expensive. This means that Brent can perform better for inexpensive functions. However, for expensive, ill-behaved functions, "Bent" can give much better results.

Bad choices of history-table size and convergence ratio can cause "Bent" to perform very poorly. For two extreme examples, using a ratio of 2.0 and a table size of one would force bisection, while using a ratio of 1.0 or an extremely large table size would allow the extremely slow convergence that pure interpolation can exhibit.

The functions tested are as follows:

Interval Function case1 [0,3] (x-1.b0)*(1.b0+(x-1.b0)^2) case2 [0.9,2] x^2-1.b0 case3 [0,3] -1.b0+x*(3.b0+x*(-3.b0+x)) case4 [0,3] (x-1.b0)*exp(-1.b0/(x-1.b0)^2) func1 [0,100] -cos(x)+1.b-2*x func2 [0,2] sin(x)-cos(x) func3 [0,2] x*x*x - 2.b0 func4 [0,2] x*x*x*x*x*x - x - 1.b0 func5 [-0.2,0.5] ((x^2 / 120.b0) + 50.b0 * x /3.b0) * x^2Cases 1-4 are taken from "An Introduction to Numerical Analysis" by Atkinson, pp77-78 (Wiley, ISBN 0-471-02985-8). Func1-5 are examples that I made up. Since "Bent" performs better relative to Brent on the textbook cases than on my made-up examples, one can argue that, once again, I should have kept my imagination on a short leash and stuck with the textbooks! Nonetheless, my rationale for each of func1-5 was as follows:

- func1 has lots of zero-crossings both within and outside of the interval. The idea was to find a simple function that caused (my broken version of) Brent to converge outside of the interval.
- func2 is a smooth but relatively expensive function.
- func3 is smooth and cheap.
- func4 has very large third derivatives.
- func5 has a very large third derivative combined with a root of its fourth derivative.

The performance of "Bent" compared to a debugged version of Brent's algorithm and to bisection, in terms of number of function evaluations, is as follows:

20 Digits Precision 50 Digits Precision Bisect Brent Bent Bisect Brent Bent case1 37 10 10 137 12 11 case2 36 8 8 136 10 10 case3 37 72 32 137 182 64 case4 37 12 11 137 12 11 func1 42 15 14 142 17 16 func2 37 8 8 136 9 11 func3 37 9 10 136 11 12 func4 37 13 12 136 15 69 func5 35 98 40 135 401 140 total 335 245 145 1232 669 344The numbers in the "total" row must be taken with a grain of salt, since they weight the ill-behaved functions very heavily.

For the curious, source code for Brent, "Bent", and the code used to compare them is available:

- bisection
- Brent's algorithm
- "Bent"
- Code to compare Brent and "Bent"
- Code to evaluate bisection
- Functions used to compare Brent and "Bent"

I suppose that my buggy implementation of Brent's algorithm might be of some interest, but I have long since deleted it.

So, let this be a lesson to you! If the textbook versions of the algorithms are not working, it is probably because of your implementation! However, I cannot complain too much. Coming up with "Bent" was quite an interesting experience, and I learned a lot about rootfinding while designing and implementing it.