Where I Write About Anything And Everything
Drag & Drop elements for the Sidebar here
Root finding methods are an important tool to have in any mathematics library. For those who don't know, root finding methods are, by definition, methods that find the zeros (or roots) of some arbitrary function. That is, given a function and some initial parameters, theses methods will find a solution that when plugged back into the function yields zero.
This might sound like a rather specific case, but this problem of finding zeros can be generalized to more exotic problems. For example, computing the inverse of a given function, or finding the intersection point between two curves, can both be obtained using root finding. Furthermore, despite their practical applications, root finding methods can generate beautiful fractals when applied to the complex plane.
The problem with root finding then is which method do you choose, because there are a lot, and as is typical with computer sience, different methods tend to fulfill different needs. From the research I have gathered, there appear to be two distinct groups of root finding methods: bracketing methods and non-bracketing methods.
Bracketing methods, as the name suggests, require a search bracket to be specified in order to function. The bracket in question must contain at least one zero. The primary advantage of bracketing methods is that they always converge to one of the zeros inside the bracket, guaranteed. The disadvantage is that they tend to converge rather slowly. They also require the end user to have some knowledge of the function being inverted.
The simplest root finding method is also a bracketing method known as Bisection. This method basically divides the search space in half, looking towards the half that is known to contain a root, and makes that the new search space. It repeats this process till the size of the search space becomes insignificant. That's it! It's extraordinary simple and easy to implement, but it's also the slowest to converge. In practice though, this is not as big an issue. Unless you are developing a real time system, or need to invert several different functions, there is nothing wrong with Bisection.
False Position is another bracketing method which tries to improve upon Bisection. Instead of always dividing the search space at the midpoint, it tries to be a bit more intelligent in choosing where to split the function. This leads to much faster convergence than Bisection in most cases, however it's possible to find some ill-behaved functions where False Position actually performs worse. Others have proposed alterations, or hybrid methods, to try and alleviate this problem. However, even when False Position is at it's best, it's still not as fast as the non-bracketing methods.
Ridder's Method is yet another bracketing method which takes this idea even further. Where False Position uses a straight line to determine where to subdivide the search bracket, Ridder's Method uses an exponential curve. This makes Ridder's Method converge very fast, reaching speeds comparable to non-bracketing methods while still guaranteeing convergence. It also worked reasonable well for all functions I tested it with, however there could be some function for which it also converges slowly, similarly to false position.
Non-bracketing methods do not use a search bracket to find roots, instead they rely upon an initial guess or guesses to find the root. (Note that the guesses need not bracket the root). They also sometimes rely on additional information about the function to be inverted. Such methods usually converge extremely fast, when they do converge. Unfortunately, unlike the bracketed methods, convergence is not guaranteed. They can also be extremely sensitive to initial conditions, the initial guesses given at the start of the algorithm. It is this extreme sensitivity, however, that gives rise to beautiful root-finding fractals.
Newton's Method is perhaps the definitive root finding method, next to Bisection, and it is a non-bracketing method. It requires only a single initial guess, but involves the derivative of the function in it's computation. This is most useful if the derivative can be derived ahead of time and is simple to compute. This is especially true when inverting integrals. Although it is rather sensitive to initial conditions, it almost always converges, which leads to some rather interesting fractals as seen in Image-1.
The Secant Method tries to improve upon Newton's Method by removing the derivative requirement. Instead it requires the user to provide an additional guess for the root to make up for this lost information. The Secant Method is about what you would expect to get if you fed an approximation of the derivative into Newton's Method, however the Secant Method can improve upon this approximation with each iteration, so it is always better to use the Secant Method if the derivative is unknown.
Muller's Algorithm is a further improvement to the Secant Method in a similar way that Ridder's Method improves upon False Position. It requires three initial guesses, instead of two, and supposedly converges even faster. Unfortunately, I could not get it to converge, despite my best efforts. My best guess is that while it converges faster for some inputs, it is even more sensitive to initial conditions, failing to converge in most cases. For this reason and others, I personally did not find it to be of practical use.
The Brent-Dekker Method is a unique case, in that it tries to be the best of both worlds, keeping the guaranteed convergence properties of bracketing methods, while achieving speeds similar too and exceeding non-bracketing methods. The implementation is exceedingly complicated (too much to discuss here), but apart from that it appears to have no downsides. It is also the root finding method of choice for statistical packages such as: Matlab, SciPy, Boost, and R.
Thus far I have implemented all of the above methods (excluding Muller's Algorithm) in the Vulpine Core Library. I picked out six different functions of varying complexity with which to test the speed at which the functions converged. I explicitly excluded Newton's method from these tests, due to the difficulty in computing the derivative. For the Secant Method, the initial guesses are the endpoints of the functions range, for all other methods the search bracket is the entire range. If you would like to check the veracity of my work, you can find the source code I used to test all of these methods on Git Hub.
I have combined the results from all six runs into a single chart for convenience (see Chart-1). Here you can see the relative error of each algorithm plotted against the number of iterations for which the algorithm has been running.
The X-coordinates indicate the number of iterations for which the algorithm has been running, and can be thought of as a measurement of time. For the purposes of these tests, one iteration corresponds to a single function evaluation. Thus algorithms like Ridder's Method, which preforms two function evaluations per loop, are only evaluated for every even number of iterations.
The Y-coordinates correspond to the inverse logarithm of the output error. I used the relative error between each successive iteration to evaluate how fast the function was converging. Because these values are very small (near zero) I used the inverse log function to scale them so that we could see more of the detail. Thus the steepness of the curve indicates how quickly the corresponding algorithm converges.
From these tests one can tell that the bracketing methods tend to converge in logarithmic time, while the non-bracketing methods (Secant and Brent) tend to converge much faster in log-logarithmic time. The real oddity though is Ridder's method, which seems to converge in logarithmic time for most functions, similar to the other bracketing methods, but in a few cases a log-logarithmic curve produced a better fit for the data.
The distribution of the data showed that the most stable algorithm, by far, was Bisection, despite also being the slowest. This makes sense, given that bisection always divides the search space in half at every iteration. Ridder's method also achieved a very high R-value, however due to its aforementioned unusual behavior, I am somewhat dubious about this measurement.
The next most stable algorithm was the Brent-Dekker Method. It seemed to preform extremely well for all of the functions in my test suite, while the Secant Method, which achieved comparable results for some functions, tended to vary much more dramatically. False Position preformed similarly, tending to vary greatly in performance based on the function being inverted.
From these results and my research done so far I can conclude that the Brent-Dekker Method is the best general-purpose root finding algorithm you could ask for, if you don't mind the complexity. It seems like those Matlab and R guys were really on to something! Ridder's method also seems to be a good choice, however there are just a few two many oddities about it for me to wholeheartedly recommend it.
On the other hand, if speed is not a critical issue, there is nothing wrong with good old Bisection. It's safe, reliable, easy to implement, and just works.