Binary Splitting

By Alexander Yee

 

(Last updated: August 31, 2018)

 

 

 

Back To:

 

y-cruncher can compute a variety of constants and simple functions. Most of these involve an infinite series of some sort. In order to efficiently sum up these series, it uses a technique called Binary Splitting.

 

This page won't reiterate the Binary Splitting tutorial in the above link. Instead, this page will cover y-cruncher's uses and implementations of the method.

 

As of version 0.7.6, y-cruncher has 4 different Binary Splitting implementations:

In the past, there were separate implementations for every single constant and algorithm. But multiple refactorings and cleanups have consolidated most of them under the same generic implementation.

 

 

 

CommonP2B3 - The Standard 3-Variable Recursion

 

This is the standard 3-variable Binary Splitting recursion that appears almost everywhere. The reason for this is because it covers the bulk of hypergeometric series while still being relatively simple. The vast majority of the popular constants can be computed with this type of series.

 

The name "CommonP2B3" is what is used in y-cruncher's source code. "B3" stands for the number of variables in the recursion. "P2" is the number of variables needed to obtain the final result.

 

 

Scope:

 

Mathematically, the CommonP2B3 recursion in y-cruncher evaluates the following series:

where P(k), Q(k), and R(k) are arbitrary functions of k that return a real integer. The only restrictions are that they are easy to evaluate and the integer produced is reasonably small. In practice, all 3 of them will be polynomials with word-sized integer coefficients. Furthermore, Q(k), and R(k) will usually completely factorize.

 

Given that P(k), Q(k), and R(k) can be near arbitrary polynomials, this series is extremely general and allows summation of a large portion of hypergeometric functions.

 

As of y-cruncher v0.7.7, the CommonP2B3 recursion is used for all of the following series:

Because of the generality of this pattern, there is some interest in exposing it as a feature in manner similar to PiFast's "user constants". This would allow y-cruncher to compute a lot more than just the hard-coded constants. Ideally, the user only needs to input the 3 polynomials for the series and some finishing sequences. y-cruncher should do the rest.

 

But this is easier said than done. While it's not difficult to automatically determine things like rate of convergence or the number of terms needed to reach X precision, it is much more difficult to sanitize user-input and enforce restrictions that if violated, can cause the program to break or fail in subtle and unexpected ways.

 

 

The Recursion:

 

The Binary Splitting recursion for the series is as follows:

 

 

 

Restrictions:

 

To actually use this recursion to sum up an infinite series, the following restrictions will usually apply:

In practice, both Q(k) and R(k) will have no zeros for k >= 1. So both polynomials will be at their asymptotic behavior for the range that the series cares about.

 

 

Convergence:

 

If the goal is to sum up an infinite series, we must understand the convergence behavior.

 

Assuming P(k), Q(k), and R(k) are all polynomials with real coefficients, we can enumerate all the possibilities:

All of the above assume that Q(k) and R(k) are non-zero for all integer k > 0.

y-cruncher will blow up on any of these cases even if they are mathematically sound.

 

 

Computational Speed:

 

Evaluating the computational speed of the series will depend on whether the series is linearly or superlinearly convergent. Superlinearly convergent series are less common and harder to analyze. So instead we'll focus on linearly convergent series.

 

Assuming P(k), Q(k), and R(k) are well-behaved polynomials with small and real integer coefficients, the relative asymptotic cost to summing up the series is given by:

Where:

 

Things to note:

Both the convergence rate and the polynomial degree are equally important. While many academic papers focus on finding fast converging series in terms of the number of bits or digits per term, they often do not weigh it against the cost of each individual term when estimating the actual speed of the series.

 

 


Optimizations

 

Since the polynomial degree D is so important to the speed of the series, there is a lot to gain by reducing it. One way to that is to remove common factors. Fundumentally, Binary Splitting is just a way of symbolically summing up a series of rationals. So it make sense to remove common factors between the numerators and denominators.

 

For the CommonP2B3 series, factors that are common to all 3 variables (P, Q, and R) can be removed. This applies to both constants as well as polynomial factors. Removal of polynomial factors will provide proportional speedups. Removal of constants will provide speedups for small computations where k doesn't get very large.

 

Here, we will discuss 3 ways to remove common factors:

  1. Remove at the recursion end-points.
  2. Remove across adjacent terms.
  3. Remove globally.

 

Removal of Factors at End-Points:

It's best to remove these factors at the recursion end-points so you don't waste any computation on them. Let's use the Lima-Guillera formula for Catalan's Constant as an example.

 

Original series:

Unpack the binomial into its factorials.

Unpack the factorials and exponentials into products:

Rearrange the formula to fit it into the CommonP2B3 series form.

 

 

Remove factors common to all 3 variables:

 

And just like that, we've cut out a third of the cost. This optimization is almost always done because it's easy to do and has no strings attached.

 

Removal of Factors across Adjacent Terms:

Sometimes, factors that cannot be removed at the end-points can be removed by combining adjacent terms. Let's do that to the example above:

 

 

Now we can see that the term (2k+1)3 is common to all 3 parameters. Once we remove it and do an index substitution, we reach:

 

 

The cost now drops to 8.656, which is another third faster from before. Note that since this can be applied to arbitrary pairs of adjacent terms, you effectively have an "infinite chain" of common factors that can be removed. So while you can merge an arbitrary number of adjacent terms, you will get increasingly complicated polynomials at the recursion end-points.

 

If such cases arise, it is worth examining the formula itself to see if anything can be optimized. If we can "shift" the index of Q so that it "lines up" with R...

 

Start with the original formula with the binomials unpacked:

Observe that the (2k+1)3 term can be merged into the ((2k)!)3:

Unpack the factorials:

Rearrange the formula to fit it into the CommonP2B3 series form.

 

Remove factors common to all 3 variables:

 

Now we reach our final solution at a cost of 5.771. By merging away the (2k+1)3 term, it shifts the entire Q variable so that the common factors which were originally in adjacent terms, are now in the same term. This allows them to be eliminated by end-point factorization.

 

Adjacent term factorization is not always applicable. Catalan Lima-Guillera happens to be one of the few. But it does illustrate how slight modifications a formula that's taken fresh out of a publication can lead to large differences in performance.

 

Removal of Factors Globally:

In most cases, there are many more factors that cannot be eliminated by either of the methods above. This is because the common factors come from different values of k which are "far apart". Eliminating these factors is more complicated.

 

One of such approaches is to keep the prime-factorizations of Q(k) and R(k) along with their normal representations. Then at each stage of the recursion, compute the GCD from the prime factorization and remove it by division. This method is colloquially called, "GCD Factorization".

 

When applied to Pi Chudnovsky, speedups of 20 - 30% have been reported.

 

y-cruncher currently does not do GCD factorization for various reasons. It's been on the radar for years, but has been put off repeatedly.

 

Special Cases:

This last optimization doesn't have anything to do with removing factors. It's about removing entire recursion variables!

For certain inputs, the 3-variable recursion is unnecessary and it's possible to do away with one or more variables.

 

exp(1) and exp(-1) both fit the CommonP2B3 pattern, however R(k) = 1. This completely eliminates the need for the variable thereby reducing both its complexity and increasing its speed. Therefore y-cruncher has a specialized implementation for it.

 

For BBP-type formula in base 2, Q(k) becomes a power-of-two multiple of R(k). This power-of-two can be absorbed into the P(k) polynomial as the floating-point exponent, thereby eliminating the need for the Q(k) variable. This special case is handled in the next section.

 

 

Binary BBP - A 2-Variable Recursion for BBP-type Formula

 

As mentioned in the previous section, BBP-type formula can be evaluated faster than plugging it into the CommonP2B3 series. This "BinaryBBP" pattern is similar to the CommonP2B3 pattern, but with only 2 variables and the types of the variables have been relaxed from integers to scaled integers. (aka floating-point with no rounding)

 

 

Scope:

 

Mathematically, the BinaryBBP recursion in y-cruncher evaluates the following series:

where P(k) and Q(k) are arbitrary functions of k that return a real integer and r is a positive integer. Again, the only restrictions are that they are easy to evaluate and the integer produced is reasonably small. In practice, P(k) will be a constant and Q(k) will be a power of a 1st order polynomial with small coefficients.

 

As of y-cruncher v0.7.7, the only series that uses this pattern is Catalan Huvent - which has been superceded by the newer and faster formulas. So in a way, BinaryBBP isn't used anymore. But since it's written in the same generic manner as CommonP2B3, it's here to stay for now in case y-cruncher decides to pick up any of the numerous BBP-type formula out there for various things.

 

 

The Recursion:

 

The Binary Splitting recursion for the series is as follows:

 

 

All numbers will need to be represented as scaled integers with a base 2 exponent. So the addition in the P variable will be a shift-add.

 

 

Restrictions:

Not too many rules here since there is a lot less room for things to break.

 

 

Convergence:

 

Provided the restrictions above are followed, the BinaryBBP series is always linearly convergent.

 

 

Computational Speed:

 

Again assuming P(k), and Q(k) are well-behaved polynomials with small and real integer coefficients, the relative asymptotic cost to summing up the series is given by:

Where:

Things to note:

 

Optimizations:

 

Since the BinaryBBP series is simpler than the CommonP2B3 series, there's a lot less room for optimizations. Removal of common factors applies to the BinaryBBP series just as it does with the CommonP2B3. But in practice, P(k) will be a constant. So there really isn't any thing that can be optimized.

 

The BinaryBBP series is itself an optimization of the more general CommonP2B3 series.