y-cruncher - Language and Algorithms

By Alexander J. Yee

 

(Last updated: June 14, 2013)

 

 

Back To:

Source Code (as of v0.6.2):

 

General Information:

Libraries and Dependencies:

Compilers:

Other Internal Requirements:

 

Overall Design:

 

Formulas:

y-cruncher has two algorithms for each major constant that it can compute - one for computation, and one for verification.

 

All complexities shown assume multiplication to be O(n log(n)). It is slightly higher than that, but for all practical purposes, O(n log(n)) is close enough.

 

Square Root of n and Golden Ratio

1st order Newton's Method - Runtime Complexity: O(n log(n))
Note that the final radix conversion from binary to decimal has a complexity of O(n log(n)2).
e

Taylor Series of exp(1): O(n log(n)2)


Taylor Series of exp(-1): O(n log(n)2)


 
Pi

Chudnovsky Formula: O(n log(n)3)


Ramanujan's Formula: O(n log(n)3)


 
Log(n)

Machin-Like Formulas: O(n log(n)3)

Starting from v0.6.1, y-cruncher is able to generate and utilize machin-like formulas for the log of any small integer.
Zeta(3) - Apery's Constant

Amdeberhan-Zeilberger Formula 2: O(n log(n)3)



Amdeberhan-Zeilberger Formula 1: O(n log(n)3)



Catalan's Constant

Lupas Formula: O(n log(n)3)


Huvent's Formula (v0.6.2 and later): O(n log(n)3)



rewritten as

 

Ramanujan's Formula (v0.5.5 and earlier): O(n log(n)3)



Euler-Mascheroni Constant

Brent-McMillan with Refinement: O(n log(n)3)



                          


Brent-McMillan (alone): O(n log(n)3)



                 

Note that both of these formulas are essentially the same.
Therefore, in order for two computations to be independent enough to qualify as a verified record, they MUST be done using different n.

For simplicity of implementation, y-cruncher only uses n that are powers of two - which serves a dual purpose of allowing the use of (the easily computed) Log(2), as well as lending itself to shift optimizations.

 

Arithmetic Algorithms:

 

Multiplication:

y-cruncher's set of algorithms for multiplication is pretty standard at the low-end, but unconventional at the high-end.

 

Standard libraries like GMP avoid the use of floating-point FFTs due to the difficulty of proving bounds on round-off error. But y-cruncher fully exploits floating-point algorithms and ensures their accuracy by means of conservative parameter selection and algorithmic error-detection. Although this is not proven to be correct for all inputs, the probability that y-cruncher's multiplication returns an undetected incorrect value is less than the probability of a silent hardware error.

 

So in that sense, y-cruncher is arguably more reliable than other implementations that lack algorithmic error-detection.

 

Algorithms used in y-cruncher v0.6.1: (Thresholds are approximate and vary between the different tunings)

Algorithm Complexity Threshold (decimal digits)
x86 x64
Basecase (grade-school multiplication) - -
Karatsuba 130 300
Floating-Point Fast Fourier Transform (FFT)
w = # bits in mantissa
500 700
Hybridized Number-Theoretic Transform (Hybrid NTT) Better than FFT, but worse than SSA 4,800,000 100,000
Vector-Scalable/Very-Stupid Transform (VST) Pending analysis ~41,000,000,000

Notes:

 

Algorithms not used by y-cruncher:

 

Several popular algorithms that are used in other bignum programs are not used by y-cruncher.

Algorithm Complexity Comments
Toom-Cook 3-way

Toom-3 and higher is never optimal. Floating-point FFT completely dominates the entire range.

Schönhage-Strassen Algorithm
(SSA)

This is the flagship algorithm of GMP. However, it is not used in y-cruncher because the Hybrid NTT beats it on the large sizes, while Floating-point FFT utterly destroys it on the small end.

Number-Theoretic Transform
(NTT over multiple primes)
~ O(n log(n))

This is the large-size algorithm used in many Pi programs including: PiFast, QuickPi, TachusPi, etc...

y-cruncher doesn't use it because the Hybrid NTT beats it for small sizes and the VST algorithm (most likely*) beats it on the large sizes.

 

*Currently, I don't have a working optimized implementation of NTT over primes to confirm that it is actually slower than VST. However, estimates based on operation counts and micro-benchmarks are favoring VST - and becoming more so as current processors are trending towards increased SIMD floating-point throughput.

 

NTT over primes is a thorny algorithm to implement because it needs significant amounts of micro-optimization and platform-specific code to be efficient.

Nevertheless it's a very popular algorithm and I'm not sure how everyone else handles it. A basic implementation is easy, but there's a lot to gain by going all the way down to inline-assembly - which Visual C++ does not support on x64...

 

In any case, 64-bit NTT over primes will be necessary if y-cruncher is to go above 90 trillion digits - as even the VST algorithm fails above that size.

 

 

 

Inverse and Roots:

Reciprocals, division, and square roots are all done using 1st order Newton's Method.

 

Log(n):

Prior to v0.6.1, only log(2) and log(10) were supported using hard-coded machin-like formulas. A generic log(n) was needed for Ramanujan's formula for Catalan's constant. That was implemented using Arithmetic-Geometric Mean (AGM).

 

In v0.6.1, Ramanujan's formula for Catalan's constant was removed - thereby removing the need for a generic log(n). Instead, v0.6.1 will support computations of log(n) for any small integer n. This is done using a formula generator that generates (at run-time) machin-like formulas for arbitrary small integer n.

 

Generation of machin-like formulas for log(n) is done using table-lookup along with a branch-and-bound search on several argument reduction formulas.

 

 

Radix Conversion:

 

Series Summation:

Series summation is done using standard Binary Splitting techniques with the following catches:

This series summation scheme (including the skewed splitting and backwards summing) has been the same in all versions of y-cruncher to date. All of this is expected to change when GCD factorization is to be incorporated.