ycruncher  Language and Algorithms
By Alexander J. Yee
(Last updated:
December 31, 2013)
Back To:
Implementation (as of v0.6.3):
General Information:
 ycruncher is written in C++11. But significant portions of it will compile in C99.
 Some inline assembly is used.
 Intel SSE and AVX compiler intrinsics are used.
 As of December 22, 2013:
 ycruncher has about 200,000 lines of active handwritten code for a total size of approximately 6.7 MB.
 If inactive and autogenerated code is included, then the figures are 225,000 lines for about 7.5 MB.
 The external Digit Viewer has about 12,000 lines of code. But only 1,100 lines is for itself. The rest is shared with ycruncher.
 ycruncher itself is closed source, but the Digit Viewer is opensourced on my GitHub. It includes all the necessary dependencies to compile.
Libraries and Dependencies:
 Windows: WinAPI
 Linux: POSIX
 ycruncher has no other 3rd party dependencies.
Compilers:
 All Windows binaries are compiled using Microsoft Visual Studio 2012.
 All Linux binaries are compiled using GCC.
 The Intel Compiler is used in some versions of ycruncher. But it is not used in v0.6.2  v0.6.3.
Other Internal Requirements:
 Littleendian addressing
 Signfill (arithmetic) rightshift
 IEEE doubleprecision floatingpoint
Overall Design:
ycruncher's design is split into 3 major layers.
 Interface + Math Layer: (24% of total code: 47,000 lines, 214 files)
This is where all the constants and functions are implemented. This is done largely using very simple objectoriented programming structured in a mathematical way that most resembles the formulas/algorithms that are involved.
 Object Layer: (2% of total code: 4,000 lines, 13 files)
This is where all the "number" objects are defined. (Large integer, large floatingpoint, etc...)
This layer is just a simple class library that wraps the modules layer into a usable interface for bignum arithmetic.
 Modules Layer: (74% of total code: 149,000 lines, 720 files)
The Modules layer implements all the core operations needed by the upper layers. This is where most of the performance critical code is. Most of the code bloat in this module is due to performance optimizations.
Formulas:
ycruncher has two algorithms for each major constant that it can compute  one for computation, and one for verification.
All complexities shown assume multiplication to be O(n log(n)). It is slightly higher than that, but for all practical purposes, O(n log(n)) is close enough.
 Square Root of n and Golden Ratio
 1st order Newton's Method  Runtime Complexity: O(n log(n))
Note that the final radix conversion from binary to decimal has a complexity of O(n log(n)^{2}).
 e
 Taylor Series of exp(1): O(n log(n)^{2})
 Taylor Series of exp(1): O(n log(n)^{2})
 Pi
 Chudnovsky Formula: O(n log(n)^{3})
 Ramanujan's Formula: O(n log(n)^{3})
 Log(n)
 MachinLike Formulas: O(n log(n)^{3})
 Starting from v0.6.1, ycruncher is able to generate and utilize machinlike formulas for the log of any small integer.

 Zeta(3)  Apery's Constant
 AmdeberhanZeilberger Formula 2: O(n log(n)^{3})
 AmdeberhanZeilberger Formula 1: O(n log(n)^{3})
 Lemniscate
 Gauss Formula: O(n log(n)^{3})
 Sebah's Formula: O(n log(n)^{3})
Both of these formulas were pulled from: http://numbers.computation.free.fr/Constants/PiProgram/userconstants.html
Note that the AGMbased algorithms are probably faster. But ycruncher currently uses these seriesbased formulas because:
 It lacks an implementation of sqrt(N) for arbitrary N. It only supports N being a small integer.
 There already is a series summation framework that can be easily applied to the ArcSinlemn() function.
 Catalan's Constant
 Lupas Formula: O(n log(n)^{3})
 Huvent's Formula: O(n log(n)^{3})
ycruncher uses the following rearrangement of Huvent's formula:

 EulerMascheroni Constant
 BrentMcMillan with Refinement: O(n log(n)^{3})
 BrentMcMillan (alone): O(n log(n)^{3})
 Note that both of these formulas are essentially the same.
Therefore, in order for two computations to be independent enough to qualify as a verified record, they MUST be done using different n.
For simplicity of implementation, ycruncher only uses n that are powers of two  which serves a dual purpose of allowing the use of (the easily computed) Log(2), as well as lending itself to shift optimizations.
Arithmetic Algorithms:
Multiplication:
Like most bignum programs, ycruncher uses different multiplication algorithms for different sizes:
As of v0.6.3, ycruncher uses:
*Standard libraries like GMP avoid the use of floatingpoint FFTs due to the difficulty of proving bounds on roundoff error. But with modern SIMD, the performance of floatingpoint is through the roof. So rather than throwing all this computational power away, ycruncher utilizes it to the maximum extent.
ycruncher fully exploits floatingpoint algorithms and ensures their accuracy by means of conservative parameter selection and algorithmic errordetection. Although this is not proven to be correct for all inputs, the probability that ycruncher's multiplication returns an undetected incorrect value is less than the probability of a silent hardware error. So in that sense, ycruncher is arguably more reliable than other implementations that lack algorithmic errordetection.
**The last two of these algorithms (Hybrid NTT and VST) are homebrew algorithms that were developed between 2008  2012. Both of them are currently unpublished.
The following popular algorithms are currently not used by ycruncher:
With modern SIMD, the floatingpoint FFT asserts complete dominance over everything from 500 decimal digits up to the point where it no longer fits in CPU cache.
This completely displaces 3way ToomCook as well as a good portion of the sizes that are normally covered by SSA.
Of the three of these, the NTT over primes algorithm is the only one that ycruncher may conceivably pick up in the near future.
Inverse and Roots:
Reciprocals, division, and square roots are all done using 1st order Newton's Method.
 Prior to v0.6.1, their implementations were very basic and unoptimized.
 Starting v0.6.1, they take advantage of middleproduct and the reuse of redundant FFTs.
Log(n):
Prior to v0.6.1, only log(2) and log(10) were supported using hardcoded machinlike formulas. A generic log(n) was needed for Ramanujan's formula for Catalan's constant. That was implemented using ArithmeticGeometric Mean (AGM).
In v0.6.1, Ramanujan's formula for Catalan's constant was removed  thereby removing the need for a generic log(n). Instead, v0.6.1 supports the computation of log(n) for any small integer n. This is done using a formula generator that generates (at runtime) machinlike formulas for arbitrary small integer n.
Generation of machinlike formulas for log(n) is done using tablelookup along with a branchandbound search on several argument reduction formulas.
Radix Conversion:
 Prior v0.5.3, conversions were done using the standard Divide and Conquer algorithm.
 Starting from v0.5.3, they are done using Scaled Remainder Trees. This is the same approach that was used in the December 2009 world record for Pi.
 Starting from v0.6.1, the Scaled Remainder Tree algorithm is combined with the middleproduct and reuse of redundant FFTs.
Series Summation:
Series summation is done using standard Binary Splitting techniques with the following catches:
 Precision is tightly controlled to keep operands from becoming unnecessarily large.
 The cutting points for the binary splitting recursions are usually significantly skewed in order to reduce memory usage and improve loadbalance.
 Series summation is partially done backwards. The lowest terms (terms with the largest index) are summed first. The motivation and details for this is quite intricate and goes handandhand with the skewing. So I won't go into that.
This series summation scheme (including the skewed splitting and backwards summing) has been the same in all versions of ycruncher to date. All of this is expected to change when GCD factorization is to be incorporated.