Pi Computed to 22.4 Trillion Digits: (November 15, 2016) - permalink
I woke up this morning to see what was quite possibly one of the bigger surprises I've ever seen. Peter Trueb, who had previously set records for the Lemniscate and Euler-Masheroni Constants had sent me an email with details for a fully verified computation of Pi to 22.4 trillion digits.
The exact number of digits is 22,459,157,718,361 which is precisely 1012 * Pie rounded down. This smashes the previous record of 13.3 trillion digits set by "houkouonchi" back in 2013. The computation took 105 days from July to November. It was interrupted 3 times, but otherwise went through without any major issues.
The hardware that was used was:
The 3.5 month run-time for 22 trillion digits is quite remarkable. Even though there have been several years of hardware and software improvements since the previous records, computations of this size have generally stagnated due to the inability of disk storage to keep up with Moore's Law in both size and performance.
Other notable and interesting facts:
On the software side, this is the first Pi record in 2 years. Since then, y-cruncher has gone through many changes from multiple refactorings, AVX2, the new parallel computing frameworks, new implementations of the large FFT algorithms, etc... - none of which had ever been tested at such large sizes. So this computation can be seen as somewhat of a validation of 2 years of work.
This is the first time that y-cruncher has been used to set a Pi record completely without my knowledge. In the past, I've always been made aware of the computations in order to provide technical support. But this time, everything from the computation to the necessary verification steps was done entirely by Peter Trueb and his sponsors. I took no part in it at all other than to maintain this website along with all the downloads and documentation.
Knights Landing Xeon Phi with AVX512: (October 10, 2016) - permalink
After more than 2 years of waiting, y-cruncher with AVX512 has finally been tested on native hardware. David Carver was kind enough to test drive an internal version of y-cruncher v0.7.1 which has the AVX512-CD binary enabled. Here it is compared to some more conventional machines:
|Processors:||Core i7 5960X||2 x Xeon E5-2696 v4||Xeon Phi 7250|
|Processor Speed:||4.0 GHz (OC)||2.2 GHz||1.4 GHz|
|Binary:||AVX2||AVX2 + ADX||AVX2 + ADX||AVX512-CD|
The AVX512-CD binary uses AVX512 Foundation and Conflict-Detection instructions. It has been in development since early 2014, but has never been run on native hardware until now. Now it has been confirmed to work well enough to do a Pi benchmark.
Performance-wise, Knights Landing falls short of the highest-end Haswell-E and Broadwell-E systems. Furthermore, the AVX2 -> AVX512 scaling is a lackluster 34%. For now, the reason remains unknown. But it's currently hypothesized to be either memory bandwidth or Amdahl's Law.
It's worth noting that y-cruncher is completely untuned for the Knights Landing architecture. Nearly all optimizations and tuning settings are the same as the desktop chips. So there's likely more performance left to be squeezed out. But due to the cost of Xeon Phi systems along with the general inaccessibility to consumers, it will be a while before y-cruncher has any properly tuned binaries for Knights Landing (if ever).
The AVX512-CD binary (for both Windows and Linux) is available upon request to anyone who sends me a Knights Landing benchmark. But for now, I'm hesitant to formally release it since it hasn't been sufficiently tested. (A pi benchmark has very poor test-coverage of the entire program.)
In addition to the AVX512-CD binary, y-cruncher also has AVX512-DQ and AVX512-IFMA binaries for Skylake Purley and Cannonlake. But assuming Intel sticks with its policy of massive delays, it will be a quite while before either of them see the light of day.
y-cruncher v0.7.1: (May 16, 2016) - permalink
This is an semi-unplanned released to address a number of critical issues with the HWBOT integration. (Most notably the reference clock skew issue.)
Other than that, there are few other user-visible features. Most of the changes since v0.6.9 are internal refactorings. Some of these were large (and dangerous) enough that it probably would've been better to wait a few more months before releasing v0.7.1. So if anything breaks, let me know.
While this version wasn't intended to have many new features, all that refactoring did lend itself to a some opportunistic stuff such as large pages and Unicode support.
GUI Benchmark Wrapper and HWBOT Integration: (April 3, 2016) - permalink
I get asked these two questions a lot:
#1 never happened because I suck at UI programming and I didn't want that mixed in with performance critical code.
#2 never happened because the HWBOT benchmark API wasn't ready.
Well, both finally happening... More details here: http://forum.hwbot.org/showthread.php?t=155079
Pi Day and some Spin-off Projects: (March 14, 2016) - permalink
Anyone who has been following my GitHub profile for past year will know that I've been working a library that exposes the compute-engine of y-cruncher. Well that's finally done and pushed out the door. (It was actually completed in January, but I waited until now following my usual "wait several months for Q/A".)
In any case, the spin-off project consists of two components:
YMP stands for "y-cruncher Multi-Precision Library". For the most part, it's just another bignum library - except that it supports SIMD and parallelized large multiplication.
Number Factory is largely a test app for the YMP library. It implements much of the same functionality as y-cruncher, albeit more cleanly and less efficiently.
The two can be found on my GitHub: https://github.com/Mysticial/NumberFactory
Documentation for the library can be found here: http://www.numberworld.org/ymp/v1.0/
For now, the project is entirely experimental and is available only for 64-bit Windows with Visual Studio 2015. It is far from mature and there are no plans to support Linux in the near future. But at the very least, it will let people code things up that utilize y-cruncher's parallel large multiplication.