News (2016)

Back To:

Pi Computed to 22.4 Trillion Digits: (November 15, 2016) - permalink

I woke up this morning to see what was quite possibly one of the bigger surprises I've ever seen. Peter Trueb, who had previously set records for the Lemniscate and Euler-Masheroni Constants had sent me an email with details for a fully verified computation of Pi to 22.4 trillion digits.

The exact number of digits is 22,459,157,718,361 which is precisely 10¹² * Pi^e rounded down. This smashes the previous record of 13.3 trillion digits set by "houkouonchi" back in 2013. The computation took 105 days from July to November. It was interrupted 3 times, but otherwise went through without any major issues.

The hardware that was used was:

Processors: 4 x Xeon E7-8890 v3 @ 2.50 GHz (72 cores, 144 threads)
Memory: 1.25 TB DDR4
Storage: 20 x 6 TB 7200 RPM Seagate

Peter has prepared a blog with more details here. The sponsor, Dectris, has also posted a news article.

The 3.5 month run-time for 22 trillion digits is quite remarkable. Even though there have been several years of hardware and software improvements since the previous records, computations of this size have generally stagnated due to the inability of disk storage to keep up with Moore's Law in both size and performance.

Other notable and interesting facts:

This is the first time that a quad-socket computer was used for a Pi record with y-cruncher.
This is the first time more than 1 terabyte of memory was used for a Pi record with y-cruncher.
The storage configuration could sustain a bandwidth of 4 GB/s. That's significantly more than any of the previous records.
The total amount of disk I/O for this computation was approximately 8 PB read and 7 PB written. That's "PB" as in petabytes.

On the software side, this is the first Pi record in 2 years. Since then, y-cruncher has gone through many changes from multiple refactorings, AVX2, the new parallel computing frameworks, new implementations of the large FFT algorithms, etc... - none of which had ever been tested at such large sizes. So this computation can be seen as somewhat of a validation of 2 years of work.

This is the first time that y-cruncher has been used to set a Pi record completely without my knowledge. In the past, I've always been made aware of the computations in order to provide technical support. But this time, everything from the computation to the necessary verification steps was done entirely by Peter Trueb and his sponsors. I took no part in it at all other than to maintain this website along with all the downloads and documentation.

Knights Landing Xeon Phi with AVX512: (October 10, 2016) - permalink

After more than 2 years of waiting, y-cruncher with AVX512 has finally been tested on native hardware. David Carver was kind enough to test drive an internal version of y-cruncher v0.7.1 which has the AVX512-CD binary enabled. Here it is compared to some more conventional machines:

Processors:	Core i7 5960X	2 x Xeon E5-2696 v4	Xeon Phi 7250
Generation:	Haswell	Broadwell	Knights Landing
Cores/Threads:	8/16	44/88	68/272
Processor Speed:	4.0 GHz (OC)	2.2 GHz	1.4 GHz
Binary:	AVX2	AVX2 + ADX	AVX2 + ADX	AVX512-CD
1,000,000,000	62.652	31.260	56.028	41.844
10,000,000,000	850.720	417.322		504.873

The AVX512-CD binary uses AVX512 Foundation and Conflict-Detection instructions. It has been in development since early 2014, but has never been run on native hardware until now. Now it has been confirmed to work well enough to do a Pi benchmark.

Performance-wise, Knights Landing falls short of the highest-end Haswell-E and Broadwell-E systems. Furthermore, the AVX2 -> AVX512 scaling is a lackluster 34%. For now, the reason remains unknown. But it's currently hypothesized to be either memory bandwidth or Amdahl's Law.

It's worth noting that y-cruncher is completely untuned for the Knights Landing architecture. Nearly all optimizations and tuning settings are the same as the desktop chips. So there's likely more performance left to be squeezed out. But due to the cost of Xeon Phi systems along with the general inaccessibility to consumers, it will be a while before y-cruncher has any properly tuned binaries for Knights Landing (if ever).

The AVX512-CD binary (for both Windows and Linux) is available upon request to anyone who sends me a Knights Landing benchmark. But for now, I'm hesitant to formally release it since it hasn't been sufficiently tested. (A pi benchmark has very poor test-coverage of the entire program.)

In addition to the AVX512-CD binary, y-cruncher also has AVX512-DQ and AVX512-IFMA binaries for Skylake Purley and Cannonlake. But assuming Intel sticks with its policy of massive delays, it will be a quite while before either of them see the light of day.

y-cruncher v0.7.1: (May 16, 2016) - permalink

This is an semi-unplanned released to address a number of critical issues with the HWBOT integration. (Most notably the reference clock skew issue.)

Other than that, there are few other user-visible features. Most of the changes since v0.6.9 are internal refactorings. Some of these were large (and dangerous) enough that it probably would've been better to wait a few more months before releasing v0.7.1. So if anything breaks, let me know.

While this version wasn't intended to have many new features, all that refactoring did lend itself to a some opportunistic stuff such as large pages and Unicode support.

Full list of changes here.

GUI Benchmark Wrapper and HWBOT Integration: (April 3, 2016) - permalink

I get asked these two questions a lot:

Why don't you add a GUI for y-cruncher?
Why isn't y-cruncher on HWBOT?

#1 never happened because I suck at UI programming and I didn't want that mixed in with performance critical code.

#2 never happened because the HWBOT benchmark API wasn't ready.

Well, both finally happening... More details here: http://forum.hwbot.org/showthread.php?t=155079

Pi Day and some Spin-off Projects: (March 14, 2016) - permalink

Anyone who has been following my GitHub profile for past year will know that I've been working a library that exposes the compute-engine of y-cruncher. Well that's finally done and pushed out the door. (It was actually completed in January, but I waited until now following my usual "wait several months for Q/A".)

In any case, the spin-off project consists of two components:

YMP is a dynamically linked version of y-cruncher's arithmetic library.
Number Factory is an open-sourced collection of mini-programs that uses the YMP library.

YMP stands for "y-cruncher Multi-Precision Library". For the most part, it's just another bignum library - except that it supports SIMD and parallelized large multiplication.

Number Factory is largely a test app for the YMP library. It implements much of the same functionality as y-cruncher, albeit more cleanly and less efficiently.

The two can be found on my GitHub: https://github.com/Mysticial/NumberFactory

Documentation for the library can be found here: http://www.numberworld.org/ymp/v1.0/

For now, the project is entirely experimental and is available only for 64-bit Windows with Visual Studio 2015. It is far from mature and there are no plans to support Linux in the near future. But at the very least, it will let people code things up that utilize y-cruncher's parallel large multiplication.