y-cruncher - A Multi-Threaded Pi-Program

From a high-school project that went a little too far...

By Alexander J. Yee

(Last updated: October 13, 2022)

 

Shortcuts:

 

The first scalable multi-threaded Pi-benchmark for multi-core systems...

 

How fast can your computer compute Pi?

 

y-cruncher is a program that can compute Pi and other constants to trillions of digits.

It is the first of its kind that is multi-threaded and scalable to multi-core systems. Ever since its launch in 2009, it has become a common benchmarking and stress-testing application for overclockers and hardware enthusiasts.

 

y-cruncher has been used to set several world records for the most digits of Pi ever computed.

 

Current Release:

Windows: Version 0.7.10 Build 9513 (Released: August 31, 2022)

Linux      : Version 0.7.10 Build 9513 (Released: August 31, 2022)

 

Official Mersenneforum Subforum (new).

Official HWBOT forum thread.

 

News:

 

Zen4's AVX512: (Sepember 26, 2022) - permalink

 

Now that the embargos have lifted, I have published my breakdown of Zen4's AVX512 over on Mersenneforum.

So if you're a SIMD programmer or just curious about architecture in general, this might be worth a read.

 

 

 

Version 0.7.10 and AMD Zen4: (August 31, 2022) - permalink

 

Zen4 is set to be AMD's first processor to support AVX512. You know what that means - a new y-cruncher binary for it!

 

AMD has graciously provided me a pre-release sample of their Ryzen 9 7950X. And using that, I'm able to produce a Zen4-optimized binary - well ahead of launch and in time for the hardware reviewers to pick up.

 

Since most information about Zen4 is still under embargo, I cannot say anything about it at this time. If you happen to have access to a Zen4 system, feel free to try out this new release.

 

If you are a hardware reviewer who uses y-cruncher as one of your benchmarks, you will need to grab this latest version of y-cruncher to get the best results on Zen4.

 

The existing Intel-optimized AVX512 binaries for Skylake and Tiger Lake do not run optimally on Zen4, so you will need the new binary. Fortunately, the performance of the other binaries remain unchanged in v0.7.10. So Zen4 benchmarks on v0.7.10 can be directly compared with those of other processors using y-cruncher v0.7.9. Thus you do not need to redo your benchmarks for competing processors if they are already done with v0.7.9.

 

Overall, this was a very fun project which I enjoyed. Being pre-release meant that all the usual optimization and architectural resources that I usually rely on do not exist yet. So I had to do all the reverse engineering myself to figure out enough of architecture to where I could optimize for it. Unless someone beats me to it (via leaks), I intend to publish my findings as soon as is allowed.

 

AMD's support for AVX512 may be the trigger that finally breaks AVX512's chicken-egg problem. For better part of the last decade, nobody used AVX512 because of poor support. And since nobody used it, it received poor support. Now with AMD's backing, adoption of AVX512 may finally start to increase and perhaps put Intel at a competitive disadvantage until they bring it back to the consumer market.

 

 

I mentioned earlier this year that Zen4 and Sapphire Rapids X were the other two chips I wanted to test and optimize for. Now with Zen4 fulfilled (for now), that leaves Sapphire Rapids - which looks like it's having its fair share of delays. So I obviously have no timeline for that and I may end up skipping it if it ends being cost prohibitive. Just in case if anyone from Intel is paying attention...

 

 

 

100 Trillion Digits of Pi: (June 8, 2022) - permalink

 

I'm glad to announce that Google has reclaimed the Pi world record by computing 100 trillion digits of Pi!

 

This computation took 158 days from October 14 to March 21. Like last time, it was run on the Google Cloud platform, but with newer and improved hardware for both compute and storage.

 

Hardware Specs:

For more details check out Google's blog here. If you are interested in the digits, you can download them from here.

 

 

Older News

 

Records Set by y-cruncher:

y-cruncher has been used to set a number of world record sized computations.

 

Blue: Current World Record

Green: Former World Record

Red: Unverified computation. Does not qualify as a world record until verified using an alternate formula.

Date Announced Date Completed: Source: Who: Constant: Decimal Digits: Time: Computer:
July 17, 2022 July 15, 2022   Seungmin Kim Lemniscate 1,200,000,000,100

Compute:  32.2 days

Verify:  46.5 days

2 x Intel Xeon Gold 6140 @ 2.30 GHz
377 GB

June 8, 2022 March 21, 2022   Emma Haruka Iwao Pi 100,000,000,000,000

Compute:  158 days

Verify:  12.6 hours

Validation File

128 vCPU Intel Ice Lake (GCP)
864 GB
663 TB storage

March 14, 2022 March 9, 2022   Seungmin Kim Catalan's Constant 1,200,000,000,100 Compute:  48.6 days

Verify:  47.3 days

2 x Intel Xeon Gold 6140 @ 2.30 GHz
2 x Intel Xeon E5-2680 v3 @ 2.50 GHz

January 5, 2022 November 12, 2021   Tizian Hanselmann Square Root of 2 10,000,000,001,000

Compute:  18.4 days

Verify:  18.5 days

Intel Xeon E7-4870 @ 2.4 GHz
896 GB
October 4, 2021 September 30, 2021  

Chris Danneil

Zeta(5) 200,000,000,000

Compute:  28.2 days

Verify:  33.0 days

Intel Xeon E5-268v4 @ 2.1 GHz
256 GB

October 4, 2021 September 9, 2021   William Echols Log(2) 1,500,000,000,000

Compute:  98.9 days
Verify:  61.7 days

2 x Intel Xeon E5-2690 v3 @ 2.6 GHz
256 GB
August 17, 2021 August 14, 2021 Source UAS Grisons Pi 62,831,853,071,796 Compute:  108 days
Verify:  34.4 hours
AMD Epyc 7542 @ 2.9 GHz
1 TB
34 + 4 Hard Drives
February 14, 2021 February 12, 2021   Clifford Spielman Golden Ratio 10,000,000,000,000

Compute:  14.3 days

Verify:  7.40 days

AMD Threadripper 3995WX @ 2.7 GHz

512 GB

December 5, 2020 November 22, 2020   David Christle e 31,415,926,535,897

Compute:  53.8 days

Verify:  46.0 days

2 x Intel Xeon E5-2680 v2 @ 2.8 GHz

252 GB

September 13, 2020 September 6, 2020   Seungmin Kim Log(10) 1,200,000,000,100

Compute:  14.5 days

Verify:  22.5 days

2 x Intel Xeon E5-2699 v3 @ 2.3 GHz
756 GB
2 x Intel Xeon Gold 5220 @ 2.2 GHz
754 GB
August 9, 2020 July 26, 2020   Seungmin Kim Zeta(3) - Apery's Constant 1,200,000,000,100 Compute:  31.7 days

Verify:  32.6 days

2 x Intel Xeon E5-2670 v3 @ 2.3 GHz
503 GB
2 x Intel Xeon Gold 5220 @ 2.2 GHz
754 GB
August 9, 2020 July 23, 2020   Andrew Sun Gamma(1/3) 500,000,001,337 Compute:  17.3 days

Verify:  4.07 days

2 x Intel Xeon E5-2690 v4 @ 2.6 GHz

315 GB

June 28, 2020 June 22, 2020   Seungmin Kim Zeta(3) - Apery's Constant 1,200,000,000,000

Compute:  31.7 days

Not Verified

2 x Xeon E5-2670 v3 @ 2.3 GHz
503 GB
June 28, 2020 May 27, 2020   Andrew Sun Gamma(1/4) 500,000,000,000

Compute:  20.0 days

Verify:  14.9 days

2 x Intel Xeon E5-2690 v4 @ 2.6 GHz
315 GB
May 28, 2020 May 26, 2020  

Seungmin Kim

Ian Cutress

Euler-Mascheroni Constant 600,000,000,100

Compute:  145 days

Verify:  104 days

2 x Intel Xeon Gold 6140 @ 2.3 GHz

187 GB

Intel Xeon 8280 @ 2.7 GHz

768 GB

January 29, 2020 January 29, 2020 Blog Timothy Mullican Pi 50,000,000,000,000

Compute:  303 days

Verify:  17.2 hours

Validation File

4 x Intel Xeon E7-4880 v2 @ 2.5 GHz

315 GB

48 Hard Drives

December 4, 2019 November 13, 2019  

Christophe Patris de Broe

& Alexandre Gouy

& Cyril Hsu

Golden Ratio 20,000,000,000,000

Compute:  6.94 days

Not Verified

2 x Intel Xeon Platinum 8268 @ 2.9 GHz

768 GB

October 21, 2019 October 17, 2019   Marco Julian Hummel Gamma(1/3) 274,877,906,944 Compute:  11.2 days

Verify:  30.7 days

2 x Intel Xeon E5-2651 v2 @ 1.8 GHz

192 GB

March 14, 2019 January 21, 2019

Blogs

1 + 2

Emma Haruka Iwao Pi 31,415,926,535,897 Compute:  121 days

Verify:  20.0 hours

Validation File

2 x Undisclosed Intel Xeon @ 2.00 GHz
> 1.40 TB DDR4
> 240 TB SSD
August 24, 2017 August 23, 2017   Ron Watkins Euler-Mascheroni Constant 477,511,832,674

Compute:  34.4 days

Verify:  141 days

4 x Xeon E5-4660 v3 @ 2.1 GHz - 1 TB
2 x Xeon X5690 @ 3.47 GHz - 128 GB
November 15, 2016 November 11, 2016 Blog
Sponsor
Peter Trueb Pi 22,459,157,718,361 Compute:  105 days

Verify:  28 hours

Validation File

4 x Xeon E7-8890 v3 @ 2.50 GHz
1.25 TB DDR4
20 x 6 TB 7200 RPM Seagate
June 28, 2016 June 19, 2016   Ron Watkins Square Root of 2 10,000,000,000,000

Compute:  18.8 days

Verify:  25.2 days

2 x Xeon X5690 @ 3.47 GHz
141 GB
October 8, 2014 October 7, 2014  

Sandon Van Ness

(houkouonchi)

Pi 13,300,000,000,000

Compute:  208 days

Verify:  182 hours

Validation File

2 x Xeon E5-4650L @ 2.6 GHz
192 GB DDR3 @ 1333 MHz
24 x 4 TB + 30 x 3 TB
December 28, 2013 December 28, 2013 Source Shigeru Kondo Pi 12,100,000,000,050

Compute: 94 days

Verify: 46 hours

2 x Xeon E5-2690 @ 2.9 GHz
128 GB DDR3 @ 1600 MHz
24 x 3 TB

See the complete list including other notably large computations. If you want to set a record yourself, the rules are in that link.

 

 

Features:

 

The main computational features of y-cruncher are:

 

Download:

Sample Screenshot: 1 trillion digits of Pi

Core i7 5960X @ 4.0 GHz - 64 DDR4 @ 2400 MHz - 16 HDs

 

Latest Releases: (August 31, 2022)

Downloading any of these files constitutes as acceptance of the license agreement.

OS Download Link Size

Windows

y-cruncher v0.7.10.9513.zip

45.1 MB

Linux (Static)

y-cruncher v0.7.10.9513-static.tar.xz

34.1 MB

Linux (Dynamic)

y-cruncher v0.7.10.9513-dynamic.tar.xz

28.0 MB

 

 

 

 

 

 

 

 

The Linux version comes in both statically and dynamically linked versions. The static version should work on most Linux distributions, but lacks Cilk Plus and NUMA binding. The dynamic version supports all features, but is less portable due to the DLL dependency hell.

 

The Windows download comes bundled with the HWBOT submitter which allows benchmarks to be submitted to HWBOT.

 

System Requirements:

Windows:

Linux:

All Systems:

Very old systems that don't meet these requirements may be able to run older versions of y-cruncher. Support goes all the way back to even before Windows XP.

 

Version History:

 

Other Downloads (for C++ programmers):

 

Advanced Documentation:

 

 

Benchmarks:

Comparison Chart: (Last updated: October 12, 2022)

 

Computations of Pi to various sizes. All times in seconds. All computations done entirely in ram.

The timings include the time needed to convert the digits to decimal representation, but not the time needed to write out the digits to disk.

 

Blue: Benchmarks are up-to-date with the latest version of y-cruncher.

Green: Benchmarks were done with an old version of y-cruncher that is comparable in performance with the current release.

Red: Benchmarks are significantly out-of-date due to being run with an old version of y-cruncher that is no longer comparable with the current release.

Purple: Benchmarks are from unreleased internal builds that are not speed comparable with the current release.

 

 

Laptops + Low-Power:

Processor(s): Core i7 8565U Core i7 9750H Core i7 1065G7 Core i7 1065G7 Core i7 1165G7 Core i9 11900KB
Generation: Intel Kaby Lake R Intel Coffee Lake Intel Ice Lake Intel Ice Lake Intel Tiger Lake Intel Tiger Lake
Cores/Threads: 4/8 6/12 4/8 4/8 4/8 8/16
Processor Speed: 2.3 - 4.6 GHz 3.1 - 3.9 GHz 2.1 - 3.0 GHz (25W) ??? 3.6 - 4.0 GHz (45W) 3.3 - 4.5 GHz
Memory: 8 GB 16 GB - 2666 MT/s 16 GB @ 3200 MT/s 16 GB @ 3733 MT/s 32 GB @ 3200 MT/s 32 GB @ 3200 MT/s
Version: v0.7.8 (14-BDW) v0.7.8 (14-BDW) v0.7.7 (18-CNL) v0.7.8 (18-CNL) v0.7.8 (18-CNL) v0.7.8 (18-CNL)
Instruction Set: x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX512-DQ x64 AVX512-VBMI x64 AVX512-VBMI x64 AVX512-VBMI x64 AVX512-VBMI
25,000,000 1.584 1.463 1.596 1.243 1.046 0.868 0.575
50,000,000 3.523 3.123 3.552 2.718 2.318 1.825 1.301
100,000,000 7.837 6.585 7.903 5.870 5.150 3.888 2.866
250,000,000 23.336 18.378 22.520 16.531 14.745 10.813 7.997
500,000,000 54.403 40.028 50.683 37.778 32.856 23.769 17.560
1,000,000,000 127.177 87.298 114.539 85.426 73.918 52.930 38.525
2,500,000,000   251.637 332.819 249.579 222.515 156.696 111.600
5,000,000,000           348.512 247.368
10,000,000,000              
Credit:   ji lcpd ji lcpd Gnyueh ji lcpd ji lcpd
Processor(s): Core i7 3630QM Core i7 4610M Core i3 8121U (Windows*) Core i7 6560U Core i7 6700HQ
Generation: Intel Ivy Bridge Intel Haswell Intel Cannon Lake Intel Skylake Intel Skylake
Cores/Threads: 4/8 2/4 2/4 4/8 4/8
Processor Speed: 3.2 GHz 3.0 GHz 2.6 - 3.0 GHz 2.6 - 3.0 GHz 2.4 - 2.9 GHz 2.21 GHz 2.6 GHz ?
Memory: 16 GB - 1600 MT/s 8 GB 8 GB 8 GB 16 GB
Version: v0.7.8 (11-SNB) v0.7.8 (13-HSW) v0.7.8 (14-BDW) v0.7.8 (17-SKX) v0.7.8 (18-CNL) v0.7.8 (14-BDW) v0.7.8 (14-BDW)
Instruction Set: x64 AVX x64 AVX2 x64 AVX2 + ADX x64 AVX512-DQ x64 AVX512-VBMI x64 AVX2 + ADX x64 AVX2 + ADX
25,000,000 3.688 3.372 2.984 2.685 2.189 3.165 2.140
50,000,000 8.460 7.634 6.795 6.079 4.879 7.212 4.634
100,000,000 18.817 17.037 15.173 13.654 10.725 16.008 10.168
250,000,000 56.097 48.912 45.801 41.775 31.268 46.491 29.298
500,000,000 129.173 109.82 106.192 97.429 75.072 105.889 65.509
1,000,000,000 302.003 244.751 241.978 222.909 170.715 233.860 144.395
2,500,000,000 848.475           429.748
5,000,000,000              
10,000,000,000              
Credit: Oliver Kruse Marco Julian Hummel       Sebastien Davies Marco Julian Hummel

 

 

Mainstream Desktops:

Processor(s): Ryzen 7 1800X Ryzen 7 3700X Ryzen 7 5800X3D Core i9 11900K Ryzen 9 3950X Ryzen 9 5950X Core i9 12900K Ryzen 9 7950X
Generation: AMD Zen AMD Zen 2 AMD Zen 3 Intel Rocket Lake AMD Zen 2 AMD Zen 3 Intel Alder Lake AMD Zen 4
Cores/Threads: 8/16 8/16 8/16 8/16 16/32 16/32 8/16 + 8/8 16/32
Processor Speed: 3.7 GHz 4.3 GHz   5.3 GHz     5.1 GHz  
Memory: 64 GB - 2866 MT/s 64 GB - 3600 MT/s 32 GB 64 GB - 3733 MT/s 64 GB - 3200 MT/s 64 GB 64 GB - 6400 MT/s 128 GB - 4400 MT/s
Program Version: v0.7.8 (17-ZN1) v0.7.8 (17-ZN1) v0.7.9 (20-ZN3) v0.7.8 (18-CNL) v0.7.8 (17-ZN1) v0.7.8 (19-ZN2) v0.7.9 (14-BDW) v0.7.10 (22-ZN4)
Instruction Set: x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX512-VBMI x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX512-GFNI
25,000,000 1.247 0.681 0.580 0.461 0.701 0.559 0.399 0.434
50,000,000 2.623 1.427 1.220 1.027 1.422 1.171 0.744 0.901
100,000,000 5.655 2.975 2.693 2.223 2.870 2.403 1.622 1.875
250,000,000 16.053 8.135 7.435 6.350 7.279 6.162 4.529 4.953
500,000,000 35.607 17.666 16.713 14.133 15.037 12.725 9.930 10.620
1,000,000,000 78.961 39.007 36.338 31.235 32.306 27.277 21.550 23.431
2,500,000,000 226.557 112.849 104.964 91.316 91.836 77.652 61.097 65.561
5,000,000,000 498.824 250.157 232.368 202.208 204.570 171.633 134.698 144.664
10,000,000,000 1,084.855 549.067   447.487 460.548 382.39   317.490
25,000,000,000               914.088
Credit:   Sebastien Davies Marc Beste O-EtaIXVII

Marc Beste

Sebastien Davies 曾 铮  
Processor(s): FX-8350 Core i7 4770K Core i7 7700K
Generation: AMD Piledriver Intel Haswell Intel Kaby Lake
Cores/Threads: 8/8 4/8 4/8
Processor Speed: 4.0 GHz 4.0 GHz (OC) 4.9 GHz (OC)
Memory: 32 GB - 1600 MT/s 32 GB - 2133 MT/s 64 GB - 3200 MT/s
Program Version: v0.7.8 (11-BD1) v0.7.8 (13-HSW) v0.7.8 (14-BDW)
Instruction Set: x64 AVX + XOP x64 AVX2 x64 AVX2 + ADX
25,000,000 3.070 1.482 1.149
50,000,000 6.845 3.396 2.489
100,000,000 15.130 7.385 5.482
250,000,000 43.077 20.610 15.419
500,000,000 96.327 45.964 33.986
1,000,000,000 214.870 101.692 74.021
2,500,000,000 623.521 292.899 209.412
5,000,000,000 1,384.689 643.534 451.414
10,000,000,000     966.710
Credit:     Oliver Kruse

 

 

 

High-End Desktops:

Processor(s): Core i9 9980XE Core i9 10980XE Threadripper 3955WX Threadripper 3970X Threadripper 3990X
Generation: Intel Skylake X Intel Cascade Lake X AMD Zen 2 AMD Zen 2 AMD Zen 2
Cores/Threads: 18/18 18/36 16/32 32/64 64/128
Processor Speed:   2.8 GHz 3.9 GHz 3.7 GHz 4.0 GHz (OC)
Memory: 128 GB - 3600 MT/s 128 GB - 3600 MT/s 512 GB - 3200 MT/s 64 GB 256 GB - 3200 MT/s
Program Version: v0.7.8 (17-SKX) v0.7.8 (17-SKX) v0.7.8 (19-ZN2) v0.7.8 (17-ZN1) v0.7.8 (19-ZN2) v0.7.8 (19-ZN2)
Instruction Set: x64 AVX512-DQ x64 AVX512-DQ x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX2 + ADX
25,000,000 0.262 0.394 0.531 0.458 0.431 0.694
50,000,000 0.556 0.837 1.070 0.910

0.890

1.011
100,000,000 1.203 1.814 2.015 1.763 1.734 2.203
250,000,000 3.492 4.885 5.072 4.258 4.070 5.618
500,000,000 7.703 10.198 10.423 8.699 7.896 10.474
1,000,000,000 16.923 21.633 22.386 18.427 15.939 21.872
2,500,000,000 48.896 59.536 63.115 50.338 44.058 54.171
5,000,000,000 108.179 129.954 138.574 109.537 95.645 116.739
10,000,000,000 236.789 282.091 304.391 237.537 208.484 245.579
25,000,000,000 675.992 799.543 846.597     682.885
50,000,000,000     1,960.563     1672.776
100,000,000,000     4,450.366      
Credit: Shigeru Kondo ji lcpd Michael Makovi Tainus Bennet Huch
Processor(s): Core i7 5960X Threadripper 1950X Core i9 7940X
Generation: Intel Haswell AMD Threadripper Intel Skylake X
Cores/Threads: 8/16 16/32 14/28
Processor Speed: 4.0 GHz (OC) 3.5 - 3.7 GHz 3.8 GHz 3.6 GHz
2.8 GHz cache
Memory: 64 GB - 2133 MT/s 128 GB - 2933 MT/s 128 GB - 3466 MT/s
Program Version: v0.7.8 (13-HSW) v0.7.8 (17-ZN1) v0.7.8 (14-BDW) v0.7.8 (17-SKX)
Instruction Set: x64 AVX2 x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX512-DQ
25,000,000 0.853 0.721 0.503 0.482
50,000,000 1.769 1.500 1.090 0.994
100,000,000 3.828 3.173 2.407 1.974
250,000,000 10.807 8.666 6.575 5.145
500,000,000 23.523 18.926 14.026 10.791
1,000,000,000 51.930 41.762 29.989 22.974
2,500,000,000 149.081 119.06 84.231 64.586
5,000,000,000 326.022 264.191 189.510 143.051
10,000,000,000 713.146 572.900 410.372 316.622
25,000,000,000   1642.184 1,185.891 903.586
50,000,000,000        
Credit:   Oliver Kruse    

 

*All-core non-AVX/AVX/AVX512 CPU frequency.

 

 

Multi-Processor Workstation/Servers:

 

Due to high core count and the effect of NUMA (Non-Uniform Memory Access), performance on multi-processor systems are extremely sensitive to various settings. Therefore, these benchmarks may not be entirely representative of what the hardware is capable of.

Processor(s): Xeon Platinum 8124M Xeon Gold 6148 Xeon Platinum 8175M Xeon Platinum 8275CL Epyc 7742 Epyc 7B12 Epyc 7742
Generation: Intel Skylake Purley Intel Skylake Purley Intel Skylake Purley Intel Cascade Lake AMD Rome AMD Rome AMD Rome
Sockets/Cores/Threads: 2/36/72 2/40/40 2/48/96 2/48/96 2/128/256 2/112/224 2/128/256
Processor Speed: 3.0 GHz 2.4 GHz 2.5 GHz 3.0 GHz   2.25 GHz 2.25 GHz
Memory: 137 GB - ?? 188 GB - ?? ~756 GB - ?? 192 GB ~504 GB ~882 GB 2 TB
Program Version: v0.7.5 (17-SKX) v0.7.6 (17-SKX) v0.7.6 (17-SKX) v0.7.8 (17-SKX) v0.7.7 (17-ZN1) v0.7.8 (19-ZN2) v0.7.8 (19-ZN2)
Instruction Set: x64 AVX512-DQ x64 AVX512-DQ x64 AVX512-DQ x64 AVX512-DQ x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX2 + ADX
25,000,000 0.540 0.329 0.294 0.283 0.534 0.439 0.513
50,000,000 0.981 0.683 0.617 0.544 1.027 0.838 0.920
100,000,000 1.905 1.456 1.305 1.169 2.298 1.796 1.887
250,000,000 5.085 3.737 3.591 3.125 5.854 4.509 4.650
500,000,000 10.372 7.750 7.293 6.309 10.502 8.196 8.066
1,000,000,000 21.217 16.550 15.041 13.042 17.836 14.252 13.246
2,500,000,000 55.701 45.693 39.329 34.028 35.485 30.592 27.011
5,000,000,000 118.151 99.078 83.601 71.777 62.432 58.405 49.940
10,000,000,000 247.928 212.984 176.695 153.169 115.543 116.900 98.156
25,000,000,000   599.653 491.988 425.442 307.995 314.907 258.081
50,000,000,000     1,081.181   690.662 741.633 598.716
100,000,000,000           1715.123 1,370.714
250,000,000,000             3,872.397
Credit: Jacob Coleman Oliver Kruse newalex Xinyu Miao Carsten Spille Greg Hogan Song Pengei
Processor(s): Xeon E5-2683 v3 Xeon E7-8880 v3 Xeon E5-2687W v4 Xeon E5-2686 v4 Xeon E5-2696 v4 Epyc 7601 Xeon Gold 6130F
Generation: Intel Haswell Intel Haswell Intel Broadwell Intel Broadwell Intel Broadwell AMD Naples Intel Skylake Purley
Sockets/Cores/Threads: 2/28/56 4/64/128 2/24/48 2/36/72 2/44/88 2/64/128 2/32/64
Processor Speed: 2.03 GHz 2.3 GHz 3.0 GHz 2.3 GHz 2.2 GHz 2.2 GHz 2.1 GHz
Memory: 128 GB - ??? 2 TB - ??? 64 GB 504 GB - ??? 768 GB - ??? 256 GB - ?? 256 GB - ??
Program Version: v0.6.9 (13-HSW) v0.7.1 (13-HSW) v0.7.6 (14-BDW) v0.7.7 (14-BDW) v0.7.1 (14-BDW) v0.7.3 (17-ZN1) v0.7.3 (17-SKX)
Instruction Set: x64 AVX2 x64 AVX2 x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX2 + ADX x64 AVX512-DQ
25,000,000 0.907 1.176 0.490 0.494 0.715 2.459 1.150
50,000,000 1.745 2.321 1.072 0.982 1.344 4.347 1.883
100,000,000 3.317 4.217 2.303 2.193 2.673 6.996 3.341
250,000,000 8.339 8.781 6.196 6.044 6.853 14.258 7.731
500,000,000 17.708 15.879 13.046 12.582 14.538 24.930 15.346
1,000,000,000 37.311 32.078 27.763 26.852 31.260 47.837 31.301
2,500,000,000 102.131 78.251 76.202 73.596 84.271 111.139 82.871
5,000,000,000 218.917 164.157 165.046 160.094 192.889 228.252 179.488
10,000,000,000 471.802 346.307 356.487 346.305 417.322 482.777 387.530
25,000,000,000 1,511.852 957.966 1,006.131 980.784 1,186.881 1,184.144 1,063.850
50,000,000,000   2,096.169 2,202.558 2,156.854 2,601.476    
100,000,000,000   4,442.742     6,037.704    
250,000,000,000   17,428.450          
Credit: Shigeru Kondo Jacob Coleman Cameron Giesbrecht newalex "yoyo" Dave Graham

 

 

Fastest Times:

The full chart of rankings for each size can be found here:

These fastest times may include unreleased betas.


Got a faster time? Let me know: a-yee@u.northwestern.edu

Note that I usually do not respond to these emails. I simply put them into the charts which I update periodically (typically within 2 weeks).

 

 

Performance Tips:

 

Decimal Digits of Pi - Times in Seconds

Core i9 7940X @ 3.7 GHz AVX512

Memory Frequency: 2666 MT/s 3466 MT/s
25,000,000 0.839 0.758
50,000,000 1.424 1.338
100,000,000 2.701 2.425
250,000,000 6.489 5.877
500,000,000 13.307 11.917
1,000,000,000 27.913 24.915
2,500,000,000 76.837 68.322
5,000,000,000 168.058 148.737
10,000,000,000 365.047 322.115
25,000,000,000 1,037.527 916.039

High core count Skylake X processors are known to be heavily bottlenecked by memory bandwidth.

Memory Bandwidth:

 

Because of the memory-intensive nature of computing Pi and other constants, y-cruncher needs a lot of memory bandwidth to perform well. In fact, the program has been noticably memory bound on nearly all high-end desktops since 2012 as well as the majority of multi-socket systems since at least 2006.

 

Recommendations:

Don't be surprised if y-cruncher exposes instabilities that other applications and stress-tests do not. y-cruncher is unusual in that it simultaneously places a heavy load on both the CPU and the entire memory subsystem.

 

 

 

Parallel Performance:

 

y-cruncher has a lot of settings for tuning parallel performance. By default, it makes a best effort to analyze the hardware and pick the best settings. But because of the virtually unlimited combinations of processor topologies, it's difficult for y-cruncher to optimally pick the best settings for everything. So sometimes the best performance can only be achieved with manual settings.

*These are advanced settings that cannot be changed if you're using the benchmark option in the console UI. To change them, you will need to either run benchmark mode from the command line or use the custom compute menu.

 

Load imbalance is a faily common problem in y-cruncher. The usual causes are:

  1. The number of logical cores is not a power-of-two.
  2. The cores are not homogenous. Common reasons include:
    • The cores are clocked at different speeds.
    • The cores have access to different amounts of memory bandwidth due an imbalanced NUMA topology.
    • The cores are different generation cores hidden behind a virtual machine.
  3. CPU-intensive background processes are interfering with y-cruncher's ability to use all the hardware. This applies to all forms of system jitter.

 

 

Large Pages:

 

Large pages used to not matter in the past, but they do now in the post-Spectre/Meltdown world. Mitigations for the Meltdown vulnerability can have a noticeable performance drop for y-cruncher (up to 5% has been observed). It turns out that turning on large pages can mitigate the penalty for this mitigation. (pun intended)

 

Refer to the memory allocation guide on how to turn on large pages.

 

 

Swap Mode:

 

This is probably one of the most complicated features in y-cruncher.

 

 

Known Issues:

 

Everything in this section is in the process of being re-verified and moved to: https://github.com/Mysticial/y-cruncher/issues

 

 

Performance Issues:


Algorithms and Developments:

 

FAQ:

 

Pi and other Constants:

 

Program Usage:

 

Hardware and Overclocking:

 

Academia:

 

Programming:

 

Other:

 

Links:

Here's some interesting sites dedicated to the computation of Pi and other constants:

 

Questions or Comments

Contact me via e-mail. I'm pretty good with responding unless it gets caught in my school's junk mail filter.

You can also find me on Twitter as @Mysticial.