Release Date |
Version |
|
Changes |
|
| April 6, 2011 |
0.5.5 Build 9180
(fix 2)
Windows Only |
|
- Fixes:
- The x64 AVX ~ Hina binary is now compatible with non-Intel processors.
- As a side-effect, the x64 AVX ~ Hina binary is about 1% faster on Intel processors as well.
- The contact email address has been changed to a-yee@u.northwestern.edu.
|
|
February 20, 2011
February 20, 2011 |
0.5.5 Build 9179
(fix 1)
Windows + Linux |
|
- Fixes:
- Fixed a major bug in the ArcCoth code that may cause incorrect computation of all dependent constants:
- Log(2)
- Log(10)
- Euler-Mascheroni Constant
- This bug has probably been present in y-cruncher since v0.4.1.
|
|
February 1, 2011
February 3, 2011 |
0.5.5 Alpha
Build 9178
Windows + Linux |
|
- New Features:
- Support for the new Advanced Vector Extensions (AVX) instruction set.
- Changes:
- All Windows binaries with SSE/AVX are now compiled using the Intel Compiler 11.1.
- All Linux binaries are now compiled using GCC 4.4.5. Furthermore, all binaries are compiled as C code, not C++.
- Minor changes in speed due to rewritten code.
- Optimizations:
- x64 AVX ~ Hina:
- Specially tuned for the Intel Sandy Bridge Core i7 Processor line.
- ~10% faster than x64 SSE4.1 on Sandy Bridge Core i7.
- Requires Windows 7 Service Pack 1 or later.
- The final output of digits at the end of each computation is now faster.
Note that this has no effect on benchmarks since outputting digits to disk does not count towards computation time.
- The built-in Digit Viewer is now faster.
- Fixes:
- The Linux binaries are now statically linked.
|
|
| August 28, 2010 |
0.5.4 Build 9157
(fix 1)
Linux Only |
|
- Optimizations (Linux):
- Re-tuned I/O. This may or may not be faster than before. Note that raw I/Os are still not used because the current implementation is slower than using straight-forward buffered I/Os. (This is in contrary to Windows where raw I/Os are faster than buffered I/Os.)
- Slightly improved speed. Added "-ffast-math" to the compile options.
- New Features:
- Colored console output has been added.
- CPU brand detection has been added.
- CPU frequency detection has been added.
- Memory detection has been added.
- Automatic memory selection has been enabled in Advanced Swap Mode.
|
|
| August 16, 2010 |
0.5.4 Build 9150
(fix 1)
Linux - Only |
|
- New Features:
- This is the first Linux release. It is slower and does not support all the features as the Windows version, but it's a start.
|
|
| August 5, 2010 |
0.5.4 Build 9148
(fix 1) |
|
- Fixes:
- Fixed a bug that would cause all computations longer than 24.8 days to trigger a "Sanity Check Error".
- Fixed a bug that would cause a "Write Error" when using more than 10 drives in Advanced Swap Mode.
|
|
| August 2, 2010 |
0.5.4 Alpha
Build 9146 |
|
- New Features:
- Checkpointing: Advanced Swap Mode computations can now be interrupted and restarted at certain checkpoints. This allows large computations to survive events such as power outages and unrecoverable computational erors. It also allows computations to be paused and restarted.
- Improvements:
- Better Error-Correction: The program is now better able to recover from computational errors. Some computational errors that were uncorrectable in previous versions are now correctable in v0.5.4.
|
|
| May 13, 2010 |
0.5.3 Build 9134b
(fix 2) |
|
- Fixes:
- Fixed a memory leak at the end of each computation. This affects Batch Mode the most because it runs many computations in succession.
|
|
| April 26, 2010 |
0.5.3 Build 9133b
(fix 1) |
|
- Fixes:
- Fixed a bug in the Compute + Verify option for Euler's Constant.
|
|
| April 15, 2010 |
0.5.3 Alpha
Build 9132 |
|
- Changes:
- The Stress Test feature will now run in below normal priority to increase system responsiveness.
- When an error is detected in the stress test feature, both threads will stop after completing (or failing) their current tests.
- These are thanks to a number of requests that I have received from some people.
- When switching to Advanced Swap Mode, the program will now choose a default memory setting based on the amount of total and available physical memory that is in the system.
The ability to set the memory usage in Advanced Swap Mode was less than obvious in v0.5.2. This resulted in some users using the default lowest memory setting when it could have been a lot faster to use more memory.
- In the Benchmark feature, benchmark sizes that require more memory than there is available are faded out. (Though they can still be run.)
- The "Validation.txt" files that the program outputs can now be customized with your name/screenname (i.e. a way to identify that the benchmark was done by you).
- Fixes:
- Though technically not a fix, this version adds a check that detects a bug in Windows where thread creation will sometimes return a normal return value when in fact the thread fails to be created due to insufficient memory.
Previously, this would result in silent errors that would cause a computation to give incorrect digits or trigger other redundancy checks later in the program.
- The sensitivity of the cheat-detection has been slightly decreased as it had been giving a lot of false positives on certain motherboards with less precise hardware timers.
- Fixed an integer-overflow bug in the 32-bit binaries that would occur when writing decimal digits at the end of a computation that is larger than ~41 billion digits.
- Fixed a possible stack-corruption bug for computations larger than 500 billion digits.
- Fixed some minor bugs in the interface.
- Optimizations:
- Algorithmic change in the final Base Conversion for all constants. The new algorithm is a partial implementation of the Scaled Remainder Tree method that was used in the current world record for Pi.
- This switch provides a near 2x speed up for the conversion - or about 10% for Pi computations.
- The rest of the algorithm will be put off to a later version and is expected to give another 30 - 40% speedup for the conversion.
- As a side-effect of the new conversion, the memory requirements for square roots, Golden Ratio, e, and Pi have decreased slightly.
- New Redundancy Checks:
- The speedups brought on by the new conversion algorithm opens up an opportunity to add some new redundancy checks to increase the reliability of the program without decreasing performance.
- A verification has been added to the Base Conversion at the end of each computation.
Though somewhat expensive, this verification is done after writing the decimal digits to disk and does not count towards the "Computation Time" parameter. Therefore it does not really count as a performance penalty.
This verification is needed because of a change in algorithm for the conversion. (see "Optimizations")
Unlike the old algorithm, this new algorithm is not sufficiently self-verifying. Therefore, a verification is needed to catch that any computation errors that fail to propagate to the last 100 digits. (since only the last 51 - 100 digits are checked to see if a computation is correct)
This verification also gurantees that the entire base conversion has been done correctly with a certainty of 261. (An error has a 1 in 261 chance of not getting caught.)
- A verification has also been added to the "Final Multiply" for Advanced Swap Mode Pi computations using the Chudnovsky algorithm. This ensures that any computational error that fails to propagate to the last few hexadecimal digits will be caught with a certainty of 261. This comes with a slight performance hit.
- A number of new and extremely aggressive redundancy checks have been added to:
- The "series" for all applicable constants except for Euler's Constant.
(Redundancy checks for Euler's Constant will be included in a later version.)
In the future, the program will also attempt to correct for errors as well.
- Newton's Method for Division and Square Roots.
- Within the new Base Conversion algorithm. (This will actually attempt to correct for errors too.)
- Note that the first of these does come with a noticable (but small) performance hit.
- Also note that these redundancy checks (although aggressive), will still in no way guarantees that a computation that finishes will finish with the correct results. Verification of the digits will always require a separate and independent algorithm. (or from known pre-computed results)
|
|
| March 10, 2010 |
0.5.2 Alpha 3
Build 9082 |
|
- Fixes:
- Removed two hidden line-feed characters that were present in the validation files.
These were unintentional and were causing validation problems because they are non-standard and were being messed up by various text editors and viewers.
- Fixed an issue that would prevent the program from being able to perform arithmetic above ~20 trillion digits. As of v0.5.2, only Square Roots and Golden Ratio are unlocked beyond 10 trillion digits. Neither of them use full size arithmetic so they will not actually fail until ~28 trillion digits.
|
|
| March 4, 2010 |
0.5.2 Alpha 3
Build 9074 |
|
- Fixes:
- Fixed an issue where the program will halt with an assertion error when it tries to print a line that is longer than 78 characters long.
- Only computations larger than 10 trillion digits will be large enough to trigger this.
|
|
| March 3, 2010 |
0.5.2 Alpha 3
Build 9072 |
|
- Reliability Update:
- y-cruncher now uses raw, non-buffered I/Os. This serves to bypass a number of MAJOR memory issues arising from sub-optimal OS buffering.
- This primary purpose of this is to fix one MAJOR issue when handling extremely large files.
When creating a large file for non-sequential writes, Windows will attempt to cache a "small" percentage of the file. What exactly is it caching? I have no idea, my guess is that it's trying to cache the portion of MFT that maps the file.
The problem arises when that "small" percentage is not that "small" anymore when the swap files are terabytes large...
In one of my test runs, a 2.7 trillion digit multiplication failed when the program attempted to do non-sequential writes to four 1 TB swap files (total 4 TB large). The result was that the system cache exploded which immediately triggered Windows Error Code 1450 because of insufficient virtual memory. Because of the work-around that was added in 0.5.2.9040, the program was able to continue after increasing the virtual memory size. However, it continued to thrash virtual memory for several hours before the program was terminated manually. The thrashing simply showed no signs of stopping.
After diagnosing the cause of the system cache spike (which was more than 10 GB large), it was determined that it was due to the OS's stupid caching schemes. (of course it was never really designed for this kind of use...)
The only true work-around to the problem was to completely avoid OS buffering by using raw I/Os.
|
|
| February 26, 2010 |
0.5.2 Alpha 2
Build 9040 |
|
- Reliability Update:
- Added a work-around for an issue where page-thrashing can cause a Windows Error Code: 1450 (ERROR_NO_SYSTEM_RESOURCES).
- This is actually not a bug in the program. It is an issue in Windows. In Advanced Swap Mode, there may be long periods of time where y-cruncher does not use all of its allocated memory. As a result, Windows will page out some (or all) of the unused portion. However, when y-cruncher finally does need to use it, Windows will thrash the pagefile like crazy. The resulting stall can sometimes be enough for the OS to fail an I/O with an error code 1450.
- This "work-around" isn't really a work-around at all. Instead of terminating the program when it encounters an I/O error, it simply pauses and retries until it either completes sucessfully, or the user decides to kill the program because something else is clearly wrong. This may also give other types of I/O failures another chance in case of a random failure of some sort.
|
|
| February 25, 2010 |
0.5.2 Alpha 2
Build 9037 |
|
- Fixes:
- Fixed an issue that may prevent extremely large computations from working properly when a low memory cap is selected.
- This is an issue in the 5-step convolution algorithm for squaring. This does not affect 5-step convolution for multiplication. Not all cases of squaring via 5-step convolution are affected. Only when the memory selection is very low does it occur. (3.3 trillion digits using less than 8GB of ram will trigger this.)
- When this issue is triggered, one of two things may happen:
- The program will be tricked into using 3-step convolution - which may result in extreme performance degradation.
- The program will terminate with an error stating that there is insufficient memory.
|
|
| February 23, 2010 |
0.5.2 Alpha
Build 9025 |
|
- Fixes:
- MAJOR fix to Advanced Swap Mode. This version fixes an issue that was causing a major performance degradation in Advanced Swap Mode.
- The source of this is because of slight differences between the public releases and the private betas.
Generally speaking, only the internal builds of the program are tested. Those internal builds have extra code in them that displays detailed debugging information. None of this code is compiled in the public releases.
It just unfortunately turns out that there was some "required" code that was accidentally put with the debugging code. So it was not compiled in the public versions for v0.5.2.9021.
This build fixes those errors.
|
|
| February 23, 2010 |
0.5.2 Alpha
Build 9021 |
|
- New Features:
- Advanced Swap Mode:
- Yeah, it's about time... This is probably the only thing that matters in this version. :P
- Allows large computations to be done using very little ram.
- Full support for multiple hard drives: Although this may seem redundant of Raid 0, it allows for unlimited drives. This serves to overpass limitations imposed by Raid 0. (which are usually limited to 4 - 6 drives)
Total bandwidth scales linearly with the # of drives, but bottlenecked by the slowest drive. This is potentially better than Raid 0 in some cases. You will need to play with the settings to achieve the optimal combination of Raid 0 and the multi-hard drive setting in y-cruncher.
(For example: 3 x 4-way Raid 0 vs. 4 x 3-way raid 0.)
- Note that this is a very primitive version of Advanced Swap Mode. It has also yet to be burn-in tested so it's potentially very buggy. And lastly, it isn't supported for all constants and algorithms yet.
The x86 versions now use 64-bit indexing for Advanced Swap Mode. So they should be clear for computations greater than 20 or 41 billion digits (which are the respective limits for signed and unsigned 32-bit indexing).
However, I have yet to test them above those sizes so they may still fail if there are any remaining 32-bit indexes that should have been converted to 64-bit.
The x64 versions have always used 64-bit indexing for everything, so they are clear for all sizes up to the theoretical limit of the program.
Future improvements should include:
- Reduced total disk memory usage.
- Reduced # of disk I/Os.
- Checkpointing and crash-recovery.
- Support for all constants and algorithms.
- A 3rd algorithm for Catalan's Constant. The current secondary algorithm is extremely I/O bound due to its use of the AGM (Arithmetic Geometric Mean).
- New Validation Scheme:
- The validation now provides much more detail than in the previous versions.
- All computations are now validated.
- All constants. Not just Pi.
- All algorithms.
- All computation modes.
- Even failed computations and benchmarks are validated. They will simply be marked as "failed" in the validation certificate.
- The validator is now easier to use. Just upload the file. No more manually entering fields.
- Only the benchmark feature for Pi will be able to auto-verify the computed digits. But the last 50-100 digits that are computed will be included in the validation so that they can be verified using external sources.
- Fixes:
- Memory estimation is more accurate. (Previously, it would underestimate actual memory usage by as much as 50MB.)
- The Stress Test feature will no longer over-shoot the target memory usage by about 50MB. (Same bug as above.)
- When entering a write path or a swapfile path, the program will now actually check to see if the path is valid and writable.
- Fixed the timers in the "Compute + Verify" option for e.
- Limits:
- Advanced Swap Mode without raising the limits of the program is kinda useless:
- The limits for e, and Pi have been raised to 10 trillion digits.
- The limits for Log(2), Log(10), and Zeta(3) have been raised to 1 trillion digits.
- The limits for Catalan's Constant, and Euler's Constant have been raised to 250 billion digits.
- Note that these limits are well above the current world records for each respective constant. (At the time of this writing.)
So feel free to attempt a world record if you have the resources. But bare in mind that the program has NOT been tested at these sizes. So there is no guarantee that it will function correctly.
I can no longer afford to tie up my machines for extended periods of time, so I can't do anymore long running computations on my own machines.
- The x86 versions are all limited to 80 billion digits. This is because computations above that size become extremely inefficient without using more than 2GB of ram (which is the limit for x86).
Internally, x86 versions are capable of performing much larger computations than a mere 80 billion digits. But it would be completely impractical to do so without the use of SSDs (Solid State Disks) - which is not recommended anyway because of write-wear.
- As with the previous few releases, Square Roots and Golden Ratio have no limit.
They are capped at 90 trillion digits to give a couple orders of safety margin before reaching the precision limit of 64-bit floating-point - which is the true theoretical limit of the program.
- Optimizations:
- All x64 binaries are now a bit faster. (16 register tuning)
- Prior to this version, the vast majority of performance-critical code was written on x86 and tuned for 8 GP and SSE registers - which is sub-optimal for x64. The x64 binaries for this version are better tuned for 16 registers (GP and SSE).
- 5 - 12% faster on AMD K10.
- 2 - 6% faster on Core 2.
- 2 - 5% faster on Core i7.
- The speed of the x86 binaries is unchanged since v0.4.4.
- Other:
- This version is not speed consistent with v0.4.3 - v0.4.4 (which have become semi-standardized). Furthermore, this version will likely be the first in a series of successive optimizations. Therefore the use of v0.5.x for competitive benchmarking should be held back until the speed of the program stabillizes.
- Advanced Swap Mode opens the possibility for hard drive benchmarking. But this will be heavily biased towards machines with a lot of ram and a lot of hard drives (or SSDs) running in parallel.
(Which could turn into a competition of who has the "most" hardware - rather than who has the "best" or "best tweaked" hardware...)
|
|
| Janurary 6, 2010 |
0.4.4 Build 7762b
(fix 2) |
|
- Fixes:
- Fixed the benchmark validator.
|
|
| December 2, 2009 |
0.4.4 Build 7760
(fix 1) |
|
- New Features:
- The last 50 - 100 digits are printed out at the end of a computation.
- The Compute + Verify modes for all constants that support it will now actually compare the digits from the computation and verification runs to see if they do indeed match. This auto-compare already existed in v0.1.0 - v0.2.1, but was taken out completely from v0.3.1 onwards. This release re-enables this feature. But for the sake of efficiency and ease of implementation, it only compares the last few digits of the two runs to determine if the computations match (whereas v0.1.0 - v0.2.1 compared ALL the digits).
- Fixes:
- Fixed a bug in the CPU consumption and utilization %'s.
- Fixed some minor bugs in the x86 binaries.
- Changes:
- The CPU consumption and utilization measurements no longer include the time needed to write digits to disk. They now only measure the actual computation time. Writing digits to a slow disk had the effect of drastically lowering utilization and efficiency %'s leading some to beleve that the program is a lot less efficient than it really is. Enabling vs. disabling hexadecimal output also had a huge effect on the measurements.
|
|
| November 18, 2009 |
0.4.4 Build 7748 |
|
- New Features:
- New specially tuned binary for AMD K10 Processors.
- Start and end dates have been added to computations. (Useful for those extra long computations.)
- CPU utilization and multi-core efficiency statistics have been added.
- Added an "Advanced Options" section. The benchmark validator has been moved there.
- Users who are running an x86 OS on an x64 SSE3 capable system will be informed.
- Added detection support for AVX and FMA instruction sets.
- Optimizations:
- x64 SSE3 ~ Kasumi:
(Credit to Raymond Chan.)
- Specially tuned for Phenom II X4.
- 0.5 - 2 % faster than v0.4.3 (x64 SSE3) on Phenom II X4.
- Slightly faster Log(2), Log(10), and Euler's Constant.
|
|
|
0.4.3 Build 7681 |
|
- New Features:
- Colored Console Output:
- Slightly less dull-looking than previous versions. :)
- Automatic Version Detection:
- Launch Executable. It will automatically choose the best version of the program to run.
- Validated Batch Benchmarks:
- Standard and SuperPi-sized batch benchmarks now provide validation.
- Stronger Anti-Tampering Protection:
- Binaries that have been tampered with will not run.
- Helps guard against validation cracking via modding the executables.
- Fixes:
- Fixed a bug in the Digit Compare feature.
- Optimizations:
- All Binaries:
- All SSE binaries are now compiled using the Intel Compiler.
- Numerous
internal optimizations.
- Status refreshing has been capped to once/second to reduce printing overhead for small benchmarks.
- x64 SSE4.1 ~ Ushio:
- Specially tuned for Core i7.
- 5 - 18% faster than v0.4.2 (x64 SSE3) on Core i7.
- x64 SSE4.1 ~ Nagisa:
- Specially tuned for Harpertown.
- 0 - 12% faster than v0.4.2 (x64 SSE3) on 2x Harpertown.
- x64 SSE3:
- Re-tuned for a smaller cache.
(Previously tuned for 3MB cache/thread.)
- Speed up vs. v0.4.2 (x64 SSE3) varies by processor. (Typically around 5 - 10%)
- x86 SSE3:
- Re-tuned for a smaller cache.
(Previously tuned for 2MB cache/thread.)
- Much improved multi-core efficiency for small computations.
- 15 - 50% faster depending on computation size.
- Single-threaded timings are now competitive with PiFast 4.3.
- x86:
- Re-tuned for a smaller cache.
(Previously tuned for 2MB cache/thread.)
- Much improved multi-core efficiency for small computations.
- 10 - 40% faster depending on computation size.
- Overall:
- This is the first release dedicated primarily to optimizations. There are few functional changes.
- Note that the binaries have gotten a lot larger since v0.4.2. This is because the Intel Compiler does more aggressive optimizations than the Visual Studio Compiler.
|
|
|
0.4.2 Build 7438 |
|
- New Features:
- Batch mode option for running automated benchmarks.
- Stress Testing option for stability checking and burn-in testing.
- Fixes:
- Corrected the name for the secondary formula for Catalan's Constant.
- Corrected some spelling errors.
|
|
|
0.4.1 Build 7412 (fix 1) |
|
- Fixes:
- Fixed a major bug in the Basic Swap mode for the x64 binaries.
|
|
|
0.4.1 Build 7409 |
|
- Fixes:
- Fixed an issue where a "Sanity Check Error" would sometimes occur for extremely fast benchmarks that take less than a few seconds.
|
|
|
0.4.1 Build 7408 |
|
- New Constants:
- Square Root of any small integer
- Golden Ratio
- e
- New Features:
- Added "SuperPi" sized benchmarks:
- 1M, 2M, 4M, etc... up to 128G.
- Existing benchmarks have been extended to 100b.
- To satisfy those who have access to server-racks and super-computers... Don't even try these on a desktop... :)
- Digit Compare is back and with full support for compressed digits.
- Compute and Verify is back for the constants that benefit
from reusing
steps.
- e
- Log(2) and Log(10)
- Euler's Constant
- Fixes:
- Fixed a bug in the secondary formula for Euler's Constant where it would sometimes terminate with an "Allocation Failure" even when there is plenty of memory.
- Fixed a bunch of bugs in the x86 binaries...
- Optimizations:
- Minor speed-ups in a few random places.
- Limits:
- Thread limit has been increased from 64 to 256 threads.
- 200 billion digits for Square Roots, Golden Ratio, e, and Pi.
- Other:
- A lot of code has been rewritten and re-tuned in preparation for some future features. So there may be some minor speed differences for all computations.
- Dropped support for x64 without SSE3.
|
|
|
0.3.2 Alpha
Build 6953 (fix 1) |
|
- Fixes:
- Fixed a major bug in the digit viewer where it may incorrectly view compressed decimal digits in .ycd files larger than ~2 GB.
|
|
|
0.3.2 Alpha
Build 6945 |
|
- New Features:
- Added a single-threaded mode for Benchmarks.
- Minor improvements to benchmark validation.
- Benchmark validation is now slightly more resistant to cheating.
- Optimizations:
- Computations now require less memory. (~20% for Pi, less so for other constants)
- The 2.5b, 5b, and 10b benchmarks will just barely fit into 12GB, 24GB, and 48GB of ram respectively - perfect for triple channel Nehalem systems.
- The % complete status now has a bit more resolution.
- Fixes:
- The error-correction feature has been fixed. In benchmark mode, errors will automatically fail a benchmark even if the error is recoverable.
- Some inconsistencies with the reported cpu frequency have been fixed. Note that the incorrect readings on multiplier-jacked CPUs have NOT been fixed yet.
|
|
|
0.3.1 Alpha
Build 6897 |
|
- New Features:
- Benchmark Validation and Anti-cheat protection.
- Pre-set sizes for validated benchmarks:
- x86: 25m, 50m, 100m, and 250m
- x64: 25m, 50m, 100m, 250m, 500m, 1b, 2.5b, 5b, and 10b
- The larger benchmarks will require a LOT of ram. New Challenge!
- How high can you overclock while maintaining a full ram configuration?
- How high can you overclock a fully-loaded workstation?
- Benchmark computations will be verified against known digits to ensure that they are correct.
- Anti-clock tampering protection.
- Timings now use hardware clocks - which are more accurate and cannot be tampered with via system clock.
- Try to tamper with the clocks (there's more than one), and it will fail validation.
- Validation Checksum
- Checksums are computed from Benchmark Time, CPU frequency, CPU type... (among other things).
- Protects against output tampering.
- Protects against system substitution. (transfering the output of a valid benchmark from a faster computer to a slower one)
- New Layout for Option Selection
- The program starts with a set of default options - which can be changed manually. This avoids all the option selection from the previous versions.
- Auto-detect # of threads.
- Shows estimated disk usage for swap computations.
- Output path can be now be specified.
- Compressed Digit Format
- Hexadecimal digits will compress to 50% of text-file size.
- Decimal digits will compress to roughly 42% of text-file size.
- Compressed digits can be read directly by the new digit viewer.
- Compressed digits can be split into smaller files and accessed individually by the digit viewer.
- This feature is already present in the new Digit Viewer. Version 0.3.1 fully integrates it.
- Euler's Constant can now be computed to any # of digits. (They were locked to specific sizes in the previous versions.)
- Compute and Verify + File Compare have been temporarily disabled as they need to be updated to support the new compessed digit format.
- Overall:
- This release consists of mostly interface changes. No optimizations. No bug fixes.
|
|
|
0.2.1 Alpha
Build 6841 |
|
- New Constants:
- Log(10)
- Zeta(3) - Apery's Constant
- Catalan's Constant
- Fixes:
- Fixed a pagefile thrashing problem when writing digits at the end of a large computation that used all the ram in a computer.
- FFT setttings have been pulled back to more conservative levels. This comes at a slight speed penalty, but is necessary to ensure reliability.
- Fixed some errors that were caused by the program being a little bit too aggressive with multi-threading. This also comes at a slight speed penalty.
- Added an extra redundancy check for base conversions. (see below)*
- Optimizations:
- Faster "Compute and Verfy" for Log(2) via a better pair of Machin Formulas.
- Improved multi-core efficiency. Barely noticable on dual-core but obvious improvement on 8-core. (Nearly 10% improvement in some cases on 8-core.)
- Basic Swap Mode is now a bit faster and requires only half the memory from before.
- Size Limits:
- x86
- 466 million digits for Euler's Constant
- 840 million digits for all other constants
- x64
- 29.8 billion digits for Euler's Constant
- 31 billion digits for all other constants
- Overall:
- This second release (as well as the next few) consists mainly of new features and bug fixes. There won't be much in the way of optimizations. Therefore, the next few releases won't be much faster (and maybe even a bit slower if certain bugs fixes necessitate it). I'll make up for it when I start doing optimizations.
*This extra redundancy check is needed to close a small weakness in the method that y-cruncher uses to verify its base conversions.
In order to understand the following paragraph, you must be familiar with radix conversions on floating point numbers.
For a record size computation to qualify as a new world record, it must be verified.
y-cruncher performs a base conversion on a number by first normalizing it to an integer, and then base converting the integer.
The current method of verifying a base conversion is to do it twice using different cutting parameters and apply a modular hash check on the final (integer) base conversion. However, I have found that the powering stage of the normalization step goes through much of the same arithmetic even with different cutting parameters. This opens up a weakness. Since the base conversion is done twice, any hardware errors will be caught. However, if there is a bug (programmer or compiler error) that affected the normalization, it may result in the same incorrect answer for both conversions because of the "shared" arithmetic (and thus pass final verification).
To close this weakness, I have added a modular hash check to the powering stage of the normalization. All existing records that have been set prior to this change should still be fine because y-cruncher already has redundancy checks built into its multiplication. And of course, the digits agree with previous records. |
|
|
0.1.0 Alpha
Build 6013 |
|
- Initial Constants:
- Pi
- Log(2)
- Euler-Mascheroni Constant
- Initial Features:
- Versions: x86, x86 SSE3, x64, and x64 SSE3
- Basic Swap Mode
- Multi-Threading
- Multi-Hard Drive
- Semi-Fault Tolerance
- Size Limits:
- x86
- 233 million digits for Euler's Constant
- 420 million digits for all other constants
- x64
- 7.4 billion digits for Euler's Constant
- 10 billion digits for all other constants
|