y-cruncher Multi-Precision Library

A Multi-threaded Multi-Precision Arithmetic Library

(Powered by y-cruncher)

(Last updated: March 3, 2016)

Shortcuts:

Numberworld Home

YMP Library:

Large Number Objects:

Low-Level Programming:

Twiddle Factor/Lookup Tables

Unlike most other bignum libraries, YMP's large multiplication requires the use of precomputed twiddle factor tables. Since nearly all non-trivial functions reduce to large multiplication, pretty much everything requires these tables.

In YMP we use the term, "lookup tables" since twiddle factors aren't the only users of the precomputation framework.

Why does YMP need them?

Space-time tradeoff.

As of 2015, all practical large multiplication algorithms that run in quasi-linear time are based on Fast Fourier Transforms (FFTs). And FFT-based algorithms require twiddle factors. Since twiddle factors are generally expensive to compute, they need to be precomputed and cached to achieve an acceptable level of efficiency.

Floating-point FFT twiddle factors require calls to sin/cos.
Number-theoretic FFT twiddle factors require power-modulus.

There is one exception. The Schönhage–Strassen algorithm (SSA) is a number-theoretic FFT whose twiddle factors are powers of two. For this reason, they are trivial and do not need to be cached. This algorithm is precisely the one that used in mainstream libraries such as GMP. Which is why they don't need twiddle factor tables.

Unfortunately, the SSA algorithm is not (and never was) the fastest algorithm for large multiplication. Therefore, twiddle factors are a necessary evil to utilize the more powerful algorithms that are further down the space-time trade-off curve.

Lookup Table Class:

In y-cruncher/YMP, the LookupTable class is the object that holds these tables. LookupTable is an incomplete type with one relevant parameter, cbitlen.

"cbitlen" stands for "Convolution Bit-Length". It is the size of the table.

A table with cbitlen = x will be able to handle any large multiplication of length N x M where N + M <= x bits.

Anything that uses large multiplication requires this table. Using a table of insufficient size is undefined behavior. (Most of the time, it'll just throw an exception.)

Larger tables require more memory than smaller tables. But the table will max out at a few hundred megabytes. A "maxed out" table can handle any size up to the limit of YMP (10¹⁸ bits). But it will still only require a few hundred megabytes. This is by design. On the current generation of processors, using larger tables provides little benefit for increased memory usage and cache pollution.

Table Management:

The lookup table is a read-only object that can be shared across multiple threads. Therefore, YMP keeps a global instance for convenience. This global table starts with a size of zero and will automatically resize itself in a thread-safe manner when requested at a larger size.

All functions in YMP that require a lookup table have overloads that allow the table to be passed in manually rather than defaulting to the global instance. This is useful for applications that wish to manage the table on their own. It is also useful for NUMA systems where it may be best to have multiple tables - one per NUMA node.

Global Table Management:

Headers: ymp/LowLevel.h
Sources: none

Get Lookup Table:

const LookupTable& get_global_table<u32_t> (uiL_t cwordlen = -1);

const LookupTable& get_global_table<u64_t> (uiL_t cwordlen = -1);

const LookupTable& get_global_table_bits (uiL_t cbitlen = -1);

Description:

Returns a lookup that is large enough to handle a convolution size of at least cwordlen word or cbitlen bits long. These functions are thread-safe.

If the parameter is -1 (or omitted), the table will be set large enough to handle any multiplication size that YMP supports. Depending on the target processor, the resulting "maxed out" table may consume several hundred megabytes. This may be the way to go since most machines nowadays have plenty of ram.

The lookup table is cached internally.

If it is large enough for the requested size, it is simply returned. This is done in an entirely lock-free manner.

If it is not large enough, it is resized in a thread-safe manner. Resizing of the internal table is done in with an exponentially growing factor.

Calling this function ensures that all subsequent calls with a size less than or equal to the call will return immediately without a resizing.

The reference that is returned by this function is not guaranteed to be the same between calls. But every reference that is returned will be valid for the lifetime of the library.

Get Lookup Table Size:

uiL_t get_global_table_size_cbitlen ();

Description:

Returns the size of the global lookup table. The number returned is cbitlen, the largest convolution size measured in bits.

Manual Table Management (currently unavailable):

Use these if you wish to manually manage tables.

Make Lookup Table:

dll_uptr<LookupTable> MakeLookupTable (uiL_t cbitlen = -1);

Description:

Construct the precomputed twiddle-factor/lookup table that is required by nearly all large multiply routines.

The parameter specifies that the table should be large enough to accommodate any large multiply whose product is at most cbitlen bits long.

If the parameter is -1 (or omitted), the table will be set large enough to handle any multiplication size that YMP supports. Depending on the target processor, the resulting "maxed out" table may consume several hundred megabytes. This may be the way to go since most machines nowadays have plenty of ram.

This function returns a smart pointer for the sake of RAII safety. It uses a custom smart pointer because std::unique_ptr is not guaranteed to be compatible across the DLL boundary.

y-cruncher Multi-Precision Library A Multi-threaded Multi-Precision Arithmetic Library (Powered by y-cruncher)

y-cruncher Multi-Precision Library

A Multi-threaded Multi-Precision Arithmetic Library

(Powered by y-cruncher)

y-cruncher Multi-Precision Library

A Multi-threaded Multi-Precision Arithmetic Library

(Powered by y-cruncher)