y-cruncher Multi-Precision Library

A Multi-threaded Multi-Precision Arithmetic Library

(Powered by y-cruncher)

(Last updated: March 3, 2016)

 

Shortcuts:

YMP Library:

Large Number Objects:

Low-Level Programming:

Multi-Threading and Parallelism

 

In the multicore era, it's parallelize or be left behind. It's really that simple. This page lists all the functionality related to parallelism.

 

Parallelism in y-cruncher and YMP is done using standard parallel constructs that are powered by a global framework.

This approaches emphasizes a separation of concerns where:

For historical reasons, y-cruncher project does not "natively" use any existing frameworks such as TBB or Cilk Plus. Instead, it

has an additional layer of abstraction that allows the framework to be configurable.

 

All parallelism inside the YMP library is done using the parallel constructs on this page.

 

 

In practice, it is a bad idea to mix frameworks within the same application. Since every framework thinks it's the only one, use of multiple frameworks can easily lead to problems ranging from as simple as oversubscription to outright undefined behavior.

 

To avoid conflicts, YMP provides two methods to avoid the multiple framework problem:

 

Currently, YMP's parallel framework can only be set to 1 of 5 built-in frameworks: (some of which are unavailable on certain targets)

In all seriousness, the only useful frameworks are the Windows Thread Pool and Cilk Plus. But all of the built-in frameworks should integrate well with any other parallel framework except for OpenMP. If the client application uses Cilk Plus and YMP is configured to also use Cilk Plus, they will share the same Cilk Plus run-time.

 

The future plan is to expose the ParallelFramework object to allow the user to make custom frameworks. But the current road block is DLL/C-ABI compatibility.

 

Parallel Constructs:

The constructs in this section are the ones used internally by both y-cruncher and YMP. While these constructs are more primitive and restrictive than more mainstream tools, they share the same parallel framework as YMP. So use these for better integration with YMP.

 


Run two Actions in Parallel:

void RunInParallel(BasicAction& a0, BasicAction& a1);

Description:

Run both actions a0 and a1 assuming that they are independent. The function will not return until both actions are completed.

Depending on the parallel framework, these actions may be run in parallel. It is safe to call this function recursively.

 

Exception propagation is currently not defined. Most frameworks will simply terminate the program on an unhandled exception.

 


Run a Range of Actions in Parallel:

void RunInParallel(IndexAction& action, upL_t si, upL_t ei);

Description:

Runs the specified action for all index parameters in the range [si, ei) assuming independence. The function will not return until all actions are completed.

Depending on the parallel framework, these actions may be run in parallel. It is safe to call this function recursively.

 

Exception propagation is currently not defined. Most frameworks will simply terminate the program on an unhandled exception.

Performance Note:

Unlike parallel-for constructs from other parallel libraries, this one doesn't try to be smart about grouping iterations together.

 

What you write is what you get. If you tell it to run an action for a billion different indices, it will dispatch a billion actions which (depending on the framework) may result in a billion context switches and/or threads.

 


Single Action Object:

class BasicAction{

public:

    virtual void run() = 0;

};

Description:

The base class for a simple action that takes no parameters. This class is non-copyable and non-movable.

 

This action class is most suitable for binary recursive fork-join parallelism.

Performance Note:

Invoking an action object is an expensive operation that may involve multiple memory allocations, context switches, or even the creation of a new thread.

Use standard parallel programming common sense. Avoid using action objects for very small tasks.

 


Multi Action Object:

class IndexAction{

public:

    virtual void run(upL_t index) = 0;

};

Description:

The base class for an action that takes an index parameter. This class is non-copyable and non-movable.

 

This action class is most suitable for the parallel-for style of parallelism.

Performance Note:

Invoking an action object is an expensive operation that may involve multiple memory allocations, context switches, or even the creation of a new thread.

Use standard parallel programming common sense. Avoid using action objects for very small tasks.

 

Parallel Framework Management:

 

Global Parallel Framework:

ParallelFramework* GetParallelFramework();

void SetParallelFramework(ParallelFramework* framework);

Description:

Gets and sets the parallel framework for the entire library.

 

This framework is a global value that is shared by all threads. Therefore, these functions are not thread-safe.

It is undefined behavior to call SetParallelFramework() while any other library function is running.

 


Get Framework Name:

const char* GetFrameworkName(const ParallelFramework* framework);

Description:

Get the name of the specified framework.

 

This function will likely go away when a future version of YMP exposes the full definition of the ParallelFramework object. If and when that happens, it will be replaced with a member getter function of the object.

 


Built-in Parallel Framework:

ParallelFramework* GetBuiltInFrameworkByName(const char* name);

Description:

y-cruncher and YMP support a number of built-in frameworks. This function can be used to get a pointer to them by name.

If no framework exists for the given name, it returns null.

 

Different systems have different sets of frameworks:

Framework Name Description Availability
"none"

Disable all parallelism and sequentialize all tasks.

All systems

"spawn"

Spawn a new thread for every task.

All systems

"cppasync"

Use C++11's std::async().

All systems

"winpool"

Use the built-in Windows thread pool.

Windows only

"cilk"

Use Intel's Cilk Plus.

All Linux systems + Windows systems with 256-bit AVX.

More details about the parallel frameworks can be found here.