cyqlone develop
Fast, parallel and vectorized solver for linear systems with optimal control structure.
Loading...
Searching...
No Matches
cyqlone::parallel::Context< SC > Struct Template Reference

#include <cyqlone/parallel.hpp>

Detailed Description

template<class SC>
struct cyqlone::parallel::Context< SC >

Thread context for parallel execution.

Each thread has a unique thread index, and can synchronize and communicate with other threads in the same shared context.

See also
SharedContext

Definition at line 64 of file parallel.hpp.

Public Types

using shared_context_type = SC
using arrival_token = typename shared_context_type::barrier_type::arrival_token

Public Member Functions

bool is_master () const
 Check if this thread is the master thread (thread index 0).
arrival_token arrive ()
 Arrive at the barrier and obtain a token that can be used to wait for completion of the current barrier phase.
void wait (arrival_token &&token)
 Await a token returned by arrive(), waiting for the barrier phase to complete.
void arrive_and_wait ()
 Arrive at the barrier and wait for the barrier phase to complete.
void arrive_and_wait (int line)
 Debug version of arrive_and_wait() that performs a sanity check to ensure that all threads are arriving at the same line of code.
template<class T>
broadcast (T x, index_t src=0)
 Broadcast a value x from the thread with index src to all threads.
template<class F, class... Args>
auto call_broadcast (F &&f, Args &&...args) -> std::invoke_result_t< F, Args... >
 Call a function f with the given args on a single thread and broadcast the return value to all threads.
template<class T, class F>
auto arrive_reduce (T x, F func)
 Perform a reduction of x across all threads using the given binary function func.
template<class T>
wait_reduce (shared_context_type::barrier_type::template arrival_token_typed< T > &&token)
 Wait for the reduction initiated by arrive_reduce() to complete and obtain the reduced value.
template<class T, class F>
reduce (T x, F func)
 Perform a reduction of x across all threads using the given binary function func, and wait for the result.
template<class T>
reduce (T x)
 Reduction with std::plus, i.e., summation across all threads.
template<class F>
void run_single_sync (F &&f)
 Wait for all threads to reach this point, then run the given function on a single thread before releasing all threads again.

Public Attributes

shared_context_typeshared
const index_t index
const index_t num_thr = shared.num_thr

Friends

constexpr bool operator== (const Context &a, const Context &b)

Member Typedef Documentation

◆ shared_context_type

template<class SC>
using cyqlone::parallel::Context< SC >::shared_context_type = SC

Definition at line 65 of file parallel.hpp.

◆ arrival_token

template<class SC>
using cyqlone::parallel::Context< SC >::arrival_token = typename shared_context_type::barrier_type::arrival_token

Definition at line 73 of file parallel.hpp.

Member Function Documentation

◆ is_master()

template<class SC>
bool cyqlone::parallel::Context< SC >::is_master ( ) const
inlinenodiscard

Check if this thread is the master thread (thread index 0).

Useful for determining which thread should perform operations like printing to the console, which should be done by a single thread and does not require synchronization.

Definition at line 86 of file parallel.hpp.

◆ arrive()

template<class SC>
arrival_token cyqlone::parallel::Context< SC >::arrive ( )
inlinenodiscard

Arrive at the barrier and obtain a token that can be used to wait for completion of the current barrier phase.

Note
Token must be awaited before any other call to arrive.

Definition at line 91 of file parallel.hpp.

◆ wait()

template<class SC>
void cyqlone::parallel::Context< SC >::wait ( arrival_token && token)
inline

Await a token returned by arrive(), waiting for the barrier phase to complete.

Definition at line 100 of file parallel.hpp.

◆ arrive_and_wait() [1/2]

template<class SC>
void cyqlone::parallel::Context< SC >::arrive_and_wait ( )
inline

Arrive at the barrier and wait for the barrier phase to complete.

This is a convenience wrapper around arrive() and wait() for the common case where the thread does not have other work to do while waiting.

Definition at line 112 of file parallel.hpp.

◆ arrive_and_wait() [2/2]

template<class SC>
void cyqlone::parallel::Context< SC >::arrive_and_wait ( int line)
inline

Debug version of arrive_and_wait() that performs a sanity check to ensure that all threads are arriving at the same line of code.

The line parameter should be the same for all threads arriving at the same barrier. It is only verified in debug builds, and is equivalent to arrive_and_wait() in release builds.

Definition at line 122 of file parallel.hpp.

◆ broadcast()

template<class SC>
template<class T>
T cyqlone::parallel::Context< SC >::broadcast ( T x,
index_t src = 0 )
inline

Broadcast a value x from the thread with index src to all threads.

Definition at line 131 of file parallel.hpp.

◆ call_broadcast()

template<class SC>
template<class F, class... Args>
auto cyqlone::parallel::Context< SC >::call_broadcast ( F && f,
Args &&... args ) -> std::invoke_result_t< F, Args... >
inline

Call a function f with the given args on a single thread and broadcast the return value to all threads.

Definition at line 139 of file parallel.hpp.

◆ arrive_reduce()

template<class SC>
template<class T, class F>
auto cyqlone::parallel::Context< SC >::arrive_reduce ( T x,
F func )
inlinenodiscard

Perform a reduction of x across all threads using the given binary function func.

Returns a token that can be used to wait for the reduction to complete and obtain the reduced value.

Definition at line 154 of file parallel.hpp.

◆ wait_reduce()

template<class SC>
template<class T>
T cyqlone::parallel::Context< SC >::wait_reduce ( shared_context_type::barrier_type::template arrival_token_typed< T > && token)
inline

Wait for the reduction initiated by arrive_reduce() to complete and obtain the reduced value.

Definition at line 162 of file parallel.hpp.

◆ reduce() [1/2]

template<class SC>
template<class T, class F>
T cyqlone::parallel::Context< SC >::reduce ( T x,
F func )
inline

Perform a reduction of x across all threads using the given binary function func, and wait for the result.

Definition at line 169 of file parallel.hpp.

◆ reduce() [2/2]

template<class SC>
template<class T>
T cyqlone::parallel::Context< SC >::reduce ( T x)
inline

Reduction with std::plus, i.e., summation across all threads.

See also
reduce(T,F)

Definition at line 176 of file parallel.hpp.

◆ run_single_sync()

template<class SC>
template<class F>
void cyqlone::parallel::Context< SC >::run_single_sync ( F && f)
inline

Wait for all threads to reach this point, then run the given function on a single thread before releasing all threads again.

Changes by all threads are visible during the call to f and changes made by f are visible to all threads after this function returns.

Definition at line 184 of file parallel.hpp.

◆ operator==

template<class SC>
bool operator== ( const Context< SC > & a,
const Context< SC > & b )
friend

Definition at line 79 of file parallel.hpp.

Member Data Documentation

◆ shared

template<class SC>
shared_context_type& cyqlone::parallel::Context< SC >::shared

Definition at line 76 of file parallel.hpp.

◆ index

template<class SC>
const index_t cyqlone::parallel::Context< SC >::index

Definition at line 77 of file parallel.hpp.

◆ num_thr

template<class SC>
const index_t cyqlone::parallel::Context< SC >::num_thr = shared.num_thr

Definition at line 77 of file parallel.hpp.


The documentation for this struct was generated from the following file: