Taskflow  3.2.0-Master-Branch
Loading...
Searching...
No Matches
tf::syclFlow Class Reference

class for building a SYCL task dependency graph More...

#include <syclflow.hpp>

Public Member Functions

 syclFlow (sycl::queue &queue)
 constructs a standalone syclFlow from the given queue
 
 ~syclFlow ()=default
 destroys the syclFlow
 
bool empty () const
 queries the emptiness of the graph
 
size_t num_tasks () const
 queries the number of tasks
 
void dump (std::ostream &os) const
 dumps the syclFlow graph into a DOT format through an output stream
 
void clear ()
 clear the associated graph
 
template<typename F , std::enable_if_t< std::is_invocable_r_v< void, F, sycl::handler & >, void > * = nullptr>
syclTask on (F &&func)
 creates a task that launches the given command group function object
 
template<typename F , std::enable_if_t< std::is_invocable_r_v< void, F, sycl::handler & >, void > * = nullptr>
void on (syclTask task, F &&func)
 updates the task to the given command group function object
 
syclTask memcpy (void *tgt, const void *src, size_t bytes)
 creates a memcpy task that copies untyped data in bytes
 
syclTask memset (void *ptr, int value, size_t bytes)
 creates a memset task that fills untyped data with a byte value
 
template<typename T >
syclTask fill (void *ptr, const T &pattern, size_t count)
 creates a fill task that fills typed data with the given value
 
template<typename T , std::enable_if_t<!std::is_same_v< T, void >, void > * = nullptr>
syclTask copy (T *target, const T *source, size_t count)
 creates a copy task that copies typed data from a source to a target memory block
 
template<typename... ArgsT>
syclTask parallel_for (ArgsT &&... args)
 creates a kernel task
 
template<typename F >
syclTask single_task (F &&func)
 invokes a SYCL kernel function using only one thread
 
template<typename I , typename C >
syclTask for_each (I first, I last, C &&callable)
 applies a callable to each dereferenced element of the data array
 
template<typename I , typename C >
syclTask for_each_index (I first, I last, I step, C &&callable)
 applies a callable to each index in the range with the step size
 
template<typename I , typename C , typename... S>
syclTask transform (I first, I last, C &&callable, S... srcs)
 applies a callable to a source range and stores the result in a target range
 
template<typename I , typename T , typename C >
syclTask reduce (I first, I last, T *result, C &&op)
 performs parallel reduction over a range of items
 
template<typename I , typename T , typename C >
syclTask uninitialized_reduce (I first, I last, T *result, C &&op)
 similar to tf::syclFlow::reduce but does not assume any initial value to reduce
 
template<typename P >
void offload_until (P &&predicate)
 offloads the syclFlow onto a GPU and repeatedly runs it until the predicate becomes true
 
void offload_n (size_t N)
 offloads the syclFlow and executes it by the given times
 
void offload ()
 offloads the syclFlow and executes it once
 
void memcpy (syclTask task, void *tgt, const void *src, size_t bytes)
 rebinds the task to a memcpy task
 
void memset (syclTask task, void *ptr, int value, size_t bytes)
 rebinds the task to a memset task
 
template<typename T >
void fill (syclTask task, void *ptr, const T &pattern, size_t count)
 rebinds the task to a fill task
 
template<typename T , std::enable_if_t<!std::is_same_v< T, void >, void > * = nullptr>
void copy (syclTask task, T *target, const T *source, size_t count)
 rebinds the task to a copy task
 
template<typename... ArgsT>
void parallel_for (syclTask task, ArgsT &&... args)
 rebinds the task to a parallel-for kernel task
 
template<typename F >
void single_task (syclTask task, F &&func)
 rebinds the task to a single-threaded kernel task
 

Friends

class Executor
 

Detailed Description

class for building a SYCL task dependency graph

Constructor & Destructor Documentation

◆ syclFlow()

tf::syclFlow::syclFlow ( sycl::queue &  queue)
inline

constructs a standalone syclFlow from the given queue

A standalone syclFlow does not go through any taskflow and can be run by the caller thread using explicit offload methods (e.g., tf::syclFlow::offload).

Member Function Documentation

◆ copy() [1/2]

template<typename T , std::enable_if_t<!std::is_same_v< T, void >, void > * >
void tf::syclFlow::copy ( syclTask  task,
T *  target,
const T *  source,
size_t  count 
)

rebinds the task to a copy task

Similar to tf::syclFlow::copy but operates on an existing task.

◆ copy() [2/2]

template<typename T , std::enable_if_t<!std::is_same_v< T, void >, void > * >
syclTask tf::syclFlow::copy ( T *  target,
const T *  source,
size_t  count 
)

creates a copy task that copies typed data from a source to a target memory block

Template Parameters
Ttrivially copyable value type
Parameters
targetpointer to the memory to fill
sourcepointer to the pattern value to fill into the memory
countnumber of items to fill the value

Creates a task that copies count items of type T from a source memory location to a target memory location.

◆ fill() [1/2]

template<typename T >
void tf::syclFlow::fill ( syclTask  task,
void *  ptr,
const T &  pattern,
size_t  count 
)

rebinds the task to a fill task

Similar to tf::syclFlow::fill but operates on an existing task.

◆ fill() [2/2]

template<typename T >
syclTask tf::syclFlow::fill ( void *  ptr,
const T &  pattern,
size_t  count 
)

creates a fill task that fills typed data with the given value

Template Parameters
Ttrivially copyable value type
Parameters
ptrpointer to the memory to fill
patternpattern value to fill into the memory
countnumber of items to fill the value

Creates a task that fills the specified memory with the specified value.

◆ for_each()

template<typename I , typename C >
syclTask tf::syclFlow::for_each ( first,
last,
C &&  callable 
)

applies a callable to each dereferenced element of the data array

Template Parameters
Iiterator type
Ccallable type
Parameters
firstiterator to the beginning (inclusive)
lastiterator to the end (exclusive)
callablea callable object to apply to the dereferenced iterator
Returns
a tf::syclTask handle

This method is equivalent to the parallel execution of the following loop on a GPU:

for(auto itr = first; itr != last; itr++) {
callable(*itr);
}

◆ for_each_index()

template<typename I , typename C >
syclTask tf::syclFlow::for_each_index ( first,
last,
step,
C &&  callable 
)

applies a callable to each index in the range with the step size

Template Parameters
Iindex type
Ccallable type
Parameters
firstbeginning index
lastlast index
stepstep size
callablethe callable to apply to each element in the data array
Returns
a tf::syclTask handle

This method is equivalent to the parallel execution of the following loop on a GPU:

// step is positive [first, last)
for(auto i=first; i<last; i+=step) {
callable(i);
}
// step is negative [first, last)
for(auto i=first; i>last; i+=step) {
callable(i);
}

◆ memcpy() [1/2]

void tf::syclFlow::memcpy ( syclTask  task,
void *  tgt,
const void *  src,
size_t  bytes 
)
inline

rebinds the task to a memcpy task

Similar to tf::syclFlow::memcpy but operates on an existing task.

◆ memcpy() [2/2]

syclTask tf::syclFlow::memcpy ( void *  tgt,
const void *  src,
size_t  bytes 
)
inline

creates a memcpy task that copies untyped data in bytes

Parameters
tgtpointer to the target memory block
srcpointer to the source memory block
bytesbytes to copy
Returns
a tf::syclTask handle

A memcpy task transfers bytes of data from a source locationA src to a target location tgt. Both src and tgt may be either host or USM pointers.

◆ memset() [1/2]

void tf::syclFlow::memset ( syclTask  task,
void *  ptr,
int  value,
size_t  bytes 
)
inline

rebinds the task to a memset task

Similar to tf::syclFlow::memset but operates on an existing task.

◆ memset() [2/2]

syclTask tf::syclFlow::memset ( void *  ptr,
int  value,
size_t  bytes 
)
inline

creates a memset task that fills untyped data with a byte value

Parameters
ptrpointer to the destination device memory area
valuevalue to set for each byte of specified memory
bytesnumber of bytes to set
Returns
a tf::syclTask handle

Fills bytes of memory beginning at address ptr with value. ptr must be a USM allocation. value is interpreted as an unsigned char.

◆ offload_n()

void tf::syclFlow::offload_n ( size_t  N)
inline

offloads the syclFlow and executes it by the given times

Parameters
Nnumber of executions

◆ offload_until()

template<typename P >
void tf::syclFlow::offload_until ( P &&  predicate)

offloads the syclFlow onto a GPU and repeatedly runs it until the predicate becomes true

Template Parameters
Ppredicate type (a binary callable)
Parameters
predicatea binary predicate (returns true for stop)

Repetitively executes the present syclFlow through the given queue object until the predicate returns true.

By default, if users do not offload the syclFlow, the executor will offload it once.

◆ on() [1/2]

template<typename F , std::enable_if_t< std::is_invocable_r_v< void, F, sycl::handler & >, void > * >
syclTask tf::syclFlow::on ( F &&  func)

creates a task that launches the given command group function object

Template Parameters
Ftype of command group function object
Parameters
funcfunction object that is constructible from std::function<void(sycl::handler&)>

Creates a task that is associated from the given command group. In SYCL, each command group function object is given a unique command group handler object to perform all the necessary work required to correctly process data on a device using a kernel.

◆ on() [2/2]

template<typename F , std::enable_if_t< std::is_invocable_r_v< void, F, sycl::handler & >, void > * >
void tf::syclFlow::on ( syclTask  task,
F &&  func 
)

updates the task to the given command group function object

Similar to tf::syclFlow::on but operates on an existing task.

◆ parallel_for() [1/2]

template<typename... ArgsT>
syclTask tf::syclFlow::parallel_for ( ArgsT &&...  args)

creates a kernel task

Template Parameters
ArgsTarguments types
Parameters
argsarguments to forward to the parallel_for methods defined in the handler object

Creates a kernel task from a parallel_for method through the handler object associated with a command group.

◆ parallel_for() [2/2]

template<typename... ArgsT>
void tf::syclFlow::parallel_for ( syclTask  task,
ArgsT &&...  args 
)

rebinds the task to a parallel-for kernel task

Similar to tf::syclFlow::parallel_for but operates on an existing task.

◆ reduce()

template<typename I , typename T , typename C >
syclTask tf::syclFlow::reduce ( first,
last,
T *  result,
C &&  op 
)

performs parallel reduction over a range of items

Template Parameters
Iinput iterator type
Tvalue type
Ccallable type
Parameters
firstiterator to the beginning (inclusive)
lastiterator to the end (exclusive)
resultpointer to the result with an initialized value
opbinary reduction operator
Returns
a tf::syclTask handle

This method is equivalent to the parallel execution of the following loop on a SYCL device:

while (first != last) {
*result = op(*result, *first++);
}

◆ single_task() [1/2]

template<typename F >
syclTask tf::syclFlow::single_task ( F &&  func)

invokes a SYCL kernel function using only one thread

Template Parameters
Fkernel function type
Parameters
funckernel function

Creates a task that launches the given function object using only one kernel thread.

◆ single_task() [2/2]

template<typename F >
void tf::syclFlow::single_task ( syclTask  task,
F &&  func 
)

rebinds the task to a single-threaded kernel task

Similar to tf::syclFlow::single_task but operates on an existing task.

◆ transform()

template<typename I , typename C , typename... S>
syclTask tf::syclFlow::transform ( first,
last,
C &&  callable,
S...  srcs 
)

applies a callable to a source range and stores the result in a target range

Template Parameters
Iiterator type
Ccallable type
Ssource types
Parameters
firstiterator to the beginning (inclusive)
lastiterator to the end (exclusive)
callablethe callable to apply to each element in the range
srcsiterators to the source ranges
Returns
a tf::syclTask handle

This method is equivalent to the parallel execution of the following loop on a SYCL device:

while (first != last) {
*first++ = callable(*src1++, *src2++, *src3++, ...);
}

◆ uninitialized_reduce()

template<typename I , typename T , typename C >
syclTask tf::syclFlow::uninitialized_reduce ( first,
last,
T *  result,
C &&  op 
)

similar to tf::syclFlow::reduce but does not assume any initial value to reduce

This method is equivalent to the parallel execution of the following loop on a SYCL device:

*result = *first++; // no initial values partitipcate in the loop
while (first != last) {
*result = op(*result, *first++);
}

The documentation for this class was generated from the following file: