Taskflow  3.2.0-Master-Branch
Loading...
Searching...
No Matches
Single Task

tf::cudaFlow provides a template method, tf::cudaFlow::single_task, for creating a task to run the given callable using a single kernel thread.

Include the Header

You need to include the header file, taskflow/cuda/algorithm/for_each.hpp, for creating a single-threaded task.

Run a Task with a Single Thread

You can create a task to run a kernel function just once, i.e., using one GPU thread. This is handy when you want to set up a single or a few global variables that do not need multiple threads and will be used by multiple kernels afterwards. The following example creates a single-task kernel that sets gpu_variable to 1.

int* gpu_variable;
cudaMalloc(&gpu_variable, sizeof(int));
tf::Task = taskflow.emplace([&] (tf::cudaFlow& cf) {
// create a single task to set the gpu_variable to 1
tf::cudaTask set_par = cf.single_task([gpu_variable] __device__ () {
*gpu_variable = 1;
})
// create two kernel tasks that need access to gpu_variable
tf::cudaTask kernel1 = cf.kernel(grid1, block1, shm1, my_kernel_1, ...);
tf::cudaTask kernel2 = cf.kernel(grid2, block2, shm2, my_kernel_2, ...);
set_par.precede(kernel1, kernel2);
});
class to create a task handle over a node in a taskflow graph
Definition task.hpp:187
class to create a cudaFlow task dependency graph
Definition cudaflow.hpp:56
cudaTask kernel(dim3 g, dim3 b, size_t s, F f, ArgsT &&... args)
creates a kernel task
Definition cudaflow.hpp:1272
cudaTask single_task(C c)
runs a callable with only a single kernel thread
Definition for_each.hpp:169
class to create a task handle over an internal node of a cudaFlow graph
Definition cuda_task.hpp:65
cudaTask & precede(Ts &&... tasks)
adds precedence links from this to other tasks
Definition cuda_task.hpp:182

Since the callable runs on GPU, it must be declared with a __device__ specifier.

Miscellaneous Items

The single-task algorithm is also available in tf::cudaFlowCapturer::single_task.