Defining Kernels#

In SYCL, functions that are executed on a SYCL device are referred to as SYCL kernel functions. A kernel containing such a SYCL kernel function is enqueued on a device queue in order to be executed on that particular device.

The return type of the SYCL kernel function is void, and all memory accesses between host and device are through accessors or through USM pointers.

There are two ways of defining kernels:

  • as named function objects;

  • as lambda functions.

A backend may also provide interoperability interfaces for defining kernels.

See also

SYCL Specification Section 4.12

Defining kernels as named function objects#

A kernel can be defined as a named function object type. These function objects provide the same functionality as any C++ function object, with the restriction that they need to follow SYCL rules to be device copyable (see SYCL Specification Section 3.13.1).

The kernel function can be templated via templating the kernel function object type. For details on restrictions for kernel naming, please refer to SYCL Specification Section 5.2.

The operator() member function must be const-qualified, and it may take different parameters depending on the data accesses defined for the specific kernel. If the operator() function writes to any of the member variables, the behavior is undefined.

The following example defines a SYCL kernel function, RandomFiller, which initializes a sycl::buffer with a random number.

 1#include <sycl/sycl.hpp>
 2
 3#include <iostream>
 4#include <random>
 5
 6class RandomFiller {
 7public:
 8  RandomFiller(sycl::accessor<int> ptr) : ptr_{ptr} {
 9    std::random_device hwRand;
10    std::uniform_int_distribution<> r{1, 100};
11    randomNum_ = r(hwRand);
12  }
13  void operator()(sycl::item<1> item) const {
14    ptr_[item.get_id()] = get_random();
15  }
16  int get_random() const { return randomNum_; }
17
18private:
19  sycl::accessor<int> ptr_;
20  int randomNum_;
21};
22
23void workFunction(sycl::buffer<int, 1> &b, sycl::queue &q,
24                  const sycl::range<1> r) {
25  q.submit([&](sycl::handler &cgh) {
26    sycl::accessor ptr{b, cgh};
27    RandomFiller filler{ptr};
28
29    cgh.parallel_for(r, filler);
30  });
31}
32
33int main() {
34  sycl::range<1> buffRange(10);
35  sycl::buffer<int> myBuff(buffRange);
36  sycl::queue myQueue;
37
38  workFunction(myBuff, myQueue, buffRange);
39
40  auto acc = myBuff.get_host_access();
41
42  for (auto a : acc) {
43    std::cout << a << ' ';
44  }
45
46  return 0;
47}

Possible output:

81 81 81 81 81 81 81 81 81 81

The random number is generated during the construction of the function object while processing the command group.

The operator() member function of the function object receives an sycl::item object. This member function will be called for each work-item of the execution range. The value of the random number will be assigned to each element of the sycl::buffer.

In this case, the sycl::accessor and the scalar random number are members of the function object and therefore will be arguments to the device kernel. Usual restrictions of passing arguments to kernels apply.

Defining kernels as lambda functions#

In C++, function objects can be defined using lambda functions. Kernels may be defined as lambda functions in SYCL.

The name of a lambda function in SYCL may optionally be specified by passing it as a template parameter to the invoking member function, and in that case, the lambda name is a C++ typename which must be forward declarable at namespace scope.

If the lambda function relies on template arguments, then if specified, the name of the lambda function must contain those template arguments which must also be forward declarable at namespace scope.

The class used for the name of a lambda function is only used for naming purposes and is not required to be defined. For details on restrictions for kernel naming, please refer to SYCL Specification Section 5.2.

The kernel function for the lambda function is the lambda function itself. The kernel lambda must use copy for all of its captures (i.e. [=]), and the lambda must not use the mutable specifier.

// Explicit kernel names can be optionally forward declared at namespace scope
class MyKernel;

{
  ...

  myQueue.submit([&](sycl::handler& h) {
    // Explicitly name kernel with previously forward declared type
    h.single_task<MyKernel>([=] {
      // [kernel code]
    });

    // Explicitly name kernel without forward declaring type at
    // namespace scope.  Must still be forward declarable at
    // namespace scope, even if not declared at that scope
    h.single_task<class MyOtherKernel>([=] {
      // [kernel code]
    });
  });

  ...
}

Explicit lambda naming is shown in the following code example, including an illegal case that uses a class within the kernel name which is not forward declarable (std::complex).

 1#include <sycl/sycl.hpp>
 2
 3// Explicit kernel names can be optionally forward declared at namespace scope
 4class MyForwardDeclName;
 5
 6template <typename T> class MyTemplatedKernelName;
 7
 8// Define and launch templated kernel
 9template <typename T> void templatedFunction() {
10  sycl::queue myQueue;
11
12  // Launch A: No explicit kernel name
13  myQueue.submit([&](sycl::handler &h) {
14    h.single_task([=] {
15      // [kernel code that depends on type T]
16    });
17  });
18
19  // Launch B: Name the kernel when invoking (this is optional)
20  myQueue.submit([&](sycl::handler &h) {
21    h.single_task<MyTemplatedKernelName<T>>([=] {
22      // The provided kernel name (MyTemplatedKernelName<T>) depends on T
23      // because the kernel does.  T must also be forward declarable at
24      // namespace scope.
25
26      // [kernel code that depends on type T]
27    });
28  });
29}
30
31int main() {
32  sycl::queue myQueue;
33
34  myQueue.submit([&](sycl::handler &h) {
35    // Declare MyKernel within this kernel invocation.  Legal because
36    // forward declaration at namespace scope is optional
37    h.single_task<class MyKernel>([=] {
38      // [kernel code]
39    });
40  });
41
42  myQueue.submit([&](sycl::handler &h) {
43    // Use kernel name that was forward declared at namespace scope
44    h.single_task<MyForwardDeclName>([=] {
45      // [kernel code]
46    });
47  });
48
49  templatedFunction<int>(); // OK
50
51  templatedFunction<std::complex<float>>(); // Launch A is OK, Launch B illegal
52  // because std::complex is not forward declarable according to C++, and was
53  // used in an explicit kernel name which must be forward declarable.
54}

sycl::is_device_copyable#

namespace sycl {

  template<typename T>
  struct is_device_copyable;

  template<typename T>
  inline constexpr bool is_device_copyable_v = is_device_copyable<T>::value;

} // namespace sycl

sycl::is_device_copyable is a user specializable class template to indicate that a type T is device copyable (see SYCL Specification Section 3.13.1).

Requirements:

  • sycl::is_device_copyable must meet the Cpp17UnaryTrait requirements.

  • If sycl::is_device_copyable is specialized such that sycl::is_device_copyable<T> == true on a T that does not satisfy all the requirements of a device copyable type, the results are unspecified.

If the application defines a type UDT that satisfies the requirements of a device copyable type but the type is not implicitly device copyable as defined in that section, then the application must provide a specialization of sycl::is_device_copyable that derives from std:true_type in order to use that type in a context that requires a device copyable type. Such a specialization can be declared like this:

template<>
struct sycl::is_device_copyable<UDT> : std::true_type {};

It is legal to provide this specialization even if the implementation does not define SYCL_DEVICE_COPYABLE to 1, but the type cannot be used as a device copyable type in that case and the specialization is ignored.

Rules for parameter passing to kernels#

A SYCL application passes parameters to a kernel in different ways depending on whether the kernel is a named function object or a lambda function.

If the kernel is a named function object, the operator() member function (or other member functions that it calls) may reference member variables inside the same named function object. Any such member variables become parameters to the kernel.

If the kernel is a lambda function, any variables captured by the lambda become parameters to the kernel.

Regardless of how the parameter is passed, the following rules define the allowable types for a kernel parameter:

  • Any device copyable (see SYCL Specification Section 3.13.1) type is a legal parameter type.

  • The following SYCL types are legal parameter types:

  • An array of element types T is a legal parameter type if T is a legal parameter type.

  • A class type S with a non-static member variable of type T is a legal parameter type if T is a legal parameter type and if S would otherwise be a legal parameter type aside from this member variable.

  • A class type S with a non-virtual base class of type T is a legal parameter type if T is a legal parameter type and if S would otherwise be a legal parameter type aside from this base class.

Note

Pointer types are trivially copyable, so they may be passed as kernel parameters. However, only the pointer value itself is passed to the kernel. Dereferencing the pointer on the kernel results in undefined behavior unless the pointer points to an address within a USM memory region that is accessible on the device.

Reference types are not trivially copyable, so they may not be passed as kernel parameters.

Note

The sycl::reducer class is a special type of kernel parameter which is passed to a kernel in a different way. See Reduction Variables for the description how this parameter type is used.