Defining Kernels#
In SYCL, functions that are executed on a SYCL device are referred to as SYCL kernel functions. A kernel containing such a SYCL kernel function is enqueued on a device queue in order to be executed on that particular device.
The return type of the SYCL kernel function is void
, and all
memory accesses between host and device are through
accessors or through
USM pointers.
There are two ways of defining kernels:
as named function objects;
as lambda functions.
A backend may also provide interoperability interfaces for defining kernels.
See also
SYCL Specification Section 4.12
Defining kernels as named function objects#
A kernel can be defined as a named function object type. These function objects provide the same functionality as any C++ function object, with the restriction that they need to follow SYCL rules to be device copyable (see SYCL Specification Section 3.13.1).
The kernel function can be templated via templating the kernel function object type. For details on restrictions for kernel naming, please refer to SYCL Specification Section 5.2.
The operator()
member function must be const
-qualified,
and it may take different parameters depending on the data
accesses defined for the specific kernel. If the operator()
function writes to any of the member variables, the behavior
is undefined.
The following example defines a SYCL kernel function,
RandomFiller
, which initializes a sycl::buffer with a random
number.
1#include <sycl/sycl.hpp>
2
3#include <iostream>
4#include <random>
5
6class RandomFiller {
7public:
8 RandomFiller(sycl::accessor<int> ptr) : ptr_{ptr} {
9 std::random_device hwRand;
10 std::uniform_int_distribution<> r{1, 100};
11 randomNum_ = r(hwRand);
12 }
13 void operator()(sycl::item<1> item) const {
14 ptr_[item.get_id()] = get_random();
15 }
16 int get_random() const { return randomNum_; }
17
18private:
19 sycl::accessor<int> ptr_;
20 int randomNum_;
21};
22
23void workFunction(sycl::buffer<int, 1> &b, sycl::queue &q,
24 const sycl::range<1> r) {
25 q.submit([&](sycl::handler &cgh) {
26 sycl::accessor ptr{b, cgh};
27 RandomFiller filler{ptr};
28
29 cgh.parallel_for(r, filler);
30 });
31}
32
33int main() {
34 sycl::range<1> buffRange(10);
35 sycl::buffer<int> myBuff(buffRange);
36 sycl::queue myQueue;
37
38 workFunction(myBuff, myQueue, buffRange);
39
40 auto acc = myBuff.get_host_access();
41
42 for (auto a : acc) {
43 std::cout << a << ' ';
44 }
45
46 return 0;
47}
Possible output:
81 81 81 81 81 81 81 81 81 81
The random number is generated during the construction of the function object while processing the command group.
The operator()
member function of the function object receives
an sycl::item object. This member function will be called for each
work-item of the execution range. The value of the random number will
be assigned to each element of the sycl::buffer.
In this case, the sycl::accessor and the scalar random number are members of the function object and therefore will be arguments to the device kernel. Usual restrictions of passing arguments to kernels apply.
Defining kernels as lambda functions#
In C++, function objects can be defined using lambda functions. Kernels may be defined as lambda functions in SYCL.
The name of a lambda function in SYCL may optionally be specified by passing it as a template parameter to the invoking member function, and in that case, the lambda name is a C++ typename which must be forward declarable at namespace scope.
If the lambda function relies on template arguments, then if specified, the name of the lambda function must contain those template arguments which must also be forward declarable at namespace scope.
The class used for the name of a lambda function is only used for naming purposes and is not required to be defined. For details on restrictions for kernel naming, please refer to SYCL Specification Section 5.2.
The kernel function for the lambda function is the lambda function itself.
The kernel lambda must use copy for all of its captures (i.e. [=]
),
and the lambda must not use the mutable
specifier.
// Explicit kernel names can be optionally forward declared at namespace scope
class MyKernel;
{
...
myQueue.submit([&](sycl::handler& h) {
// Explicitly name kernel with previously forward declared type
h.single_task<MyKernel>([=] {
// [kernel code]
});
// Explicitly name kernel without forward declaring type at
// namespace scope. Must still be forward declarable at
// namespace scope, even if not declared at that scope
h.single_task<class MyOtherKernel>([=] {
// [kernel code]
});
});
...
}
Explicit lambda naming is shown in the following code example,
including an illegal case that uses a class within the kernel
name which is not forward declarable (std::complex
).
1#include <sycl/sycl.hpp>
2
3// Explicit kernel names can be optionally forward declared at namespace scope
4class MyForwardDeclName;
5
6template <typename T> class MyTemplatedKernelName;
7
8// Define and launch templated kernel
9template <typename T> void templatedFunction() {
10 sycl::queue myQueue;
11
12 // Launch A: No explicit kernel name
13 myQueue.submit([&](sycl::handler &h) {
14 h.single_task([=] {
15 // [kernel code that depends on type T]
16 });
17 });
18
19 // Launch B: Name the kernel when invoking (this is optional)
20 myQueue.submit([&](sycl::handler &h) {
21 h.single_task<MyTemplatedKernelName<T>>([=] {
22 // The provided kernel name (MyTemplatedKernelName<T>) depends on T
23 // because the kernel does. T must also be forward declarable at
24 // namespace scope.
25
26 // [kernel code that depends on type T]
27 });
28 });
29}
30
31int main() {
32 sycl::queue myQueue;
33
34 myQueue.submit([&](sycl::handler &h) {
35 // Declare MyKernel within this kernel invocation. Legal because
36 // forward declaration at namespace scope is optional
37 h.single_task<class MyKernel>([=] {
38 // [kernel code]
39 });
40 });
41
42 myQueue.submit([&](sycl::handler &h) {
43 // Use kernel name that was forward declared at namespace scope
44 h.single_task<MyForwardDeclName>([=] {
45 // [kernel code]
46 });
47 });
48
49 templatedFunction<int>(); // OK
50
51 templatedFunction<std::complex<float>>(); // Launch A is OK, Launch B illegal
52 // because std::complex is not forward declarable according to C++, and was
53 // used in an explicit kernel name which must be forward declarable.
54}
sycl::is_device_copyable
#
namespace sycl {
template<typename T>
struct is_device_copyable;
template<typename T>
inline constexpr bool is_device_copyable_v = is_device_copyable<T>::value;
} // namespace sycl
sycl::is_device_copyable
is a user specializable class
template to indicate that a type T
is device copyable
(see SYCL Specification Section 3.13.1).
Requirements:
sycl::is_device_copyable
must meet theCpp17UnaryTrait
requirements.If
sycl::is_device_copyable
is specialized such thatsycl::is_device_copyable<T> == true
on aT
that does not satisfy all the requirements of a device copyable type, the results are unspecified.
If the application defines a type UDT
that satisfies the
requirements of a device copyable type but the type is not
implicitly device copyable as defined in that section,
then the application must provide a specialization of
sycl::is_device_copyable
that derives from std:true_type
in order to use that type in a context that requires a device
copyable type. Such a specialization can be declared like this:
template<>
struct sycl::is_device_copyable<UDT> : std::true_type {};
It is legal to provide this specialization even if the
implementation does not define SYCL_DEVICE_COPYABLE
to 1
,
but the type cannot be used as a device copyable type in that
case and the specialization is ignored.
Rules for parameter passing to kernels#
A SYCL application passes parameters to a kernel in different ways depending on whether the kernel is a named function object or a lambda function.
If the kernel is a named function object, the
operator()
member function (or other member
functions that it calls) may reference member
variables inside the same named function object.
Any such member variables become parameters to
the kernel.
If the kernel is a lambda function, any variables captured by the lambda become parameters to the kernel.
Regardless of how the parameter is passed, the following rules define the allowable types for a kernel parameter:
Any device copyable (see SYCL Specification Section 3.13.1) type is a legal parameter type.
The following SYCL types are legal parameter types:
sycl::accessor when templated with
sycl::target::device
;sycl::accessor when templated with any of the deprecated parameters:
sycl::target::global_buffer
,sycl::target::constant_buffer
, orsycl::target::local
;sycl::unsampled_image_accessor when templated with
sycl::image_target::device
;sycl::sampled_image_accessor when templated with
sycl::image_target::device
;sycl::marray<T, NumElements>
whenT
is device copyable;sycl::vec<T, NumElements>
.
An array of element types
T
is a legal parameter type ifT
is a legal parameter type.A class type
S
with a non-static member variable of typeT
is a legal parameter type ifT
is a legal parameter type and ifS
would otherwise be a legal parameter type aside from this member variable.A class type
S
with a non-virtual base class of typeT
is a legal parameter type ifT
is a legal parameter type and ifS
would otherwise be a legal parameter type aside from this base class.
Note
Pointer types are trivially copyable, so they may be passed as kernel parameters. However, only the pointer value itself is passed to the kernel. Dereferencing the pointer on the kernel results in undefined behavior unless the pointer points to an address within a USM memory region that is accessible on the device.
Reference types are not trivially copyable, so they may not be passed as kernel parameters.
Note
The sycl::reducer
class is a special type of kernel parameter
which is passed to a kernel in a different way. See Reduction Variables
for the description how this parameter type is used.