Specialization Constants#
Device code can make use of specialization constants which represent constants whose values can be set dynamically during execution of the SYCL application. The values of these constants are fixed when a SYCL kernel function is invoked, and they do not change during the execution of the kernel. However, the application is able to set a new value for a specialization constant each time a kernel is invoked, so the values can be tuned differently for each invocation.
There are two methods for an application to use specialization constants.
One method requires creating a sycl::kernel_bundle
object and
the other does not.
The syntax for both methods is mostly the same. Both methods declare
specialization constants in the same way, and kernels read their values
in the same way. The main difference is whether their values are set
via sycl::handler::set_specialization_constant()
or via
sycl::kernel_bundle::set_specialization_constant()
.
These two methods are incompatible with one another,
so they may not both be used by the same command group.
Warning
Implementations that support online compilation of kernel bundles
will likely implement both methods of specialization constants using
kernel bundles. Therefore, applications should expect that there is
some overhead associated with invoking a kernel with new values for
its specialization constants. A typical implementation records the values
of specialization constants set via
sycl::handler::set_specialization_constant()
and remembers these values
until a kernel is invoked (e.g. via sycl::parallel_for()
). At this point,
the implementation determines the bundle that contains the invoked kernel.
If that bundle has already been compiled for the handler’s device and compiled
with the correct values for the specialization constants, the kernel is
scheduled for invocation. Otherwise, the implementation compiles the bundle
before scheduling the kernel for invocation. Therefore, applications that
frequently change the values of specialization constants may see an overhead
associated with recompilation of the kernel’s bundle.
See also
SYCL Specification Section 4.9.5
sycl::specialization_id
#
template <class... Args>
explicit constexpr specialization_id(Args&&... args);
Constraints: Available only when
std::is_constructible_v<T, Args...>
evaluates to true
.
Effects: Constructs a specialization_id
containing an instance of T
initialized with args
..., which represents the specialization
constant’s default value.
Special member functions#
specialization_id(const specialization_id& rhs) = delete; // (1)
specialization_id(specialization_id&& rhs) = delete; // (2)
specialization_id& operator=(const specialization_id& rhs) = delete; // (3)
specialization_id& operator=(specialization_id&& rhs) = delete; // (4)
1.Deleted copy constructor.
2.Deleted move constructor.
3.Deleted copy assignment operator.
4.Deleted move assignment operator.
Declaring a specialization constant#
Specialization constants must be declared using the specialization_id
class
with the following restrictions:
the template parameter
T
must be a device copyable type;the
specialization_id
variable must be declared asconstexpr
;the
specialization_id
variable must be declared in either namespace scope or in class scope;if the
specialization_id
variable is declared in class scope, it must have public accessibility when referenced from namespace scope;the
specialization_id
variable may not be shadowed by another identifierX
which has the same name and is declared in aninline
namespace, such that thespecialization_id
variable is no longer accessible after the declaration ofX
;if the
specialization_id
variable is declared in a namespace, none of the enclosing namespace namesN
may be shadowed by another identifierX
which has the same name asN
and is declared in aninline
namespace, such thatN
is no longer accessible after the declaration ofX
.
Warning
The expectation is that some implementations may conceptually insert
code at the end of a translation unit which references each
specialization_id
variable that is declared in that translation
unit. The restrictions listed above make this possible by ensuring
that these variables are accessible at the end of the translation unit.
Setting and getting the value of a specialization constant#
If the application uses specialization constants without creating a
sycl::kernel_bundle
object, it can set and get their values from
command group scope by calling member functions of the handler
class.
These member functions have a template parameter SpecName
whose
value must be a reference to a variable of type specialization_id
,
which defines the type and default value of the specialization constant.
When not using a kernel bundle, the value of a specialization constant that is used in a kernel invoked from a command group is affected by calls to set its value from that same command group, but it is not affected by calls from other command groups even if those calls are from another invocation of the same command group function object.
sycl::set_specialization_constant
#
template <auto& SpecName>
void set_specialization_constant(
typename std::remove_reference_t<decltype(SpecName)>::value_type value);
Effects: Sets the value of the specialization constant whose address is
SpecName
for this handler’s command group. If the specialization
constant’s value was previously set in this same command group,
the value is overwritten.
This function may be called even if the specialization constant
SpecName
is not used by the kernel that is invoked by this
handler’s command group. Doing so has no effect on the invoked kernel.
Throws:
An exception
with the sycl::errc::invalid
error code
if a kernel bundle has been bound to the
sycl::handler
via use_kernel_bundle()
.
sycl::get_specialization_constant
#
template <auto& SpecName>
typename std::remove_reference_t<decltype(SpecName)>::value_type
get_specialization_constant();
Returns: The value of the specialization constant whose address is
SpecName
for this handler’s command group. If the value was
previously set in this handler’s command group, that value is returned.
Otherwise, the specialization constant’s default value is returned.
Throws:
An exception
with the sycl::errc::invalid
error code if a kernel
bundle has been bound to the
sycl::handler
via use_kernel_bundle()
.
Reading the value of a specialization constant from device code#
In order to read the value of a specialization constant from device
code, the SYCL kernel function must be declared to take an object of
type kernel_handler
as its last parameter. The SYCL runtime
constructs this object, which has a member function for reading
the specialization constant’s value.
sycl::kernel_handler
#
class kernel_handler;
In order to read the value of a specialization constant from
device code, the SYCL kernel function must be declared to take
an object of type sycl::kernel_handler
as its last parameter. The
SYCL runtime constructs this object, which has a member
function for reading the specialization constant's value.
A synopsis of this class is shown below.
namespace sycl {
class kernel_handler {
public:
template <auto& SpecName>
typename std::remove_reference_t<decltype(SpecName)>::value_type
get_specialization_constant();
};
} // namespace sycl
Member functions#
sycl::get_specialization_constant
#
template<auto& SpecName>
typename std::remove_reference_t<decltype(SpecName)>::value_type
get_specialization_constant();
Returns: The value of the specialization constant whose address is
SpecName
. For a kernel invoked from a command group that was
not bound to a kernel bundle, the value is the same as what would
have been returned if sycl::handler::get_specialization_constant()
was called immediately before invoking the kernel. For a kernel
invoked from a command group that was bound to a kernel bundle,
the value is the same as what would be returned if
sycl::kernel_bundle::get_specialization_constant()
was called on the bound bundle.
Example 1#
The following example performs a convolution and uses specialization constants to set the values of the coefficients.
1#include <sycl/sycl.hpp>
2
3using coeff_t = std::array<std::array<float, 3>, 3>;
4
5// Identify the specialization constant.
6constexpr sycl::specialization_id<coeff_t> coeff_id;
7
8// Function to get coefficients (in this case, a simple example)
9coeff_t get_coefficients() {
10 // Read coefficients from somewhere or generate them
11 return {{{1.0f, 2.0f, 1.0f}, {0.0f, 0.0f, 0.0f}, {-1.0f, -2.0f, -1.0f}}};
12}
13
14// Convolution kernel function
15void do_conv(sycl::buffer<float, 2> &in, sycl::buffer<float, 2> &out,
16 coeff_t coefficients) {
17 sycl::property_list properties{sycl::property::queue::enable_profiling()};
18 auto myQueue = sycl::queue(sycl::default_selector_v, properties);
19
20 myQueue.submit([&](sycl::handler &cgh) {
21 sycl::accessor in_acc{in, cgh, sycl::read_only};
22 sycl::accessor out_acc{out, cgh, sycl::write_only};
23
24 // Set the coefficient of the convolution as constant.
25 // This will build a specific kernel the coefficient available as literals.
26 cgh.set_specialization_constant<coeff_id>(get_coefficients());
27
28 cgh.parallel_for<class Convolution>(
29 in.get_range(), [=](sycl::item<2> item_id, sycl::kernel_handler h) {
30 float acc = 0;
31 coeff_t coeff = h.get_specialization_constant<coeff_id>();
32 for (int i = -1; i <= 1; i++) {
33 if (item_id[0] + i < 0 || item_id[0] + i >= in_acc.get_range()[0])
34 continue;
35 for (int j = -1; j <= 1; j++) {
36 if (item_id[1] + j < 0 || item_id[1] + j >= in_acc.get_range()[1])
37 continue;
38 // The underlying JIT can see all the values of the array returned
39 // by coeff.get().
40 acc +=
41 coeff[i + 1][j + 1] * in_acc[item_id[0] + i][item_id[1] + j];
42 }
43 }
44 out_acc[item_id] = acc;
45 });
46 });
47
48 myQueue.wait();
49}
50
51void printResults(sycl::buffer<float, 2> &input_buf,
52 sycl::buffer<float, 2> &output_buf,
53 const coeff_t &coefficients, size_t width, size_t height) {
54 auto accessorIn = input_buf.get_access<sycl::access::mode::read>();
55 auto accessorOut = output_buf.get_access<sycl::access::mode::read>();
56
57 std::cout << "Input:" << std::endl;
58 for (size_t y = 0; y < height; ++y) {
59 for (size_t x = 0; x < width; ++x) {
60 std::cout << accessorIn[x][y] << " ";
61 }
62 std::cout << std::endl;
63 }
64
65 std::cout << "\nOutput after convolution:" << std::endl;
66 for (size_t y = 0; y < height; ++y) {
67 for (size_t x = 0; x < width; ++x) {
68 std::cout << accessorOut[x][y] << " ";
69 }
70 std::cout << std::endl;
71 }
72
73 std::cout << "\nCoefficients:" << std::endl;
74 for (const auto &row : coefficients) {
75 for (const auto &value : row) {
76 std::cout << value << " ";
77 }
78 std::cout << std::endl;
79 }
80}
81
82int main() {
83 constexpr size_t width = 10;
84 constexpr size_t height = 10;
85
86 // Create input and output buffers
87 sycl::buffer<float, 2> input_buf(sycl::range<2>(width, height));
88 sycl::buffer<float, 2> output_buf(sycl::range<2>(width, height));
89
90 // Get coefficients
91 coeff_t coefficients = get_coefficients();
92
93 // Call convolution function
94 do_conv(input_buf, output_buf, coefficients);
95
96 // Print input, output, and coefficients using printResults function
97 printResults(input_buf, output_buf, coefficients, width, height);
98
99 return 0;
100}
Output:
Input:
0 0 9.69007e-38 9.69894e-38 0 0 9.69061e-38 9.69894e-38 0 0
0 0 0 0 0 0 0 0 0 0
0 9.68996e-38 9.69894e-38 0 0 9.6905e-38 9.69894e-38 0 0 9.69104e-38
0 0 0 0 0 0 0 0 0 0
9.68986e-38 9.69894e-38 0 0 9.69039e-38 9.69894e-38 0 0 9.69093e-38 9.69894e-38
0 0 0 0 0 0 0 0 0 0
9.69894e-38 0 0 9.69029e-38 9.69894e-38 0 0 9.69082e-38 9.69894e-38 0
0 0 0 0 0 0 0 0 0 0
0 0 9.69018e-38 9.69894e-38 0 0 9.69072e-38 9.69894e-38 0 0
0 0 0 0 0 0 0 0 0 0
Output after convolution:
0 -1.93801e-37 -1.93979e-37 1.93801e-37 1.93979e-37 -1.93812e-37 -1.93979e-37 1.93812e-37 1.93979e-37 0
-9.68996e-38 -1.9389e-37 -8.97728e-41 1.9389e-37 8.43918e-41 -1.93895e-37 -8.43918e-41 1.93895e-37 7.90108e-41 0
-1.93799e-37 -1.93979e-37 1.93799e-37 1.93979e-37 -1.9381e-37 -1.93979e-37 1.9381e-37 1.93979e-37 -1.93821e-37 0
-1.93889e-37 -9.0849e-41 1.93889e-37 8.5468e-41 -1.93894e-37 -8.5468e-41 1.93894e-37 8.0087e-41 -1.939e-37 9.69093e-38
-1.93979e-37 1.93797e-37 1.93979e-37 -1.93808e-37 -1.93979e-37 1.93808e-37 1.93979e-37 -1.93819e-37 -1.93979e-37 1.93819e-37
-9.69894e-38 1.93888e-37 8.65442e-41 -1.93893e-37 -8.65442e-41 1.93893e-37 8.11632e-41 -1.93899e-37 -8.11632e-41 1.93899e-37
0 1.93979e-37 -1.93806e-37 -1.93979e-37 1.93806e-37 1.93979e-37 -1.93816e-37 -1.93979e-37 1.93816e-37 1.93979e-37
0 8.76204e-41 -1.93892e-37 -8.76204e-41 1.93892e-37 8.22394e-41 -1.93898e-37 -8.22394e-41 1.93898e-37 9.69894e-38
0 -1.93804e-37 -1.93979e-37 1.93804e-37 1.93979e-37 -1.93814e-37 -1.93979e-37 1.93814e-37 1.93979e-37 0
0 -9.69018e-38 -9.69894e-38 9.69018e-38 9.69894e-38 -9.69072e-38 -9.69894e-38 9.69072e-38 9.69894e-38 0
Coefficients:
1 2 1
0 0 0
-1 -2 -1