Specialization Constants#

Device code can make use of specialization constants which represent constants whose values can be set dynamically during execution of the SYCL application. The values of these constants are fixed when a SYCL kernel function is invoked, and they do not change during the execution of the kernel. However, the application is able to set a new value for a specialization constant each time a kernel is invoked, so the values can be tuned differently for each invocation.

There are two methods for an application to use specialization constants. One method requires creating a sycl::kernel_bundle object and the other does not. The syntax for both methods is mostly the same. Both methods declare specialization constants in the same way, and kernels read their values in the same way. The main difference is whether their values are set via sycl::handler::set_specialization_constant() or via sycl::kernel_bundle::set_specialization_constant(). These two methods are incompatible with one another, so they may not both be used by the same command group.

Warning

Implementations that support online compilation of kernel bundles will likely implement both methods of specialization constants using kernel bundles. Therefore, applications should expect that there is some overhead associated with invoking a kernel with new values for its specialization constants. A typical implementation records the values of specialization constants set via sycl::handler::set_specialization_constant() and remembers these values until a kernel is invoked (e.g. via sycl::parallel_for()). At this point, the implementation determines the bundle that contains the invoked kernel. If that bundle has already been compiled for the handler’s device and compiled with the correct values for the specialization constants, the kernel is scheduled for invocation. Otherwise, the implementation compiles the bundle before scheduling the kernel for invocation. Therefore, applications that frequently change the values of specialization constants may see an overhead associated with recompilation of the kernel’s bundle.

See also

SYCL Specification Section 4.9.5

sycl::specialization_id#

template <class... Args>
explicit constexpr specialization_id(Args&&... args);

Constraints: Available only when std::is_constructible_v<T, Args...> evaluates to true.

Effects: Constructs a specialization_id containing an instance of T initialized with args..., which represents the specialization constant’s default value.

Special member functions#

specialization_id(const specialization_id& rhs) = delete;            // (1)

specialization_id(specialization_id&& rhs) = delete;                 // (2)

specialization_id& operator=(const specialization_id& rhs) = delete; // (3)

specialization_id& operator=(specialization_id&& rhs) = delete;      // (4)

1.Deleted copy constructor.

2.Deleted move constructor.

3.Deleted copy assignment operator.

4.Deleted move assignment operator.

Declaring a specialization constant#

Specialization constants must be declared using the specialization_id class with the following restrictions:

  • the template parameter T must be a device copyable type;

  • the specialization_id variable must be declared as constexpr;

  • the specialization_id variable must be declared in either namespace scope or in class scope;

  • if the specialization_id variable is declared in class scope, it must have public accessibility when referenced from namespace scope;

  • the specialization_id variable may not be shadowed by another identifier X which has the same name and is declared in an inline namespace, such that the specialization_id variable is no longer accessible after the declaration of X;

  • if the specialization_id variable is declared in a namespace, none of the enclosing namespace names N may be shadowed by another identifier X which has the same name as N and is declared in an inline namespace, such that N is no longer accessible after the declaration of X.

Warning

The expectation is that some implementations may conceptually insert code at the end of a translation unit which references each specialization_id variable that is declared in that translation unit. The restrictions listed above make this possible by ensuring that these variables are accessible at the end of the translation unit.

Setting and getting the value of a specialization constant#

If the application uses specialization constants without creating a sycl::kernel_bundle object, it can set and get their values from command group scope by calling member functions of the handler class. These member functions have a template parameter SpecName whose value must be a reference to a variable of type specialization_id, which defines the type and default value of the specialization constant.

When not using a kernel bundle, the value of a specialization constant that is used in a kernel invoked from a command group is affected by calls to set its value from that same command group, but it is not affected by calls from other command groups even if those calls are from another invocation of the same command group function object.

sycl::set_specialization_constant#

template <auto& SpecName>
void set_specialization_constant(
    typename std::remove_reference_t<decltype(SpecName)>::value_type value);

Effects: Sets the value of the specialization constant whose address is SpecName for this handler’s command group. If the specialization constant’s value was previously set in this same command group, the value is overwritten.

This function may be called even if the specialization constant SpecName is not used by the kernel that is invoked by this handler’s command group. Doing so has no effect on the invoked kernel.

Throws:

An exception with the sycl::errc::invalid error code if a kernel bundle has been bound to the sycl::handler via use_kernel_bundle().

sycl::get_specialization_constant#

template <auto& SpecName>
typename std::remove_reference_t<decltype(SpecName)>::value_type
get_specialization_constant();

Returns: The value of the specialization constant whose address is SpecName for this handler’s command group. If the value was previously set in this handler’s command group, that value is returned. Otherwise, the specialization constant’s default value is returned.

Throws:

An exception with the sycl::errc::invalid error code if a kernel bundle has been bound to the sycl::handler via use_kernel_bundle().

Reading the value of a specialization constant from device code#

In order to read the value of a specialization constant from device code, the SYCL kernel function must be declared to take an object of type kernel_handler as its last parameter. The SYCL runtime constructs this object, which has a member function for reading the specialization constant’s value.

sycl::kernel_handler#

class kernel_handler;

In order to read the value of a specialization constant from device code, the SYCL kernel function must be declared to take an object of type sycl::kernel_handler as its last parameter. The SYCL runtime constructs this object, which has a member function for reading the specialization constant's value.

A synopsis of this class is shown below.

namespace sycl {

class kernel_handler {
 public:
  template <auto& SpecName>
  typename std::remove_reference_t<decltype(SpecName)>::value_type
  get_specialization_constant();
};

} // namespace sycl

Member functions#

sycl::get_specialization_constant#

template<auto& SpecName>
typename std::remove_reference_t<decltype(SpecName)>::value_type
get_specialization_constant();

Returns: The value of the specialization constant whose address is SpecName. For a kernel invoked from a command group that was not bound to a kernel bundle, the value is the same as what would have been returned if sycl::handler::get_specialization_constant() was called immediately before invoking the kernel. For a kernel invoked from a command group that was bound to a kernel bundle, the value is the same as what would be returned if sycl::kernel_bundle::get_specialization_constant() was called on the bound bundle.

Example 1#

The following example performs a convolution and uses specialization constants to set the values of the coefficients.

  1#include <sycl/sycl.hpp>
  2
  3using coeff_t = std::array<std::array<float, 3>, 3>;
  4
  5// Identify the specialization constant.
  6constexpr sycl::specialization_id<coeff_t> coeff_id;
  7
  8// Function to get coefficients (in this case, a simple example)
  9coeff_t get_coefficients() {
 10  // Read coefficients from somewhere or generate them
 11  return {{{1.0f, 2.0f, 1.0f}, {0.0f, 0.0f, 0.0f}, {-1.0f, -2.0f, -1.0f}}};
 12}
 13
 14// Convolution kernel function
 15void do_conv(sycl::buffer<float, 2> &in, sycl::buffer<float, 2> &out,
 16             coeff_t coefficients) {
 17  sycl::property_list properties{sycl::property::queue::enable_profiling()};
 18  auto myQueue = sycl::queue(sycl::default_selector_v, properties);
 19
 20  myQueue.submit([&](sycl::handler &cgh) {
 21    sycl::accessor in_acc{in, cgh, sycl::read_only};
 22    sycl::accessor out_acc{out, cgh, sycl::write_only};
 23
 24    // Set the coefficient of the convolution as constant.
 25    // This will build a specific kernel the coefficient available as literals.
 26    cgh.set_specialization_constant<coeff_id>(get_coefficients());
 27
 28    cgh.parallel_for<class Convolution>(
 29        in.get_range(), [=](sycl::item<2> item_id, sycl::kernel_handler h) {
 30          float acc = 0;
 31          coeff_t coeff = h.get_specialization_constant<coeff_id>();
 32          for (int i = -1; i <= 1; i++) {
 33            if (item_id[0] + i < 0 || item_id[0] + i >= in_acc.get_range()[0])
 34              continue;
 35            for (int j = -1; j <= 1; j++) {
 36              if (item_id[1] + j < 0 || item_id[1] + j >= in_acc.get_range()[1])
 37                continue;
 38              // The underlying JIT can see all the values of the array returned
 39              // by coeff.get().
 40              acc +=
 41                  coeff[i + 1][j + 1] * in_acc[item_id[0] + i][item_id[1] + j];
 42            }
 43          }
 44          out_acc[item_id] = acc;
 45        });
 46  });
 47
 48  myQueue.wait();
 49}
 50
 51void printResults(sycl::buffer<float, 2> &input_buf,
 52                  sycl::buffer<float, 2> &output_buf,
 53                  const coeff_t &coefficients, size_t width, size_t height) {
 54  auto accessorIn = input_buf.get_access<sycl::access::mode::read>();
 55  auto accessorOut = output_buf.get_access<sycl::access::mode::read>();
 56
 57  std::cout << "Input:" << std::endl;
 58  for (size_t y = 0; y < height; ++y) {
 59    for (size_t x = 0; x < width; ++x) {
 60      std::cout << accessorIn[x][y] << " ";
 61    }
 62    std::cout << std::endl;
 63  }
 64
 65  std::cout << "\nOutput after convolution:" << std::endl;
 66  for (size_t y = 0; y < height; ++y) {
 67    for (size_t x = 0; x < width; ++x) {
 68      std::cout << accessorOut[x][y] << " ";
 69    }
 70    std::cout << std::endl;
 71  }
 72
 73  std::cout << "\nCoefficients:" << std::endl;
 74  for (const auto &row : coefficients) {
 75    for (const auto &value : row) {
 76      std::cout << value << " ";
 77    }
 78    std::cout << std::endl;
 79  }
 80}
 81
 82int main() {
 83  constexpr size_t width = 10;
 84  constexpr size_t height = 10;
 85
 86  // Create input and output buffers
 87  sycl::buffer<float, 2> input_buf(sycl::range<2>(width, height));
 88  sycl::buffer<float, 2> output_buf(sycl::range<2>(width, height));
 89
 90  // Get coefficients
 91  coeff_t coefficients = get_coefficients();
 92
 93  // Call convolution function
 94  do_conv(input_buf, output_buf, coefficients);
 95
 96  // Print input, output, and coefficients using printResults function
 97  printResults(input_buf, output_buf, coefficients, width, height);
 98
 99  return 0;
100}

Output:

Input:
0 0 9.69007e-38 9.69894e-38 0 0 9.69061e-38 9.69894e-38 0 0
0 0 0 0 0 0 0 0 0 0
0 9.68996e-38 9.69894e-38 0 0 9.6905e-38 9.69894e-38 0 0 9.69104e-38
0 0 0 0 0 0 0 0 0 0
9.68986e-38 9.69894e-38 0 0 9.69039e-38 9.69894e-38 0 0 9.69093e-38 9.69894e-38
0 0 0 0 0 0 0 0 0 0
9.69894e-38 0 0 9.69029e-38 9.69894e-38 0 0 9.69082e-38 9.69894e-38 0
0 0 0 0 0 0 0 0 0 0
0 0 9.69018e-38 9.69894e-38 0 0 9.69072e-38 9.69894e-38 0 0
0 0 0 0 0 0 0 0 0 0

Output after convolution:
0 -1.93801e-37 -1.93979e-37 1.93801e-37 1.93979e-37 -1.93812e-37 -1.93979e-37 1.93812e-37 1.93979e-37 0
-9.68996e-38 -1.9389e-37 -8.97728e-41 1.9389e-37 8.43918e-41 -1.93895e-37 -8.43918e-41 1.93895e-37 7.90108e-41 0
-1.93799e-37 -1.93979e-37 1.93799e-37 1.93979e-37 -1.9381e-37 -1.93979e-37 1.9381e-37 1.93979e-37 -1.93821e-37 0
-1.93889e-37 -9.0849e-41 1.93889e-37 8.5468e-41 -1.93894e-37 -8.5468e-41 1.93894e-37 8.0087e-41 -1.939e-37 9.69093e-38
-1.93979e-37 1.93797e-37 1.93979e-37 -1.93808e-37 -1.93979e-37 1.93808e-37 1.93979e-37 -1.93819e-37 -1.93979e-37 1.93819e-37
-9.69894e-38 1.93888e-37 8.65442e-41 -1.93893e-37 -8.65442e-41 1.93893e-37 8.11632e-41 -1.93899e-37 -8.11632e-41 1.93899e-37
0 1.93979e-37 -1.93806e-37 -1.93979e-37 1.93806e-37 1.93979e-37 -1.93816e-37 -1.93979e-37 1.93816e-37 1.93979e-37
0 8.76204e-41 -1.93892e-37 -8.76204e-41 1.93892e-37 8.22394e-41 -1.93898e-37 -8.22394e-41 1.93898e-37 9.69894e-38
0 -1.93804e-37 -1.93979e-37 1.93804e-37 1.93979e-37 -1.93814e-37 -1.93979e-37 1.93814e-37 1.93979e-37 0
0 -9.69018e-38 -9.69894e-38 9.69018e-38 9.69894e-38 -9.69072e-38 -9.69894e-38 9.69072e-38 9.69894e-38 0

Coefficients:
1 2 1
0 0 0
-1 -2 -1