sycl::private_memory

sycl::private_memory#

template <typename T, int Dimensions = 1>
class private_memory;

Warning

Based on developer and implementation feedback, the hierarchical data parallel kernel feature is undergoing improvements to better align with the frameworks and patterns prevalent in modern programming.

As this is a key part of the SYCL API and we expect to make changes to it, we temporarily recommend that new codes refrain from using this feature until the new API is finished in a near-future version of the SYCL specification, when full use of the updated feature will be recommended for use in new code.

Existing codes using this feature will of course be supported by conformant implementations of this specification.

The sycl::private_memory class simply constructs private data for a given group across the entire group. The id of the current work-item is passed to any access to grab the correct data.

It is used to wrap the data, which guarantees use of private per-work-item memory in the hierarchical parallel_for invoke.

Private memory is allocated per underlying work-item, not per iteration of the parallel_for_work_item loop.

The number of instances of a private memory object is only under direct control if a work-group size is passed to the parallel_for_work_group call. If the underlying work-group size is chosen by the runtime, the number of private memory instances is opaque to the program.

Explicit private memory declarations should therefore be used with care and with a full understanding of which instances of a parallel_for_work_item loop will share the same underlying variable.

See also

Parallel for hierarchical invoke

SYCL Specification Section 4.7.5

(constructors)#

private_memory(const sycl::group<Dimensions>&);

Place an object of type T in the underlying private memory of each work-items. The type T must be default constructible.

The underlying constructor will be called for each work-item.

Member functions#

T& operator()(const sycl::h_item<Dimensions>& id)

Retrieve a reference to the object for the work-items.

Example 1#

Example of usage of the sycl::private_memory:

 1#include <sycl/sycl.hpp>
 2
 3#include <iostream>
 4
 5int main() {
 6  sycl::queue myQueue;
 7
 8  myQueue.submit([&](sycl::handler &cgh) {
 9    sycl::stream out(256, 256, cgh);
10
11    // Issue 8 work-groups of 8 work-items each
12    cgh.parallel_for_work_group(
13        sycl::range<3>(2, 2, 2), sycl::range<3>(2, 2, 2),
14        [=](sycl::group<3> myGroup) {
15          //[workgroup code]
16
17          // this variable is shared between workitems
18          int myLocal;
19
20          // this variable will be instantiated for each work-item separately
21          sycl::private_memory<int, 3> myPrivate(myGroup);
22
23          // Issue parallel work-items.  The number issued per work-group is
24          // determined by the work-group size range of parallel_for_work_group.
25          // In this case, 8 work-items will execute the parallel_for_work_item
26          // body for each of the 8 work-groups, resulting in 64 executions
27          // globally/total.
28          myGroup.parallel_for_work_item([&](sycl::h_item<3> myItem) {
29            //[work-item code]
30            sycl::id<3> myItemId = myItem.get_local_id();
31            myPrivate(myItem) = myItemId[0] + myItemId[1] + myItemId[2];
32          });
33
34          // Implicit work-group barrier
35
36          // Carry private value across loops
37          myGroup.parallel_for_work_item([&](sycl::h_item<3> myItem) {
38            //[work-item code]
39            out << myPrivate(myItem) << ' ';
40          });
41          out << sycl::stream_manipulator::endl;
42
43          //[workgroup code]
44        });
45  });
46}

Output example:

0 1 1 2 1 2 2 3
0 1 1 2 1 2 2 3
0 1 1 2 1 2 2 3
0 1 1 2 1 2 2 3
0 1 1 2 1 2 2 3
0 1 1 2 1 2 2 3
0 1 1 2 1 2 2 3
0 1 1 2 1 2 2 3