Name Strings
SPV_NV_tensor_addressing
Contact
To report problems with this extension, please open a new issue at:
Contributors
-
Jeff Bolz, NVIDIA
-
Karthik Vaidyanathan, NVIDIA
Notice
Copyright (c) 2024 NVIDIA Corp.
Status
-
Complete
Version
Last Modified Date |
2024-09-18 |
Revision |
1 |
Dependencies
This extension is written against the SPIR-V Specification, Version 1.6, Revision 3, Unified.
This extension requires SPIR-V 1.6.
Overview
This extension adds tensor layout and view types which initially can be be used with SPV_NV_cooperative_matrix2. It is written as a separate extension to allow it to potentially be used with other extensions in the future.
Extension Name
To use this extension within a SPIR-V module, the following OpExtension must be present in the module:
OpExtension "SPV_NV_tensor_addressing"
Modifications to the SPIR-V Specification, Version 1.6
2.2 Terms
Add new terms to section 2.2.2 Types:
Tensor Layout: An opaque collection of values manipulated by OpTensorLayout instructions, and used for tensor addressing calculations when loading and storing cooperative matrices.
Tensor View: An opaque collection of values manipulated by OpTensorView instructions, and used for tensor addressing calculations when loading and storing cooperative matrices.
Add Tensor Layout and Tensor View to the list of Opaque Types.
3.31 Capabilities
Modify Section 3.31, "Capability", adding these rows to the Capability table:
Capability | Enabling Capabilities | |
---|---|---|
5439 |
TensorAddressingNV |
3.X Tensor Layout and View
Tensor layout and tensor view types are representations of the mapping between matrix coordinates and tensor memory layout. They each have a number of dimensions in the range [1,5], with dimension 0 being the outermost dimension and the last dimension being the innermost. These types have the following logical state:
struct tensorLayoutNV<uint32_t Dim,
TensorClampMode Mode = TensorClampModeUndefined>
{
static constexpr uint32_t LDim = Dim;
static constexpr TensorClampMode clampMode = Mode;
uint32_t blockSize[LDim];
uint32_t layoutDimension[LDim];
uint32_t stride[LDim];
int32_t offset[LDim];
uint32_t span[LDim];
uint32_t clampValue;
};
struct tensorViewNV<uint Dim, bool hasDimensions, uint32_t p0, ..., uint32_t p<Dim-1>>
{
static constexpr uint32_t VDim = Dim;
static constexpr bool hasDim = hasDimensions;
static constexpr uint32_t permutation[VDim] = {p0, ..., p<Dim-1>};
uint32_t viewDimension[VDim];
uint32_t viewStride[VDim];
uint32_t clipRowOffset, clipRowSpan, clipColOffset, clipColSpan;
};
A tensor layout represents the layout of values in memory (number of dimensions and size), along with a region being accessed (offset and span).
---------------------------------------------------------------------------
| layoutDimension1 |
| |
| |
| |
| |
| |
| |
| |
| span1 |
| ----------------- |
| | | |
| | | |
| | slice | span0 |
| | | layoutDimension0|
| | | |
| offset1 | | |
| ---------------> ----------------- |
| |
| ^ |
| | |
| | |
| | offset0 |
| | |
| | |
| | |
| | |
---------------------------------------------------------------------------
Figure: A 2D tensor layout, and a slice selecting a region within it.
A tensor view allows reinterpreting the dimensions of the region being accessed, including changing the number of dimensions, reordering the dimensions as they are loaded or stored, and clipping the region of the matrix that is loaded or stored. Often the span will have the same number of elements as the matrix, but in some more advanced uses that may not be the case.
How the addressing calculations are performed is left to other extensions to define.
Unlike some other ML APIs, tensor layouts and views only describe addressing calculations and never involve making copies of tensors. For this reason, the functionality is slightly more limited (e.g. there’s no way to slice, then permute, then slice again).
OpTensorLayout and OpTensorView instructions operate by copying existing object state and updating the requested state and returning that as a new result. Some of these instructions initialize multiple related pieces of state, setting some to common default values, so the order of the operations matters.
3.X Tensor Clamp Mode
New section in 3 "Binary Form".
Tensor Clamp Mode | Enabling Capabilities | |
---|---|---|
0 |
Undefined |
|
1 |
Constant |
|
2 |
ClampToEdge |
|
3 |
Repeat |
|
4 |
RepeatMirrored |
3.49.6 Type-Declaration Instructions
3.X Tensor Layout and View Instructions
New section in 3 "Binary Form".
Issues
Revision History
Rev | Date | Author | Changes |
---|---|---|---|
1 |
2024-09-18 |
Jeff Bolz |
Initial revision of SPV_NV_tensor_addressing |