Name Strings

SPV_VALVE_mixed_float_dot_product

Contact

To report problems with this extension, please open a new issue at:

Contributors

  • Georg Lehmann, Valve

Notice

Copyright (c) 2026, Valve Corporation

Status

  • Complete

Version

Last Modified Date

2026-02-04

Revision

1

Dependencies

This extension is written against the SPIR-V Specification, Version 1.6 Revision 6.

This extension requires SPIR-V 1.0.

If DotProductBFloat16AccVALVE is used, SPV_KHR_bfloat16 is required.

If DotProductFloat8AccFloat32VALVE is used, SPV_EXT_float8 is required.

Overview

This extension introduces support for dot product operations on low precision inputs with potentially higher precision accumulation. The specific types accepted as inputs are constrained by capabilities of which this extension introduces four:

  • 2 component vector of 16bit float inputs with 32bit accumulation

  • 2 component vector of 16bit float inputs with 16bit accumulation

  • 2 component vector of bfloat16 inputs with 32bit or bfloat16 accumulation

  • 4 component vector of 8bit float inputs with 32bit accumulation

Extension Name

To use this extension within a SPIR-V module, the following OpExtension must be present in the module:

OpExtension "SPV_VALVE_mixed_float_dot_product"

Modifications to the SPIR-V Specification, Version 1.6

Capabilities

Modify Section 3.2.30, "Capability", adding these rows to the Capability table (these capabilities enable specific input types):

Capability Implicitly declares

6912

DotProductFloat16AccFloat32VALVE
Uses 16bit floating point dot product with 32bit accumulation

Float16

6913

DotProductFloat16AccFloat16VALVE
Uses 16bit floating point dot product with 16bit accumulation

Float16

6914

DotProductBFloat16AccVALVE
Uses 16bit BFloat16 dot product with BFloat16 or 32bit accumulation

BFloat16TypeKHR

6915

DotProductFloat8AccFloat32VALVE
Uses 8bit E4M3 or E5M2 dot product with 32bit accumulation

Float8EXT

Instructions

Add the following new instructions:

OpFDot2MixAcc32VALVE

Floating point dot product of Vector 1 and Vector 2 and addition of the result with Accumulator.

Result Type must be a scalar floating-point type using the IEEE 754 encoding. The component width must be 32 bits.

Vector 1 and Vector 2 must have the same type.

Vector 1 and Vector 2 must be vectors of 2 components of floating point types using the IEEE 754 encoding (enabled by DotProductFloat16AccFloat32VALVE) or the BFloat16KHR encoding (enabled by DotProductBFloat16AccVALVE). The component width must be 16 bits.

The type of Accumulator must be the same as Result Type.

All components of the input vectors are converted to Result Type. The converted vectors are then multiplied component-wise and all components of the vector resulting from the component-wise multiplication are added together. Finally, the resulting sum is added to the input accumulator.
Exact order and precision of these operations is implementation defined.

Capability:
DotProductFloat16AccFloat32VALVE, DotProductBFloat16AccVALVE

6

6916

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

<id> Accumulator

OpFDot2MixAcc16VALVE

Floating point dot product of Vector 1 and Vector 2 and addition of the result with Accumulator.

Result Type must be a scalar floating-point type equal to the components of Vector 1 and Vector 2.

Vector 1 and Vector 2 must have the same type.

Vector 1 and Vector 2 must be vectors of 2 components of floating point types using the IEEE 754 encoding (enabled by DotProductFloat16AccFloat16VALVE) or the BFloat16KHR encoding (enabled by DotProductBFloat16AccVALVE). The component width must be 16 bits.

The type of Accumulator must be the same as Result Type.

The vectors are multiplied component-wise and all components of the vector resulting from the component-wise multiplication are added together. Finally, the resulting sum is added to the input accumulator.
Exact order and precision of these operations is implementation defined.

Capability:
DotProductFloat16AccFloat16VALVE, DotProductBFloat16AccVALVE

6

6917

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

<id> Accumulator

OpFDot4MixAcc32VALVE

Floating point dot product of Vector 1 and Vector 2 and addition of the result with Accumulator.

Result Type must be a scalar floating-point type using the IEEE 754 encoding. The component width must be 32 bits.

Vector 1 and Vector 2 must be vectors of 4 components of floating point types using the Float8E4M3EXT or Float8E5M2EXT encoding. The component width must be 8 bits.

The type of Accumulator must be the same as Result Type.

All components of the input vectors are converted to Result Type. The converted vectors are then multiplied component-wise and all components of the vector resulting from the component-wise multiplication are added together. Finally, the resulting sum is added to the input accumulator.
Exact order and precision of these operations is implementation defined.

Capability:
DotProductFloat8AccFloat32VALVE

6

6918

<id> Result Type

Result <id>

<id> Vector 1

<id> Vector 2

<id> Accumulator

Issues

  1. How to define precision?

    Leave it implementation defined, like cooperative matrix multiply-add. Precision varies across supported hardware.

Revision History

Rev Date Author Changes

1

2026-02-04

Georg Lehmann

Initial revision