SYCL and the SYCL logo are trademarks of the Khronos Group Inc.
## Coalesced Global Memory
## Learning Objectives * Learn about coalesced global memory access * Learn about the performance impact * Learn about row-major vs column-major * Learn about SoA vs AoS
#### Coalesced global memory
* Reading from and writing to global memory is generally very expensive. * It often involves copying data across an off-chip bus. * This means you generally want to avoid unnecessary accesses. * Memory access operations is done in chunks. * This means accessing data that is physically close together in memory is more efficient.
#### Coalesced global memory
![SYCL](../common-revealjs/images/coalesced_global_memory_1.png "SYCL")
#### Coalesced global memory
![SYCL](../common-revealjs/images/coalesced_global_memory_2.png "SYCL")
#### Coalesced global memory
![SYCL](../common-revealjs/images/coalesced_global_memory_3.png "SYCL")
#### Coalesced global memory
![SYCL](../common-revealjs/images/coalesced_global_memory_4.png "SYCL")
#### Coalesced global memory
![SYCL](../common-revealjs/images/coalesced_global_memory_5.png "SYCL")
#### Coalesced global memory
![SYCL](../common-revealjs/images/coalesced_global_memory_6.png "SYCL")
#### Row-major vs Column-major
* Coalescing global memory access is particularly important when working in multiple dimensions. * This is because when doing so you have to convert from a position in 2d space to a linear memory space. * There are two ways to do this; generally referred to as row-major and column-major.
#### Row-major vs Column-major
![SYCL](../common-revealjs/images/row_col_1.png "SYCL")
![SYCL](../common-revealjs/images/row_col_2.png "SYCL")
![SYCL](../common-revealjs/images/row_col_3.png "SYCL")
#### AoS vs SoA
* Another area this is a factor is when composing data structures. * It's often instinctive to have struct representing a collection of data and then have an array of this - often referred to as Array of Structs (AoS). * But for data parallel architectures such as a GPU it's more efficient to have sequential elements of the same type stored contiguously in memory - often referred to as Struct of Arrays (SoA).
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_1.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_2.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_3.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_4.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_5.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_6.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_7.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_8.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_9.png "SYCL")
#### AoS vs SoA
![SYCL](../common-revealjs/images/soa_vs_aos_10.png "SYCL")
#### Coalesced image convolution performance
![SYCL](../common-revealjs/images/image_convolution_performance_coalesced.png "SYCL")
## Questions
#### Exercise
Code_Exercises/Coalesced_Global_Memory/source
Try inverting the dimensions when calculating the linear address in memory and measure the performance.