Types of shared memory
Fastest form of memory on the GPU. Is only accessible by individual threads and has the lifetime of a thread. We don’t need to deal with it directly (but we can).
Can be as fast as a register when there are no bank conflicts (when threads read from the same address). Accessible by any thread of the block from which it was created. Has the lifetime of the block.
Potentially 150x slower than register or shared memory because of un-coalesced reads and writes. Accessible from either the host or device. Has the lifetime of the application. Read-only global memory is called constant memory.
Resides in global memory and can be 150x slower than register/shared memory. Is only accessible by the thread. Has the lifetime of the thread.