Posts Tagged "cuda"
Twelve Attempts at an FP4 Kernel
A worklog of NVFP4 kernels, failed experiments, and one stubborn memory bus
Read Post →
Honey, I Tiled the Tensors
Shapes, Strides, Swizzles and Suffering! - An intro to Layout Algebra
(Updated on )
Read Post →