Ecosyste.ms: Timeline
Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.
DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes
- Update hgemm_wmma_stage.cu 9aaf1b5
bear-zd opened a pull request on DefTruth/CUDA-Learn-Notes
[Mat][Trans] Add f32x4_shared/bcf row/col first kernel.
DefTruth created a branch on DefTruth/CUDA-Learn-Notes
opt-hgemm-mma - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.
DefTruth created a comment on a pull request on DefTruth/CUDA-Learn-Notes
@bear-zd 增加了代码规范说明, #50
DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
- [Docs][Contribute] Add How to contribute Notes (#90) 82b94c5
DefTruth closed a pull request on DefTruth/CUDA-Learn-Notes
[Docs][Contribute] Add How to contribute Notes
DefTruth opened a pull request on DefTruth/CUDA-Learn-Notes
[Docs][Contribute] Add How to contribute Notes
DefTruth created a branch on DefTruth/CUDA-Learn-Notes
add-contribute-notes - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.
DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
- [Mat][Trans] Add f32/f32x4 row/col first kernel (#89) * [Mat Transpose] Add mat transpose f32/x4_packed kernel with ... 293bc83