Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

DefTruth/CUDA-Learn-Notes

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes
  • Update hgemm_wmma_stage.cu 9aaf1b5

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

cyleex starred DefTruth/CUDA-Learn-Notes
DefTruth created a branch on DefTruth/CUDA-Learn-Notes

opt-hgemm-mma - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-hgemm-mma

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

add-contribute-notes

DefTruth created a comment on a pull request on DefTruth/CUDA-Learn-Notes
@bear-zd 增加了代码规范说明, #50

View on GitHub

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [Docs][Contribute] Add How to contribute Notes (#90) 82b94c5

View on GitHub

DefTruth pushed 1 commit to add-contribute-notes DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth created a branch on DefTruth/CUDA-Learn-Notes

add-contribute-notes - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [Mat][Trans] Add f32/f32x4 row/col first kernel (#89) * [Mat Transpose] Add mat transpose f32/x4_packed kernel with ... 293bc83

View on GitHub

DefTruth closed a pull request on DefTruth/CUDA-Learn-Notes
[Mat][Trans] Add f32/f32x4 row/col first kernel
Take a seat.
DefTruth created a review on a pull request on DefTruth/CUDA-Learn-Notes
感谢您的贡献!修改以下几点后合入: - 删除未ready的代码,shared版本可以在下一个PR完成后合入 - { }使用非对称风格 - 一行代码尽量不要超100字符

View on GitHub

zzhong44 starred DefTruth/CUDA-Learn-Notes
bear-zd created a comment on a pull request on DefTruth/CUDA-Learn-Notes
Not fully tested. There remain some error since value of mat cannot validate the kernel is true

View on GitHub

bear-zd closed a pull request on DefTruth/CUDA-Learn-Notes
[Mat Transpose] Add mat transpose f32/x4_packed kernel.
[Mat Transpose] Add mat transpose f32/x4_packed kernel with col first or row first.
wynneyin starred DefTruth/CUDA-Learn-Notes
DefTruth created a comment on a pull request on DefTruth/CUDA-Learn-Notes
LGTM

View on GitHub

bear-zd opened a pull request on DefTruth/CUDA-Learn-Notes
[Mat Transpose] Add mat transpose f32/x4_packed kernel.
[Mat Transpose] Add mat transpose f32/x4_packed kernel with col first or row first.
linxin2429 starred DefTruth/CUDA-Learn-Notes
Load more