Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

DefTruth/CUDA-Learn-Notes

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 3 commits to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes
  • [Mat][Trans] Add f32x4_shared/bcf row/col first kernel. (#91) * [Mat][Trans] Add f32x4_shared/bcf row/col first kern... 2f854e8
  • Merge branch 'main' of github.com:DefTruth/CUDA-Learn-Notes into opt-hgemm-mma-2 3ec6c6b
  • rename mat_transpose->mat-transpose ffb18a5

View on GitHub

Bin-ze starred DefTruth/CUDA-Learn-Notes
DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes
  • Update hgemm_wmma_stage.cu 5f7935f

View on GitHub

DefTruth created a comment on a pull request on DefTruth/CUDA-Learn-Notes
merged

View on GitHub

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [Mat][Trans] Add f32x4_shared/bcf row/col first kernel. (#91) * [Mat][Trans] Add f32x4_shared/bcf row/col first kern... 2f854e8

View on GitHub

DefTruth created a review on a pull request on DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-2 DefTruth/CUDA-Learn-Notes
  • Update sgemm_wmma_tf32_stage.cu d75a69c

View on GitHub

bear-zd created a review comment on a pull request on DefTruth/CUDA-Learn-Notes
或者我直接换成M和N吧

View on GitHub

bear-zd created a review on a pull request on DefTruth/CUDA-Learn-Notes

View on GitHub

Lonepic starred DefTruth/CUDA-Learn-Notes
DefTruth created a comment on a pull request on DefTruth/CUDA-Learn-Notes
LGTM~

View on GitHub

YOUYANY starred DefTruth/CUDA-Learn-Notes
DefTruth created a review comment on a pull request on DefTruth/CUDA-Learn-Notes
这里 S, K,在mat的语义下,改成 M, K更合适

View on GitHub

DefTruth created a review on a pull request on DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth created a branch on DefTruth/CUDA-Learn-Notes

opt-hgemm-mma-2 - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-hgemm-mma

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

Load more