Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

DefTruth/CUDA-Learn-Notes

DefTruth created a branch on DefTruth/CUDA-Learn-Notes

hgemm-col-major-2 - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [HGEMM] collective store via warp shfl&reg reuse (#101) * Update hgemm_mma_stage.cu * Update hgemm.py * Update... 6c89595

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes
  • Update hgemm_mma_stage_col_major.cu 07c2f7b

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes
  • Create hgemm_mma_stage_col_major.cu bb1f626

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes
  • Update hgemm_mma_stage.cu 4dfad47

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to hgemm-col-major DefTruth/CUDA-Learn-Notes
  • Update hgemm_mma_stage.cu 45f1c4d

View on GitHub

chengy-sysu starred DefTruth/CUDA-Learn-Notes
christinewyt starred DefTruth/CUDA-Learn-Notes
DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-hgemm-mma-col-major

DefTruth created a branch on DefTruth/CUDA-Learn-Notes

hgemm-col-major - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [HGEMM] ldmatrix.x4.trans with reg double buffers (#100) * Update hgemm_mma_stage.cu * Update hgemm_mma_stage.cu ... bcd12bd

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-col-major DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma-col-major DefTruth/CUDA-Learn-Notes
  • Update hgemm_mma_stage.cu f23b810

View on GitHub

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-hgemm-mma

DefTruth created a branch on DefTruth/CUDA-Learn-Notes

opt-hgemm-mma-col-major - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [HGEMM] HGEMM MMA with Reg Double Buffers (#99) * Update hgemm_mma_stage.cu * Update hgemm_mma.cu * Update hge... 8e869ef

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes
  • Update hgemm_mma_stage.cu 1236e78

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

Load more