Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

DefTruth/CUDA-Learn-Notes

DefTruth published a release on DefTruth/CUDA-Learn-Notes
v2.4.11 HGEMM Thread Block Swizzle
## What's Changed * [Docs] Update README.md by @DefTruth in https://github.com/DefTruth/CUDA-Learn-Notes/pull/81 * [HEGMM] HGEMM WMMA Thread Block Swizzle by @DefTruth in https://github.com/DefTr...
DefTruth created a tag on DefTruth/CUDA-Learn-Notes

v2.4.11 - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-sgemm-swizzle

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [HGEMM] make thread block swizzle stride as N/4 (#83) * Update hgemm.py * Update README.md * Update README.md bc3d78e

View on GitHub

DefTruth pushed 1 commit to opt-sgemm-swizzle DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-sgemm-swizzle DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-sgemm-swizzle DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-hgemm-mma

RaphaelDu starred DefTruth/CUDA-Learn-Notes
nainiu888 starred DefTruth/CUDA-Learn-Notes
yiwei-sun starred DefTruth/CUDA-Learn-Notes