Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

DefTruth/CUDA-Learn-Notes

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

XuZhang99 starred DefTruth/CUDA-Learn-Notes
DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

Zessay starred DefTruth/CUDA-Learn-Notes
DefTruth published a release on DefTruth/CUDA-Learn-Notes
HGEMM Up to 113 TFLOPS on L20
## What's Changed * [Mat][Trans] Add f32/f32x4 row/col first kernel by @bear-zd in https://github.com/DefTruth/CUDA-Learn-Notes/pull/89 * [Docs][Contribute] Add How to contribute Notes by @DefTru...
DefTruth created a tag on DefTruth/CUDA-Learn-Notes

v2.4.13 - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth created a branch on DefTruth/CUDA-Learn-Notes

opt-hgemm-mma - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-hgemm-mma

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [HGEMM] Update HGEMM WMMA Benchmark (#97) * Update README.md * Update hgemm.py * Update README.md * Update ... 0aeb450

View on GitHub

DefTruth closed a pull request on DefTruth/CUDA-Learn-Notes
[HGEMM] Update HGEMM WMMA Benchmark
DefTruth opened a pull request on DefTruth/CUDA-Learn-Notes
[HGEMM] Update HGEMM WMMA Benchmark
DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

DefTruth pushed 1 commit to opt-hgemm-mma DefTruth/CUDA-Learn-Notes

View on GitHub

ce107 starred DefTruth/CUDA-Learn-Notes
liu654042016 starred DefTruth/CUDA-Learn-Notes
Li-dongyang starred DefTruth/CUDA-Learn-Notes
Luoxiaogan starred DefTruth/CUDA-Learn-Notes
DefTruth created a branch on DefTruth/CUDA-Learn-Notes

opt-hgemm-mma - 🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

DefTruth deleted a branch DefTruth/CUDA-Learn-Notes

opt-hgemm-mma

DefTruth pushed 1 commit to main DefTruth/CUDA-Learn-Notes
  • [HGEMM] Refactor HGEMM WMMA 161616 kernels (#96) * update hgemm benchmark option * update hgemm benchmark option ... 2e9e997

View on GitHub

DefTruth pushed 5 commits to opt-hgemm-mma DefTruth/CUDA-Learn-Notes
  • update hgemm benchmark option d57932a
  • update hgemm benchmark option 2984f19
  • update hgemm benchmark option f899cd7
  • Merge branch 'main' of github.com:DefTruth/CUDA-Learn-Notes into opt-hgemm-mma 465a960
  • remove un-need hgemm codes c29384d

View on GitHub

flowerinheart starred DefTruth/CUDA-Learn-Notes
Load more