Ecosyste.ms: Timeline
Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.
YangWang92 created a comment on an issue on microsoft/VPTQ
I think it is a bug, and can you help me to pull a request to fix it. You can be a contributor of the project. Thanks!
YangWang92 created a comment on an issue on microsoft/VPTQ
Let me check, thanks for your feedback!
YangWang92 created a comment on an issue on deepseek-ai/DeepSeek-V3
I found that they have some configurations https://github.com/deepseek-ai/DeepSeek-V3/tree/main/inference/configs, which include 16B/236B/671B.
YangWang92 created a comment on a pull request on deepseek-ai/DeepSeek-V3
I've uploaded the converted bf16 model here for everyone to use freely: https://huggingface.co/collections/opensourcerelease/deepseek-v3-bf16-676d7fa1b3f500d39f8f559b
YangWang92 pushed 1 commit to patch-1 YangWang92/DeepSeek-V3
- Add CUDA cache clearing in memory management Added torch.cuda.empty_cache() to free up unused memory on the GPU, 65d8f5f
YangWang92 pushed 1 commit to patch-1 YangWang92/DeepSeek-V3
- sort filename to reduce memory costs e6e66fd
YangWang92 opened a pull request on deepseek-ai/DeepSeek-V3
handle missing scale_inv_name
Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.experts.92.gate_proj.weight` and `model.layers.39.mlp.experts.92.gate_proj.weight_scale_inv`) were not in the same Sa...YangWang92 pushed 1 commit to patch-1 YangWang92/DeepSeek-V3
- handle missing scale_inv_name Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.expert... 1e3a836
YangWang92 pushed 25 commits to master VPTQ/hessian_collector
- for m300 d97a8e2
- update setting aa3aae5
- collect qwen 27e61f4
- fix qwen image size 9547622
- add qwen vlm df05cca
- fix input dev fdd4b13
- set text length 0aceb13
- fix for llama 3.2 bcec2f7
- hack vlm layer 9e765db
- set mem 3e694bd
- fix llama3.2 trucate e690188
- set max length 5d3ed5c
- add llm sample 42fd742
- add start 57463cd
- Merge branch 'm300' of https://github.com/VPTQ/hessian_collector into m300 ac2a972
- fix range 793b283
- set generate tokens d1baff0
- update qwen setting bae141a
- fix llm hessian 9434824
- add cli data 3bf1599
- and 5 more ...
YangWang92 pushed 5 commits to m300 VPTQ/hessian_collector
YangWang92 pushed 1 commit to main microsoft/VPTQ
- fix: a small bug fix for the initialization of the residual index tensor. (#147) * Fixed a small bug in the initiali... 170770c