Ecosyste.ms: Timeline
Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.
YangWang92 created a comment on an issue on microsoft/VPTQ
Yes, you can directly replace the model name and Hessian matrix to quantize different models. Additionally, here is a quick guide on setting the quantization parameters: https://github.com/microsof...
YangWang92 created a comment on an issue on microsoft/VPTQ
Try this one, and you can dowload hessian matrix from here https://huggingface.co/collections/VPTQ-community/hessian-and-invhessian-checkpoints-66fd249a104850d17b23fd8b . ```bash CUDA_VISIBLE_DEVIC...
YangWang92 created a comment on an issue on microsoft/VPTQ
Glad it's resolved! I'll close this issue. You are welcome to use our VPTQ project, and if you have any further requirements, please feel free to mention them.
YangWang92 closed an issue on microsoft/VPTQ
Sometimes models load very slowly
I honestly don't know what the exact trigger condition is, before 0.0.4, this situation usually occurs when I am not running in the vptq root directory, and this problem is usually accompanied by a...YangWang92 created a comment on an issue on microsoft/VPTQ
I'm wondering if it could be an issue with the model cache or the file system? It doesn't seem like there should be such a difference. I'll check about it some more.
YangWang92 created a comment on an issue on microsoft/VPTQ
> [@YangWang92](https://github.com/YangWang92) Thanks for your reply! I'm sorry that I didn't provide enough information. I'm testing on A6000, and the problem I'm facing is slow model loading (for...
YangWang92 created a comment on an issue on microsoft/VPTQ
I’m guessing you might need to run the same model repeatedly with different input or context? In that case, you probably only need to load the model once and perform inference on new inputs repeate...
YangWang92 created a comment on an issue on microsoft/VPTQ
May I ask if the issue is with slow inference speed or slow model loading? Also, could you share what hardware you are using for testing? Thank you!
YangWang92 pushed 1 commit to main YangWang92/VPTQ
- Update model_base.py (#124) fix quantization config for layers 139a380