Ecosyste.ms: Timeline
Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.
mfuntowicz pushed 2 commits to trtllm-executor-thread huggingface/text-generation-inference
mfuntowicz pushed 2 commits to feat-backend-llamacpp huggingface/text-generation-inference
mottoslo created a comment on an issue on huggingface/text-generation-inference
gentle ping @drbh is this issue being handled internally ? any feedback would be great !
drbh pushed 1 commit to pr-2634-ci-branch huggingface/text-generation-inference
- fix: adjust default when json tool choice is 193ad66
SMAntony opened an issue on huggingface/text-generation-inference
Distributed Inference failing for Llama-3.1-70b-Instruct
### System Info text-generation-inference docker: sha-5e0fb46 (latest) OS: Ubuntu 22.04 Model: meta-llama/Llama-3.1-70B-Instruct GPU Used: 4 `nvidia-smi`: ``` +-----------------------------...sywangyi closed a pull request on huggingface/text-generation-inference
add gptq and awq int4 support in intel platform
# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, ...nimishbongale created a comment on an issue on huggingface/text-generation-inference
Same issue!
danieldk pushed 1 commit to main huggingface/text-generation-inference
- Make handling of FP8 scales more consisent (#2666) Change `fp8_quantize` so that we can pass around reciprocals ever... 5e0fb46