Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

huggingface/text-generation-inference

Johnno1011 closed an issue on huggingface/text-generation-inference
llama3.1 /v1/chat/completions template not found
### System Info text generation inference v2.3.1 meta-llama/Meta-Llama-3.1-70B-Instruct ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [...
Narsil pushed 1 commit to gpt_awq_4 huggingface/text-generation-inference
  • Upgrading the tests (TP>1 fix changes to use different kernels.) 8673bb0

View on GitHub

sywangyi created a review comment on a pull request on huggingface/text-generation-inference
https://github.com/huggingface/text-generation-inference/blob/main/server/text_generation_server/layers/gptq/__init__.py#L134,this line set the use_exllama to false, since in intel platform, exllam...

View on GitHub

sywangyi created a review on a pull request on huggingface/text-generation-inference

View on GitHub

trainerbox created a comment on an issue on huggingface/text-generation-inference
TGI is not hardware compatible with Mac silicon https://github.com/huggingface/text-generation-inference Hardware support [Nvidia] [AMD] [Inferentia] [Gaudi] [Google TPU] Additional ref...

View on GitHub

mfuntowicz pushed 5 commits to trtllm-executor-thread huggingface/text-generation-inference
  • misc(cuda): require 12.6 1c3e71e
  • chore(cmake): use correct policy for download_timestamp 09e8803
  • feat(looper): check engine and executorWorker paths exist before creating the backend 55adb74
  • chore(cmake): download timestamp should be before URL 0745f2b
  • feat(looper): minor optimizations to avoid growing too much the containers f45e180

View on GitHub

Narsil pushed 1 commit to gpt_awq_4 huggingface/text-generation-inference
  • Revert change after rebase. 5ca6da1

View on GitHub

Narsil pushed 1 commit to gpt_awq_4 huggingface/text-generation-inference

View on GitHub

Narsil pushed 16 commits to gpt_awq_4 huggingface/text-generation-inference
  • Fixing linters. (#2650) cf04a43
  • Use flashinfer for Gemma 2. ce7e356
  • Rollback to `ChatRequest` for Vertex AI Chat instead of `VertexChat` (#2651) As spotted by @philschmid, the payload ... ffe05cc
  • Fp8 e4m3_fnuz support for rocm (#2588) * (feat) fp8 fnuz support for rocm * (review comments) Fix compression_con... 704a58c
  • feat: prefill chunking (#2600) * wip * rollback * refactor to use prefix/postfix namming + fix all_input_ids_t... a6a0c97
  • Support `e4m3fn` KV cache (#2655) * Support `e4m3fn` KV cache * Make check more obvious 5bbe1ce
  • Simplify the `attention` function (#2609) * Simplify the `attention` function - Use one definition rather than mu... 59ea38c
  • fix tgi-entrypoint wrapper in docker file: exec instead of spawning a child process (#2663) tgi-entrypoint: exec ins... 1b97e08
  • fix: prefer inplace softmax to avoid copy (#2661) * fix: prefer inplace softmax to avoid copy * Update server/tex... 5f32dea
  • Break cycle between the attention implementations and KV cache (#2627) 8ec5755
  • add gptq and awq int4 support in intel platform Signed-off-by: Wang, Yi A <[email protected]> 61fe28e
  • fix ci failure Signed-off-by: Wang, Yi A <[email protected]> dd3fb81
  • set kv cache dtype Signed-off-by: Wang, Yi A <[email protected]> 645369b
  • refine the code according to the review command Signed-off-by: Wang, Yi A <[email protected]> f36c9a6
  • Simplifying conditionals + reverting integration tests values. 3e12402
  • Unused import ba7197c

View on GitHub

Narsil pushed 1 commit to gpt_awq_4 huggingface/text-generation-inference
  • Simplifying conditionals + reverting integration tests values. cf7a957

View on GitHub

Narsil created a review comment on a pull request on huggingface/text-generation-inference
Isn't keeping `use_exllama` and simply fixing the TP (with - g_idx[0]) in the conditional to fix the issues on IPEX ?

View on GitHub

Narsil created a review on a pull request on huggingface/text-generation-inference

View on GitHub

Narsil opened a pull request on huggingface/text-generation-inference
CI job. Gpt awq 4
# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, ...
Narsil created a branch on huggingface/text-generation-inference

gpt_awq_4 - Large Language Model Text Generation Inference

edwin-19 starred huggingface/text-generation-inference
muscionig created a comment on an issue on huggingface/text-generation-inference
Hi @Johnno1011, I think this might help. I noticed that your `model-id` is set to `meta-llama/Meta-Llama-3.1-70B-Instruct`. While working with this model on the HF Hub I faced a similar issu...

View on GitHub

default-anton starred huggingface/text-generation-inference
Bihan created a comment on an issue on huggingface/text-generation-inference
@danieldk Deployed TGI with neuralmagic/Meta-Llama-3-70B-Instruct-FP8 and it worked.

View on GitHub

Nov05 starred huggingface/text-generation-inference
Johnno1011 created a comment on an issue on huggingface/text-generation-inference
Yeah interesting point, I have tried fiddling with this... I got myself a new copy of the model and removed the caching (so that it downloads directly into the container) but this still happens. No...

View on GitHub

Narsil created a review comment on a pull request on huggingface/text-generation-inference
Isn't this `no_tool` with `snake_case` ?This should mean a rename of this property or `None`, no ? I don't think `schema(rename)` imples a serde `rename`.

View on GitHub

Narsil created a review on a pull request on huggingface/text-generation-inference

View on GitHub

Narsil created a review comment on a pull request on huggingface/text-generation-inference
`vec![]` ?

View on GitHub

Narsil created a review on a pull request on huggingface/text-generation-inference

View on GitHub

danieldk pushed 1 commit to main huggingface/text-generation-inference
  • Break cycle between the attention implementations and KV cache (#2627) 8ec5755

View on GitHub

drbh deleted a branch huggingface/text-generation-inference

prefer-inplace-softmax-for-prefill-logprobs

drbh pushed 1 commit to main huggingface/text-generation-inference
  • fix: prefer inplace softmax to avoid copy (#2661) * fix: prefer inplace softmax to avoid copy * Update server/tex... 5f32dea

View on GitHub

Narsil created a review on a pull request on huggingface/text-generation-inference

View on GitHub

Narsil created a review on a pull request on huggingface/text-generation-inference

View on GitHub

nikhil-weamai created a comment on an issue on huggingface/text-generation-inference
please use the endpoint url like this : **https://xxx.cloud/v1/chat/completions** after that you will get the token count in response.

View on GitHub

Load more