huggingface/text-generation-inference Events in 2024 - Ecosyste.ms: Timeline

drbh pushed 1 commit to pr-2634-ci-branch huggingface/text-generation-inference

October 16, 2024 1:54pm

fix: simplify naming, tool choice default and improve test 6837c5b

danieldk created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 1:52pm

Yeah, that's a fair point. I think `in {torch.float8_e5m2, torch.float8_e4m3}` makes it easier to grep for all the places where one of these float types is used. So I'll update the PR.

View on GitHub

danieldk created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 1:52pm

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 1:52pm

oops 😬, removed in latest commit. thanks!

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 1:52pm

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 1:52pm

agreed, just updated to choose `get_n_day_weather_forecast` instead

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 1:52pm

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 1:51pm

oh yea thats much better, updated in the latest commit

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 1:51pm

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 1:50pm

got it that make sense, I've changed the name to `ToolChoice` in the latest commit. (its much cleaner 👍)

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 1:50pm

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 1:12pm

nice! looks good to me ✨

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 1:12pm

nice! looks good to me ✨

View on GitHub

danieldk pushed 22 commits to maintenance/simplify-attention huggingface/text-generation-inference

October 16, 2024 1:12pm

enable mllama in intel platform (#2610) Signed-off-by: Wang, Yi A <[email protected]> 57f9685
Upgrade minor rust version (Fixes rust build compilation cache) (#2617) * Upgrade minor rust version (Fixes rust bui... 8b295aa
Add support for fused MoE Marlin for AWQ (#2616) * Add support for fused MoE Marlin for AWQ This uses the updated... 6414248
nix: move back to the tgi-nix main branch (#2620) 6db3bcb
CI (2599): Update ToolType input schema (#2601) * Update ToolType input schema * lint * fix: run formatter ... 8ad20da
nix: add black and isort to the closure (#2619) To make sure that everything is formatted with the same black versio... 9ed0c85
AMD CI (#2589) * Only run 1 valid test. * TRying the tailscale action quickly. * ? * bash spaces. * Remo... 43f39f6
feat: allow tool calling to respond without a tool (#2614) * feat: process token stream before returning to client ... e36dfaa
Update documentation to most recent stable version of TGI. (#2625) Update to most recent stable version of TGI. d912f0b
Intel ci (#2630) * Intel CI ? * Let's try non sharded gemma. * Snapshot rename * Apparently container can b... 3dbdf63
Fixing intel Supports windowing. (#2637) 0c47884
Small fixes for supported models (#2471) * Small improvements for docs * Update _toctree.yml * Updating the do... ce28ee8
Cpu perf (#2596) * break when there's nothing to read Signed-off-by: Wang, Yi A <[email protected]> * Differ... 3ea82d0
Clarify gated description and quicktour (#2631) Update quicktour.md 51f5401
update ipex to fix incorrect output of mllama in cpu (#2640) Signed-off-by: Wang, Yi A <[email protected]> 7a82ddc
feat: enable pytorch xpu support for non-attention models (#2561) XPU backend is available natively (without IPEX) i... 58848cb
Fixing linters. (#2650) cf04a43
Use flashinfer for Gemma 2. ce7e356
Rollback to `ChatRequest` for Vertex AI Chat instead of `VertexChat` (#2651) As spotted by @philschmid, the payload ... ffe05cc
Fp8 e4m3_fnuz support for rocm (#2588) * (feat) fp8 fnuz support for rocm * (review comments) Fix compression_con... 704a58c
and 2 more ...

View on GitHub

Narsil created a comment on an issue on huggingface/text-generation-inference

October 16, 2024 12:55pm

Can you please include the necessary information asked when you create an issue ?

View on GitHub

Narsil created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 12:34pm

Fair enough. From my understanding they are actually the same dtypes only side effectifs on edge values (which we won't check against). I don't even think torch have plans for explicit support h...

View on GitHub

Narsil created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 12:34pm

View on GitHub

danieldk created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 12:28pm

We might have other variants in the future as well (`e4m3fnuz`, `e5m2uz`).

View on GitHub

danieldk created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 12:28pm

View on GitHub

Narsil created a comment on an issue on huggingface/text-generation-inference

October 16, 2024 12:17pm

We're not entirely sure this is really the way to go. Typical deployments have multiple replicas. With CPU/disk kv-cache you need to use sticky sessions if you don't want to reproduce n-times th...

View on GitHub

mht-sharma created a comment on an issue on huggingface/text-generation-inference

October 16, 2024 11:31am

Fixed in https://github.com/huggingface/text-generation-inference/pull/2579

View on GitHub

mht-sharma closed an issue on huggingface/text-generation-inference

October 16, 2024 11:31am

tgi server launch fails with latest-rocm docker image.

### System Info text-generation-launcher 2.2.1-dev0 ### Information - [X] Docker - [X] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications...

mht-sharma created a comment on an issue on huggingface/text-generation-inference

October 16, 2024 11:30am

Fixed in https://github.com/huggingface/text-generation-inference/pull/2579

View on GitHub

mht-sharma closed an issue on huggingface/text-generation-inference

October 16, 2024 11:30am

TGI fails to run Llama 3.1-405B with AMD 8 X MI300x

### System Info Runtime environment: ``` Target: x86_64-unknown-linux-gnu Cargo version: 1.79.0 Commit sha: a379d5536bb2de55154dc09c3a1f24ce58cb7df5 Docker label: sha-a379d55-rocm nvidia-s...

Johnno1011 opened an issue on huggingface/text-generation-inference

October 16, 2024 10:52am

llama3.1 /v1/chat/completions template not found

### System Info text generation inference v2.3.1 meta-llama/Meta-Llama-3.1-70B-Instruct ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [...

Narsil deleted a branch huggingface/text-generation-inference

October 16, 2024 10:49am

feat/prefix_chunking

Narsil pushed 1 commit to main huggingface/text-generation-inference

October 16, 2024 10:49am

feat: prefill chunking (#2600) * wip * rollback * refactor to use prefix/postfix namming + fix all_input_ids_t... a6a0c97

View on GitHub

Narsil closed a pull request on huggingface/text-generation-inference

October 16, 2024 10:49am

feat: prefill chunking

Narsil created a review on a pull request on huggingface/text-generation-inference

October 16, 2024 10:45am

View on GitHub

Narsil created a review comment on a pull request on huggingface/text-generation-inference

October 16, 2024 10:25am

Actually a better serde(rename) on the whole struct should fix every field (both schema and actual JSON parsing)

View on GitHub