Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

huggingface/text-generation-inference

drbh pushed 1 commit to pr-2634-ci-branch huggingface/text-generation-inference
  • fix: simplify naming, tool choice default and improve test 6837c5b

View on GitHub

danieldk created a review comment on a pull request on huggingface/text-generation-inference
Yeah, that's a fair point. I think `in {torch.float8_e5m2, torch.float8_e4m3}` makes it easier to grep for all the places where one of these float types is used. So I'll update the PR.

View on GitHub

danieldk created a review on a pull request on huggingface/text-generation-inference

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference
oops 😬, removed in latest commit. thanks!

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference
agreed, just updated to choose `get_n_day_weather_forecast` instead

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference
oh yea thats much better, updated in the latest commit

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

View on GitHub

drbh created a review comment on a pull request on huggingface/text-generation-inference
got it that make sense, I've changed the name to `ToolChoice` in the latest commit. (its much cleaner 👍)

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference
nice! looks good to me ✨

View on GitHub

drbh created a review on a pull request on huggingface/text-generation-inference
nice! looks good to me ✨

View on GitHub

danieldk pushed 22 commits to maintenance/simplify-attention huggingface/text-generation-inference
  • enable mllama in intel platform (#2610) Signed-off-by: Wang, Yi A <[email protected]> 57f9685
  • Upgrade minor rust version (Fixes rust build compilation cache) (#2617) * Upgrade minor rust version (Fixes rust bui... 8b295aa
  • Add support for fused MoE Marlin for AWQ (#2616) * Add support for fused MoE Marlin for AWQ This uses the updated... 6414248
  • nix: move back to the tgi-nix main branch (#2620) 6db3bcb
  • CI (2599): Update ToolType input schema (#2601) * Update ToolType input schema * lint * fix: run formatter ... 8ad20da
  • nix: add black and isort to the closure (#2619) To make sure that everything is formatted with the same black versio... 9ed0c85
  • AMD CI (#2589) * Only run 1 valid test. * TRying the tailscale action quickly. * ? * bash spaces. * Remo... 43f39f6
  • feat: allow tool calling to respond without a tool (#2614) * feat: process token stream before returning to client ... e36dfaa
  • Update documentation to most recent stable version of TGI. (#2625) Update to most recent stable version of TGI. d912f0b
  • Intel ci (#2630) * Intel CI ? * Let's try non sharded gemma. * Snapshot rename * Apparently container can b... 3dbdf63
  • Fixing intel Supports windowing. (#2637) 0c47884
  • Small fixes for supported models (#2471) * Small improvements for docs * Update _toctree.yml * Updating the do... ce28ee8
  • Cpu perf (#2596) * break when there's nothing to read Signed-off-by: Wang, Yi A <[email protected]> * Differ... 3ea82d0
  • Clarify gated description and quicktour (#2631) Update quicktour.md 51f5401
  • update ipex to fix incorrect output of mllama in cpu (#2640) Signed-off-by: Wang, Yi A <[email protected]> 7a82ddc
  • feat: enable pytorch xpu support for non-attention models (#2561) XPU backend is available natively (without IPEX) i... 58848cb
  • Fixing linters. (#2650) cf04a43
  • Use flashinfer for Gemma 2. ce7e356
  • Rollback to `ChatRequest` for Vertex AI Chat instead of `VertexChat` (#2651) As spotted by @philschmid, the payload ... ffe05cc
  • Fp8 e4m3_fnuz support for rocm (#2588) * (feat) fp8 fnuz support for rocm * (review comments) Fix compression_con... 704a58c
  • and 2 more ...

View on GitHub

Narsil created a comment on an issue on huggingface/text-generation-inference
Can you please include the necessary information asked when you create an issue ?

View on GitHub

Narsil created a review comment on a pull request on huggingface/text-generation-inference
Fair enough. From my understanding they are actually the same dtypes only side effectifs on edge values (which we won't check against). I don't even think torch have plans for explicit support h...

View on GitHub

Narsil created a review on a pull request on huggingface/text-generation-inference

View on GitHub

danieldk created a review comment on a pull request on huggingface/text-generation-inference
We might have other variants in the future as well (`e4m3fnuz`, `e5m2uz`).

View on GitHub

danieldk created a review on a pull request on huggingface/text-generation-inference

View on GitHub

Narsil created a comment on an issue on huggingface/text-generation-inference
We're not entirely sure this is really the way to go. Typical deployments have multiple replicas. With CPU/disk kv-cache you need to use sticky sessions if you don't want to reproduce n-times th...

View on GitHub

mht-sharma created a comment on an issue on huggingface/text-generation-inference
Fixed in https://github.com/huggingface/text-generation-inference/pull/2579

View on GitHub

mht-sharma closed an issue on huggingface/text-generation-inference
tgi server launch fails with latest-rocm docker image.
### System Info text-generation-launcher 2.2.1-dev0 ### Information - [X] Docker - [X] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications...
mht-sharma created a comment on an issue on huggingface/text-generation-inference
Fixed in https://github.com/huggingface/text-generation-inference/pull/2579

View on GitHub

mht-sharma closed an issue on huggingface/text-generation-inference
TGI fails to run Llama 3.1-405B with AMD 8 X MI300x
### System Info Runtime environment: ``` Target: x86_64-unknown-linux-gnu Cargo version: 1.79.0 Commit sha: a379d5536bb2de55154dc09c3a1f24ce58cb7df5 Docker label: sha-a379d55-rocm nvidia-s...
Johnno1011 opened an issue on huggingface/text-generation-inference
llama3.1 /v1/chat/completions template not found
### System Info text generation inference v2.3.1 meta-llama/Meta-Llama-3.1-70B-Instruct ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [...
Narsil deleted a branch huggingface/text-generation-inference

feat/prefix_chunking

Narsil pushed 1 commit to main huggingface/text-generation-inference
  • feat: prefill chunking (#2600) * wip * rollback * refactor to use prefix/postfix namming + fix all_input_ids_t... a6a0c97

View on GitHub

Narsil closed a pull request on huggingface/text-generation-inference
feat: prefill chunking
Narsil created a review on a pull request on huggingface/text-generation-inference

View on GitHub

Narsil created a review comment on a pull request on huggingface/text-generation-inference
Actually a better serde(rename) on the whole struct should fix every field (both schema and actual JSON parsing)

View on GitHub

Load more