huggingface/text-generation-inference Events in 2024 - Ecosyste.ms: Timeline

mht-sharma created a review on a pull request on huggingface/text-generation-inference

October 24, 2024 10:46am

LGTM! Thanks for the PR @danieldk. This will help me enable FP8 KV cache on ROCm next.

View on GitHub

mht-sharma created a review on a pull request on huggingface/text-generation-inference

October 24, 2024 10:46am

LGTM! Thanks for the PR @danieldk. This will help me enable FP8 KV cache on ROCm next.

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 9:54am

Fix integration mt0 (transformers update). e3db525

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 9:39am

Simple updates. 199973c

View on GitHub

wakaka6 starred huggingface/text-generation-inference

October 24, 2024 9:29am

danieldk pushed 1 commit to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 24, 2024 8:56am

Remove fbgemm fb24d7a

View on GitHub

danieldk pushed 9 commits to feature/fp8-kv-cache-scale huggingface/text-generation-inference

October 24, 2024 8:50am

Test Marlin MoE with `desc_act=true` (#2622) Update the Mixtral GPTQ test to use a model with `desc_act=true` and `... 7f54b73
break when there's nothing to read (#2582) Signed-off-by: Wang, Yi A <[email protected]> 058d306
Add `impureWithCuda` dev shell (#2677) * Add `impureWithCuda` dev shell This shell is handy when developing some ... 9c9ef37
Make moe-kernels and marlin-kernels mandatory in CUDA installs (#2632) f58eb70
feat: natively support Granite models (#2682) * feat: natively support Granite models * Update doc 03c9388
hotfix: fix flashllama 27ff187
feat: allow any supported payload on /invocations (#2683) * feat: allow any supported payload on /invocations * u... 41c2623
Add support for FP8 KV cache scales Since FP8 only has limited dynamic range, we can scale keys/values before storin... ba4ac96
Update FP8 KV cache test to use checkpoint with scales 1f18cb6

View on GitHub

danieldk pushed 8 commits to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 24, 2024 8:47am

Make moe-kernels and marlin-kernels mandatory in CUDA installs (#2632) f58eb70
feat: natively support Granite models (#2682) * feat: natively support Granite models * Update doc 03c9388
hotfix: fix flashllama 27ff187
feat: allow any supported payload on /invocations (#2683) * feat: allow any supported payload on /invocations * u... 41c2623
Add support for FP8 KV cache scales Since FP8 only has limited dynamic range, we can scale keys/values before storin... 14a5053
WIP 56135ba
scale upper bound as tensor for cutlass gemm 6f87b7f
Remove fbgemm 14bffd8

View on GitHub

danieldk pushed 2 commits to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 24, 2024 8:46am

scale upper bound as tensor for cutlass gemm de0ba05
Remove fbgemm ce7f395

View on GitHub

zh190920 starred huggingface/text-generation-inference

October 24, 2024 8:17am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 8:07am

Revert doc text. cacaba6

View on GitHub

mfuntowicz pushed 1 commit to feat-backend-llamacpp huggingface/text-generation-inference

October 24, 2024 8:04am

feat(backend): wip Rust binding c4862be

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 7:58am

Updating logic + non flash. 6994fa1

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 4:55am

Much simpler logic after the overhead. 1053451

View on GitHub

DandinPower starred huggingface/text-generation-inference

October 24, 2024 2:28am

mfuntowicz pushed 1 commit to feat-backend-llamacpp huggingface/text-generation-inference

October 23, 2024 8:12pm

chore(backend): minor formatting e23566e

View on GitHub

lp-noel created a comment on an issue on huggingface/text-generation-inference

October 23, 2024 6:13pm

Same error here with Qwen2.5 on 4 GPUs, can this be re-opened?

View on GitHub

mfuntowicz pushed 3 commits to trtllm-stop-words huggingface/text-generation-inference

October 23, 2024 2:06pm

chore(docker): add mpi to ld_library_path ef00311
chore(docker): install transformers 6376fec
feat(trtllm): detect stop_words from generation_config.json 9cee00e

View on GitHub

geekdeedy starred huggingface/text-generation-inference

October 23, 2024 10:11am

sidharthrajaram closed a pull request on huggingface/text-generation-inference

October 23, 2024 10:08am

Support OpenAI Structured Output by adding json_schema as an alias for JSON Grammar

# What does this PR do? ### tl;dr Supports `"json_schema"` for as a type for `response_format` in addition to the existing alias of `"json_object"` and `"json"`. This aligns TGI with the OpenAI...

HuggingFaceDocBuilderDev created a comment on a pull request on huggingface/text-generation-inference

October 23, 2024 10:06am

The docs for this PR live [here](https://moon-ci-docs.huggingface.co/docs/text-generation-inference/pr_2683). All of your documentation changes will be reflected on that endpoint. The docs are avai...

View on GitHub

HuggingFaceDocBuilderDev created a comment on a pull request on huggingface/text-generation-inference

October 23, 2024 10:05am

The docs for this PR live [here](https://moon-ci-docs.huggingface.co/docs/text-generation-inference/pr_2682). All of your documentation changes will be reflected on that endpoint. The docs are avai...

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 23, 2024 10:03am

QuantLinear is rocm compatible. 849d882

View on GitHub

danieldk pushed 2 commits to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 23, 2024 9:30am

Add support for FP8 KV cache scales Since FP8 only has limited dynamic range, we can scale keys/values before storin... 59a6ba4
WIP 5595569

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 23, 2024 9:26am

fix. 82a6cb8

View on GitHub

danieldk deleted a branch huggingface/text-generation-inference

October 23, 2024 9:07am

maintenance/mandatory-moe-kernels

danieldk pushed 1 commit to main huggingface/text-generation-inference

October 23, 2024 9:07am

Make moe-kernels and marlin-kernels mandatory in CUDA installs (#2632) f58eb70

View on GitHub

danieldk closed a pull request on huggingface/text-generation-inference

October 23, 2024 9:07am

Make moe-kernels and marlin-kernels mandatory in CUDA installs

# What does this PR do? ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guidel...

danieldk created a comment on a pull request on huggingface/text-generation-inference

October 23, 2024 9:07am

Merging (was already discussed on Slack).

View on GitHub

Ecosyste.ms: Timeline

huggingface/text-generation-inference

mht-sharma created a review on a pull request on huggingface/text-generation-inference

October 24, 2024 10:46am

mht-sharma created a review on a pull request on huggingface/text-generation-inference

October 24, 2024 10:46am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 9:54am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 9:39am

wakaka6 starred huggingface/text-generation-inference

October 24, 2024 9:29am

danieldk pushed 1 commit to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 24, 2024 8:56am

danieldk pushed 9 commits to feature/fp8-kv-cache-scale huggingface/text-generation-inference

October 24, 2024 8:50am

danieldk pushed 8 commits to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 24, 2024 8:47am

danieldk pushed 2 commits to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 24, 2024 8:46am

zh190920 starred huggingface/text-generation-inference

October 24, 2024 8:17am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 8:07am

mfuntowicz pushed 1 commit to feat-backend-llamacpp huggingface/text-generation-inference

October 24, 2024 8:04am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 7:58am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 24, 2024 4:55am

DandinPower starred huggingface/text-generation-inference

October 24, 2024 2:28am

mfuntowicz pushed 1 commit to feat-backend-llamacpp huggingface/text-generation-inference

October 23, 2024 8:12pm

lp-noel created a comment on an issue on huggingface/text-generation-inference

October 23, 2024 6:13pm

mfuntowicz pushed 3 commits to trtllm-stop-words huggingface/text-generation-inference

October 23, 2024 2:06pm

geekdeedy starred huggingface/text-generation-inference

October 23, 2024 10:11am

sidharthrajaram closed a pull request on huggingface/text-generation-inference

October 23, 2024 10:08am

Support OpenAI Structured Output by adding json_schema as an alias for JSON Grammar

HuggingFaceDocBuilderDev created a comment on a pull request on huggingface/text-generation-inference

October 23, 2024 10:06am

HuggingFaceDocBuilderDev created a comment on a pull request on huggingface/text-generation-inference

October 23, 2024 10:05am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 23, 2024 10:03am

danieldk pushed 2 commits to feature/cc89-cutlass-w8a8 huggingface/text-generation-inference

October 23, 2024 9:30am

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 23, 2024 9:26am

danieldk deleted a branch huggingface/text-generation-inference

October 23, 2024 9:07am

danieldk pushed 1 commit to main huggingface/text-generation-inference

October 23, 2024 9:07am

danieldk closed a pull request on huggingface/text-generation-inference

October 23, 2024 9:07am

Make moe-kernels and marlin-kernels mandatory in CUDA installs

danieldk created a comment on a pull request on huggingface/text-generation-inference

October 23, 2024 9:07am

Kyosuketam starred huggingface/text-generation-inference

October 23, 2024 8:31am