huggingface/text-generation-inference Events in 2024 - Ecosyste.ms: Timeline

bad-beets starred huggingface/text-generation-inference

October 22, 2024 1:39am

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 10:12pm

After disabling prefix caching I seem the be getting but same response across different different machines

View on GitHub

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 9:53pm

To disable prefix caching you have to set both `USE_PREFIX_CACHING=0` AND `PREFIX_CACHING=0` in v2.3.1

View on GitHub

mfuntowicz pushed 6 commits to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 9:40pm

chore(trtllm): create specific parallelconfig factory and logging init methods 75e4466
chore(trtllm): define a macro for SizeType cast ea82247
chore(trtllm): use GetParallelConfig b999c04
chore(trtllm): minor refactoring 98dcde0
chore(trtllm): validate there are enough GPus on the system for the desired model 1b56a33
chore(trtllm): ensure max throughput scheduling policy is selected 4a0f05e

View on GitHub

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 9:39pm

This also prevents you from using `ATTENTION=paged` since the prefix caching is always true which crashes the model shards on launch

View on GitHub

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 9:32pm

waiting of #2676 to validate if this is a prefix caching issue but I have confirmed with LOG_LEVEL=debug that the exact same params and input render different results with seed set

View on GitHub

sam-ulrich1 opened an issue on huggingface/text-generation-inference

October 21, 2024 9:31pm

PREFIX_CACHING=0 does not disable prefix caching in v2.3.1

### System Info Ubuntu 20.04 Host, Docker image v2.3.1 ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications ### Repro...

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 8:58pm

This unfortunately did not work for me on the docker image

View on GitHub

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 8:29pm

Awesome, thank you

View on GitHub

sam-ulrich1 closed an issue on huggingface/text-generation-inference

October 21, 2024 8:29pm

Optionally log input tokens/prompt

### Feature request Optionally log the input prompt/tokens for improved debugging. ### Motivation I am currently attempting to debug why in a prod env I am getting garbage but when replicating t...

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 8:28pm

Sweet I'll give that a try

View on GitHub

mfuntowicz pushed 6 commits to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 7:47pm

chore(rebase): fix invalid references d73401a
feat(trtllm): rewrite health to not account for current state 9afcb48
chore(looper): cleanup a bit more b3d27e6
feat(post_processing): max_new_tokens is const evaluated now 582551d
chore(ffi):formatting 56cad9f
feat(trtllm): add stop words handling # Conflicts: # backends/trtllm/lib/backend.cpp 8b8daac

View on GitHub

mfuntowicz pushed 1 commit to trtllm-executor-thread huggingface/text-generation-inference

October 21, 2024 7:44pm

chore(rebase): fix invalid references d73401a

View on GitHub

mfuntowicz pushed 1 commit to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 7:40pm

chore(rebase): fix invalid references 2c8ecdb

View on GitHub

lgf5090 forked huggingface/text-generation-inference

October 21, 2024 7:24pm

lgf5090/text-generation-inference

lgf5090 starred huggingface/text-generation-inference

October 21, 2024 7:23pm

danieldk pushed 2 commits to feature/fp8-kv-cache-scale huggingface/text-generation-inference

October 21, 2024 5:25pm

Add support for FP8 KV cache scales Since FP8 only has limited dynamic range, we can scale keys/values before storin... 4097a20
Update FP8 KV cache test to use checkpoint with scales 98efcb4

View on GitHub

danieldk pushed 2 commits to feature/fp8-kv-cache-scale huggingface/text-generation-inference

October 21, 2024 5:21pm

Add support for FP8 KV cache scales Since FP8 only has limited dynamic range, we can scale keys/values before storin... a4cb3d3
Update FP8 KV cache test to use checkpoint with scales 08c0b3f

View on GitHub

claudioMontanari created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 4:46pm

You should be able to disable prefix caching by starting the server with `PREFIX_CACHING=0`. That's how I got the `llama 3.2 vision` models to work.

View on GitHub

claudioMontanari created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 4:42pm

Prompts should be logged (as well as other info) if you start the server with `LOG_LEVEL=debug text-generation-launcher ...`. Hope this helps!

View on GitHub

james-deee created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 4:40pm

This is so bizarre that this is closed. You can aboslutely positively (pun intended) send a temperature of `0.0` to these models. Why in the world is this restricted in this?

View on GitHub

mfuntowicz pushed 6 commits to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 3:08pm

Revert "chore(trtllm): remove unused method" This reverts commit 31747163 f5b9ee3
feat(trtllm): rewrite health to not account for current state 7a14185
chore(looper): cleanup a bit more 2ab1a8b
feat(post_processing): max_new_tokens is const evaluated now e4beada
chore(ffi):formatting 983ecf1
feat(trtllm): add stop words handling # Conflicts: # backends/trtllm/lib/backend.cpp f631742

View on GitHub

mfuntowicz pushed 1 commit to trtllm-executor-thread huggingface/text-generation-inference

October 21, 2024 3:03pm

Revert "chore(trtllm): remove unused method" This reverts commit 31747163 f5b9ee3

View on GitHub

sywangyi created a comment on a pull request on huggingface/text-generation-inference

October 21, 2024 1:55pm

I am also curious about which the thread is kept looping while the process is close.

View on GitHub

sywangyi created a comment on a pull request on huggingface/text-generation-inference

October 21, 2024 1:54pm

docker --version Docker version 26.1.3, build b72abbb lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 224 On-lin...

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 21, 2024 1:24pm

Remove generated files. a31db04

View on GitHub

Narsil pushed 1 commit to main huggingface/text-generation-inference

October 21, 2024 1:22pm

break when there's nothing to read (#2582) Signed-off-by: Wang, Yi A <[email protected]> 058d306

View on GitHub

Narsil closed a pull request on huggingface/text-generation-inference

October 21, 2024 1:22pm

break when there's nothing to read

# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, ...

Narsil created a comment on a pull request on huggingface/text-generation-inference

October 21, 2024 1:22pm

Oh right, let's go with your fix then. I'm still not sure this is technically correct given the docs: https://doc.rust-lang.org/std/io/trait.Read.html#tymethod.read But Doing a "cleaner" fix in...

View on GitHub

Ecosyste.ms: Timeline

huggingface/text-generation-inference

bad-beets starred huggingface/text-generation-inference

October 22, 2024 1:39am

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 10:12pm

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 9:53pm

mfuntowicz pushed 6 commits to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 9:40pm

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 9:39pm

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 9:32pm

sam-ulrich1 opened an issue on huggingface/text-generation-inference

October 21, 2024 9:31pm

PREFIX_CACHING=0 does not disable prefix caching in v2.3.1

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 8:58pm

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 8:29pm

sam-ulrich1 closed an issue on huggingface/text-generation-inference

October 21, 2024 8:29pm

Optionally log input tokens/prompt

sam-ulrich1 created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 8:28pm

mfuntowicz pushed 6 commits to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 7:47pm

mfuntowicz pushed 1 commit to trtllm-executor-thread huggingface/text-generation-inference

October 21, 2024 7:44pm

mfuntowicz pushed 1 commit to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 7:40pm

lgf5090 forked huggingface/text-generation-inference

October 21, 2024 7:24pm

lgf5090 starred huggingface/text-generation-inference

October 21, 2024 7:23pm

danieldk pushed 2 commits to feature/fp8-kv-cache-scale huggingface/text-generation-inference

October 21, 2024 5:25pm

danieldk pushed 2 commits to feature/fp8-kv-cache-scale huggingface/text-generation-inference

October 21, 2024 5:21pm

claudioMontanari created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 4:46pm

claudioMontanari created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 4:42pm

james-deee created a comment on an issue on huggingface/text-generation-inference

October 21, 2024 4:40pm

mfuntowicz pushed 6 commits to trtllm-stop-words huggingface/text-generation-inference

October 21, 2024 3:08pm

mfuntowicz pushed 1 commit to trtllm-executor-thread huggingface/text-generation-inference

October 21, 2024 3:03pm

sywangyi created a comment on a pull request on huggingface/text-generation-inference

October 21, 2024 1:55pm

sywangyi created a comment on a pull request on huggingface/text-generation-inference

October 21, 2024 1:54pm

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

October 21, 2024 1:24pm

Narsil pushed 1 commit to main huggingface/text-generation-inference

October 21, 2024 1:22pm

Narsil closed a pull request on huggingface/text-generation-inference

October 21, 2024 1:22pm

break when there's nothing to read

Narsil created a comment on a pull request on huggingface/text-generation-inference

October 21, 2024 1:22pm

ita9naiwa starred huggingface/text-generation-inference

October 21, 2024 1:21pm