Ecosyste.ms: Timeline

Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.

huggingface/text-generation-inference

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference

View on GitHub

Narsil pushed 1 commit to auto_length huggingface/text-generation-inference
  • Trying to fix non chunking targets. 0a01dde

View on GitHub

sywangyi created a comment on a pull request on huggingface/text-generation-inference
I think we use the same OS, since it's ubuntu22.04, see https://github.com/huggingface/text-generation-inference/blob/main/Dockerfile_intel#L127

View on GitHub

Narsil created a comment on a pull request on huggingface/text-generation-inference
I think it's OS dependant more than CPU dependant no ? The doc definitely says that behavior varies based on target.

View on GitHub

sidharthrajaram opened a pull request on huggingface/text-generation-inference
Support OpenAI Structured Output by adding json_schema as an alias for JSON Grammar
# What does this PR do? ### tl;dr Supports `"json_schema"` for as a type for `response_format` in addition to the existing alias of `"json_object"` and `"json"`. This aligns TGI with the OpenAI...
Noblezhong starred huggingface/text-generation-inference
mfuntowicz pushed 1 commit to feat-backend-llamacpp huggingface/text-generation-inference
  • feat(backend): use llama_token as TokenId type 1f9c456

View on GitHub

mfuntowicz pushed 1 commit to feat-backend-llamacpp huggingface/text-generation-inference
  • feat(backend): add some initial decoding steps 9376250

View on GitHub

mfuntowicz pushed 3 commits to trtllm-stop-words huggingface/text-generation-inference
  • chore(router): minor refactorings 56106b4
  • feat(docker): build with-slurm ompi ba2618e
  • feat(docker): add python3.10 dev to runtime deps 5f81550

View on GitHub

Timenumber starred huggingface/text-generation-inference
Simon-Stone created a comment on an issue on huggingface/text-generation-inference
I am running into the same issue: I can get Llama 3.1 to respond with a tool call using the Messages API, but I cannot seem to make it respond to a tool call result. If I manually convert it to Lla...

View on GitHub

drbh pushed 1 commit to return-streaming-error-in-openai-client-compatible-format huggingface/text-generation-inference
  • fix: propagate completions error events to stream 85701ea

View on GitHub

mfuntowicz pushed 2 commits to feat-backend-llamacpp huggingface/text-generation-inference
  • feat(backend): correctly load llama.cpp model from llama api and not gpt2 196aedd
  • feat(backend): tell cmake to build llama-common and link to it fde042a

View on GitHub

alvarobartt created a comment on a pull request on huggingface/text-generation-inference
> I wonder if it would be good to have separate sections in the Getting Started bar for AWS and GCP. It is an important feature that seems very hidden. Agree, mentioned above too; I believe that...

View on GitHub

nbroad1881 created a review comment on a pull request on huggingface/text-generation-inference
I would say: > This will modify the `/invocations` route to accept [Messages dictionaries](#openai-messages-api). See the example below on how to deploy Llama with the new Messages API.

View on GitHub

nbroad1881 created a review on a pull request on huggingface/text-generation-inference

View on GitHub

nbroad1881 created a comment on a pull request on huggingface/text-generation-inference
I wonder if it would be good to have separate sections in the Getting Started bar for AWS and GCP. It is an important feature that seems very hidden. ![image](https://github.com/user-attachm...

View on GitHub

nbroad1881 created a comment on a pull request on huggingface/text-generation-inference
Looks great to me! Thanks for doing this so quickly. Sorry I responded so slowly.

View on GitHub

danieldk deleted a branch huggingface/text-generation-inference

impure-with-cuda

danieldk pushed 1 commit to main huggingface/text-generation-inference
  • Add `impureWithCuda` dev shell (#2677) * Add `impureWithCuda` dev shell This shell is handy when developing some ... 9c9ef37

View on GitHub

danieldk closed a pull request on huggingface/text-generation-inference
Add `impureWithCuda` dev shell
# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, ...
mfuntowicz opened a pull request on huggingface/text-generation-inference
Add support for stop words in TRTLLM
This PR attempts to read the `eos_token_ids` in `generation_config.json` (if present) and creates a list (`std::list` required by trtllm) to cache those value. The list is then forwarded to TRTL...
danieldk pushed 1 commit to impure-with-cuda huggingface/text-generation-inference

View on GitHub

danieldk opened a pull request on huggingface/text-generation-inference
Add `impureWithCuda` dev shell
# What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, ...
danieldk created a branch on huggingface/text-generation-inference

impure-with-cuda - Large Language Model Text Generation Inference

mfuntowicz pushed 13 commits to trtllm-stop-words huggingface/text-generation-inference
  • chore(router): add python dependency 18b473b
  • feat(trtllm): rewrite health to not account for current state 04c6f51
  • chore(looper): cleanup a bit more cdac4b0
  • feat(post_processing): max_new_tokens is const evaluated now 9ac26ed
  • chore(ffi):formatting c1a43a6
  • feat(trtllm): add stop words handling # Conflicts: # backends/trtllm/lib/backend.cpp 421a175
  • chore(trtllm): create specific parallelconfig factory and logging init methods 7217caf
  • chore(trtllm): define a macro for SizeType cast d5c8bdc
  • chore(trtllm): use GetParallelConfig 60a08a2
  • chore(trtllm): minor refactoring 848b8ad
  • chore(trtllm): validate there are enough GPus on the system for the desired model a6ac274
  • chore(trtllm): ensure max throughput scheduling policy is selected 47d8c53
  • chore(trtllm): minor fix 84f3bf9

View on GitHub

mfuntowicz pushed 1 commit to trtllm-executor-thread huggingface/text-generation-inference
  • chore(router): add python dependency 18b473b

View on GitHub

mfuntowicz pushed 1 commit to trtllm-stop-words huggingface/text-generation-inference

View on GitHub

CheanBotum starred huggingface/text-generation-inference
Load more