sixsixcoder Events in 2024 - Ecosyste.ms: Timeline

sixsixcoder created a review on a pull request on vllm-project/vllm

November 27, 2024 3:35am

View on GitHub

sixsixcoder pushed 1 commit to glm-4 sixsixcoder/vllm

November 27, 2024 1:48am

Update vllm/model_executor/models/glm.py Co-authored-by: Isotr0py <[email protected]> 5028019

View on GitHub

sixsixcoder created a comment on an issue on THUDM/GLM-4

November 26, 2024 8:43am

> 了解了，我可以先用glm-4-9b-chat的版本部署；另外我想确认下-hf的版本只是提高了对transformers后续升级的兼容性，并没有其他模型性能相关指标的改进对吧？是的，-hf版本将兼容未来的transformers，模型性能没有变化

View on GitHub

sixsixcoder created a comment on a pull request on vllm-project/vllm

November 26, 2024 8:03am

Thank you for your reply. I have updated the code according to your suggestion. Can it be merged?

View on GitHub

sixsixcoder created a comment on an issue on THUDM/GLM-4

November 26, 2024 7:09am

这个问题的vllm版本的问题，最新的vllm刚刚适配`GlmForCausalLM`架构，你可以先使用这个PR运行该脚本 https://github.com/vllm-project/vllm/pull/10561

View on GitHub

sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice

November 26, 2024 4:02am

在这个软硬件环境中 ``` GPU A800-SXM4-80GB cuda 12.1 torch 2.4.0 torchaudio 2.4.0 transformers 4.45.2 python 3.10 显存 80G 精度 BF16 GPU 个数 1 top_p = 1.0 temperature = 1.0 max_new_tokens = 256 ``` ...

View on GitHub

sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice

November 26, 2024 4:00am

> 目前的版本在拟人化方面是什么水平,比如语调、停顿、语气词的控制等. 然后具体怎么微调呢? 仓库中没有找到相关文档. > > 我理解构建语境来微调需要的语料会比较复杂, 所以先考虑文本方面, 前期只想微调文本生成(比如和采用客服数据做sft), 有没有办法用LORA微调当前的这个glm-4-voice-9b? 使得可以根据自有的数据做文本对齐, 同时又具有语音的理解和合成能力. 感谢...

View on GitHub

sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice

November 25, 2024 10:17am

> I tried to run the training after preparing the appropriate json data and yaml configuration file, but my attempt failed, can I ask for the missing configuration yaml file? There is no publish...

View on GitHub

sixsixcoder pushed 45 commits to glm-4 sixsixcoder/vllm

November 25, 2024 9:38am

[Benchmark] Add new H100 machine (#10547) aed0748
[9/N] torch.compile LLM usage (#10552) Signed-off-by: youkaichao <[email protected]> 33e0a25
[Minor] Fix line-too-long (#10563) Signed-off-by: Woosuk Kwon <[email protected]> 446c780
[platforms] absorb worker cls difference into platforms folder (#10555) Signed-off-by: youkaichao <youkaichao@gmail.... a111d01
[Bugfix] Fix Phi-3 BNB quantization with tensor parallel (#9948) Signed-off-by: Isotr0py <[email protected]> b6374e0
Remove token-adding chat embedding params (#10551) Signed-off-by: Noam Gat <[email protected]> 11fcf0e
[bugfix] fix full graph tests (#10581) Signed-off-by: youkaichao <[email protected]> db100c5
[torch.compile] support all attention backends (#10558) Signed-off-by: youkaichao <[email protected]> eebad39
[v1] Refactor KVCacheManager for more hash input than token ids (#10507) Signed-off-by: rickyx <[email protected]>... 97814fb
support bitsandbytes quantization with qwen model (#10549) Signed-off-by: Ubuntu <[email protected]> 948c859
[Core] remove temporary local variables in LLMEngine.__init__ (#10577) Signed-off-by: Russell Bryant <rbryant@redhat... 28598f3
[V1] EngineCore supports profiling (#10564) Signed-off-by: Abatom <[email protected]> d345f40
[bugfix] fix cpu tests (#10585) Signed-off-by: youkaichao <[email protected]> d559979
[Bugfix][Frontend] Update Llama Chat Templates to also support Non-Tool use (#10164) Signed-off-by: Travis Johnson <... 9195dbd
[Core] Fix broken log configuration (#10458) Signed-off-by: Russell Bryant <[email protected]> ebda519
[Misc] Add pynccl wrappers for all_gather and reduce_scatter (#9432) 978b397
[core] gemma2 full context length support (#10584) Signed-off-by: youkaichao <[email protected]> 4aba6e3
[Bugfix] Internal Server Error when tool_choice is incorrect. (#10567) Signed-off-by: Varun Shenoy <varun.vinayak.sh... 7d8ffb3
[Model] Fix Baichuan BNB online quantization (#10572) Signed-off-by: Chen Wu <[email protected]> cfea9c0
Update default max_num_batch_tokens for chunked prefill to 2048 (#10544) 02a43f8
and 25 more ...

View on GitHub

sixsixcoder pushed 1 commit to main sixsixcoder/vllm

November 25, 2024 9:30am

[misc] move functions to config.py (#10624) Signed-off-by: youkaichao <[email protected]> 05d1f8c

View on GitHub

sixsixcoder pushed 3 commits to main sixsixcoder/vllm

November 25, 2024 9:21am

[torch.compile] force inductor threads (#10620) Signed-off-by: Jee Jee Li <[email protected]> 7c2134b
[torch.compile] add warning for unsupported models (#10622) Signed-off-by: youkaichao <[email protected]> 6581378
[misc] add torch.compile compatibility check (#10618) Signed-off-by: youkaichao <[email protected]> 25d806e

View on GitHub

sixsixcoder pushed 3 commits to main sixsixcoder/vllm

November 25, 2024 7:01am

[Refactor][MISC] del redundant code in ParallelConfig.postinit (#10614) Signed-off-by: MengqingCao <[email protected]> 7ea3cd7
[torch.compile] support encoder based models (#10613) Signed-off-by: youkaichao <[email protected]> 571841b
[Doc] Add encoder-based models to Supported Models page (#10616) Signed-off-by: DarkLight1337 <[email protected]... a30a605

View on GitHub

sixsixcoder pushed 1 commit to main sixsixcoder/vllm

November 25, 2024 4:07am

Support Cross encoder models (#10400) Signed-off-by: Max de Bayser <[email protected]> Signed-off-by: Max de Ba... 214efc2

View on GitHub

sixsixcoder pushed 34 commits to main sixsixcoder/vllm

November 25, 2024 2:05am

[Benchmark] Add new H100 machine (#10547) aed0748
[9/N] torch.compile LLM usage (#10552) Signed-off-by: youkaichao <[email protected]> 33e0a25
[Minor] Fix line-too-long (#10563) Signed-off-by: Woosuk Kwon <[email protected]> 446c780
[platforms] absorb worker cls difference into platforms folder (#10555) Signed-off-by: youkaichao <youkaichao@gmail.... a111d01
[Bugfix] Fix Phi-3 BNB quantization with tensor parallel (#9948) Signed-off-by: Isotr0py <[email protected]> b6374e0
Remove token-adding chat embedding params (#10551) Signed-off-by: Noam Gat <[email protected]> 11fcf0e
[bugfix] fix full graph tests (#10581) Signed-off-by: youkaichao <[email protected]> db100c5
[torch.compile] support all attention backends (#10558) Signed-off-by: youkaichao <[email protected]> eebad39
[v1] Refactor KVCacheManager for more hash input than token ids (#10507) Signed-off-by: rickyx <[email protected]>... 97814fb
support bitsandbytes quantization with qwen model (#10549) Signed-off-by: Ubuntu <[email protected]> 948c859
[Core] remove temporary local variables in LLMEngine.__init__ (#10577) Signed-off-by: Russell Bryant <rbryant@redhat... 28598f3
[V1] EngineCore supports profiling (#10564) Signed-off-by: Abatom <[email protected]> d345f40
[bugfix] fix cpu tests (#10585) Signed-off-by: youkaichao <[email protected]> d559979
[Bugfix][Frontend] Update Llama Chat Templates to also support Non-Tool use (#10164) Signed-off-by: Travis Johnson <... 9195dbd
[Core] Fix broken log configuration (#10458) Signed-off-by: Russell Bryant <[email protected]> ebda519
[Misc] Add pynccl wrappers for all_gather and reduce_scatter (#9432) 978b397
[core] gemma2 full context length support (#10584) Signed-off-by: youkaichao <[email protected]> 4aba6e3
[Bugfix] Internal Server Error when tool_choice is incorrect. (#10567) Signed-off-by: Varun Shenoy <varun.vinayak.sh... 7d8ffb3
[Model] Fix Baichuan BNB online quantization (#10572) Signed-off-by: Chen Wu <[email protected]> cfea9c0
Update default max_num_batch_tokens for chunked prefill to 2048 (#10544) 02a43f8
and 14 more ...

View on GitHub

sixsixcoder opened a pull request on vllm-project/vllm

November 22, 2024 2:17am

[Model] Added GLM-4 series model support vllm==0.6.4

# Overview This update adds [GLM-4](https://huggingface.co/THUDM/glm-4-9b-chat-hf) series text model support vllm==0.6.4, which is different from GLM-4v https://github.com/vllm-project/vllm/pull/9...

sixsixcoder pushed 1 commit to glm-4 sixsixcoder/vllm

November 22, 2024 2:05am

Added GLM-4 series model support vllm==0.6.4 4f84afe

View on GitHub

sixsixcoder pushed 71 commits to glm-4 sixsixcoder/vllm

November 22, 2024 2:00am

[4/N][torch.compile] clean up set_torch_compile_backend (#10401) Signed-off-by: youkaichao <[email protected]> 51bb12d
[VLM] Report multi_modal_placeholders in output (#10407) Signed-off-by: Linkun Chen <[email protected]> c7dec92
[Model] Remove redundant softmax when using PoolingType.STEP (#10415) 01aae1c
[Model][LoRA]LoRA support added for glm-4v (#10418) Signed-off-by: B-201 <[email protected]> 5be4e52
[Model] Remove transformers attention porting in VITs (#10414) Signed-off-by: Isotr0py <[email protected]> e7ebb66
[Doc] Update doc for LoRA support in GLM-4V (#10425) Signed-off-by: B-201 <[email protected]> 4186be8
[5/N][torch.compile] torch.jit.script --> torch.compile (#10406) Signed-off-by: youkaichao <[email protected]> 7851b45
[Doc] Add documentation for Structured Outputs (#9943) Signed-off-by: ismael-dm <[email protected]> 31894a2
Fix open_collective value in FUNDING.yml (#10426) Signed-off-by: Andrew Nesbitt <[email protected]> 4f686d1
[Model][Bugfix] Support TP for PixtralHF ViT (#10405) Signed-off-by: mgoin <[email protected]> 281cc4b
[Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107) Signed-off-by: yan ma <[email protected]> 6b2d25e
[Kernel] Explicitly specify other value in tl.load calls (#9014) Signed-off-by: Angus Wang <[email protected]> c2170a5
[Kernel] Initial Machete W4A8 support + Refactors (#9855) Signed-off-by: Lucas Wilkinson <[email protected]> 96d999f
[3/N][torch.compile] consolidate custom op logging (#10399) Signed-off-by: youkaichao <[email protected]> a03ea40
[ci][bugfix] fix kernel tests (#10431) Signed-off-by: youkaichao <[email protected]> 2298e69
[misc] partial prefix & random input generation benchmark (#9929) Signed-off-by: rickyx <[email protected]> 90a6c75
[ci/build] Have dependabot ignore all patch update (#10436) We have too many dependencies and all patch updates can ... 284203f
[Bugfix]Fix Phi-3 BNB online quantization (#10417) Signed-off-by: Jee Jee Li <[email protected]> 7eb719d
[Platform][Refactor] Extract func `get_default_attn_backend` to `Platform` (#10358) Signed-off-by: Mengqing Cao <cmq... 8c1fb50
Add openai.beta.chat.completions.parse example to structured_outputs.rst (#10433) 74f8c2c
and 51 more ...

View on GitHub

sixsixcoder pushed 72 commits to main sixsixcoder/vllm

November 22, 2024 2:00am

[Bugfix] Ignore ray reinit error when current platform is ROCm or XPU (#10375) Signed-off-by: Hollow Man <hollowman@... 47826ca
[4/N][torch.compile] clean up set_torch_compile_backend (#10401) Signed-off-by: youkaichao <[email protected]> 51bb12d
[VLM] Report multi_modal_placeholders in output (#10407) Signed-off-by: Linkun Chen <[email protected]> c7dec92
[Model] Remove redundant softmax when using PoolingType.STEP (#10415) 01aae1c
[Model][LoRA]LoRA support added for glm-4v (#10418) Signed-off-by: B-201 <[email protected]> 5be4e52
[Model] Remove transformers attention porting in VITs (#10414) Signed-off-by: Isotr0py <[email protected]> e7ebb66
[Doc] Update doc for LoRA support in GLM-4V (#10425) Signed-off-by: B-201 <[email protected]> 4186be8
[5/N][torch.compile] torch.jit.script --> torch.compile (#10406) Signed-off-by: youkaichao <[email protected]> 7851b45
[Doc] Add documentation for Structured Outputs (#9943) Signed-off-by: ismael-dm <[email protected]> 31894a2
Fix open_collective value in FUNDING.yml (#10426) Signed-off-by: Andrew Nesbitt <[email protected]> 4f686d1
[Model][Bugfix] Support TP for PixtralHF ViT (#10405) Signed-off-by: mgoin <[email protected]> 281cc4b
[Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107) Signed-off-by: yan ma <[email protected]> 6b2d25e
[Kernel] Explicitly specify other value in tl.load calls (#9014) Signed-off-by: Angus Wang <[email protected]> c2170a5
[Kernel] Initial Machete W4A8 support + Refactors (#9855) Signed-off-by: Lucas Wilkinson <[email protected]> 96d999f
[3/N][torch.compile] consolidate custom op logging (#10399) Signed-off-by: youkaichao <[email protected]> a03ea40
[ci][bugfix] fix kernel tests (#10431) Signed-off-by: youkaichao <[email protected]> 2298e69
[misc] partial prefix & random input generation benchmark (#9929) Signed-off-by: rickyx <[email protected]> 90a6c75
[ci/build] Have dependabot ignore all patch update (#10436) We have too many dependencies and all patch updates can ... 284203f
[Bugfix]Fix Phi-3 BNB online quantization (#10417) Signed-off-by: Jee Jee Li <[email protected]> 7eb719d
[Platform][Refactor] Extract func `get_default_attn_backend` to `Platform` (#10358) Signed-off-by: Mengqing Cao <cmq... 8c1fb50
and 52 more ...