Ecosyste.ms: Timeline
Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.
sixsixcoder pushed 1 commit to glm-4 sixsixcoder/vllm
- Update vllm/model_executor/models/glm.py Co-authored-by: Isotr0py <[email protected]> 5028019
sixsixcoder created a comment on an issue on THUDM/GLM-4
> 了解了,我可以先用glm-4-9b-chat的版本部署; 另外我想确认下-hf的版本只是提高了对transformers后续升级的兼容性,并没有其他模型性能相关指标的改进对吧? 是的,-hf版本将兼容未来的transformers,模型性能没有变化
sixsixcoder created a comment on a pull request on vllm-project/vllm
Thank you for your reply. I have updated the code according to your suggestion. Can it be merged?
sixsixcoder created a comment on an issue on THUDM/GLM-4
这个问题的vllm版本的问题,最新的vllm刚刚适配`GlmForCausalLM`架构,你可以先使用这个PR运行该脚本 https://github.com/vllm-project/vllm/pull/10561
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
在这个软硬件环境中 ``` GPU A800-SXM4-80GB cuda 12.1 torch 2.4.0 torchaudio 2.4.0 transformers 4.45.2 python 3.10 显存 80G 精度 BF16 GPU 个数 1 top_p = 1.0 temperature = 1.0 max_new_tokens = 256 ``` ...
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
> 目前的版本在拟人化方面是什么水平,比如语调、停顿、语气词的控制等. 然后具体怎么微调呢? 仓库中没有找到相关文档. > > 我理解构建语境来微调需要的语料会比较复杂, 所以先考虑文本方面, 前期只想微调文本生成(比如和采用客服数据做sft), 有没有办法用LORA微调当前的这个glm-4-voice-9b? 使得可以根据自有的数据做文本对齐, 同时又具有语音的理解和合成能力. 感谢...
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
> I tried to run the training after preparing the appropriate json data and yaml configuration file, but my attempt failed, can I ask for the missing configuration yaml file? There is no publish...
sixsixcoder pushed 45 commits to glm-4 sixsixcoder/vllm
- [Benchmark] Add new H100 machine (#10547) aed0748
- [9/N] torch.compile LLM usage (#10552) Signed-off-by: youkaichao <[email protected]> 33e0a25
- [Minor] Fix line-too-long (#10563) Signed-off-by: Woosuk Kwon <[email protected]> 446c780
- [platforms] absorb worker cls difference into platforms folder (#10555) Signed-off-by: youkaichao <youkaichao@gmail.... a111d01
- [Bugfix] Fix Phi-3 BNB quantization with tensor parallel (#9948) Signed-off-by: Isotr0py <[email protected]> b6374e0
- Remove token-adding chat embedding params (#10551) Signed-off-by: Noam Gat <[email protected]> 11fcf0e
- [bugfix] fix full graph tests (#10581) Signed-off-by: youkaichao <[email protected]> db100c5
- [torch.compile] support all attention backends (#10558) Signed-off-by: youkaichao <[email protected]> eebad39
- [v1] Refactor KVCacheManager for more hash input than token ids (#10507) Signed-off-by: rickyx <[email protected]>... 97814fb
- support bitsandbytes quantization with qwen model (#10549) Signed-off-by: Ubuntu <[email protected]> 948c859
- [Core] remove temporary local variables in LLMEngine.__init__ (#10577) Signed-off-by: Russell Bryant <rbryant@redhat... 28598f3
- [V1] EngineCore supports profiling (#10564) Signed-off-by: Abatom <[email protected]> d345f40
- [bugfix] fix cpu tests (#10585) Signed-off-by: youkaichao <[email protected]> d559979
- [Bugfix][Frontend] Update Llama Chat Templates to also support Non-Tool use (#10164) Signed-off-by: Travis Johnson <... 9195dbd
- [Core] Fix broken log configuration (#10458) Signed-off-by: Russell Bryant <[email protected]> ebda519
- [Misc] Add pynccl wrappers for all_gather and reduce_scatter (#9432) 978b397
- [core] gemma2 full context length support (#10584) Signed-off-by: youkaichao <[email protected]> 4aba6e3
- [Bugfix] Internal Server Error when tool_choice is incorrect. (#10567) Signed-off-by: Varun Shenoy <varun.vinayak.sh... 7d8ffb3
- [Model] Fix Baichuan BNB online quantization (#10572) Signed-off-by: Chen Wu <[email protected]> cfea9c0
- Update default max_num_batch_tokens for chunked prefill to 2048 (#10544) 02a43f8
- and 25 more ...
sixsixcoder pushed 1 commit to main sixsixcoder/vllm
- [misc] move functions to config.py (#10624) Signed-off-by: youkaichao <[email protected]> 05d1f8c
sixsixcoder pushed 3 commits to main sixsixcoder/vllm
- [torch.compile] force inductor threads (#10620) Signed-off-by: Jee Jee Li <[email protected]> 7c2134b
- [torch.compile] add warning for unsupported models (#10622) Signed-off-by: youkaichao <[email protected]> 6581378
- [misc] add torch.compile compatibility check (#10618) Signed-off-by: youkaichao <[email protected]> 25d806e
sixsixcoder pushed 3 commits to main sixsixcoder/vllm
- [Refactor][MISC] del redundant code in ParallelConfig.postinit (#10614) Signed-off-by: MengqingCao <[email protected]> 7ea3cd7
- [torch.compile] support encoder based models (#10613) Signed-off-by: youkaichao <[email protected]> 571841b
- [Doc] Add encoder-based models to Supported Models page (#10616) Signed-off-by: DarkLight1337 <[email protected]... a30a605
sixsixcoder pushed 1 commit to main sixsixcoder/vllm
- Support Cross encoder models (#10400) Signed-off-by: Max de Bayser <[email protected]> Signed-off-by: Max de Ba... 214efc2
sixsixcoder pushed 34 commits to main sixsixcoder/vllm
- [Benchmark] Add new H100 machine (#10547) aed0748
- [9/N] torch.compile LLM usage (#10552) Signed-off-by: youkaichao <[email protected]> 33e0a25
- [Minor] Fix line-too-long (#10563) Signed-off-by: Woosuk Kwon <[email protected]> 446c780
- [platforms] absorb worker cls difference into platforms folder (#10555) Signed-off-by: youkaichao <youkaichao@gmail.... a111d01
- [Bugfix] Fix Phi-3 BNB quantization with tensor parallel (#9948) Signed-off-by: Isotr0py <[email protected]> b6374e0
- Remove token-adding chat embedding params (#10551) Signed-off-by: Noam Gat <[email protected]> 11fcf0e
- [bugfix] fix full graph tests (#10581) Signed-off-by: youkaichao <[email protected]> db100c5
- [torch.compile] support all attention backends (#10558) Signed-off-by: youkaichao <[email protected]> eebad39
- [v1] Refactor KVCacheManager for more hash input than token ids (#10507) Signed-off-by: rickyx <[email protected]>... 97814fb
- support bitsandbytes quantization with qwen model (#10549) Signed-off-by: Ubuntu <[email protected]> 948c859
- [Core] remove temporary local variables in LLMEngine.__init__ (#10577) Signed-off-by: Russell Bryant <rbryant@redhat... 28598f3
- [V1] EngineCore supports profiling (#10564) Signed-off-by: Abatom <[email protected]> d345f40
- [bugfix] fix cpu tests (#10585) Signed-off-by: youkaichao <[email protected]> d559979
- [Bugfix][Frontend] Update Llama Chat Templates to also support Non-Tool use (#10164) Signed-off-by: Travis Johnson <... 9195dbd
- [Core] Fix broken log configuration (#10458) Signed-off-by: Russell Bryant <[email protected]> ebda519
- [Misc] Add pynccl wrappers for all_gather and reduce_scatter (#9432) 978b397
- [core] gemma2 full context length support (#10584) Signed-off-by: youkaichao <[email protected]> 4aba6e3
- [Bugfix] Internal Server Error when tool_choice is incorrect. (#10567) Signed-off-by: Varun Shenoy <varun.vinayak.sh... 7d8ffb3
- [Model] Fix Baichuan BNB online quantization (#10572) Signed-off-by: Chen Wu <[email protected]> cfea9c0
- Update default max_num_batch_tokens for chunked prefill to 2048 (#10544) 02a43f8
- and 14 more ...
sixsixcoder opened a pull request on vllm-project/vllm
[Model] Added GLM-4 series model support vllm==0.6.4
# Overview This update adds [GLM-4](https://huggingface.co/THUDM/glm-4-9b-chat-hf) series text model support vllm==0.6.4, which is different from GLM-4v https://github.com/vllm-project/vllm/pull/9...sixsixcoder pushed 1 commit to glm-4 sixsixcoder/vllm
- Added GLM-4 series model support vllm==0.6.4 4f84afe
sixsixcoder pushed 71 commits to glm-4 sixsixcoder/vllm
- [4/N][torch.compile] clean up set_torch_compile_backend (#10401) Signed-off-by: youkaichao <[email protected]> 51bb12d
- [VLM] Report multi_modal_placeholders in output (#10407) Signed-off-by: Linkun Chen <[email protected]> c7dec92
- [Model] Remove redundant softmax when using PoolingType.STEP (#10415) 01aae1c
- [Model][LoRA]LoRA support added for glm-4v (#10418) Signed-off-by: B-201 <[email protected]> 5be4e52
- [Model] Remove transformers attention porting in VITs (#10414) Signed-off-by: Isotr0py <[email protected]> e7ebb66
- [Doc] Update doc for LoRA support in GLM-4V (#10425) Signed-off-by: B-201 <[email protected]> 4186be8
- [5/N][torch.compile] torch.jit.script --> torch.compile (#10406) Signed-off-by: youkaichao <[email protected]> 7851b45
- [Doc] Add documentation for Structured Outputs (#9943) Signed-off-by: ismael-dm <[email protected]> 31894a2
- Fix open_collective value in FUNDING.yml (#10426) Signed-off-by: Andrew Nesbitt <[email protected]> 4f686d1
- [Model][Bugfix] Support TP for PixtralHF ViT (#10405) Signed-off-by: mgoin <[email protected]> 281cc4b
- [Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107) Signed-off-by: yan ma <[email protected]> 6b2d25e
- [Kernel] Explicitly specify other value in tl.load calls (#9014) Signed-off-by: Angus Wang <[email protected]> c2170a5
- [Kernel] Initial Machete W4A8 support + Refactors (#9855) Signed-off-by: Lucas Wilkinson <[email protected]> 96d999f
- [3/N][torch.compile] consolidate custom op logging (#10399) Signed-off-by: youkaichao <[email protected]> a03ea40
- [ci][bugfix] fix kernel tests (#10431) Signed-off-by: youkaichao <[email protected]> 2298e69
- [misc] partial prefix & random input generation benchmark (#9929) Signed-off-by: rickyx <[email protected]> 90a6c75
- [ci/build] Have dependabot ignore all patch update (#10436) We have too many dependencies and all patch updates can ... 284203f
- [Bugfix]Fix Phi-3 BNB online quantization (#10417) Signed-off-by: Jee Jee Li <[email protected]> 7eb719d
- [Platform][Refactor] Extract func `get_default_attn_backend` to `Platform` (#10358) Signed-off-by: Mengqing Cao <cmq... 8c1fb50
- Add openai.beta.chat.completions.parse example to structured_outputs.rst (#10433) 74f8c2c
- and 51 more ...
sixsixcoder pushed 72 commits to main sixsixcoder/vllm
- [Bugfix] Ignore ray reinit error when current platform is ROCm or XPU (#10375) Signed-off-by: Hollow Man <hollowman@... 47826ca
- [4/N][torch.compile] clean up set_torch_compile_backend (#10401) Signed-off-by: youkaichao <[email protected]> 51bb12d
- [VLM] Report multi_modal_placeholders in output (#10407) Signed-off-by: Linkun Chen <[email protected]> c7dec92
- [Model] Remove redundant softmax when using PoolingType.STEP (#10415) 01aae1c
- [Model][LoRA]LoRA support added for glm-4v (#10418) Signed-off-by: B-201 <[email protected]> 5be4e52
- [Model] Remove transformers attention porting in VITs (#10414) Signed-off-by: Isotr0py <[email protected]> e7ebb66
- [Doc] Update doc for LoRA support in GLM-4V (#10425) Signed-off-by: B-201 <[email protected]> 4186be8
- [5/N][torch.compile] torch.jit.script --> torch.compile (#10406) Signed-off-by: youkaichao <[email protected]> 7851b45
- [Doc] Add documentation for Structured Outputs (#9943) Signed-off-by: ismael-dm <[email protected]> 31894a2
- Fix open_collective value in FUNDING.yml (#10426) Signed-off-by: Andrew Nesbitt <[email protected]> 4f686d1
- [Model][Bugfix] Support TP for PixtralHF ViT (#10405) Signed-off-by: mgoin <[email protected]> 281cc4b
- [Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107) Signed-off-by: yan ma <[email protected]> 6b2d25e
- [Kernel] Explicitly specify other value in tl.load calls (#9014) Signed-off-by: Angus Wang <[email protected]> c2170a5
- [Kernel] Initial Machete W4A8 support + Refactors (#9855) Signed-off-by: Lucas Wilkinson <[email protected]> 96d999f
- [3/N][torch.compile] consolidate custom op logging (#10399) Signed-off-by: youkaichao <[email protected]> a03ea40
- [ci][bugfix] fix kernel tests (#10431) Signed-off-by: youkaichao <[email protected]> 2298e69
- [misc] partial prefix & random input generation benchmark (#9929) Signed-off-by: rickyx <[email protected]> 90a6c75
- [ci/build] Have dependabot ignore all patch update (#10436) We have too many dependencies and all patch updates can ... 284203f
- [Bugfix]Fix Phi-3 BNB online quantization (#10417) Signed-off-by: Jee Jee Li <[email protected]> 7eb719d
- [Platform][Refactor] Extract func `get_default_attn_backend` to `Platform` (#10358) Signed-off-by: Mengqing Cao <cmq... 8c1fb50
- and 52 more ...
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
感谢您的关注,后续会发布技术报告,会有详细的技术细节
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
目前长文本语音能力有限,请关注后续的技术报告,应该是目前短文本表现最佳
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
目前提供的demo还不支持文本+语音混合输入,未来会支持
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
GLM-4-Voice支持语气转换,您可以尝试微调,构建语境,实现该效果
sixsixcoder created a comment on an issue on THUDM/GLM-4-Voice
支持,但是本仓库只提供简单的使用demo,rag需要开发者自行进行二次开发
sixsixcoder created a branch on sixsixcoder/GithubToLark
main - Receive information from GitHub and push it to Lark