Ecosyste.ms: Timeline
Browse the timeline of events for every public repo on GitHub. Data updated hourly from GH Archive.
lpscr opened a pull request on SWivid/F5-TTS
quick fix device always cpu !
@SWivid something i miss here also video action all working great https://github.com/user-attachments/assets/ebf25212-2cc5-4570-b5ff-41b63a6f0f96SWivid pushed 6 commits to main SWivid/F5-TTS
SWivid closed a pull request on SWivid/F5-TTS
new update gradio finetune
@SWivid just another great update and fix some stuf first add create vocan from the dataset and you can see ![image](https://github.com/user-attachments/assets/9784504d-b772-4369-b275-5e8dc...lpscr created a comment on a pull request on SWivid/F5-TTS
@SWivid update add mixed_precision option ![image](https://github.com/user-attachments/assets/880ef837-45c1-4c5f-9cb5-c92c9e85e8d6)
SWivid created a comment on a pull request on SWivid/F5-TTS
> if possible to merge 2 models weight . so not need the dataset English , Chinese ... not sure if will work. a more possible solution is to do llama-adapter finetuning.
SWivid created a comment on an issue on SWivid/F5-TTS
@huutuongtu I would recommend a smaller model size. And just train longer, at least 200k updates thought to hear something reasonable, cuz we have no phoneme-level force-alignment (if you're int...
SWivid created a comment on an issue on SWivid/F5-TTS
> oral-to-oral llm chat app Yes, in this way, a streamable tts is better √
lpscr created a comment on a pull request on SWivid/F5-TTS
> > Will the model then be able to speak three languages (including the newly finetuned one)? > > but you need to have Chinese and English included in finetuning dataset to avoid catastrophic fo...
SWivid pushed 1 commit to dev SWivid/F5-TTS
- finish eval dependencies; update infer_gradio with chat feature ba4b04b