SWivid/F5-TTS Events in 2024 - Ecosyste.ms: Timeline

@SWivid also about the vocab new from dataset i think be good to extend with vocab for Emilia_ZH_EN_pinyin for example if there is new symbols not in let to add this from train from the scrat...

View on GitHub

WGS-note created a comment on an issue on SWivid/F5-TTS

October 24, 2024 9:07am

谢谢！

View on GitHub

jpgallegoar created a comment on a pull request on SWivid/F5-TTS

October 24, 2024 9:06am

@SWivid This speeds inference a lot, good idea

View on GitHub

jpgallegoar pushed 3 commits to main SWivid/F5-TTS

October 24, 2024 9:05am

Added audio hashing for faster inference 3d5be2a
Added audio hashing for faster inference a44435c
Merge pull request #249 from jpgallegoar/main Added audio caching for faster inference 40b2c85

View on GitHub

jpgallegoar closed a pull request on SWivid/F5-TTS

October 24, 2024 9:05am

Added audio caching for faster inference

Added audio caching in utils_infer so if the same audio is used twice, it will not be transcribed again.

lpscr opened a pull request on SWivid/F5-TTS

October 24, 2024 9:04am

new update gradiof finetune

just another great update fix some stuf first add create vocan from the dataset and you can see ![image](https://github.com/user-attachments/assets/9784504d-b772-4369-b275-5e8dc9dd7d19) s...

jpgallegoar closed a pull request on SWivid/F5-TTS

October 24, 2024 9:04am

Added audio cashing for faster inference

Added audio caching in utils_infer so if the same audio is used twice, it will not be transcribed again.

jpgallegoar opened a pull request on SWivid/F5-TTS

October 24, 2024 9:02am

Added audio cashing for faster inference

Added audio caching in utils_infer so if the same audio is used twice, it will not be transcribed again.

RobinWitch starred SWivid/F5-TTS

October 24, 2024 8:56am

huutuongtu opened an issue on SWivid/F5-TTS

October 24, 2024 8:55am

Model generates audio with random content during inference

Hi, thanks for the great work. I trained the model on my own dataset (Vietnamese, 1 speaker, 1.5 hours, just for testing) for about 70k steps. In [this comment,](https://github.com/SWivid/F5-TTS/is...

qq547276542 starred SWivid/F5-TTS

October 24, 2024 8:49am

nomand starred SWivid/F5-TTS

October 24, 2024 8:49am

gzw518 starred SWivid/F5-TTS

October 24, 2024 8:48am

Banane-tech starred SWivid/F5-TTS

October 24, 2024 8:40am

teneous starred SWivid/F5-TTS

October 24, 2024 8:32am

kmn1024 created a comment on an issue on SWivid/F5-TTS

October 24, 2024 8:26am

Could the following happen? Suppose we have these examples with these frames lengths and durations. ex1: 100, 2.0 ex2: 50, 1.0 ex3: 10, 0.2 ex4: 100, 2.0 Suppose frames_threshold is 101. The...

View on GitHub

yyf110100 starred SWivid/F5-TTS

October 24, 2024 8:20am

trunks970 created a comment on an issue on SWivid/F5-TTS

October 24, 2024 8:19am

> Use lower case as we suggested in readme, or you are telling model to read letter by letter Also check if reference audio uploaded correctly, will show waveform if so done both unfortunately n...

View on GitHub

kmn1024 created a comment on an issue on SWivid/F5-TTS

October 24, 2024 8:19am

What I mean to suggest is, suppose we have these examples with these durations ex1, 100 ex2, 1 ex2, 1 ex2, 1

View on GitHub

SWivid created a comment on a pull request on SWivid/F5-TTS

October 24, 2024 8:09am

@lpscr the `dev` branch is from this repo. updates is mainly for fair comparison with other models with different train set. as the trainset is of different size, so academic comparison general...

View on GitHub

dxhedi starred SWivid/F5-TTS

October 24, 2024 8:07am

ygyuan starred SWivid/F5-TTS

October 24, 2024 8:04am

boltomli starred SWivid/F5-TTS

October 24, 2024 8:03am

jpgallegoar created a comment on a pull request on SWivid/F5-TTS

October 24, 2024 8:01am

Thank you for the inputs, I also thought about the extra loading time. I will look into it and update.

View on GitHub

SWivid created a comment on a pull request on SWivid/F5-TTS

October 24, 2024 8:00am

@jpgallegoar Hi, very cool chat! I just test it, may need few tweaks with your help: 1. will the ref_text saved and not calling asr pipeline to do transcription again if not ref_audio not chang...

View on GitHub