1) using inference-cli with a large text splits the generation into multiple batches, but at the start of the next batch a random sounds gets added.
2) generation reads hyphenated compound word ...
Hi, awesome work!
I was wondering if we could use this method to have a powerful singing model as well ?
Either text-to-singing or direct voice conversion ?
Thanks
@lpscr Hi, we are having a general convertion to a package-compatible form of repo.
Maybe we could do local develop temporarily and just wait for above-mentioned key update (in order to have less ...