Thought will just be fine, cuz the basic structure of chunk inference is done by @RootingInLoad .
We could take the best for user experience and properly updating the credit part
@kmn1024 @SWivid I ran it without the constraint and here are the result.
```
Epoch 1/10: 67%|█████████████████████████████████████████████████████████████████████████▏ ...
From the several test cases I tested, it always emits non-existent characters at the beginning of the audio
[zh_prompt.zip](https://github.com/user-attachments/files/17404857/zh_prompt.zip)
给定模型的...
@SWivid something i miss
here also video action all working great
https://github.com/user-attachments/assets/ebf25212-2cc5-4570-b5ff-41b63a6f0f96
BTW: when you have free time please test