>>59197052>no emotion>shitty enunciation>change in accent>stress syllables are incorrect for the accent>speaking cadence is stilted and it is not like an actual human speaking>fails to capture any breathiness, pauses, swallowing, or mouth sounds>will not be able to capture the cuteness of a vtuber woman in her mid-20s Try again AI bros, you'll get it soon enough. Next time hire some linguistics and sound engineers to consult how to fix your algorithm.
>>59197478That speech-to-text live translator is bad, and anyone thinking otherwise is retarded. Fubuki, Patra, Nazuna, and several indie JP 2views have used the same on-screen overlay, and it's absolutely incomprehensible 97% of the time. The remaining 3% maybe allows you to follow along to basic conversational topics by flashing general topic markers, but you will have no understanding of what they're saying about it unless you have some knowledge of Japanese grammar and vocabulary.
It is still faster to do your reps than to hope these programmers can come up with anything of note in the next 15 years+. It's really fucking laughable at this current stage. Until you can show me voice samples of voice outputs that sound 100% like HanaKana, Miyuki Sawashiro, Mizuki Nana- hell even then, there's still nothing to write home about because the translations will be straight up incorrect or horribly tone inaccurate in some way. This is a curated demo and it's still barely scratching the surface of what full language conversion should be. I personally wouldn't be happy with this outcome, and people should expect more from AI if it's so good. Just understanding isn't enough if your software fails to capture speaking habits and quirks which is a strength that vtubers can leverage way more than the traditional streamer.