>>2932251This is probably doable. Not in the sense of AGI but I can imagine retrofitting GPT-3 to generate answers from question input, then some GAN model to learn a representation of Ina's voice and convert from text input.
The real problem would be manually annotating/transcripting all those clips for a usable dataset, which I don't think is feasible unless you use Mechanical Turk or somehow get all the takodachis to expend their labor to create the annotations.