>>65416694i've worked enough with around a year of stable diffusion to have my brain wired into the mindset of LLMs. The base tech has their "memory" caked in, everything is a response to the prompt in some way. so they add additional "layers" to the interpreter thing which can do other stuff than just do some statistics to give you fitting words.
GPT-4 is where they started to really ride this train for applications of the tech, and some of that is in Neuro now. Vedal took the idea and implemented some kind of funnel for other inputs into her generator, basically Neuro gets constant "streams" of input like you do IRL. Image stuff might be through something like deepbooru, her own memory might be just looking at herself, there's voice to text from vedal or guests and there's chat which is also text. There might be a small focus module which weighs all of the inputs, i.e. "let's do a reaction" so her most important inputs to listen to are visual and audio.
most "proompting" takes some time usually so I think Vedal goes through little iterations. Neuro's sister is just her weights flipped more or less as a base. The base model will stay the same, it does the heavy lifting of finding text to say. Google themselves admitted the community found a way to beat them with LORAs to modify a model, Vedal did the next logical thing and found a way to modify her output with some self-written adjustment parameters, maybe he has a mood-meter or some stuff, basically some interface to give her additional inputs to work with, which make her a LOT more capable than most chatbots, which in my experience are yes-men trained to give you the response you want to hear, instead of going off other parameters.
It's not at all artificial intelligence, but they're getting good at mimicing it, which is basically all you need for some entertainment.