>>94869658Okay, so the whole tech pipeline goes like this:
first order of business, you need a camera and a piece of software that converts the real time video feed of your face into facial data, i.e. a stream of parametrized information on what your eyes, nose, head etc. are doing and their positioning etc.
It's possible to do video feed to facial data on the PC side in VTube Studio, but most serious VTubers use iPhone's built-in facial data because it's better.
Next, a VTuber model is essentially a bunch of PNGs stacked on top of each other. Movement of the model is movement and opacity changes etc. in those layers, think moving stuff around in Photoshop but in real time at 60 fps. Every single PNG layer has a bunch of parameters, e.g. the right eye might have a parameter called "open/closed" that goes between 0 to 1 etc.
Taking a bunch of PNGs (which are pieces of a model, eyes and ears separate and all that) and parameterizing them for movement is called rigging, and it's a part of making a model.
Conversion from raw facial data to model parameters is called tracking. It's done in real-time. Then a draw engine renders the model and sends the resulting video stream to OBS.
When I say "Hololive uses an in-house solution" I mean that tracking and rendering is done inside a proprietary piece of software made by Cover that's not available for outsiders. On the other hand, indies usually use VTube Studio, which is a free (albeit, iirc, also proprietary) program you can download off Steam.