Still in the early stages of testing out the best approach here. I messed with deforum and it's a very nice way to script the settings, but it seems like the temporal consistency isn't even as good as batch img2img with the right settings. I still have other ideas for getting temporal consistency but they're more labor intensive.
https://litter.catbox.moe/oxf83w.webm