>>69837652Cool model.
>very fried and not performing that well,It's easy to fuck up when finetuning because it trains everything, including things that you shouldn't touch. If LR is too high - text encoder, time embeds and conv layers will overfit / break easily.
With lora/lyco you train less so you have less opportunity to fuck up, and using --scale_weight_norms or train_norm helps to prevent frying/overfitting.
>How did you train your hll lycoris so that it retains old knowledge of artists while gaining new ones from a freshly collected dataset update?For hll6.3-fluff i usually resumed from a previous version (a9 is resumed from from a7, a9-eps is resumed from a9 for EF, etc. ) because re training from scratch would take too much time (1 epoch on 900k images takes >24 hours). So i just added new images and resumed training for 1-2 epochs on a dataset with both new and old images.
Base was trained for 12-15 epochs on around 300k images, i don't remember.
>Should it be just continued training from what it already knows or does it better to do a fresh start?If it's too fried, it's better to restart. I don't think you can recover a 'fried' model easily, though ii heard that using very high LR for a short time and then training normally may fix it sometimes.
But if he trained TE, you have to understand that training text encoder always breaks it a little and makes it dumber. And if it broken too much you cant fix it. CLIP is shit, but it was trained on billions of entries on hundreds of gpus, you can't reproduce it with a small dataset
Maybe he should wait for a new type of lora to be implemented in lycoris repo and in a1111. It should have a similar performance to full funetuming and lora/dora is faster than full finetune. Finetuning only makes sense if you are doing a significant change like retraining 512x512 res model to 1024x1024, not if you are just adding characters and artists.
https://arxiv.org/abs/2402.09353