>>72235213adamW8bit batch size 1 800 steps with other settings from
https://rentry.org/CCC_TrainingStill looks like shit, character recognition is low while everything else is clearly fucked. Does it just not like fp8? I have scheduler cycles at 30 while there were 20 epochs, is that it?
At this point I would have saved time just using the 1.5 hour settings
Or perhaps I'm using the wrong model to train. I assumed using 3.0 base was the way to go. It does look a lot better on the samples given by the trainer using 3.0 base, but it still doesn't look good