>>84092001>>84094853SpeeD timestep schedule didn't beat lognorm(0,1) either, sigh.. Trained the model for 5000 steps with SpeeD, then stopped and continued from 4000-step checkpoint using lognorm(0,1). Lognorm version just trains better and the output clarity surprisingly resembles Flux except it has worse comprehension while SpeeD without Change-Aware Weighting models resemble constant schedule a lot more.
>>84064303>mean=-0.3, std=1I've notice that you're using lognorm(-0.3,1) but did you actually compare it to lognorm(0,1)? I mean, in theory, for diffusion specifically a slight bias towards later timesteps should be beneficial but is that actually so?