Quoted By:
Found what was wrong. I has mixed precision set to fp16 in accelerate config, but had bf16 in training script so some values got destroyed during conversion.
It's so broken supermerger shows negative numbers for similarity, which should not be possible