Quoted By:
Training on swarm status - I am only using a 1B parameter model so its kind of retarded. But you can tell it retained some knowledge. I can't get it to stop yapping though, I think the finetuning fucked up the generation of end tokens or something. I can set a number of tokens for it to generate, but it still tries to generate a long 10 paragraph answer and it just hard cuts it off at that token count.