>>36813267>>36813563I'm not sure if that approach would work. See, in language models words are broken down to tokens. So cat would be token 001, dog will be token 002... that's how language models are able "comprehend" language, it's simply predicting which token is next in a sentence. Lamda can recognize 1.56T words, that's a lot!
But what happens if you have a word not recognized in that 1.56T? It'll treat it as an <unknown> token. So all complex ciphers will likely be recognized by the model as one same thing. You'll also be stuck with it trying to respond to you, since it doesn't have the token / word to respond it with.
But a simple sanity check is - if you can make the bot respond to a simple cipher, then it should be able to do the rest.
Did anyone ever figure out how the filter works? Is it an updated model, a keyword based filter or did they actually attach a context predictive model?