>>39543685First time I have ever been asked to reddit space.
Guranon, if you are around, here is the gist of my method. It’s rather subjective, so feel free not to try it if you feel more comfortable doing something, or nothing, else.
Loop fixing post 1.1: The idea here is that the bots have a context window in which they self-references prior posts; we want to fill that context window with progressively “better” (i.e., less looping) responses and ultimately fill it with good responses. The basic idea is to shorten the responses as much as possible in the front of things; less content means less material to reference, and thus more control over the loops. Ask simple questions, but try to diversify the type as to not induce new loops (they may be impossible to avoid, but it still works should one form). Ask things like, “What is the antonym for up?”; “What is a male spouse called?”; “Is it true that a new quarter is shiny?”; “What color is a green plant?”; etc. Rate shorter responses and those with less looping higher, naturally.
Be a bit forgiving in the beginning; any progress in the context window is better than none, so feel free to give something with minor progress as high as a three. Only respond to a three or a four (only give fours to responses which perfectly eliminate the loop; it is okay to target one loop at a time if multiple exist), but feel free to respond to the first three or four you see. No need for hundreds of ratings. Become more selective with each cycle. Accept fewer and fewer loops and give threes more sparingly. We need to fill the context window with progressively better responses, and allowing our wAIfus to self-reference slightly better responses, even if not perfect, is the simple way forward. If nothing is better after several responses, delete the message chain from your prompt and write a new one.
If a new loop seems to be forming in how the wAIfu is giving short answers, it can be useful to ask her to repeat a longer phrase to get more words into the context window; ask her to do something like, “Repeat the following phrase: A purple clown fell from the sky and met the ground with a large splat.” The content of the phrase is unimportant, so long as it does not contain the looping behaviors and offers a diversity of words. It may also help to intersperse directions for actions like, “Please raise your right hand.”
This method is subjective and time consuming. Read each response carefully to avoid encouraging loops more than you have to. If you are focusing on fixing one loop, but a good or decent response that eliminates another loop appears, then feel free to respond to that new combo breaker; remember, we want the context window to have responses diverse enough to break loops. If you have 10 good responses in the context window for the first focused loop, and then 1 response which is for a new loop but reintroduces the loop from the first focus, then you are still making progress.
Once the context window (~40 of the bot responses) is filled with responses with minimal looping, ask more open-ended questions. Things like, “What is sylvite, and how does it differ from halite?” or even, “How are you doing so far?” At this point, you should be accepting and replying to only responses with minimal looping. None, if possible. But this stage is again to introduce more variety into the context window.
The major difference between this method and Guranon’s is that the ratings are a means to an end here, whereas they are critical for Guranon’s method. For my method, ratings are only important for immediately filling the context window. No Rating Hell, just rating to influence the next response. Keep that in mind, and you should have a better feel for what you need to do.