In chats that run long enough on ChatGPT, you'll see it begin to confuse prompts and responses, and eventually even confuse both for its system prompt. I suspect this sort of problem exists widely in AI.
In Gemini chat I find that you should avoid continuing a conversation if its answer was wrong or had a big shortcoming. It's better to edit the previous prompt so that it comes up with a better answer in the first place, instead of sending a new message.
The key with gemini is to migrate to a new chat once it makes a single dumb mistake. It's a very strong model, but once it steps in the mud, you'll lose your mind trying to recover it.
Delete the bad response, ask it for a summary or to update [context].md, then start a new instance.
Makes me wonder if during training LLMs are asked to tell whether they've written something themselves or not. Should be quite easy: ask the LLM to produce many continuations of a prompt, then mix them with many other produced by humans, and then ask the LLM to tell them apart. This should be possible by introspecting on the hidden layers and comparing with the provided continuation. I believe Anthropic has already demonstrated that the models have already partially developed this capability, but should be trivial and useful to train it.
Isn't that something different? If I prompt an LLM to identify the speaker, that's different from keeping track of speaker while processing a different prompt.
At work where LLM based tooling is being pushed haaard, I'm amazed every day that developers don't know, let alone second nature intuit, this and other emergent behavior of LLMs. But seeing that lack here on hn with an article on the frontpage boggles my mind. The future really is unevenly distributed.
author here, interesting to hear, I generally start a new chat for each interaction so I've never noticed this in the chat interfaces, and only with Claude using claude code, but I guess my sessions there do get much longer, so maybe I'm wrong that it's a harness bug
Yes, and with very long chats, you'll see it even forget how to do things like make tool calls - or even respond at all! I've had ChatGPT reply with raw JSON, regurgitate an earlier prompt, reply with a single newline, regurgitate information from a completely different chat, reply in a foreign language, and more.
Things get really wacky as it approaches decoherence.
Yeah, the raw JSON (in my case) is the result of a failed tool call, it was trying to generate an image. With thinking models, you can observe the degeneration of its understanding of image tool calls over the lifetime of a chat. It eventually puzzles over where images are supposed to be emitted, how it's supposed to write text, if it's allowed to provide commentary - and eventually, it gets all of it wrong. This also happens with file cites (in projects) and web search calls.
It makes sense. It's all probabilistic and it all gets fuzzy when garbage in context accumulates. User messages or system prompt got through the same network of math as model thinking and responses.