Long-context reasoning remains challenging for LLMs: performance degrades as context grows, and inputs beyond the context window are difficult to handle effectively. MemAgent introduced an RNN-like chunk-by-chunk memory workflow, but two practical limitations remain: memory can explode due to indiscriminate updates, and the workflow lacks an early-exit mechanism.
We propose GRU-Mem, a gated recurrent memory framework with two text-controlled gates: an update gate (UG) deciding whether memory should be updated at a step, and an exit gate (EG) deciding whether the recurrent loop should stop once sufficient evidence is collected. The model is trained end-to-end with explicit rewards for update and exit behaviors.
Across diverse long-context tasks, GRU-Mem improves both effectiveness and efficiency over vanilla MemAgent, and achieves up to 400% inference speed acceleration in selected settings.