To Build Lifelong AI, Teach It To Forget
To keep learning over time, AI systems must forget some information. A new monograph explains why that tradeoff is essential.
Based on research by Yueyang Liu (Rice Business), Saurabh Kumar (Stanford), Henrik Marklund (Stanford), Ashish Rao (Stanford), Yifan Zhu (Stanford), Hong Jun Jeon (Stanford), and Benjamin Van Roy (Stanford)
Key takeaways:
- A field-defining framework establishes a baseline for researchers to approach continual learning as a unified problem.
- To build lifelong AI, we must rethink “catastrophic forgetting” as a challenge and start thinking of it as a necessary feature for systems to operate within real-world computational limits.
- With limited capacity freed up by discarding obsolete data, agents must continuously explore their ever-changing environments, prioritizing "durable" knowledge over fleeting trends to maximize long-term success.
If you try to recall your absolute favorite meal from when you were 10 years old, you might come up blank. But that memory lapse isn’t a failure of your brain. That detail probably just is not useful to your life anymore.
For years, computer scientists have viewed artificial intelligence through a much stricter lens. When an AI system learns new tasks or information, it overwrites some of what it learned before, a phenomenon known as “catastrophic forgetting.” Historically, the field has treated that as a flaw and tried to patch it with workarounds like storing large amounts of old data.
But in a new monograph published in Foundations and Trends in Machine Learning, Yueyang Liu, assistant professor of operations management at Rice Business, and her Stanford University co-authors argue that forgetting is not simply a bug. In real-world settings, AI systems face hard limits on memory and computing power. They cannot, and should not, try to remember everything.
How AI continues to learn about you
Traditional machine learning often assumes that training eventually ends. In these models, the system learns a stable target — e.g., identifying cats in a photo — and is effectively done. Under that logic, forgetting looks like failure.
Liu and her co-authors start from a different premise: the world keeps changing, so useful AI must keep adapting. A recommendation system for music or movies, for example, cannot assume your tastes are fixed. If it stops learning, it gets worse. To stay useful, it has to keep exploring.
“As humans, we do not hold on to every detail from the past,” Liu says. “A continually learning system has to make similar choices about what is worth keeping.”
By the time a music app user turns 25, it may not matter much who their favorite singer was at 15. What matters more are the patterns likely to stay useful going forward.
The researchers call this “durable” knowledge: information that remains relevant over time. To preserve room for that knowledge, the system has to discard obsolete or low-value data. In this framework, forgetting is not a breakdown. It is part of how constrained intelligence stays adaptable.
How constrained AI survives
To test that idea, Liu and her colleagues ran simulations examining how different algorithms perform over time with limited computational resources.
One experiment used a modified version of Permuted MNIST, a common benchmark based on classifying handwritten digits. Here the environment kept changing, forcing the AI to face a stream of shifting tasks rather than focus on a stable assignment.
The researchers compared three agents: a large memory agent that could retain 1 million past samples; a small memory agent limited to 1,000 samples; and a “reset” agent whose neural network was periodically wiped clean.
“As humans, we do not hold on to every detail from the past,” Liu says. “A continually learning system has to make similar choices about what is worth keeping.”
The results challenge some of the field’s usual assumptions. On tasks that never returned, the small memory agent performed just as well as the large memory agent, showing no lasting benefit to hanging on to one-off data. Even the reset agent could keep up when recurring tasks lasted long enough for it to relearn what it needed.
Most strikingly, when the researchers imposed stricter computing limits, larger-memory systems became less flexible. They suffered a “loss of plasticity,” growing too rigid to absorb new information well.
“If a system tries to preserve everything, it can lose the capacity to adapt,” Liu says. “Under real constraints, selective forgetting is often what keeps learning possible.”
Additional simulations showed something similar in recommendation-like environments. Agents that prioritized long-lasting patterns over fleeting trends earned stronger rewards over time.
How to think about continual learning
The research offers a more unified way to think about continual learning, a field that has often been split into narrower problems like memory retention, fast relearning or computational efficiency. Liu and her co-authors argue that these tradeoffs belong inside a single framework: maximizing long-term performance under real resource limits.
That shift also changes the meaning of catastrophic forgetting. Forgetting all past information is not ideal. But forgetting nonrecurring or low-value information may be exactly what helps a system keep learning over a long lifetime.
The framework does not solve every practical challenge. Exact mathematical solutions remain difficult in messy real-world environments, and additional capabilities still require memory and computing power that many systems do not have. But the paper establishes a clearer baseline for future work.
For researchers building lifelong AI, that means the goal may be less like perfect recall and more like good judgment. A smart system does not need to remember every detail from years ago. It needs to keep what still matters and let the rest go, whether that is an old data point or your favorite meal when you were 10.
Written by Scott Pett
“Continual Learning as Computationally Constrained Reinforcement Learning,” Foundations and Trends in Machine Learning (2025).
Never Miss A Story