What if your AI agents could get better at their jobs while you sleep? That is no longer a hypothetical. In May 2026, Anthropic announced a feature called “dreaming” for Claude Managed Agents: a scheduled memory-curation process that reviews an agent’s past sessions, identifies recurring patterns, merges duplicate information, removes outdated entries, and surfaces cross-session insights that no single conversation could detect on its own. Self-improving AI agents have moved from research concept to production reality, and the early results are extraordinary.
Harvey, the AI-powered legal research platform, reported that task completion rates increased roughly 6x after deploying dreaming. Wisedocs, which uses AI to review complex medical documents, cut review time by 50 percent. These are not incremental gains. They are step-change improvements driven by agents that continuously refine their own knowledge base without any human intervention required.
In this post, we break down exactly how dreaming works, what the real-world results tell us about the future of enterprise AI, and how you can apply these principles to your own agentic workflows today.
What Are Self-Improving AI Agents and How Does Anthropic’s Dreaming Work?
Traditional AI agents are stateless or only retain memory within a single session. When the conversation ends, the learning stops. Anthropic’s dreaming feature changes that fundamental limitation by adding a layer of asynchronous, offline reflection.
Here is how it works: after an agent completes its sessions throughout the day, a scheduled “dreaming” process runs in the background. It reviews all stored memory from recent sessions, identifies patterns such as repeated mistakes, team preferences, and converging workflows across multiple agents, then curates the memory store. Outdated information is pruned. Contradictions are resolved. Shared insights from a team of agents are surfaced and made available to the whole group.
Dreaming launched alongside two other Claude Managed Agents features that moved from research preview into public beta: “outcomes” and “multi-agent orchestration.” Outcomes lets you define a success rubric for the agent, and a separate grader agent evaluates output against that rubric independently, so it is not influenced by the main agent’s reasoning. When something falls short, the grader pinpoints the gap and the agent takes another pass. In Anthropic’s internal benchmarks, outcomes improved task success rates by up to 10 percentage points over standard prompting loops, with the largest gains on the hardest tasks.
Together, memory, dreaming, and outcomes form a system where agents not only complete tasks but learn how to complete them better over time. That is a qualitative shift in what AI agents can do.
Real-World Results: What Dreaming Is Already Delivering for Enterprise AI Agents
The most striking aspect of Anthropic’s dreaming announcement is not the feature itself, but the production results from early adopters. In two verticals known for high-complexity, high-stakes document work, the gains are dramatic.
Harvey, the AI legal research assistant used by major law firms, saw a 6x increase in task completion rates after enabling dreaming. Legal research involves nuanced, multi-step reasoning across large corpora of case law, statutes, and precedent. An agent that has dreamed on prior sessions arrives at each new task with a refined understanding of the firm’s preferences, common pitfalls, and jurisdiction-specific nuances that would take a human associate months to absorb.
Wisedocs, which automates medical document review for insurance and healthcare providers, cut document review time in half. Medical documents are notoriously variable in structure, terminology, and format. Dreaming allows agents to build a richer, continuously updated mental model of document patterns, reducing the time spent on disambiguation and re-processing.
These results align with broader enterprise AI trends. According to Gartner’s 2026 prediction, 40 percent of enterprise applications will feature task-specific AI agents by end of year, up from less than 5 percent in 2025. As adoption scales, the gap between agents that can self-improve and those that cannot will become a significant competitive differentiator. Businesses that deploy multi-agent AI systems with shared learning will compound their advantages over time in ways that static deployments simply cannot match.
How to Apply Self-Improving AI Agents in Your Business Workflows
You do not need to be a Fortune 500 company to benefit from the principles behind dreaming. Here are practical steps you can take today to build more adaptive agentic workflows, whether or not you are using Claude Managed Agents directly.
Start with persistent memory. If you are building on any modern AI agent platform, enable session-persistent memory. Agents that can recall prior interactions, user preferences, and past errors are a significant step above purely stateless systems. Platforms like n8n, Lindy, and direct Claude API integrations all support memory configurations. See our guide on building AI agent workflows with persistent memory in n8n for a hands-on starting point.
Define your success criteria explicitly. The “outcomes” companion feature to dreaming works because success is precisely defined. Before deploying any agent, write a clear rubric: what does a correct output look like? What are the common failure modes? Even without an automated grader, having this rubric lets you evaluate agent performance and feed improvements back into your prompts and memory stores.
Schedule regular memory audits. Whether your platform supports automated dreaming or not, you can implement a manual version: periodically review what your agents have logged, identify repeated errors, and update your system prompts or knowledge base accordingly. This creates a human-in-the-loop version of the same self-improvement loop.
Use multi-agent architectures for complex tasks. Dreaming works best when multiple agents share a knowledge pool. If you are currently relying on a single generalist agent, consider breaking your workflow into specialized agents that can share learnings across tasks. This is a core driver of the strongest AI agent ROI results seen in 2026.
What Dreaming Signals About the Next Phase of AI Agents
Anthropic’s dreaming feature is significant not just for its immediate utility, but for what it signals about the trajectory of the entire AI agent category.
For most of the past three years, the primary axis of AI agent competition has been capability: which model reasons better, which can use more tools, which handles longer contexts. Dreaming introduces a new axis: adaptability over time. An agent that improves with use is fundamentally different from one that does not. The analogy is a new employee who learns from feedback versus one who makes the same mistakes indefinitely. Over a six-month deployment horizon, the compounding difference becomes enormous.
This shift also has governance implications. As agents accumulate refined knowledge across sessions, the question of what they have learned, and whether that learning is accurate, safe, and auditable, becomes critical. Enterprises building on self-improving AI will need memory governance frameworks alongside their agent governance policies. This connects directly to the growing conversation around responsible agentic AI deployment that analysts and vendors alike are actively grappling with in 2026.
The platforms that solve continuous learning, accuracy, and auditability together will define the next generation of enterprise AI infrastructure. Dreaming is the opening move in what will become a much bigger category.
Key Takeaways
Self-improving AI agents are here and already delivering production results: 6x task completion gains at Harvey and 50 percent time reductions at Wisedocs. Anthropic’s dreaming feature, which curates agent memory between sessions and surfaces cross-session patterns, represents a fundamental shift from static to adaptive AI agents. The practical implication for businesses is clear: start building with persistent memory, define your success criteria explicitly, and architect for multi-agent knowledge sharing now, before the gap between adaptive and static deployments becomes a competitive disadvantage.
Whether you are just getting started with AI automation or scaling an existing agentic stack, BigAIAgent.tech covers the tools, frameworks, and strategies you need to stay ahead. Which matters more to your business right now: agents that learn from the past, or agents that coordinate better across tasks? Share your thoughts in the comments below.






