Dr-DCI: Scaling Agentic Search via Dynamic Workspace Expansion

🔑 Enhanced Key Takeaways

•Dr-DCI builds upon the concept of Direct Corpus Interaction (DCI), where an agent directly searches a raw corpus using general-purpose terminal tools like grep or file reads, bypassing traditional embedding models or vector indexes for retrieval.
•The BrowseComp-Plus benchmark, on which Dr-DCI achieved 73.3% accuracy, is a static, curated dataset of approximately 100,000 human-verified web documents, designed to provide a fair and reproducible environment for evaluating deep-research AI agents by disentangling retrieval and reasoning components.
•Agentic search, which Dr-DCI enhances, represents a paradigm shift in AI, moving beyond single-shot LLM outputs to systems that pursue goals, make decisions, and act autonomously, often involving multi-step planning, tool use, memory, and reflection.

🛠️ Technical Deep Dive

Dr-DCI is a retriever-steered framework that dynamically expands a local workspace for agentic operations, implying an initial retrieval step to identify a relevant subset of the corpus for the agent to interact with.
The underlying Direct Corpus Interaction (DCI) allows agents to search the raw corpus directly using general-purpose terminal tools (e.g., grep, file reads, shell commands, lightweight scripts) without relying on embedding models, vector indexes, or retrieval APIs.
This direct interaction capability enables the handling of exact lexical constraints, sparse clue conjunctions, local context checks, and multi-step hypothesis refinement, which are often challenging for conventional semantic retrievers.
DCI requires no offline indexing, making it adaptable to evolving local corpora.
The dynamic workspace expansion in Dr-DCI likely optimizes the DCI approach by focusing the agent's interaction on a more manageable and highly relevant subset of documents, thereby reducing the computational burden of interacting with a very large raw corpus directly at every step.

🔮 Future ImplicationsAI analysis grounded in cited sources

Dr-DCI could significantly reduce the infrastructure costs associated with large-scale agentic search.

By reducing tool usage, wall time, and computational costs, and not requiring offline indexing or vector databases, Dr-DCI offers a more resource-efficient approach to agentic search over vast document collections.

The approach of dynamic workspace expansion will become a standard component in future agentic AI architectures.

Its ability to combine scalability with precision, and its strong performance on benchmarks like BrowseComp-Plus, suggests it addresses a critical bottleneck in current agentic systems interacting with large corpora.

Dr-DCI's methodology will accelerate research into more robust and generalizable tool-use agents.

By providing a more efficient and precise way for agents to interact with information, it frees up computational resources and simplifies the interaction interface, allowing researchers to focus on higher-level agentic reasoning and planning.

⏳ Timeline

2025-08-08

BrowseComp-Plus dataset released on Hugging Face, featuring ~100K web documents.

2025-08-09

BrowseComp-Plus paper submitted/published, introducing it as a new benchmark for Deep-Research systems.

2026-01-04

BrowseComp-Plus: A Fair and Disentangled Evaluation Benchmark for Deep Search Agents paper published.

2026-05-03

"Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction" (DCI) paper published on arXiv, outlining the foundational DCI approach.

2026-06-08

"Dr-DCI: Scaling Direct Corpus Interaction via Dynamic Workspace Expansion" paper announced on arXiv.

Dr-DCI: Scaling Agentic Search via Dynamic Workspace Expansion

⚡ 30-Second TL;DR

🧠 Deep Insight

🔑 Enhanced Key Takeaways

🛠️ Technical Deep Dive

🔮 Future ImplicationsAI analysis grounded in cited sources

⏳ Timeline

📎 Sources (7)

👉Related Updates