Cursor 3 Glass: Engineering Breakdown of Parallel Agent Architecture
Core Question: Cursor 3 redefines the IDE from an “AI-assisted editing tool” to an “agent orchestration platform.” What are the engineering decisions behind this shift? Why is the worktree the core abstraction of this architecture?
Tags: Cursor 3, Agent Architecture, Parallel Agents, Worktree, AI, Coding
1. Why the Form of IDE Must Change
Cursor 3, codenamed “Glass,” was released on April 2, 2026. Its headline change is not a new model but a new shape.
The Composer, which defined Cursor since 2024, has been removed. In its place is a full-screen Agents Window: multiple agents can be launched simultaneously, each pointing to different tasks, maintaining independent contexts and worktrees.
This redesign is based on a clear product assertion: most code in the future will be written by AI agents; the role of human engineers is to orchestrate these agents. Glass is Cursor’s bet on this future.
However, this is not just a UI change. It requires a complete redesign of the underlying task scheduling, context isolation, and result merging mechanisms. This article breaks down the core engineering decisions of this architecture.
2. Core Abstraction: Why Worktree
2.1 What Worktree Is Not
Cursor 2 also supported worktrees, but its usage was entirely different. In Cursor 2, the worktree represented “completely independent parallel contexts”—each branch had its own set of code and dialogue history, and engineers had to manually decide which branch’s changes were “correct” before merging back to the main branch. This was a parallel tool aimed at humans.
Cursor 3’s worktree design goal is different. Its core question is: when multiple agents modify the same codebase simultaneously, how can their changes be systematically merged without creating a bunch of conflicting branches?
2.2 PR-Oriented Worktree
Cursor 3’s answer is described in official community posts as a “PR-oriented worktree.”
Each agent operates on its own worktree,
each worktree corresponds to an independent PR/task.
Once completed, a Merge Agent coordinates the merge.
This means parallel no longer means “simultaneously modifying the same code” but rather “breaking tasks down into independent subtasks, each executed on independent code snapshots.” During merging, only logical conflicts need to be addressed, not git-level branch merges.
The pseudocode for task allocation logic is roughly:
# Task allocation logic for Cursor 3 Agents Window (conceptual model)
def launch_parallel_refactor(main_task, subtasks, agents_config):
# 1. Clone the main repository into N worktrees
worktrees = [clone_to_worktree(main_task.repo, f"wt-{i}")
for i in range(len(subtasks))]
# 2. Each agent executes an independent subtask in an independent worktree
futures = []
for task, wt in zip(subtasks, worktrees):
agent = create_agent(config=agents_config[task.type])
# Agent receives: subtask description + worktree path + unique instructions
futures.append(agent.execute(task, worktree=wt))
# 3. Wait for all agents to complete
results = [f.result() for f in futures]
# 4. Merge Agent reads diffs from each worktree and coordinates the merge
merge_agent = create_agent(type="merge")
merge_result = merge_agent.merge([r.diff for r in results])
return merge_result
The isolation granularity of worktrees is repository-level, not file-level. This means agents do not see each other’s partial modifications, reducing mutual interference. However, this design also means: if the task itself is not decomposable (i.e., there are strong dependencies between subtasks), forcing parallelism can make things worse.
3. The Architecture of Agents Window
3.1 From Sidebar to Full-Screen Workspace
The Composer’s physical form was a panel on the right side of the IDE. It coexisted with the main editing area, allowing engineers to code while interacting with agents.
The Agents Window is full-screen. Its form is closer to “an IDE specifically designed for agent collaboration,” rather than a chat window attached to a code editor.
This signifies a shift in the interaction model:
| Dimension | Composer 2 (Cursor 2.x) | Agents Window (Cursor 3) |
|---|---|---|
| Interaction Mode | Human-machine collaboration (human writes, AI assists) | Human-agent collaboration (human orchestrates, agent executes) |
| Context Window | Shared main editor context | Each agent has an independent worktree context |
| Task Granularity | Single file or small scope modifications | Large tasks across multiple files and modules |
| Feedback Loop | Real-time (human sees agent changes) | Asynchronous (agents run in the background, results reviewed later) |
| UI Form | Sidebar panel (coexisting with human-operated editor) | Full-screen workspace (human is an observer, not an operator) |
The core of this transformation is reversal of control: in Composer, humans are the subjects, and AI is the tool; in the Agents Window, agents are the subjects, and humans are the reviewers.
3.2 Three Operating Environments
The Agents Window supports three agent operating environments:
- Local: Agents operate on the local file system, suitable for scenarios requiring direct access to the local development environment.
- Cloud Sandbox: Agents run in isolated cloud environments, suitable for long-running tasks or tasks requiring special hardware resources.
- Remote SSH: Agents operate on remote servers, suitable for accessing production environments or special deployment targets.
This three-environment design addresses a practical issue in the Deep Agent paradigm: different task types require different execution contexts. Quick file operations are suitable for local; time-consuming tasks are suitable for the cloud; deployment-related tasks are suitable for the SSH environment.
4. Composer 2: The Reasoning Foundation of Glass
4.1 Technical Specifications
Glass is not a new model, but its Agents Window heavily relies on Composer 2 as the core reasoning engine.
Composer 2 was released on March 19, 2026, and is Cursor’s first proprietary reasoning model, priced at $0.50/M input / $2.50/M output tokens. Here are key benchmark data (from the official technical report, March 2026):
| Model | CursorBench | Terminal-Bench 2.0 | SWE-bench | Multilingual |
|---|---|---|---|---|
| Composer 2 | 61.3 | 61.7 | 73.7 | - |
| Composer 1.5 | 44.2 | 47.9 | 65.9 | - |
| Composer 1 | 38.0 | 40.0 | 56.9 | - |
Terminal-Bench 2.0 uses the official Harbor evaluation framework, averaging over 5 iterations. Notably, Composer 2’s Terminal-Bench 2.0 score (61.7) is lower than the official score obtained by Claude Code (different harnesses cannot be directly compared).
4.2 /best-of-n: Engineering Significance of Model Selection
Cursor 3 introduces the /best-of-n mechanism, which essentially runs multiple models/configurations in parallel for the same task and selects the best result.
This is not a new concept (LLM’s self-consistency, best-of-N sampling have been around), but making it a built-in feature of the IDE means: model selection itself becomes an engineering configurable item, rather than developers manually switching between different tools.
The engineering implications of /best-of-n are:
- For simple tasks: using cheaper models multiple times, the total cost is still lower than a single Frontier model run.
- For critical tasks: Frontier model + N retries, improving pass@1.
- The trade-off is that token consumption may significantly increase (N inputs + N outputs).
5. Design Trade-offs: Engineering Choices of Glass
5.1 The Real Cost of Parallel Agents
An analysis from dev.to provides a specific case: adding idempotency, audit log, and tests to a FastAPI refund endpoint.
Composer 2 Serial Workflow (about 6 minutes, ~$0.20):
Agent completes in sequence: API endpoint → audit log → tests
Single agent, shared context
Agents Window Parallel Workflow (about 3 minutes, ~$0.30):
Agent 1: API endpoint (worktree A)
Agent 2: audit log (worktree B)
Agent 3: tests (worktree C)
→ Merge Agent merges
Parallelism saves 50% wall time, but token consumption increases by about 50%. In Max Mode, this multiplier effect is even more pronounced.
Key Insight: The Agents Window is a token budget magnifier UI. Each agent tab consumes tokens. If teams do not establish a consciousness of concurrent resource usage, token consumption can far exceed actual benefits.
5.2 Local Value of Design Mode
The Design Mode introduced in Glass deserves separate evaluation. Its core capability is: directly annotating UI elements in a rendered browser page, passing the annotation results as structured instructions to agents.
This addresses a long-standing pain point in front-end development: visual adjustment requests (“move this button 8px to the right”) are difficult to express accurately in traditional text-based agent dialogues. Design Mode provides a more natural input channel.
However, the applicability of Design Mode is limited to front-end visual tasks; it offers little help for back-end logic, data processing, or system architecture tasks. This is a localized optimization for specific scenarios, not a general capability.
5.3 Pricing Structure of Max Mode
Starting March 16, 2026, the Frontier models (GPT-5.4, Sonnet-4.6, Opus-4.6, GPT-5.3 Codex) were moved to the Max Mode tier of Pro/Pro+/Ultra subscriptions. The billing method for Max Mode adds a multiplier to the standard request rate.
This means: Glass’s Agents Window directly layers with the Max Mode billing model—parallel agents × Max multiplier × number of agents. This is a combination worth closely monitoring in the 2026 AI coding cost structure.
6. Engineering Applicability Assessment
Scenarios Where Glass is Applicable
In the following scenarios, Glass’s Agents Window provides real engineering value:
- Large multi-file refactoring: Cross 20+ file domain concept renaming, decomposed into multiple subtasks for parallel processing.
- Multi-package monorepo operations: Changes between packages are relatively independent, yielding high parallel benefits.
- Code + test dual-line parallelism: One agent is responsible for implementation, while another agent handles testing, resulting in fewer logical conflicts during merging.
- Long-running background tasks: Cloud agents execute test suites or type checks while local work continues.
Scenarios Where Glass is Not Applicable or Cost-Effective
- Single file simple modifications: Forcing parallelism will only increase merging costs and token consumption.
- Strongly coupled multi-file changes: If multiple agents need to understand the same file’s context, parallelism yields negative benefits.
- Budget-constrained teams: Costs of parallel agents in Max Mode may far exceed actual time savings.
- Complex logic requiring deep code understanding: Current models still have limitations in cross-module context understanding, and parallel agents may amplify understanding errors.
Team Adoption Recommendations
- Start with three concurrent agents, not six. Three parallel agents are a reasonable upper limit for most decomposable tasks.
- Check Max Mode bills on Fridays. If token consumption shows abnormal growth, review the week’s agent concurrency usage.
- Establish admission criteria for parallel tasks: Tasks must be decomposable into “clearly defined boundaries and context-independent subtasks” to use parallel agents.
- Do not disable Frontier models. Use Frontier models for core complex logic, while Composer or other cheaper models handle simple tasks.
7. First-Hand Resources
- Composer 2 Technical Report (arXiv:2603.24477)
- Cursor 3 Official Changelog
- Cursor 3 Official Release Blog
- Cursor 3: Worktrees & Best-of-N Official Discussion
- Terminal-Bench 2.0 Official Evaluation Framework (Harbor)
Comments
Discussion is powered by Giscus (GitHub Discussions). Add
repo,repoID,category, andcategoryIDunder[params.comments.giscus]inhugo.tomlusing the values from the Giscus setup tool.