Cursor 3 Glass: Engineering Breakdown of Parallel Agent Architecture

Core Question: Cursor 3 redefines the IDE from an “AI-assisted editing tool” to an “agent orchestration platform.” What are the engineering decisions behind this shift? Why is the worktree the core abstraction of this architecture?

Tags: Cursor 3, Agent Architecture, Parallel Agents, Worktree, AI, Coding

1. Why the Form of IDE Must Change

Cursor 3, codenamed “Glass,” was released on April 2, 2026. Its headline change is not a new model but a new shape.

The Composer, which defined Cursor since 2024, has been removed. In its place is a full-screen Agents Window: multiple agents can be launched simultaneously, each pointing to different tasks, maintaining independent contexts and worktrees.

This redesign is based on a clear product assertion: most code in the future will be written by AI agents; the role of human engineers is to orchestrate these agents. Glass is Cursor’s bet on this future.

However, this is not just a UI change. It requires a complete redesign of the underlying task scheduling, context isolation, and result merging mechanisms. This article breaks down the core engineering decisions of this architecture.

2. Core Abstraction: Why Worktree

2.1 What Worktree Is Not

Cursor 2 also supported worktrees, but its usage was entirely different. In Cursor 2, the worktree represented “completely independent parallel contexts”—each branch had its own set of code and dialogue history, and engineers had to manually decide which branch’s changes were “correct” before merging back to the main branch. This was a parallel tool aimed at humans.

Cursor 3’s worktree design goal is different. Its core question is: when multiple agents modify the same codebase simultaneously, how can their changes be systematically merged without creating a bunch of conflicting branches?

2.2 PR-Oriented Worktree

Cursor 3’s answer is described in official community posts as a “PR-oriented worktree.”

Each agent operates on its own worktree,
each worktree corresponds to an independent PR/task.
Once completed, a Merge Agent coordinates the merge.

This means parallel no longer means “simultaneously modifying the same code” but rather “breaking tasks down into independent subtasks, each executed on independent code snapshots.” During merging, only logical conflicts need to be addressed, not git-level branch merges.

The pseudocode for task allocation logic is roughly:

# Task allocation logic for Cursor 3 Agents Window (conceptual model)
def launch_parallel_refactor(main_task, subtasks, agents_config):
    # 1. Clone the main repository into N worktrees
    worktrees = [clone_to_worktree(main_task.repo, f"wt-{i}") 
                 for i in range(len(subtasks))]

    # 2. Each agent executes an independent subtask in an independent worktree
    futures = []
    for task, wt in zip(subtasks, worktrees):
        agent = create_agent(config=agents_config[task.type])
        # Agent receives: subtask description + worktree path + unique instructions
        futures.append(agent.execute(task, worktree=wt))

    # 3. Wait for all agents to complete
    results = [f.result() for f in futures]

    # 4. Merge Agent reads diffs from each worktree and coordinates the merge
    merge_agent = create_agent(type="merge")
    merge_result = merge_agent.merge([r.diff for r in results])

    return merge_result

The isolation granularity of worktrees is repository-level, not file-level. This means agents do not see each other’s partial modifications, reducing mutual interference. However, this design also means: if the task itself is not decomposable (i.e., there are strong dependencies between subtasks), forcing parallelism can make things worse.

3. The Architecture of Agents Window

The Composer’s physical form was a panel on the right side of the IDE. It coexisted with the main editing area, allowing engineers to code while interacting with agents.

The Agents Window is full-screen. Its form is closer to “an IDE specifically designed for agent collaboration,” rather than a chat window attached to a code editor.

This signifies a shift in the interaction model:

Dimension	Composer 2 (Cursor 2.x)	Agents Window (Cursor 3)
Interaction Mode	Human-machine collaboration (human writes, AI assists)	Human-agent collaboration (human orchestrates, agent executes)
Context Window	Shared main editor context	Each agent has an independent worktree context
Task Granularity	Single file or small scope modifications	Large tasks across multiple files and modules
Feedback Loop	Real-time (human sees agent changes)	Asynchronous (agents run in the background, results reviewed later)
UI Form	Sidebar panel (coexisting with human-operated editor)	Full-screen workspace (human is an observer, not an operator)

The core of this transformation is reversal of control: in Composer, humans are the subjects, and AI is the tool; in the Agents Window, agents are the subjects, and humans are the reviewers.

3.2 Three Operating Environments

The Agents Window supports three agent operating environments:

Local: Agents operate on the local file system, suitable for scenarios requiring direct access to the local development environment.
Cloud Sandbox: Agents run in isolated cloud environments, suitable for long-running tasks or tasks requiring special hardware resources.
Remote SSH: Agents operate on remote servers, suitable for accessing production environments or special deployment targets.

This three-environment design addresses a practical issue in the Deep Agent paradigm: different task types require different execution contexts. Quick file operations are suitable for local; time-consuming tasks are suitable for the cloud; deployment-related tasks are suitable for the SSH environment.

4. Composer 2: The Reasoning Foundation of Glass

4.1 Technical Specifications

Glass is not a new model, but its Agents Window heavily relies on Composer 2 as the core reasoning engine.

Composer 2 was released on March 19, 2026, and is Cursor’s first proprietary reasoning model, priced at $0.50/M input / $2.50/M output tokens. Here are key benchmark data (from the official technical report, March 2026):

Model	CursorBench	Terminal-Bench 2.0	SWE-bench	Multilingual
Composer 2	61.3	61.7	73.7	-
Composer 1.5	44.2	47.9	65.9	-
Composer 1	38.0	40.0	56.9	-

Terminal-Bench 2.0 uses the official Harbor evaluation framework, averaging over 5 iterations. Notably, Composer 2’s Terminal-Bench 2.0 score (61.7) is lower than the official score obtained by Claude Code (different harnesses cannot be directly compared).

4.2 /best-of-n: Engineering Significance of Model Selection

Cursor 3 introduces the /best-of-n mechanism, which essentially runs multiple models/configurations in parallel for the same task and selects the best result.

This is not a new concept (LLM’s self-consistency, best-of-N sampling have been around), but making it a built-in feature of the IDE means: model selection itself becomes an engineering configurable item, rather than developers manually switching between different tools.

The engineering implications of /best-of-n are:

For simple tasks: using cheaper models multiple times, the total cost is still lower than a single Frontier model run.
For critical tasks: Frontier model + N retries, improving pass@1.
The trade-off is that token consumption may significantly increase (N inputs + N outputs).

5. Design Trade-offs: Engineering Choices of Glass

5.1 The Real Cost of Parallel Agents

An analysis from dev.to provides a specific case: adding idempotency, audit log, and tests to a FastAPI refund endpoint.

Composer 2 Serial Workflow (about 6 minutes, ~$0.20):

Agent completes in sequence: API endpoint → audit log → tests
Single agent, shared context

Agents Window Parallel Workflow (about 3 minutes, ~$0.30):

Agent 1: API endpoint (worktree A)
Agent 2: audit log (worktree B)
Agent 3: tests (worktree C)
→ Merge Agent merges

Parallelism saves 50% wall time, but token consumption increases by about 50%. In Max Mode, this multiplier effect is even more pronounced.

Key Insight: The Agents Window is a token budget magnifier UI. Each agent tab consumes tokens. If teams do not establish a consciousness of concurrent resource usage, token consumption can far exceed actual benefits.

5.2 Local Value of Design Mode

The Design Mode introduced in Glass deserves separate evaluation. Its core capability is: directly annotating UI elements in a rendered browser page, passing the annotation results as structured instructions to agents.

This addresses a long-standing pain point in front-end development: visual adjustment requests (“move this button 8px to the right”) are difficult to express accurately in traditional text-based agent dialogues. Design Mode provides a more natural input channel.

However, the applicability of Design Mode is limited to front-end visual tasks; it offers little help for back-end logic, data processing, or system architecture tasks. This is a localized optimization for specific scenarios, not a general capability.

5.3 Pricing Structure of Max Mode

Starting March 16, 2026, the Frontier models (GPT-5.4, Sonnet-4.6, Opus-4.6, GPT-5.3 Codex) were moved to the Max Mode tier of Pro/Pro+/Ultra subscriptions. The billing method for Max Mode adds a multiplier to the standard request rate.

This means: Glass’s Agents Window directly layers with the Max Mode billing model—parallel agents × Max multiplier × number of agents. This is a combination worth closely monitoring in the 2026 AI coding cost structure.

6. Engineering Applicability Assessment

Scenarios Where Glass is Applicable

In the following scenarios, Glass’s Agents Window provides real engineering value:

Large multi-file refactoring: Cross 20+ file domain concept renaming, decomposed into multiple subtasks for parallel processing.
Multi-package monorepo operations: Changes between packages are relatively independent, yielding high parallel benefits.
Code + test dual-line parallelism: One agent is responsible for implementation, while another agent handles testing, resulting in fewer logical conflicts during merging.
Long-running background tasks: Cloud agents execute test suites or type checks while local work continues.

Scenarios Where Glass is Not Applicable or Cost-Effective

Single file simple modifications: Forcing parallelism will only increase merging costs and token consumption.
Strongly coupled multi-file changes: If multiple agents need to understand the same file’s context, parallelism yields negative benefits.
Budget-constrained teams: Costs of parallel agents in Max Mode may far exceed actual time savings.
Complex logic requiring deep code understanding: Current models still have limitations in cross-module context understanding, and parallel agents may amplify understanding errors.

Team Adoption Recommendations

Start with three concurrent agents, not six. Three parallel agents are a reasonable upper limit for most decomposable tasks.
Check Max Mode bills on Fridays. If token consumption shows abnormal growth, review the week’s agent concurrency usage.
Establish admission criteria for parallel tasks: Tasks must be decomposable into “clearly defined boundaries and context-independent subtasks” to use parallel agents.
Do not disable Frontier models. Use Frontier models for core complex logic, while Composer or other cheaper models handle simple tasks.

7. First-Hand Resources

Composer 2 Technical Report (arXiv:2603.24477)
Cursor 3 Official Changelog
Cursor 3 Official Release Blog
Cursor 3: Worktrees & Best-of-N Official Discussion
Terminal-Bench 2.0 Official Evaluation Framework (Harbor)