-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Problem
PR #991 adds isolatedContext to new_page, enabling agents to create pages with isolated cookies, storage, and sessions. However, all subsequent tools (take_screenshot, take_snapshot, navigate_page, evaluate_script, emulate, click, fill, press_key, etc.) resolve their target page via the global getSelectedPage(). This means agents must call select_page before each operation, which introduces a race condition in multi-agent workflows.
The internal mutex serializes individual tool calls, but it doesn't prevent interleaving between two sequential calls from the same agent:
| Step | Mutex processes | #selectedPage after |
Correct? |
|---|---|---|---|
| 1 | Agent A: select_page(2) |
Page 2 | Yes |
| 2 | Agent B: select_page(3) |
Page 3 | Yes |
| 3 | Agent A: take_screenshot() |
Page 3 (captures wrong page) | No, wanted Page 2 |
Each call is atomic, but Agent A has no way to hold the mutex across both calls. Agent B's select_page mutates the shared #selectedPage state in between.
Use cases
(genuinely using it like that)
Parallel site exploration for specification writing. A team of agents explores a large e-commerce site to produce functional specifications for each domain (navigation, product listing, product detail, checkout, account, etc.). A single agent doing this sequentially takes hours. With 3-5 parallel agents, each assigned a domain and operating in its own isolated context, the same work completes in a fraction of the time. Each agent needs to navigate, take snapshots, fill forms, and interact with elements independently.
Cross-session cache and state testing. Two agents investigate caching issues by browsing the same site simultaneously: one logged in as a premium user, the other as a guest. Isolated contexts give them separate cookies and sessions, but they need to take screenshots, evaluate scripts, and compare DOM state in parallel to catch bugs where one user sees another's cached data.
Responsive design auditing. Multiple agents emulate different viewports (mobile, tablet, desktop) on the same site in parallel. Each agent operates in its own isolated context with a different viewport configuration, taking screenshots and snapshots across the same set of pages to compare layout differences.
Multi-locale testing. Agents browse separate locale-specific versions of a site (en-US, de-DE, ja-JP) in parallel, each in its own isolated context with different language/geo settings, verifying that translations, currency formatting, and regional content are correct.
Why separate MCP server instances don't solve this
Agent frameworks like Claude Code configure the MCP server once (in .mcp.json), and the framework launches a single server process. When the user spawns a team of parallel agents (for site exploration, cross-role testing, parallel crawling, etc.), all agents share that single MCP server. The user has no mechanism to give each agent its own server instance.
Proposed solution
Initially I proposed routing tools via the isolatedContext name from new_page. After discussion with @OrKoN, pageId routing is a better fit: it works for same-context multi-page scenarios, doesn't require isolated contexts, and the concept already exists in list_pages/select_page.
Add an optional pageId parameter to all page-operating tools. When provided, the tool resolves the target page directly, bypassing getSelectedPage(). When omitted, behavior is unchanged.
// Single-agent (unchanged):
take_screenshot()
// Multi-agent (atomic, no select_page needed):
take_screenshot(pageId: 2)Agents get the page ID from the new_page / list_pages response and pass it to every subsequent call. No new concepts to track, and select_page is no longer needed in multi-agent flows.
Implementation
PR at #1022 (feat/isolated-context-page-routing branch).