Developer guide

Hack on brow-use

How the repo is laid out, how to build it, and how to add new tools, commands, or extension capabilities.

Repository layout

Path	Contents
mcp/	MCP server entry (`index.ts`) and the WebSocket bridge (`crx-client.ts`) that talks to the extension.
tool/	One file per MCP tool — browser tools, file-only tools, pure-compute tools. Each tool exports a `Tool` conforming to `tool.ts`.
extension/	Chrome MV3 extension: `background.ts`, `content.ts`, popup & sidepanel, manifest. Built with Vite.
plugin/	Plugin manifest plus `commands/*.md` — every `/bu:<name>` command is a markdown file Claude executes verbatim.
viewer/	React + Vite app for browsing recorded runs (trace + sidecar + reasoning + aria log on a timeline).
scripts/	Build & dev helpers: `extract-trace.ts`, `watch-extension.ts`, `generate-icons.ts`.
config/	Default app template (`app.json`). User apps live in `.brow-use/apps.json` at runtime.
output/	Generated artifacts (page/, workflow/, trace/, docs/, exploration/, reasoning/, results/). Created on first run.
Makefile	One-shot tasks: `make build`, `make install`, `make extract SESSION=<id>`.

Build & install

Prerequisites

Node ≥ 20 (matches the @types/node dev dependency).
Chrome (for session mode and extension dev).
Claude Code with plugin support enabled.

One-time setup

git clone <repo> brow-use && cd brow-use/spike
npm install
make build              # builds dist/mcp/ and dist/extension/
make install            # registers the marketplace and installs the bu plugin into Claude Code

Reinstall after changes

The Makefile collapses the uninstall/install dance:

make reinstall

For pure MCP-server iteration without going through Claude Code, run the server directly:

make dev-mcp            # npx tsx mcp/index.ts

Running tests

Tests live next to the code they cover (tool/foo.ts → tool/foo.test.ts) and run via the built-in node test runner:

npm test

Most tool tests use real fixtures from output/ (existing trace zips, aria logs) so they exercise the same paths the live commands do. They t.skip() automatically when fixtures aren't present, so a fresh clone passes without manual setup.

Anatomy of a tool

Every MCP tool is a TypeScript module exporting a single Tool. The interface lives in tool/tool.ts:

interface Tool {
  name: string
  description: string
  inputSchema: { type: 'object'; properties: Record<string, unknown>; required?: string[] }
  execute(input: Record<string, unknown>, ctx: ToolContext): Promise<string | ToolResultContent[]>
}

Three kinds of tools, each routed differently in mcp/index.ts:

Browser tools — drive the page (navigate, click, type, get_accessibility_tree, snapshot…). In session mode, the call is forwarded to the extension over WebSocket; in default mode, it runs against a Playwright Page.
File-only tools — read or write files in output/ or .brow-use/ (write_page_object, write_feature_doc, read_observed_edges, record_run). Always run on the Node side.
Pure-compute tools — neither browser nor disk (compare_fingerprint). Pure functions exposed as MCP tools because the agent calls them.

Adding a new tool

Create tool/<name>.ts exporting a Tool.
Import and register it in mcp/index.ts (browserTools array).
If file-only, add the tool name to the fileOnlyTools set in the same file. If pure-compute, add it to pureComputeTools.
Add a tool/<name>.test.ts — the existing tests are the best reference for fixture handling and skip behaviour.

If your tool needs to be callable from a slash command, also list it in the allowed-tools frontmatter of the relevant plugin/commands/*.md file as MCP(bu/<name>).

Anatomy of a command

Slash commands are markdown files in plugin/commands/. Each one starts with YAML frontmatter and is followed by free-form instructions Claude reads as the agent's brief for the duration of the command. The frontmatter looks like this:

---
disable-model-invocation: true
description: One-line summary that shows up in Claude Code's command picker
allowed-tools: Read, Glob, MCP(bu/health_check), MCP(bu/navigate), MCP(bu/click)
---

allowed-tools is the security boundary. The agent cannot call MCP tools that aren't listed here, even if they're registered on the server. disable-model-invocation: true means the command runs only when the user explicitly invokes it.

Adding a new command

Create plugin/commands/<name>.md.
Decide which MCP tools it needs and list them in allowed-tools.
Write the body as instructions to Claude — preflight, inputs, execution, outputs, failure modes. Use the existing commands (explore.md, document.md, do.md) as references.
make reinstall picks up the new command.

Working on the extension

Build & load

npm run build:extension

Then in Chrome:

Open chrome://extensions
Enable Developer mode (top-right toggle)
Load unpacked → select dist/extension/

This is a one-time step; Chrome remembers the extension across restarts.

Watch mode

Start Chrome with remote debugging so the watcher can trigger reloads over CDP:

npm run chrome             # opens Chrome with --remote-debugging-port=9222
npm run dev:extension      # rebuilds + reloads on every save in extension/

If Chrome was already open, quit it first — --remote-debugging-port is a startup flag and is ignored on a running instance.

Adding an extension command

The extension's background.ts dispatches incoming WebSocket messages to handlers backed by playwright-crx. Adding a new command is three steps:

Add a case to the dispatcher in extension/background.ts.
Add a corresponding wrapper in mcp/crx-client.ts (or just call crxClient.execute(name, args) directly from the tool).
Implement the tool on the Node side (so the same tool works in default Playwright mode too).

For full extension docs see Chrome extension and Session mode.

Working on the viewer

The viewer is a React + Vite app that turns a recorded run into a navigable timeline. It reads from a small JSON database produced by viewer/ingest.ts.

npm run viewer:ingest      # ingests every run from .brow-use/runs.json
npm run viewer:dev         # starts vite dev server

The ingester correlates the trace zip's action stream with the aria log, the reasoning log, and per-step screenshots, then emits a single JSON file the React app loads on the client. Re-run viewer:ingest after a new /bu:explore or /bu:run-instruction run to refresh the data.

Coding conventions

File naming: kebab-case for TS/JS files, one module per concept.
Folders: singular by component name (tool/, repository/, domain/), grouped by responsibility per the layered conventions in CLAUDE.md.
Comments: default to none. Add only when the why is non-obvious.
Defensive code: avoid null/undefined guards in internal code. Validate at the system boundary (tool inputs, network responses) only.
Tests: live next to source as *.test.ts. Use real artifacts from output/ via fs.symlinkSync when possible — fixture-free tests are unrealistic.

Common dev workflows

Iterating on a tool

Edit tool/<name>.ts → run npm test for fast feedback. When ready end-to-end, restart Claude Code (or run make reinstall) so the MCP server is rebuilt.

Iterating on a command

Edit the markdown file in plugin/commands/. Restart Claude Code so the new file is picked up. No build required for command changes — they're plain markdown.

Iterating on the extension

npm run chrome then npm run dev:extension in a separate terminal. Save → reload happens automatically. The extension's WebSocket auto-reconnects every 3s so the MCP server can come and go independently.

Iterating on the viewer

npm run viewer:dev. Re-run npm run viewer:ingest when the underlying run data changes.

Where to look next

Architecture

Layer model, mode comparison, and the run database schema.

Chrome extension internals

Permissions, supported commands, connection lifecycle.

Session mode internals

WebSocket bridge, mode switching, tracing in CRX mode.

User guide

Use the plugin end-to-end before touching the source.