agent-browser

Browser automation CLI designed for AI agents. Compact text output minimizes context usage. 100% native Rust.

What is agent-browser?

agent-browser is a high-performance, headless browser automation CLI specifically built for AI agents (like Claude Code, Cursor, or Gemini). Unlike traditional automation tools that return heavy JSON or HTML, it focuses on providing a compact accessibility tree (text-based) to maximize LLM context efficiency and reduce token costs.

Key Features

Agent-first Design: Uses compact text output instead of bulky JSON/DOM, saving thousands of tokens per request.
Ref-based Interaction: Assigns unique tags (e.g., @e1, @e2) to elements in a snapshot, allowing AI to interact with specific elements deterministically.
Native Rust Performance: Built entirely in Rust for instant command parsing and execution.
Session Management: Supports multiple isolated browser instances with separate authentication states.
Comprehensive Command Set: Includes over 50 commands covering navigation, form filling, screenshots, network monitoring, and storage.

Use Cases

AI Agent Integration: Providing LLMs with a "browser-as-a-tool" to perform web research or data entry.
Web Automation: Running scripts for automated testing or repetitive web tasks via shell commands.
Context-Constrained Environments: Navigating complex web pages within limited token windows.
Cross-Platform Automation: Deploying automation tasks across macOS, Linux, and Windows using native binaries.

FAQ / Why use "Refs"?

The tool utilizes a "Ref-based" snapshot system because:

Efficiency: Text snapshots use ~200-400 tokens, compared to ~3000-5000 for a full DOM.
Precision: Refs point to exact elements, eliminating the need for the AI to re-query the DOM or guess selectors.
Speed: No DOM re-querying is required between commands.

🔎

Similar to agent-browser

Browser Use

Agents at scale. Undetectable browsers. The API for any website.

Automation

pinchtab

High-performance browser automation bridge and multi-instance orchestrator with advanced stealth injection and real-time dashboard.

Automation

lightpanda

The first browser for machines, not humans. 10x faster. 10x less memory. Instant startup.

Automation

chrome-devtools-mcp

chrome-devtools-mcp lets your coding agent (such as Gemini, Claude, Cursor or Copilot) control and inspect a live Chrome browser.

Automation

playwright

Playwright is a framework for Web-browser automation that allows testing across Chromium, Firefox, and WebKit with a single API.

ProductivityDeveloper Tools

n8n

Flexible AI workflow automation platform for technical teams. Build multi-step AI agents, integrate 500+ apps, and deploy on-prem or in the cloud with full control.

Automation