agent-browser

Visit

Browser automation CLI designed for AI agents. Compact text output minimizes context usage. 100% native Rust.

agent-browser

What is agent-browser?

agent-browser is a high-performance, headless browser automation CLI specifically built for AI agents (like Claude Code, Cursor, or Gemini). Unlike traditional automation tools that return heavy JSON or HTML, it focuses on providing a compact accessibility tree (text-based) to maximize LLM context efficiency and reduce token costs.

Key Features

  • Agent-first Design: Uses compact text output instead of bulky JSON/DOM, saving thousands of tokens per request.
  • Ref-based Interaction: Assigns unique tags (e.g., @e1, @e2) to elements in a snapshot, allowing AI to interact with specific elements deterministically.
  • Native Rust Performance: Built entirely in Rust for instant command parsing and execution.
  • Session Management: Supports multiple isolated browser instances with separate authentication states.
  • Comprehensive Command Set: Includes over 50 commands covering navigation, form filling, screenshots, network monitoring, and storage.

Use Cases

  • AI Agent Integration: Providing LLMs with a "browser-as-a-tool" to perform web research or data entry.
  • Web Automation: Running scripts for automated testing or repetitive web tasks via shell commands.
  • Context-Constrained Environments: Navigating complex web pages within limited token windows.
  • Cross-Platform Automation: Deploying automation tasks across macOS, Linux, and Windows using native binaries.

FAQ / Why use "Refs"?

The tool utilizes a "Ref-based" snapshot system because:

  1. Efficiency: Text snapshots use ~200-400 tokens, compared to ~3000-5000 for a full DOM.
  2. Precision: Refs point to exact elements, eliminating the need for the AI to re-query the DOM or guess selectors.
  3. Speed: No DOM re-querying is required between commands.
🔎

Similar to agent-browser

Browser Use
Browser Use
Agents at scale. Undetectable browsers. The API for any website.
Automation
pinchtab
pinchtab
High-performance browser automation bridge and multi-instance orchestrator with advanced stealth injection and real-time dashboard.
Automation
lightpanda
lightpanda
The first browser for machines, not humans. 10x faster. 10x less memory. Instant startup.
Automation
chrome-devtools-mcp
chrome-devtools-mcp
chrome-devtools-mcp lets your coding agent (such as Gemini, Claude, Cursor or Copilot) control and inspect a live Chrome browser.
Automation
playwright
playwright
Playwright is a framework for Web-browser automation that allows testing across Chromium, Firefox, and WebKit with a single API.
ProductivityDeveloper Tools
n8n
n8n
Flexible AI workflow automation platform for technical teams. Build multi-step AI agents, integrate 500+ apps, and deploy on-prem or in the cloud with full control.
Automation