agent-browser
VisitBrowser automation CLI designed for AI agents. Compact text output minimizes context usage. 100% native Rust.

What is agent-browser?
agent-browser is a high-performance, headless browser automation CLI specifically built for AI agents (like Claude Code, Cursor, or Gemini). Unlike traditional automation tools that return heavy JSON or HTML, it focuses on providing a compact accessibility tree (text-based) to maximize LLM context efficiency and reduce token costs.
Key Features
- Agent-first Design: Uses compact text output instead of bulky JSON/DOM, saving thousands of tokens per request.
- Ref-based Interaction: Assigns unique tags (e.g.,
@e1,@e2) to elements in a snapshot, allowing AI to interact with specific elements deterministically. - Native Rust Performance: Built entirely in Rust for instant command parsing and execution.
- Session Management: Supports multiple isolated browser instances with separate authentication states.
- Comprehensive Command Set: Includes over 50 commands covering navigation, form filling, screenshots, network monitoring, and storage.
Use Cases
- AI Agent Integration: Providing LLMs with a "browser-as-a-tool" to perform web research or data entry.
- Web Automation: Running scripts for automated testing or repetitive web tasks via shell commands.
- Context-Constrained Environments: Navigating complex web pages within limited token windows.
- Cross-Platform Automation: Deploying automation tasks across macOS, Linux, and Windows using native binaries.
FAQ / Why use "Refs"?
The tool utilizes a "Ref-based" snapshot system because:
- Efficiency: Text snapshots use ~200-400 tokens, compared to ~3000-5000 for a full DOM.
- Precision: Refs point to exact elements, eliminating the need for the AI to re-query the DOM or guess selectors.
- Speed: No DOM re-querying is required between commands.





