Browser-act is built for models, not for humans writing browser scripts by hand. The interface is designed around one question: how can an AI agent understand the browser state and choose the next safe action?Documentation Index
Fetch the complete documentation index at: https://docs.browseract.com/llms.txt
Use this file to discover all available pages before exploring further.
Compact state output
Agents do not need the full DOM or verbose JSON for every step.state returns a compact, indexed view:
* marker highlights new or changed elements since the previous state call. This helps the agent focus on what changed.
| State field | How the agent uses it |
|---|---|
url | Confirms the current page before taking action |
title | Checks whether navigation reached the expected screen |
[N] index | Provides the target for commands such as click 4 or input 2 |
* marker | Points attention to elements that changed after the last action |
Indexed interaction
Agents act by element index:No selector guessing
Agents do not need to generate XPath or CSS selectors for routine actions.
Same view, same actions
The agent acts on the same indexed elements it sees in
state.Refresh after change
Calling
state again refreshes indexes after navigation or page updates.Semantic browser descriptions
Each browser has adesc field. It tells the agent what the browser is for:
Match tasks
Use
desc to match a new request to an existing browser.Avoid duplicates
Reuse known browsers instead of creating new ones for the same job.
Improve over time
Append useful context when a browser becomes associated with more workflows.
Browser selection priority
When multiple browsers exist, the agent should follow this order: This selection logic belongs to the Skill layer. After a user chooses, the agent should updatedesc so future tasks can match more directly.
Safety by default
Browser automation can affect real accounts and real data. Browser-act uses confirmation rules to keep the user in control.Confirmation gates
Agents should ask before sensitive operations:| Operation | Why confirmation matters |
|---|---|
| Create any browser | Creates a new automation endpoint |
| Delete a browser | Destroys persistent browser state |
| Import a profile | Copies login state into a managed browser |
| Change proxy settings | Changes network identity |
| Change privacy mode | Changes fingerprint and persistence behavior |
Change confirm_before_use | Changes safety behavior |
Open a confirm_before_use browser | Uses a browser marked as sensitive |
[!WARNING] These confirmation gates are agent instructions, not a replacement for platform-level security. Their effectiveness depends on the agent runtime and model following the Skill instructions.Example confirmation:
- prior approval does not carry over to a new sensitive operation
- every sensitive operation needs its own confirmation
- strong wording in the user’s original prompt does not replace confirmation
- the agent should explain what it will do before doing it
Local data handling
| Data | Location | Leaves the machine? |
|---|---|---|
| Cookies | Browser-act local storage | No |
| Login sessions | Isolated browser profile | No |
| Page content | In memory during the task | No |
| Screenshots | Local file system when saved | No |
| Network captures | Memory or local HAR files | No |
| Browser profiles | Isolated local directories | No |
solve-captcha, which sends CAPTCHA challenge images to Browser-act cloud for solving. It should not include cookies, page content, or full URLs.
Advanced capabilities
Browser-act includes more than simple click and input commands:Network capture and HAR
Find API endpoints, debug auth flows, capture XHR-loaded data, and analyze page loading.
JavaScript evaluation
Run
eval for complex extraction or page-local operations.Cookie import and export
Move session state between browser types, machines, or CI jobs.
Offline mode
Test forms and button flows without making real network requests.
Learn more
Command Reference
Open the full Browser-act CLI command index.
Anti-detection & Blocking
Understand blocking, CAPTCHA handling, and handoff.
Concurrency & Isolation
Run parallel agent work without mixing state.

