ToShopToShop DocsBeta
ToShopToShop DocsBeta
HomepageWelcome to ToShop

Start Here

Using ToShop

Customize

Help

Tools

Browser

A real automated browser your agent can drive — navigate, click, type, screenshot, execute JS.

ToShop ships with a managed automated browser. Your agent can drive it end-to-end: navigate to a URL, read the page's accessibility tree, click elements by reference, fill forms, take screenshots, and even run JavaScript.

It's a real browser, not a web fetch

The browser_* tools spin up a full browser instance — JavaScript runs, login state persists, network requests are intercepted. It's not a stripped-down HTTP client.

Three web-shaped capabilities

Managed automated browser. Full DOM, JS, cookies, sessions.

Use when: your agent needs to interact — click, type, scrape rendered content, log in.

Cost: higher latency, more context (each snapshot is large).

HTTP-only fetch, no rendering. Returns markdown-extracted page text.

Use when: your agent only needs raw content and the page renders server-side.

Cost: lowest — fast, cheap context.

Hand a URL to the system default browser.

Use when: your agent's recommendation includes a link you want to read next.

Cost: none for your agent — it stops being involved once the link opens.

The browser_* toolkit

NameDescription
browser_navigateLoad a URL in the automated browser.
browser_snapshotReturn the page's accessibility tree (interactive elements with refs like @e5, @e12).
browser_screenshotCapture a full-page or viewport image.
browser_clickClick an element by accessibility ref.
browser_typeType into a form field.
browser_pressSend a keyboard key (Enter, Tab, Esc, etc.).
browser_scrollScroll the page.
browser_evaluateRun JavaScript in the page context.
browser_wait_forBlock until text appears, selector matches, URL changes, or a load state.
browser_networkInspect or record network requests (HAR-style).
browser_downloadTrack downloads triggered by page actions.
browser_tabsList, focus, or close tabs.
browser_advancedPDF export, file upload, dialog handling, hover, drag, select option, fill.

Typical flow

Navigate

browser_navigate("https://example.com")

Snapshot the page

browser_snapshot()

Returns the accessibility tree with refs your agent can target.

Decide which element to interact with

Your agent reads the tree and picks an element.

Act

browser_click(ref="@e5")
browser_type(ref="@e12", text="...")

Wait for the page to settle

browser_wait_for(state="networkidle")

Verify visually if needed

browser_screenshot()

Snapshot vs Screenshot

Snapshot — to interact

The accessibility tree gives stable refs your agent can click or type into. Survives layout changes. Cheap to keep using.

Screenshot — to verify

For showing the user, verifying visually, or solving things like CAPTCHA where a tree isn't enough.

Your agent uses both: snapshot to act, screenshot to confirm.

Browser sessions

A browser session persists across multiple browser_* calls within a task — so your agent can log in once and operate inside the authenticated state. Sessions end when the task ends, unless you've set up a persistent profile.

Persistent profiles

Configure under Settings → Browser → Profiles — useful for "always-signed-in to Shopify admin" patterns. Profiles survive across tasks.

Permissions and safety

When not to use the browser

Don't escalate prematurely

If your agent only needs the page's content (article text, JSON API response, static HTML), web_fetch is faster, lighter, and uses less context. Your agent picks web_fetch by default and escalates to browser_* only when JavaScript, login, or interaction is needed.

If you are going to read the page next, prefer open_url to open it in your normal browser (with your cookies, extensions, etc.).

Related

  • System Tools — clipboard, screenshot of the desktop (not the browser).
  • Search — when you want a result, not a specific URL.
  • Skills — agent-browser skill composes these into larger workflows.

Table of Contents

Three web-shaped capabilitiesThe browser_* toolkitTypical flowNavigateSnapshot the pageDecide which element to interact withActWait for the page to settleVerify visually if neededSnapshot vs ScreenshotBrowser sessionsPermissions and safetyWhen not to use the browserRelated