Stagehand

convex-stagehand

AI-powered browser automation for Convex applications. Extract data, perform actions, and automate workflows using natural language - no Playwright knowledge required.

Features#

Simple API - Describe what you want in plain English
Type-safe - Full TypeScript support with Zod schemas
Session management - Reuse browser sessions across multiple operations
Agent mode - Autonomous multi-step task execution
Powered by Stagehand - Uses the Stagehand REST API

Quick Start#

1. Install the Component#

npm install github:browserbase/convex-stagehand zod

2. Configure Convex#

Add the component to your convex/convex.config.ts:

import { defineApp } from "convex/server";
import stagehand from "convex-stagehand/convex.config";

const app = defineApp();
app.use(stagehand, { name: "stagehand" });

export default app;

3. Set Up Environment Variables#

Add these to your Convex Dashboard → Settings → Environment Variables:

Variable	Description
`BROWSERBASE_API_KEY`	Your Browserbase API key
`BROWSERBASE_PROJECT_ID`	Your Browserbase project ID
`MODEL_API_KEY`	Your LLM provider API key (OpenAI, Anthropic, etc.)

4. Use the Component#

import { action } from "./_generated/server";
import { Stagehand } from "convex-stagehand";
import { components } from "./_generated/api";
import { z } from "zod";

const stagehand = new Stagehand(components.stagehand, {
  browserbaseApiKey: process.env.BROWSERBASE_API_KEY!,
  browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID!,
  modelApiKey: process.env.MODEL_API_KEY!,
});

export const scrapeHackerNews = action({
  handler: async (ctx) => {
    return await stagehand.extract(ctx, {
      url: "https://news.ycombinator.com",
      instruction: "Extract the top 5 stories with title, score, and link",
      schema: z.object({
        stories: z.array(z.object({
          title: z.string(),
          score: z.string(),
          link: z.string(),
        }))
      })
    });
  }
});

API Reference#

`startSession(ctx, args)`#

Start a new browser session. Returns session info for use with other operations.

const session = await stagehand.startSession(ctx, {
  url: "https://example.com",
  browserbaseSessionId: "optional-existing-session-id",
  options: {
    timeout: 30000,
    waitUntil: "networkidle",
    domSettleTimeoutMs: 2000,
    selfHeal: true,
    systemPrompt: "Custom system prompt for the session",
  }
});
// { sessionId: "...", browserbaseSessionId: "...", cdpUrl: "wss://..." }

Parameters:

url - The URL to navigate to
browserbaseSessionId - Optional: Resume an existing Browserbase session
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete: "load", "domcontentloaded", or "networkidle"
options.domSettleTimeoutMs - Timeout for DOM to settle before considering page loaded
options.selfHeal - Enable self-healing capabilities for more robust automation
options.systemPrompt - Custom system prompt to guide the AI's behavior during the session

Returns:

{
  sessionId: string;           // Use with other operations
  browserbaseSessionId?: string; // Store to resume later
  cdpUrl?: string;             // For advanced Playwright/Puppeteer usage
}

`endSession(ctx, args)`#

End a browser session.

await stagehand.endSession(ctx, { sessionId: session.sessionId });

Parameters:

sessionId - The session to end

Returns: { success: boolean }

`extract(ctx, args)`#

Extract structured data from a web page using AI.

// Without session (creates and destroys its own)
const data = await stagehand.extract(ctx, {
  url: "https://example.com",
  instruction: "Extract all product names and prices",
  schema: z.object({
    products: z.array(z.object({
      name: z.string(),
      price: z.string(),
    }))
  }),
});

// With existing session (reuses session, doesn't end it)
const data = await stagehand.extract(ctx, {
  sessionId: session.sessionId,
  instruction: "Extract all product names and prices",
  schema: z.object({ ... }),
});

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
instruction - Natural language description of what to extract
schema - Zod schema defining the expected output structure
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete: "load", "domcontentloaded", or "networkidle"

Returns: Data matching your Zod schema

`act(ctx, args)`#

Execute browser actions using natural language.

// Without session
const result = await stagehand.act(ctx, {
  url: "https://example.com/login",
  action: "Click the login button and wait for the page to load",
});

// With existing session
const result = await stagehand.act(ctx, {
  sessionId: session.sessionId,
  action: "Fill in the email field with 'user@example.com'",
});

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
action - Natural language description of the action to perform
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete

Returns:

{
  success: boolean;
  message: string;
  actionDescription: string;
}

`observe(ctx, args)`#

Find available actions on a web page.

const actions = await stagehand.observe(ctx, {
  url: "https://example.com",
  instruction: "Find all clickable navigation links",
});
// [{ description: "Home link", selector: "a.nav-home", method: "click" }, ...]

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
instruction - Natural language description of what actions to find
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete

Returns:

Array<{
  description: string;
  selector: string;
  method: string;
  arguments?: string[];
}>

`agent(ctx, args)`#

Execute autonomous multi-step browser automation using an AI agent. The agent interprets the instruction and decides what actions to take.

// Agent creates its own session
const result = await stagehand.agent(ctx, {
  url: "https://google.com",
  instruction: "Search for 'convex database' and extract the top 3 results with title and URL",
  options: { maxSteps: 10 },
});

// Agent with existing session
const result = await stagehand.agent(ctx, {
  sessionId: session.sessionId,
  instruction: "Fill out the contact form and submit",
  options: { maxSteps: 5 },
});

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
instruction - Natural language description of the task to complete
options.cua - Enable Computer Use Agent mode
options.maxSteps - Maximum steps the agent can take
options.systemPrompt - Custom system prompt for the agent
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete

Returns:

{
  actions: Array<{
    type: string;
    action?: string;
    reasoning?: string;
    timeMs?: number;
  }>;
  completed: boolean;
  message: string;
  success: boolean;
}

Examples#

Simple extraction (automatic session)#

const news = await stagehand.extract(ctx, {
  url: "https://news.ycombinator.com",
  instruction: "Get the top 10 stories with title, points, and comment count",
  schema: z.object({
    stories: z.array(z.object({
      title: z.string(),
      points: z.string(),
      comments: z.string(),
    }))
  })
});

Manual session management#

Use session management when you need to perform multiple operations while preserving browser state (cookies, login, etc.):

// Start a session
const session = await stagehand.startSession(ctx, {
  url: "https://google.com"
});

// Perform multiple operations in the same session
await stagehand.act(ctx, {
  sessionId: session.sessionId,
  action: "Search for 'convex database'"
});

const data = await stagehand.extract(ctx, {
  sessionId: session.sessionId,
  instruction: "Extract the top 3 results",
  schema: z.object({
    results: z.array(z.object({
      title: z.string(),
      url: z.string(),
    }))
  })
});

// End the session when done
await stagehand.endSession(ctx, { sessionId: session.sessionId });

Autonomous agent#

Let the AI agent figure out how to complete a complex task:

const result = await stagehand.agent(ctx, {
  url: "https://www.google.com",
  instruction: "Search for 'best pizza in NYC', click on the first result, and extract the restaurant name and address",
  options: { maxSteps: 10 }
});

console.log(result.message); // Summary of what the agent did
console.log(result.actions); // Detailed log of each action taken

Resume session across Convex actions#

Store the browserbaseSessionId to resume sessions across different Convex action calls:

// Action 1: Start session and return browserbaseSessionId
export const startBrowsing = action({
  handler: async (ctx) => {
    const session = await stagehand.startSession(ctx, {
      url: "https://example.com/login"
    });
    // Store browserbaseSessionId in your database
    return session.browserbaseSessionId;
  }
});

// Action 2: Resume session later
export const continueBrowsing = action({
  args: { browserbaseSessionId: v.string() },
  handler: async (ctx, args) => {
    const session = await stagehand.startSession(ctx, {
      url: "https://example.com/dashboard",
      browserbaseSessionId: args.browserbaseSessionId,
    });
    // Continue using the same browser instance
    return await stagehand.extract(ctx, {
      sessionId: session.sessionId,
      instruction: "Extract user data",
      schema: z.object({ ... }),
    });
  }
});

Configuration Options#

AI Model#

By default, the component uses openai/gpt-4o. You can use any model supported by the Vercel AI SDK that supports structured outputs:

const stagehand = new Stagehand(components.stagehand, {
  browserbaseApiKey: process.env.BROWSERBASE_API_KEY!,
  browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID!,
  modelApiKey: process.env.ANTHROPIC_API_KEY!, // Use Anthropic
  modelName: "anthropic/claude-3-5-sonnet-20241022",
});

For the full list of supported models and providers, see the Stagehand Models documentation.

Requirements#

Browserbase account and API key
LLM provider API key (see supported models)
Convex 1.29.3 or later

How It Works#

This component uses the Stagehand REST API to power browser automation. Each operation:

Starts a cloud browser session via Browserbase (or reuses an existing one)
Navigates to the target URL
Uses AI to understand the page and perform the requested operation
Optionally ends the session and returns results

With session management, you control when sessions start and end, allowing you to maintain browser state across multiple operations.

Development#

Component Structure#

The component exposes its API through Convex's component system. All functions are in a single lib.ts module:

component.lib.<function>

For example:

component.lib.startSession - Start a browser session
component.lib.endSession - End a browser session
component.lib.extract - Extract data from web pages
component.lib.act - Perform browser actions
component.lib.observe - Find interactive elements
component.lib.agent - Autonomous multi-step automation

The Stagehand client class wraps these internal paths to provide a clean user API:

// User calls:
stagehand.extract(ctx, {...})

// Internally calls:
ctx.runAction(component.lib.extract, {...})

Building the Component#

To build the component locally:

# Install dependencies
npm install

# Build with Convex codegen (generates component API)
npm run build:codegen

# Or just build TypeScript
npm run build:esm

The component requires a Convex deployment to generate proper component API types (_generated/component.ts).

Example App#

Check out the full example app in the example/ directory:

git clone https://github.com/browserbase/convex-stagehand
cd convex-stagehand/example
npm install
npm run dev

The example includes:

HackerNews story extraction with AI
Type-safe data extraction using Zod schemas
Database persistence with Convex
Real-time updates and automatic refresh

License#

MIT

Back to Components

Stagehand

browserbase/convex-stagehand

View repo

View package

convex-stagehand

AI-powered browser automation for Convex applications. Extract data, perform actions, and automate workflows using natural language - no Playwright knowledge required.

Features#

Simple API - Describe what you want in plain English
Type-safe - Full TypeScript support with Zod schemas
Session management - Reuse browser sessions across multiple operations
Agent mode - Autonomous multi-step task execution
Powered by Stagehand - Uses the Stagehand REST API

Quick Start#

1. Install the Component#

npm install github:browserbase/convex-stagehand zod

2. Configure Convex#

Add the component to your convex/convex.config.ts:

import { defineApp } from "convex/server";
import stagehand from "convex-stagehand/convex.config";

const app = defineApp();
app.use(stagehand, { name: "stagehand" });

export default app;

3. Set Up Environment Variables#

Add these to your Convex Dashboard → Settings → Environment Variables:

Variable	Description
`BROWSERBASE_API_KEY`	Your Browserbase API key
`BROWSERBASE_PROJECT_ID`	Your Browserbase project ID
`MODEL_API_KEY`	Your LLM provider API key (OpenAI, Anthropic, etc.)

4. Use the Component#

import { action } from "./_generated/server";
import { Stagehand } from "convex-stagehand";
import { components } from "./_generated/api";
import { z } from "zod";

const stagehand = new Stagehand(components.stagehand, {
  browserbaseApiKey: process.env.BROWSERBASE_API_KEY!,
  browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID!,
  modelApiKey: process.env.MODEL_API_KEY!,
});

export const scrapeHackerNews = action({
  handler: async (ctx) => {
    return await stagehand.extract(ctx, {
      url: "https://news.ycombinator.com",
      instruction: "Extract the top 5 stories with title, score, and link",
      schema: z.object({
        stories: z.array(z.object({
          title: z.string(),
          score: z.string(),
          link: z.string(),
        }))
      })
    });
  }
});

API Reference#

`startSession(ctx, args)`#

Start a new browser session. Returns session info for use with other operations.

const session = await stagehand.startSession(ctx, {
  url: "https://example.com",
  browserbaseSessionId: "optional-existing-session-id",
  options: {
    timeout: 30000,
    waitUntil: "networkidle",
    domSettleTimeoutMs: 2000,
    selfHeal: true,
    systemPrompt: "Custom system prompt for the session",
  }
});
// { sessionId: "...", browserbaseSessionId: "...", cdpUrl: "wss://..." }

Parameters:

url - The URL to navigate to
browserbaseSessionId - Optional: Resume an existing Browserbase session
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete: "load", "domcontentloaded", or "networkidle"
options.domSettleTimeoutMs - Timeout for DOM to settle before considering page loaded
options.selfHeal - Enable self-healing capabilities for more robust automation
options.systemPrompt - Custom system prompt to guide the AI's behavior during the session

Returns:

{
  sessionId: string;           // Use with other operations
  browserbaseSessionId?: string; // Store to resume later
  cdpUrl?: string;             // For advanced Playwright/Puppeteer usage
}

`endSession(ctx, args)`#

End a browser session.

await stagehand.endSession(ctx, { sessionId: session.sessionId });

Parameters:

sessionId - The session to end

Returns: { success: boolean }

`extract(ctx, args)`#

Extract structured data from a web page using AI.

// Without session (creates and destroys its own)
const data = await stagehand.extract(ctx, {
  url: "https://example.com",
  instruction: "Extract all product names and prices",
  schema: z.object({
    products: z.array(z.object({
      name: z.string(),
      price: z.string(),
    }))
  }),
});

// With existing session (reuses session, doesn't end it)
const data = await stagehand.extract(ctx, {
  sessionId: session.sessionId,
  instruction: "Extract all product names and prices",
  schema: z.object({ ... }),
});

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
instruction - Natural language description of what to extract
schema - Zod schema defining the expected output structure
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete: "load", "domcontentloaded", or "networkidle"

Returns: Data matching your Zod schema

`act(ctx, args)`#

Execute browser actions using natural language.

// Without session
const result = await stagehand.act(ctx, {
  url: "https://example.com/login",
  action: "Click the login button and wait for the page to load",
});

// With existing session
const result = await stagehand.act(ctx, {
  sessionId: session.sessionId,
  action: "Fill in the email field with 'user@example.com'",
});

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
action - Natural language description of the action to perform
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete

Returns:

{
  success: boolean;
  message: string;
  actionDescription: string;
}

`observe(ctx, args)`#

Find available actions on a web page.

const actions = await stagehand.observe(ctx, {
  url: "https://example.com",
  instruction: "Find all clickable navigation links",
});
// [{ description: "Home link", selector: "a.nav-home", method: "click" }, ...]

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
instruction - Natural language description of what actions to find
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete

Returns:

Array<{
  description: string;
  selector: string;
  method: string;
  arguments?: string[];
}>

`agent(ctx, args)`#

Execute autonomous multi-step browser automation using an AI agent. The agent interprets the instruction and decides what actions to take.

// Agent creates its own session
const result = await stagehand.agent(ctx, {
  url: "https://google.com",
  instruction: "Search for 'convex database' and extract the top 3 results with title and URL",
  options: { maxSteps: 10 },
});

// Agent with existing session
const result = await stagehand.agent(ctx, {
  sessionId: session.sessionId,
  instruction: "Fill out the contact form and submit",
  options: { maxSteps: 5 },
});

Parameters:

sessionId - Optional: Use an existing session
url - The URL to navigate to (required if no sessionId)
instruction - Natural language description of the task to complete
options.cua - Enable Computer Use Agent mode
options.maxSteps - Maximum steps the agent can take
options.systemPrompt - Custom system prompt for the agent
options.timeout - Navigation timeout in milliseconds
options.waitUntil - When to consider navigation complete

Returns:

{
  actions: Array<{
    type: string;
    action?: string;
    reasoning?: string;
    timeMs?: number;
  }>;
  completed: boolean;
  message: string;
  success: boolean;
}

Examples#

Simple extraction (automatic session)#

const news = await stagehand.extract(ctx, {
  url: "https://news.ycombinator.com",
  instruction: "Get the top 10 stories with title, points, and comment count",
  schema: z.object({
    stories: z.array(z.object({
      title: z.string(),
      points: z.string(),
      comments: z.string(),
    }))
  })
});

Manual session management#

Use session management when you need to perform multiple operations while preserving browser state (cookies, login, etc.):

// Start a session
const session = await stagehand.startSession(ctx, {
  url: "https://google.com"
});

// Perform multiple operations in the same session
await stagehand.act(ctx, {
  sessionId: session.sessionId,
  action: "Search for 'convex database'"
});

const data = await stagehand.extract(ctx, {
  sessionId: session.sessionId,
  instruction: "Extract the top 3 results",
  schema: z.object({
    results: z.array(z.object({
      title: z.string(),
      url: z.string(),
    }))
  })
});

// End the session when done
await stagehand.endSession(ctx, { sessionId: session.sessionId });

Autonomous agent#

Let the AI agent figure out how to complete a complex task:

const result = await stagehand.agent(ctx, {
  url: "https://www.google.com",
  instruction: "Search for 'best pizza in NYC', click on the first result, and extract the restaurant name and address",
  options: { maxSteps: 10 }
});

console.log(result.message); // Summary of what the agent did
console.log(result.actions); // Detailed log of each action taken

Resume session across Convex actions#

Store the browserbaseSessionId to resume sessions across different Convex action calls:

// Action 1: Start session and return browserbaseSessionId
export const startBrowsing = action({
  handler: async (ctx) => {
    const session = await stagehand.startSession(ctx, {
      url: "https://example.com/login"
    });
    // Store browserbaseSessionId in your database
    return session.browserbaseSessionId;
  }
});

// Action 2: Resume session later
export const continueBrowsing = action({
  args: { browserbaseSessionId: v.string() },
  handler: async (ctx, args) => {
    const session = await stagehand.startSession(ctx, {
      url: "https://example.com/dashboard",
      browserbaseSessionId: args.browserbaseSessionId,
    });
    // Continue using the same browser instance
    return await stagehand.extract(ctx, {
      sessionId: session.sessionId,
      instruction: "Extract user data",
      schema: z.object({ ... }),
    });
  }
});

Configuration Options#

AI Model#

By default, the component uses openai/gpt-4o. You can use any model supported by the Vercel AI SDK that supports structured outputs:

const stagehand = new Stagehand(components.stagehand, {
  browserbaseApiKey: process.env.BROWSERBASE_API_KEY!,
  browserbaseProjectId: process.env.BROWSERBASE_PROJECT_ID!,
  modelApiKey: process.env.ANTHROPIC_API_KEY!, // Use Anthropic
  modelName: "anthropic/claude-3-5-sonnet-20241022",
});

For the full list of supported models and providers, see the Stagehand Models documentation.

Requirements#

Browserbase account and API key
LLM provider API key (see supported models)
Convex 1.29.3 or later

How It Works#

This component uses the Stagehand REST API to power browser automation. Each operation:

Starts a cloud browser session via Browserbase (or reuses an existing one)
Navigates to the target URL
Uses AI to understand the page and perform the requested operation
Optionally ends the session and returns results

With session management, you control when sessions start and end, allowing you to maintain browser state across multiple operations.

Development#

Component Structure#

The component exposes its API through Convex's component system. All functions are in a single lib.ts module:

component.lib.<function>

For example:

component.lib.startSession - Start a browser session
component.lib.endSession - End a browser session
component.lib.extract - Extract data from web pages
component.lib.act - Perform browser actions
component.lib.observe - Find interactive elements
component.lib.agent - Autonomous multi-step automation

The Stagehand client class wraps these internal paths to provide a clean user API:

// User calls:
stagehand.extract(ctx, {...})

// Internally calls:
ctx.runAction(component.lib.extract, {...})

Building the Component#

To build the component locally:

# Install dependencies
npm install

# Build with Convex codegen (generates component API)
npm run build:codegen

# Or just build TypeScript
npm run build:esm

The component requires a Convex deployment to generate proper component API types (_generated/component.ts).

Example App#

Check out the full example app in the example/ directory:

git clone https://github.com/browserbase/convex-stagehand
cd convex-stagehand/example
npm install
npm run dev

The example includes:

HackerNews story extraction with AI
Type-safe data extraction using Zod schemas
Database persistence with Convex
Real-time updates and automatic refresh

License#

MIT

Get your app up and running in minutes

Start building

ProductSync Realtime Auth Open source AI coding FAQ Chef Merch Pricing

DevelopersDocs Blog Components Templates Convex for Startups Convex for Open Source Champions Changelog Podcast LLMs.txt

CompanyAbout us Brand Investors Become a partner Jobs News Events Terms of service Privacy policy Security

SocialTwitter Discord YouTube Luma LinkedIn GitHub

A Trusted Solution

SOC 2 Type II Compliant
HIPAA Compliant
GDPR Verified

Stagehand

Category

convex-stagehand

Features#

Quick Start#

1. Install the Component#

2. Configure Convex#

3. Set Up Environment Variables#

4. Use the Component#

API Reference#

startSession(ctx, args)#

endSession(ctx, args)#

extract(ctx, args)#

act(ctx, args)#

observe(ctx, args)#

agent(ctx, args)#

Examples#

Simple extraction (automatic session)#

Manual session management#

Autonomous agent#

Resume session across Convex actions#

Configuration Options#

AI Model#

Requirements#

How It Works#

Development#

Component Structure#

Building the Component#

Example App#

License#

Stagehand

Category

convex-stagehand

Features#

Quick Start#

1. Install the Component#

2. Configure Convex#

3. Set Up Environment Variables#

4. Use the Component#

API Reference#

startSession(ctx, args)#

endSession(ctx, args)#

extract(ctx, args)#

act(ctx, args)#

observe(ctx, args)#

agent(ctx, args)#

Examples#

Simple extraction (automatic session)#

Manual session management#

Autonomous agent#

Resume session across Convex actions#

Configuration Options#

AI Model#

Requirements#

How It Works#

Development#

Component Structure#

Building the Component#

Example App#

License#

`startSession(ctx, args)`#

`endSession(ctx, args)`#

`extract(ctx, args)`#

`act(ctx, args)`#

`observe(ctx, args)`#

`agent(ctx, args)`#

`startSession(ctx, args)`#

`endSession(ctx, args)`#

`extract(ctx, args)`#

`act(ctx, args)`#

`observe(ctx, args)`#

`agent(ctx, args)`#