RealtimeKeep your app up to date
AuthenticationOver 80+ OAuth integrations
Convex Components
ComponentsIndependent, modular, TypeScript building blocks for your backend.
Open sourceSelf host and develop locally
AI CodingGenerate high quality Convex code with AI
Compare
Convex vs. Firebase
Convex vs. Supabase
Convex vs. SQL
DocumentationGet started with your favorite frameworks
SearchSearch across Docs, Stack, and Discord
Convex for StartupsStart and scale your company with Convex
Convex for Open SourceSupport for open source projects
TemplatesUse a recipe to get started quickly
Convex ChampionsAmbassadors that support our thriving community
Convex CommunityShare ideas and ask for help in our community Discord
Stack
Stack

Stack is the Convex developer portal and blog, sharing bright ideas and techniques for building with Convex.

Explore Stack
BlogDocsPricing
GitHub
Log inStart building
Back to Components

Firecrawl Scrape

Gitmaxd's avatar
Gitmaxd/convex-firecrawl-scrape
View repo
GitHub logoView package

Category

AI
Firecrawl Scrape hero image
npm install convex-firecrawl-scrape

Scrape any URL and get clean markdown, HTML, screenshots, or structured JSON - with durable caching and reactive queries.

const { jobId } = await scrape({ url: "https://example.com" });
// Status updates reactively as the scrape completes
const status = useQuery(api.firecrawl.getStatus, { id: jobId });
  • Durable caching with configurable TTL (default 30 days)
  • Reactive status updates via Convex subscriptions
  • Multiple output formats: markdown, HTML, raw HTML, screenshots, links, images, AI summaries
  • JSON extraction via schema-based LLM processing
  • Built-in SSRF protection blocks private IPs and localhost
  • Secure by default with required auth wrapper

Live Demo | Example Code

Play with the example:

git clone https://github.com/gitmaxd/convex-firecrawl-scrape.git
cd convex-firecrawl-scrape
npm install
npm run dev

Pre-requisite: Convex#

You'll need an existing Convex project. Convex is a hosted backend platform with a database, serverless functions, and more. Learn more here.

Run npm create convex or follow any of the quickstarts to set one up.

Installation#

npm install convex-firecrawl-scrape

Install the component in your convex/convex.config.ts:

// convex/convex.config.ts
import { defineApp } from "convex/server";
import firecrawlScrape from "convex-firecrawl-scrape/convex.config.js";

const app = defineApp();
app.use(firecrawlScrape);
export default app;

Set your Firecrawl API key:

npx convex env set FIRECRAWL_API_KEY your_api_key_here

Get your API key at firecrawl.dev.

Usage#

Always use exposeApi() to expose component functionality. This wrapper enforces authentication and controls API key access.

// convex/firecrawl.ts
import { exposeApi } from "convex-firecrawl-scrape";
import { components } from "./_generated/api";

export const { scrape, getCached, getStatus, getContent, invalidate } =
  exposeApi(components.firecrawlScrape, {
    auth: async (ctx, operation) => {
      const identity = await ctx.auth.getUserIdentity();
      if (!identity) throw new Error("Unauthorized");
      return process.env.FIRECRAWL_API_KEY!;
    },
  });

React Integration#

import { useMutation, useQuery } from "convex/react";
import { api } from "../convex/_generated/api";
import { useState } from "react";

function ScrapeButton({ url }: { url: string }) {
  const [jobId, setJobId] = useState<string | null>(null);
  const scrape = useMutation(api.firecrawl.scrape);
  const status = useQuery(
    api.firecrawl.getStatus,
    jobId ? { id: jobId } : "skip",
  );
  const content = useQuery(
    api.firecrawl.getContent,
    jobId && status?.status === "completed" ? { id: jobId } : "skip",
  );

  return (
    <div>
      <button
        onClick={async () => setJobId((await scrape({ url })).jobId)}
        disabled={status?.status === "scraping"}
      >
        {status?.status === "scraping" ? "Scraping..." : "Scrape"}
      </button>
      {status?.status === "completed" && <pre>{content?.markdown}</pre>}
      {status?.status === "failed" && <p>Error: {status.error}</p>}
    </div>
  );
}

Output Formats#

const { jobId } = await scrape({
  url: "https://example.com",
  options: {
    formats: ["markdown", "html", "links", "images", "screenshot"],
    storeScreenshot: true,
  },
});
FormatDescription
markdownClean markdown content (default)
htmlCleaned HTML
rawHtmlOriginal HTML source
linksURLs found on the page
imagesImage URLs found on the page
summaryAI-generated page summary
screenshotScreenshot URL (use storeScreenshot: true to persist)

JSON Extraction#

Extract structured data using a JSON schema:

const { jobId } = await scrape({
  url: "https://example.com/product",
  options: {
    extractionSchema: {
      type: "object",
      properties: {
        name: { type: "string" },
        price: { type: "number" },
      },
      required: ["name", "price"],
    },
  },
});

const content = await getContent({ id: jobId });
console.log(content.extractedJson); // { name: "Widget", price: 99.99 }

Cache Management#

Cached results use superset matching: a cache entry with ["markdown", "screenshot"] satisfies a request for ["markdown"].

// Check cache
const cached = await getCached({ url: "https://example.com" });

// Force refresh
const { jobId } = await scrape({ url, options: { force: true } });

// Invalidate cache
await invalidate({ url: "https://example.com" });

Proxy Options#

For anti-bot protected sites:

const { jobId } = await scrape({
  url: "https://protected-site.com",
  options: {
    proxy: "stealth", // Residential proxy
    waitFor: 3000, // Wait for dynamic content
  },
});

Security#

Always use exposeApi() - never expose component functions directly to clients. Server-side code can call component internals directly, but doing so bypasses authentication. It ensures:

  • Authentication before any operation
  • API key controlled by your callback, not callers
  • Operation-specific authorization support
// ❌ DANGEROUS - bypasses auth
export const scrape = components.firecrawlScrape.lib.startScrape;

// ✅ SAFE - auth enforced
export const { scrape } = exposeApi(components.firecrawlScrape, { auth: ... });

SSRF Protection: Built-in validation blocks localhost, private IPs, and non-HTTP schemes.

For domain allowlists, rate limiting, and detailed security guidance, see docs/SECURITY.md.

Error Handling#

const status = await getStatus({ id: jobId });
if (status?.status === "failed") {
  console.error(status.error, status.errorCode);
  // errorCode is the HTTP status from Firecrawl (e.g., 402, 403, 429, 500)
}

Found a bug? Feature request? File it here.

Get your app up and running in minutes
Start building
Convex logo
ProductSyncRealtimeAuthOpen sourceAI codingFAQChefMerchPricing
DevelopersDocsBlogComponentsTemplatesConvex for StartupsConvex for Open SourceChampionsChangelogPodcastLLMs.txt
CompanyAbout usBrandInvestorsBecome a partnerJobsNewsEventsTerms of servicePrivacy policySecurity
SocialTwitterDiscordYouTubeLumaLinkedInGitHub
A Trusted Solution
  • SOC 2 Type II Compliant
  • HIPAA Compliant
  • GDPR Verified
©2026 Convex, Inc.