Overview

When you add a description parameter to any POST /screenshots/web request, the API passes a full-page screenshot to a vision model (Claude by default). The model returns a CSS selector for the described element. Playwright then clips that element and returns a cropped screenshot.

This means you can capture any visually distinct section — pricing tables, nav bars, charts, forms, footers — without inspecting DOM selectors yourself.

TipAI targeting is included on the FREE plan. You don't need to upgrade to use it. The vision model call adds ~1–2 seconds to capture time.

How it works

// request
description: "pricing table"

1. Describe it

Claude Vision

2. AI Locates

confidence: 0.94

3. Exact crop

You add description to your request — A plain-English description of the element — e.g. "the pricing comparison table" or "the main navigation bar".
API takes a full-page screenshot — Playwright renders the full page. This screenshot is passed to the vision model as a base64 image.
Vision model returns a CSS selector — Claude (or your configured model) analyzes the screenshot and returns the most precise CSS selector, plus a confidence score.
Playwright clips and returns the crop — The API evaluates the selector, measures the bounding box, and returns a precisely cropped screenshot of just that element.

Quick start

Add description to any web screenshot request. Everything else is identical to a standard capture.

const job = await client.screenshots.web({
  url: "https://stripe.com/pricing",
  description: "the pricing comparison table",
  format: "png",
});

const result = await client.jobs.waitForResult(job.jobId);

// AI metadata returned alongside the screenshot
console.log(result.metadata.aiSelector);    // ".pricing-table"
console.log(result.metadata.aiConfidence);  // 0.94

Parameters

AI targeting adds two new fields to the standard POST /screenshots/web request body. All other web screenshot parameters still apply.

Parameter	Type	Description
`description`required	string	Natural-language description of the element to capture. Describe what you see, not an implementation detail. Examples: `"the pricing table"`, `"the top navigation bar"`, `"the monthly revenue chart"`. Max 500 characters.
`element`	string	CSS selector fallback. If vision model confidence is below the threshold (default 0.70), the API falls back to this selector instead of returning the full page. Optional.
`aiModel`	string	Override the vision model for this request. Defaults to `anthropic/claude-opus-4-5`. Accepted: `anthropic/claude-sonnet-4-6`, `openai/gpt-4o`, `google/gemini-2.0-flash-001`.
`aiConfidenceThreshold`	number	Minimum confidence score (0–1) to accept the AI selector. Below this threshold the request falls back to `element` or full-page. Defaults to `0.70`.

Response fields

When AI targeting is used, the metadata object in the job result includes additional fields.

GET /jobs/:id/result

{
  "jobId": "job_web_7a91bcd3",
  "status": "completed",
  "screenshotUrl": "https://cdn.screenshotfreeapi.com/...",
  "metadata": {
    "aiSelector":    ".pricing-table",
    "aiConfidence":  0.94,
    "aiUsed":        true,
    "processingMs":  4210,
    "pageTitle":     "Pricing — Stripe"
  }
}

Parameter	Type	Description
`metadata.aiSelector`	string	The CSS selector the vision model selected. Useful for debugging or reuse as an element parameter.
`metadata.aiConfidence`	number	Confidence score from 0 to 1. Above 0.90 is highly reliable. Below 0.70 triggers the fallback.
`metadata.aiUsed`	boolean	true when the AI selector was applied to the crop. false when the fallback (element or full-page) was used instead.

Confidence & fallback

Every AI targeting response includes a confidence score. The API uses this score to decide whether to use the AI selector or fall back to your explicit selector or full-page mode.

0.90 – 1.00High confidenceAI selects and crops the element

0.70 – 0.89ModerateAI selects; result included in metadata for review

< 0.70Low — fallbackFalls back to element selector or full page

Supply both description and element for belt-and-suspenders reliability. If the AI is confident, you get the crop. If not, the explicit selector takes over.

const job = await client.screenshots.web({
  url: "https://example.com",
  // AI targeting + CSS fallback: if AI confidence < threshold,
  // falls back to this explicit selector automatically
  description: "the hero banner at the top",
  element: ".hero",
  format: "png",
});

Writing good descriptions

The quality of the description directly affects confidence. These patterns consistently produce high-confidence results:

Describe what you see

"the blue hero banner"

"#hero-section"

Name the content type

"the pricing comparison table"

"a table"

Use positional hints

"the navigation bar at the top"

"nav"

Reference visible text

"the Get started for free button"

"a button"

Describe the data shown

"the monthly revenue bar chart"

"a chart"

Name the form by its action

"the sign-up form"

"a form"

NoteAvoid referencing element IDs or class names in your description — those are implementation details the vision model doesn't see. Describe what a human would notice visually.

More examples

A range of descriptions that work reliably across real-world sites:

description patterns

// All of these descriptions work well in practice:

// Pricing tables
{ description: "the pricing comparison table" }

// Navigation
{ description: "the top navigation bar" }

// CTAs
{ description: "the main call-to-action button" }

// Charts and data viz
{ description: "the monthly revenue chart" }

// Reviews
{ description: "the customer testimonials section" }

// Forms
{ description: "the sign-up form" }

// Footers
{ description: "the footer with links and copyright" }

Vision models

AI targeting routes through OpenRouter so you can swap models per request via aiModel.

Parameter	Type	Description
`anthropic/claude-opus-4-5`	default	Best accuracy for complex or heavily styled layouts. Slower — adds ~2 s.
`anthropic/claude-sonnet-4-6`	string	Faster, still excellent for most layouts. Adds ~1 s. Good choice for high-volume pipelines.
`openai/gpt-4o`	string	Strong alternative to Claude. Especially reliable on forms and e-commerce pages.
`google/gemini-2.0-flash-001`	string	Fastest and cheapest. Lower accuracy on dense or dynamic layouts. Use for simple, static pages.

WarningModel availability depends on OpenRouter routing. If a model is unavailable, the API falls back to claude-opus-4-5 automatically. The job result includes the model actually used in metadata.aiModel.

Limits & plan notes

Parameter	Type	Description
`description max length`	500 chars	Descriptions longer than 500 characters are truncated before being sent to the vision model.
`AI latency overhead`	1–3 s	Vision model call adds 1–3 seconds to total job time on top of the Playwright render.
`Included in FREE plan`	yes	AI targeting is not a paid add-on. Every plan includes it, subject to your screenshot quota.
`Quota counting`	1 per job	An AI-targeted job counts as 1 screenshot against your monthly quota, same as any other capture.
`Rate limiting`	same	No separate rate limit for AI jobs. Requests count against your plan's requests-per-minute limit.

AI-Powered Element Targeting

AI Targeting