AI FeatureFree tier

AI-Powered Element Targeting

Describe any section of a page in plain English. Vision AI locates the exact element and returns a pixel-perfect crop — no CSS selectors, no DOM inspection required.

  • Natural-language element description
  • Confidence scoring with automatic CSS fallback
  • 4 vision models via OpenRouter
  • Included on every plan, including free

AI Targeting

Describe the section you want. Vision models find the selector. You get a pixel-perfect crop.

Claude
default vision model
0.94
avg confidence score
description parameter
1await client.screenshots.web({ 2 url: "https://stripe.com/pricing", 3 description: "the pricing comparison table", 4 format: "png", 5});

Overview

When you add a description parameter to any POST /screenshots/web request, the API passes a full-page screenshot to a vision model (Claude by default). The model returns a CSS selector for the described element. Playwright then clips that element and returns a cropped screenshot.

This means you can capture any visually distinct section — pricing tables, nav bars, charts, forms, footers — without inspecting DOM selectors yourself.

TipAI targeting is included on the FREE plan. You don't need to upgrade to use it. The vision model call adds ~1–2 seconds to capture time.

How it works

// request description: "pricing table"

1. Describe it

Claude Vision

2. AI Locates

confidence: 0.94

3. Exact crop

  1. You add description to your requestA plain-English description of the element — e.g. "the pricing comparison table" or "the main navigation bar".
  2. API takes a full-page screenshotPlaywright renders the full page. This screenshot is passed to the vision model as a base64 image.
  3. Vision model returns a CSS selectorClaude (or your configured model) analyzes the screenshot and returns the most precise CSS selector, plus a confidence score.
  4. Playwright clips and returns the cropThe API evaluates the selector, measures the bounding box, and returns a precisely cropped screenshot of just that element.

Quick start

Add description to any web screenshot request. Everything else is identical to a standard capture.

1const job = await client.screenshots.web({ 2 url: "https://stripe.com/pricing", 3 description: "the pricing comparison table", 4 format: "png", 5}); 6 7const result = await client.jobs.waitForResult(job.jobId); 8 9// AI metadata returned alongside the screenshot 10console.log(result.metadata.aiSelector); // ".pricing-table" 11console.log(result.metadata.aiConfidence); // 0.94

Parameters

AI targeting adds two new fields to the standard POST /screenshots/web request body. All other web screenshot parameters still apply.

ParameterTypeDescription
descriptionrequiredstringNatural-language description of the element to capture. Describe what you see, not an implementation detail. Examples: "the pricing table", "the top navigation bar", "the monthly revenue chart". Max 500 characters.
elementstringCSS selector fallback. If vision model confidence is below the threshold (default 0.70), the API falls back to this selector instead of returning the full page. Optional.
aiModelstringOverride the vision model for this request. Defaults to anthropic/claude-opus-4-5. Accepted: anthropic/claude-sonnet-4-6, openai/gpt-4o, google/gemini-2.0-flash-001.
aiConfidenceThresholdnumberMinimum confidence score (0–1) to accept the AI selector. Below this threshold the request falls back to element or full-page. Defaults to 0.70.

Response fields

When AI targeting is used, the metadata object in the job result includes additional fields.

GET /jobs/:id/result
1{ 2 "jobId": "job_web_7a91bcd3", 3 "status": "completed", 4 "screenshotUrl": "https://cdn.screenshotfreeapi.com/...", 5 "metadata": { 6 "aiSelector": ".pricing-table", 7 "aiConfidence": 0.94, 8 "aiUsed": true, 9 "processingMs": 4210, 10 "pageTitle": "Pricing — Stripe" 11 } 12}
ParameterTypeDescription
metadata.aiSelectorstringThe CSS selector the vision model selected. Useful for debugging or reuse as an element parameter.
metadata.aiConfidencenumberConfidence score from 0 to 1. Above 0.90 is highly reliable. Below 0.70 triggers the fallback.
metadata.aiUsedbooleantrue when the AI selector was applied to the crop. false when the fallback (element or full-page) was used instead.

Confidence & fallback

Every AI targeting response includes a confidence score. The API uses this score to decide whether to use the AI selector or fall back to your explicit selector or full-page mode.

0.90 – 1.00High confidenceAI selects and crops the element
0.70 – 0.89ModerateAI selects; result included in metadata for review
< 0.70Low — fallbackFalls back to element selector or full page

Supply both description and element for belt-and-suspenders reliability. If the AI is confident, you get the crop. If not, the explicit selector takes over.

1const job = await client.screenshots.web({ 2 url: "https://example.com", 3 // AI targeting + CSS fallback: if AI confidence < threshold, 4 // falls back to this explicit selector automatically 5 description: "the hero banner at the top", 6 element: ".hero", 7 format: "png", 8});

Writing good descriptions

The quality of the description directly affects confidence. These patterns consistently produce high-confidence results:

Describe what you see

"the blue hero banner"
"#hero-section"

Name the content type

"the pricing comparison table"
"a table"

Use positional hints

"the navigation bar at the top"
"nav"

Reference visible text

"the Get started for free button"
"a button"

Describe the data shown

"the monthly revenue bar chart"
"a chart"

Name the form by its action

"the sign-up form"
"a form"
NoteAvoid referencing element IDs or class names in your description — those are implementation details the vision model doesn't see. Describe what a human would notice visually.

More examples

A range of descriptions that work reliably across real-world sites:

description patterns
1// All of these descriptions work well in practice: 2 3// Pricing tables 4{ description: "the pricing comparison table" } 5 6// Navigation 7{ description: "the top navigation bar" } 8 9// CTAs 10{ description: "the main call-to-action button" } 11 12// Charts and data viz 13{ description: "the monthly revenue chart" } 14 15// Reviews 16{ description: "the customer testimonials section" } 17 18// Forms 19{ description: "the sign-up form" } 20 21// Footers 22{ description: "the footer with links and copyright" }

Vision models

AI targeting routes through OpenRouter so you can swap models per request via aiModel.

ParameterTypeDescription
anthropic/claude-opus-4-5defaultBest accuracy for complex or heavily styled layouts. Slower — adds ~2&nbsp;s.
anthropic/claude-sonnet-4-6stringFaster, still excellent for most layouts. Adds ~1&nbsp;s. Good choice for high-volume pipelines.
openai/gpt-4ostringStrong alternative to Claude. Especially reliable on forms and e-commerce pages.
google/gemini-2.0-flash-001stringFastest and cheapest. Lower accuracy on dense or dynamic layouts. Use for simple, static pages.
WarningModel availability depends on OpenRouter routing. If a model is unavailable, the API falls back to claude-opus-4-5 automatically. The job result includes the model actually used in metadata.aiModel.

Limits & plan notes

ParameterTypeDescription
description max length500 charsDescriptions longer than 500 characters are truncated before being sent to the vision model.
AI latency overhead1–3 sVision model call adds 1–3 seconds to total job time on top of the Playwright render.
Included in FREE planyesAI targeting is not a paid add-on. Every plan includes it, subject to your screenshot quota.
Quota counting1 per jobAn AI-targeted job counts as 1 screenshot against your monthly quota, same as any other capture.
Rate limitingsameNo separate rate limit for AI jobs. Requests count against your plan's requests-per-minute limit.