Overview
When you add a description parameter to any POST /screenshots/web request, the API passes a full-page screenshot to a vision model (Claude by default). The model returns a CSS selector for the described element. Playwright then clips that element and returns a cropped screenshot.
This means you can capture any visually distinct section — pricing tables, nav bars, charts, forms, footers — without inspecting DOM selectors yourself.
How it works
1. Describe it
Claude Vision
2. AI Locates
confidence: 0.94
3. Exact crop
- You add description to your request — A plain-English description of the element — e.g. "the pricing comparison table" or "the main navigation bar".
- API takes a full-page screenshot — Playwright renders the full page. This screenshot is passed to the vision model as a base64 image.
- Vision model returns a CSS selector — Claude (or your configured model) analyzes the screenshot and returns the most precise CSS selector, plus a confidence score.
- Playwright clips and returns the crop — The API evaluates the selector, measures the bounding box, and returns a precisely cropped screenshot of just that element.
Quick start
Add description to any web screenshot request. Everything else is identical to a standard capture.
1const job = await client.screenshots.web({
2 url: "https://stripe.com/pricing",
3 description: "the pricing comparison table",
4 format: "png",
5});
6
7const result = await client.jobs.waitForResult(job.jobId);
8
9// AI metadata returned alongside the screenshot
10console.log(result.metadata.aiSelector); // ".pricing-table"
11console.log(result.metadata.aiConfidence); // 0.94Parameters
AI targeting adds two new fields to the standard POST /screenshots/web request body. All other web screenshot parameters still apply.
| Parameter | Type | Description |
|---|---|---|
descriptionrequired | string | Natural-language description of the element to capture. Describe what you see, not an implementation detail. Examples: "the pricing table", "the top navigation bar", "the monthly revenue chart". Max 500 characters. |
element | string | CSS selector fallback. If vision model confidence is below the threshold (default 0.70), the API falls back to this selector instead of returning the full page. Optional. |
aiModel | string | Override the vision model for this request. Defaults to anthropic/claude-opus-4-5. Accepted: anthropic/claude-sonnet-4-6, openai/gpt-4o, google/gemini-2.0-flash-001. |
aiConfidenceThreshold | number | Minimum confidence score (0–1) to accept the AI selector. Below this threshold the request falls back to element or full-page. Defaults to 0.70. |
Response fields
When AI targeting is used, the metadata object in the job result includes additional fields.
1{
2 "jobId": "job_web_7a91bcd3",
3 "status": "completed",
4 "screenshotUrl": "https://cdn.screenshotfreeapi.com/...",
5 "metadata": {
6 "aiSelector": ".pricing-table",
7 "aiConfidence": 0.94,
8 "aiUsed": true,
9 "processingMs": 4210,
10 "pageTitle": "Pricing — Stripe"
11 }
12}| Parameter | Type | Description |
|---|---|---|
metadata.aiSelector | string | The CSS selector the vision model selected. Useful for debugging or reuse as an element parameter. |
metadata.aiConfidence | number | Confidence score from 0 to 1. Above 0.90 is highly reliable. Below 0.70 triggers the fallback. |
metadata.aiUsed | boolean | true when the AI selector was applied to the crop. false when the fallback (element or full-page) was used instead. |
Confidence & fallback
Every AI targeting response includes a confidence score. The API uses this score to decide whether to use the AI selector or fall back to your explicit selector or full-page mode.
0.90 – 1.00High confidenceAI selects and crops the element0.70 – 0.89ModerateAI selects; result included in metadata for review< 0.70Low — fallbackFalls back to element selector or full pageSupply both description and element for belt-and-suspenders reliability. If the AI is confident, you get the crop. If not, the explicit selector takes over.
1const job = await client.screenshots.web({
2 url: "https://example.com",
3 // AI targeting + CSS fallback: if AI confidence < threshold,
4 // falls back to this explicit selector automatically
5 description: "the hero banner at the top",
6 element: ".hero",
7 format: "png",
8});Writing good descriptions
The quality of the description directly affects confidence. These patterns consistently produce high-confidence results:
Describe what you see
"the blue hero banner""#hero-section"Name the content type
"the pricing comparison table""a table"Use positional hints
"the navigation bar at the top""nav"Reference visible text
"the Get started for free button""a button"Describe the data shown
"the monthly revenue bar chart""a chart"Name the form by its action
"the sign-up form""a form"More examples
A range of descriptions that work reliably across real-world sites:
1// All of these descriptions work well in practice:
2
3// Pricing tables
4{ description: "the pricing comparison table" }
5
6// Navigation
7{ description: "the top navigation bar" }
8
9// CTAs
10{ description: "the main call-to-action button" }
11
12// Charts and data viz
13{ description: "the monthly revenue chart" }
14
15// Reviews
16{ description: "the customer testimonials section" }
17
18// Forms
19{ description: "the sign-up form" }
20
21// Footers
22{ description: "the footer with links and copyright" }Vision models
AI targeting routes through OpenRouter so you can swap models per request via aiModel.
| Parameter | Type | Description |
|---|---|---|
anthropic/claude-opus-4-5 | default | Best accuracy for complex or heavily styled layouts. Slower — adds ~2 s. |
anthropic/claude-sonnet-4-6 | string | Faster, still excellent for most layouts. Adds ~1 s. Good choice for high-volume pipelines. |
openai/gpt-4o | string | Strong alternative to Claude. Especially reliable on forms and e-commerce pages. |
google/gemini-2.0-flash-001 | string | Fastest and cheapest. Lower accuracy on dense or dynamic layouts. Use for simple, static pages. |
claude-opus-4-5 automatically. The job result includes the model actually used in metadata.aiModel.Limits & plan notes
| Parameter | Type | Description |
|---|---|---|
description max length | 500 chars | Descriptions longer than 500 characters are truncated before being sent to the vision model. |
AI latency overhead | 1–3 s | Vision model call adds 1–3 seconds to total job time on top of the Playwright render. |
Included in FREE plan | yes | AI targeting is not a paid add-on. Every plan includes it, subject to your screenshot quota. |
Quota counting | 1 per job | An AI-targeted job counts as 1 screenshot against your monthly quota, same as any other capture. |
Rate limiting | same | No separate rate limit for AI jobs. Requests count against your plan's requests-per-minute limit. |