Blog
Use Case

AI Screenshot Targeting: Capture Any Page Element by Description

Learn how AI screenshot element targeting works in ScreenshotFreeAPI — describe any page element in plain English and the vision AI finds and crops it automatically.

CSS selectors are fragile. A designer renames a class from pricing-table__wrapper to pricing-section__container, your screenshot job silently captures nothing, and you don't notice until a customer complains. ScreenshotFreeAPI's AI element targeting solves this by letting you describe what you want in plain English — the description field — and letting a vision AI model find the element for you, even if the underlying HTML changes.

How AI Element Targeting Works

When you include a description field in your screenshot request, the ScreenshotFreeAPI worker runs a two-phase pipeline. First, it captures a full-page screenshot of the target URL using Playwright. Second, it sends that screenshot plus a structured element map (bounding boxes + text content extracted from the DOM) to Claude claude-opus-4-5 via OpenRouter using a multimodal vision prompt. The model reasons about the visual content and the element map together, identifies the element that best matches your description, and returns a CSS selector and bounding box coordinates. The worker then crops the full-page screenshot to those coordinates and returns the cropped image as the job result.

The response includes a confidence score between 0 and 1. If confidence is below your threshold (the default is 0.7), you can configure the API to fall back to the full-page screenshot automatically by setting "fallbackToFullPage": true on your request.

Use Case 1: Pricing Table

Competitive intelligence teams frequently screenshot competitor pricing pages to track price changes. Rather than maintaining CSS selectors for each competitor's unique DOM structure, you describe the element once and the AI adapts.

bash
curl -X POST https://api.screenshotfreeapi.com/screenshots/web \
  -H "Authorization: Bearer sfa_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://competitor.com/pricing",
    "description": "the pricing table showing all plans and monthly prices",
    "format": "png",
    "fallbackToFullPage": true,
    "webhookUrl": "https://yourapp.com/hooks/screenshot"
  }'

# Result includes confidence metadata:
# {
#   "screenshots": [{ "url": "...", "width": 960, "height": 480 }],
#   "ai": {
#     "description": "the pricing table showing all plans and monthly prices",
#     "detectedSelector": ".pricing-section .plan-grid",
#     "confidence": 0.94,
#     "fallbackUsed": false
#   }
# }

Use Case 2: Hero Section

Design teams use hero section screenshots to create mood boards and track how marketing pages evolve over time. The hero section is visually obvious but structurally varies wildly between sites — some use <section id="hero">, others <div class="above-fold">, others a full-bleed <header>.

bash
curl -X POST https://api.screenshotfreeapi.com/screenshots/web \
  -H "Authorization: Bearer sfa_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "description": "the hero section with headline and call-to-action button",
    "format": "webp",
    "viewport": { "width": 1440, "height": 900 }
  }'

Use Case 3: Navigation Bar

Capturing navigation bars across multiple sites is common when auditing IA (information architecture) patterns or producing documentation screenshots. Navigation bars are notoriously inconsistent in their HTML structure but visually distinctive.

bash
curl -X POST https://api.screenshotfreeapi.com/screenshots/web \
  -H "Authorization: Bearer sfa_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "description": "the top navigation bar with logo and menu links",
    "format": "png",
    "viewport": { "width": 1440, "height": 900 }
  }'

Use Case 4: Product Card

E-commerce teams screenshot product cards for catalog automation, price tracking, and availability monitoring. Product card layouts differ between platforms (Shopify, WooCommerce, custom), but describing "the first product card with image, title, and price" works reliably across all of them.

bash
curl -X POST https://api.screenshotfreeapi.com/screenshots/web \
  -H "Authorization: Bearer sfa_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://shop.example.com/products",
    "description": "the first product card showing product image, name, and price",
    "format": "jpeg",
    "quality": 90,
    "fallbackToFullPage": false
  }'

Confidence Scores and Fallback Behavior

Every AI-targeted screenshot result includes a confidence value. Values above 0.85 indicate high certainty — the model clearly identified the element and its bounding box is precise. Values between 0.7 and 0.85 indicate moderate confidence — the element was found but may be slightly over- or under-cropped. Values below 0.7 indicate ambiguity — the description matched multiple candidates or nothing clearly.

  • Set "fallbackToFullPage": true to automatically return a full-page screenshot when confidence is below the threshold.
  • Use the confidence value in your application logic to decide whether to flag the result for human review.
  • Refine your description if you receive low-confidence results — more specific descriptions produce higher confidence. For example, "the pricing table" → "the pricing comparison table with three columns showing Starter, Business, and Enterprise plans".

Tip

AI element targeting calls consume 2 screenshot credits per request (one for the full-page capture, one for the AI targeting pipeline). Budget accordingly when processing large batches.

Try AI element targeting free — no CSS selectors, no DOM archaeology, just describe what you want.

Start free — no credit card

Priya Nair

Senior Engineer at ScreenshotFreeAPI