API Reference

Scrape any website. Even the ones that fight back.

Base URL https://api.sessemi.com

Sessemi is a scraping API that handles anti-bot protection automatically. Send a URL, get back clean HTML. Challenges from Cloudflare and DataDome are detected and solved transparently, with more vendors shipping soon. Your response comes back with success: true and clean content ready to parse.

Authentication

All requests require an API key passed via the X-API-Key header.

curl -X POST https://api.sessemi.com/scrape \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'bash

You can also pass the key as a query parameter: ?key=your_api_key

Quick Start

Scrape a simple site (datacenter proxy, 1 credit):

curl -X POST https://api.sessemi.com/scrape \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com"
  }'bash

Scrape a protected site with challenge solving (residential proxy, 10 credits):

curl -X POST https://api.sessemi.com/scrape \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "pool": "residential"
  }'bash

Response:

{
  "success": true,
  "url": "https://example.com",
  "resolved_url": "https://example.com/",
  "html": "<!doctype html>...",
  "html_size": 48210,
  "status_code": 200,
  "challenge_type": "solved",
  "challenge_provider": "cloudflare",
  "pool": "residential",
  "solved": true,
  "credits_charged": 10,
  "credits_remaining": 2490,
  "duration_ms": 8420,
  "cookies": [...]
}json

POST /scrape

Scrape a URL. Challenges are detected and solved automatically. Fingerprint management is always on.

Request Parameters

ParameterTypeDescription
url required string Target URL to scrape.
pool string Proxy pool: datacenter (default, 1 credit) or residential (5 credits, or 10 with solve). Datacenter is fast and cheap — use for sites without anti-bot protection. Residential uses premium IPs with higher trust scores — use for protected sites.
solve boolean Enable challenge solving (Cloudflare, DataDome). Adds +5 credits to the pool base rate. Enabled by default with pool: "residential". When a challenge is detected, Sessemi solves it automatically and returns the page content. Set solve: false explicitly to disable — challenges will be returned as challenge_unsolved.
country string Proxy country code (e.g. FR, DE, US). 240+ countries supported. Routes through a residential proxy in that country with proprietary fingerprint management. Only with pool: "residential" or stealth: true.
session string Session ID (Pro only). Pins requests to a persistent environment — cookies, localStorage, and JS state carry across requests. Use for authenticated scraping (login → scrape) or multi-step JS interactions. Expires after 5 minutes idle or 10 minutes total. Any string works — created automatically on first use. Not needed for anti-bot bypass or pagination — Sessemi handles anti-bot bypass automatically.
screenshot boolean Include a base64-encoded PNG screenshot in the response. +1 credit
wait_for string CSS selector to wait for before returning HTML. Comma-separated for OR logic: .products, .no-results
wait_for_js string JS expression that must return truthy before returning HTML. Example: window.__DATA__ !== undefined
wait_timeout integer Max seconds to wait for wait_for / wait_for_js. Default: 10.
script string JavaScript to execute after page load. Result returned in script_result. See JS Execution.
warmup boolean Prime the session by loading the site's homepage before the target URL. Reduces the chance of blocks on deep links.
render boolean Enable JavaScript rendering. Use for JS-rendered SPAs (React, Vue, Angular) where you need the full DOM. When omitted, Sessemi automatically chooses the fastest method for each request.
timeout integer Request timeout in seconds. Default: 90. Challenge solving on protected sites can take 15–30 seconds — setting this too low may cause solve attempts to fail. For JS-heavy or protected sites, keep the default or increase it.
retry integer Number of automatic retries on failure. Max: 5. See Retries.
retry_on string[] Failure types that trigger retry: server_error, challenge_timeout, navigate_failed, empty_page
stealth boolean The recommended flag for protected sites. Automatically selects residential proxy, challenge solving, retries, and JS rendering. One toggle, maximum success rate. 10 credits/request. Explicit user values for pool, retry, etc. are not overridden — stealth only fills in defaults.
block_resources boolean Block images, fonts, media, and tracker scripts during JS rendering. Dramatically reduces page load time (5–10×) on heavy sites. DOM stays intact — <img src> attributes are preserved. If screenshot: true, images stay loaded. Fonts, media, and trackers are always blocked.

Response Fields

FieldTypeDescription
successbooleanWhether the scrape succeeded.
urlstringThe requested URL.
resolved_urlstringFinal URL after redirects.
htmlstringFull rendered HTML of the page.
html_sizeintegerSize of HTML in bytes.
status_codeintegerHTTP status code of the page.
challenge_typestringChallenge result: clear, solved, timeout, needs_human
challenge_providerstringDetected provider: cloudflare, akamai, datadome, none
duration_msintegerTotal request duration in milliseconds.
cookiesarrayCookies set by the page (name, value, domain, path).
screenshotstringBase64-encoded PNG. Only present if requested.
script_resultanyReturn value of custom JS. Only present if script was provided.
failure_typestringOn failure: server_error, challenge_timeout, challenge_unsolved, navigate_failed, blocked, burned
wait_for_matchstringWhich wait condition matched: css, js, timeout
poolstringProxy pool used: datacenter or residential
errorstringError message on failure.
warningstringNon-fatal advisory. Present when the request succeeded but something may need attention (e.g. datacenter solve).
solvedbooleanWhether an anti-bot challenge was detected and successfully solved. Only true when challenge_type is "solved".
credits_chargedintegerCredits consumed by this request.
credits_remainingintegerCredits left in your billing cycle.
jsonstringResponse body when Content-Type is application/json. Present instead of html for JSON API endpoints.
response_headersobjectHTTP response headers from the target. Available on direct HTTP requests; omitted when JS rendering is used.
stealthbooleantrue when stealth mode was active for this request.
queued_msintegerTime spent waiting in the queue, in milliseconds. 0 when resources were immediately available.
retry_countintegerNumber of retries that were performed. 0 if the first attempt succeeded.
user_agentstringThe User-Agent string used for this request.
challenge_detailsstringAdditional challenge info, e.g. type=slider, type=managed.

Async / Batch

For long-running scrapes or batch jobs, use async mode. Submit requests with ?async=true — the API returns immediately with a task ID. Poll GET /tasks/{id} for results.

This eliminates HTTP timeout issues and enables batch scraping: submit multiple URLs and collect results as they finish. All parameters (stealth, country, retry, etc.) work identically.

Submit async request

curl -X POST "https://api.sessemi.com/scrape?async=true" \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "stealth": true
  }'bash

Response (HTTP 202):

{
  "task_id": "049ebae94d32d35a3956cfc9167fce5b",
  "status": "queued",
  "poll": "/tasks/049ebae94d32d35a3956cfc9167fce5b"
}json

Poll for results

curl https://api.sessemi.com/tasks/049ebae94d32d35a3956cfc9167fce5b \
  -H "X-API-Key: your_api_key"bash

Response when complete:

{
  "task_id": "049ebae94d32d35a3956cfc9167fce5b",
  "status": "done",
  "created_at": "2026-03-24T20:45:00Z",
  "started_at": "2026-03-24T20:45:00Z",
  "completed_at": "2026-03-24T20:45:02Z",
  "http_status": 200,
  "result": {
    "success": true,
    "url": "https://example.com",
    "html": "<!doctype html>...",
    "html_size": 252360,
    ...
  }
}json

List recent tasks

curl https://api.sessemi.com/tasks \
  -H "X-API-Key: your_api_key"bash

Task lifecycle

StatusDescription
queuedTask accepted, waiting for capacity
runningScrape in progress
doneScrape completed successfully. Result in result field.
failedScrape failed (blocked, timeout, etc.). Error details in result field.

Notes

Batch pattern: For large jobs, submit all URLs with ?async=true, then poll GET /tasks periodically to collect results. Each URL runs as an independent task with its own 5-minute budget.

POST /screenshot

Navigate to a URL and return a full-page PNG screenshot. Returns raw image/png bytes, not JSON.

ParameterTypeDescription
url required string URL to screenshot.
timeout integer Timeout in seconds. Default: 30.
curl -X POST https://api.sessemi.com/screenshot \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}' \
  -o screenshot.pngbash

Sessions

Pro plan only. Sessions are not available on the free trial. Upgrade to Pro to use persistent sessions.

Sessions let you make multiple requests that share the same environment — cookies, localStorage, and JS state carry across requests. Pass any string as the session parameter and it's created automatically on first use. Sessions expire after 5 minutes of inactivity, or 10 minutes total — whichever comes first.

Do you need sessions? Probably not. Anti-bot bypass, proxy management, and IP rotation are all automatic — no session parameter needed. Sessions are for authenticated scraping (login via JS → scrape protected pages) and multi-step interactions (click "Load more" → scrape expanded content). Pagination, crawling, and batch scraping don't need sessions.
{
  "url": "https://example.com/page-1",
  "session": "my-crawl-123"
}json

All subsequent requests with "session": "my-crawl-123" share the same cookies, IP, and fingerprint. No setup or teardown needed.

GET /me

Returns your account info: tier, credits remaining, and limits.

curl https://api.sessemi.com/me \
  -H "X-API-Key: your_api_key"bash

Challenge Solving

When a site presents an anti-bot challenge (Cloudflare Turnstile, DataDome slider), Sessemi detects and solves it automatically. Your response comes back with success: true and the page content — the challenge is handled transparently.

Solving is optimized for batches. When you scrape multiple pages from the same domain, the first challenge may take longer to solve, but subsequent pages are typically much faster.

Enabling solve

Solving is enabled by default on residential proxy requests. You can also enable it on datacenter with solve: true.

{
  "url": "https://protected-site.com",
  "pool": "residential"
}json
Which pool for solving? Residential proxies have the highest success rate for challenge solving. Datacenter proxies with solve: true work but are more likely to be flagged by anti-bot providers. Use datacenter + solve as a budget option (6 credits vs 10) for lighter protection.

To disable solving on residential (e.g. for sites with no anti-bot protection that just need a residential IP), set solve: false explicitly. Challenges will be returned as challenge_unsolved. Residential without solve costs 5 credits instead of 10.

Supported providers

ProviderChallenge TypesMethod
Cloudflare JS Challenge, Managed Challenge, Turnstile JavaScript execution + proof-of-work computation
DataDome Device Check, Slider CAPTCHA Device attestation + automated slider interaction

The response includes challenge_provider and challenge_type so you can see what happened.

Fingerprint Management

Every request uses a realistic device fingerprint matched to real-world configurations. This is always on. No configuration needed.

Proxies

Requests are routed through our proxy pools automatically based on the pool parameter. Datacenter is the default — fast and cheap. Use pool: "residential" when sites block datacenter IPs. Sessemi manages all proxy infrastructure: TLS fingerprinting, session stickiness, geo-targeting, and rotation.

Wait Conditions

Wait for dynamic content to render before returning HTML.

CSS Selector

Wait until a CSS selector is present in the DOM. Use comma-separated selectors for OR logic — the first match wins.

{
  "url": "https://example.com/products",
  "wait_for": ".product-card, .no-results",
  "wait_timeout": 15
}json

JavaScript Expression

Wait until a JS expression returns a truthy value. Checked every 200ms.

{
  "url": "https://example.com/app",
  "wait_for_js": "window.__DATA__ && window.__DATA__.products.length > 0"
}json

Scraping Tips

Heavy Pages

Some pages (news portals, ad-heavy homepages) load dozens of scripts and trackers. Through residential proxies with higher latency, this can cause timeouts. Use wait_for to grab the content you need as soon as it appears, without waiting for every ad script to finish:

{
  "url": "https://heavy-site.com",
  "pool": "residential",
  "country": "JP",
  "wait_for": "#main-content",
  "wait_timeout": 15
}json

Lazy-Loaded Images

Many sites defer image loading until the user scrolls. These images often use data-src instead of src until they enter the viewport. If you need the real image URLs, extract from data-src (or data-lazy-src, data-original) in your parsing logic, or trigger lazy loading with a script:

{
  "url": "https://example.com/products",
  "script": "window.scrollTo(0, document.body.scrollHeight); return true",
  "wait_for": ".product-card img[src*='cdn']",
  "wait_timeout": 10
}json

JS-Rendered Content (SPAs)

Sites built with React, Vue, or Angular render content with JavaScript after the initial page load. The HTML shell loads quickly but the actual products/listings appear later. Always use wait_for or wait_for_js for these sites:

{
  "url": "https://spa-site.com/catalog",
  "wait_for_js": "document.querySelectorAll('.product-card').length >= 10",
  "wait_timeout": 15
}json
Tip: Server-rendered sites (most traditional e-commerce, news, and catalog pages) return full HTML immediately — all element attributes (href, src, etc.) are present without needing wait_for. Use it only when you see empty or placeholder content in your results.

JS Execution

Run custom JavaScript after the page loads. The return value is included in script_result.

{
  "url": "https://example.com/products",
  "script": "return [...document.querySelectorAll('.price')].map(e => e.textContent)"
}json

Script-only mode: Send script + session without a url to run JS on the current page of an existing session. Useful for pagination, clicking "load more", or extracting data after interactions.

Retries

Automatic retries with fresh proxies on failure. Set retry to the number of attempts (max 5). Each retry uses a new proxy.

{
  "url": "https://example.com",
  "retry": 2,
  "retry_on": ["challenge_timeout", "server_error"]
}json

Default retry_on (when retries are set but types aren't specified): server_error, challenge_timeout, navigate_failed, empty_page.

Credit note: Only successful retry attempts are billed. Failed retries are free (see Scrape Failed Protection below).

Credit Pricing

Credits are charged based on the proxy pool and features you select.

ConfigurationFreeStarterPro
Datacenter111
Datacenter + solve1663
Residential1053
Residential + solve25105
Screenshot addon+1+1+1
Simple pricing: Datacenter is the default — 1 credit per request on any plan. Add pool: "residential" for premium IPs with challenge solving enabled by default. Add solve: false on residential to skip solving and pay just the residential base rate. Failed requests are free.

Scrape Failed Protection

We only charge for successful requests. If a scrape fails, you are not billed — regardless of the proxy pool or features used.

OutcomeFailure TypeBilled?Why
Success (200, 404)YesYou got the content you requested.
BlockedblockedNoTarget site rejected the request. You got nothing useful.
Challenge timeoutchallenge_timeoutNoAnti-bot challenge was not solved in time.
Challenge unsolvedchallenge_unsolvedNoAnti-bot challenge detected but solving was not enabled. Add solve: true or use pool: "residential".
Navigate failednavigate_failedNoPage could not be loaded (DNS, timeout, crash).
Server errorserver_errorNoTarget returned HTTP 5xx. Not our fault or yours.
Empty pageempty_pageNoPage loaded but returned no usable content.
Session burnedburnedNoRequest fingerprint was flagged. Automatically retried with a fresh identity.

Fairness Policy

To prevent abuse, we monitor failure rates per account. If your failure rate exceeds 30% over a rolling 1-hour window (minimum 20 requests), the Scrape Failed Protection is temporarily disabled and all requests are billed — including failures.

This policy exists to prevent intentional scraping of unreachable or blocked targets to consume proxy bandwidth without cost. Normal usage is never affected.

Tip: If you're seeing a high failure rate, check that your target URLs are reachable and that you're using the right proxy pool. Residential proxies with the correct country dramatically reduce blocks on geo-restricted sites.

Error Codes

200Success.
400Bad request — missing URL or invalid parameter.
401Missing or invalid API key.
402Insufficient credits. Response includes credits_remaining.
429Rate limit exceeded or all capacity in use. Retry after the Retry-After header value.
500Unexpected internal error. Retry the request.

On failure, the response includes "success": false with an error message and failure_type for programmatic handling.

Rate Limits

Requests are processed in order. If all capacity is in use, your request is queued for up to 5 minutes before timing out.

PlanCredits / MonthRate LimitSessions
Free5002 req/minNo
Starter (€20/mo)5,00060 req/minNo
Pro (€100/mo)50,000UnlimitedYes
Need more? Additional credits and capacity available on request.