API Reference
Scrape any website. Even the ones that fight back.
Sessemi is a scraping API that handles anti-bot protection automatically. Send a URL, get back clean HTML. Challenges from Cloudflare and DataDome are detected and solved transparently, with more vendors shipping soon. Your response comes back with success: true and clean content ready to parse.
Authentication
All requests require an API key passed via the X-API-Key header.
curl -X POST https://api.sessemi.com/scrape \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'bash
You can also pass the key as a query parameter: ?key=your_api_key
Quick Start
Scrape a simple site (datacenter proxy, 1 credit):
curl -X POST https://api.sessemi.com/scrape \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com"
}'bash
Scrape a protected site with challenge solving (residential proxy, 10 credits):
curl -X POST https://api.sessemi.com/scrape \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"pool": "residential"
}'bash
Response:
{
"success": true,
"url": "https://example.com",
"resolved_url": "https://example.com/",
"html": "<!doctype html>...",
"html_size": 48210,
"status_code": 200,
"challenge_type": "solved",
"challenge_provider": "cloudflare",
"pool": "residential",
"solved": true,
"credits_charged": 10,
"credits_remaining": 2490,
"duration_ms": 8420,
"cookies": [...]
}json
POST /scrape
Scrape a URL. Challenges are detected and solved automatically. Fingerprint management is always on.
Request Parameters
| Parameter | Type | Description |
|---|---|---|
| url required | string | Target URL to scrape. |
| pool | string | Proxy pool: datacenter (default, 1 credit) or residential (5 credits, or 10 with solve). Datacenter is fast and cheap — use for sites without anti-bot protection. Residential uses premium IPs with higher trust scores — use for protected sites. |
| solve | boolean | Enable challenge solving (Cloudflare, DataDome). Adds +5 credits to the pool base rate. Enabled by default with pool: "residential". When a challenge is detected, Sessemi solves it automatically and returns the page content. Set solve: false explicitly to disable — challenges will be returned as challenge_unsolved. |
| country | string | Proxy country code (e.g. FR, DE, US). 240+ countries supported. Routes through a residential proxy in that country with proprietary fingerprint management. Only with pool: "residential" or stealth: true. |
| session | string | Session ID (Pro only). Pins requests to a persistent environment — cookies, localStorage, and JS state carry across requests. Use for authenticated scraping (login → scrape) or multi-step JS interactions. Expires after 5 minutes idle or 10 minutes total. Any string works — created automatically on first use. Not needed for anti-bot bypass or pagination — Sessemi handles anti-bot bypass automatically. |
| screenshot | boolean | Include a base64-encoded PNG screenshot in the response. +1 credit |
| wait_for | string | CSS selector to wait for before returning HTML. Comma-separated for OR logic: .products, .no-results |
| wait_for_js | string | JS expression that must return truthy before returning HTML. Example: window.__DATA__ !== undefined |
| wait_timeout | integer | Max seconds to wait for wait_for / wait_for_js. Default: 10. |
| script | string | JavaScript to execute after page load. Result returned in script_result. See JS Execution. |
| warmup | boolean | Prime the session by loading the site's homepage before the target URL. Reduces the chance of blocks on deep links. |
| render | boolean | Enable JavaScript rendering. Use for JS-rendered SPAs (React, Vue, Angular) where you need the full DOM. When omitted, Sessemi automatically chooses the fastest method for each request. |
| timeout | integer | Request timeout in seconds. Default: 90. Challenge solving on protected sites can take 15–30 seconds — setting this too low may cause solve attempts to fail. For JS-heavy or protected sites, keep the default or increase it. |
| retry | integer | Number of automatic retries on failure. Max: 5. See Retries. |
| retry_on | string[] | Failure types that trigger retry: server_error, challenge_timeout, navigate_failed, empty_page |
| stealth | boolean | The recommended flag for protected sites. Automatically selects residential proxy, challenge solving, retries, and JS rendering. One toggle, maximum success rate. 10 credits/request. Explicit user values for pool, retry, etc. are not overridden — stealth only fills in defaults. |
| block_resources | boolean | Block images, fonts, media, and tracker scripts during JS rendering. Dramatically reduces page load time (5–10×) on heavy sites. DOM stays intact — <img src> attributes are preserved. If screenshot: true, images stay loaded. Fonts, media, and trackers are always blocked. |
Response Fields
| Field | Type | Description |
|---|---|---|
| success | boolean | Whether the scrape succeeded. |
| url | string | The requested URL. |
| resolved_url | string | Final URL after redirects. |
| html | string | Full rendered HTML of the page. |
| html_size | integer | Size of HTML in bytes. |
| status_code | integer | HTTP status code of the page. |
| challenge_type | string | Challenge result: clear, solved, timeout, needs_human |
| challenge_provider | string | Detected provider: cloudflare, akamai, datadome, none |
| duration_ms | integer | Total request duration in milliseconds. |
| cookies | array | Cookies set by the page (name, value, domain, path). |
| screenshot | string | Base64-encoded PNG. Only present if requested. |
| script_result | any | Return value of custom JS. Only present if script was provided. |
| failure_type | string | On failure: server_error, challenge_timeout, challenge_unsolved, navigate_failed, blocked, burned |
| wait_for_match | string | Which wait condition matched: css, js, timeout |
| pool | string | Proxy pool used: datacenter or residential |
| error | string | Error message on failure. |
| warning | string | Non-fatal advisory. Present when the request succeeded but something may need attention (e.g. datacenter solve). |
| solved | boolean | Whether an anti-bot challenge was detected and successfully solved. Only true when challenge_type is "solved". |
| credits_charged | integer | Credits consumed by this request. |
| credits_remaining | integer | Credits left in your billing cycle. |
| json | string | Response body when Content-Type is application/json. Present instead of html for JSON API endpoints. |
| response_headers | object | HTTP response headers from the target. Available on direct HTTP requests; omitted when JS rendering is used. |
| stealth | boolean | true when stealth mode was active for this request. |
| queued_ms | integer | Time spent waiting in the queue, in milliseconds. 0 when resources were immediately available. |
| retry_count | integer | Number of retries that were performed. 0 if the first attempt succeeded. |
| user_agent | string | The User-Agent string used for this request. |
| challenge_details | string | Additional challenge info, e.g. type=slider, type=managed. |
Async / Batch
For long-running scrapes or batch jobs, use async mode. Submit requests with ?async=true — the API returns immediately with a task ID. Poll GET /tasks/{id} for results.
This eliminates HTTP timeout issues and enables batch scraping: submit multiple URLs and collect results as they finish. All parameters (stealth, country, retry, etc.) work identically.
Submit async request
curl -X POST "https://api.sessemi.com/scrape?async=true" \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"stealth": true
}'bash
Response (HTTP 202):
{
"task_id": "049ebae94d32d35a3956cfc9167fce5b",
"status": "queued",
"poll": "/tasks/049ebae94d32d35a3956cfc9167fce5b"
}json
Poll for results
curl https://api.sessemi.com/tasks/049ebae94d32d35a3956cfc9167fce5b \
-H "X-API-Key: your_api_key"bash
Response when complete:
{
"task_id": "049ebae94d32d35a3956cfc9167fce5b",
"status": "done",
"created_at": "2026-03-24T20:45:00Z",
"started_at": "2026-03-24T20:45:00Z",
"completed_at": "2026-03-24T20:45:02Z",
"http_status": 200,
"result": {
"success": true,
"url": "https://example.com",
"html": "<!doctype html>...",
"html_size": 252360,
...
}
}json
List recent tasks
curl https://api.sessemi.com/tasks \
-H "X-API-Key: your_api_key"bash
Task lifecycle
| Status | Description |
|---|---|
queued | Task accepted, waiting for capacity |
running | Scrape in progress |
done | Scrape completed successfully. Result in result field. |
failed | Scrape failed (blocked, timeout, etc.). Error details in result field. |
Notes
- Tasks are stored server-side — results survive page reloads and client disconnects. Poll from any client.
- Each task has a 5-minute execution budget. If the scrape (including retries) doesn't complete within 5 minutes, the task is marked
failed. - Completed tasks are automatically deleted after 1 hour. Poll promptly or store results on your end.
?async=trueis not compatible withsession(sessions require a persistent connection).- Billing works identically — credits are charged when the scrape completes.
?async=true, then poll GET /tasks periodically to collect results. Each URL runs as an independent task with its own 5-minute budget.
POST /screenshot
Navigate to a URL and return a full-page PNG screenshot. Returns raw image/png bytes, not JSON.
| Parameter | Type | Description |
|---|---|---|
| url required | string | URL to screenshot. |
| timeout | integer | Timeout in seconds. Default: 30. |
curl -X POST https://api.sessemi.com/screenshot \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}' \
-o screenshot.pngbash
Sessions
Sessions let you make multiple requests that share the same environment — cookies, localStorage, and JS state carry across requests. Pass any string as the session parameter and it's created automatically on first use. Sessions expire after 5 minutes of inactivity, or 10 minutes total — whichever comes first.
{
"url": "https://example.com/page-1",
"session": "my-crawl-123"
}json
All subsequent requests with "session": "my-crawl-123" share the same cookies, IP, and fingerprint. No setup or teardown needed.
GET /me
Returns your account info: tier, credits remaining, and limits.
curl https://api.sessemi.com/me \
-H "X-API-Key: your_api_key"bash
Challenge Solving
When a site presents an anti-bot challenge (Cloudflare Turnstile, DataDome slider), Sessemi detects and solves it automatically. Your response comes back with success: true and the page content — the challenge is handled transparently.
Solving is optimized for batches. When you scrape multiple pages from the same domain, the first challenge may take longer to solve, but subsequent pages are typically much faster.
Enabling solve
Solving is enabled by default on residential proxy requests. You can also enable it on datacenter with solve: true.
{
"url": "https://protected-site.com",
"pool": "residential"
}json
solve: true work but are more likely to be flagged by anti-bot providers. Use datacenter + solve as a budget option (6 credits vs 10) for lighter protection.
To disable solving on residential (e.g. for sites with no anti-bot protection that just need a residential IP), set solve: false explicitly. Challenges will be returned as challenge_unsolved. Residential without solve costs 5 credits instead of 10.
Supported providers
| Provider | Challenge Types | Method |
|---|---|---|
| Cloudflare | JS Challenge, Managed Challenge, Turnstile | JavaScript execution + proof-of-work computation |
| DataDome | Device Check, Slider CAPTCHA | Device attestation + automated slider interaction |
The response includes challenge_provider and challenge_type so you can see what happened.
Fingerprint Management
Every request uses a realistic device fingerprint matched to real-world configurations. This is always on. No configuration needed.
Proxies
Requests are routed through our proxy pools automatically based on the pool parameter. Datacenter is the default — fast and cheap. Use pool: "residential" when sites block datacenter IPs. Sessemi manages all proxy infrastructure: TLS fingerprinting, session stickiness, geo-targeting, and rotation.
Wait Conditions
Wait for dynamic content to render before returning HTML.
CSS Selector
Wait until a CSS selector is present in the DOM. Use comma-separated selectors for OR logic — the first match wins.
{
"url": "https://example.com/products",
"wait_for": ".product-card, .no-results",
"wait_timeout": 15
}json
JavaScript Expression
Wait until a JS expression returns a truthy value. Checked every 200ms.
{
"url": "https://example.com/app",
"wait_for_js": "window.__DATA__ && window.__DATA__.products.length > 0"
}json
Scraping Tips
Heavy Pages
Some pages (news portals, ad-heavy homepages) load dozens of scripts and trackers. Through residential proxies with higher latency, this can cause timeouts. Use wait_for to grab the content you need as soon as it appears, without waiting for every ad script to finish:
{
"url": "https://heavy-site.com",
"pool": "residential",
"country": "JP",
"wait_for": "#main-content",
"wait_timeout": 15
}json
Lazy-Loaded Images
Many sites defer image loading until the user scrolls. These images often use data-src instead of src until they enter the viewport. If you need the real image URLs, extract from data-src (or data-lazy-src, data-original) in your parsing logic, or trigger lazy loading with a script:
{
"url": "https://example.com/products",
"script": "window.scrollTo(0, document.body.scrollHeight); return true",
"wait_for": ".product-card img[src*='cdn']",
"wait_timeout": 10
}json
JS-Rendered Content (SPAs)
Sites built with React, Vue, or Angular render content with JavaScript after the initial page load. The HTML shell loads quickly but the actual products/listings appear later. Always use wait_for or wait_for_js for these sites:
{
"url": "https://spa-site.com/catalog",
"wait_for_js": "document.querySelectorAll('.product-card').length >= 10",
"wait_timeout": 15
}json
href, src, etc.) are present without needing wait_for. Use it only when you see empty or placeholder content in your results.
JS Execution
Run custom JavaScript after the page loads. The return value is included in script_result.
{
"url": "https://example.com/products",
"script": "return [...document.querySelectorAll('.price')].map(e => e.textContent)"
}json
Script-only mode: Send script + session without a url to run JS on the current page of an existing session. Useful for pagination, clicking "load more", or extracting data after interactions.
Retries
Automatic retries with fresh proxies on failure. Set retry to the number of attempts (max 5). Each retry uses a new proxy.
{
"url": "https://example.com",
"retry": 2,
"retry_on": ["challenge_timeout", "server_error"]
}json
Default retry_on (when retries are set but types aren't specified): server_error, challenge_timeout, navigate_failed, empty_page.
Credit Pricing
Credits are charged based on the proxy pool and features you select.
| Configuration | Free | Starter | Pro |
|---|---|---|---|
| Datacenter | 1 | 1 | 1 |
| Datacenter + solve | 16 | 6 | 3 |
| Residential | 10 | 5 | 3 |
| Residential + solve | 25 | 10 | 5 |
| Screenshot addon | +1 | +1 | +1 |
pool: "residential" for premium IPs with challenge solving enabled by default. Add solve: false on residential to skip solving and pay just the residential base rate. Failed requests are free.
Scrape Failed Protection
We only charge for successful requests. If a scrape fails, you are not billed — regardless of the proxy pool or features used.
| Outcome | Failure Type | Billed? | Why |
|---|---|---|---|
| Success (200, 404) | — | Yes | You got the content you requested. |
| Blocked | blocked | No | Target site rejected the request. You got nothing useful. |
| Challenge timeout | challenge_timeout | No | Anti-bot challenge was not solved in time. |
| Challenge unsolved | challenge_unsolved | No | Anti-bot challenge detected but solving was not enabled. Add solve: true or use pool: "residential". |
| Navigate failed | navigate_failed | No | Page could not be loaded (DNS, timeout, crash). |
| Server error | server_error | No | Target returned HTTP 5xx. Not our fault or yours. |
| Empty page | empty_page | No | Page loaded but returned no usable content. |
| Session burned | burned | No | Request fingerprint was flagged. Automatically retried with a fresh identity. |
Fairness Policy
To prevent abuse, we monitor failure rates per account. If your failure rate exceeds 30% over a rolling 1-hour window (minimum 20 requests), the Scrape Failed Protection is temporarily disabled and all requests are billed — including failures.
This policy exists to prevent intentional scraping of unreachable or blocked targets to consume proxy bandwidth without cost. Normal usage is never affected.
country dramatically reduce blocks on geo-restricted sites.
Error Codes
credits_remaining.Retry-After header value.On failure, the response includes "success": false with an error message and failure_type for programmatic handling.
Rate Limits
Requests are processed in order. If all capacity is in use, your request is queued for up to 5 minutes before timing out.
| Plan | Credits / Month | Rate Limit | Sessions |
|---|---|---|---|
| Free | 500 | 2 req/min | No |
| Starter (€20/mo) | 5,000 | 60 req/min | No |
| Pro (€100/mo) | 50,000 | Unlimited | Yes |