v1.0
API Reference
A single REST API for extracting structured data from any URL. All endpoints accept JSON and return JSON.
Authentication
All requests require your API key as a Bearer token in the Authorization header. Keys are prefixed with cfai_. Generate keys in the dashboard.
Authorization: Bearer cfai_your_key_here
Never expose your key in client-side code or public repositories.
/v1/markdown
5 creditsFetches the target URL via headless browser, extracts the main content block, and strips boilerplate — navigation, ads, footers, cookie banners. Returns structured Markdown with word count and metadata.
Request body
| field | type | req | description |
|---|---|---|---|
url | string | yes | Full URL to scrape. Must include protocol (https://). |
options.includeImages | boolean | no | Embed image alt-text in Markdown output. Default: false. |
options.timeout | number | no | Timeout in ms. Default: 30000. Max: 60000. |
Example
curl -X POST https://contextforai.ashutoshswamy.in/v1/markdown \
-H "Authorization: Bearer cfai_your_key_here" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Response
{
"url": "https://example.com",
"title": "Example Domain",
"markdown": "# Example Domain\n\nThis domain is for use in illustrative examples.",
"wordCount": 17,
"extractedAt": "2026-07-01T10:00:00Z"
}/v1/html
5 creditsFetches the target URL and returns the raw HTML as-is — no Readability parsing, no boilerplate stripping. Useful when you want to run your own extraction downstream. The only GET endpoint; pass the target as a query parameter.
Query parameters
| field | type | req | description |
|---|---|---|---|
url | string | yes | Full URL to fetch, as a query parameter. Must include protocol (https://). |
Example
curl "https://contextforai.ashutoshswamy.in/v1/html?url=https://example.com" \ -H "Authorization: Bearer cfai_your_key_here"
Response
{
"url": "https://example.com",
"finalUrl": "https://example.com",
"html": "<!doctype html><html>...</html>",
"length": 1256,
"extractedAt": "2026-07-01T10:00:00Z"
}/v1/json
5 creditsParses the page and returns a structured JSON object with all extractable metadata: title, description, OpenGraph tags, canonical URL, heading hierarchy, and up to 100 outbound links.
Request body
| field | type | req | description |
|---|---|---|---|
url | string | yes | Full URL to scrape. Must include protocol (https://). |
options.maxLinks | number | no | Max outbound links to return. Default: 100. |
options.timeout | number | no | Timeout in ms. Default: 30000. Max: 60000. |
Example
curl -X POST https://contextforai.ashutoshswamy.in/v1/json \
-H "Authorization: Bearer cfai_your_key_here" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Response
{
"url": "https://example.com",
"title": "Example Domain",
"description": "This domain is for use in illustrative examples.",
"canonical": "https://example.com",
"metadata": {
"og:title": "Example Domain",
"og:type": "website"
},
"headings": [{ "level": 1, "text": "Example Domain" }],
"links": [
{ "text": "More information", "href": "https://www.iana.org/domains/example" }
],
"extractedAt": "2026-07-01T10:00:00Z"
}/v1/sitemap
5 creditsAttempts to fetch and parse sitemap.xml (and sitemap index files). If no sitemap is found, performs a shallow crawl of the homepage to discover linked pages. Useful for bulk-scraping or building crawl queues.
Request body
| field | type | req | description |
|---|---|---|---|
url | string | yes | Root URL of the site. Must include protocol (https://). |
options.maxPages | number | no | Max URLs to return. Default: 500. |
options.timeout | number | no | Timeout in ms. Default: 30000. Max: 60000. |
Example
curl -X POST https://contextforai.ashutoshswamy.in/v1/sitemap \
-H "Authorization: Bearer cfai_your_key_here" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Response
{
"url": "https://example.com",
"source": "sitemap.xml",
"count": 48,
"pages": [
{ "url": "https://example.com/", "lastmod": "2026-07-01", "priority": 1.0 },
{ "url": "https://example.com/about", "lastmod": "2026-06-15", "priority": 0.8 }
],
"extractedAt": "2026-07-01T10:00:00Z"
}/v1/fonts
5 creditsFetches the page and its stylesheets, then extracts all font families from font-family CSS declarations and Google Fonts <link> tags. Returns an ordered list of up to 10 unique font families.
Request body
| field | type | req | description |
|---|---|---|---|
url | string | yes | Full URL of the site. Must include protocol (https://). |
options.timeout | number | no | Timeout in ms. Default: 30000. Max: 60000. |
Example
curl -X POST https://contextforai.ashutoshswamy.in/v1/fonts \
-H "Authorization: Bearer cfai_your_key_here" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Response
{
"url": "https://example.com",
"fonts": ["Inter", "Geist Sans", "Geist Mono"],
"count": 3,
"extractedAt": "2026-07-01T10:00:00Z"
}/v1/screenshot
10 creditsRenders the page in headless Chromium at 1280px viewport width and captures a screenshot. Returns a CDN-hosted image URL valid for 24 hours.
Request body
| field | type | req | description |
|---|---|---|---|
url | string | yes | Full URL to screenshot. Must include protocol (https://). |
options.fullPage | boolean | no | Capture the full scrollable page. Default: false (viewport only). |
options.format | string | no | "png" or "jpeg". Default: "png". |
options.width | number | no | Viewport width in px. Default: 1280. Max: 1920. |
options.timeout | number | no | Timeout in ms. Default: 30000. Max: 60000. |
Example
curl -X POST https://contextforai.ashutoshswamy.in/v1/screenshot \
-H "Authorization: Bearer cfai_your_key_here" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Response
{
"url": "https://example.com",
"imageUrl": "https://contextforai.ashutoshswamy.in/screenshots/abc123.png",
"dimensions": { "width": 1280, "height": 800 },
"format": "png",
"fullPage": false,
"expiresAt": "2026-07-02T10:00:00Z",
"extractedAt": "2026-07-01T10:00:00Z"
}/v1/branding
10 creditsExtracts branding assets from the page: best-match logo image URL, favicon, and dominant color palette (up to 3 colors) ranked by frequency in CSS. Neutral colors (black, white, near-black, near-white) are filtered out.
Request body
| field | type | req | description |
|---|---|---|---|
url | string | yes | Full URL of the site. Must include protocol (https://). |
options.timeout | number | no | Timeout in ms. Default: 30000. Max: 60000. |
Example
curl -X POST https://contextforai.ashutoshswamy.in/v1/branding \
-H "Authorization: Bearer cfai_your_key_here" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Response
{
"url": "https://example.com",
"logo": "https://example.com/logo.svg",
"colors": [
{ "name": "primary", "hex": "#1A73E8" },
{ "name": "secondary", "hex": "#34A853" },
{ "name": "accent", "hex": "#EA4335" }
]
}Error codes
All errors return a JSON body with error and message fields. Errors are never charged.
{
"error": "unauthorized",
"message": "Missing or invalid API key."
}| status | error | description |
|---|---|---|
| 400 | bad_request | Missing or invalid request body. Check that url is present and valid. |
| 401 | unauthorized | Missing or invalid API key. Ensure the Authorization header is set correctly. |
| 402 | insufficient_credits | Account has no remaining credits. Top up in the dashboard. |
| 422 | unprocessable_url | The URL returned a non-200 status, timed out, or could not be rendered. |
| 429 | rate_limited | Too many requests. See Rate limits for per-plan, per-endpoint limits. |
| 500 | internal_error | Something went wrong on our end. Retrying after 5 seconds usually resolves it. |
Rate limits
Limits are per API key, per minute, and reset on a rolling 60-second window. Heavier endpoints — /v1/screenshot and /v1/branding — run a headless browser and allow half the plan's rate. Exceeding a limit returns 429 with a Retry-After header.
| plan | standard endpoints | screenshot / branding |
|---|---|---|
| free | 20 req/min | 10 req/min |
| starter | 60 req/min | 30 req/min |
| growth | 120 req/min | 60 req/min |
| pro | 180 req/min | 90 req/min |
Credits & billing
Credits are consumed per successful request. /v1/markdown, /v1/html, /v1/json, /v1/sitemap, and /v1/fonts cost 5 credits each. /v1/screenshot and /v1/branding cost 10 credits each.
Requests that return a 4xx or 5xx error are never charged. Cached responses (same URL + endpoint within 1 hour) cost 0 credits.
New accounts receive 500 free credits with no credit card required. The free plan renews 500 credits every 30 days for as long as the account stays on it. Every plan (free and paid) resets your credit balance to that plan's monthly allowance every 30 days from when it was granted (unused credits don't roll over). Yearly plans grant the full 12 months of credits upfront at a 20% discount off the monthly price, and don't reset for a year. One-time top-ups expire 30 days after purchase. Add credits or view usage in the billing dashboard.