v1.0

API Reference

A single REST API for extracting structured data from any URL. All endpoints accept JSON and return JSON.

Language:

Authentication

All requests require your API key as a Bearer token in the Authorization header. Keys are prefixed with cfai_. Generate keys in the dashboard.

header
Authorization: Bearer cfai_your_key_here

Never expose your key in client-side code or public repositories.

POST

/v1/markdown

5 credits

Fetches the target URL via headless browser, extracts the main content block, and strips boilerplate — navigation, ads, footers, cookie banners. Returns structured Markdown with word count and metadata.

Request body

fieldtypereqdescription
urlstringyesFull URL to scrape. Must include protocol (https://).
options.includeImagesbooleannoEmbed image alt-text in Markdown output. Default: false.
options.timeoutnumbernoTimeout in ms. Default: 30000. Max: 60000.

Example

curl -X POST https://contextforai.ashutoshswamy.in/v1/markdown \
  -H "Authorization: Bearer cfai_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response

200 OK
{
  "url": "https://example.com",
  "title": "Example Domain",
  "markdown": "# Example Domain\n\nThis domain is for use in illustrative examples.",
  "wordCount": 17,
  "extractedAt": "2026-07-01T10:00:00Z"
}
GET

/v1/html

5 credits

Fetches the target URL and returns the raw HTML as-is — no Readability parsing, no boilerplate stripping. Useful when you want to run your own extraction downstream. The only GET endpoint; pass the target as a query parameter.

Query parameters

fieldtypereqdescription
urlstringyesFull URL to fetch, as a query parameter. Must include protocol (https://).

Example

curl "https://contextforai.ashutoshswamy.in/v1/html?url=https://example.com" \
  -H "Authorization: Bearer cfai_your_key_here"

Response

200 OK
{
  "url": "https://example.com",
  "finalUrl": "https://example.com",
  "html": "<!doctype html><html>...</html>",
  "length": 1256,
  "extractedAt": "2026-07-01T10:00:00Z"
}
POST

/v1/json

5 credits

Parses the page and returns a structured JSON object with all extractable metadata: title, description, OpenGraph tags, canonical URL, heading hierarchy, and up to 100 outbound links.

Request body

fieldtypereqdescription
urlstringyesFull URL to scrape. Must include protocol (https://).
options.maxLinksnumbernoMax outbound links to return. Default: 100.
options.timeoutnumbernoTimeout in ms. Default: 30000. Max: 60000.

Example

curl -X POST https://contextforai.ashutoshswamy.in/v1/json \
  -H "Authorization: Bearer cfai_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response

200 OK
{
  "url": "https://example.com",
  "title": "Example Domain",
  "description": "This domain is for use in illustrative examples.",
  "canonical": "https://example.com",
  "metadata": {
    "og:title": "Example Domain",
    "og:type": "website"
  },
  "headings": [{ "level": 1, "text": "Example Domain" }],
  "links": [
    { "text": "More information", "href": "https://www.iana.org/domains/example" }
  ],
  "extractedAt": "2026-07-01T10:00:00Z"
}
POST

/v1/sitemap

5 credits

Attempts to fetch and parse sitemap.xml (and sitemap index files). If no sitemap is found, performs a shallow crawl of the homepage to discover linked pages. Useful for bulk-scraping or building crawl queues.

Request body

fieldtypereqdescription
urlstringyesRoot URL of the site. Must include protocol (https://).
options.maxPagesnumbernoMax URLs to return. Default: 500.
options.timeoutnumbernoTimeout in ms. Default: 30000. Max: 60000.

Example

curl -X POST https://contextforai.ashutoshswamy.in/v1/sitemap \
  -H "Authorization: Bearer cfai_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response

200 OK
{
  "url": "https://example.com",
  "source": "sitemap.xml",
  "count": 48,
  "pages": [
    { "url": "https://example.com/", "lastmod": "2026-07-01", "priority": 1.0 },
    { "url": "https://example.com/about", "lastmod": "2026-06-15", "priority": 0.8 }
  ],
  "extractedAt": "2026-07-01T10:00:00Z"
}
POST

/v1/fonts

5 credits

Fetches the page and its stylesheets, then extracts all font families from font-family CSS declarations and Google Fonts <link> tags. Returns an ordered list of up to 10 unique font families.

Request body

fieldtypereqdescription
urlstringyesFull URL of the site. Must include protocol (https://).
options.timeoutnumbernoTimeout in ms. Default: 30000. Max: 60000.

Example

curl -X POST https://contextforai.ashutoshswamy.in/v1/fonts \
  -H "Authorization: Bearer cfai_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response

200 OK
{
  "url": "https://example.com",
  "fonts": ["Inter", "Geist Sans", "Geist Mono"],
  "count": 3,
  "extractedAt": "2026-07-01T10:00:00Z"
}
POST

/v1/screenshot

10 credits

Renders the page in headless Chromium at 1280px viewport width and captures a screenshot. Returns a CDN-hosted image URL valid for 24 hours.

Request body

fieldtypereqdescription
urlstringyesFull URL to screenshot. Must include protocol (https://).
options.fullPagebooleannoCapture the full scrollable page. Default: false (viewport only).
options.formatstringno"png" or "jpeg". Default: "png".
options.widthnumbernoViewport width in px. Default: 1280. Max: 1920.
options.timeoutnumbernoTimeout in ms. Default: 30000. Max: 60000.

Example

curl -X POST https://contextforai.ashutoshswamy.in/v1/screenshot \
  -H "Authorization: Bearer cfai_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response

200 OK
{
  "url": "https://example.com",
  "imageUrl": "https://contextforai.ashutoshswamy.in/screenshots/abc123.png",
  "dimensions": { "width": 1280, "height": 800 },
  "format": "png",
  "fullPage": false,
  "expiresAt": "2026-07-02T10:00:00Z",
  "extractedAt": "2026-07-01T10:00:00Z"
}
POST

/v1/branding

10 credits

Extracts branding assets from the page: best-match logo image URL, favicon, and dominant color palette (up to 3 colors) ranked by frequency in CSS. Neutral colors (black, white, near-black, near-white) are filtered out.

Request body

fieldtypereqdescription
urlstringyesFull URL of the site. Must include protocol (https://).
options.timeoutnumbernoTimeout in ms. Default: 30000. Max: 60000.

Example

curl -X POST https://contextforai.ashutoshswamy.in/v1/branding \
  -H "Authorization: Bearer cfai_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Response

200 OK
{
  "url": "https://example.com",
  "logo": "https://example.com/logo.svg",
  "colors": [
    { "name": "primary",   "hex": "#1A73E8" },
    { "name": "secondary", "hex": "#34A853" },
    { "name": "accent",    "hex": "#EA4335" }
  ]
}

Error codes

All errors return a JSON body with error and message fields. Errors are never charged.

error shape
{
  "error": "unauthorized",
  "message": "Missing or invalid API key."
}
statuserrordescription
400bad_requestMissing or invalid request body. Check that url is present and valid.
401unauthorizedMissing or invalid API key. Ensure the Authorization header is set correctly.
402insufficient_creditsAccount has no remaining credits. Top up in the dashboard.
422unprocessable_urlThe URL returned a non-200 status, timed out, or could not be rendered.
429rate_limitedToo many requests. See Rate limits for per-plan, per-endpoint limits.
500internal_errorSomething went wrong on our end. Retrying after 5 seconds usually resolves it.

Rate limits

Limits are per API key, per minute, and reset on a rolling 60-second window. Heavier endpoints — /v1/screenshot and /v1/branding — run a headless browser and allow half the plan's rate. Exceeding a limit returns 429 with a Retry-After header.

planstandard endpointsscreenshot / branding
free20 req/min10 req/min
starter60 req/min30 req/min
growth120 req/min60 req/min
pro180 req/min90 req/min

Credits & billing

Credits are consumed per successful request. /v1/markdown, /v1/html, /v1/json, /v1/sitemap, and /v1/fonts cost 5 credits each. /v1/screenshot and /v1/branding cost 10 credits each.

Requests that return a 4xx or 5xx error are never charged. Cached responses (same URL + endpoint within 1 hour) cost 0 credits.

New accounts receive 500 free credits with no credit card required. The free plan renews 500 credits every 30 days for as long as the account stays on it. Every plan (free and paid) resets your credit balance to that plan's monthly allowance every 30 days from when it was granted (unused credits don't roll over). Yearly plans grant the full 12 months of credits upfront at a 20% discount off the monthly price, and don't reset for a year. One-time top-ups expire 30 days after purchase. Add credits or view usage in the billing dashboard.