For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
GuidesAPI Reference
GuidesAPI Reference
  • Overview
    • API reference
  • Search
    • POSTSearch
    • GETSearch Run List
    • GETSearch Run Get
    • GETSearch Run Events
    • POSTSearch Run Cancel
  • Extract
    • GETExtract a URL
    • POSTExtract
  • Monitors
    • GETList monitors
    • POSTCreate monitor
    • GETRetrieve monitor
    • DELDelete monitor
    • PATCHUpdate monitor
    • GETList monitor events
    • POSTMonitor Pause
    • POSTMonitor Resume
    • POSTRun monitor
  • Tasks
    • POSTTask
    • POSTCreate task run
    • GETRetrieve task run
    • GETStream task run events
    • GETTask List
    • GETTask Get
    • GETTask Events
    • POSTTask Cancel
  • Chat
    • POSTChat completion
  • Find All
    • POSTFindAll
    • GETFindall Run List
    • POSTCreate FindAll run
    • GETRetrieve FindAll run
    • GETStream FindAll run events
    • POSTCancel FindAll run
    • POSTExtend FindAll run
    • POSTEnrich FindAll run
LogoLogo
Extract

Extract

||View as Markdown|
POST
https://core.usescout.sh/v1/extract
POST
/v1/extract
$curl -X POST https://core.usescout.sh/v1/extract \
> -H "Authorization: Bearer <token>" \
> -H "Content-Type: application/json" \
> -d '{
> "urls": [
> "https://www.anthropic.com/company"
> ],
> "output_schema": {
> "type": "object",
> "properties": {}
> }
>}'
1{
2 "extract_id": "a1b2c3d4-e5f6-7890-ab12-cd34ef567890",
3 "session_id": "session-20240615-123456",
4 "results": [
5 {
6 "url": "https://www.anthropic.com/company",
7 "title": "Anthropic - Company",
8 "content": "# Anthropic\n\nAnthropic is an AI safety company based in San Francisco. We build reliable, interpretable, and steerable AI systems.\n\n## Our mission\n\nEnsure that humanity safely navigates the transition",
9 "data": {
10 "company_name": "Anthropic",
11 "headquarters": "San Francisco, California",
12 "industry": "Artificial intelligence"
13 }
14 }
15 ],
16 "credits": 4
17}

Extract clean content from 1-20 URLs.

Default mode returns per-URL excerpts (paragraph sample). Supplying objective and/or search_queries focuses excerpts via an LLM call. Toggle advanced_settings.full_content to also return full markdown. Provide advanced_settings.summary to also run a summarization pass.

Was this page helpful?
Previous

Extract a URL

Next

List monitors

Built with

Authentication

AuthorizationBearer
Your API key, sent as a Bearer token.

Request

This endpoint expects an object.
urlslist of stringsRequired

Public http(s) URLs to extract (1-20).

objectivestring or nullOptional

What you’re looking for. Focuses the per-URL excerpts.

search_querieslist of strings or nullOptional

Keyword hints. Combined with objective to focus the extracted excerpts.

max_chars_totalinteger or nullOptional>=1
Cap on excerpt characters across all URLs combined.
session_idstring or nullOptional

Optional caller-supplied session id. Echoed back; if absent a new one is generated.

client_modelstring or nullOptional

Caller’s model identifier. Echoed only; not used.

advanced_settingsobjectOptional

Per-call advanced tuning - fetch policy, excerpt settings, full_content, summary and subpages.

extrasobjectOptional

Side data: include up to N outbound links and/or images per result.

max_charsinteger or nullOptional>=1

DEPRECATED alias of advanced_settings.excerpt_settings.max_chars_per_result.

output_schemamap from strings to any or nullOptional

DEPRECATED alias of advanced_settings.summary.schema.

Response

Successful Response
extract_idstring
Opaque id for this extract call.
session_idstring

Caller-supplied or generated.

resultslist of objects
One result per successfully extracted URL.
creditsinteger

Cost - 2 credits per URL successfully extracted, +1 per URL that hit the LLM (focused excerpts or summary).

errorslist of objects

Per-URL failures (not present in results).

statuseslist of objects

Per-URL status row (success or error; cached or live).

warningslist of strings or null
Reserved for future use.
subpages_discoveredintegerDefaults to 0

Total subpage candidates surfaced across all parent URLs before objective-aware ranking trimmed the list.

subpages_extractedintegerDefaults to 0
Total subpages actually fetched and returned across all parent URLs.
usagelist of objects or null

Parallel-style usage line items broken down by SKU (currently surfaces sku_subpage_rank when the objective-aware subpage ranker ran).

scratchpadmap from strings to any or null

Per-request scratchpad payload when SCRATCHPAD_FIRST=true and an objective was provided. Carries the single tool-use Claude answer plus retrieval stats.

Errors

422
Unprocessable Entity Error