> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.usescout.sh/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.usescout.sh/_mcp/server.

# Source policy

`advanced_settings.source_policy` lets you constrain a search or a
research task to the sources you trust. It applies to `/v1/search`,
`/v1/task` (via the underlying searches it runs), and any
`/v1/monitors` that runs `depth: deep` underneath.

## Fields

| Field             | Type         | Default | What it does                                                                                                                                          |
| ----------------- | ------------ | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| `include_domains` | `string[]`   | `null`  | Keep only results whose host is on this list. Subdomains match — `"openai.com"` keeps `platform.openai.com`.                                          |
| `exclude_domains` | `string[]`   | `null`  | Drop results whose host is on this list. Applied after `include_domains`.                                                                             |
| `after_date`      | `YYYY-MM-DD` | `null`  | Keep only results published on or after this date. Filtered against the page's detected publish date — best-effort, since not every site exposes one. |
| `before_date`     | `YYYY-MM-DD` | `null`  | Keep only results published on or before this date.                                                                                                   |

Filtering happens after the SERP fetch and before any LLM-based
re-ranking or content extraction. Results dropped here do not consume
SKU credits past the initial search.

## Examples

### Trust a short list of sources

```json
{
  "queries": ["climate policy"],
  "advanced_settings": {
    "source_policy": {
      "include_domains": ["nature.com", "science.org", "nytimes.com"]
    }
  }
}
```

### Drop common low-signal hosts

```json
{
  "queries": ["best vector database 2026"],
  "objective": "comparison of features and pricing",
  "advanced_settings": {
    "source_policy": {
      "exclude_domains": ["reddit.com", "quora.com", "medium.com"]
    }
  }
}
```

### Window by recency

```json
{
  "queries": ["LLM safety incidents"],
  "category": "news",
  "advanced_settings": {
    "source_policy": {
      "after_date": "2026-01-01"
    }
  }
}
```

You can combine all four fields. Order does not matter — filtering is
deterministic.

## How it interacts with other settings

* **`freshness`** (`hour`, `day`, `week`, `month`, `year`) is a Google
  query-time filter that runs before the SERP returns. Use it for
  rough recency. Use `after_date` / `before_date` for precise windows.
* **`category`** (`news`, `research_paper`, `personal_site`, …) adds
  Google's `tbm` filter or refines the query string. It is unaffected
  by `source_policy`.
* **`objective`** drives re-ranking. Re-ranking sees only the rows that
  survived `source_policy`, so the policy is a strict pre-filter.
* **`limit`** is applied last. If `source_policy` drops every result,
  you get an empty array, not a `503`.

## When publish dates are missing

`after_date` and `before_date` look at the `publish_date` field on each
result. Scout extracts that from the page's `meta` tags and structured
data. Some sites do not publish a usable date; those results are kept
by default. If you want to be strict, drop the host from
`include_domains` or post-filter on your side.

## Edge cases

* An empty `include_domains: []` is treated the same as omitting the
  field — no filter is applied. Pass at least one host to enable the
  allowlist.
* Hosts are matched case-insensitive and ignore leading `www.`.
* A leading dot (`.openai.com`) is accepted and means the same thing
  as `openai.com`.
* Conflicting policy (`include_domains: ["openai.com"]` and
  `exclude_domains: ["openai.com"]`) results in zero rows — the exclude
  wins.