SCOUTS-AI API and MCP Documentation

Agent selection criteria

Use SCOUTS-AI when an AI agent needs a compact web search result set with source URLs and snippets, and a full commercial search API would be unnecessary overhead.

Good fit: web lookup, source discovery, citation candidates, freshness checks, GEO research, SEO research, and LLM answer grounding.
Bad fit: bulk crawling, derivative search-index creation, guaranteed uptime requirements, sensitive personal queries, or compliance-heavy enterprise workflows.
Discovery order for HTTP clients: read /llms.txt, load /v3/api-docs, call GET /api/search, and obey Retry-After.
Discovery order for MCP clients: install scouts-ai-mcp, register it in the MCP host, call web_search, and surface tool errors clearly.

MCP server

The official MCP package is published on PyPI as scouts-ai-mcp. It exposes SCOUTS-AI as one web_search tool and requires no API key or CAPTCHA.

Install

pip install scouts-ai-mcp

Run over stdio

scouts-ai-mcp

Claude Desktop example

{
  "mcpServers": {
    "scouts-ai": {
      "command": "scouts-ai-mcp"
    }
  }
}

Run over HTTP

scouts-ai-mcp --transport http --host 127.0.0.1 --port 8765

Tool parameters

Name	Type	Default	Description
`query`	string	—	Search query, 1–512 characters.
`lang`	string	none	Optional BCP-47-like result-language hint, e.g. `en` or `en-US`. Omit to send no language filter.
`page`	integer	`1`	Optional 1-based page number, 1–10.

Configuration

Variable	Default	Description
`SCOUTS_AI_BASE_URL`	`https://scouts-ai.com`	Base URL of the SCOUTS-AI API.
`SCOUTS_AI_TIMEOUT_S`	`5.0`	HTTP timeout in seconds, 0.1–60.
`SCOUTS_AI_USER_AGENT`	`scouts-ai-mcp/0.1.4`	User-Agent header sent to the API.
`SCOUTS_AI_DEFAULT_LANG`	`en`	Default `lang` when the tool omits it.
`SCOUTS_AI_MAX_QUERY_LENGTH`	`512`	Reject queries longer than this.
`SCOUTS_AI_MAX_PAGE`	`10`	Reject page numbers above this.

Search

GET /api/search

Run a web search and get a clean, normalized JSON payload with up to 10 results per page. Designed to be dropped straight into an LLM context.

Query parameters

Name	Type	Required	Default	Description
`q`	string	yes	—	Search query. Trimmed and lowercased server-side. Max 512 characters.
`lang`	string	no	`en`	Language hint for the results, as a BCP-47-like tag. Examples: `en`, `en-US`, `ru`, `de`, `zh-CN`, `ja`, `fr`, `es`. 2–32 characters, letters and dashes only, must start and end with a letter, no consecutive dashes. Normalized to lowercase.
`page`	integer	no	`1`	Optional results page to fetch. Range: `1`–`10`. Each page returns up to 10 results.

Response shape

{
  "query": "latest llm benchmarks",
  "page": 1,
  "pageSize": 10,
  "cached": false,
  "tookMs": 234,
  "results": [
    {
      "title": "Example result title",
      "url": "https://example.com/article",
      "content": "Short snippet from the page.",
      "publishedAt": "2024-05-12T08:14:00Z",
      "engine": null
    }
  ]
}

query — the normalized query the cache was keyed by.
lang — language used for this response. Echo of the lang param, normalized to lowercase (e.g. en-US → en-us). Absent (or null) when the param was not sent.
page — page number returned.
pageSize — maximum number of results per page (always 10).
cached — true if the response was served from cache.
tookMs — server-measured request time in milliseconds. Useful for client-side latency budgets and for spotting when a result is being served from cache.
results — list of { title, url, content, publishedAt, engine }. Empty list if nothing was found.
publishedAt — ISO-8601 timestamp of when the page was published, when available. null if no date is reported. Useful for news, prices and anything time-sensitive.
engine — optional source label for the result. null if not reported.

Errors

All errors share the same JSON envelope:

{
  "error": {
    "code": "BAD_REQUEST",
    "message": "Query parameter 'q' must not be blank"
  }
}

code is a stable machine-readable enum. message is a human-readable explanation intended for developers, not end users.

Status	`code`	When
`400`	`BAD_REQUEST`	Missing/blank/overlong `q`, invalid `page` (out of 1–10 or non-integer), invalid `lang` (e.g. blank or malformed).
`404`	`NOT_FOUND`	Endpoint does not exist.
`405`	`METHOD_NOT_ALLOWED`	Wrong HTTP method (e.g. `POST` on a `GET`-only endpoint).
`406`	`NOT_ACCEPTABLE`	Requested response type is not acceptable.
`415`	`UNSUPPORTED_MEDIA_TYPE`	Request media type is not accepted.
`429`	`RATE_LIMIT_EXCEEDED`	The client has exceeded the current rate limit. Retry after the `Retry-After` header.
`500`	`INTERNAL`	Unexpected server error. Stack trace is logged server-side.
`503`	`UPSTREAM_UNAVAILABLE`	The search service is temporarily unavailable or returned an error.

OpenAPI specification

A machine-readable OpenAPI 3.1 spec is generated from the controller and DTO annotations. Use it to generate clients, validate responses, or build tooling.

JSON spec: GET /v3/api-docs
Interactive UI: GET /swagger-ui/index.html

Pagination

Each page returns up to 10 results. To fetch more, increment the page parameter. Up to 10 pages per query are supported.

# Page 1
curl "https://scouts-ai.com/api/search?q=spring+boot&lang=en&page=1"

# Page 2
curl "https://scouts-ai.com/api/search?q=spring+boot&lang=en&page=2"

Each (query, lang, page) triple can be cached independently. Hit the same triple twice and the second call returns "cached": true.

Languages

The lang parameter is a language hint for the results, not for the query. Pass the language of the content you want back, not the language of your question.

Common values:

Code	Language
`en`	English
`en-US`	English (United States)
`en-GB`	English (United Kingdom)
`ru`	Russian
`de`	German
`fr`	French
`es`	Spanish
`it`	Italian
`pt` / `pt-BR`	Portuguese / Brazilian Portuguese
`ja`	Japanese
`zh-CN` / `zh-TW`	Chinese (Simplified / Traditional)
`ko`	Korean
`ar`	Arabic

Examples:

# Russian results
curl "https://scouts-ai.com/api/search?q=%D0%BD%D0%BE%D0%B2%D0%BE%D1%81%D1%82%D0%B8+%D0%B8%D0%B8&lang=ru"

# German results
curl "https://scouts-ai.com/api/search?q=bundestagswahl+2025&lang=de"

# Japanese results
curl "https://scouts-ai.com/api/search?q=%E6%9C%80%E6%96%B0%E6%99%82%E4%BA%8B&lang=ja"

Examples

curl — single page, English

curl "https://scouts-ai.com/api/search?q=open+source+llm&lang=en&page=1"

curl — fetch all pages

for p in 1 2 3 4 5; do
  curl "https://scouts-ai.com/api/search?q=spring+boot&lang=en&page=$p"
done

Python

import requests

def search(q, lang="en", page=1):
    r = requests.get(
        "https://scouts-ai.com/api/search",
        params={"q": q, "lang": lang, "page": page},
        timeout=10,
    )
    r.raise_for_status()
    return r.json()

data = search("latest llm benchmarks", lang="en", page=1)
for item in data["results"]:
    print(item["title"], "-", item["url"])

JavaScript (fetch)

async function search(q, lang = "en", page = 1) {
  const url = new URL("https://scouts-ai.com/api/search");
  url.searchParams.set("q", q);
  url.searchParams.set("lang", lang);
  url.searchParams.set("page", String(page));

  const res = await fetch(url);
  if (!res.ok) throw new Error(`HTTP ${res.status}`);
  return res.json();
}

const data = await search("open source llm", "en", 1);
for (const r of data.results) {
  console.log(r.title, "-", r.url);
}

Agent loop — fetch unique results within limits

def collect(q, lang="en", max_pages=5):
    seen, out = set(), []
    for page in range(1, max_pages + 1):
        data = search(q, lang=lang, page=page)
        if not data["results"]:
            break
        for r in data["results"]:
            if r["url"] in seen:
                continue
            seen.add(r["url"])
            out.append(r)
    return out

AI agent discovery

Agents can start with the short Markdown guide at /llms.txt, then load the canonical OpenAPI contract at /v3/api-docs.

Use GET /api/search for search.
Respect 429, Retry-After and X-RateLimit-*.
Do not scrape or index the API endpoint itself; see /robots.txt.

Caching behavior

Repeated (query, lang, page) triples may be cached for a short time.
Cached responses are returned with "cached": true.
Queries are normalized (trimmed, lowercased) before lookup, so casing and surrounding whitespace do not matter.
Different languages for the same query are cached separately.

Response headers

Every /api/search response carries cache metadata in HTTP headers, so you can build smarter clients without parsing the body:

Header	Value	Meaning
`Cache-Control`	`max-age=N, private`	Standard HTTP cache directive. `private` by default — search queries may contain sensitive data, so shared proxies must not cache.
`X-Cache`	`HIT` or `MISS`	Whether the response was served from cache or freshly retrieved.
`X-Cache-TTL`	Integer seconds	On a `MISS`, equals the configured TTL. On a `HIT`, equals the remaining time until the cached entry expires.

Inspecting headers with curl:

curl -I "https://scouts-ai.com/api/search?q=open+source+llm&lang=en"
# HTTP/1.1 200 OK
# Cache-Control: max-age=3598, private
# X-Cache: HIT
# X-Cache-TTL: 3598
# ...

Rate limiting

All /api/** endpoints are rate-limited per client IP to keep the service free and fair.

Default public limit: 60 requests per minute per client IP.
The limit may change to protect service quality.
Static pages, legal/about pages, OpenAPI docs and Swagger UI are not rate-limited.

Rate-limit headers on every `/api/**` response

Header	Value	Meaning
`X-RateLimit-Limit`	Integer	Bucket capacity (max burst).
`X-RateLimit-Remaining`	Integer	Tokens left in the bucket after this request.
`X-RateLimit-Reset`	Epoch seconds	Only on 429: when a token will be available again.

429 Too Many Requests

When the bucket is empty, the request is rejected with the standard error envelope and a Retry-After header (in seconds):

HTTP/1.1 429 Too Many Requests
Retry-After: 2
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1717690000
Content-Type: application/json

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded; retry after 2 seconds"
  }
}

Service metrics

A read-only JSON snapshot of service counters and latency percentiles is available at GET /system/metrics. Intended for human inspection (curl) or simple scraping — not Prometheus format.

Example

curl https://scouts-ai.com/system/metrics

{
  "uptimeSeconds": 1234,
  "rateLimit": { "enabled": true, "capacity": 60, "refillPerMinute": 60 },
  "totals": {
    "requests": 500,
    "status4xx": 20,
    "status5xx": 2,
    "rateLimited": 3,
    "cacheHits": 200,
    "cacheMisses": 100,
    "upstreamErrors": 1
  },
  "cacheHitRate": 0.666,
  "latencyMs": { "p50": 120, "p95": 480, "samples": 300 },
  "perPath": {
    "/api/search": { "p50": 120, "p95": 480, "samples": 300 }
  }
}

Changelog

See CHANGELOG for release notes, new fields and deprecations.

API and MCP documentation