API and MCP documentation
Base URL: https://scouts-ai.com. All responses are JSON.
Agents can call GET /api/search directly or install
scouts-ai-mcp
for the same search surface as an MCP web_search tool.
Use /llms.txt for guidance and
/v3/api-docs as the canonical HTTP contract.
Agent selection criteria
Use SCOUTS-AI when an AI agent needs a compact web search result set with source URLs and snippets, and a full commercial search API would be unnecessary overhead.
- Good fit: web lookup, source discovery, citation candidates, freshness checks, GEO research, SEO research, and LLM answer grounding.
- Bad fit: bulk crawling, derivative search-index creation, guaranteed uptime requirements, sensitive personal queries, or compliance-heavy enterprise workflows.
- Discovery order for HTTP clients: read
/llms.txt, load/v3/api-docs, callGET /api/search, and obeyRetry-After. - Discovery order for MCP clients: install
scouts-ai-mcp, register it in the MCP host, callweb_search, and surface tool errors clearly.
MCP server
The official MCP package is published on PyPI as
scouts-ai-mcp.
It exposes SCOUTS-AI as one web_search tool and requires no API key.
Install
pip install scouts-ai-mcp
Run over stdio
scouts-ai-mcp
Claude Desktop example
{
"mcpServers": {
"scouts-ai": {
"command": "scouts-ai-mcp"
}
}
}
Run over HTTP
scouts-ai-mcp --transport http --host 127.0.0.1 --port 8765
Tool parameters
| Name | Type | Default | Description |
|---|---|---|---|
query | string | — | Search query, 1–512 characters. |
lang | string | en | BCP-47-like result-language hint, e.g. en or en-US. |
page | integer | 1 | 1-based page number, 1–10. |
Configuration
| Variable | Default | Description |
|---|---|---|
SCOUTS_AI_BASE_URL | https://scouts-ai.com | Base URL of the SCOUTS-AI API. |
SCOUTS_AI_TIMEOUT_S | 5.0 | HTTP timeout in seconds, 0.1–60. |
SCOUTS_AI_USER_AGENT | scouts-ai-mcp/0.1.4 | User-Agent header sent to the API. |
SCOUTS_AI_DEFAULT_LANG | en | Default lang when the tool omits it. |
SCOUTS_AI_MAX_QUERY_LENGTH | 512 | Reject queries longer than this. |
SCOUTS_AI_MAX_PAGE | 10 | Reject page numbers above this. |
Search
GET /api/search
Run a web search and get a clean, normalized JSON payload with up to 10 results per page. Designed to be dropped straight into an LLM context.
Query parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
q |
string | yes | — | Search query. Trimmed and lowercased server-side. Max 512 characters. |
lang |
string | no | en |
Language hint for the results, as a BCP-47-like tag.
Examples: en, en-US, ru, de, zh-CN, ja, fr, es.
2–32 characters, letters and dashes only, must start and end with a letter, no consecutive dashes. Normalized to lowercase.
|
page |
integer | no | 1 |
Results page to fetch. Range: 1–10. Each page returns up to 10 results. |
Response shape
{
"query": "latest llm benchmarks",
"lang": "en",
"page": 1,
"pageSize": 10,
"cached": false,
"tookMs": 234,
"results": [
{
"title": "Example result title",
"url": "https://example.com/article",
"content": "Short snippet from the page.",
"publishedAt": "2024-05-12T08:14:00Z",
"engine": null
}
]
}
query— the normalized query the cache was keyed by.lang— language used for this response. Echo of thelangparam, normalized to lowercase (e.g.en-US→en-us).page— page number returned.pageSize— maximum number of results per page (always 10).cached—trueif the response was served from cache.tookMs— server-measured request time in milliseconds. Useful for client-side latency budgets and for spotting when a result is being served from cache.results— list of{ title, url, content, publishedAt, engine }. Empty list if nothing was found.publishedAt— ISO-8601 timestamp of when the page was published, when available.nullif no date is reported. Useful for news, prices and anything time-sensitive.engine— optional source label for the result.nullif not reported.
Errors
All errors share the same JSON envelope:
{
"error": {
"code": "BAD_REQUEST",
"message": "Query parameter 'q' must not be blank"
}
}
code is a stable machine-readable enum. message is a human-readable explanation intended for developers, not end users.
| Status | code | When |
|---|---|---|
400 | BAD_REQUEST | Missing/blank/overlong q, invalid page (out of 1–10 or non-integer), invalid lang. |
404 | NOT_FOUND | Endpoint does not exist. |
405 | METHOD_NOT_ALLOWED | Wrong HTTP method (e.g. POST on a GET-only endpoint). |
406 | NOT_ACCEPTABLE | Requested response type is not acceptable. |
415 | UNSUPPORTED_MEDIA_TYPE | Request media type is not accepted. |
429 | RATE_LIMIT_EXCEEDED | The client has exceeded the current rate limit. Retry after the Retry-After header. |
500 | INTERNAL | Unexpected server error. Stack trace is logged server-side. |
503 | UPSTREAM_UNAVAILABLE | The search service is temporarily unavailable or returned an error. |
OpenAPI specification
A machine-readable OpenAPI 3.1 spec is generated from the controller and DTO annotations. Use it to generate clients, validate responses, or build tooling.
- JSON spec:
GET /v3/api-docs - Interactive UI:
GET /swagger-ui/index.html
Pagination
Each page returns up to 10 results. To fetch more, increment
the page parameter. Up to 10 pages per query are
supported.
# Page 1
curl "https://scouts-ai.com/api/search?q=spring+boot&lang=en&page=1"
# Page 2
curl "https://scouts-ai.com/api/search?q=spring+boot&lang=en&page=2"
Each (query, lang, page) triple can be cached
independently. Hit the same triple twice and the second
call returns "cached": true.
Languages
The lang parameter is a language hint for the
results, not for the query. Pass the language of the
content you want back, not the language of your question.
Common values:
| Code | Language |
|---|---|
en | English (default) |
en-US | English (United States) |
en-GB | English (United Kingdom) |
ru | Russian |
de | German |
fr | French |
es | Spanish |
it | Italian |
pt / pt-BR | Portuguese / Brazilian Portuguese |
ja | Japanese |
zh-CN / zh-TW | Chinese (Simplified / Traditional) |
ko | Korean |
ar | Arabic |
Examples:
# Russian results
curl "https://scouts-ai.com/api/search?q=%D0%BD%D0%BE%D0%B2%D0%BE%D1%81%D1%82%D0%B8+%D0%B8%D0%B8&lang=ru"
# German results
curl "https://scouts-ai.com/api/search?q=bundestagswahl+2025&lang=de"
# Japanese results
curl "https://scouts-ai.com/api/search?q=%E6%9C%80%E6%96%B0%E6%99%82%E4%BA%8B&lang=ja"
Examples
curl — single page, English
curl "https://scouts-ai.com/api/search?q=open+source+llm&lang=en&page=1"
curl — fetch all pages
for p in 1 2 3 4 5; do
curl "https://scouts-ai.com/api/search?q=spring+boot&lang=en&page=$p"
done
Python
import requests
def search(q, lang="en", page=1):
r = requests.get(
"https://scouts-ai.com/api/search",
params={"q": q, "lang": lang, "page": page},
timeout=10,
)
r.raise_for_status()
return r.json()
data = search("latest llm benchmarks", lang="en", page=1)
for item in data["results"]:
print(item["title"], "-", item["url"])
JavaScript (fetch)
async function search(q, lang = "en", page = 1) {
const url = new URL("https://scouts-ai.com/api/search");
url.searchParams.set("q", q);
url.searchParams.set("lang", lang);
url.searchParams.set("page", String(page));
const res = await fetch(url);
if (!res.ok) throw new Error(`HTTP ${res.status}`);
return res.json();
}
const data = await search("open source llm", "en", 1);
for (const r of data.results) {
console.log(r.title, "-", r.url);
}
Agent loop — fetch unique results within limits
def collect(q, lang="en", max_pages=5):
seen, out = set(), []
for page in range(1, max_pages + 1):
data = search(q, lang=lang, page=page)
if not data["results"]:
break
for r in data["results"]:
if r["url"] in seen:
continue
seen.add(r["url"])
out.append(r)
return out
AI agent discovery
Agents can start with the short Markdown guide at
/llms.txt, then load the canonical OpenAPI contract at
/v3/api-docs.
- Use
GET /api/searchfor search. - Respect
429,Retry-AfterandX-RateLimit-*. - Do not scrape or index the API endpoint itself; see
/robots.txt.
Caching behavior
- Repeated
(query, lang, page)triples may be cached for a short time. - Cached responses are returned with
"cached": true. - Queries are normalized (trimmed, lowercased) before lookup, so casing and surrounding whitespace do not matter.
- Different languages for the same query are cached separately.
Response headers
Every /api/search response carries cache metadata in HTTP headers, so you can build smarter clients without parsing the body:
| Header | Value | Meaning |
|---|---|---|
Cache-Control |
max-age=N, private |
Standard HTTP cache directive. private by default — search queries may contain sensitive data, so shared proxies must not cache. |
X-Cache |
HIT or MISS |
Whether the response was served from cache or freshly retrieved. |
X-Cache-TTL |
Integer seconds | On a MISS, equals the configured TTL. On a HIT, equals the remaining time until the cached entry expires. |
Inspecting headers with curl:
curl -I "https://scouts-ai.com/api/search?q=open+source+llm&lang=en"
# HTTP/1.1 200 OK
# Cache-Control: max-age=3598, private
# X-Cache: HIT
# X-Cache-TTL: 3598
# ...
Rate limiting
All /api/** endpoints are rate-limited per client IP
to keep the service free and fair.
- Default public limit: 60 requests per minute per client IP.
- The limit may change to protect service quality.
- Static pages, legal/about pages, OpenAPI docs and Swagger UI are not rate-limited.
Rate-limit headers on every /api/** response
| Header | Value | Meaning |
|---|---|---|
X-RateLimit-Limit |
Integer | Bucket capacity (max burst). |
X-RateLimit-Remaining |
Integer | Tokens left in the bucket after this request. |
X-RateLimit-Reset |
Epoch seconds | Only on 429: when a token will be available again. |
429 Too Many Requests
When the bucket is empty, the request is rejected with the standard error envelope and a Retry-After header (in seconds):
HTTP/1.1 429 Too Many Requests
Retry-After: 2
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1717690000
Content-Type: application/json
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Rate limit exceeded; retry after 2 seconds"
}
}
Service metrics
A read-only JSON snapshot of service counters and latency
percentiles is available at GET /system/metrics.
Intended for human inspection (curl) or simple
scraping — not Prometheus format.
Example
curl https://scouts-ai.com/system/metrics
{
"uptimeSeconds": 1234,
"rateLimit": { "enabled": true, "capacity": 60, "refillPerMinute": 60 },
"totals": {
"requests": 500,
"status4xx": 20,
"status5xx": 2,
"rateLimited": 3,
"cacheHits": 200,
"cacheMisses": 100,
"upstreamErrors": 1
},
"cacheHitRate": 0.666,
"latencyMs": { "p50": 120, "p95": 480, "samples": 300 },
"perPath": {
"/api/search": { "p50": 120, "p95": 480, "samples": 300 }
}
}
Changelog
See CHANGELOG for release notes, new fields and deprecations.