# HTTP Contract: Content Fetch Route **Feature**: `003-kme-content-fetch` **File**: `specs/003-kme-content-fetch/contracts/http-content-fetch.md` This document defines the HTTP request/response contract for the content-fetch route exposed by the KME Content Adapter proxy. --- ## Route ``` GET {proxy-base-url}?kmeURL={encoded-article-url} ``` The proxy detects the content-fetch route when: - The incoming URL does **not** end in `/sitemap.xml`, AND - The query string contains a `kmeURL` parameter (present, regardless of value) Requests without `kmeURL` (and not a sitemap request) are routed to the existing auth-check passthrough (returns 200 "Authorized"). --- ## Request ### Method `GET` ### Query Parameters | Parameter | Required | Description | |-----------|----------|-------------| | `kmeURL` | Yes | The verbatim `vkm:url` value from the KME Search API response. Must be a well-formed absolute `http` or `https` URL. Percent-encoded characters are decoded once (standard URL decoding) — double-encoding must not occur. | ### Headers None required on the inbound request. The proxy adds its own `Authorization` header on the upstream request. ### Example Request ``` GET /?kmeURL=https%3A%2F%2Fcontent.kme.example%2Farticles%2F123 HTTP/1.1 Host: proxy.example.com ``` --- ## Responses ### 200 OK — Article HTML Body The article was successfully fetched and `vkm:articleBody` was extracted. ``` HTTP/1.1 200 OK Content-Type: text/html

Article content here...

``` | Field | Value | |-------|-------| | Status | `200` | | `Content-Type` | `text/html` | | Body | Raw HTML string from `vkm:articleBody` field of the KME Content Service JSON-LD response. Not sanitised or transformed. | --- ### 400 Bad Request — Invalid `kmeURL` Returned when `kmeURL` is absent, empty, whitespace-only, or not a well-formed absolute http/https URL. No upstream request is made. ``` HTTP/1.1 400 Bad Request Content-Type: text/plain Bad Request: kmeURL parameter is required ``` ``` HTTP/1.1 400 Bad Request Content-Type: text/plain Bad Request: kmeURL must be a well-formed absolute http/https URL ``` | Trigger | Response body | |---------|---------------| | `kmeURL` absent, empty, or whitespace | `Bad Request: kmeURL parameter is required` | | `kmeURL` present but malformed or non-http/https | `Bad Request: kmeURL must be a well-formed absolute http/https URL` | --- ### 404 Not Found — Article Not Found Returned when the upstream KME Content Service returns a 4xx response for the article URL, or when the upstream response does not contain a non-empty `vkm:articleBody`. ``` HTTP/1.1 404 Not Found Content-Type: text/plain Not Found: article not found at upstream ``` ``` HTTP/1.1 404 Not Found Content-Type: text/plain Not Found: article body not present in upstream response ``` | Trigger | Response body | |---------|---------------| | Upstream 4xx HTTP response | `Not Found: article not found at upstream` | | `vkm:articleBody` absent, null, or empty string | `Not Found: article body not present in upstream response` | --- ### 500 Internal Server Error — Proxy Configuration Error Returned when a required OIDC setting is missing from `kme_CSA_settings`. Indicates a proxy deployment/configuration issue. ``` HTTP/1.1 500 Internal Server Error Content-Type: text/plain Configuration error: missing required field: tokenUrl ``` --- ### 502 Bad Gateway — Upstream or Token Failure Returned for any upstream connectivity, protocol, or data error, and for token acquisition failure. ``` HTTP/1.1 502 Bad Gateway Content-Type: text/plain Bad Gateway: token acquisition failed ``` | Trigger | Response body | |---------|---------------| | OIDC token acquisition failure | `Bad Gateway: token acquisition failed` | | Upstream request timeout (`ECONNABORTED`/`ERR_CANCELED`) | `Bad Gateway: upstream request timed out` | | Upstream 5xx HTTP response | `Bad Gateway: upstream error HTTP {status}` | | Network-level error (no HTTP response) | `Bad Gateway: {error message}` | | Upstream response body is not valid JSON | `Bad Gateway: unparseable response from upstream` | | Upstream response body is not an object | `Bad Gateway: unexpected response from upstream` | --- ## Upstream Request (Proxy → KME Content Service) The proxy makes a single GET request to the verbatim `kmeURL` value. ``` GET {kmeURL} HTTP/1.1 Authorization: OIDC_id_token {id_token} ``` | Field | Value | |-------|-------| | Method | `GET` | | URL | Verbatim value of `kmeURL` query parameter — no manipulation, no re-encoding | | `Authorization` | `OIDC_id_token {id_token}` where `id_token` is from `getValidToken()` | | Timeout | 10 000 ms (10 seconds) | --- ## Error Mapping Summary ``` kmeURL absent/empty → 400 kmeURL malformed / non-http(s) → 400 Missing OIDC config → 500 Token acquisition failure → 502 Upstream 4xx → 404 Upstream 5xx → 502 Upstream timeout → 502 Network error → 502 Unparseable response body → 502 vkm:articleBody absent/null/empty → 404 Success → 200 text/html ``` --- ## Non-regression: Existing Routes This feature does not change the behaviour of existing routes: | Route | Behaviour | |-------|-----------| | URL ends in `/sitemap.xml` | Sitemap flow (unchanged) | | No `kmeURL`, not sitemap | Auth-check passthrough → 200 "Authorized" (unchanged) |