- Refactor kmeContentSourceAdapter.js into getValidToken(), oidcAuthFlow(), and sitemapFlow(); add sitemap generation using hydra:member response structure - Add searchApiBaseUrl, tenant, proxyBaseUrl fields to kme_CSA_settings.json and kme_CSA_settings.json.example - Add 17 unit tests for sitemap flow and non-sitemap routing regression - Add 5 contract tests for sitemap endpoint (proxy-http.test.js) - Add [Unreleased] sitemap entry to CHANGELOG.md - Add full specs/002-sitemap-generation/ artifact directory (spec, plan, tasks, data-model, contracts, research, quickstart, checklist) - Update constitution.md: add redis as permitted global, refresh kme_CSA_settings references - Update copilot-instructions.md SPECKIT marker to sitemap plan
190 lines
4.5 KiB
Markdown
190 lines
4.5 KiB
Markdown
# Contract: Sitemap Endpoint
|
|
|
|
**Feature**: `002-sitemap-generation`
|
|
**Endpoint type**: HTTP GET
|
|
**Introduced in**: `002-sitemap-generation`
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
The `kme-content-adapter` proxy exposes a single new HTTP endpoint: `GET /sitemap.xml` (or
|
|
any URL whose path ends with `/sitemap.xml`). This contract governs the complete observable
|
|
behaviour of that endpoint from the consumer's perspective.
|
|
|
|
---
|
|
|
|
## Endpoint
|
|
|
|
```
|
|
GET <proxy-base-url>/sitemap.xml
|
|
```
|
|
|
|
The adapter detects sitemap requests by checking whether `req.url` ends with `/sitemap.xml`.
|
|
The full path prefix (if any) is determined by how the reverse proxy routes requests to this
|
|
adapter.
|
|
|
|
---
|
|
|
|
## Request
|
|
|
|
### Method
|
|
`GET`
|
|
|
|
### Headers
|
|
No special request headers required. The adapter uses its own internally cached OIDC token
|
|
to authenticate the upstream call to the KME Knowledge Search Service.
|
|
|
|
### Body
|
|
None.
|
|
|
|
---
|
|
|
|
## Responses
|
|
|
|
### 200 OK — Sitemap generated successfully
|
|
|
|
**Condition**: The KME Knowledge Search Service returned a 2xx response and the sitemap was
|
|
built without errors.
|
|
|
|
**Headers**:
|
|
```
|
|
Content-Type: application/xml
|
|
```
|
|
|
|
**Body**: A well-formed XML Sitemap document conforming to
|
|
[https://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd](https://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd).
|
|
|
|
```xml
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
|
<url>
|
|
<loc>https://{proxyBaseUrl}?kmeURL={encodeURIComponent(vkmUrl)}</loc>
|
|
</url>
|
|
<!-- one <url> element per knowledge item with a non-empty vkm:url -->
|
|
</urlset>
|
|
```
|
|
|
|
**Empty-result variant** (search service returns zero items):
|
|
```xml
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"/>
|
|
```
|
|
|
|
### 500 Internal Server Error — Missing configuration
|
|
|
|
**Condition**: One or more required settings fields (`searchApiBaseUrl`, `tenant`,
|
|
`proxyBaseUrl`) are absent from `kme_CSA_settings`.
|
|
|
|
**Headers**:
|
|
```
|
|
Content-Type: text/plain
|
|
```
|
|
|
|
**Body**:
|
|
```
|
|
Configuration error: missing required field: <fieldName>
|
|
```
|
|
|
|
### 502 Bad Gateway — Upstream search service error
|
|
|
|
**Condition**: The KME Knowledge Search Service returned a non-2xx HTTP response.
|
|
|
|
**Headers**:
|
|
```
|
|
Content-Type: text/plain
|
|
```
|
|
|
|
**Body**:
|
|
```
|
|
Search service error: HTTP <status>
|
|
```
|
|
|
|
### 504 Gateway Timeout — Upstream search service timeout
|
|
|
|
**Condition**: The KME Knowledge Search Service connection timed out (>10 000 ms).
|
|
|
|
**Headers**:
|
|
```
|
|
Content-Type: text/plain
|
|
```
|
|
|
|
**Body**:
|
|
```
|
|
Search service timeout
|
|
```
|
|
|
|
---
|
|
|
|
## `<loc>` URL Format
|
|
|
|
Each `<loc>` element is constructed as:
|
|
|
|
```
|
|
{proxyBaseUrl}?kmeURL={encodeURIComponent(item['vkm:url'])}
|
|
```
|
|
|
|
Where:
|
|
- `proxyBaseUrl` is taken from `kme_CSA_settings.proxyBaseUrl` (e.g., `https://adapter.example.com`)
|
|
- `item['vkm:url']` is the raw `vkm:url` value from the search service result
|
|
- `encodeURIComponent` percent-encodes the value so it is safe as a query parameter
|
|
|
|
**Example**:
|
|
```
|
|
https://adapter.example.com?kmeURL=https%3A%2F%2Fkme.example.com%2Fknowledge%2Farticle-123
|
|
```
|
|
|
|
---
|
|
|
|
## Authentication to Upstream (internal, not exposed to consumer)
|
|
|
|
The adapter authenticates to the KME Knowledge Search Service using:
|
|
|
|
```
|
|
Authorization: OIDC_id_token <token>
|
|
```
|
|
|
|
Where `<token>` is the `id_token` from the OIDC token service, cached in Redis at
|
|
`authorization.token`. Token refresh uses the same stampede-guarded fetch already present
|
|
in the existing OIDC auth flow.
|
|
|
|
---
|
|
|
|
## Existing Endpoint Behaviour (unchanged)
|
|
|
|
All requests whose URL does **not** end in `/sitemap.xml` continue to use the existing OIDC
|
|
authentication flow with no change in response behaviour:
|
|
|
|
| Condition | Response |
|
|
|---|---|
|
|
| Valid cached OIDC token | `200 Authorized` (`text/plain`) |
|
|
| No cached token — fetch succeeds | `200 Authorized` (`text/plain`) |
|
|
| Token service unreachable | `401 Unauthorized: <error>` (`text/plain`) |
|
|
|
|
---
|
|
|
|
## Non-Functional Constraints
|
|
|
|
| Constraint | Value | Source |
|
|
|---|---|---|
|
|
| Search API timeout | 10 000 ms | Spec assumption |
|
|
| Max response time (normal conditions) | < 5 000 ms | SC-001 |
|
|
| Max response time (error scenarios) | < 10 000 ms | SC-005 |
|
|
| Pagination | Not supported (v1) | Spec assumption |
|
|
| Multi-tenant | Not supported (v1) | Spec assumption |
|
|
|
|
---
|
|
|
|
## Sitemap Protocol Compliance
|
|
|
|
The returned XML must validate against the Sitemaps XSD:
|
|
`https://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd`
|
|
|
|
Required elements per entry (v1 scope):
|
|
- `<loc>` — mandatory
|
|
|
|
Optional elements **not included** in v1:
|
|
- `<lastmod>` — out of scope
|
|
- `<changefreq>` — out of scope
|
|
- `<priority>` — out of scope
|