Files
kme_content_adapter/specs/002-sitemap-generation/contracts/sitemap-endpoint.md
Peter.Morton 50b87297d2 feat(002): add sitemap generation feature
- Refactor kmeContentSourceAdapter.js into getValidToken(), oidcAuthFlow(),
  and sitemapFlow(); add sitemap generation using hydra:member response structure
- Add searchApiBaseUrl, tenant, proxyBaseUrl fields to kme_CSA_settings.json
  and kme_CSA_settings.json.example
- Add 17 unit tests for sitemap flow and non-sitemap routing regression
- Add 5 contract tests for sitemap endpoint (proxy-http.test.js)
- Add [Unreleased] sitemap entry to CHANGELOG.md
- Add full specs/002-sitemap-generation/ artifact directory
  (spec, plan, tasks, data-model, contracts, research, quickstart, checklist)
- Update constitution.md: add redis as permitted global, refresh
  kme_CSA_settings references
- Update copilot-instructions.md SPECKIT marker to sitemap plan
2026-04-22 22:08:08 -05:00

4.5 KiB

Contract: Sitemap Endpoint

Feature: 002-sitemap-generation Endpoint type: HTTP GET Introduced in: 002-sitemap-generation


Overview

The kme-content-adapter proxy exposes a single new HTTP endpoint: GET /sitemap.xml (or any URL whose path ends with /sitemap.xml). This contract governs the complete observable behaviour of that endpoint from the consumer's perspective.


Endpoint

GET <proxy-base-url>/sitemap.xml

The adapter detects sitemap requests by checking whether req.url ends with /sitemap.xml. The full path prefix (if any) is determined by how the reverse proxy routes requests to this adapter.


Request

Method

GET

Headers

No special request headers required. The adapter uses its own internally cached OIDC token to authenticate the upstream call to the KME Knowledge Search Service.

Body

None.


Responses

200 OK — Sitemap generated successfully

Condition: The KME Knowledge Search Service returned a 2xx response and the sitemap was built without errors.

Headers:

Content-Type: application/xml

Body: A well-formed XML Sitemap document conforming to https://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://{proxyBaseUrl}?kmeURL={encodeURIComponent(vkmUrl)}</loc>
  </url>
  <!-- one <url> element per knowledge item with a non-empty vkm:url -->
</urlset>

Empty-result variant (search service returns zero items):

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"/>

500 Internal Server Error — Missing configuration

Condition: One or more required settings fields (searchApiBaseUrl, tenant, proxyBaseUrl) are absent from kme_CSA_settings.

Headers:

Content-Type: text/plain

Body:

Configuration error: missing required field: <fieldName>

502 Bad Gateway — Upstream search service error

Condition: The KME Knowledge Search Service returned a non-2xx HTTP response.

Headers:

Content-Type: text/plain

Body:

Search service error: HTTP <status>

504 Gateway Timeout — Upstream search service timeout

Condition: The KME Knowledge Search Service connection timed out (>10 000 ms).

Headers:

Content-Type: text/plain

Body:

Search service timeout

<loc> URL Format

Each <loc> element is constructed as:

{proxyBaseUrl}?kmeURL={encodeURIComponent(item['vkm:url'])}

Where:

  • proxyBaseUrl is taken from kme_CSA_settings.proxyBaseUrl (e.g., https://adapter.example.com)
  • item['vkm:url'] is the raw vkm:url value from the search service result
  • encodeURIComponent percent-encodes the value so it is safe as a query parameter

Example:

https://adapter.example.com?kmeURL=https%3A%2F%2Fkme.example.com%2Fknowledge%2Farticle-123

Authentication to Upstream (internal, not exposed to consumer)

The adapter authenticates to the KME Knowledge Search Service using:

Authorization: OIDC_id_token <token>

Where <token> is the id_token from the OIDC token service, cached in Redis at authorization.token. Token refresh uses the same stampede-guarded fetch already present in the existing OIDC auth flow.


Existing Endpoint Behaviour (unchanged)

All requests whose URL does not end in /sitemap.xml continue to use the existing OIDC authentication flow with no change in response behaviour:

Condition Response
Valid cached OIDC token 200 Authorized (text/plain)
No cached token — fetch succeeds 200 Authorized (text/plain)
Token service unreachable 401 Unauthorized: <error> (text/plain)

Non-Functional Constraints

Constraint Value Source
Search API timeout 10 000 ms Spec assumption
Max response time (normal conditions) < 5 000 ms SC-001
Max response time (error scenarios) < 10 000 ms SC-005
Pagination Not supported (v1) Spec assumption
Multi-tenant Not supported (v1) Spec assumption

Sitemap Protocol Compliance

The returned XML must validate against the Sitemaps XSD: https://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd

Required elements per entry (v1 scope):

  • <loc> — mandatory

Optional elements not included in v1:

  • <lastmod> — out of scope
  • <changefreq> — out of scope
  • <priority> — out of scope