Files
kme_content_adapter/CHANGELOG.md
2026-04-23 19:07:51 -05:00

82 lines
5.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Changelog
All notable changes to this project will be documented in this file.
The format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
---
## [Unreleased]
---
## [0.4.0] - 2026-04-23
### Added
- Sitemap pagination via `hydra:view['hydra:last']`: after the first search page, all subsequent pages are fetched in parallel using the correct 0-based item-index `start` model (`start = size, 2×size, …, lastStart`); when all results fit on one page (`hydra:view` absent) no additional requests are made
- Latest `vkm:datePublished` selection per `SearchResultItem`: when a search result contains multiple content fragments, only the fragment with the most recent `vkm:datePublished` is included in the sitemap; fragments without a date are treated as epoch 0
- Sitemap URL cap: output is limited to 50,000 `<loc>` entries per the [Sitemaps protocol](https://www.sitemaps.org/protocol.html); a `warn` log is emitted when results are truncated
- Full HTML document wrapper for content fetch responses: body is now `<!DOCTYPE html><html><head><title>…</title></head><body>…</body></html>` instead of a bare `articleBody` fragment
- `<title>` element populated from the `vkm:name` field of the fetched article (empty `<title></title>` when `vkm:name` is absent)
### Changed
- `oidcAuthFlow` route removed: requests that do not match `?kmeURL=` or `/sitemap.xml` now return `404 Not Found`
### Fixed
- `proxyBaseUrl` is now derived dynamically from the incoming request (`X-Forwarded-Proto`, `X-Forwarded-Host`, `Host` headers) rather than read from settings, ensuring correct `<loc>` URLs in all deployment environments
---
## [0.3.0] - 2026-04-23
### Added
- `GET /?kmeURL=<upstream-article-url>` content fetch endpoint: fetches a KME article by URL and returns its `vkm:articleBody` as `200 text/html; charset=utf-8`
- `contentFetchFlow()` async function in `kmeContentSourceAdapter.js` — URL routing branch, 9-step implementation: validates `kmeURL` parameter (400 for missing/blank/malformed/non-http), acquires OIDC token via `getValidToken` (502 on failure), fetches upstream article with 10-second timeout, handles all error paths (4xx upstream → 404, 5xx/timeout/network → 502, unparseable body → 502, missing/empty `vkm:articleBody` → 404)
- URL routing updated: `?kmeURL=` present → `contentFetchFlow()`, `/sitemap.xml``sitemapFlow()`, otherwise → `oidcAuthFlow()` (passthrough, FR-012 preserved)
- `extractArticleBody(data)` pure helper in `kmeContentSourceAdapterHelpers.js` — returns `data['vkm:articleBody']` if non-empty non-whitespace string, otherwise `null`; guards against null/non-object input
- Unit test describe blocks in `tests/unit/proxy.test.js`: `extractArticleBody helper` (7 edge-case tests), `US-content-fetch: happy path` (2 tests), `US-content-fetch: input validation` (6 tests), `US-content-fetch: upstream errors` (7 tests), `US-content-fetch: body parsing` (5 tests), `US-content-fetch: passthrough preserved` (1 test)
- Contract tests in `tests/contract/proxy-http.test.js`: `content fetch: happy path` (full round-trip 200 + SC-001 timing), `content fetch: error handling` (upstream 404 → 404, upstream 503 → 502, server hang → 502 within 12s)
---
## [0.2.0] - 2026-04-23
### Added
- `GET /sitemap.xml` endpoint: returns a well-formed XML Sitemap (Sitemaps protocol 0.9) containing one `<url><loc>` per knowledge item from the KME Knowledge Search Service
- `sitemapFlow()` async function in `kmeContentSourceAdapter.js` — settings validation, OIDC token reuse, search API call, XML build via `xmlbuilder2`, 10-second timeout, 502/504/500 error responses
- `getValidToken()` shared helper extracted from the existing OIDC auth flow — used by both sitemap and non-sitemap paths
- URL routing at IIFE entry point: requests ending in `/sitemap.xml``sitemapFlow()`, all others → `oidcAuthFlow()`
- Three new fields in `src/globalVariables/kme_CSA_settings.json`: `searchApiBaseUrl`, `tenant`, `proxyBaseUrl`
- Three new placeholder fields in `src/globalVariables/kme_CSA_settings.json.example`
- Unit tests for sitemap flow: happy path (items present), empty results, `vkm:url` filtering, 502/504/500 error scenarios, non-sitemap regression tests
- Contract tests for sitemap endpoint: full round-trip 200, empty results 200, 502 upstream error, 504 timeout
---
## [0.1.0] - 2026-04-23
### Added
- `src/proxyScripts/kmeContentSourceAdapter.js` — OIDC authentication proxy script running in a Node.js VM sandbox (zero imports/exports)
- `src/globalVariables/kme_CSA_settings.json` — OIDC credentials and token endpoint configuration (gitignored)
- `src/globalVariables/kme_CSA_settings.json.example` — placeholder settings file for version control
- Redis-backed token cache (`authorization` hash, fields `token` and `expiry`) — token persists across adapter restarts
- Token stampede guard via in-process `_pendingFetch` promise — only one token fetch in-flight at a time
- Absolute Unix epoch expiry check (`Date.now() / 1000 < expiry`)
- `200 OK / Authorized` response on successful authentication
- `401 Unauthorized` response with descriptive message on auth failure (bad credentials, timeout, unreachable service)
- 5-second timeout on OIDC token POST requests
- Structured logging throughout proxy script using `console.debug`, `console.info`, and `console.error`
- `redis` dependency wired into VM context via `createClient().connect()` in `server.js`
- Unit tests (`tests/unit/proxy.test.js`) — 12 tests covering US1, US2, US3, and stampede guard
- Contract tests (`tests/contract/proxy-http.test.js`) — 2 tests covering HTTP 200/401 response shape
[Unreleased]: https://github.com/your-org/kme-content-adapter/compare/v0.1.0...HEAD
[0.1.0]: https://github.com/your-org/kme-content-adapter/releases/tag/v0.1.0