Files
kme_content_adapter/specs/003-kme-content-fetch/quickstart.md
Peter.Morton f840587e5e feat: content fetch, sitemap fixes, remove oidcAuthFlow
- Add contentFetchFlow() to proxy (FR-001 through FR-012)
- Add extractArticleBody() helper with vkm:articleBody / articleBody fallback
- Dynamic proxyBaseUrl derivation from x-forwarded-proto/host headers
- Forward query/size/category params on /sitemap.xml requests
- Add Accept: application/ld+json header to content API calls
- Remove oidcAuthFlow() - unmatched requests now return 404 Not Found
- Fix xmlbuilder2 import: default import, call as xmlbuilder2.create(...)
- Version bump 0.2.0 → 0.3.0
- 45/45 tests passing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-23 16:40:06 -05:00

4.0 KiB

Quickstart: KME Article Content Fetch (003)

Feature branch: 003-kme-content-fetch

This guide explains how to develop and test the content-fetch feature locally.


Prerequisites

  • Node.js ≥18
  • A running Redis instance (default: localhost:6379)
  • kme_CSA_settings.json populated (see src/globalVariables/kme_CSA_settings.json.example)
  • npm install already run

Running the Proxy

npm run dev        # start with --watch (auto-restart on changes)
npm start          # start with jq log formatting

Testing the Content-Fetch Route

Happy path (requires a real or stubbed KME Content Service)

curl -s "http://localhost:3000/?kmeURL=https://content.kme.example/articles/123"
# Expected: 200 OK, Content-Type: text/html, body = <p>Article HTML...</p>

Bad input — missing kmeURL

curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/"
# Expected: 200 (auth-check passthrough, no kmeURL → oidcAuthFlow)

curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL="
# Expected: 400

curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL=not-a-url"
# Expected: 400

curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL=ftp://example.com/article"
# Expected: 400

Existing sitemap route (unchanged)

curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/sitemap.xml"
# Expected: 200 (sitemap XML) or 502 if KME Search API is unreachable

Running Tests

npm run test:unit       # unit tests (mocked axios and Redis)
npm run test:contract   # contract tests (real HTTP servers, real Redis fake)
npm test                # all tests

Running a single test file

node --test tests/unit/proxy.test.js
node --test tests/contract/proxy-http.test.js

Key Files Modified by This Feature

File Change
src/proxyScripts/kmeContentSourceAdapter.js Add contentFetchFlow() function; add routing branch else if (searchParams.has('kmeURL'))
src/globalVariables/kmeContentSourceAdapterHelpers.js Add extractArticleBody(data) function; export it in return { ... }
tests/unit/proxy.test.js Add describe blocks for content-fetch unit tests and extractArticleBody helper tests
tests/contract/proxy-http.test.js Add contract tests for content-fetch (real mock HTTP servers)

Architecture Reminder

The proxy runs inside a Node.js vm.Script / vm.createContext sandbox — zero imports or exports are permitted in kmeContentSourceAdapter.js. All dependencies arrive via the injected context:

Variable What it is
axios HTTP client — axios.get(url, { headers, timeout })
kmeContentSourceAdapterHelpers Helpers object — getValidToken(), extractArticleBody(), validateSettings()
kme_CSA_settings OIDC + service settings from src/globalVariables/kme_CSA_settings.json
URL, URLSearchParams WHATWG URL API — for parsing req.url and validating kmeURL
console Structured logger — console.info/debug/error({ message, ... })
req, res Node.js HTTP request/response

The helpers file (kmeContentSourceAdapterHelpers.js) is a literal function body — it ends with return { ... } and contains no import/export statements. server.js wraps it as an IIFE.


Content-Fetch Flow Summary

Request: GET /?kmeURL=https://content.kme.example/articles/123

1. Routing: req.url has ?kmeURL= → contentFetchFlow()
2. Extract kmeURL: new URL(req.url, 'http://localhost').searchParams.get('kmeURL')
3. Validate kmeURL: empty → 400; malformed / non-http(s) → 400
4. getValidToken() → OIDC id_token (from Redis cache or fresh fetch)
5. axios.get(kmeURL, { Authorization: 'OIDC_id_token {token}', timeout: 10000 })
6. Error handling: 4xx upstream → 404; 5xx/timeout/network → 502
7. extractArticleBody(response.data) → vkm:articleBody string or null
8. null → 404; string → 200 text/html