- Add contentFetchFlow() to proxy (FR-001 through FR-012) - Add extractArticleBody() helper with vkm:articleBody / articleBody fallback - Dynamic proxyBaseUrl derivation from x-forwarded-proto/host headers - Forward query/size/category params on /sitemap.xml requests - Add Accept: application/ld+json header to content API calls - Remove oidcAuthFlow() - unmatched requests now return 404 Not Found - Fix xmlbuilder2 import: default import, call as xmlbuilder2.create(...) - Version bump 0.2.0 → 0.3.0 - 45/45 tests passing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4.0 KiB
4.0 KiB
Quickstart: KME Article Content Fetch (003)
Feature branch: 003-kme-content-fetch
This guide explains how to develop and test the content-fetch feature locally.
Prerequisites
- Node.js ≥18
- A running Redis instance (default:
localhost:6379) kme_CSA_settings.jsonpopulated (seesrc/globalVariables/kme_CSA_settings.json.example)npm installalready run
Running the Proxy
npm run dev # start with --watch (auto-restart on changes)
npm start # start with jq log formatting
Testing the Content-Fetch Route
Happy path (requires a real or stubbed KME Content Service)
curl -s "http://localhost:3000/?kmeURL=https://content.kme.example/articles/123"
# Expected: 200 OK, Content-Type: text/html, body = <p>Article HTML...</p>
Bad input — missing kmeURL
curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/"
# Expected: 200 (auth-check passthrough, no kmeURL → oidcAuthFlow)
curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL="
# Expected: 400
curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL=not-a-url"
# Expected: 400
curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL=ftp://example.com/article"
# Expected: 400
Existing sitemap route (unchanged)
curl -s -o /dev/null -w "%{http_code}" "http://localhost:3000/sitemap.xml"
# Expected: 200 (sitemap XML) or 502 if KME Search API is unreachable
Running Tests
npm run test:unit # unit tests (mocked axios and Redis)
npm run test:contract # contract tests (real HTTP servers, real Redis fake)
npm test # all tests
Running a single test file
node --test tests/unit/proxy.test.js
node --test tests/contract/proxy-http.test.js
Key Files Modified by This Feature
| File | Change |
|---|---|
src/proxyScripts/kmeContentSourceAdapter.js |
Add contentFetchFlow() function; add routing branch else if (searchParams.has('kmeURL')) |
src/globalVariables/kmeContentSourceAdapterHelpers.js |
Add extractArticleBody(data) function; export it in return { ... } |
tests/unit/proxy.test.js |
Add describe blocks for content-fetch unit tests and extractArticleBody helper tests |
tests/contract/proxy-http.test.js |
Add contract tests for content-fetch (real mock HTTP servers) |
Architecture Reminder
The proxy runs inside a Node.js vm.Script / vm.createContext sandbox — zero imports or
exports are permitted in kmeContentSourceAdapter.js. All dependencies arrive via the injected
context:
| Variable | What it is |
|---|---|
axios |
HTTP client — axios.get(url, { headers, timeout }) |
kmeContentSourceAdapterHelpers |
Helpers object — getValidToken(), extractArticleBody(), validateSettings() |
kme_CSA_settings |
OIDC + service settings from src/globalVariables/kme_CSA_settings.json |
URL, URLSearchParams |
WHATWG URL API — for parsing req.url and validating kmeURL |
console |
Structured logger — console.info/debug/error({ message, ... }) |
req, res |
Node.js HTTP request/response |
The helpers file (kmeContentSourceAdapterHelpers.js) is a literal function body — it ends
with return { ... } and contains no import/export statements. server.js wraps it as an IIFE.
Content-Fetch Flow Summary
Request: GET /?kmeURL=https://content.kme.example/articles/123
1. Routing: req.url has ?kmeURL= → contentFetchFlow()
2. Extract kmeURL: new URL(req.url, 'http://localhost').searchParams.get('kmeURL')
3. Validate kmeURL: empty → 400; malformed / non-http(s) → 400
4. getValidToken() → OIDC id_token (from Redis cache or fresh fetch)
5. axios.get(kmeURL, { Authorization: 'OIDC_id_token {token}', timeout: 10000 })
6. Error handling: 4xx upstream → 404; 5xx/timeout/network → 502
7. extractArticleBody(response.data) → vkm:articleBody string or null
8. null → 404; string → 200 text/html