--- description: "Task list for KME Article Content Fetch (003)" --- # Tasks: KME Article Content Fetch **Input**: Design documents from `specs/003-kme-content-fetch/` **Prerequisites**: plan.md ✅, spec.md ✅, research.md ✅, data-model.md ✅, contracts/http-content-fetch.md ✅, quickstart.md ✅ **Architecture constraints**: - Zero new files in `src/` — only `src/proxyScripts/kmeContentSourceAdapter.js` and `src/globalVariables/kmeContentSourceAdapterHelpers.js` are modified - VM sandbox: zero `import`/`export` statements in proxy script or helpers file - Helpers file is a literal function body (ends with `return { ... }`) — new function added before that block - Tests use Node.js built-in test runner (`node:test`) **Files in scope**: | File | Change | |------|--------| | `src/globalVariables/kmeContentSourceAdapterHelpers.js` | Add `extractArticleBody(data)`; export in `return { ... }` | | `src/proxyScripts/kmeContentSourceAdapter.js` | Add `contentFetchFlow()`; add routing branch | | `tests/unit/proxy.test.js` | Add content-fetch describe blocks and helper tests | | `tests/contract/proxy-http.test.js` | Add content-fetch contract tests | | `CHANGELOG.md` | Add feature entry | ## Format: `[ID] [P?] [Story?] Description` - **[P]**: Can run in parallel (different files, no dependencies on incomplete tasks) - **[Story]**: Which user story this task belongs to (US1–US4) - All file paths are relative to repository root --- ## Phase 1: Setup **Purpose**: Confirm baseline before any modifications - [X] T001 Run `npm test` from repository root to confirm all existing tests pass and record the baseline count **Checkpoint**: Baseline confirmed — no pre-existing failures --- ## Phase 2: Foundational (Blocking Prerequisite) **Purpose**: Add `extractArticleBody` pure helper — required by `contentFetchFlow()` in every user story phase **⚠️ CRITICAL**: Phase 3 implementation cannot begin until T002 and T003 are complete; T004 is independently testable after T002+T003 - [X] T002 Add `extractArticleBody(data)` function body to `src/globalVariables/kmeContentSourceAdapterHelpers.js` — insert immediately before the existing `return { ... }` block; implementation: guard for non-object input (`if (!data || typeof data !== 'object') return null`), extract `data['vkm:articleBody']`, return null if field is null/undefined/non-string/empty/whitespace, otherwise return the string - [X] T003 Add `extractArticleBody` to the exports in the `return { ... }` block at the bottom of `src/globalVariables/kmeContentSourceAdapterHelpers.js` so the injected VM context exposes the new function - [X] T004 [P] Add `extractArticleBody helper` describe block to `tests/unit/proxy.test.js` covering all 7 edge cases per data-model.md: valid HTML string → returns string; empty string → null; whitespace-only string → null; null field value → null; field absent (`{}`) → null; null input → null; non-object input (string) → null — no mocking needed, call the helper directly **Checkpoint**: `extractArticleBody` is implemented, exported, and unit-tested; run `npm run test:unit` to confirm T004 passes --- ## Phase 3: User Story 1 — Happy Path Article Fetch (Priority: P1) 🎯 MVP **Goal**: Proxy receives a valid `?kmeURL=` request, obtains an OIDC token, fetches the upstream article, extracts `vkm:articleBody`, and returns it as `200 text/html` **Independent Test**: `curl "http://localhost:3000/?kmeURL=https://content.kme.example/articles/123"` returns `200 OK`, `Content-Type: text/html`, and body matching `vkm:articleBody` from the mock upstream ### Implementation for User Story 1 - [X] T005 [US1] Implement complete `contentFetchFlow()` async function in `src/proxyScripts/kmeContentSourceAdapter.js` following the 9-step design in plan.md: (1) extract `kmeURL` via `new URL(req.url, 'http://localhost').searchParams.get('kmeURL') ?? ''`, (2) empty/blank → 400, (3) malformed/non-http(s) → 400, (4) `validateSettings` missing field → 500, (5) `getValidToken` throws → 502, (6) `axios.get(kmeURL, { headers: { Authorization: 'OIDC_id_token {token}' }, timeout: 10000 })` — ECONNABORTED/ERR_CANCELED → 502, upstream 4xx → 404, upstream 5xx → 502, network error → 502, (7) string body fallback `JSON.parse` — failure → 502; non-object → 502, (8) `extractArticleBody(data)` → null → 404, (9) `res.writeHead(200, { 'Content-Type': 'text/html' }); res.end(articleBody)` - [X] T006 [US1] Add content-fetch routing branch to the URL dispatch block in `src/proxyScripts/kmeContentSourceAdapter.js`: insert `else if (new URL(req.url, 'http://localhost').searchParams.has('kmeURL')) { await contentFetchFlow(); }` between the existing sitemap check and the `oidcAuthFlow()` fallback ### Tests for User Story 1 - [X] T007 [P] [US1] Add `US-content-fetch: happy path` describe block to `tests/unit/proxy.test.js` with two tests: (a) stub `getValidToken` returning cached token + stub `axios.get` returning `{ data: { 'vkm:articleBody': '

Hello

' } }` → assert status 200, `Content-Type: text/html`, body `

Hello

`; (b) stub `getValidToken` simulating cache miss (returns a freshly acquired token) → same 200 assertion - [X] T008 [P] [US1] Add happy path contract test to `tests/contract/proxy-http.test.js`: start a real mock HTTP server that returns `{ "vkm:articleBody": "

Contract test article

" }` with `Content-Type: application/ld+json`; start a real mock token server; issue `GET /?kmeURL={mock-server-url}` to the proxy; assert status 200, `Content-Type: text/html`, response body equals `

Contract test article

`; verify total round-trip is under 11 s (SC-001) **Checkpoint**: `npm run test:unit` and `npm run test:contract` both pass for happy path; manually verify with `curl` per quickstart.md --- ## Phase 4: User Story 2 — Missing or Empty kmeURL Parameter (Priority: P2) **Goal**: Requests with absent, empty, whitespace, or malformed `kmeURL` receive a 400 response with no upstream call made **Independent Test**: `curl -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL="` returns `400`; `curl -o /dev/null -w "%{http_code}" "http://localhost:3000/?kmeURL=not-a-url"` returns `400` ### Tests for User Story 2 - [X] T009 [P] [US2] Add `US-content-fetch: input validation` describe block to `tests/unit/proxy.test.js` with 6 tests using a spy on `axios.get` to assert it is never called: (a) `?kmeURL` absent (no kmeURL param) → routes to `oidcAuthFlow` → 200 (confirms FR-012); (b) `?kmeURL=` empty string → 400, body `Bad Request: kmeURL parameter is required`; (c) `?kmeURL=%20` whitespace-only → 400; (d) `?kmeURL=relative/path` → 400, body `Bad Request: kmeURL must be a well-formed absolute http/https URL`; (e) `?kmeURL=ftp://example.com/article` non-http protocol → 400; (f) `?kmeURL=:::malformed` → 400 **Checkpoint**: `npm run test:unit` passes for validation tests; confirm no upstream stubs are invoked in any 400 scenario --- ## Phase 5: User Story 3 — Upstream Failure & Missing Article Body (Priority: P3) **Goal**: All upstream error conditions (token failure, 4xx, 5xx, timeout, network error, bad body, missing/empty `vkm:articleBody`) return the correct 404 or 502 status to the caller **Independent Test**: Stub `axios.get` to throw an ECONNABORTED error; verify proxy returns 502. Stub `getValidToken` to throw; verify proxy returns 502. Stub `axios.get` returning `{ data: {} }`; verify proxy returns 404. ### Tests for User Story 3 - [X] T010 [P] [US3] Add `US-content-fetch: upstream errors` describe block to `tests/unit/proxy.test.js` with 7 tests: (a) `getValidToken` throws → 502, body `Bad Gateway: token acquisition failed`; (b) `axios.get` throws with `{ response: { status: 404 } }` → 404, body `Not Found: article not found at upstream`; (c) `axios.get` throws with `{ response: { status: 410 } }` → 404; (d) `axios.get` throws with `{ response: { status: 503 } }` → 502, body `Bad Gateway: upstream error HTTP 503`; (e) `axios.get` throws with `{ code: 'ECONNABORTED' }` → 502, body `Bad Gateway: upstream request timed out`; (f) `axios.get` throws with `{ code: 'ERR_CANCELED' }` → 502; (g) `axios.get` throws with `{ message: 'ENOTFOUND' }` (no `response`, no code) → 502, body contains `Bad Gateway:` - [X] T011 [P] [US3] Add `US-content-fetch: body parsing` describe block to `tests/unit/proxy.test.js` with 5 tests (all require valid `getValidToken` stub): (a) `axios.get` returns `{ data: 'not json{{{' }` (string, unparseable) → 502, body `Bad Gateway: unparseable response from upstream`; (b) `axios.get` returns `{ data: { 'vkm:articleBody': undefined } }` (field absent) → 404, body `Not Found: article body not present in upstream response`; (c) field is `null` → 404; (d) field is `''` empty string → 404; (e) field is `' '` whitespace-only → 404 - [X] T012 [P] [US3] Add contract error tests to `tests/contract/proxy-http.test.js`: (a) mock upstream server returns HTTP 404 → proxy returns 404; (b) mock upstream server returns HTTP 503 → proxy returns 502; (c) mock server accepts connection but never responds (use `server.on('request', () => {})`) → proxy returns 502 within 12 s and does not hang **Checkpoint**: All 19 unit tests in T010+T011 pass; all 3 contract error tests in T012 pass --- ## Phase 6: User Story 4 — Passthrough Behaviour Preserved (Priority: P4) **Goal**: Requests without `kmeURL` and without `/sitemap.xml` suffix continue to receive the existing 200 OK auth-check passthrough — zero regression **Independent Test**: `curl -o /dev/null -w "%{http_code}" "http://localhost:3000/"` returns `200` and body is `Authorized` (unchanged) ### Tests for User Story 4 - [X] T013 [US4] Add `US-content-fetch: passthrough preserved` describe block to `tests/unit/proxy.test.js` with 1 test: GET `/?someOtherParam=value` (no `kmeURL`, not sitemap) → assert status 200, body `Authorized`, and confirm `axios.get` is never called (spy asserts not called) — verifies FR-012 and SC-005 **Checkpoint**: Passthrough test passes; run full `npm test` to confirm zero regressions across entire suite --- ## Final Phase: Polish & Cross-Cutting Concerns **Purpose**: Changelog documentation and final validation - [X] T014 [P] Add entry to `CHANGELOG.md` for feature `003-kme-content-fetch`: document new `contentFetchFlow()` in `kmeContentSourceAdapter.js` (routes `?kmeURL=` requests, handles all error paths 400/404/500/502, 10 s timeout), new `extractArticleBody(data)` in `kmeContentSourceAdapterHelpers.js`, new unit test describe blocks in `tests/unit/proxy.test.js`, and new contract tests in `tests/contract/proxy-http.test.js` - [X] T015 Run full test suite `npm test` and confirm all tests pass; run the four quickstart.md `curl` smoke tests (valid kmeURL passthrough, empty kmeURL → 400, malformed kmeURL → 400, sitemap → 200) to validate end-to-end behaviour --- ## Dependencies & Execution Order ### Phase Dependencies - **Setup (Phase 1)**: No dependencies — run immediately - **Foundational (Phase 2)**: Depends on Setup ✅ — **BLOCKS** all user story phases - T002 → T003 (sequential, same file) - T004 [P] can run after T002+T003 (different file: test file) - **US1 (Phase 3)**: Depends on Foundational complete (T002+T003) - T005 → T006 (sequential, same file) - T007 [P] and T008 [P] can run after T005+T006 (different files) - **US2 (Phase 4)**: Depends on T005+T006 complete (tests the validation guards inside `contentFetchFlow`) - **US3 (Phase 5)**: Depends on T005+T006 complete (tests the error guards inside `contentFetchFlow`) - **US4 (Phase 6)**: Depends on T006 complete (tests the routing branch) - **Polish (Final)**: Depends on all user story phases complete ### User Story Dependencies - **US1 (P1)**: Depends only on Foundational phase - **US2 (P2)**: Depends on US1 implementation (T005+T006) — validation lives inside `contentFetchFlow()` - **US3 (P3)**: Depends on US1 implementation (T005+T006) — error paths live inside `contentFetchFlow()` - **US4 (P4)**: Depends on T006 routing branch — tests that passthrough still reached when no `kmeURL` ### Within Each Phase - Source file edits must complete before their corresponding test tasks - T002 must complete before T003 (same file, sequential) - T005 must complete before T006 (same file, sequential) - T005+T006 must complete before T007, T008, T009, T010, T011, T012, T013 ### Parallel Opportunities - T004 [P] runs in parallel with T005+T006 (different files: test file vs source file) - After T005+T006: T007 [P], T008 [P], T009 [P], T010 [P], T011 [P], T012 [P] can all run in parallel (different describe blocks, or separate test file vs unit file) - T013 and T014 [P] run in parallel (different files) --- ## Parallel Execution Examples ### Foundational Phase Parallelism ``` # After T002+T003 complete, run simultaneously: Task A: T004 — Write extractArticleBody unit tests in tests/unit/proxy.test.js Task B: T005 — Implement contentFetchFlow() in src/proxyScripts/kmeContentSourceAdapter.js ``` ### After T005+T006 Complete ``` # These 6 tasks can all run in parallel (different describe blocks / different files): Task A: T007 — Happy path unit tests (proxy.test.js) Task B: T008 — Happy path contract test (proxy-http.test.js) Task C: T009 — Input validation unit tests (proxy.test.js, separate describe block) Task D: T010 — Upstream error unit tests (proxy.test.js, separate describe block) Task E: T011 — Body parsing unit tests (proxy.test.js, separate describe block) Task F: T012 — Contract error tests (proxy-http.test.js, separate describe block) ``` --- ## Implementation Strategy ### MVP First (User Story 1 Only) 1. Complete Phase 1: Setup baseline verification 2. Complete Phase 2: Add `extractArticleBody` helper (CRITICAL — blocks everything) 3. Complete Phase 3: Implement `contentFetchFlow()`, routing branch, and happy path tests 4. **STOP and VALIDATE**: `npm run test:unit` + `npm run test:contract` pass; manual `curl` smoke test works 5. **Deploy/demo if ready** — consumers can now fetch articles via the proxy ### Incremental Delivery 1. Foundation + US1 → happy path working → Demo MVP 2. Add US2 tests → validate 400 rejection works 3. Add US3 tests → validate error handling works 4. Add US4 test → confirm no regression 5. Polish → CHANGELOG + final `npm test` ### Single-Developer Sequence (Optimal Order) ``` T001 → T002 → T003 → T005 → T006 → T004* → T007 → T009 → T010 → T011 → T013 → T008 → T012 → T014 → T015 (* T004 can be done any time after T003 — fits naturally here before test sprint) ``` --- ## Notes - **VM sandbox constraint**: `contentFetchFlow()` must not contain any `import` or `require` — all dependencies (`axios`, `kmeContentSourceAdapterHelpers`, `kme_CSA_settings`, `URL`, `URLSearchParams`, `console`, `req`, `res`) arrive via the injected VM context - **Helpers file constraint**: `extractArticleBody` must be inserted as a plain `function` declaration before the existing `return { ... }` block — no module syntax - **`[P]` tasks**: different files with no dependency on incomplete tasks in the same file - **`[Story]` labels**: map each test task back to the user story it validates for traceability - Each user story's test tasks are independently runnable with `node --test tests/unit/proxy.test.js` (filter by describe block name) - Commit after each logical group (e.g., after T002+T003, after T005+T006, after all unit test tasks) - Verify `npm test` green at each checkpoint before proceeding to next phase