kme-content-adapter

An HTTP proxy adapter that searches and fetches content from KME (Knowledge Management Engine) and exposes it as a Sitemaps-compliant XML feed and individual HTML article pages. Business logic runs in an isolated Node.js VM sandbox, mirroring the IVA Studio proxy script execution environment.

Requirements

  • Node.js ≥ 18
  • Redis (used for token caching)
  • jq (optional — used by npm start for log pretty-printing)

Setup

npm install
cp src/globalVariables/kme_CSA_settings.json.example src/globalVariables/kme_CSA_settings.json
# Edit kme_CSA_settings.json with real credentials

Configuration

src/globalVariables/kme_CSA_settings.json

Credentials and API settings — never commit this file.

{
  "tokenUrl": "https://<host>/oidc-token-service/<env>/token",
  "username": "<username>",
  "password": "<password>",
  "clientId": "default",
  "scope": "openid tags content_entitlements",
  "searchApiBaseUrl": "https://<host>/km-search-service",
  "tenant": "<env>"
}
Field Description
tokenUrl OIDC token endpoint
username / password KME credentials
clientId OAuth client ID (usually default)
scope OAuth scopes
searchApiBaseUrl KME Knowledge Search Service base URL
tenant KME tenant/environment path segment (e.g. qa)

config/default.json

Infrastructure settings (port, host, log level). Override with environment variables:

Variable Default Description
PORT 3000 HTTP server port
HOST 0.0.0.0 Bind address
LOG_LEVEL debug Log level: DEBUG, INFO, WARN, ERROR

Endpoints

GET /sitemap.xml

Returns a Sitemaps protocol 0.9 XML document. Each <loc> points back to this adapter's content fetch endpoint so crawlers can retrieve individual articles.

Query parameters (all optional):

Parameter Default Description
query * KME search query string
size 100 Max results per search page
category vkm:ArticleCategory KME category filter

Results are paginated automatically using hydra:view['hydra:last']. The response is capped at 50,000 URLs per the Sitemaps protocol.

GET /sitemap.xml?query=temple&size=50&category=vkm:ArticleCategory

GET /?kmeURL=<upstream-article-url>

Fetches a single KME article by its upstream URL and returns it as a full HTML document.

GET /?kmeURL=https%3A%2F%2F<kme-host>%2Fkm-content-service%2F...

Response: 200 text/html; charset=utf-8 — a complete HTML document:

<!DOCTYPE html>
<html>
<head><title>Article Title from vkm:name</title></head>
<body>
<!-- vkm:articleBody content verbatim -->
</body>
</html>

Error responses:

Status Cause
400 kmeURL missing, blank, malformed, or non-http/https
404 Upstream returned 4xx, or article body absent in response
502 Token acquisition failed, upstream 5xx, network error, or timeout

GET /* (anything else)

Returns 404 Not Found.


Running

npm run dev      # Development — auto-restart on file changes
npm start        # Production — logs piped through jq

Testing

npm test                    # All tests
npm run test:unit           # Unit tests only
npm run test:integration    # Integration tests only
npm run test:contract       # Contract tests only

# Single test file
node --test tests/unit/proxy.test.js

Tests use the Node.js built-in node:test runner. No external test framework.

Architecture

The server loads src/proxyScripts/kmeContentSourceAdapter.js once at startup via vm.Script, then executes it in a fresh isolated VM context per request via vm.createContext.

src/
├── proxyScripts/
│   └── kmeContentSourceAdapter.js          # All business logic (zero imports/exports)
├── globalVariables/
│   ├── kme_CSA_settings.json               # Credentials & API config (gitignored)
│   ├── kme_CSA_settings.json.example       # Template for version control
│   └── kmeContentSourceAdapterHelpers.js   # Pure utilities (literal function body)
├── logger.js                               # Structured JSON logger
└── server.js                               # HTTP server bootstrap only
config/
└── default.json                            # Infrastructure settings

VM Context Globals

All dependencies are injected into each request's sandbox:

Variable Source
console Structured logger
crypto Node.js Web Crypto API
axios HTTP client
jwt jsonwebtoken
uuidv4 UUID v4 generator
xmlbuilder2 xmlbuilder2 default export (call as xmlbuilder2.create(...))
redis Connected Redis client
URLSearchParams, URL Node.js globals
kme_CSA_settings Loaded from src/globalVariables/kme_CSA_settings.json
kmeContentSourceAdapterHelpers Loaded from src/globalVariables/kmeContentSourceAdapterHelpers.js
req, res Node.js HTTP request/response

Key Constraints for kmeContentSourceAdapter.js

  • Zero import/export — runs in a VM with no module system
  • No config, global.config, or process.env — use injected globals only
  • Routing metadata is available via req.params (set by server.js)
  • proxyBaseUrl is derived dynamically from request headers (x-forwarded-proto, x-forwarded-host, host) — not read from settings

Token Caching

OIDC tokens are cached in Redis under the hash key authorization (fields token and expiry). The cache survives adapter restarts. Token expiry is stored as an absolute Unix epoch timestamp. A stampede guard ensures only one token fetch is in flight at a time when multiple concurrent requests encounter a cache miss.

Helpers (kmeContentSourceAdapterHelpers.js)

A pure-utility module injected into the VM context. Key functions:

Function Description
getValidToken(reqUrl, reqMethod) Returns a cached or freshly-fetched OIDC id_token; throws on failure
extractHydraItems(data) Extracts one fragment per SearchResultItem — the one with the latest vkm:datePublished
buildSitemapXml(items, proxyBaseUrl) Builds Sitemaps 0.9 XML from an array of fragments
extractArticleBody(data) Returns vkm:articleBody (or articleBody fallback) from a content API response
validateSettings(settings, fields) Returns the first missing required field name, or null

Note: This file is a literal function body — server.js wraps it as (function() { <file> })(). It must end with a bare return { ... } and contain zero import/export statements.

Changelog

See CHANGELOG.md.

Description
KME Content Source Adapter
Readme 455 KiB
Languages
Shell 47.9%
JavaScript 39%
PowerShell 13.1%