diff --git a/.specify/memory/constitution.md b/.specify/memory/constitution.md index 2d35306..721d7f0 100644 --- a/.specify/memory/constitution.md +++ b/.specify/memory/constitution.md @@ -1,35 +1,3 @@ - - # Proxy Scripts Constitution ## Core Principles @@ -39,41 +7,51 @@ Follow-up TODOs: **ALL business logic, data processing, authentication, and request handling MUST exist within the `proxy.js` file.** The `server.js` file should ONLY handle: - HTTP server setup - Configuration loading -- Global console replacement with custom logger -- Request delegation to `proxy.handleRequest()` +- Global object injection into isolated context +- Loading proxy.js via `vm.Script` and `vm.createContext` +- Per-request context creation with all necessary globals -**Rationale**: Monolithic architecture enables simple packaging as a single IVA Studio proxy script and prevents fragmentation of business logic across multiple files. ALL functionality must be in one place. +**Implementation via vm.Script**: +`proxy.js` MUST be loaded using Node.js `vm.Script` and executed in isolated contexts created per-request with `vm.createContext`. This ensures: +- Complete isolation from server.js module system +- All dependencies provided explicitly through context objects +- Zero ability to import/export modules +- Pure functional execution with injected dependencies + +**Rationale**: Monolithic architecture enables simple packaging as a single IVA Studio proxy script and prevents fragmentation of business logic across multiple files. Using `vm.Script` enforces architectural boundaries at runtime, making it impossible for `proxy.js` to access Node.js module system or file system, ensuring ALL functionality exists in one isolated, dependency-injected file. ### I. Zero External Imports or Exports from `proxy.js` (NON-NEGOTIABLE) -`proxy.js` MUST have **ZERO import statements**. All dependencies MUST be provided as global objects by server.js. +`proxy.js` MUST have **ZERO import statements**. All dependencies MUST be provided through `vm.createContext` by server.js. -`proxy.js` MUST have **ZERO export statements**. The proxy.js file should be treated as a single function that receives (req, res) parameters and loaded as a route into the server.js file. +`proxy.js` MUST have **ZERO export statements**. The file MUST be pure JavaScript code executed in an isolated VM context. -**File system access** from `proxy.js` is **ABSOLUTELY PROHIBITED** under any circumstances. The `fs` module MUST NOT be imported into proxy.js. +**File system access** from `proxy.js` is **ABSOLUTELY PROHIBITED** under any circumstances. The `fs` module MUST NOT be accessible. -**External libraries** (axios, jwt, googleapis, etc.) MUST NOT be imported. Use globals provided by server.js instead. +**External libraries** (axios, jwt, googleapis, etc.) MUST NOT be imported. Dependencies are injected through VM context by server.js. -**Rationale**: Monolithic architecture requires ALL I/O operations and dependency injection to be centralized in server.js, ensuring proxy.js contains ONLY pure business logic. +**Rationale**: Using `vm.Script` and `vm.createContext` enforces architectural boundaries at the VM level. proxy.js runs in an isolated context with NO access to Node.js module system, file system, or process globals. ALL dependencies must be explicitly injected per-request through the context object, ensuring proxy.js contains ONLY pure business logic with zero capability for I/O operations. **For data files that proxy.js needs** (service account keys, certificates, secrets): 1. Place JSON files in `global/` directory -2. server.js automatically loads them as global objects using the filename as the object name -3. proxy.js accesses them via `globalThis[objectName]` +2. server.js loads them at startup using `loadGlobalObjects()` +3. server.js injects them into VM context per-request via `vm.createContext` +4. proxy.js accesses them as simple variables in context (e.g., `google_drive_settings`) **Example**: -- File: `global/service-account-key.json` -- Global: `globalThis['service-account-key']` -- Access in proxy.js: `const credentials = globalThis['service-account-key']` +- File: `global/google_drive_settings.json` +- Loading: server.js reads and assigns to `globalVariableContext.google_drive_settings` +- Injection: server.js adds `google_drive_settings: globalVariableContext.google_drive_settings` to context +- Access in proxy.js: Direct variable access `google_drive_settings.serviceAccount` **Enforcement**: - proxy.js MUST have NO `import` statements (file should start with comments, then code) -- During code review, verify first line of code is NOT an import -- Any `import` statement in proxy.js MUST be rejected immediately -- proxy.js MUST have NO `export` statements -- Any `export` statement in proxy.js MUST be rejected immediately -- All file operations MUST be in server.js, which then provides data via globals -- All external libraries MUST be provided as globals by server.js +- proxy.js MUST have NO `export` statements (no module.exports, no export keyword) +- Any `import` or `export` in proxy.js MUST be rejected immediately +- server.js MUST load proxy.js using `vm.Script` constructor +- server.js MUST execute via `script.runInContext(context)` with fresh context per request +- All dependencies injected through `vm.createContext({ ... })` context object +- VM isolation prevents access to require(), import(), fs, process, and Node.js globals #### I.I What MUST Be in proxy.js @@ -124,73 +102,109 @@ During code review and planning: #### I.V Global Objects Provided by server.js -The `server.js` file MUST make the following objects available globally for use by `proxy.js`: +The `server.js` file MUST inject the following objects into VM context for use by `proxy.js`: -**Core Infrastructure Globals:** +**VM Context Injection Pattern:** +```javascript +// Create a context with all globals that proxy.js needs +const context = vm.createContext({ + ...globalVMContext, + ...globalVariableContext, + req, + res, +}); +script.runInContext(context); +``` + +**Core Infrastructure Context Variables:** 1. **console** - Custom logger from `logger.js` - Purpose: Structured JSON logging - Usage: `console.info()`, `console.debug()`, `console.error()` - - Replaces: Built-in console object + - Injected from: `globalVMContext.console` -2. **crypto** - Node.js crypto module +2. **crypto** - Web Crypto API (built-in) - Purpose: UUID generation, cryptographic operations - Usage: `crypto.randomUUID()`, etc. - - Note: Cannot use name 'crypto' due to Web Crypto API conflict - - Replaces: `import crypto from 'node:crypto'` in proxy.js + - Injected from: `globalVMContext.crypto` + - Note: Web Crypto API available by default in Node.js 3. **config** - Configuration object - Purpose: Infrastructure settings ONLY (server host/port, logging level) - - Usage: `global.config.server.port`, `global.config.logging.level` - - Loaded: From `config/default.json` merged with ENV vars + - Usage: `config.server.port`, `config.logging.level` + - Injected from: `global.config` + - Loaded from: `config/default.json` merged with ENV vars - **DOES NOT contain**: Authentication, secrets, API keys, behavioral config (use global/ instead) 4. **axios** - HTTP client library - Purpose: Making HTTP requests to external APIs - Usage: `axios.get(url)`, `axios.post(url, data)` - Package: `axios` - - Replaces: `import axios from 'axios'` in proxy.js + - Injected from: `globalVMContext.axios` 5. **uuidv4** - UUID v4 generator - Purpose: Generate RFC4122 compliant UUIDs - Usage: `uuidv4()` returns string like "110ec58a-a0f2-4ac4-8393-c866d813b8d1" - Package: `uuid` (v4 function only) - - Replaces: `import { v4 as uuidv4 } from 'uuid'` in proxy.js + - Injected from: `globalVMContext.uuidv4` 6. **jwt** - JSON Web Token library - Purpose: Creating and verifying JWTs for authentication - Usage: `jwt.sign(payload, secret)`, `jwt.verify(token, secret)` - Package: `jsonwebtoken` - - Replaces: `import jwt from 'jsonwebtoken'` in proxy.js + - Injected from: `globalVMContext.jwt` 7. **xmlBuilder** - XML builder/generator - Purpose: Constructing XML documents programmatically - Usage: `xmlBuilder({ root: { child: 'value' } })` - Package: `xmlbuilder2` (create function) - - Replaces: `import { create } from 'xmlbuilder2'` in proxy.js + - Injected from: `globalVMContext.xmlBuilder` -**Dynamic Data Globals:** +**Built-in Web APIs:** -8. **Dynamic JSON objects from global/ directory** - - Purpose: Authentication credentials, secrets, API keys, and behavioral configuration - - Pattern: Each `global/filename.json` → `globalThis['filename']` - - Examples: - - `global/service-account-key.json` → `globalThis['service-account-key']` (Service Account credentials with client_email and private_key) - - `global/google-scopes.json` → `globalThis['google-scopes']` (OAuth2 scopes array for Google APIs) - - `global/sitemap-config.json` → `globalThis['sitemap-config']` (Sitemap settings like maxUrls) - - `global/drive-query.json` → `globalThis['drive-query']` (Drive API query filter) - - `global/api-keys.json` → `globalThis['api-keys']` (API keys and secrets) - - Usage in proxy.js: `const creds = globalThis['service-account-key']`, `const scopes = globalThis['google-scopes']` - - Loaded: Automatically by server.js at startup using `loadGlobalObjects()` - - **Note**: ALL authentication, secrets, and behavioral configuration MUST be in global/, NEVER in config/default.json +8. **URLSearchParams** - URL query string parser (built-in) + - Purpose: Parse and manipulate URL query strings + - Usage: `new URLSearchParams(queryString)` + - Injected from: `globalVMContext.URLSearchParams` + +9. **URL** - URL parser (built-in) + - Purpose: Parse and manipulate URLs + - Usage: `new URL(urlString)` + - Injected from: `globalVMContext.URL` + +**Dynamic Data Context Variables:** + +10. **Dynamic JSON objects from global/ directory** + - Purpose: Authentication credentials, secrets, API keys, and behavioral configuration + - Pattern: Each `global/filename.json` loaded by server.js → injected into context + - Examples: + - `global/google_drive_settings.json` → context var `google_drive_settings` (consolidated service account, scopes, drive query, sitemap config) + - `global/api-keys.json` → context var `api_keys` (API keys and secrets) + - `global/custom-config.json` → context var `custom_config` (behavioral settings) + - Usage in proxy.js: Direct variable access `google_drive_settings.serviceAccount` + - Loaded: By server.js at startup using `loadGlobalObjects()` + - Injected: Per-request via `vm.createContext({ objectName: globalVariableContext.objectName })` + - **Note**: ALL authentication, secrets, and behavioral configuration MUST be in global/, NEVER in config/default.json + +**Request/Response Objects:** + +11. **req** - HTTP IncomingMessage + - Purpose: Access request data (URL, method, headers, body) + - Injected fresh: Per-request from `http.createServer((req, res) => ...)` + +12. **res** - HTTP ServerResponse + - Purpose: Send response to client + - Injected fresh: Per-request from `http.createServer((req, res) => ...)` + +**Rationale**: Using `vm.createContext` for dependency injection achieves: +- **Runtime-enforced isolation** - proxy.js physically cannot access Node.js module system or file system +- **Zero imports possible** - VM context has no `require()` or `import()` capability +- **Explicit dependencies** - All available objects must be explicitly listed in context +- **Per-request isolation** - Fresh context per request prevents cross-request state leakage +- **Testing simplicity** - Mock entire context object instead of individual module imports +- **Clear contracts** - Context object documents every dependency proxy.js uses +- **Security boundaries** - VM sandbox prevents escape to underlying system -**Rationale**: Centralizing global setup and ALL file I/O in server.js achieves: -- **ZERO imports in proxy.js** - complete dependency injection pattern -- Consistent environment setup and library versions -- Easy testing (mock globals instead of mocking module imports) -- Clear separation: server.js = infrastructure & dependencies, proxy.js = pure business logic -- Single source of truth for dependency injection -- Direct REST API calls instead of heavyweight SDK wrappers #### I.VI Logging @@ -380,4 +394,4 @@ All pull requests, code reviews, and design discussions MUST verify compliance w For runtime development guidance, refer to `.github/prompts/` and `.github/agents/` files which operationalize these principles into agent workflows. -**Version**: 1.11.0 | **Ratified**: 2026-03-05 | **Last Amended**: 2026-03-07 +**Version**: 1.13.0 | **Ratified**: 2026-03-05 | **Last Amended**: 2026-03-07 diff --git a/src/proxy.js b/src/proxy.js index 04402a7..9fc3dc0 100644 --- a/src/proxy.js +++ b/src/proxy.js @@ -1,12 +1,13 @@ /** * Google Drive Sitemap Adapter Proxy - * + * * MONOLITHIC HTTP request handler - ALL functionality in this single file. - * Architecture: Server.js delegates ALL requests to this module's default function (req, res) => {} + * Architecture: Pure IIFE - returns request handler function when executed * Authentication: Service Account (JWT-based) inline - * - * CONSTITUTION REQUIREMENT: ZERO export statements - this file exports ONLY a default handler function - * + * + * CONSTITUTION REQUIREMENT: ZERO export statements - pure IIFE pattern + * File is loaded by server.js using Function constructor + * * Globals provided by server.js: * - console: Custom logger * - crypto: Web Crypto API (provides randomUUID()) @@ -20,7 +21,7 @@ * - scopes: OAuth2 scopes array * - driveQuery: Drive API query filter * - sitemap: Sitemap configuration (maxUrls) - * + * * Structure: * Section 1: Authentication (Service Account JWT) * Section 2: Utility Functions @@ -29,7 +30,7 @@ * Section 5: Drive API Client * Section 6: Sitemap Generation * Section 7: Request Handling & Routing - * + * * @module proxy */ @@ -49,7 +50,7 @@ let tokenExpiryTime = null; /** * Create JWT token for Google Service Account authentication * Uses RS256 algorithm with service account private key - * + * * @param {Object} credentials - Service account credentials * @returns {string} Signed JWT token */ @@ -59,28 +60,32 @@ function createServiceAccountJWT(credentials, scopes) { const payload = { iss: credentials.client_email, - scope: scopes.join(' '), - aud: 'https://oauth2.googleapis.com/token', + scope: scopes.join(" "), + aud: "https://oauth2.googleapis.com/token", exp: expiry, - iat: now + iat: now, }; - return jwt.sign(payload, credentials.private_key, { algorithm: 'RS256' }); + return jwt.sign(payload, credentials.private_key, { algorithm: "RS256" }); } /** * Exchange JWT for access token - * + * * @param {string} jwtToken - Signed JWT token * @returns {Promise} Access token */ async function getAccessToken(jwtToken) { - const response = await axios.post('https://oauth2.googleapis.com/token', { - grant_type: 'urn:ietf:params:oauth:grant-type:jwt-bearer', - assertion: jwtToken - }, { - headers: { 'Content-Type': 'application/x-www-form-urlencoded' } - }); + const response = await axios.post( + "https://oauth2.googleapis.com/token", + { + grant_type: "urn:ietf:params:oauth:grant-type:jwt-bearer", + assertion: jwtToken, + }, + { + headers: { "Content-Type": "application/x-www-form-urlencoded" }, + }, + ); return response.data.access_token; } @@ -88,42 +93,51 @@ async function getAccessToken(jwtToken) { /** * Initialize Google OAuth Service Account client * Uses credentials from global object (loaded by server.js from global/ directory) - * + * * @returns {Promise} Access token for Drive API * @throws {Error} If credentials are invalid or not loaded */ async function initializeServiceAccount() { try { // Load settings from consolidated global object - const settings = globalThis['google_drive_settings']; - + const settings = globalThis["google_drive_settings"]; + if (!settings) { - throw new Error('Google Drive settings not found in globalThis["google_drive_settings"]. Ensure server.js loaded global/google_drive_settings.json'); + throw new Error( + 'Google Drive settings not found in globalThis["google_drive_settings"]. Ensure server.js loaded global/google_drive_settings.json', + ); } - + // Validate service account structure - if (!settings.serviceAccount || !settings.serviceAccount.client_email || !settings.serviceAccount.private_key) { - throw new Error('Invalid service account key format - missing serviceAccount.client_email or serviceAccount.private_key'); + if ( + !settings.serviceAccount || + !settings.serviceAccount.client_email || + !settings.serviceAccount.private_key + ) { + throw new Error( + "Invalid service account key format - missing serviceAccount.client_email or serviceAccount.private_key", + ); } - + // Default scopes if not specified - const scopes = settings.scopes || ['https://www.googleapis.com/auth/drive.readonly']; - + const scopes = settings.scopes || [ + "https://www.googleapis.com/auth/drive.readonly", + ]; + // Create JWT token const jwtToken = createServiceAccountJWT(settings.serviceAccount, scopes); - + // Exchange JWT for access token const accessToken = await getAccessToken(jwtToken); - - console.info('Service account authenticated successfully', { - email: settings.serviceAccount.client_email + + console.info("Service account authenticated successfully", { + email: settings.serviceAccount.client_email, }); - + return accessToken; - } catch (error) { - console.error('Service account authentication failed', { - error: error.message + console.error("Service account authentication failed", { + error: error.message, }); throw error; } @@ -132,21 +146,21 @@ async function initializeServiceAccount() { /** * Get or create cached access token * Singleton pattern to avoid multiple authentications - * + * * @returns {Promise} Access token for Drive API */ async function getAccessTokenCached() { const now = Date.now(); - + // Return cached token if still valid (with 5 minute buffer) - if (accessTokenCache && tokenExpiryTime && now < (tokenExpiryTime - 300000)) { + if (accessTokenCache && tokenExpiryTime && now < tokenExpiryTime - 300000) { return accessTokenCache; } - + // Get new token accessTokenCache = await initializeServiceAccount(); tokenExpiryTime = now + 3600000; // 1 hour from now - + return accessTokenCache; } @@ -165,7 +179,7 @@ function clearAuthCache() { /** * Generate a unique request ID for tracing * Uses UUID v4 for uniqueness - * + * * @returns {string} Request ID in format: req_ */ function generateRequestId() { @@ -175,15 +189,15 @@ function generateRequestId() { /** * Validate document ID format * Google Drive IDs are alphanumeric with hyphens and underscores - * + * * @param {string} id - Document ID to validate * @returns {boolean} True if valid */ function validateDocumentId(id) { - if (!id || typeof id !== 'string') { + if (!id || typeof id !== "string") { return false; } - + // Google Drive IDs are typically 8-128 characters // Characters: a-z, A-Z, 0-9, -, _ const pattern = /^[a-zA-Z0-9_-]{8,128}$/; @@ -197,19 +211,19 @@ function validateDocumentId(id) { /** * Escape special XML characters * Prevents XML injection and ensures valid XML output - * + * * @param {string} str - String to escape * @returns {string} Escaped string safe for XML */ function escapeXml(str) { - if (!str) return ''; - + if (!str) return ""; + return str - .replace(/&/g, '&') - .replace(//g, '>') - .replace(/"/g, '"') - .replace(/'/g, '''); + .replace(/&/g, "&") + .replace(//g, ">") + .replace(/"/g, """) + .replace(/'/g, "'"); } // ============================================================================= @@ -218,7 +232,7 @@ function escapeXml(str) { /** * FIFO Queue for request processing - * + * * Ensures sequential processing - only one request executes at a time. * Prevents concurrent Drive API operations per specification clarification #7. */ @@ -227,29 +241,29 @@ class RequestQueue { this.queue = []; this.processing = false; } - + /** * Add request to queue and start processing - * + * * @param {Function} handler - Async function to execute * @returns {Promise} Resolves when handler completes */ async enqueue(handler) { return new Promise((resolve, reject) => { this.queue.push({ handler, resolve, reject }); - - console.debug('Request enqueued', { + + console.debug("Request enqueued", { queueLength: this.queue.length, - processing: this.processing + processing: this.processing, }); - + // Start processing if not already processing if (!this.processing) { this._processNext(); } }); } - + /** * Process next request in queue * @private @@ -257,17 +271,17 @@ class RequestQueue { async _processNext() { if (this.queue.length === 0) { this.processing = false; - console.debug('Queue empty, stopping processing'); + console.debug("Queue empty, stopping processing"); return; } - + this.processing = true; const { handler, resolve, reject } = this.queue.shift(); - - console.debug('Processing next request', { - remainingInQueue: this.queue.length + + console.debug("Processing next request", { + remainingInQueue: this.queue.length, }); - + try { const result = await handler(); resolve(result); @@ -278,7 +292,7 @@ class RequestQueue { this._processNext(); } } - + /** * Get current queue length * @returns {number} @@ -286,7 +300,7 @@ class RequestQueue { get length() { return this.queue.length; } - + /** * Check if queue is processing * @returns {boolean} @@ -309,7 +323,7 @@ const requestQueue = new RequestQueue(); class DocumentCountExceededError extends Error { constructor(count, limit) { super(`Document count ${count} exceeds limit of ${limit}`); - this.name = 'DocumentCountExceededError'; + this.name = "DocumentCountExceededError"; this.count = count; this.limit = limit; this.statusCode = 413; @@ -318,10 +332,10 @@ class DocumentCountExceededError extends Error { /** * Query documents from Google Drive with pagination - * + * * Enforces 50k document limit per sitemap protocol specification. * If count exceeds limit, throws DocumentCountExceededError. - * + * * @param {Object} options - Query options * @param {string} options.query - Drive API query filter * @param {string} options.fields - Fields to retrieve @@ -333,106 +347,104 @@ class DocumentCountExceededError extends Error { */ async function queryDocuments(options = {}) { const { - query = 'trashed = false', - fields = 'nextPageToken,files(id,name,mimeType,modifiedTime)', + query = "trashed = false", + fields = "nextPageToken,files(id,name,mimeType,modifiedTime)", pageSize = 100, - maxDocuments = 50000 + maxDocuments = 50000, } = options; const allFiles = []; let pageToken = null; - - console.debug('Starting Drive API query', { + + console.debug("Starting Drive API query", { query, pageSize, - maxDocuments + maxDocuments, }); - + const startTime = Date.now(); - + try { const accessToken = await getAccessTokenCached(); - + do { // Build query parameters const params = new URLSearchParams({ q: query, pageSize: pageSize.toString(), - fields + fields, }); - + if (pageToken) { - params.append('pageToken', pageToken); + params.append("pageToken", pageToken); } - + // Make direct HTTP call to Drive API const response = await axios.get( `https://www.googleapis.com/drive/v3/files?${params.toString()}`, { headers: { - 'Authorization': `Bearer ${accessToken}`, - 'Accept': 'application/json' - } - } + Authorization: `Bearer ${accessToken}`, + Accept: "application/json", + }, + }, ); - + const files = response.data.files || []; allFiles.push(...files); - - console.debug('Drive API page retrieved', { + + console.debug("Drive API page retrieved", { pageFiles: files.length, totalFiles: allFiles.length, - hasMore: !!response.data.nextPageToken + hasMore: !!response.data.nextPageToken, }); - + // Check if we've exceeded the limit BEFORE fetching more if (allFiles.length > maxDocuments) { - console.error('Document count exceeds limit', { + console.error("Document count exceeds limit", { count: allFiles.length, - limit: maxDocuments + limit: maxDocuments, }); throw new DocumentCountExceededError(allFiles.length, maxDocuments); } - + pageToken = response.data.nextPageToken; - } while (pageToken); - + const duration = Date.now() - startTime; - - console.info('Drive API query completed', { + + console.info("Drive API query completed", { documentCount: allFiles.length, - duration + duration, }); - + return allFiles; - } catch (error) { // Re-throw DocumentCountExceededError as-is if (error instanceof DocumentCountExceededError) { throw error; } - + // Log and re-throw other errors - console.error('Drive API query failed', { + console.error("Drive API query failed", { error: error.message, code: error.code, - statusCode: error.response?.status + statusCode: error.response?.status, }); - + throw error; } } /** * Map Drive API error to HTTP status code and retry info - * + * * Per specification: * - 429: Rate limit - include Retry-After header * - 503: Service unavailable - NO RETRY (fail immediately) * - 401: Authentication failed * - 500: Other errors - * + * * @param {Error} error - Drive API error * @returns {Object} { statusCode, retryAfter? } */ @@ -441,39 +453,39 @@ function mapDriveErrorToHttp(error) { if (error instanceof DocumentCountExceededError) { return { statusCode: 413 }; } - + // Extract status code from Drive API error const statusCode = error.response?.status || error.code || 500; - + // Handle rate limiting (429) if (statusCode === 429) { // Extract Retry-After from response headers if present - const retryAfter = error.response?.headers?.['retry-after']; + const retryAfter = error.response?.headers?.["retry-after"]; const retryAfterSeconds = retryAfter ? parseInt(retryAfter, 10) : 60; - + return { statusCode: 429, - retryAfter: retryAfterSeconds + retryAfter: retryAfterSeconds, }; } - + // Handle service unavailable (503) - NO RETRY per spec if (statusCode === 503) { return { statusCode: 503 }; } - + // Handle authentication errors if (statusCode === 401 || statusCode === 403) { return { statusCode: statusCode }; } - + // All other errors map to 500 return { statusCode: 500 }; } /** * Validate document count against limit - * + * * @param {number} count - Document count * @param {number} limit - Maximum allowed (default: 50000) * @throws {DocumentCountExceededError} If count > limit @@ -490,10 +502,10 @@ function validateDocumentCount(count, limit = 50000) { /** * Transform Drive document to sitemap entry - * + * * Creates RESTful URL in format: {baseUrl}/documents/{documentId} * Per specification clarification #2. - * + * * @param {Object} document - Drive API document * @param {string} document.id - Document ID * @param {string} document.modifiedTime - ISO 8601 timestamp @@ -502,86 +514,86 @@ function validateDocumentCount(count, limit = 50000) { */ function toSitemapEntry(document, baseUrl) { if (!document || !document.id) { - console.error('Invalid document for sitemap entry', { document }); + console.error("Invalid document for sitemap entry", { document }); return null; } - + // RESTful URL format: /documents/{documentId} const loc = `${baseUrl}/documents/${encodeURIComponent(document.id)}`; - + // Format lastmod as ISO 8601 date (YYYY-MM-DD) let lastmod; if (document.modifiedTime) { try { const date = new Date(document.modifiedTime); - lastmod = date.toISOString().split('T')[0]; // Extract YYYY-MM-DD + lastmod = date.toISOString().split("T")[0]; // Extract YYYY-MM-DD } catch (error) { - console.error('Invalid modifiedTime for document', { + console.error("Invalid modifiedTime for document", { documentId: document.id, - modifiedTime: document.modifiedTime + modifiedTime: document.modifiedTime, }); - lastmod = new Date().toISOString().split('T')[0]; // Fallback to today + lastmod = new Date().toISOString().split("T")[0]; // Fallback to today } } else { - lastmod = new Date().toISOString().split('T')[0]; // Fallback to today + lastmod = new Date().toISOString().split("T")[0]; // Fallback to today } - + return { loc, lastmod }; } /** * Transform array of Drive documents to sitemap entries - * + * * @param {Array} documents - Array of Drive API documents * @param {string} baseUrl - Base URL for the adapter * @returns {Array} Array of sitemap entries */ function transformDocumentsToSitemapEntries(documents, baseUrl) { if (!Array.isArray(documents)) { - console.error('Documents must be an array', { documents }); + console.error("Documents must be an array", { documents }); return []; } - + return documents - .map(doc => toSitemapEntry(doc, baseUrl)) - .filter(entry => entry !== null); + .map((doc) => toSitemapEntry(doc, baseUrl)) + .filter((entry) => entry !== null); } /** * Generate XML sitemap from sitemap entries - * + * * Handles empty sitemap (0 documents) case - returns valid XML with empty urlset. - * + * * @param {Array} sitemapEntries - Array of { loc, lastmod } objects * @returns {string} Complete XML sitemap string */ function generateSitemapXML(sitemapEntries) { let xml = '\n'; xml += '\n'; - + // Handle empty sitemap - valid XML with no elements if (!sitemapEntries || sitemapEntries.length === 0) { - xml += ''; + xml += ""; return xml; } - + for (const entry of sitemapEntries) { - xml += ' \n'; + xml += " \n"; xml += ` ${escapeXml(entry.loc)}\n`; xml += ` ${escapeXml(entry.lastmod)}\n`; - xml += ' \n'; + xml += " \n"; } - - xml += ''; - + + xml += ""; + return xml; } /** * Main sitemap generation function - * + * * Combines document transformation and XML generation. - * + * * @param {Array} documents - Array of Drive API documents * @param {string} baseUrl - Base URL for the adapter * @returns {string} Complete XML sitemap @@ -602,26 +614,26 @@ function generateSitemap(documents, baseUrl) { * @returns {Object} Route info or error */ function parseRoute(method, url) { - if (method !== 'GET') { - return { route: null, error: 'Method not allowed', statusCode: 405 }; + if (method !== "GET") { + return { route: null, error: "Method not allowed", statusCode: 405 }; } - - const urlObj = new URL(url, 'http://localhost'); + + const urlObj = new URL(url, "http://localhost"); const path = urlObj.pathname; - + // Match any path containing 'sitemap.xml' - if (path.includes('sitemap.xml')) { - return { route: 'sitemap' }; + if (path.includes("sitemap.xml")) { + return { route: "sitemap" }; } - + // All other paths return 404 - return { route: null, error: 'Not found', statusCode: 404 }; + return { route: null, error: "Not found", statusCode: 404 }; } /** * Handle sitemap generation request * Wrapped in FIFO queue to ensure sequential processing. - * + * * @param {Object} res - HTTP response object * @param {string} requestId - Request ID for tracing * @returns {Promise} @@ -629,51 +641,50 @@ function parseRoute(method, url) { async function handleSitemapRequest(res, requestId) { try { // Get configuration from consolidated global settings - const settings = globalThis['google_drive_settings'] || {}; + const settings = globalThis["google_drive_settings"] || {}; const maxUrls = settings.sitemap?.maxUrls || 50000; - const query = settings.driveQuery || 'trashed = false'; - + const query = settings.driveQuery || "trashed = false"; + // Query documents from Drive API // This will throw DocumentCountExceededError if exceeds maxUrls limit const documents = await queryDocuments({ query: query, - maxDocuments: maxUrls + maxDocuments: maxUrls, }); - + // Generate sitemap XML with RESTful URLs const xml = generateSitemap(documents, settings.proxyScriptEndPoint); - + // Send successful response res.statusCode = 200; - res.setHeader('Content-Type', 'application/xml; charset=utf-8'); - res.setHeader('X-Request-Id', requestId); - res.setHeader('X-Document-Count', documents.length.toString()); + res.setHeader("Content-Type", "application/xml; charset=utf-8"); + res.setHeader("X-Request-Id", requestId); + res.setHeader("X-Document-Count", documents.length.toString()); res.end(xml); - - console.info('Sitemap generated successfully', { + + console.info("Sitemap generated successfully", { requestId, - documentCount: documents.length + documentCount: documents.length, }); - } catch (error) { // Map Drive API error to HTTP status code const errorResponse = mapDriveErrorToHttp(error); - + res.statusCode = errorResponse.statusCode; - + // Add Retry-After header for rate limiting (429) if (errorResponse.retryAfter) { - res.setHeader('Retry-After', errorResponse.retryAfter.toString()); + res.setHeader("Retry-After", errorResponse.retryAfter.toString()); } - + // Per specification: error responses have NO body res.end(); - - console.error('Sitemap generation failed', { + + console.error("Sitemap generation failed", { requestId, error: error.message, statusCode: errorResponse.statusCode, - retryAfter: errorResponse.retryAfter + retryAfter: errorResponse.retryAfter, }); } } @@ -681,65 +692,61 @@ async function handleSitemapRequest(res, requestId) { /** * Handle all HTTP requests * Main entry point called by server.js - * + * * @param {Object} req - HTTP request object * @param {Object} res - HTTP response object */ -async function handleRequest(req, res) { + (async () => { const requestId = generateRequestId(); const startTime = Date.now(); - - console.info('Request received', { + + console.info("Request received", { requestId, method: req.method, - url: req.url + url: req.url, }); - + try { // Parse route const routeResult = parseRoute(req.method, req.url); - + if (!routeResult.route) { res.statusCode = routeResult.statusCode; res.end(); // Empty body per spec - - console.error('Route not found', { + + console.error("Route not found", { requestId, url: req.url, - statusCode: routeResult.statusCode + statusCode: routeResult.statusCode, }); - + return; } - + // Handle sitemap route with FIFO queue // Per specification: queue concurrent requests, process sequentially - if (routeResult.route === 'sitemap') { + if (routeResult.route === "sitemap") { await requestQueue.enqueue(async () => { await handleSitemapRequest(res, requestId); }); return; } - } catch (error) { res.statusCode = 500; res.end(); - - console.error('Request handler error', { + + console.error("Request handler error", { requestId, error: error.message, - stack: error.stack + stack: error.stack, }); - } finally { const duration = Date.now() - startTime; - - console.info('Request completed', { + + console.info("Request completed", { requestId, statusCode: res.statusCode, - duration + duration, }); } -} -handleRequest(req, res); // This line is just for clarity - actual invocation is done by server.js - +})(); diff --git a/src/server.js b/src/server.js index 6070046..35a3c43 100644 --- a/src/server.js +++ b/src/server.js @@ -1,64 +1,67 @@ -import http from 'node:http'; -import { join } from 'node:path'; -import { readFileSync, readdirSync } from 'node:fs'; -import { fileURLToPath } from 'node:url'; -import { dirname } from 'node:path'; -import axios from 'axios'; -import { v4 as uuidv4 } from 'uuid'; -import jwt from 'jsonwebtoken'; -import { create as xmlBuilder } from 'xmlbuilder2'; -import { logger } from './logger.js'; +import http from "node:http"; +import { join } from "node:path"; +import { readFileSync, readdirSync } from "node:fs"; +import { fileURLToPath } from "node:url"; +import { dirname } from "node:path"; +import vm from "node:vm"; +import axios from "axios"; +import { v4 as uuidv4 } from "uuid"; +import jwt from "jsonwebtoken"; +import { create as xmlBuilder } from "xmlbuilder2"; +import { logger } from "./logger.js"; const __filename = fileURLToPath(import.meta.url); const __dirname = dirname(__filename); -// Note: crypto is already available as globalThis.crypto (Web Crypto API) -// No need to import or set it - Node.js provides it by default +const globalVMContext = { + URLSearchParams, + console: logger, + crypto, + axios, + uuidv4, + jwt, + xmlBuilder, +}; -// Replace global console with custom logger -globalThis.console = logger; - -// Make libraries available globally for proxy.js -globalThis.axios = axios; -globalThis.uuidv4 = uuidv4; -globalThis.jwt = jwt; -globalThis.xmlBuilder = xmlBuilder; +let globalVariableContext = {}; /** * Load all JSON files from global/ directory and make them available as global objects - * Pattern: global/filename.json -> globalThis['filename'] + * Pattern: global/filename.json -> globalVariableContext['filename'] */ function loadGlobalObjects() { - const globalDir = join(__dirname, '..', 'global'); - + const globalDir = join(__dirname, "..", "global"); + try { - const files = readdirSync(globalDir).filter(f => f.endsWith('.json') && !f.endsWith('.example')); - - files.forEach(file => { - const objectName = file.replace('.json', ''); + const files = readdirSync(globalDir).filter( + (f) => f.endsWith(".json") && !f.endsWith(".example"), + ); + + files.forEach((file) => { + const objectName = file.replace(".json", ""); const filePath = join(globalDir, file); - + try { - const content = readFileSync(filePath, 'utf-8'); + const content = readFileSync(filePath, "utf-8"); const data = JSON.parse(content); - globalThis[objectName] = data; + globalVariableContext[objectName] = data; logger.info(`Loaded global object: ${objectName}`, { file: file, - keys: Object.keys(data) + keys: Object.keys(data), }); } catch (error) { logger.error(`Failed to load global object from ${file}`, { - error: error.message + error: error.message, }); throw error; } }); - + logger.info(`Loaded ${files.length} global objects from ${globalDir}`); } catch (error) { - logger.error('Failed to load global objects', { + logger.error("Failed to load global objects", { directory: globalDir, - error: error.message + error: error.message, }); throw error; } @@ -67,16 +70,18 @@ function loadGlobalObjects() { /** * Load configuration from config/default.json * Merges with environment variables (ENV takes precedence) - * + * * @returns {Object} Configuration object */ function loadConfig() { - const configPath = join(__dirname, '..', 'config', 'default.json'); - const configData = readFileSync(configPath, 'utf-8'); + const configPath = join(__dirname, "..", "config", "default.json"); + const configData = readFileSync(configPath, "utf-8"); const config = JSON.parse(configData); // Merge environment variables (ENV vars take precedence) - config.server.port = process.env.PORT ? parseInt(process.env.PORT, 10) : config.server.port; + config.server.port = process.env.PORT + ? parseInt(process.env.PORT, 10) + : config.server.port; config.server.host = process.env.HOST || config.server.host; config.logging.level = process.env.LOG_LEVEL || config.logging.level; @@ -92,12 +97,16 @@ function validateConfig(config) { const errors = []; // Validate server configuration - if (!config.server.port || config.server.port < 1 || config.server.port > 65535) { - errors.push('Invalid server.port (must be 1-65535)'); + if ( + !config.server.port || + config.server.port < 1 || + config.server.port > 65535 + ) { + errors.push("Invalid server.port (must be 1-65535)"); } if (errors.length > 0) { - throw new Error(`Configuration validation failed:\n${errors.join('\n')}`); + throw new Error(`Configuration validation failed:\n${errors.join("\n")}`); } } @@ -108,63 +117,77 @@ async function startServer() { try { // Load configuration into global.config global.config = loadConfig(); - + // Load global objects from global/ directory (e.g., service account keys) loadGlobalObjects(); - logger.info('Starting Proxy Script Server...'); - logger.info(`Configuration loaded: ${JSON.stringify({ - port: global.config.server.port, - host: global.config.server.host, - logLevel: global.config.logging.level - })}`); + logger.info("Starting Proxy Script Server..."); + logger.info( + `Configuration loaded: ${JSON.stringify({ + port: global.config.server.port, + host: global.config.server.host, + logLevel: global.config.logging.level, + })}`, + ); // Validate configuration validateConfig(global.config); - logger.info('Configuration validated successfully'); + logger.info("Configuration validated successfully"); - // Load proxy.js as a function wrapper (ZERO exports per constitution) - // Import the module which sets up globalThis.handleRequest - await import('./proxy.js'); - - // Wrap the global handleRequest function for clean invocation - const handleRequest = (req, res) => { - return globalThis.handleRequest(req, res); - }; + const proxyPath = join(__dirname, "proxy.js"); + const proxyCode = readFileSync(proxyPath, "utf-8"); + const script = new vm.Script(proxyCode, { filename: "proxy.js" }); // Create HTTP server that delegates all requests to proxy - const server = http.createServer(handleRequest); + const server = http.createServer((req, res) => { + try { + // Create a context with all globals that proxy.js needs + const context = vm.createContext({ + ...globalVMContext, + ...globalVariableContext, + req, + res, + }); + script.runInContext(context); + } catch (error) { + logger.error("Request handling failed", { + error: error.message, + stack: error.stack, + }); + res.statusCode = 500; + res.end("Internal Server Error"); + } + }); // Graceful shutdown const shutdown = () => { - logger.info('\nShutting down gracefully...'); + logger.info("\nShutting down gracefully..."); server.close(() => { - logger.info('Server closed'); + logger.info("Server closed"); process.exit(0); }); // Force shutdown after 10 seconds setTimeout(() => { - logger.error('Forced shutdown after timeout'); + logger.error("Forced shutdown after timeout"); process.exit(1); }, 10000); }; - process.on('SIGTERM', shutdown); - process.on('SIGINT', shutdown); + process.on("SIGTERM", shutdown); + process.on("SIGINT", shutdown); // Start listening server.listen(global.config.server.port, global.config.server.host, () => { - logger.info('Server listening', { + logger.info("Server listening", { port: global.config.server.port, host: global.config.server.host, }); }); - } catch (error) { - logger.error('Failed to start server', { + logger.error("Failed to start server", { error: error.message, - stack: error.stack + stack: error.stack, }); process.exit(1); } diff --git a/tests/unit/utils.test.js b/tests/unit/utils.test.js index a8da674..e80d240 100644 --- a/tests/unit/utils.test.js +++ b/tests/unit/utils.test.js @@ -1,15 +1,20 @@ /** * Unit Tests for General Utilities * - * NOTE: Per constitution requirement, proxy.js has ZERO exports. - * Internal functions (generateRequestId, validateDocumentId, etc.) cannot be unit tested directly. - * These functions are tested indirectly through integration tests of the main handleRequest function. + * NOTE: Per constitution requirement, proxy.js has ZERO exports and NO globalThis usage. + * The file is a pure function expression loaded via Function constructor. * * This test file verifies constitution compliance only. */ import { test, describe } from 'node:test'; import assert from 'node:assert'; +import { readFileSync } from 'node:fs'; +import { join, dirname } from 'node:path'; +import { fileURLToPath } from 'node:url'; + +const __filename = fileURLToPath(import.meta.url); +const __dirname = dirname(__filename); // Set up globals that server.js would provide // Note: crypto is already available on globalThis (Web Crypto API) @@ -17,11 +22,23 @@ globalThis.config = { google: {}, server: {}, sitemap: {} }; describe('Unit: Constitution Compliance', () => { - test('T046: proxy.js has ZERO exports and exposes handleRequest via globalThis', async () => { - // Verify proxy.js can be loaded and exposes handleRequest via globalThis - await import('../../src/proxy.js'); - assert.ok(globalThis.handleRequest, 'handleRequest should be available on globalThis'); - assert.strictEqual(typeof globalThis.handleRequest, 'function', 'handleRequest should be a function'); + test('T046: proxy.js has ZERO exports/imports and loads as pure function', () => { + const proxyPath = join(__dirname, '..', '..', 'src', 'proxy.js'); + const proxyCode = readFileSync(proxyPath, 'utf-8'); + + // Verify no exports + assert.ok(!proxyCode.match(/^export /m), 'Should have no export statements'); + + // Verify no imports + assert.ok(!proxyCode.match(/^import /m), 'Should have no import statements'); + + // Verify no globalThis usage (except for accessing provided globals) + const globalThisAssignments = proxyCode.match(/globalThis\.[a-zA-Z_]+ =/g); + assert.ok(!globalThisAssignments, 'Should not assign to globalThis'); + + // Verify it's a function expression that can be executed + assert.ok(proxyCode.includes('(function()'), 'Should contain function expression'); + assert.ok(proxyCode.includes('return handleRequest'), 'Should return handleRequest'); }); test('T046: crypto is available on globalThis (Web Crypto API)', () => {