Files

21 KiB

Tasks: Google Drive HTTP Proxy Adapter

Input: Design documents from /specs/001-drive-proxy-adapter/ Prerequisites: plan.md, spec.md, research.md, data-model.md, contracts/, quickstart.md

Feature: Generate XML sitemaps from Google Drive documents via HTTP endpoint Key Clarifications Incorporated (10 total):

  1. Service Account JWT auth with inline JSON env var
  2. RESTful URL format /documents/{documentId}
  3. No retries on 503 errors
  4. stdout/stderr logging only
  5. 413 error for >50k documents
  6. Crash with exit code 1 for fatal errors
  7. FIFO queue for concurrent requests
  8. Plain text logging format [timestamp] [level] message
  9. Configurable Drive API filter in config/settings.js
  10. Status code only errors (no response body)

Tests: Test-First Development enforced per Constitution Principle III

Organization: Tasks are grouped by user story (only US1 exists for this feature - single endpoint system)


Format: - [ ] [ID] [P?] [Story?] Description

  • [P]: Can run in parallel (different files, no dependencies)
  • [Story]: User story label (US1, US2, etc.) - only for user story phases
  • Include exact file paths in descriptions

Phase 1: Setup (Shared Infrastructure)

Purpose: Project initialization and basic structure

  • T001 Initialize Node.js project with package.json at repository root
  • T002 Install googleapis dependency v140.0.0 in package.json
  • T003 [P] Create src/ directory for application source code
  • T004 [P] Create config/ directory for configuration files
  • T005 [P] Create tests/unit/ directory for unit tests
  • T006 [P] Create tests/integration/ directory for integration tests
  • T007 [P] Create tests/contract/ directory for contract tests
  • T008 Configure Node.js native test runner in package.json with test scripts
  • T009 [P] Setup ESLint configuration in .eslintrc.json for ES2022+ JavaScript
  • T010 [P] Create .env.example file documenting required environment variables

Phase 2: Foundational (Blocking Prerequisites)

Purpose: Core infrastructure that MUST be complete before user story implementation

⚠️ CRITICAL: User Story 1 cannot begin until this phase is complete

  • T011 Create console.js module in src/ with formatMessage function and log/info/debug/error methods (plain text format: [timestamp] [level] message)
  • T012 Create config/config.js exporting server configuration (port, baseUrl from env vars)
  • T013 Create config/settings.js exporting Drive API configuration (query filter from env var DRIVE_QUERY or default "trashed = false", fields, pageSize, scope)
  • T014 Create auth.js module in src/ for Service Account JWT authentication using googleapis GoogleAuth class
  • T015 Add credential validation function in src/auth.js to check client_email, private_key, project_id structure
  • T016 Implement fatal error handler in src/auth.js that logs to stderr and exits with code 1 if credentials invalid
  • T017 Create xml-utils.js module in src/ with XML escaping utilities for special characters (&, <, >, ", ')
  • T018 Implement FIFO request queue class in src/queue.js using Node.js EventEmitter with processing flag and queue array
  • T019 Create server.js entry point in src/ that sets up HTTP server with http module

Checkpoint: Foundation ready - User Story 1 implementation can now begin


Phase 3: User Story 1 - Generate Sitemap of Available Documents (Priority: P1) 🎯 MVP

Goal: Users can request /sitemap.xml and receive a valid XML sitemap listing all accessible Google Drive documents with RESTful links containing document IDs

Independent Test: Make GET request to /sitemap.xml and verify: (1) 200 status with valid XML sitemap format, (2) URLs use RESTful format /documents/{documentId}, (3) reflects documents in Google Drive, (4) handles >50k documents with 413, (5) queues concurrent requests in FIFO order

Why this is the complete feature: This feature has only one user story. The system provides a single endpoint for sitemap generation.


Tests for User Story 1 (Test-First Development) ⚠️

CONSTITUTION REQUIREMENT: Write these tests FIRST, ensure they FAIL, obtain user approval before implementation

Contract Tests

  • T020 [P] [US1] Contract test for /sitemap.xml success response (200 OK) in tests/contract/sitemap-schema.test.js - verify XML structure, namespace, Content-Type header
  • T021 [P] [US1] Contract test for /sitemap.xml with empty Drive (0 documents) in tests/contract/sitemap-schema.test.js - verify empty urlset is valid
  • T022 [P] [US1] Contract test for XML special character escaping in tests/contract/sitemap-schema.test.js - verify &, <, >, ", ' are properly escaped in URLs
  • T023 [P] [US1] Contract test for lastmod date format validation in tests/contract/sitemap-schema.test.js - verify ISO 8601 format YYYY-MM-DD

Integration Tests

  • T024 [P] [US1] Integration test for /sitemap.xml endpoint success scenario in tests/integration/sitemap-endpoint.test.js - mock Drive API, verify 200 response with valid XML
  • T025 [P] [US1] Integration test for /sitemap.xml with >50k documents in tests/integration/error-scenarios.test.js - verify 413 response with no body
  • T026 [P] [US1] Integration test for /sitemap.xml with Drive API rate limiting in tests/integration/error-scenarios.test.js - verify 429 response with Retry-After header and no body
  • T027 [P] [US1] Integration test for /sitemap.xml with Drive API 503 error in tests/integration/error-scenarios.test.js - verify 503 passthrough with no retry and no body
  • T028 [P] [US1] Integration test for invalid endpoint requests in tests/integration/error-scenarios.test.js - verify 404 response with no body for non-/sitemap.xml paths
  • T029 [P] [US1] Integration test for concurrent requests to /sitemap.xml in tests/integration/queue-concurrency.test.js - verify FIFO processing (one at a time)
  • T030 [P] [US1] Integration test for Service Account token refresh in tests/integration/sitemap-endpoint.test.js - mock token expiry, verify 401 if refresh fails

Unit Tests

  • T031 [P] [US1] Unit test for Drive API client query execution in tests/unit/drive-client.test.js - mock googleapis drive.files.list() call
  • T032 [P] [US1] Unit test for Drive API pagination handling in tests/unit/drive-client.test.js - verify pageToken logic for >1000 documents
  • T033 [P] [US1] Unit test for Service Account JWT authentication in tests/unit/auth.test.js - verify GoogleAuth client creation from env var JSON
  • T034 [P] [US1] Unit test for credential validation in tests/unit/auth.test.js - verify detection of invalid client_email, private_key, project_id
  • T035 [P] [US1] Unit test for sitemap XML generation in tests/unit/sitemap-generator.test.js - verify XML structure and URL format /documents/{documentId}
  • T036 [P] [US1] Unit test for Document to SitemapEntry transformation in tests/unit/sitemap-generator.test.js - verify baseUrl + /documents/ + documentId concatenation
  • T037 [P] [US1] Unit test for lastmod date formatting in tests/unit/sitemap-generator.test.js - verify ISO 8601 YYYY-MM-DD format from modifiedTime
  • T038 [P] [US1] Unit test for FIFO queue enqueue/dequeue in tests/unit/queue.test.js - verify sequential processing order
  • T039 [P] [US1] Unit test for FIFO queue concurrent request handling in tests/unit/queue.test.js - verify processing flag prevents simultaneous execution
  • T040 [P] [US1] Unit test for XML special character escaping in tests/unit/sitemap-generator.test.js - verify escapeXml function handles &, <, >, ", '

TEST APPROVAL CHECKPOINT: Present test scenarios to user for approval before proceeding to implementation


Implementation for User Story 1

Drive API Integration

  • T041 [P] [US1] Create drive-client.js module in src/ with function to initialize googleapis drive client using auth from src/auth.js
  • T042 [US1] Implement queryDocuments function in src/drive-client.js to call drive.files.list() with query from config/settings.js and fields: files(id, name, mimeType, modifiedTime)
  • T043 [US1] Implement pagination logic in src/drive-client.js to handle pageToken and collect all results up to 50,000 limit
  • T044 [US1] Add document count validation in src/drive-client.js to return error if count exceeds 50,000
  • T045 [US1] Implement error mapping in src/drive-client.js to detect Drive API 429 (rate limit), 503 (unavailable), auth failures

Sitemap Generation

  • T046 [P] [US1] Create sitemap-generator.js module in src/ with function to transform Document array to SitemapEntry array
  • T047 [US1] Implement toSitemapEntry function in src/sitemap-generator.js to construct loc URLs using baseUrl + /documents/ + encodeURIComponent(documentId)
  • T048 [US1] Implement lastmod date extraction in src/sitemap-generator.js to format modifiedTime as ISO 8601 date (YYYY-MM-DD)
  • T049 [US1] Implement generateSitemapXML function in src/sitemap-generator.js to build XML string with proper namespace and escaped URLs using xml-utils.js
  • T050 [US1] Add empty sitemap handling in src/sitemap-generator.js to return valid XML with empty urlset when 0 documents

Request Routing and Error Handling

  • T051 [US1] Create proxy.js monolithic route handler in src/ that imports queue, drive-client, sitemap-generator modules
  • T052 [US1] Implement request handler function in src/proxy.js that checks if path is /sitemap.xml (404 for all other paths with no response body)
  • T053 [US1] Implement FIFO queue integration in src/proxy.js to enqueue /sitemap.xml requests using queue.process() from src/queue.js
  • T054 [US1] Implement sitemap generation flow in src/proxy.js: authenticate → query Drive API → check count → transform to sitemap → generate XML
  • T055 [US1] Implement error response handling in src/proxy.js for 413 (>50k docs), 429 (rate limit with Retry-After header), 503 (Drive unavailable), 401 (auth failed), 500 (unexpected) - all with NO response body
  • T056 [US1] Add HTTP response headers in src/proxy.js: Content-Type: application/xml; charset=utf-8 for 200 responses, no Content-Type for errors
  • T057 [US1] Extract Retry-After value from Drive API 429 error in src/proxy.js and set Retry-After header in seconds

Logging and Observability

  • T058 [US1] Add request logging in src/proxy.js to log incoming requests with method, path, client IP using console.info() from src/console.js
  • T059 [US1] Add response logging in src/proxy.js to log status code and response time for each request using console.info()
  • T060 [US1] Add Drive API operation logging in src/drive-client.js to log query start, document count, and completion time using console.debug()
  • T061 [US1] Add error logging in src/proxy.js to log errors with request context (requestId) and error message using console.error() to stderr
  • T062 [US1] Implement requestId generation in src/proxy.js using crypto.randomUUID() for request tracing

Server Lifecycle

  • T063 [US1] Implement HTTP server setup in src/server.js to route all requests to src/proxy.js handler
  • T064 [US1] Load configuration in src/server.js from config/config.js and config/settings.js on startup
  • T065 [US1] Load Service Account credentials in src/server.js from GOOGLE_SERVICE_ACCOUNT_KEY env var on startup
  • T066 [US1] Add startup validation in src/server.js to call credential validation from src/auth.js and exit(1) on failure
  • T067 [US1] Implement server binding in src/server.js to listen on port from config, catch EADDRINUSE error and exit(1) with error log
  • T068 [US1] Add startup logging in src/server.js to log server configuration (port, baseUrl), Service Account email (masked), and "server listening" message using console.info()
  • T069 [US1] Implement graceful shutdown handler in src/server.js for SIGTERM/SIGINT signals to log shutdown and close server

Checkpoint: User Story 1 complete - /sitemap.xml endpoint fully functional with all 10 clarifications implemented


Phase 4: Polish & Cross-Cutting Concerns

Purpose: Final validation, documentation, and quality improvements

  • T070 [P] Update README.md with quickstart instructions referencing specs/001-drive-proxy-adapter/quickstart.md
  • T071 [P] Create .env.example file with all required environment variables documented per quickstart.md
  • T072 Validate test coverage meets 80%+ requirement per constitution using Node.js test runner coverage
  • T073 Run all tests (npm test) and verify 100% pass rate
  • T074 Manual validation: Start server and request /sitemap.xml, verify valid XML response
  • T075 Manual validation: Test >50k documents scenario, verify 413 response with no body
  • T076 Manual validation: Test invalid endpoint, verify 404 response with no body
  • T077 Manual validation: Test concurrent requests, verify FIFO processing (sequential execution)
  • T078 Manual validation: Test fatal error scenarios (invalid credentials, port in use), verify exit code 1
  • T079 [P] Code cleanup: Remove unused imports, add JSDoc comments for all public functions
  • T080 Run ESLint and fix any linting errors
  • [~] T081 Verify all log output uses plain text format [timestamp] [level] message per research.md Section 5
  • T082 Verify Drive API filter is loaded from config/settings.js not hardcoded per clarification #9
  • T083 Run quickstart.md validation: follow installation and usage instructions from scratch

Dependencies & Execution Order

Phase Dependencies

  • Setup (Phase 1): No dependencies - start immediately
  • Foundational (Phase 2): Depends on Setup (Phase 1) - BLOCKS User Story 1
  • User Story 1 (Phase 3): Depends on Foundational (Phase 2) - This is the only user story
  • Polish (Phase 4): Depends on User Story 1 completion

Within User Story 1

Test-First Sequence:

  1. Write ALL tests (T020-T040) - can run in parallel [P]
  2. STOP: Obtain user approval of test scenarios
  3. Verify tests FAIL (no implementation yet)
  4. Proceed to implementation

Implementation Sequence:

  1. Drive API Integration (T041-T045)
  2. Sitemap Generation (T046-T050) - can run in parallel with T041-T045
  3. Request Routing (T051-T057) - depends on T041-T050
  4. Logging (T058-T062) - can run in parallel with T051-T057
  5. Server Lifecycle (T063-T069) - depends on T051-T062

Parallel Opportunities

Phase 1 Setup - All can run in parallel:

  • T003, T004, T005, T006, T007 (directory creation)
  • T009, T010 (config files)

Phase 2 Foundational - Groups can run in parallel:

  • T011, T012, T013, T017 (utility modules)
  • T014, T015, T016 (auth module)
  • T018, T019 (queue and server scaffolding)

Phase 3 Tests - All tests can run in parallel:

  • Contract tests: T020, T021, T022, T023
  • Integration tests: T024-T030
  • Unit tests: T031-T040

Phase 3 Implementation - Within groups:

  • T041, T046 (drive-client and sitemap-generator start in parallel)
  • T058-T062 (all logging tasks in parallel)

Phase 4 Polish:

  • T070, T071, T079, T081, T082 (documentation and cleanup)

Parallel Example: User Story 1 Tests

# Launch all contract tests together:
Task: "Contract test for /sitemap.xml success response in tests/contract/sitemap-schema.test.js"
Task: "Contract test for /sitemap.xml with empty Drive in tests/contract/sitemap-schema.test.js"
Task: "Contract test for XML special character escaping in tests/contract/sitemap-schema.test.js"
Task: "Contract test for lastmod date format validation in tests/contract/sitemap-schema.test.js"

# Launch all integration tests together:
Task: "Integration test for /sitemap.xml endpoint success in tests/integration/sitemap-endpoint.test.js"
Task: "Integration test for >50k documents in tests/integration/error-scenarios.test.js"
Task: "Integration test for Drive API rate limiting in tests/integration/error-scenarios.test.js"
Task: "Integration test for Drive API 503 error in tests/integration/error-scenarios.test.js"
Task: "Integration test for invalid endpoints in tests/integration/error-scenarios.test.js"
Task: "Integration test for concurrent requests in tests/integration/queue-concurrency.test.js"
Task: "Integration test for token refresh in tests/integration/sitemap-endpoint.test.js"

# Launch all unit tests together:
Task: "Unit test for Drive API client query execution in tests/unit/drive-client.test.js"
Task: "Unit test for Drive API pagination handling in tests/unit/drive-client.test.js"
Task: "Unit test for Service Account JWT authentication in tests/unit/auth.test.js"
Task: "Unit test for credential validation in tests/unit/auth.test.js"
Task: "Unit test for sitemap XML generation in tests/unit/sitemap-generator.test.js"
Task: "Unit test for Document to SitemapEntry transformation in tests/unit/sitemap-generator.test.js"
Task: "Unit test for lastmod date formatting in tests/unit/sitemap-generator.test.js"
Task: "Unit test for FIFO queue enqueue/dequeue in tests/unit/queue.test.js"
Task: "Unit test for FIFO queue concurrent request handling in tests/unit/queue.test.js"
Task: "Unit test for XML special character escaping in tests/unit/sitemap-generator.test.js"

Implementation Strategy

MVP = Complete Feature (User Story 1 Only)

This feature is inherently MVP-sized:

  1. Complete Phase 1: Setup → Project initialized
  2. Complete Phase 2: Foundational → Infrastructure ready (CRITICAL BLOCKER)
  3. Complete Phase 3: User Story 1 → FULL FEATURE COMPLETE
  4. Complete Phase 4: Polish → Production ready
  5. VALIDATE: Test /sitemap.xml independently with all 10 clarifications verified

No Incremental Delivery Needed

Unlike multi-story features, this feature has only one user story. The MVP IS the complete feature:

  • Single endpoint: /sitemap.xml
  • All requirements in User Story 1
  • No additional stories to add later

Validation Checklist (All 10 Clarifications)

Before marking feature complete, verify:

  1. Service Account JWT auth works with inline JSON from GOOGLE_SERVICE_ACCOUNT_KEY env var
  2. Sitemap URLs use RESTful format: /documents/{documentId}
  3. Drive API 503 errors pass through immediately with NO retries
  4. All logs output to stdout/stderr only (no log files)
  5. System returns 413 error when >50,000 documents exist
  6. Fatal errors (invalid credentials, port conflict) crash with exit code 1
  7. Concurrent /sitemap.xml requests queue in FIFO order and process sequentially
  8. Log format is plain text: [timestamp] [level] message
  9. Drive API query filter loads from config/settings.js (configurable, not hardcoded)
  10. All error responses return status code only with NO response body (except 429 includes Retry-After header)

Task Summary

Total Tasks: 83

  • Phase 1 (Setup): 10 tasks
  • Phase 2 (Foundational): 9 tasks (BLOCKING)
  • Phase 3 (User Story 1):
    • Tests: 21 tasks (T020-T040)
    • Implementation: 29 tasks (T041-T069)
  • Phase 4 (Polish): 14 tasks

Parallel Opportunities:

  • Phase 1: 7 tasks can run in parallel
  • Phase 2: 6 tasks can run in parallel
  • Phase 3 Tests: All 21 tests can run in parallel
  • Phase 3 Implementation: Up to 4 tasks can run in parallel at certain points
  • Phase 4: 5 tasks can run in parallel

Independent Test Criteria: User Story 1 is independently testable via:

  1. GET /sitemap.xml returns 200 with valid XML
  2. URLs follow RESTful format /documents/{documentId}
  3. 50k documents returns 413 (no body)

  4. Concurrent requests process sequentially (FIFO)
  5. Fatal errors crash with exit code 1
  6. Logs use plain text format to stdout/stderr
  7. Drive API filter loads from config/settings.js

Suggested MVP Scope: Complete all phases (this is a single-story feature)


Format Validation

ALL tasks follow checklist format:

  • Checkbox: - [ ]
  • Task ID: Sequential (T001-T083)
  • [P] marker: Present only on parallelizable tasks
  • [Story] label: Present only on User Story 1 phase tasks (US1)
  • Description: Includes clear action and exact file path
  • File paths: All absolute and specific

Organization by user story:

  • Setup phase: No story label (infrastructure)
  • Foundational phase: No story label (blocking prerequisites)
  • User Story 1 phase: All tasks marked [US1]
  • Polish phase: No story label (cross-cutting)

Compliance with constitution:

  • Test-First Development: Tests (T020-T040) come before implementation with approval gate
  • Monolithic architecture: Single proxy.js for all logic per plan.md
  • Minimal dependencies: Only googleapis + Node.js built-ins per research.md
  • Observability: Plain text logging to stdout/stderr per clarification #4, #8

Notes

  • This feature has only ONE user story (sitemap generation), so all implementation tasks are in Phase 3
  • The feature specification explicitly removed document export functionality from scope (Session 2)
  • All 10 clarifications from 3 sessions are incorporated into task descriptions
  • Test-first development is mandatory per Constitution Principle III (non-negotiable)
  • FIFO queue ensures sequential processing of concurrent requests (no parallel Drive API operations)
  • Fatal errors must crash immediately with exit code 1 (no graceful degradation)
  • Error responses have NO body (status code only), except 429 includes Retry-After header
  • Drive API query filter MUST be configurable via config/settings.js (not hardcoded)