Changes: - Store helpers object in globalVariableContext.helpers - Remove separate 'helpers' variable and helpersContext - Helpers now spread into context via ...globalVariableContext - Simplifies context injection - all globals in one place Benefits: - More consistent with JSON data loading pattern - Single source of truth (globalVariableContext) for all VM globals - Cleaner context creation (no separate helpers variable) - helpers treated same as other global objects Implementation: - loadHelpers() now mutates globalVariableContext instead of returning - Use tempContext for helpers execution (discarded after use) - helpers accessible as 'helpers' in proxy.js via spread operator Testing: - ✓ Syntax validated - ✓ helpers accessible in VM context - ✓ Spread operator includes both JSON data and helpers - ✓ All 11 helper functions available Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Google Drive Sitemap Adapter
HTTP service that generates XML sitemaps listing all accessible documents in a Google Drive account. Uses Service Account authentication for secure, automated access.
Features
- Sitemap Generation: XML sitemap at
/sitemap.xmllisting all accessible Google Drive documents - RESTful URLs: Document links in format
/documents/{documentId}per sitemap protocol - Service Account Auth: JWT-based authentication using Google Service Account credentials
- Pagination Support: Handles large document sets (up to 50,000 URLs per sitemap protocol)
- 50k Limit Enforcement: Returns 413 error if document count exceeds sitemap protocol limit
- FIFO Request Queue: Concurrent requests processed sequentially (one at a time)
- Rate Limit Handling: Returns 429 with Retry-After header when Drive API rate limits
- No Retry on 503: Fails immediately on Drive API unavailability (per spec)
- Minimal Dependencies: Only
googleapispackage required
Quick Start
Prerequisites
- Node.js v18.x or later
- Google Cloud Project with Drive API enabled
- Service Account credentials with Drive API access
Setup
-
Install dependencies:
npm install -
Configure Service Account (see
specs/001-drive-proxy-adapter/quickstart.mdfor detailed steps):- Create Service Account in Google Cloud Console
- Download service account key JSON file
- Share Drive files/folders with service account email
- Place key file at
config/service-account-key.json
-
Configure environment:
cp .env.example .env # Edit .env with your service account email -
Start the server:
npm start # or for development with auto-reload: npm run dev -
Generate sitemap:
curl http://localhost:3000/sitemap.xml
Usage Examples
# Get sitemap of all documents
curl http://localhost:3000/sitemap.xml
# Verify XML format
curl http://localhost:3000/sitemap.xml | xmllint --noout -
# Count documents in sitemap
curl http://localhost:3000/sitemap.xml | grep -c '<loc>'
Architecture
Monolithic Design
This project follows a monolithic architecture as specified in the project constitution:
- Single Route File: ALL routing, business logic, and Drive API integration in
src/proxy.js(~350 LOC) - Utility Modules: Separate files for auth, logging, XML utils (constitution-compliant separation of concerns)
- Configuration as Data: JSON configuration in
config/default.jsonloaded intoglobal.configat startup - Minimal Dependencies: Only
googleapispackage for Drive API integration
Why Monolithic?
Rationale defined in constitution:
- Simplicity: Easy to understand, debug, and maintain
- Direct Code Flow: No dependency injection, no framework magic
- YAGNI Principle: No premature abstraction for a focused service
Structure
src/
├── server.js # HTTP server, config loader, validation
├── proxy.js # Request handler with FIFO queue integration
├── drive-client.js # Drive API integration with 50k limit enforcement
├── sitemap-generator.js # Sitemap XML generation with RESTful URLs
├── queue.js # FIFO request queue (sequential processing)
├── auth.js # Service Account authentication
├── logger.js # Structured logging utility
├── utils.js # Request ID, validation
└── xml-utils.js # XML escaping
Testing
Test Structure
Tests follow TDD workflow with real assertions:
tests/
├── contract/ # API contract tests (HTTP interface)
├── integration/ # Drive API integration tests
└── unit/ # Pure function unit tests
Running Tests
# All tests
npm test
# Specific test suites
npm run test:unit
npm run test:integration
npm run test:contract
Coverage Requirements
- Minimum: 80% code coverage (enforced)
- Tests Written First: TDD mandatory per constitution
- Real Assertions: No placeholder tests
Configuration
Configuration is loaded from config/default.json and merged with environment variables:
{
"server": {
"port": 3000,
"host": "0.0.0.0",
"baseUrl": "http://localhost:3000"
},
"google": {
"serviceAccountEmail": "service@project.iam.gserviceaccount.com",
"serviceAccountKeyPath": "./config/service-account-key.json",
"scopes": ["https://www.googleapis.com/auth/drive.readonly"]
},
"sitemap": {
"maxUrls": 50000
},
"logging": {
"level": "info"
}
}
Environment variables override JSON config (e.g., PORT, GOOGLE_SERVICE_ACCOUNT_EMAIL).
API Documentation
Endpoints
GET /sitemap.xml- XML sitemap of all accessible documents (200 OK with XML body)GET /*- All other paths return 404 Not Found (empty body)
Response Headers
Successful sitemap response (200 OK):
Content-Type: application/xml; charset=utf-8X-Request-Id: req_<uuid>- Request tracing IDX-Document-Count: <number>- Number of documents in sitemap
Error Responses
All errors return HTTP status code only with no response body (per specification):
401 Unauthorized- Service account authentication failed404 Not Found- Path is not /sitemap.xml413 Payload Too Large- Document count exceeds 50,000 (sitemap protocol limit)429 Too Many Requests- Drive API rate limit exceeded (includesRetry-Afterheader in seconds)500 Internal Server Error- Server error503 Service Unavailable- Drive API unavailable (NO RETRY per specification)
Performance Characteristics
- Cold Start: < 10 seconds to accepting requests
- Sitemap Generation: < 5 seconds for 10,000 documents
- Concurrent Requests: 10+ without degradation
- Memory Usage: < 256MB under normal load
Development
Project Structure
google-drive-content-adapter/
├── config/
│ └── default.json # Configuration
├── src/
│ ├── server.js # HTTP server
│ ├── proxy.js # Request handler (monolithic)
│ ├── auth.js # Service Account auth
│ ├── logger.js # Structured logging
│ ├── utils.js # Utilities
│ └── xml-utils.js # XML escaping
├── tests/
│ ├── contract/ # API contract tests
│ ├── integration/ # Integration tests
│ └── unit/ # Unit tests
├── specs/
│ └── 001-drive-proxy-adapter/ # Feature spec, plan, tasks
├── .env.example # Environment template
├── package.json # Dependencies and scripts
└── README.md # This file
Development Workflow
- Write Tests First (TDD)
- Implement Minimum Code
- Run Tests:
npm test - Run in Development:
npm run dev
Deployment
Docker
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY src/ ./src/
COPY config/ ./config/
CMD ["node", "src/server.js"]
EXPOSE 3000
docker build -t drive-sitemap-adapter .
docker run -p 3000:3000 -v $(pwd)/config:/app/config drive-sitemap-adapter
Direct Node.js
NODE_ENV=production npm start
Troubleshooting
Authentication Failed (401)
- Verify service account key file exists at
config/service-account-key.json - Check service account email matches configuration
- Ensure Drive API is enabled in Google Cloud project
Empty Sitemap
- Service account needs access to Drive files
- Share files/folders with service account email
- Check service account has "Viewer" permission
Rate Limit (429)
- Wait for time specified in
Retry-Afterheader - Reduce frequency of sitemap requests
- Check Google Cloud Console quotas
License
ISC
Documentation
For detailed setup and usage instructions, see: