Added new feature for document export, including API contracts, data model, implementation plan, and tests. Updated related configurations and instructions.
This commit is contained in:
4
.github/agents/copilot-instructions.md
vendored
4
.github/agents/copilot-instructions.md
vendored
@@ -15,6 +15,8 @@ Auto-generated from all feature plans. Last updated: 2026-03-06
|
||||
- N/A (no persistent storage, always fetch fresh from Google Drive API) (001-drive-proxy-adapter)
|
||||
- JavaScript ES2022+ (Node.js LTS v18.0.0+) (001-drive-proxy-adapter)
|
||||
- N/A (no persistence - sitemap generated on-demand from Drive API) (001-drive-proxy-adapter)
|
||||
- Node.js >=18.0.0 (ES modules) + axios (HTTP client), jsonwebtoken (JWT for Google auth), uuid (request IDs), xmlbuilder2 (sitemap generation) (001-document-export)
|
||||
- N/A (stateless proxy, no database) (001-document-export)
|
||||
|
||||
- Node.js v20.x LTS (with fallback support for v18.x LTS) (001-drive-proxy-adapter)
|
||||
|
||||
@@ -34,9 +36,9 @@ tests/
|
||||
Node.js v20.x LTS (with fallback support for v18.x LTS): Follow standard conventions
|
||||
|
||||
## Recent Changes
|
||||
- 001-document-export: Added Node.js >=18.0.0 (ES modules) + axios (HTTP client), jsonwebtoken (JWT for Google auth), uuid (request IDs), xmlbuilder2 (sitemap generation)
|
||||
- 001-drive-proxy-adapter: Added JavaScript ES2022+ (Node.js LTS v18.0.0+)
|
||||
- 001-drive-proxy-adapter: Added JavaScript ES2022+ / Node.js 18 LTS or later + googleapis (Google Drive API v3 client), xmlbuilder2 (sitemap XML generation)
|
||||
- 001-drive-proxy-adapter: Added Node.js 18+ (LTS), JavaScript ES2022+ with ES modules + `googleapis` (Google Drive API + OAuth 2.0), Node.js built-ins only otherwise
|
||||
|
||||
|
||||
<!-- MANUAL ADDITIONS START -->
|
||||
|
||||
@@ -281,9 +281,64 @@ Follow-up TODOs:
|
||||
- All dependencies injected through `vm.createContext({ ... })` context object
|
||||
- VM isolation prevents access to require(), import(), fs, process, and Node.js globals
|
||||
|
||||
#### I.0 Forbidden Globals in proxy.js (NON-NEGOTIABLE)
|
||||
|
||||
`src/proxyScripts/proxy.js` MUST NOT access ANY infrastructure configuration globals. The following are **ABSOLUTELY PROHIBITED**:
|
||||
|
||||
- ❌ `config` - Infrastructure settings (server port, proxy paths, logging level)
|
||||
- ❌ `global.config` - Global configuration object
|
||||
- ❌ `process.env` - Environment variables (these are server concerns, not business logic)
|
||||
|
||||
**ONLY the following globals are permitted** in `src/proxyScripts/proxy.js`:
|
||||
|
||||
- ✅ `console` - Custom logger (injected by server.js)
|
||||
- ✅ `crypto` - Web Crypto API for randomUUID()
|
||||
- ✅ `axios` - HTTP client for API calls
|
||||
- ✅ `jwt` - JSON Web Token library for authentication
|
||||
- ✅ `xmlBuilder` - XML document builder
|
||||
- ✅ `uuidv4` - UUID generator
|
||||
- ✅ `googleDriveAdapterHelper` - Helper functions (loaded from src/globalVariables/)
|
||||
- ✅ `google_drive_settings` - Business data only (service account, Drive query, sitemap settings)
|
||||
- ✅ `req` - HTTP request object (includes req.params with routing metadata)
|
||||
- ✅ `res` - HTTP response object
|
||||
|
||||
**Rationale**: Infrastructure configuration (server ports, proxy routing, deployment settings) is the responsibility of server.js, NOT business logic. proxy.js implements document export logic - it should NOT know about HTTP server configuration, proxy path prefixes, or deployment details. These are injected via `req.params` when needed for routing.
|
||||
|
||||
**If routing information is needed** (e.g., proxy path prefix for route parsing):
|
||||
1. server.js MUST parse the incoming request URL
|
||||
2. server.js MUST extract routing metadata (workspaceId, branch, routeName)
|
||||
3. server.js MUST add this to `req.params` before invoking proxy.js
|
||||
4. proxy.js accesses routing info via `req.params`, NOT via `config`
|
||||
|
||||
**Example of correct routing metadata injection**:
|
||||
```javascript
|
||||
// server.js - BEFORE invoking proxy.js
|
||||
if (global.config.proxy) {
|
||||
const { pathPrefix, workspaceId, branch, routeName } = global.config.proxy;
|
||||
const fullPrefix = `${pathPrefix.replace(/\/$/, '')}/${workspaceId}/${branch}/${routeName}`;
|
||||
|
||||
if (req.url.startsWith(fullPrefix)) {
|
||||
req.params = {
|
||||
"0": req.url, // Original path
|
||||
workspaceId, // Extracted from config
|
||||
branch, // Extracted from config
|
||||
route: routeName // Extracted from config (renamed to 'route')
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Enforcement**:
|
||||
- Any reference to `config` in proxy.js MUST be rejected
|
||||
- Any reference to `global.config` in proxy.js MUST be rejected
|
||||
- Any reference to `process.env` in proxy.js MUST be rejected
|
||||
- Routing metadata MUST be passed via `req.params`, never via `config`
|
||||
- Code reviews MUST verify zero infrastructure globals in proxy.js
|
||||
|
||||
#### I.I What MUST Be in src/proxyScripts/proxy.js
|
||||
|
||||
|
||||
|
||||
The following MUST be implemented in `src/proxyScripts/proxy.js` (or extracted to googleDriveAdapterHelper.js if pure utilities):
|
||||
|
||||
1. **Authentication**: Service Account JWT, OAuth flows, token management (MUST be in proxy.js)
|
||||
|
||||
@@ -3,6 +3,12 @@
|
||||
"port": 3000,
|
||||
"host": "0.0.0.0"
|
||||
},
|
||||
"proxy": {
|
||||
"pathPrefix": "/ProxyScript/run/",
|
||||
"workspaceId": "67bca862210071627d32ef12",
|
||||
"branch": "current",
|
||||
"routeName": "googleDriveAdapter"
|
||||
},
|
||||
"logging": {
|
||||
"level": "debug"
|
||||
}
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
"main": "src/server.js",
|
||||
"scripts": {
|
||||
"dev": "node --watch src/server.js",
|
||||
"start": "node src/server.js",
|
||||
"start": "node src/server.js | jq -R 'fromjson? | select(. != null)'",
|
||||
"test": "node --test tests/**/*.test.js",
|
||||
"test:unit": "node --test tests/unit/**/*.test.js",
|
||||
"test:integration": "node --test tests/integration/**/*.test.js",
|
||||
|
||||
@@ -51,7 +51,7 @@ A user makes an HTTP GET request to `/sitemap.xml` and receives a valid XML site
|
||||
2. **Given** sitemap is generated, **When** examining the XML, **Then** each `<url>` entry contains a `<loc>` pointing to the adapter using RESTful format (e.g., `http://adapter-host/documents/{documentId}`)
|
||||
3. **Given** multiple documents in Google Drive, **When** sitemap is generated, **Then** all accessible documents are included in the sitemap
|
||||
4. **Given** user lacks permission to certain documents, **When** sitemap is generated, **Then** those documents are excluded from the sitemap
|
||||
5. **Given** the adapter base URL is configured, **When** sitemap is generated, **Then** all URLs use the configured base URL
|
||||
5. **Given** the adapter receives a sitemap request at any path, **When** sitemap is generated, **Then** all URLs use the base URL derived from the incoming request (protocol, host, and path up to but not including sitemap.xml)
|
||||
|
||||
---
|
||||
|
||||
@@ -69,6 +69,7 @@ A user makes an HTTP GET request to `/sitemap.xml` and receives a valid XML site
|
||||
- What happens when service account credentials are invalid or missing at startup? → Log critical error to stderr and crash with exit code 1
|
||||
- How are Drive API query filters customized? → Configure filters in config/settings.js file (not hardcoded)
|
||||
- What happens if config/settings.js is missing or malformed? → Log critical error to stderr and crash with exit code 1
|
||||
- How is the base URL determined for sitemap links? → Extracted from incoming request including protocol, host, and path prefix (e.g., request to `/api/v1/sitemap.xml` generates URLs like `https://example.com/api/v1/documents/{id}`)
|
||||
|
||||
## Requirements _(mandatory)_
|
||||
|
||||
@@ -88,7 +89,7 @@ A user makes an HTTP GET request to `/sitemap.xml` and receives a valid XML site
|
||||
- **FR-012**: System MUST log errors to stdout/stderr using plain text format: [timestamp] [level] message (includes request ID and error message for debugging)
|
||||
- **FR-013**: System MUST handle Google Drive API rate limiting gracefully by returning 429 status with Retry-After header indicating seconds until retry
|
||||
- **FR-017**: System MUST NOT retry when Google Drive API returns 503; instead immediately return 503 to client
|
||||
- **FR-014**: System MUST support configuration via environment variables (port, base URL)
|
||||
- **FR-014**: System MUST derive the base URL from the incoming HTTP request including the full path (using X-Forwarded-Proto and X-Forwarded-Host headers if present, otherwise using request protocol and host, plus the path up to but not including sitemap.xml)
|
||||
- **FR-018**: System MUST load Service Account credentials from environment variable GOOGLE_SERVICE_ACCOUNT_KEY containing inline JSON key file content
|
||||
- **FR-015**: System MUST return 413 Payload Too Large if Google Drive contains more than 50,000 documents (enforces sitemap protocol limit)
|
||||
- **FR-016**: System MUST filter out documents user lacks read access to from sitemap
|
||||
@@ -102,7 +103,7 @@ A user makes an HTTP GET request to `/sitemap.xml` and receives a valid XML site
|
||||
- **Sitemap Entry**: Represents a document listing in the sitemap XML. Attributes include: location URL (RESTful path `/documents/{documentId}`), last modified date
|
||||
- **HTTP Request Context**: Represents an incoming request. Attributes include: request ID (for tracing), Service Account JWT token, requested endpoint, client IP
|
||||
- **Service Account Credentials**: Represents JWT-based authentication state. Attributes include: client email, private key (from JSON key file), access token (generated via JWT), token expiry time, scopes granted
|
||||
- **Configuration**: Represents application settings. Attributes include: Drive API query filter (loaded from config/settings.js), server port, base URL, request queue (FIFO for /sitemap.xml requests)
|
||||
- **Configuration**: Represents application settings. Attributes include: Drive API query filter (loaded from config/settings.js), server port, request queue (FIFO for /sitemap.xml requests)
|
||||
|
||||
## Success Criteria _(mandatory)_
|
||||
|
||||
@@ -134,7 +135,7 @@ A user makes an HTTP GET request to `/sitemap.xml` and receives a valid XML site
|
||||
- Default port is 3000 unless configured otherwise
|
||||
- System runs on Node.js LTS version (v18 or later)
|
||||
- Environment supports async/await and ES modules
|
||||
- Base URL for sitemap links is configured via environment variable
|
||||
- Sitemap URLs are constructed dynamically from incoming request headers and path (X-Forwarded-Proto/Host for reverse proxy scenarios, otherwise direct request protocol/host, plus path prefix before sitemap.xml)
|
||||
- Drive API query filter is configured in config/settings.js file (allows customization without code changes)
|
||||
- System processes sitemap requests sequentially to avoid concurrent Drive API query conflicts
|
||||
- Fatal errors (invalid credentials, port binding failure, missing configuration) cause immediate termination with exit code 1
|
||||
|
||||
113
specs/002-document-export/contracts/documents-export-api.md
Normal file
113
specs/002-document-export/contracts/documents-export-api.md
Normal file
@@ -0,0 +1,113 @@
|
||||
# API Contract: Document Export Endpoint
|
||||
|
||||
**Feature**: 002-document-export
|
||||
**Version**: 1.0
|
||||
**Date**: 2026-03-09
|
||||
**Base URL**: `http://localhost:3000` (configurable via config/default.json)
|
||||
|
||||
## Overview
|
||||
|
||||
HTTP endpoint for exporting Google Drive documents in multiple formats. Fetches document metadata, selects optimal export format, and streams content with appropriate headers.
|
||||
|
||||
---
|
||||
|
||||
## Endpoint
|
||||
|
||||
### Export Document
|
||||
|
||||
```
|
||||
GET /documents/:documentId
|
||||
```
|
||||
|
||||
Exports a single Google Drive document in the best available format (Markdown > HTML > PDF). For native PDF files, streams content directly.
|
||||
|
||||
---
|
||||
|
||||
## Request
|
||||
|
||||
### Path Parameters
|
||||
|
||||
| Parameter | Type | Required | Description | Example |
|
||||
|-----------|------|----------|-------------|---------|
|
||||
| `documentId` | string | Yes | Google Drive file ID | `1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms` |
|
||||
|
||||
### Headers
|
||||
|
||||
| Header | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| `Authorization` | Yes (implicit) | Google OAuth2 access token (handled by proxy auth layer) |
|
||||
|
||||
### Example Request
|
||||
|
||||
```http
|
||||
GET /documents/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms HTTP/1.1
|
||||
Host: localhost:3000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Response
|
||||
|
||||
### Success Responses
|
||||
|
||||
#### 200 OK - Document Exported
|
||||
|
||||
**Headers**:
|
||||
```http
|
||||
Content-Type: text/x-markdown | text/html | application/pdf
|
||||
Content-Disposition: inline; filename="[document-name].[extension]"
|
||||
```
|
||||
|
||||
**Body**: Binary stream of document content
|
||||
|
||||
**Format Selection**:
|
||||
- Markdown (text/x-markdown, .md): Preferred if available
|
||||
- HTML (text/html, .html): Fallback if Markdown unavailable
|
||||
- PDF (application/pdf, .pdf): Final fallback or native PDF files
|
||||
|
||||
---
|
||||
|
||||
### Error Responses
|
||||
|
||||
| Status Code | Message | Condition |
|
||||
|-------------|---------|-----------|
|
||||
| 401 | Unauthorized | User lacks permissions or auth failed |
|
||||
| 403 | mimetype not supported | Document type not supported |
|
||||
| 404 | Document not found | Invalid/non-existent document ID |
|
||||
| 413 | Payload Too Large | Document exceeds 10MB |
|
||||
| 500 | Export failed - unable to retrieve document content | Export link malformed/inaccessible |
|
||||
| 502 | Bad Gateway - Google Drive API unavailable | Google Drive API error |
|
||||
| 504 | Gateway Timeout | Operation exceeded 30 seconds |
|
||||
|
||||
---
|
||||
|
||||
## Behavior Specifications
|
||||
|
||||
### Format Selection Algorithm
|
||||
|
||||
1. Fetch document metadata from Google Drive API
|
||||
2. Check if `exportLinks` exists:
|
||||
- If yes: Select first available from [Markdown, HTML, PDF]
|
||||
- If no and mimeType=PDF: Stream native PDF
|
||||
- Otherwise: Return 403
|
||||
|
||||
### Timeouts & Limits
|
||||
|
||||
- **Timeout**: 30 seconds per request
|
||||
- **Size Limit**: 10MB (10,485,760 bytes)
|
||||
- **Streaming**: Content never buffered, piped directly
|
||||
|
||||
---
|
||||
|
||||
## Testing Contract
|
||||
|
||||
### Required Tests
|
||||
|
||||
1. **200**: Valid document returns content with correct headers
|
||||
2. **401**: Invalid token returns Unauthorized
|
||||
3. **403**: Unsupported mimetype returns error
|
||||
4. **404**: Invalid ID returns Document not found
|
||||
5. **413**: Document >10MB returns Payload Too Large
|
||||
6. **504**: Timeout >30s returns Gateway Timeout
|
||||
7. **Headers**: Content-Type and Content-Disposition correct
|
||||
8. **Streaming**: Large files stream without buffering
|
||||
285
specs/002-document-export/data-model.md
Normal file
285
specs/002-document-export/data-model.md
Normal file
@@ -0,0 +1,285 @@
|
||||
# Data Model: Document Export API Route
|
||||
|
||||
**Feature**: 002-document-export
|
||||
**Date**: 2026-03-09
|
||||
**Purpose**: Define data structures and entities for document export functionality
|
||||
|
||||
## Overview
|
||||
|
||||
This feature introduces three primary entities for handling document export requests: **Document**, **ExportRequest**, and **ExportFormat**. These entities represent the data flowing through the export pipeline from request initiation to response delivery.
|
||||
|
||||
---
|
||||
|
||||
## Entities
|
||||
|
||||
### 1. Document
|
||||
|
||||
Represents a file stored in Google Drive, accessed by unique ID.
|
||||
|
||||
**Attributes**:
|
||||
|
||||
| Field | Type | Required | Description | Validation |
|
||||
|-------|------|----------|-------------|------------|
|
||||
| `id` | string | Yes | Google Drive document identifier (extracted from URL parameter) | Non-empty string, alphanumeric with hyphens/underscores |
|
||||
| `name` | string | Yes | Document name from Google Drive metadata | Non-empty string, used in Content-Disposition filename |
|
||||
| `mimeType` | string | Yes | MIME type of the document | One of Google Workspace types or native file types |
|
||||
| `exportLinks` | object | No | Map of available export formats to URLs | Key: MIME type (string), Value: Export URL (string) |
|
||||
|
||||
**Document Types**:
|
||||
|
||||
1. **Google Workspace Documents**:
|
||||
- Docs: `application/vnd.google-apps.document`
|
||||
- Sheets: `application/vnd.google-apps.spreadsheet`
|
||||
- Slides: `application/vnd.google-apps.presentation`
|
||||
- **Characteristic**: Have `exportLinks` field with conversion options
|
||||
|
||||
2. **Native Files**:
|
||||
- PDF: `application/pdf`
|
||||
- Images: `image/jpeg`, `image/png`, etc.
|
||||
- Other: Various MIME types
|
||||
- **Characteristic**: No `exportLinks` field, streamed directly
|
||||
|
||||
**State Transitions**:
|
||||
- N/A (stateless - documents fetched per request)
|
||||
|
||||
**Example**:
|
||||
```javascript
|
||||
// Google Workspace Document (has exportLinks)
|
||||
{
|
||||
id: "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms",
|
||||
name: "Meeting Notes Q1 2026",
|
||||
mimeType: "application/vnd.google-apps.document",
|
||||
exportLinks: {
|
||||
"text/x-markdown": "https://docs.google.com/feeds/download/documents/export/Export?...",
|
||||
"text/html": "https://docs.google.com/feeds/download/documents/export/Export?...",
|
||||
"application/pdf": "https://docs.google.com/feeds/download/documents/export/Export?..."
|
||||
}
|
||||
}
|
||||
|
||||
// Native PDF (no exportLinks)
|
||||
{
|
||||
id: "1AbcDeFgHiJkLmNoPqRsTuVwXyZ1234567890",
|
||||
name: "Product Specs",
|
||||
mimeType: "application/pdf",
|
||||
exportLinks: null
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. ExportRequest
|
||||
|
||||
Represents a user's request to export a document via the `/documents/:documentId` route.
|
||||
|
||||
**Attributes**:
|
||||
|
||||
| Field | Type | Required | Description | Validation |
|
||||
|-------|------|----------|-------------|------------|
|
||||
| `documentId` | string | Yes | Document ID from URL path parameter | Non-empty string, alphanumeric with hyphens/underscores |
|
||||
| `timestamp` | Date | Yes | Request initiation timestamp | ISO 8601 format, used for timeout calculation |
|
||||
| `accessToken` | string | Yes | Google Drive API access token (from auth context) | Valid JWT, not expired |
|
||||
|
||||
**Lifecycle**:
|
||||
1. **Initiated**: Request received on `/documents/:documentId`
|
||||
2. **Authenticated**: Access token validated and available
|
||||
3. **Metadata Fetched**: Google Drive API called for document metadata
|
||||
4. **Format Selected**: Export format chosen based on availability
|
||||
5. **Content Streamed**: Document content piped to response
|
||||
6. **Completed**: Response sent to client
|
||||
|
||||
**Timeout Handling**:
|
||||
- Maximum duration: 30 seconds from timestamp
|
||||
- Enforced via axios timeout configuration
|
||||
- Returns HTTP 504 if exceeded
|
||||
|
||||
**Example**:
|
||||
```javascript
|
||||
{
|
||||
documentId: "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms",
|
||||
timestamp: "2026-03-09T18:00:00.000Z",
|
||||
accessToken: "ya29.a0AfH6SMBx..." // Google OAuth2 access token
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. ExportFormat
|
||||
|
||||
Represents the selected output format for a document export.
|
||||
|
||||
**Attributes**:
|
||||
|
||||
| Field | Type | Required | Description | Validation |
|
||||
|-------|------|----------|-------------|------------|
|
||||
| `mimeType` | string | Yes | MIME type of the export format | One of: `text/x-markdown`, `text/html`, `application/pdf` |
|
||||
| `extension` | string | Yes | File extension for Content-Disposition header | One of: `md`, `html`, `pdf` |
|
||||
| `url` | string | Conditional | Export URL from Google Drive exportLinks | Required for Google Workspace docs, null for native files |
|
||||
| `isNative` | boolean | Yes | Whether this is a native file (direct stream) or export | `true` for native PDFs, `false` for conversions |
|
||||
|
||||
**Format Priority**:
|
||||
Priority order for selection when multiple formats available:
|
||||
1. `text/x-markdown` (.md) - Most portable for content processing
|
||||
2. `text/html` (.html) - Rich formatting fallback
|
||||
3. `application/pdf` (.pdf) - Universal viewing format
|
||||
|
||||
**Selection Rules**:
|
||||
1. If `exportLinks` exist: Select first available format from priority list
|
||||
2. If no `exportLinks` and `mimeType === 'application/pdf'`: Use native PDF streaming
|
||||
3. Otherwise: Return HTTP 403 "mimetype not supported"
|
||||
|
||||
**Example**:
|
||||
```javascript
|
||||
// Google Workspace Document export (Markdown selected)
|
||||
{
|
||||
mimeType: "text/x-markdown",
|
||||
extension: "md",
|
||||
url: "https://docs.google.com/feeds/download/documents/export/Export?...",
|
||||
isNative: false
|
||||
}
|
||||
|
||||
// Native PDF file (direct stream)
|
||||
{
|
||||
mimeType: "application/pdf",
|
||||
extension: "pdf",
|
||||
url: null, // Not used - file streamed directly
|
||||
isNative: true
|
||||
}
|
||||
|
||||
// Unsupported file (image)
|
||||
{
|
||||
mimeType: null,
|
||||
extension: null,
|
||||
url: null,
|
||||
isNative: false
|
||||
}
|
||||
// Returns HTTP 403
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Entity Relationships
|
||||
|
||||
```
|
||||
ExportRequest
|
||||
|
|
||||
| 1:1 (fetches)
|
||||
v
|
||||
Document
|
||||
|
|
||||
| 1:1 (determines)
|
||||
v
|
||||
ExportFormat
|
||||
```
|
||||
|
||||
**Flow**:
|
||||
1. ExportRequest initiated with documentId
|
||||
2. Document metadata fetched from Google Drive API
|
||||
3. ExportFormat selected based on Document attributes (mimeType, exportLinks)
|
||||
4. Content streamed using ExportFormat configuration
|
||||
|
||||
---
|
||||
|
||||
## Validation Rules
|
||||
|
||||
### Document Validation
|
||||
- **ID Format**: Must be valid Google Drive file ID (alphanumeric, hyphens, underscores)
|
||||
- **Name Sanitization**: Remove special characters for Content-Disposition filename
|
||||
- **MIME Type**: Must be recognized Google Workspace or native file type
|
||||
- **Export Links**: If present, must be object with string keys and URL string values
|
||||
|
||||
### Size & Timeout Constraints
|
||||
- **Max Document Size**: 10MB (10,485,760 bytes)
|
||||
- Validated via `Content-Length` header before streaming
|
||||
- Returns HTTP 413 if exceeded
|
||||
- **Max Request Duration**: 30 seconds
|
||||
- Enforced via axios timeout
|
||||
- Returns HTTP 504 if exceeded
|
||||
|
||||
### Format Selection Validation
|
||||
- **Priority Check**: Iterate through formats in order: Markdown → HTML → PDF
|
||||
- **Availability Check**: Format must exist in exportLinks object
|
||||
- **Fallback Check**: If no exportLinks, mimeType must be `application/pdf`
|
||||
- **Rejection**: If none of above, return HTTP 403
|
||||
|
||||
---
|
||||
|
||||
## Error States
|
||||
|
||||
### Document Not Found
|
||||
- **Condition**: Google Drive API returns 404 or document doesn't exist
|
||||
- **Response**: HTTP 404 "Document not found"
|
||||
- **Data State**: No Document entity created
|
||||
|
||||
### Unauthorized Access
|
||||
- **Condition**: User lacks permissions, invalid/expired token
|
||||
- **Response**: HTTP 401 "Unauthorized"
|
||||
- **Data State**: No Document entity created
|
||||
|
||||
### Unsupported Format
|
||||
- **Condition**: No exportLinks, mimeType not application/pdf
|
||||
- **Response**: HTTP 403 "mimetype not supported"
|
||||
- **Data State**: Document entity exists, ExportFormat entity null
|
||||
|
||||
### Size Limit Exceeded
|
||||
- **Condition**: Content-Length > 10MB
|
||||
- **Response**: HTTP 413 "Payload Too Large"
|
||||
- **Data State**: Document entity exists, ExportFormat selected, streaming aborted
|
||||
|
||||
### Timeout Exceeded
|
||||
- **Condition**: Request duration > 30 seconds
|
||||
- **Response**: HTTP 504 "Gateway Timeout"
|
||||
- **Data State**: Partial processing, request abandoned
|
||||
|
||||
### Google Drive API Error
|
||||
- **Condition**: API unavailable, rate limit exceeded
|
||||
- **Response**: HTTP 502 "Bad Gateway - Google Drive API unavailable"
|
||||
- **Data State**: Variable depending on failure point
|
||||
|
||||
---
|
||||
|
||||
## Data Flow Example
|
||||
|
||||
**Successful Export (Google Workspace Document)**:
|
||||
```
|
||||
1. ExportRequest { documentId: "abc123", timestamp: T0, accessToken: "..." }
|
||||
2. Document { id: "abc123", name: "Report", mimeType: "application/vnd.google-apps.document", exportLinks: {...} }
|
||||
3. ExportFormat { mimeType: "text/x-markdown", extension: "md", url: "https://...", isNative: false }
|
||||
4. Stream content from url to client
|
||||
5. Response Headers: Content-Type: text/x-markdown, Content-Disposition: inline; filename="Report.md"
|
||||
```
|
||||
|
||||
**Successful Export (Native PDF)**:
|
||||
```
|
||||
1. ExportRequest { documentId: "xyz789", timestamp: T0, accessToken: "..." }
|
||||
2. Document { id: "xyz789", name: "Invoice", mimeType: "application/pdf", exportLinks: null }
|
||||
3. ExportFormat { mimeType: "application/pdf", extension: "pdf", url: null, isNative: true }
|
||||
4. Stream file using files.get with alt=media
|
||||
5. Response Headers: Content-Type: application/pdf, Content-Disposition: inline; filename="Invoice.pdf"
|
||||
```
|
||||
|
||||
**Failed Export (Unsupported Type)**:
|
||||
```
|
||||
1. ExportRequest { documentId: "img456", timestamp: T0, accessToken: "..." }
|
||||
2. Document { id: "img456", name: "Photo", mimeType: "image/jpeg", exportLinks: null }
|
||||
3. ExportFormat { mimeType: null, extension: null, url: null, isNative: false }
|
||||
4. Return HTTP 403 "mimetype not supported"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
### Statelessness
|
||||
- No entities persisted to database or cache
|
||||
- All data exists only for request duration
|
||||
- Document metadata fetched fresh per request
|
||||
|
||||
### Memory Management
|
||||
- Document metadata buffered in memory (typically <1KB)
|
||||
- Content never buffered - streamed directly
|
||||
- Maximum memory per request: ~10MB + metadata
|
||||
|
||||
### Concurrency
|
||||
- Each request handled independently with isolated ExportRequest entity
|
||||
- No shared state between requests
|
||||
- Target: 50 concurrent requests without degradation
|
||||
219
specs/002-document-export/plan.md
Normal file
219
specs/002-document-export/plan.md
Normal file
@@ -0,0 +1,219 @@
|
||||
# Implementation Plan: Document Export API Route
|
||||
|
||||
**Branch**: `002-document-export` | **Date**: 2026-03-09 | **Spec**: [spec.md](spec.md)
|
||||
**Input**: Feature specification from `/specs/002-document-export/spec.md`
|
||||
|
||||
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/plan-template.md` for the execution workflow.
|
||||
|
||||
## Summary
|
||||
|
||||
Implement a new `/documents/:documentId` HTTP route in the proxy script that exports Google Drive documents in multiple formats. The route fetches document metadata from Google Drive API, selects the best available export format (Markdown > HTML > PDF) based on availability, and streams the content with appropriate headers. Native PDF files are streamed directly. Error handling covers all edge cases with specific HTTP status codes (401, 403, 404, 413, 500, 502, 504) and enforces a 10MB size limit with 30-second timeout.
|
||||
|
||||
## Technical Context
|
||||
|
||||
**Language/Version**: Node.js >=18.0.0 (ES modules)
|
||||
**Primary Dependencies**: axios (HTTP client), jsonwebtoken (JWT for Google auth), uuid (request IDs), xmlbuilder2 (sitemap generation)
|
||||
**Storage**: N/A (stateless proxy, no database)
|
||||
**Testing**: Node.js built-in test runner (`node --test`), organized by contract/integration/unit
|
||||
**Target Platform**: Linux/macOS server (Node.js runtime)
|
||||
**Project Type**: Web service proxy (HTTP-to-Google Drive API adapter)
|
||||
**Performance Goals**: <5 seconds for documents <10MB, 50 concurrent requests without degradation, 99% success rate
|
||||
**Constraints**: 30-second timeout per request, 10MB document size limit, monolithic architecture (all logic in proxy.js via vm.Script isolation)
|
||||
**Scale/Scope**: Single proxy instance serving document exports from Google Drive, isolated VM context execution per request
|
||||
|
||||
## Constitution Check
|
||||
|
||||
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
||||
|
||||
### ✅ Monolithic Architecture (NON-NEGOTIABLE)
|
||||
- **Status**: PASS
|
||||
- **Implementation**: All business logic will be in `src/proxyScripts/proxy.js` loaded via `vm.Script`
|
||||
- **Details**: New `/documents/:documentId` route, Google Drive API calls, format selection, and streaming logic will be added to existing proxy.js monolith
|
||||
- **Helper Functions**: Document format selection and filename extension mapping may be extracted to `src/globalVariables/googleDriveAdapterHelper.js` if they improve code organization (pure utilities only)
|
||||
|
||||
### ✅ Zero External Imports/Exports (NON-NEGOTIABLE)
|
||||
- **Status**: PASS
|
||||
- **Implementation**: proxy.js continues to have zero imports/exports, all dependencies injected via VM context
|
||||
- **Dependencies**: axios (HTTP), jwt (auth), crypto (signatures) already injected by server.js
|
||||
- **Data Access**: Uses existing global variables pattern (google_drive_settings) for credentials
|
||||
|
||||
### ✅ Test-First Development (NON-NEGOTIABLE)
|
||||
- **Status**: PASS
|
||||
- **Implementation**: Contract tests → Integration tests → Unit tests → Implementation
|
||||
- **Test Structure**:
|
||||
- Contract tests: Verify `/documents/:documentId` endpoint behavior against spec requirements
|
||||
- Integration tests: Test Google Drive API integration with mock responses
|
||||
- Unit tests: Test format selection logic, header generation, error mapping
|
||||
- **Coverage Target**: ≥80% code coverage
|
||||
|
||||
### ✅ API-First Design
|
||||
- **Status**: PASS
|
||||
- **Implementation**: HTTP endpoint contract defined before implementation
|
||||
- **Contract**: `/documents/:documentId` route with documented request/response formats
|
||||
- **Documentation**: OpenAPI-style contract in contracts/ directory
|
||||
|
||||
### ✅ Security & Privacy by Default
|
||||
- **Status**: PASS
|
||||
- **Implementation**: Leverages existing Google Drive OAuth2/service account authentication
|
||||
- **Access Control**: Google Drive API permissions enforced (returns 401 for unauthorized)
|
||||
- **Data Privacy**: No caching, documents streamed directly from Google Drive to client
|
||||
|
||||
### ✅ Semantic Versioning
|
||||
- **Status**: PASS
|
||||
- **Implementation**: New feature added to existing v1.0.0 (minor version bump to v1.1.0)
|
||||
- **Backward Compatibility**: New route doesn't affect existing sitemap functionality
|
||||
|
||||
### ✅ Simplicity & YAGNI
|
||||
- **Status**: PASS
|
||||
- **Implementation**: No caching, no retry logic, no rate limiting - only what's specified
|
||||
- **Scope**: Single document export only (no batch operations)
|
||||
|
||||
## Project Structure
|
||||
|
||||
### Documentation (this feature)
|
||||
|
||||
```text
|
||||
specs/002-document-export/
|
||||
├── spec.md # Feature specification (completed)
|
||||
├── plan.md # This file (/speckit.plan command output)
|
||||
├── research.md # Phase 0 output (Google Drive export API patterns)
|
||||
├── data-model.md # Phase 1 output (Document/ExportRequest/ExportFormat entities)
|
||||
├── quickstart.md # Phase 1 output (How to use the new /documents/:documentId route)
|
||||
├── contracts/ # Phase 1 output (API contracts)
|
||||
│ └── documents-export-api.md # HTTP endpoint contract for document export
|
||||
├── requirements-checklist.md # Specification quality validation
|
||||
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
|
||||
```
|
||||
|
||||
### Source Code (repository root)
|
||||
|
||||
```text
|
||||
src/
|
||||
├── proxyScripts/
|
||||
│ └── proxy.js # Main business logic (ADD: /documents/:documentId route handler)
|
||||
├── globalVariables/
|
||||
│ ├── google_drive_settings.json # Existing: Google Drive credentials
|
||||
│ └── googleDriveAdapterHelper.js # UPDATE: Add format selection helpers (optional)
|
||||
├── logger.js # Existing: Structured logging
|
||||
└── server.js # Existing: HTTP server bootstrap (no changes needed)
|
||||
|
||||
config/
|
||||
└── default.json # Existing: Infrastructure settings (no changes)
|
||||
|
||||
tests/
|
||||
├── contract/
|
||||
│ └── documents-export.test.js # NEW: Contract tests for /documents/:documentId
|
||||
├── integration/
|
||||
│ └── google-drive-export.test.js # NEW: Integration tests with Google Drive API mocks
|
||||
└── unit/
|
||||
├── format-selection.test.js # NEW: Unit tests for format selection logic
|
||||
└── export-headers.test.js # NEW: Unit tests for Content-Type/Content-Disposition
|
||||
```
|
||||
|
||||
**Structure Decision**: Single-project architecture maintained per constitution. All new business logic (route handling, format selection, streaming) added to existing `src/proxyScripts/proxy.js` monolith. Pure utility functions for format mapping and filename generation may be extracted to `googleDriveAdapterHelper.js` if they improve code organization. Tests organized by contract/integration/unit following existing pattern.
|
||||
|
||||
## Complexity Tracking
|
||||
|
||||
> **Fill ONLY if Constitution Check has violations that must be justified**
|
||||
|
||||
No violations - all constitutional principles satisfied.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: Research (COMPLETE)
|
||||
|
||||
✅ **Generated**: `research.md`
|
||||
|
||||
**Key Decisions**:
|
||||
1. Google Drive Files.get API with field selection for metadata
|
||||
2. Priority-based format selection: Markdown > HTML > PDF
|
||||
3. Native PDF streaming using alt=media parameter
|
||||
4. Content-Disposition inline with filename sanitization
|
||||
5. Error mapping from Google Drive API to HTTP status codes
|
||||
6. 30-second timeout via axios, 10MB size limit via Content-Length
|
||||
7. Streaming content directly without buffering
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Design & Contracts (COMPLETE)
|
||||
|
||||
✅ **Generated**:
|
||||
- `data-model.md` - Document, ExportRequest, ExportFormat entities
|
||||
- `contracts/documents-export-api.md` - HTTP endpoint contract
|
||||
- `quickstart.md` - User guide for the new endpoint
|
||||
|
||||
✅ **Agent Context Updated**: GitHub Copilot context file updated with Node.js, axios, and project details
|
||||
|
||||
### Post-Design Constitution Re-Check
|
||||
|
||||
All constitutional principles remain satisfied:
|
||||
|
||||
- ✅ **Monolithic Architecture**: New route logic in proxy.js
|
||||
- ✅ **Zero Imports/Exports**: VM context injection maintained
|
||||
- ✅ **Test-First Development**: Contract → Integration → Unit test structure defined
|
||||
- ✅ **API-First Design**: Endpoint contract documented before implementation
|
||||
- ✅ **Security**: Leverages existing Google Drive authentication
|
||||
- ✅ **Simplicity**: No caching, no retry, no rate limiting - only specified features
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Implementation Planning (Ready for /speckit.tasks)
|
||||
|
||||
**Next Steps**:
|
||||
1. Run `/speckit.tasks` to generate `tasks.md` with:
|
||||
- Dependency-ordered implementation tasks
|
||||
- Test-first workflow (write tests before implementation)
|
||||
- Tasks grouped by user story (P1, P2, P3)
|
||||
- Independent testability per story
|
||||
|
||||
2. Implementation will add to `src/proxyScripts/proxy.js`:
|
||||
- Route handler for `/documents/:documentId`
|
||||
- Google Drive metadata fetch logic
|
||||
- Format selection algorithm
|
||||
- Content streaming with headers
|
||||
- Error handling for all edge cases
|
||||
|
||||
3. Optional helper extraction to `src/globalVariables/googleDriveAdapterHelper.js`:
|
||||
- Format priority mapping
|
||||
- Filename sanitization
|
||||
- Content-Disposition header generation
|
||||
- Error status code mapping
|
||||
|
||||
4. Tests will be created in:
|
||||
- `tests/contract/documents-export.test.js`
|
||||
- `tests/integration/google-drive-export.test.js`
|
||||
- `tests/unit/format-selection.test.js`
|
||||
- `tests/unit/export-headers.test.js`
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**Feature**: Document Export API Route
|
||||
**Branch**: `002-document-export`
|
||||
**Status**: Planning Complete ✅
|
||||
**Constitution Compliance**: All gates passed ✅
|
||||
|
||||
**Artifacts Generated**:
|
||||
- ✅ spec.md (feature specification)
|
||||
- ✅ plan.md (this file)
|
||||
- ✅ research.md (technical decisions)
|
||||
- ✅ data-model.md (entities & relationships)
|
||||
- ✅ contracts/documents-export-api.md (HTTP endpoint contract)
|
||||
- ✅ quickstart.md (user guide)
|
||||
- ⏭️ tasks.md (run `/speckit.tasks` to generate)
|
||||
|
||||
**Ready for**: `/speckit.tasks` command to generate implementation tasks
|
||||
|
||||
**Implementation Approach**:
|
||||
1. Test-first development (TDD)
|
||||
2. Monolithic architecture (all logic in proxy.js)
|
||||
3. Streaming content (no buffering)
|
||||
4. Comprehensive error handling
|
||||
5. 80%+ code coverage target
|
||||
|
||||
**Performance Targets**:
|
||||
- <5 seconds for documents <10MB
|
||||
- 50+ concurrent requests
|
||||
- 99% success rate
|
||||
- <1MB memory per request
|
||||
337
specs/002-document-export/quickstart.md
Normal file
337
specs/002-document-export/quickstart.md
Normal file
@@ -0,0 +1,337 @@
|
||||
# Quickstart Guide: Document Export API
|
||||
|
||||
**Feature**: 002-document-export
|
||||
**Date**: 2026-03-09
|
||||
**Audience**: Developers and API consumers
|
||||
|
||||
## Overview
|
||||
|
||||
The Document Export API provides a simple HTTP endpoint for exporting Google Drive documents in multiple formats. The system automatically selects the best available format (Markdown > HTML > PDF) and streams the content with appropriate headers.
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Start the Proxy Server
|
||||
|
||||
```bash
|
||||
# Install dependencies (if not already done)
|
||||
npm install
|
||||
|
||||
# Start server in development mode (with auto-reload)
|
||||
npm run dev
|
||||
|
||||
# Or start in production mode
|
||||
npm start
|
||||
```
|
||||
|
||||
Server starts on `http://localhost:3000` (configurable via `config/default.json`)
|
||||
|
||||
---
|
||||
|
||||
### 2. Export a Document
|
||||
|
||||
**Basic Request**:
|
||||
```bash
|
||||
curl http://localhost:3000/documents/{DOCUMENT_ID}
|
||||
```
|
||||
|
||||
**Example (Export Google Doc as Markdown)**:
|
||||
```bash
|
||||
curl http://localhost:3000/documents/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms \
|
||||
-o output.md
|
||||
```
|
||||
|
||||
**Example (Export Native PDF)**:
|
||||
```bash
|
||||
curl http://localhost:3000/documents/1AbcDeFgHiJkLmNoPqRsTuVwXyZ1234567890 \
|
||||
-o output.pdf
|
||||
```
|
||||
|
||||
**Save with Original Filename**:
|
||||
```bash
|
||||
# The Content-Disposition header includes the original filename
|
||||
curl -OJ http://localhost:3000/documents/{DOCUMENT_ID}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Finding Document IDs
|
||||
|
||||
### From Google Drive URL
|
||||
|
||||
Google Drive URLs contain the document ID:
|
||||
|
||||
```
|
||||
https://docs.google.com/document/d/DOCUMENT_ID/edit
|
||||
https://drive.google.com/file/d/DOCUMENT_ID/view
|
||||
```
|
||||
|
||||
**Example**:
|
||||
- URL: `https://docs.google.com/document/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/edit`
|
||||
- Document ID: `1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms`
|
||||
|
||||
---
|
||||
|
||||
## Supported Formats
|
||||
|
||||
### Google Workspace Documents
|
||||
|
||||
Automatically exported in best available format:
|
||||
|
||||
| Document Type | Preferred Format | Fallback Formats |
|
||||
|---------------|------------------|------------------|
|
||||
| Google Docs | Markdown (.md) | HTML (.html), PDF (.pdf) |
|
||||
| Google Sheets | HTML (.html) | PDF (.pdf) |
|
||||
| Google Slides | PDF (.pdf) | - |
|
||||
|
||||
### Native Files
|
||||
|
||||
| File Type | Behavior |
|
||||
|-----------|----------|
|
||||
| PDF | Streamed directly (no conversion) |
|
||||
| Images, Videos, Archives | Returns 403 "mimetype not supported" |
|
||||
|
||||
---
|
||||
|
||||
## Response Headers
|
||||
|
||||
Every successful response includes:
|
||||
|
||||
```http
|
||||
Content-Type: text/x-markdown | text/html | application/pdf
|
||||
Content-Disposition: inline; filename="document-name.ext"
|
||||
```
|
||||
|
||||
- **Content-Type**: Indicates the export format
|
||||
- **Content-Disposition**: Provides the original filename with appropriate extension
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Common Errors
|
||||
|
||||
| Error | Status | Cause | Solution |
|
||||
|-------|--------|-------|----------|
|
||||
| Document not found | 404 | Invalid ID | Verify document ID is correct |
|
||||
| Unauthorized | 401 | No permission | Check Google Drive access permissions |
|
||||
| mimetype not supported | 403 | Unsupported file type | Only Workspace docs and PDFs supported |
|
||||
| Payload Too Large | 413 | Document >10MB | Use smaller documents or direct Drive access |
|
||||
| Gateway Timeout | 504 | Operation >30s | Retry or use smaller documents |
|
||||
|
||||
### Error Response Format
|
||||
|
||||
All errors return plain text messages:
|
||||
|
||||
```bash
|
||||
$ curl http://localhost:3000/documents/invalid-id
|
||||
Document not found
|
||||
|
||||
$ curl http://localhost:3000/documents/{IMAGE_FILE_ID}
|
||||
mimetype not supported
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Check Response Headers
|
||||
|
||||
```bash
|
||||
# View headers without downloading content
|
||||
curl -I http://localhost:3000/documents/{DOCUMENT_ID}
|
||||
```
|
||||
|
||||
**Example Output**:
|
||||
```http
|
||||
HTTP/1.1 200 OK
|
||||
Content-Type: text/x-markdown
|
||||
Content-Disposition: inline; filename="Meeting_Notes.md"
|
||||
```
|
||||
|
||||
### Stream Large Documents
|
||||
|
||||
```bash
|
||||
# Stream to stdout (for processing)
|
||||
curl http://localhost:3000/documents/{DOCUMENT_ID} | less
|
||||
|
||||
# Pipe to another tool
|
||||
curl http://localhost:3000/documents/{DOCUMENT_ID} | pandoc -f markdown -t docx -o output.docx
|
||||
```
|
||||
|
||||
### Integrate with Scripts
|
||||
|
||||
**Bash Script Example**:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
DOCUMENT_ID="1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms"
|
||||
OUTPUT_DIR="./exports"
|
||||
|
||||
# Create output directory
|
||||
mkdir -p "$OUTPUT_DIR"
|
||||
|
||||
# Export document
|
||||
curl "http://localhost:3000/documents/$DOCUMENT_ID" \
|
||||
-o "$OUTPUT_DIR/document.md" \
|
||||
--fail \
|
||||
--show-error
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "Export successful: $OUTPUT_DIR/document.md"
|
||||
else
|
||||
echo "Export failed"
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
**Node.js Example**:
|
||||
```javascript
|
||||
const axios = require('axios');
|
||||
const fs = require('fs');
|
||||
|
||||
async function exportDocument(documentId, outputPath) {
|
||||
const url = `http://localhost:3000/documents/${documentId}`;
|
||||
|
||||
try {
|
||||
const response = await axios.get(url, {
|
||||
responseType: 'stream',
|
||||
timeout: 30000 // 30 second timeout
|
||||
});
|
||||
|
||||
const writer = fs.createWriteStream(outputPath);
|
||||
response.data.pipe(writer);
|
||||
|
||||
return new Promise((resolve, reject) => {
|
||||
writer.on('finish', resolve);
|
||||
writer.on('error', reject);
|
||||
});
|
||||
} catch (error) {
|
||||
if (error.response) {
|
||||
console.error(`Error ${error.response.status}: ${error.response.data}`);
|
||||
} else {
|
||||
console.error('Request failed:', error.message);
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
// Usage
|
||||
exportDocument('1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms', 'output.md')
|
||||
.then(() => console.log('Export complete'))
|
||||
.catch(err => console.error('Export failed:', err));
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### Run Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
npm test
|
||||
|
||||
# Run specific test suites
|
||||
npm run test:contract # API contract tests
|
||||
npm run test:integration # Google Drive integration tests
|
||||
npm run test:unit # Unit tests
|
||||
```
|
||||
|
||||
### Manual Testing Checklist
|
||||
|
||||
- [ ] Export Google Doc as Markdown
|
||||
- [ ] Export Google Sheet as HTML
|
||||
- [ ] Export Google Slides as PDF
|
||||
- [ ] Export native PDF file
|
||||
- [ ] Test invalid document ID (should return 404)
|
||||
- [ ] Test unsupported file type (should return 403)
|
||||
- [ ] Verify Content-Disposition filename matches document name
|
||||
- [ ] Verify Content-Type header matches export format
|
||||
|
||||
---
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
| Metric | Expected Value |
|
||||
|--------|----------------|
|
||||
| Response time (docs <10MB) | <5 seconds |
|
||||
| Concurrent requests | 50+ supported |
|
||||
| Success rate | >99% for valid docs |
|
||||
| Memory per request | <1MB (streaming) |
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Document not found" for valid document
|
||||
|
||||
1. Verify document ID is correct (check Google Drive URL)
|
||||
2. Ensure Google Drive service account has access to the document
|
||||
3. Check if document is in a shared drive (requires `supportsAllDrives=true`)
|
||||
|
||||
### "Unauthorized" error
|
||||
|
||||
1. Check Google Drive credentials in `src/globalVariables/google_drive_settings.json`
|
||||
2. Verify service account has been granted access to the document
|
||||
3. Check if access token is expired (auth handled by proxy layer)
|
||||
|
||||
### "Gateway Timeout" on large documents
|
||||
|
||||
1. Document may be >10MB (check file size in Google Drive)
|
||||
2. Slow network connection to Google Drive API
|
||||
3. Try again - transient network issue
|
||||
|
||||
### "mimetype not supported"
|
||||
|
||||
This is expected for non-document files:
|
||||
- Images (.jpg, .png, .gif)
|
||||
- Videos (.mp4, .mov)
|
||||
- Archives (.zip, .tar)
|
||||
- Executables (.exe, .dmg)
|
||||
|
||||
Only Google Workspace documents (Docs, Sheets, Slides) and native PDFs are supported.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Server Settings
|
||||
|
||||
Edit `config/default.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"server": {
|
||||
"host": "localhost",
|
||||
"port": 3000
|
||||
},
|
||||
"logging": {
|
||||
"level": "info"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Google Drive Credentials
|
||||
|
||||
Credentials stored in `src/globalVariables/google_drive_settings.json` (managed by existing infrastructure).
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- **Integration**: Use the `/documents/:documentId` endpoint in your applications
|
||||
- **Testing**: Run contract tests to verify behavior: `npm run test:contract`
|
||||
- **Monitoring**: Check logs for errors: `npm run dev` shows real-time logs
|
||||
- **Scaling**: Deploy multiple instances behind a load balancer for high traffic
|
||||
|
||||
---
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
1. Check error messages and status codes (see Error Handling section)
|
||||
2. Review logs for detailed error information
|
||||
3. Verify Google Drive permissions and credentials
|
||||
4. Consult API contract: `specs/002-document-export/contracts/documents-export-api.md`
|
||||
69
specs/002-document-export/requirements-checklist.md
Normal file
69
specs/002-document-export/requirements-checklist.md
Normal file
@@ -0,0 +1,69 @@
|
||||
# Specification Quality Checklist: Document Export API Route
|
||||
|
||||
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
||||
**Created**: 2026-03-09
|
||||
**Updated**: 2026-03-09 (Clarification complete)
|
||||
**Feature**: [spec.md](spec.md)
|
||||
|
||||
## Content Quality
|
||||
|
||||
- [x] No implementation details (languages, frameworks, APIs)
|
||||
- [x] Focused on user value and business needs
|
||||
- [x] Written for non-technical stakeholders
|
||||
- [x] All mandatory sections completed
|
||||
|
||||
## Requirement Completeness
|
||||
|
||||
- [x] No [NEEDS CLARIFICATION] markers remain
|
||||
- [x] Requirements are testable and unambiguous
|
||||
- [x] Success criteria are measurable
|
||||
- [x] Success criteria are technology-agnostic (no implementation details)
|
||||
- [x] All acceptance scenarios are defined
|
||||
- [x] Edge cases are identified and resolved
|
||||
- [x] Scope is clearly bounded
|
||||
- [x] Dependencies and assumptions identified
|
||||
|
||||
## Feature Readiness
|
||||
|
||||
- [x] All functional requirements have clear acceptance criteria
|
||||
- [x] User scenarios cover primary flows
|
||||
- [x] Feature meets measurable outcomes defined in Success Criteria
|
||||
- [x] No implementation details leak into specification
|
||||
|
||||
## Validation Notes
|
||||
|
||||
**Initial validation completed**: 2026-03-09
|
||||
**Clarification completed**: 2026-03-09
|
||||
|
||||
All checklist items pass validation after clarification:
|
||||
|
||||
### Clarifications Resolved:
|
||||
|
||||
1. **Error Response Format** ✅
|
||||
- Standard HTTP status codes with plain text error messages
|
||||
- Specific codes defined for each error scenario (401, 403, 404, 413, 500, 502, 504)
|
||||
- Added FR-014 through FR-019 to specify error handling
|
||||
|
||||
2. **Response Headers** ✅
|
||||
- Content-Disposition set to "inline" with filename from Google Drive metadata
|
||||
- Added FR-011 to specify Content-Disposition format
|
||||
- File extensions documented in Assumptions (.md, .html, .pdf)
|
||||
- Updated all acceptance scenarios to include both Content-Type and Content-Disposition
|
||||
|
||||
3. **Large Document Handling** ✅
|
||||
- 10MB size limit enforced (HTTP 413 for larger documents)
|
||||
- 30-second timeout for export operations (HTTP 504 for timeouts)
|
||||
- Added FR-017 and FR-018 for size/timeout limits
|
||||
- Updated Success Criteria to include timeout handling (SC-011)
|
||||
|
||||
### Updated Sections:
|
||||
|
||||
- **Edge Cases**: Changed from questions to concrete behaviors with specific HTTP status codes
|
||||
- **Functional Requirements**: Added 9 new requirements (FR-011 through FR-019) for error handling, headers, and limits
|
||||
- **Success Criteria**: Added 3 new criteria (SC-007, SC-010, SC-011) for headers, error codes, and timeouts
|
||||
- **Assumptions**: Clarified error response format, size limits, timeout values, and Content-Disposition behavior
|
||||
- **Scope**: Expanded to explicitly include error scenarios and limits
|
||||
- **User Story Acceptance Scenarios**: Updated all scenarios to include Content-Disposition headers and added new error scenarios
|
||||
|
||||
**Recommendation**: Specification is fully clarified and ready for `/speckit.plan` phase. All ambiguities resolved with testable requirements.
|
||||
|
||||
293
specs/002-document-export/research.md
Normal file
293
specs/002-document-export/research.md
Normal file
@@ -0,0 +1,293 @@
|
||||
# Technical Research: Document Export API Route
|
||||
|
||||
**Feature**: 002-document-export
|
||||
**Date**: 2026-03-09
|
||||
**Purpose**: Research technical patterns and best practices for implementing Google Drive document export functionality
|
||||
|
||||
## Research Areas
|
||||
|
||||
### 1. Google Drive Files.get API - Metadata Retrieval
|
||||
|
||||
**Decision**: Use Google Drive API v3 `files.get` endpoint with specific field selection
|
||||
|
||||
**Rationale**:
|
||||
- Google Drive API v3 provides `files.get` endpoint: `GET https://www.googleapis.com/drive/v3/files/{fileId}`
|
||||
- Field selection via `fields` query parameter reduces response size and improves performance
|
||||
- Required fields: `id,name,mimeType,exportLinks`
|
||||
- exportLinks returns map of available export formats for Google Workspace documents
|
||||
- Native files (PDFs, images) don't have exportLinks field
|
||||
|
||||
**Implementation Pattern**:
|
||||
```javascript
|
||||
// In proxy.js - Google Drive API call
|
||||
const metadataUrl = `https://www.googleapis.com/drive/v3/files/${documentId}`;
|
||||
const params = {
|
||||
fields: 'id,name,mimeType,exportLinks',
|
||||
supportsAllDrives: true // Support shared drives
|
||||
};
|
||||
const response = await axios.get(metadataUrl, {
|
||||
params,
|
||||
headers: { Authorization: `Bearer ${accessToken}` }
|
||||
});
|
||||
```
|
||||
|
||||
**Alternatives Considered**:
|
||||
- files.export endpoint directly → Rejected: Requires knowing export format upfront, can't query available formats
|
||||
- files.list with query → Rejected: Less efficient, requires additional parsing
|
||||
|
||||
**References**:
|
||||
- Google Drive API v3 Files.get: https://developers.google.com/drive/api/reference/rest/v3/files/get
|
||||
- Field selection: https://developers.google.com/drive/api/guides/fields-parameter
|
||||
|
||||
---
|
||||
|
||||
### 2. Export Format Selection Strategy
|
||||
|
||||
**Decision**: Priority-based format selection (Markdown > HTML > PDF) with fallback to native file streaming
|
||||
|
||||
**Rationale**:
|
||||
- Google Workspace documents (Docs, Sheets, Slides) provide exportLinks map: `{"text/plain": "url", "text/html": "url", ...}`
|
||||
- Markdown (text/x-markdown) is most portable for downstream content processing
|
||||
- HTML fallback provides rich formatting when Markdown unavailable
|
||||
- PDF fallback ensures something is always available
|
||||
- Native PDFs streamed directly using files.get with `alt=media` parameter
|
||||
|
||||
**Implementation Pattern**:
|
||||
```javascript
|
||||
// Format priority order
|
||||
const EXPORT_FORMATS = [
|
||||
{ mimeType: 'text/x-markdown', extension: 'md' },
|
||||
{ mimeType: 'text/html', extension: 'html' },
|
||||
{ mimeType: 'application/pdf', extension: 'pdf' }
|
||||
];
|
||||
|
||||
// Selection logic
|
||||
function selectExportFormat(exportLinks) {
|
||||
for (const format of EXPORT_FORMATS) {
|
||||
if (exportLinks && exportLinks[format.mimeType]) {
|
||||
return {
|
||||
url: exportLinks[format.mimeType],
|
||||
contentType: format.mimeType,
|
||||
extension: format.extension
|
||||
};
|
||||
}
|
||||
}
|
||||
return null; // No export links available
|
||||
}
|
||||
```
|
||||
|
||||
**Alternatives Considered**:
|
||||
- User-specified format via query parameter → Rejected: Out of scope per spec, adds complexity
|
||||
- Always export as PDF → Rejected: Markdown preferred for content processing
|
||||
- Try all formats in parallel → Rejected: Unnecessary, increases API calls
|
||||
|
||||
---
|
||||
|
||||
### 3. Native PDF File Streaming
|
||||
|
||||
**Decision**: Use Google Drive API `files.get` with `alt=media` parameter for direct file content download
|
||||
|
||||
**Rationale**:
|
||||
- Native PDF files (mimeType: application/pdf) don't have exportLinks
|
||||
- files.get with `alt=media` returns raw file bytes as response body
|
||||
- Response is streamed directly to client (no buffering in proxy)
|
||||
- Efficient for large files up to 10MB limit
|
||||
|
||||
**Implementation Pattern**:
|
||||
```javascript
|
||||
// For native PDFs (no exportLinks)
|
||||
if (metadata.mimeType === 'application/pdf' && !metadata.exportLinks) {
|
||||
const fileUrl = `https://www.googleapis.com/drive/v3/files/${documentId}`;
|
||||
const response = await axios.get(fileUrl, {
|
||||
params: { alt: 'media' },
|
||||
headers: { Authorization: `Bearer ${accessToken}` },
|
||||
responseType: 'stream' // Stream response
|
||||
});
|
||||
|
||||
// Pipe stream to client
|
||||
res.setHeader('Content-Type', 'application/pdf');
|
||||
res.setHeader('Content-Disposition', `inline; filename="${metadata.name}.pdf"`);
|
||||
response.data.pipe(res);
|
||||
}
|
||||
```
|
||||
|
||||
**Alternatives Considered**:
|
||||
- Buffer entire file in memory → Rejected: Inefficient for large files, increases memory usage
|
||||
- Download and re-upload → Rejected: Unnecessary overhead, adds latency
|
||||
|
||||
**References**:
|
||||
- Google Drive API files.get with alt=media: https://developers.google.com/drive/api/guides/manage-downloads
|
||||
|
||||
---
|
||||
|
||||
### 4. Content-Disposition Header Format
|
||||
|
||||
**Decision**: Use `inline; filename="[name].[ext]"` format for Content-Disposition header
|
||||
|
||||
**Rationale**:
|
||||
- `inline` disposition allows browser to display content (PDFs, HTML) in-browser
|
||||
- Filename parameter provides sensible default if user saves file
|
||||
- RFC 6266 compliant format
|
||||
- Filename includes extension matching export format (.md, .html, .pdf)
|
||||
|
||||
**Implementation Pattern**:
|
||||
```javascript
|
||||
// Generate Content-Disposition header
|
||||
function generateContentDisposition(filename, extension) {
|
||||
// Sanitize filename: remove special characters, limit length
|
||||
const sanitized = filename
|
||||
.replace(/[^a-zA-Z0-9-_. ]/g, '_') // Replace special chars
|
||||
.substring(0, 255); // Limit length
|
||||
|
||||
return `inline; filename="${sanitized}.${extension}"`;
|
||||
}
|
||||
|
||||
// Usage
|
||||
res.setHeader('Content-Disposition', generateContentDisposition(metadata.name, 'md'));
|
||||
```
|
||||
|
||||
**Alternatives Considered**:
|
||||
- `attachment` disposition → Rejected: Forces download, prevents in-browser viewing
|
||||
- No filename parameter → Rejected: Browser uses document ID as filename (poor UX)
|
||||
- RFC 2231 encoding for Unicode → Deferred: Simple ASCII sanitization sufficient for MVP
|
||||
|
||||
**References**:
|
||||
- RFC 6266 Content-Disposition: https://datatracker.ietf.org/doc/html/rfc6266
|
||||
|
||||
---
|
||||
|
||||
### 5. Error Handling & HTTP Status Codes
|
||||
|
||||
**Decision**: Map Google Drive API errors to appropriate HTTP status codes with descriptive messages
|
||||
|
||||
**Rationale**:
|
||||
- Google Drive API returns structured error responses with reason codes
|
||||
- Map to standard HTTP status codes for consistent client experience
|
||||
- Plain text error messages for simplicity (no JSON wrapper needed)
|
||||
|
||||
**Implementation Pattern**:
|
||||
```javascript
|
||||
// Error mapping
|
||||
const ERROR_MAP = {
|
||||
'notFound': { status: 404, message: 'Document not found' },
|
||||
'authError': { status: 401, message: 'Unauthorized' },
|
||||
'forbidden': { status: 401, message: 'Unauthorized' },
|
||||
'insufficientPermissions': { status: 401, message: 'Unauthorized' },
|
||||
'rateLimitExceeded': { status: 502, message: 'Bad Gateway - Google Drive API unavailable' },
|
||||
'backendError': { status: 502, message: 'Bad Gateway - Google Drive API unavailable' }
|
||||
};
|
||||
|
||||
// Error handler
|
||||
function handleDriveError(error) {
|
||||
const reason = error.response?.data?.error?.errors?.[0]?.reason;
|
||||
const mapped = ERROR_MAP[reason] || { status: 500, message: 'Export failed - unable to retrieve document content' };
|
||||
|
||||
return {
|
||||
status: mapped.status,
|
||||
message: mapped.message
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
**Additional Error Scenarios**:
|
||||
- Document > 10MB: Check Content-Length header, return HTTP 413
|
||||
- Timeout > 30s: Use axios timeout option, return HTTP 504
|
||||
- Unsupported mimetype: Check mimeType, return HTTP 403
|
||||
|
||||
**Alternatives Considered**:
|
||||
- JSON error responses → Rejected: Plain text simpler per spec assumptions
|
||||
- Retry logic → Rejected: Out of scope per spec
|
||||
- Detailed error messages → Rejected: Security concern, could leak internal details
|
||||
|
||||
---
|
||||
|
||||
### 6. Request Timeout & Size Limits
|
||||
|
||||
**Decision**: Implement 30-second timeout with axios and 10MB size validation via Content-Length header
|
||||
|
||||
**Rationale**:
|
||||
- axios supports timeout option for all requests
|
||||
- Content-Length header available in Google Drive API responses before streaming
|
||||
- Early validation prevents downloading oversized files
|
||||
- Timeout prevents hanging requests from blocking proxy
|
||||
|
||||
**Implementation Pattern**:
|
||||
```javascript
|
||||
// Timeout configuration
|
||||
const TIMEOUT_MS = 30000; // 30 seconds
|
||||
const MAX_SIZE_BYTES = 10 * 1024 * 1024; // 10 MB
|
||||
|
||||
// Request with timeout
|
||||
const response = await axios.get(url, {
|
||||
timeout: TIMEOUT_MS,
|
||||
headers: { Authorization: `Bearer ${accessToken}` }
|
||||
});
|
||||
|
||||
// Size validation
|
||||
const contentLength = parseInt(response.headers['content-length'] || '0');
|
||||
if (contentLength > MAX_SIZE_BYTES) {
|
||||
return res.status(413).send('Payload Too Large');
|
||||
}
|
||||
```
|
||||
|
||||
**Alternatives Considered**:
|
||||
- Progressive timeout (short for metadata, long for content) → Rejected: Adds complexity, 30s sufficient
|
||||
- No size validation → Rejected: Could stream partial files, poor UX
|
||||
- Configurable limits → Rejected: Hard-coded per spec, no need for configuration
|
||||
|
||||
---
|
||||
|
||||
### 7. Streaming vs Buffering
|
||||
|
||||
**Decision**: Stream export content directly from Google Drive to client without buffering
|
||||
|
||||
**Rationale**:
|
||||
- axios supports streaming via `responseType: 'stream'`
|
||||
- Node.js streams allow piping directly to HTTP response
|
||||
- No memory overhead for file contents (only metadata buffered)
|
||||
- Efficient for documents approaching 10MB limit
|
||||
|
||||
**Implementation Pattern**:
|
||||
```javascript
|
||||
// Stream response
|
||||
const exportResponse = await axios.get(exportUrl, {
|
||||
headers: { Authorization: `Bearer ${accessToken}` },
|
||||
responseType: 'stream',
|
||||
timeout: TIMEOUT_MS
|
||||
});
|
||||
|
||||
// Set headers
|
||||
res.setHeader('Content-Type', contentType);
|
||||
res.setHeader('Content-Disposition', contentDisposition);
|
||||
|
||||
// Pipe stream
|
||||
exportResponse.data.pipe(res);
|
||||
|
||||
// Handle stream errors
|
||||
exportResponse.data.on('error', (err) => {
|
||||
if (!res.headersSent) {
|
||||
res.status(500).send('Export failed - unable to retrieve document content');
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Alternatives Considered**:
|
||||
- Buffer entire response → Rejected: Increases memory usage, adds latency
|
||||
- Chunked encoding → Not needed: Google Drive provides Content-Length
|
||||
|
||||
---
|
||||
|
||||
## Summary of Technical Decisions
|
||||
|
||||
| Area | Decision | Rationale |
|
||||
|------|----------|-----------|
|
||||
| **Metadata API** | files.get with field selection | Minimal response size, single API call |
|
||||
| **Format Selection** | Priority order: Markdown > HTML > PDF | Most portable to least portable |
|
||||
| **Native PDFs** | files.get with alt=media streaming | Efficient, no conversion needed |
|
||||
| **Headers** | Content-Disposition: inline with filename | Browser rendering + save support |
|
||||
| **Error Mapping** | Google Drive errors → HTTP status codes | Consistent client experience |
|
||||
| **Timeouts** | 30s axios timeout | Prevents hanging requests |
|
||||
| **Size Limits** | 10MB via Content-Length validation | Early rejection, no partial downloads |
|
||||
| **Streaming** | Direct pipe from Google Drive to client | Memory efficient, low latency |
|
||||
|
||||
All decisions align with constitution principles (monolithic architecture, simplicity, YAGNI) and specification requirements.
|
||||
168
specs/002-document-export/spec.md
Normal file
168
specs/002-document-export/spec.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# Feature Specification: Document Export API Route
|
||||
|
||||
**Feature Branch**: `002-document-export`
|
||||
**Created**: 2026-03-09
|
||||
**Status**: Draft
|
||||
**Input**: User description: "Document exporting. This feature adds a route to the proxy.sh in the format of /documents/:documentId. This route returns a single response that exports the document. The route should 1st fetch metadata about the document from Google Drive using this api https://developers.google.com/workspace/drive/api/reference/rest/v3/files/get to retrieve these fields 'id,name,mimeType,exportLinks'. If exportLinks are available then select one export option based on availability from the following list in order "text/x-markdown","text/html","application/pdf" and export using the link provided in exportLinks. If exportLinks are not available for the document the determine the mimeType of the document and if the mimeType matches 'application/pdf' then stream the pdf file from Google Drive, otherwise send a '403' mimetype not supported message. In all cases make sure that the 'Content-Type' header is set appropriately in the Response."
|
||||
|
||||
## User Scenarios & Testing *(mandatory)*
|
||||
|
||||
### User Story 1 - Export Google Workspace Documents (Priority: P1)
|
||||
|
||||
Users request a Google Workspace document (Google Docs, Sheets, Slides) via the export route and receive it in their preferred format (Markdown, HTML, or PDF). The system intelligently selects the best available export format from Google Drive's export links.
|
||||
|
||||
**Why this priority**: This is the core functionality - exporting Google Workspace documents is the primary use case. Without this, the feature cannot deliver any value.
|
||||
|
||||
**Independent Test**: Can be fully tested by requesting a Google Doc via `/documents/:documentId` endpoint with a valid document ID and verifying the response contains the exported document in the correct format with appropriate Content-Type headers.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a Google Workspace document with export links available, **When** user requests `/documents/:documentId`, **Then** system returns the document in Markdown format (if available) with `Content-Type: text/x-markdown` and `Content-Disposition: inline; filename="[name].md"` headers
|
||||
2. **Given** a Google Workspace document without Markdown export but HTML export available, **When** user requests `/documents/:documentId`, **Then** system returns the document in HTML format with `Content-Type: text/html` and `Content-Disposition: inline; filename="[name].html"` headers
|
||||
3. **Given** a Google Workspace document with only PDF export available, **When** user requests `/documents/:documentId`, **Then** system returns the document in PDF format with `Content-Type: application/pdf` and `Content-Disposition: inline; filename="[name].pdf"` headers
|
||||
4. **Given** a valid document ID, **When** export route is called, **Then** system fetches metadata using Google Drive API with fields 'id,name,mimeType,exportLinks'
|
||||
5. **Given** an invalid document ID, **When** export is requested, **Then** system returns HTTP 404 with message "Document not found"
|
||||
6. **Given** a document the user cannot access, **When** export is requested, **Then** system returns HTTP 401 with message "Unauthorized"
|
||||
|
||||
---
|
||||
|
||||
### User Story 2 - Export Native PDF Files (Priority: P2)
|
||||
|
||||
Users request native PDF files stored in Google Drive and receive them streamed directly without conversion. This handles documents that are already in PDF format rather than Google Workspace documents.
|
||||
|
||||
**Why this priority**: Native PDFs are a common file type in Google Drive. This ensures the export route works for pre-existing PDF files, not just Google Workspace documents. It's lower priority than P1 because Google Workspace documents are the primary use case.
|
||||
|
||||
**Independent Test**: Can be fully tested by uploading a native PDF file to Google Drive, requesting it via `/documents/:documentId`, and verifying the PDF is streamed correctly with `Content-Type: application/pdf` header.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a native PDF file (mimeType: application/pdf) in Google Drive with no export links, **When** user requests `/documents/:documentId`, **Then** system streams the PDF file directly with `Content-Type: application/pdf` and `Content-Disposition: inline; filename="[name].pdf"` headers
|
||||
2. **Given** a native PDF file, **When** export is requested, **Then** system bypasses export links and streams the file content
|
||||
3. **Given** a native PDF file larger than 10MB, **When** export is requested, **Then** system returns HTTP 413 with message "Payload Too Large"
|
||||
|
||||
---
|
||||
|
||||
### User Story 3 - Handle Unsupported File Types (Priority: P3)
|
||||
|
||||
Users attempting to export unsupported file types receive clear error messages indicating the mimetype is not supported. This provides a graceful failure path for non-exportable documents.
|
||||
|
||||
**Why this priority**: Error handling is important for user experience, but it's lower priority than successfully exporting supported file types. Users primarily need the happy path working first.
|
||||
|
||||
**Independent Test**: Can be fully tested by requesting a document with an unsupported mimetype (e.g., image, video, zip) via `/documents/:documentId` and verifying a 403 response with appropriate error message.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a document with unsupported mimeType (not Google Workspace or PDF) and no export links, **When** user requests `/documents/:documentId`, **Then** system returns HTTP 403 with message "mimetype not supported"
|
||||
2. **Given** an image file (e.g., mimeType: image/jpeg), **When** export is requested, **Then** system returns 403 error
|
||||
3. **Given** an export operation that exceeds 30 seconds, **When** timeout occurs, **Then** system returns HTTP 504 with message "Gateway Timeout"
|
||||
4. **Given** Google Drive API is unavailable, **When** export is requested, **Then** system returns HTTP 502 with message "Bad Gateway - Google Drive API unavailable"
|
||||
|
||||
---
|
||||
|
||||
### Edge Cases
|
||||
|
||||
- **Invalid or non-existent document ID**: System returns HTTP 404 with message "Document not found"
|
||||
- **Insufficient permissions**: System returns HTTP 401 with message "Unauthorized" when user lacks access to the requested document
|
||||
- **Google Drive API unavailable**: System returns HTTP 502 with message "Bad Gateway - Google Drive API unavailable"
|
||||
- **Malformed or inaccessible export links**: System returns HTTP 500 with message "Export failed - unable to retrieve document content"
|
||||
- **Large documents or timeouts**: Documents exceeding 10MB return HTTP 413 "Payload Too Large"; exports exceeding 30-second timeout return HTTP 504 "Gateway Timeout"
|
||||
- **Missing mimeType field**: System treats document as unsupported and returns HTTP 403 "mimetype not supported"
|
||||
- **Multiple export formats with same priority**: Not applicable - priority list is strictly ordered
|
||||
|
||||
## Requirements *(mandatory)*
|
||||
|
||||
### Functional Requirements
|
||||
|
||||
- **FR-001**: System MUST provide an HTTP route in the format `/documents/:documentId` where documentId is the Google Drive document identifier
|
||||
- **FR-002**: System MUST fetch document metadata from Google Drive API using `https://developers.google.com/workspace/drive/api/reference/rest/v3/files/get` endpoint
|
||||
- **FR-003**: System MUST retrieve the following metadata fields: `id`, `name`, `mimeType`, `exportLinks`
|
||||
- **FR-004**: System MUST check for the presence of `exportLinks` in the metadata response
|
||||
- **FR-005**: System MUST select export format from `exportLinks` based on this priority order: `text/x-markdown`, `text/html`, `application/pdf` (first available wins)
|
||||
- **FR-006**: System MUST use the export link provided in `exportLinks` to retrieve the exported document content
|
||||
- **FR-007**: System MUST handle documents without `exportLinks` by checking the `mimeType` field
|
||||
- **FR-008**: System MUST stream native PDF files (mimeType: `application/pdf`) directly from Google Drive when no export links are available
|
||||
- **FR-009**: System MUST return HTTP 403 with message "mimetype not supported" for documents without export links and mimeType other than `application/pdf`
|
||||
- **FR-010**: System MUST set the `Content-Type` response header appropriately based on the export format used:
|
||||
- `text/x-markdown` for Markdown exports
|
||||
- `text/html` for HTML exports
|
||||
- `application/pdf` for PDF exports or native PDF files
|
||||
- **FR-011**: System MUST set the `Content-Disposition` response header to `inline; filename="[document-name].[extension]"` using the document name from Google Drive metadata and appropriate file extension for the export format
|
||||
- **FR-012**: System MUST return the exported document content as the HTTP response body
|
||||
- **FR-013**: System MUST handle authentication with Google Drive API (assumes OAuth2 or service account credentials are configured)
|
||||
- **FR-014**: System MUST return HTTP 404 with message "Document not found" when the document ID is invalid or doesn't exist in Google Drive
|
||||
- **FR-015**: System MUST return HTTP 401 with message "Unauthorized" when the user lacks permissions to access the requested document
|
||||
- **FR-016**: System MUST return HTTP 502 with message "Bad Gateway - Google Drive API unavailable" when Google Drive API is unavailable or returns an error
|
||||
- **FR-017**: System MUST return HTTP 413 with message "Payload Too Large" for documents exceeding 10MB
|
||||
- **FR-018**: System MUST return HTTP 504 with message "Gateway Timeout" when export operations exceed 30 seconds
|
||||
- **FR-019**: System MUST return HTTP 500 with message "Export failed - unable to retrieve document content" when export links are malformed or inaccessible
|
||||
|
||||
### Key Entities
|
||||
|
||||
- **Document**: A file stored in Google Drive, identified by a unique documentId
|
||||
- Attributes: id, name, mimeType, exportLinks (optional map of format to URL)
|
||||
- Can be either a Google Workspace document (Docs, Sheets, Slides) or a native file (PDF, images, etc.)
|
||||
- Google Workspace documents provide exportLinks for format conversion
|
||||
- Native files like PDFs do not have exportLinks and must be streamed directly
|
||||
|
||||
- **Export Request**: A user's request to retrieve a document via the export route
|
||||
- Attributes: documentId (from URL parameter)
|
||||
- Triggers metadata fetch, format selection, and document retrieval
|
||||
|
||||
- **Export Format**: The output format for a document
|
||||
- Supported formats: Markdown (text/x-markdown), HTML (text/html), PDF (application/pdf)
|
||||
- Prioritized by preference: Markdown > HTML > PDF
|
||||
- Determines the Content-Type header in the response
|
||||
|
||||
## Success Criteria *(mandatory)*
|
||||
|
||||
### Measurable Outcomes
|
||||
|
||||
- **SC-001**: Users can successfully export Google Workspace documents in under 5 seconds for documents under 10MB
|
||||
- **SC-002**: System correctly selects Markdown format when available in 100% of cases
|
||||
- **SC-003**: System falls back to HTML or PDF formats appropriately when Markdown is unavailable in 100% of cases
|
||||
- **SC-004**: Native PDF files are streamed successfully without conversion in 100% of attempts
|
||||
- **SC-005**: Unsupported file types return clear error messages (403 status) in 100% of cases
|
||||
- **SC-006**: Response Content-Type headers match the exported format in 100% of requests
|
||||
- **SC-007**: Response Content-Disposition headers include correct filenames with appropriate extensions in 100% of requests
|
||||
- **SC-008**: System handles at least 50 concurrent export requests without degradation
|
||||
- **SC-009**: Export success rate exceeds 99% for valid document IDs with proper permissions
|
||||
- **SC-010**: Error responses return appropriate HTTP status codes (401, 403, 404, 413, 500, 502, 504) in 100% of error scenarios
|
||||
- **SC-011**: Export operations exceeding 30 seconds timeout gracefully with HTTP 504 response
|
||||
|
||||
## Assumptions
|
||||
|
||||
- Google Drive API credentials (OAuth2 or service account) are already configured and available to the proxy
|
||||
- The proxy service has network access to Google Drive API endpoints
|
||||
- Document permissions are managed by Google Drive - the proxy inherits the authenticated user's or service account's permissions
|
||||
- The route naming convention `/documents/:documentId` matches the sitemap.xml URL format and is consistent with existing proxy route patterns
|
||||
- Export format priority (Markdown > HTML > PDF) represents the most useful format hierarchy for downstream consumers
|
||||
- Standard HTTP response codes are used: 200 (success), 401 (unauthorized), 403 (unsupported type), 404 (not found), 413 (too large), 500 (server error), 502 (bad gateway), 504 (timeout)
|
||||
- Document size limit is 10MB with a 30-second timeout for export operations, aligning with Google Drive's export API limits
|
||||
- Streaming is preferred for native PDF files to handle large files efficiently
|
||||
- Error messages are plain text responses for simplicity and consistency
|
||||
- Content-Disposition header is set to "inline" to allow browser rendering while preserving filename for downloads
|
||||
- File extensions for Content-Disposition filenames: .md (Markdown), .html (HTML), .pdf (PDF)
|
||||
|
||||
## Scope
|
||||
|
||||
**In Scope**:
|
||||
- Single document export via document ID
|
||||
- Google Workspace document export (Docs, Sheets, Slides) via exportLinks
|
||||
- Native PDF file streaming (up to 10MB)
|
||||
- Format selection based on availability and priority
|
||||
- Proper Content-Type and Content-Disposition header management
|
||||
- Error handling for unsupported mimetypes, invalid document IDs, permission issues, size limits, and timeouts
|
||||
- HTTP status codes for all error scenarios (401, 403, 404, 413, 500, 502, 504)
|
||||
- 30-second timeout for export operations
|
||||
|
||||
**Out of Scope**:
|
||||
- Batch export of multiple documents
|
||||
- Custom format selection by user (always uses priority order)
|
||||
- Document conversion beyond what Google Drive provides via exportLinks
|
||||
- Caching of exported documents
|
||||
- Rate limiting or throttling
|
||||
- User authentication/authorization (assumes proxy handles this)
|
||||
- Document metadata editing or management
|
||||
- Progress tracking for large exports
|
||||
- Retry logic for failed Google Drive API calls
|
||||
- Logging and monitoring (assumes proxy infrastructure handles this)
|
||||
347
specs/002-document-export/tasks.md
Normal file
347
specs/002-document-export/tasks.md
Normal file
@@ -0,0 +1,347 @@
|
||||
# Tasks: Document Export API Route
|
||||
|
||||
**Input**: Design documents from `/specs/002-document-export/`
|
||||
**Prerequisites**: plan.md (complete), spec.md (complete), research.md (complete), data-model.md (complete), contracts/ (complete)
|
||||
|
||||
**Tests**: Test-First Development is MANDATORY per project constitution. All tests MUST be written and FAIL before implementation begins.
|
||||
|
||||
**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
|
||||
|
||||
## Format: `[ID] [P?] [Story] Description`
|
||||
|
||||
- **[P]**: Can run in parallel (different files, no dependencies)
|
||||
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
|
||||
- Include exact file paths in descriptions
|
||||
|
||||
## Path Conventions
|
||||
|
||||
- Single project structure: `src/`, `tests/` at repository root
|
||||
- All business logic in `src/proxyScripts/proxy.js` (monolithic architecture)
|
||||
- Helper functions (optional) in `src/globalVariables/googleDriveAdapterHelper.js`
|
||||
- Tests organized by contract/integration/unit in `tests/`
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Setup (Project Initialization)
|
||||
|
||||
**Purpose**: Verify project structure and dependencies are ready for document export feature
|
||||
|
||||
- [X] T001 Verify existing project dependencies (axios, jsonwebtoken, uuid, xmlbuilder2) in package.json
|
||||
- [X] T002 Verify test infrastructure using Node.js built-in test runner with existing test structure in tests/
|
||||
|
||||
**Checkpoint**: Project structure validated - ready for foundational work
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Foundational (Blocking Prerequisites)
|
||||
|
||||
**Purpose**: Core infrastructure and helper functions that MUST be complete before ANY user story can be implemented
|
||||
|
||||
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
|
||||
|
||||
- [X] T003 [P] Add format selection helper function to src/globalVariables/googleDriveAdapterHelper.js with EXPORT_FORMATS constant and selectExportFormat() function
|
||||
- [X] T004 [P] Add filename sanitization helper function to src/globalVariables/googleDriveAdapterHelper.js for Content-Disposition headers
|
||||
- [X] T005 [P] Add file extension mapping helper function to src/globalVariables/googleDriveAdapterHelper.js (mimeType → extension)
|
||||
- [X] T006 [P] Add error status code mapping helper function to src/globalVariables/googleDriveAdapterHelper.js (Google Drive API errors → HTTP status codes)
|
||||
|
||||
**Checkpoint**: Helper functions ready - user story implementation can now begin in parallel
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: User Story 1 - Export Google Workspace Documents (Priority: P1) 🎯 MVP
|
||||
|
||||
**Goal**: Users can request Google Workspace documents (Docs, Sheets, Slides) via `/documents/:documentId` and receive them in the best available format (Markdown > HTML > PDF) with appropriate headers.
|
||||
|
||||
**Independent Test**: Request a Google Doc via `/documents/:documentId` with valid document ID and verify response contains exported document in correct format with Content-Type and Content-Disposition headers.
|
||||
|
||||
### Tests for User Story 1 (MANDATORY - Write FIRST) ⚠️
|
||||
|
||||
> **CRITICAL**: Write these tests FIRST, ensure they FAIL before implementation begins. User approval required before proceeding to implementation.
|
||||
|
||||
- [X] T007 [P] [US1] Contract test: Valid Google Workspace document returns 200 with correct Content-Type header in tests/contract/documents-export.test.js
|
||||
- [X] T008 [P] [US1] Contract test: Valid document returns Content-Disposition header with inline and correct filename in tests/contract/documents-export.test.js
|
||||
- [X] T009 [P] [US1] Contract test: Document with Markdown export returns text/x-markdown Content-Type in tests/contract/documents-export.test.js
|
||||
- [X] T010 [P] [US1] Contract test: Document without Markdown but with HTML returns text/html Content-Type in tests/contract/documents-export.test.js
|
||||
- [X] T011 [P] [US1] Contract test: Document with only PDF export returns application/pdf Content-Type in tests/contract/documents-export.test.js
|
||||
- [X] T012 [P] [US1] Integration test: Fetch metadata from Google Drive API with correct fields (id,name,mimeType,exportLinks) in tests/integration/google-drive-export.test.js
|
||||
- [X] T013 [P] [US1] Integration test: Select first available export format from priority list (Markdown > HTML > PDF) in tests/integration/google-drive-export.test.js
|
||||
- [X] T014 [P] [US1] Integration test: Stream content from Google Drive export link to response in tests/integration/google-drive-export.test.js
|
||||
- [X] T015 [P] [US1] Unit test: Format selection prioritizes text/x-markdown over text/html and application/pdf in tests/unit/format-selection.test.js
|
||||
- [X] T016 [P] [US1] Unit test: Content-Disposition header generation with sanitized filename and correct extension in tests/unit/export-headers.test.js
|
||||
|
||||
**Checkpoint**: All User Story 1 tests written and FAILING. Request user approval of test scenarios before proceeding to implementation.
|
||||
|
||||
### Implementation for User Story 1
|
||||
|
||||
- [X] T017 [US1] Add /documents/:documentId route handler in src/proxyScripts/proxy.js to extract documentId from URL path
|
||||
- [X] T018 [US1] Implement Google Drive metadata fetch in src/proxyScripts/proxy.js using files.get API with fields parameter
|
||||
- [X] T019 [US1] Implement format selection logic in src/proxyScripts/proxy.js using selectExportFormat() helper (Markdown > HTML > PDF priority)
|
||||
- [X] T020 [US1] Implement export link streaming in src/proxyScripts/proxy.js using axios to fetch content from export URL
|
||||
- [X] T021 [US1] Implement Content-Type header setting in src/proxyScripts/proxy.js based on selected export format
|
||||
- [X] T022 [US1] Implement Content-Disposition header setting in src/proxyScripts/proxy.js with inline and sanitized filename
|
||||
- [X] T023 [US1] Add response streaming logic in src/proxyScripts/proxy.js to pipe content from Google Drive to HTTP response
|
||||
|
||||
**Checkpoint**: User Story 1 complete - all tests should now PASS. Verify independently before proceeding to User Story 2.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: User Story 2 - Export Native PDF Files (Priority: P2)
|
||||
|
||||
**Goal**: Users can request native PDF files stored in Google Drive and receive them streamed directly without conversion.
|
||||
|
||||
**Independent Test**: Upload a native PDF file to Google Drive, request it via `/documents/:documentId`, and verify the PDF is streamed correctly with `Content-Type: application/pdf` header.
|
||||
|
||||
### Tests for User Story 2 (MANDATORY - Write FIRST) ⚠️
|
||||
|
||||
> **CRITICAL**: Write these tests FIRST, ensure they FAIL before implementation begins. User approval required before proceeding to implementation.
|
||||
|
||||
- [ ] T024 [P] [US2] Contract test: Native PDF file (application/pdf mimeType) returns 200 with application/pdf Content-Type in tests/contract/documents-export.test.js
|
||||
- [ ] T025 [P] [US2] Contract test: Native PDF returns Content-Disposition with inline and .pdf extension in tests/contract/documents-export.test.js
|
||||
- [ ] T026 [P] [US2] Contract test: Native PDF larger than 10MB returns 413 Payload Too Large in tests/contract/documents-export.test.js
|
||||
- [ ] T027 [P] [US2] Integration test: Native PDF document without exportLinks streams via files.get with alt=media in tests/integration/google-drive-export.test.js
|
||||
- [ ] T028 [P] [US2] Integration test: Check Content-Length header before streaming and enforce 10MB limit in tests/integration/google-drive-export.test.js
|
||||
- [ ] T029 [P] [US2] Unit test: Detect native PDF by mimeType=application/pdf and absence of exportLinks in tests/unit/format-selection.test.js
|
||||
|
||||
**Checkpoint**: All User Story 2 tests written and FAILING. Request user approval of test scenarios before proceeding to implementation.
|
||||
|
||||
### Implementation for User Story 2
|
||||
|
||||
- [ ] T030 [US2] Add native PDF detection logic in src/proxyScripts/proxy.js (check mimeType=application/pdf and no exportLinks)
|
||||
- [ ] T031 [US2] Implement native PDF streaming in src/proxyScripts/proxy.js using files.get API with alt=media parameter
|
||||
- [ ] T032 [US2] Add Content-Length validation in src/proxyScripts/proxy.js to enforce 10MB limit (return 413 if exceeded)
|
||||
- [ ] T033 [US2] Add Content-Type and Content-Disposition headers for native PDF in src/proxyScripts/proxy.js
|
||||
- [ ] T034 [US2] Add response streaming for native PDF in src/proxyScripts/proxy.js to pipe from Google Drive to HTTP response
|
||||
|
||||
**Checkpoint**: User Story 2 complete - all tests should now PASS. Verify independently: US1 and US2 both work without affecting each other.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: User Story 3 - Handle Unsupported File Types (Priority: P3)
|
||||
|
||||
**Goal**: Users attempting to export unsupported file types receive clear error messages (403 status) indicating the mimetype is not supported. Includes timeout and API error handling.
|
||||
|
||||
**Independent Test**: Request a document with an unsupported mimetype (e.g., image, video, zip) via `/documents/:documentId` and verify a 403 response with appropriate error message.
|
||||
|
||||
### Tests for User Story 3 (MANDATORY - Write FIRST) ⚠️
|
||||
|
||||
> **CRITICAL**: Write these tests FIRST, ensure they FAIL before implementation begins. User approval required before proceeding to implementation.
|
||||
|
||||
- [ ] T035 [P] [US3] Contract test: Invalid document ID returns 404 with "Document not found" message in tests/contract/documents-export.test.js
|
||||
- [ ] T036 [P] [US3] Contract test: Document user cannot access returns 401 with "Unauthorized" message in tests/contract/documents-export.test.js
|
||||
- [ ] T037 [P] [US3] Contract test: Unsupported mimeType (no exportLinks, not PDF) returns 403 with "mimetype not supported" in tests/contract/documents-export.test.js
|
||||
- [ ] T038 [P] [US3] Contract test: Export operation exceeding 30 seconds returns 504 with "Gateway Timeout" message in tests/contract/documents-export.test.js
|
||||
- [ ] T039 [P] [US3] Contract test: Google Drive API unavailable returns 502 with "Bad Gateway - Google Drive API unavailable" in tests/contract/documents-export.test.js
|
||||
- [ ] T040 [P] [US3] Contract test: Malformed export link returns 500 with "Export failed - unable to retrieve document content" in tests/contract/documents-export.test.js
|
||||
- [ ] T041 [P] [US3] Integration test: Map Google Drive 404 error to HTTP 404 response in tests/integration/google-drive-export.test.js
|
||||
- [ ] T042 [P] [US3] Integration test: Map Google Drive 401/403 errors to HTTP 401 response in tests/integration/google-drive-export.test.js
|
||||
- [ ] T043 [P] [US3] Integration test: Enforce 30-second timeout using axios timeout configuration in tests/integration/google-drive-export.test.js
|
||||
- [ ] T044 [P] [US3] Unit test: Error status code mapping function correctly maps Google Drive errors to HTTP codes in tests/unit/error-mapping.test.js
|
||||
|
||||
**Checkpoint**: All User Story 3 tests written and FAILING. Request user approval of test scenarios before proceeding to implementation.
|
||||
|
||||
### Implementation for User Story 3
|
||||
|
||||
- [ ] T045 [US3] Add 404 error handling in src/proxyScripts/proxy.js for invalid/non-existent document IDs (return "Document not found")
|
||||
- [ ] T046 [US3] Add 401 error handling in src/proxyScripts/proxy.js for insufficient permissions (return "Unauthorized")
|
||||
- [ ] T047 [US3] Add 403 error handling in src/proxyScripts/proxy.js for unsupported mimetypes with no exportLinks (return "mimetype not supported")
|
||||
- [ ] T048 [US3] Add 502 error handling in src/proxyScripts/proxy.js for Google Drive API unavailability (return "Bad Gateway - Google Drive API unavailable")
|
||||
- [ ] T049 [US3] Add 500 error handling in src/proxyScripts/proxy.js for malformed/inaccessible export links (return "Export failed - unable to retrieve document content")
|
||||
- [ ] T050 [US3] Add 30-second timeout configuration in src/proxyScripts/proxy.js for all axios requests using timeout option
|
||||
- [ ] T051 [US3] Add 504 error handling in src/proxyScripts/proxy.js for timeout errors (return "Gateway Timeout")
|
||||
- [ ] T052 [US3] Integrate error mapping helper in src/proxyScripts/proxy.js to convert Google Drive API errors to appropriate HTTP status codes
|
||||
|
||||
**Checkpoint**: User Story 3 complete - all tests should now PASS. Verify all three user stories (US1, US2, US3) work independently and together.
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Polish & Cross-Cutting Concerns
|
||||
|
||||
**Purpose**: Improvements that affect multiple user stories and final quality checks
|
||||
|
||||
- [ ] T053 [P] Run all tests to verify 80%+ code coverage target achieved using Node.js built-in coverage tool
|
||||
- [ ] T054 [P] Update quickstart.md with actual implementation details and verify all examples work in specs/002-document-export/quickstart.md
|
||||
- [ ] T055 [P] Code review: Verify all logic in src/proxyScripts/proxy.js follows monolithic architecture (no imports/exports)
|
||||
- [ ] T056 [P] Code review: Verify all helper functions in src/globalVariables/googleDriveAdapterHelper.js are pure functions with literal function body pattern
|
||||
- [ ] T057 [P] Security review: Verify authentication uses existing Google Drive OAuth2/service account credentials
|
||||
- [ ] T058 [P] Performance validation: Test with 50 concurrent requests and verify <5 second response time for documents <10MB
|
||||
- [ ] T059 Manual testing: Test complete workflow with real Google Drive documents (Docs, Sheets, Slides, PDFs)
|
||||
- [ ] T060 Documentation: Update README.md or API documentation with new /documents/:documentId endpoint usage
|
||||
|
||||
**Checkpoint**: Feature complete - ready for integration testing and deployment
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Execution Order
|
||||
|
||||
### Phase Dependencies
|
||||
|
||||
- **Setup (Phase 1)**: No dependencies - can start immediately
|
||||
- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
|
||||
- **User Stories (Phase 3-5)**: All depend on Foundational phase completion
|
||||
- User stories can then proceed in parallel (if staffed)
|
||||
- Or sequentially in priority order (P1 → P2 → P3)
|
||||
- **Polish (Phase 6)**: Depends on all desired user stories being complete
|
||||
|
||||
### User Story Dependencies
|
||||
|
||||
- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
|
||||
- **User Story 2 (P2)**: Can start after Foundational (Phase 2) - Independent of US1 (different code paths)
|
||||
- **User Story 3 (P3)**: Can start after Foundational (Phase 2) - Adds error handling to US1/US2 code paths
|
||||
|
||||
### Within Each User Story
|
||||
|
||||
**MANDATORY TEST-FIRST WORKFLOW**:
|
||||
1. Tests MUST be written FIRST
|
||||
2. Tests MUST FAIL before implementation
|
||||
3. User MUST approve test scenarios
|
||||
4. Implementation proceeds only after approval
|
||||
5. Tests MUST PASS after implementation
|
||||
6. Code coverage MUST be ≥80%
|
||||
|
||||
**Task Execution**:
|
||||
- All tests for a story marked [P] can be written in parallel
|
||||
- Implementation tasks may have dependencies (metadata fetch before format selection)
|
||||
- Verify all tests pass before moving to next user story
|
||||
|
||||
### Parallel Opportunities
|
||||
|
||||
- **Setup tasks**: Both can run in parallel
|
||||
- **Foundational tasks**: All T003-T006 marked [P] can run in parallel (different helper functions)
|
||||
- **User story tests**: All tests within a story marked [P] can run in parallel (different test files or independent test cases)
|
||||
- **User stories**: Once Foundational completes, all three user stories can be implemented in parallel by different developers (mostly independent code paths)
|
||||
- US1: Main export logic (metadata fetch, format selection, streaming)
|
||||
- US2: Native PDF path (separate code branch)
|
||||
- US3: Error handling (wraps US1/US2 code)
|
||||
|
||||
---
|
||||
|
||||
## Parallel Example: User Story 1 Tests
|
||||
|
||||
```bash
|
||||
# Launch all US1 tests together (after Phase 2 complete):
|
||||
Task T007: "Contract test: Valid Google Workspace document returns 200..."
|
||||
Task T008: "Contract test: Valid document returns Content-Disposition..."
|
||||
Task T009: "Contract test: Document with Markdown export returns..."
|
||||
Task T010: "Contract test: Document without Markdown but with HTML..."
|
||||
Task T011: "Contract test: Document with only PDF export..."
|
||||
Task T012: "Integration test: Fetch metadata from Google Drive API..."
|
||||
Task T013: "Integration test: Select first available export format..."
|
||||
Task T014: "Integration test: Stream content from Google Drive..."
|
||||
Task T015: "Unit test: Format selection prioritizes text/x-markdown..."
|
||||
Task T016: "Unit test: Content-Disposition header generation..."
|
||||
```
|
||||
|
||||
All these tests can be written simultaneously since they test different aspects of the same functionality.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### MVP First (User Story 1 Only)
|
||||
|
||||
1. Complete Phase 1: Setup (T001-T002)
|
||||
2. Complete Phase 2: Foundational (T003-T006) - CRITICAL blocker
|
||||
3. Complete Phase 3: User Story 1 (T007-T023)
|
||||
- Write ALL tests first (T007-T016)
|
||||
- Get user approval of test scenarios
|
||||
- Implement (T017-T023)
|
||||
- Verify all tests pass
|
||||
4. **STOP and VALIDATE**: Test User Story 1 independently with real Google Drive documents
|
||||
5. Deploy/demo if ready - MVP feature complete!
|
||||
|
||||
**MVP Scope**: Export Google Workspace documents in best available format with proper headers.
|
||||
|
||||
### Incremental Delivery
|
||||
|
||||
1. Complete Setup + Foundational → Foundation ready
|
||||
2. Add User Story 1 → Test independently → Deploy/Demo (MVP - export Workspace docs!)
|
||||
3. Add User Story 2 → Test independently → Deploy/Demo (add native PDF support)
|
||||
4. Add User Story 3 → Test independently → Deploy/Demo (add comprehensive error handling)
|
||||
5. Complete Polish → Final validation → Production ready
|
||||
|
||||
### Parallel Team Strategy
|
||||
|
||||
With multiple developers:
|
||||
|
||||
1. Team completes Setup + Foundational together (CRITICAL - blocks everything)
|
||||
2. Once Foundational is done:
|
||||
- **Developer A**: User Story 1 (main export logic)
|
||||
- **Developer B**: User Story 2 (native PDF streaming)
|
||||
- **Developer C**: User Story 3 (error handling)
|
||||
3. Stories complete and integrate independently
|
||||
4. Team validates together
|
||||
|
||||
**Note**: US3 (error handling) may need to coordinate with US1/US2 since it wraps their code paths with error handling, but can still be developed in parallel and merged last.
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy (Constitution Requirement)
|
||||
|
||||
### Test-First Development (MANDATORY)
|
||||
|
||||
Per project constitution Section III:
|
||||
|
||||
1. **Write failing tests FIRST** - all test tasks (T007-T016, T024-T029, T035-T044) MUST be completed before corresponding implementation tasks
|
||||
2. **Obtain user approval** - checkpoint after each test phase to review and approve test scenarios
|
||||
3. **Implement minimum code** - only enough to pass tests
|
||||
4. **Maintain 80%+ coverage** - verified in Phase 6 (T053)
|
||||
|
||||
### Test Organization
|
||||
|
||||
- **Contract Tests** (`tests/contract/documents-export.test.js`):
|
||||
- HTTP endpoint behavior verification
|
||||
- Status codes, headers, response formats
|
||||
- End-to-end request/response validation
|
||||
|
||||
- **Integration Tests** (`tests/integration/google-drive-export.test.js`):
|
||||
- Google Drive API interactions with mocks
|
||||
- Format selection algorithm
|
||||
- Streaming behavior
|
||||
- Timeout and size limit enforcement
|
||||
|
||||
- **Unit Tests** (`tests/unit/`):
|
||||
- `format-selection.test.js`: Format priority logic
|
||||
- `export-headers.test.js`: Header generation (Content-Type, Content-Disposition)
|
||||
- `error-mapping.test.js`: Error code mapping
|
||||
|
||||
### Success Criteria
|
||||
|
||||
- ✅ All tests pass
|
||||
- ✅ Code coverage ≥80%
|
||||
- ✅ All user stories independently testable
|
||||
- ✅ No imports/exports in proxy.js (monolithic architecture verified)
|
||||
- ✅ Performance: <5s for docs <10MB, 50 concurrent requests
|
||||
- ✅ 99% success rate for valid requests
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- **[P] tasks** = different files, no dependencies, can run in parallel
|
||||
- **[Story] label** maps task to specific user story for traceability
|
||||
- **Test-First is MANDATORY** - constitution requires TDD workflow
|
||||
- **Monolithic architecture** - all business logic in `src/proxyScripts/proxy.js`, zero imports/exports
|
||||
- **Helper functions** - optional pure utilities in `src/globalVariables/googleDriveAdapterHelper.js`
|
||||
- **User approval required** at test checkpoints before implementation
|
||||
- **Commit after each task** or logical group
|
||||
- **Stop at any checkpoint** to validate story independently
|
||||
- **Total tasks**: 60 (10 tests, 50 implementation/setup, solid coverage of all requirements)
|
||||
|
||||
---
|
||||
|
||||
## Task Count Summary
|
||||
|
||||
- **Phase 1 (Setup)**: 2 tasks
|
||||
- **Phase 2 (Foundational)**: 4 tasks (all parallelizable)
|
||||
- **Phase 3 (User Story 1)**: 17 tasks (10 tests [P], 7 implementation)
|
||||
- **Phase 4 (User Story 2)**: 11 tasks (6 tests [P], 5 implementation)
|
||||
- **Phase 5 (User Story 3)**: 18 tasks (10 tests [P], 8 implementation)
|
||||
- **Phase 6 (Polish)**: 8 tasks (7 parallelizable)
|
||||
|
||||
**Total**: 60 tasks
|
||||
|
||||
**Parallel Opportunities**: 37 tasks marked [P] (62% of all tasks)
|
||||
|
||||
**Test Coverage**: 26 test tasks (43% of total) ensuring TDD compliance
|
||||
|
||||
**MVP Scope**: Phases 1-3 (23 tasks) = Basic Google Workspace document export
|
||||
@@ -256,31 +256,227 @@ function generateSitemap(documents, baseUrl) {
|
||||
return generateSitemapXML(entries);
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
// Document Export Helpers
|
||||
// =============================================================================
|
||||
|
||||
/**
|
||||
* Export format configuration with priority ordering
|
||||
* Markdown > HTML > PDF (most portable to least)
|
||||
*/
|
||||
const EXPORT_FORMATS = [
|
||||
{ mimeType: 'text/x-markdown', extension: 'md' },
|
||||
{ mimeType: 'text/html', extension: 'html' },
|
||||
{ mimeType: 'application/pdf', extension: 'pdf' }
|
||||
];
|
||||
|
||||
/**
|
||||
* Select best available export format from exportLinks
|
||||
* Uses priority order: Markdown > HTML > PDF
|
||||
*
|
||||
* @param {Object} exportLinks - Map of mimeType to export URL from Google Drive
|
||||
* @returns {Object|null} { url, contentType, extension } or null if no format available
|
||||
*/
|
||||
function selectExportFormat(exportLinks) {
|
||||
if (!exportLinks || typeof exportLinks !== 'object') {
|
||||
return null;
|
||||
}
|
||||
|
||||
for (const format of EXPORT_FORMATS) {
|
||||
if (exportLinks[format.mimeType]) {
|
||||
return {
|
||||
url: exportLinks[format.mimeType],
|
||||
contentType: format.mimeType,
|
||||
extension: format.extension
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sanitize filename for Content-Disposition header
|
||||
* Removes/replaces characters that could cause issues in HTTP headers
|
||||
*
|
||||
* @param {string} filename - Original filename from Google Drive
|
||||
* @returns {string} Sanitized filename safe for HTTP headers
|
||||
*/
|
||||
function sanitizeFilename(filename) {
|
||||
if (!filename || typeof filename !== 'string') {
|
||||
return 'document';
|
||||
}
|
||||
|
||||
return filename
|
||||
// Remove/replace problematic characters
|
||||
.replace(/[^\w\s.-]/g, '_') // Replace non-word chars (except space, dot, dash)
|
||||
.replace(/\s+/g, '_') // Replace spaces with underscores
|
||||
.replace(/\.+/g, '.') // Collapse multiple dots
|
||||
.replace(/^\./, '') // Remove leading dot
|
||||
.replace(/\.$/, '') // Remove trailing dot
|
||||
.substring(0, 255); // Limit length
|
||||
}
|
||||
|
||||
/**
|
||||
* Get file extension for a given MIME type
|
||||
*
|
||||
* @param {string} mimeType - MIME type string
|
||||
* @returns {string} File extension (without dot)
|
||||
*/
|
||||
function getFileExtension(mimeType) {
|
||||
const extensionMap = {
|
||||
'text/x-markdown': 'md',
|
||||
'text/html': 'html',
|
||||
'application/pdf': 'pdf',
|
||||
'text/plain': 'txt',
|
||||
'application/json': 'json',
|
||||
'image/jpeg': 'jpg',
|
||||
'image/png': 'png',
|
||||
'image/gif': 'gif'
|
||||
};
|
||||
|
||||
return extensionMap[mimeType] || 'bin';
|
||||
}
|
||||
|
||||
/**
|
||||
* Map Google Drive API errors to HTTP status codes for document export
|
||||
* Extends the existing mapDriveErrorToHttp for export-specific errors
|
||||
*
|
||||
* @param {Error} error - Error object from Google Drive API or axios
|
||||
* @returns {Object} { statusCode, message }
|
||||
*/
|
||||
function mapExportErrorToHttp(error) {
|
||||
// Use existing Drive error mapper as base
|
||||
const baseMapping = mapDriveErrorToHttp(error);
|
||||
|
||||
// Extract status code from error
|
||||
const statusCode = error.response?.status || error.code || baseMapping.statusCode;
|
||||
|
||||
// Map specific errors for document export
|
||||
switch (statusCode) {
|
||||
case 404:
|
||||
return {
|
||||
statusCode: 404,
|
||||
message: 'Document not found'
|
||||
};
|
||||
|
||||
case 401:
|
||||
case 403:
|
||||
return {
|
||||
statusCode: 401,
|
||||
message: 'Unauthorized'
|
||||
};
|
||||
|
||||
case 413:
|
||||
return {
|
||||
statusCode: 413,
|
||||
message: 'Payload Too Large'
|
||||
};
|
||||
|
||||
case 502:
|
||||
case 503:
|
||||
return {
|
||||
statusCode: 502,
|
||||
message: 'Bad Gateway - Google Drive API unavailable'
|
||||
};
|
||||
|
||||
case 504:
|
||||
return {
|
||||
statusCode: 504,
|
||||
message: 'Gateway Timeout'
|
||||
};
|
||||
|
||||
default:
|
||||
// Check for timeout errors (ECONNABORTED, ETIMEDOUT)
|
||||
if (error.code === 'ECONNABORTED' || error.code === 'ETIMEDOUT') {
|
||||
return {
|
||||
statusCode: 504,
|
||||
message: 'Gateway Timeout'
|
||||
};
|
||||
}
|
||||
|
||||
return {
|
||||
statusCode: 500,
|
||||
message: 'Export failed - unable to retrieve document content'
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
// Route Parsing
|
||||
// =============================================================================
|
||||
|
||||
/**
|
||||
* Build proxy prefix from request params
|
||||
* Returns the full path prefix based on routing parameters
|
||||
* Example: "/ProxyScript/run/67bca862210071627d32ef12/current/googleDriveAdapter"
|
||||
*
|
||||
* @param {Object} params - Request params object with workspaceId, branch, route
|
||||
* @returns {string} Full proxy prefix path, or empty string if no params
|
||||
*/
|
||||
function buildProxyPrefixFromParams(params) {
|
||||
if (!params || !params.workspaceId || !params.branch || !params.route) {
|
||||
return '';
|
||||
}
|
||||
|
||||
// Extract original path to find the pathPrefix
|
||||
const originalPath = params["0"] || '';
|
||||
const suffix = `/${params.workspaceId}/${params.branch}/${params.route}`;
|
||||
|
||||
// Find where suffix starts in original path
|
||||
const suffixIndex = originalPath.indexOf(suffix);
|
||||
if (suffixIndex === -1) {
|
||||
return '';
|
||||
}
|
||||
|
||||
// Extract pathPrefix from original path
|
||||
const pathPrefix = originalPath.substring(0, suffixIndex);
|
||||
return `${pathPrefix}${suffix}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse route from request
|
||||
* Handles optional proxy prefix from request params
|
||||
*
|
||||
* @param {string} method - HTTP method
|
||||
* @param {string} url - Request URL
|
||||
* @param {Object} params - Request params object (optional, contains routing info)
|
||||
* @returns {Object} Route info or error
|
||||
*/
|
||||
function parseRoute(method, url) {
|
||||
function parseRoute(method, url, params) {
|
||||
if (method !== "GET") {
|
||||
return { route: null, error: "Method not allowed", statusCode: 405 };
|
||||
}
|
||||
|
||||
const urlObj = new URL(url, "http://localhost");
|
||||
const path = urlObj.pathname;
|
||||
let path = urlObj.pathname;
|
||||
|
||||
// Strip proxy prefix if params provided
|
||||
const proxyPrefix = buildProxyPrefixFromParams(params);
|
||||
if (proxyPrefix && path.startsWith(proxyPrefix)) {
|
||||
path = path.substring(proxyPrefix.length);
|
||||
// Ensure path starts with /
|
||||
if (!path.startsWith('/')) {
|
||||
path = '/' + path;
|
||||
}
|
||||
}
|
||||
|
||||
// Match /documents/:documentId route for document export
|
||||
const documentMatch = path.match(/^\/documents\/([^\/]+)$/);
|
||||
if (documentMatch) {
|
||||
return {
|
||||
route: "document-export",
|
||||
documentId: documentMatch[1]
|
||||
};
|
||||
}
|
||||
|
||||
// Match any path containing 'sitemap.xml'
|
||||
if (path.includes("sitemap.xml")) {
|
||||
return { route: "sitemap" };
|
||||
}
|
||||
|
||||
// All other paths return 404
|
||||
return { route: null, error: "Not found", statusCode: 404 };
|
||||
// All other paths return 404
|
||||
return { route: null, error: "Not found", statusCode: 404 };
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
@@ -302,6 +498,13 @@ return {
|
||||
// Error mapping
|
||||
mapDriveErrorToHttp,
|
||||
|
||||
// Document Export
|
||||
EXPORT_FORMATS,
|
||||
selectExportFormat,
|
||||
sanitizeFilename,
|
||||
getFileExtension,
|
||||
mapExportErrorToHttp,
|
||||
|
||||
// Sitemap
|
||||
toSitemapEntry,
|
||||
transformDocumentsToSitemapEntries,
|
||||
@@ -309,5 +512,6 @@ return {
|
||||
generateSitemap,
|
||||
|
||||
// Routing
|
||||
buildProxyPrefixFromParams,
|
||||
parseRoute,
|
||||
};
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
"type": "service_account",
|
||||
"project_id": "black-portfolio-486723-f2",
|
||||
"private_key_id": "01d829a7ef4b4a85506ad31718bb4331c8618183",
|
||||
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQChyph2XLEiFtmN\nvnetGXI4xXwwH1rxd9ieDD9jMfBGvTjAcbhLILO4BBa+bWMyHzjyK7AKp4a4Bs5w\ny3dYHCFlCF/46kf2j+wcCDmnBTHcjS61E+ycSy9Jznz7myZac3bzf4QT3z4po02S\n+okJdH1CKGFepWfQysmATCQ8+nQQCovjydQI6WX+q5LlUpGv3WLjHJsX0YNpy1fS\nHglMw57iY5s7qprdFJNlQqgVy9TQGxzcIZtRFW6YriMhZty7Nxdr5hvLWgqtSNuw\ntFzqeWcJyuCbROlDTWGmKtlwa8lRmtNXWSrIPNKvDaicr7HLUoMh3phfMMtxi1OC\nfZhXvd0BAgMBAAECggEAAURGka5/uh1ejvfPzy6ILKzSUEcOPB5zhx0VOeHSMInt\nWII0wUeoaTi6I/EpGK4CcQYcajXwmyQzADKInNC3ZneJ1vyga1F/cE1Ubw6ZFYcT\nB5AIOdrzwU69KeQM4/exgF4rhsi4T2+aeoZ3gD1jLcDGuTdqGYTH7iHSerooLS1h\nGwG5wSy0fn5vpRNfMZLZ9ZdPQi+PQujiVDkuABYdax018kHyqFTxCzaxX6uuSKMH\nDFh3k6q/WUxgEmQfs6cvJKVbcXs8vPU4ROaz2shId3NV/jev1orrRpFOl5ptKBxh\nbtOykl21r96gGTN2zs32KgleDYFDpscDk3Ik2FqXkQKBgQDTJ6iSCvjUflWl6HQz\nuuDuXMElK842vr3SLAgUskB4hY6Hy4kZskfIth+KvAM+wxS6JPUXx2p6VaL8P4Av\nJklwQdXHEfSRrgBncUtfFcrcvvykOioaL17fS2EPsbjjdsBt9oZCiahRDSwoIhrE\nwKqwHlrmVWC7EluNRxxzPKFuEQKBgQDEJxPVSWQUIWtCHwa1NnU1oxvUTdSSkGRJ\n3AG9zAu3r0mwqA2BvAVDbfDQmvWgXALvmaLcIkKJOEcTv7MuH9o5TyEoBBJ4DtKY\nQWDcPn0rBCvX7GQ3b5r9DhbiubtCotyJQnL6LIpwZpJC+sDTq5x6nI3TrOrFmyGA\n4XlNL+5P8QKBgD7479wGK6lrt+1Pwv/+dsB/pxaH1usavY+llA9gDbwj0JsNB2lD\ncwcX0ZZVdf5Mvay6AuJBla7ARWhHI9pr57Dz4WaKI08i/nnbHuhPnn1w8/WiZxYC\nFKAxYdQFY6dqrf7da7MCTNFHRWj+qs8MyprVorRYuA1ybx1WHNT9OwORAoGABJ/L\nNucBBfx3s9pZZSJAhyAuQsYG8eGXi6o1HE1YJV9rhE+h6eIN2bYYzEIq8jnZE97y\nWPAx01xRSKTnS3oSwfEcnf3ilZP74P0BlI+gkcgKZI+9GRV3eOnBHl00jfCa9F1t\nqnosVVQFtLCGpTbRfI5+RXQ5IKl0k749BtXPb3ECgYBLO3S+EBTcsS9fYzq9dNSk\n5WG2DEqezgciJ46ZPtLqLs4IhDNVozhorY9lfg9LfQpY5gAJR88eLtdAv0quDbYn\npxxMglR09qNmcX/nPvRfCvua45n2VaTRuUdyEzjKUtUlp5oVWo4XmKLqPTOeJkuh\nT85FN2oAmw0ZGJ0eLiU7tQ==\n-----END PRIVATE KEY-----\n",
|
||||
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQCxyGEYBKNgn7ie\nDKZ21O6ykoAwculAj0PyqXidjK4JjywZpdKabqp4n0fkS+87SLYCaaGQnYnFQKO9\ntwrx6QRu4i6RoCsEAdQYQ+ZZEhaRv9nnl9u6GtFKTulNC2slVJ5Qn1qc5Ovn+h+T\nFV1qf8U9SEMsluI/kUjoNzmXqD4r3j6MTH5l42Cij8DQxj9c7zzN80RWZ+GIs9AX\nJKi5akNr8fc3gVi9m+mBuEvwdPJKyVY2qz16Vcpe/bEkqO4aC9pKTU7Y7I8ofdjR\niTwxHR862pR4lyHuOHnO2K2+vtiQ5KhZYCS4pdz9qvJGARVJWlE66ETGKBUJxSxX\n0y0U5usTAgMBAAECggEABKlq9lzExfdaOXa+dLY/rhoOV3bj6+f10aqk+YijVafV\n8bQ58ge925zdnzxogQb2ktifPnILF0uLH6HpnQ9NqPSwYOwwxJGhtKMvKp3BTsAX\noC2IuvgSDd9E2drXS+rMnfOXxi5wiywxYMN6KB1CmElJTaWFOEKAhWpRTctBGhhW\nFolLcmF9LUtNlh6BNs1aezQUmPp+vwDS9uGqKUcG1x5dnCpSWyKrdmA1Lf46KECr\nm2Ev3ir6TY8Yo/n8Fyvtj8QcbZXNatGT+eW3MrQ7aup+EfYbvVgdY8r0C6fCMvB+\nf2PMBDPO3sX4D6uGNqjMbHvpy+1BPNzT0jUqGQfQrQKBgQDlq9hQPw2xQBhvjkQ7\ncfAC7GShLe2kVbnKwz3BpbUQWcEYnxymZm1oumud0fN9bcQnfDGF4V/4W5erykcA\nYp8uRr3LzwWXb0XQRx5T/kI5VIFXtB/O/VhoYoChJbxp/Lp471Et/kSussfEhyyT\nk1EFNm5JOgd3STev8hvRaJ/hJwKBgQDGKcNb28PM6ZUN3R9GYtS6bmV6QgL1GWsX\neJpm0y1S7anL+EOMXD4hSKbyKs8x59iWegDeM5xxxmlXy6dDRlsPH2RIUP/rV1ki\ndvLmWTl5Wu3CeOnGeNRW2wzFOexVAFxScFh+NrgEkdHZQqc873Ya5Aaj52/EimS3\nTwA7ZH0CNQKBgAuJYkhFoo5wxcl0wACsbH35GeTuxa0nkTmaLRP5GutDVuvBslK5\nem10T8uRrEV0qhHBr0smUwfKsgezFXXzfkN40jfWolVFBaC8sc1OTE1M7WJWbfKb\nz0EPEZ8GojxAsa05eD5zM0gDOv2oPJj9IWi9nzSWcaGQT/fKlZMjSkSpAoGBAKMo\nb7mKUMS+7gLkNYP2i8CUdOkcwOKdcxd4LWjMJ11IYa2XU8aVjHJLJ2ns5XvpsOL0\nwRIy3HSxMLsg6y7xFrh02FTSnGRhHvrJhWUzwaaxv2GHvLO1eN+qq/EXqAa0rU8T\nQUlqNElO5sFDp/78CvpJFU6Ol+/zIsnrOf2s12ChAoGAB3s2msWo5ESMcYw8fY8G\n/fsDsIEs31GEd62CD8x0x2uF0Kktf9AHda+D/ieweLHK5MZEkfuajDvoSkunYDgP\nAo9KMBSvXcr6XBeplVKaAGvPjZyJ33ae+WIIMM2yKmALMMBqp09Tl77od1gr77MN\nkDuPA7r1pEUaXbHtonPGa6c=\n-----END PRIVATE KEY-----\n",
|
||||
"client_email": "n8n-service@black-portfolio-486723-f2.iam.gserviceaccount.com",
|
||||
"client_id": "108246676308214231920",
|
||||
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
|
||||
@@ -16,7 +16,6 @@
|
||||
"https://www.googleapis.com/auth/drive.readonly"
|
||||
],
|
||||
"driveQuery": "trashed = false",
|
||||
"proxyScriptEndPoint": "http://localhost:3000",
|
||||
"sitemap": {
|
||||
"maxUrls": 50000
|
||||
}
|
||||
|
||||
@@ -28,33 +28,66 @@ function getLogLevel() {
|
||||
}
|
||||
|
||||
/**
|
||||
* Log a message with structured metadata
|
||||
* @param {string} level - Log level (DEBUG|INFO|WARN|ERROR)
|
||||
* @param {string} message - Log message
|
||||
* @param {Object} meta - Additional metadata
|
||||
* Format and pretty print an object
|
||||
* @param {Object} obj - Object to format
|
||||
* @returns {string} Formatted object string
|
||||
*/
|
||||
export function log(level, message, meta = {}) {
|
||||
function formatObject(obj) {
|
||||
return JSON.stringify(obj, null, 2);
|
||||
}
|
||||
|
||||
/**
|
||||
* Log a message or object
|
||||
* @param {string} level - Log level (DEBUG|INFO|WARN|ERROR)
|
||||
* @param {string|Object} data - Log message (string) or structured data (object)
|
||||
*/
|
||||
function _log(level, data) {
|
||||
const levelValue = LOG_LEVELS[level] ?? LOG_LEVELS.INFO;
|
||||
const threshold = getLogLevel();
|
||||
|
||||
// Only log if level meets or exceeds threshold
|
||||
if (levelValue >= threshold) {
|
||||
const entry = {
|
||||
timestamp: new Date().toISOString(),
|
||||
level,
|
||||
message,
|
||||
...meta
|
||||
};
|
||||
originalConsoleLog(JSON.stringify(entry));
|
||||
let entry;
|
||||
|
||||
if (typeof data === 'string') {
|
||||
// String input: create structured log entry
|
||||
entry = {
|
||||
timestamp: new Date().toISOString(),
|
||||
level,
|
||||
message: data
|
||||
};
|
||||
originalConsoleLog(JSON.stringify(entry));
|
||||
} else if (typeof data === 'object' && data !== null) {
|
||||
// Object input: pretty print with timestamp and level
|
||||
entry = {
|
||||
timestamp: new Date().toISOString(),
|
||||
level,
|
||||
...data
|
||||
};
|
||||
originalConsoleLog(formatObject(entry));
|
||||
} else {
|
||||
// Fallback: convert to string
|
||||
entry = {
|
||||
timestamp: new Date().toISOString(),
|
||||
level,
|
||||
message: String(data)
|
||||
};
|
||||
originalConsoleLog(JSON.stringify(entry));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Console-like logging interface
|
||||
* Exported as 'console' to match standard console API
|
||||
*
|
||||
* Usage:
|
||||
* logger.info("Simple message")
|
||||
* logger.info({ message: "Structured data", requestId: "123", status: 200 })
|
||||
*/
|
||||
export const logger = {
|
||||
debug: (message, meta) => log('DEBUG', message, meta),
|
||||
info: (message, meta) => log('INFO', message, meta),
|
||||
error: (message, meta) => log('ERROR', message, meta)
|
||||
log: (data) => _log('INFO', data),
|
||||
debug: (data) => _log('DEBUG', data),
|
||||
info: (data) => _log('INFO', data),
|
||||
error: (data) => _log('ERROR', data)
|
||||
};
|
||||
|
||||
@@ -11,7 +11,6 @@
|
||||
* Globals provided by server.js:
|
||||
* - console: Custom logger
|
||||
* - crypto: Web Crypto API (provides randomUUID())
|
||||
* - config: Infrastructure settings (server port, logging level)
|
||||
* - axios: HTTP client
|
||||
* - uuidv4: UUID generator
|
||||
* - jwt: JSON Web Token library
|
||||
@@ -22,6 +21,8 @@
|
||||
* - scopes: OAuth2 scopes array
|
||||
* - driveQuery: Drive API query filter
|
||||
* - sitemap: Sitemap configuration (maxUrls)
|
||||
* - req: HTTP request object (includes req.params with routing info if proxy prefix configured)
|
||||
* - res: HTTP response object
|
||||
*
|
||||
* Structure:
|
||||
* Section 1: Authentication (Service Account JWT)
|
||||
@@ -31,6 +32,7 @@
|
||||
* @module proxy
|
||||
*/
|
||||
|
||||
|
||||
// NO IMPORTS - ALL dependencies provided as globals by server.js
|
||||
|
||||
// =============================================================================
|
||||
@@ -79,8 +81,9 @@ async function initializeServiceAccount() {
|
||||
{ headers: { "Content-Type": "application/x-www-form-urlencoded" } }
|
||||
);
|
||||
|
||||
console.info("Service account authenticated", {
|
||||
email: settings.serviceAccount.client_email,
|
||||
console.info({
|
||||
message: "Service account authenticated",
|
||||
email: settings.serviceAccount.client_email
|
||||
});
|
||||
|
||||
return response.data.access_token;
|
||||
@@ -153,9 +156,10 @@ async function queryDocuments(options = {}) {
|
||||
pageToken = response.data.nextPageToken;
|
||||
} while (pageToken);
|
||||
|
||||
console.info("Drive API query completed", {
|
||||
console.info({
|
||||
message: "Drive API query completed",
|
||||
documentCount: allFiles.length,
|
||||
duration: Date.now() - startTime,
|
||||
duration: Date.now() - startTime
|
||||
});
|
||||
|
||||
return allFiles;
|
||||
@@ -165,17 +169,56 @@ async function queryDocuments(options = {}) {
|
||||
// Section 3: Request Handling & Routing
|
||||
// =============================================================================
|
||||
|
||||
/**
|
||||
* Generate base URL from incoming request including path
|
||||
* Uses X-Forwarded-Proto and X-Forwarded-Host headers if behind proxy,
|
||||
* otherwise falls back to direct request protocol and host
|
||||
* Includes the path up to (but not including) sitemap.xml
|
||||
*
|
||||
* @param {Object} req - HTTP request object
|
||||
* @returns {string} Base URL (e.g., "https://example.com/api/v1")
|
||||
*/
|
||||
function getBaseUrl(req) {
|
||||
// Check for forwarded headers (when behind reverse proxy)
|
||||
const forwardedProto = req.headers['x-forwarded-proto'];
|
||||
const forwardedHost = req.headers['x-forwarded-host'];
|
||||
|
||||
let baseUrl;
|
||||
if (forwardedProto && forwardedHost) {
|
||||
baseUrl = `${forwardedProto}://${forwardedHost}`;
|
||||
} else {
|
||||
// Fall back to direct request
|
||||
const protocol = req.connection?.encrypted ? 'https' : 'http';
|
||||
const host = req.headers.host || 'localhost:3000';
|
||||
baseUrl = `${protocol}://${host}`;
|
||||
}
|
||||
|
||||
// Extract path from request URL, removing sitemap.xml and trailing slash
|
||||
const url = req.url || '/';
|
||||
const pathWithoutSitemap = url.replace(/\/sitemap\.xml.*$/, '');
|
||||
|
||||
// Append path if it exists and is not just root
|
||||
if (pathWithoutSitemap && pathWithoutSitemap !== '/') {
|
||||
baseUrl += pathWithoutSitemap;
|
||||
}
|
||||
|
||||
return baseUrl;
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle sitemap generation request
|
||||
*/
|
||||
async function handleSitemapRequest(res, requestId) {
|
||||
async function handleSitemapRequest(req, res, requestId) {
|
||||
try {
|
||||
const settings = google_drive_settings || {};
|
||||
const maxUrls = settings.sitemap?.maxUrls || 50000;
|
||||
const query = settings.driveQuery || "trashed = false";
|
||||
|
||||
const documents = await queryDocuments({ query, maxDocuments: maxUrls });
|
||||
const xml = googleDriveAdapterHelper.generateSitemap(documents, settings.proxyScriptEndPoint);
|
||||
|
||||
// Generate base URL from request instead of using configured setting
|
||||
const baseUrl = getBaseUrl(req);
|
||||
const xml = googleDriveAdapterHelper.generateSitemap(documents, baseUrl);
|
||||
|
||||
res.statusCode = 200;
|
||||
res.setHeader("Content-Type", "application/xml; charset=utf-8");
|
||||
@@ -183,7 +226,11 @@ async function handleSitemapRequest(res, requestId) {
|
||||
res.setHeader("X-Document-Count", documents.length.toString());
|
||||
res.end(xml);
|
||||
|
||||
console.info("Sitemap generated", { requestId, documentCount: documents.length });
|
||||
console.info({
|
||||
message: "Sitemap generated",
|
||||
requestId,
|
||||
documentCount: documents.length
|
||||
});
|
||||
} catch (error) {
|
||||
const errorResponse = googleDriveAdapterHelper.mapDriveErrorToHttp(error);
|
||||
res.statusCode = errorResponse.statusCode;
|
||||
@@ -192,10 +239,188 @@ async function handleSitemapRequest(res, requestId) {
|
||||
}
|
||||
res.end(); // Empty body per spec
|
||||
|
||||
console.error("Sitemap generation failed", {
|
||||
console.error({
|
||||
message: "Sitemap generation failed",
|
||||
requestId,
|
||||
error: error.message,
|
||||
statusCode: errorResponse.statusCode,
|
||||
statusCode: errorResponse.statusCode
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle document export request
|
||||
* Exports a single Google Drive document in best available format
|
||||
*/
|
||||
async function handleDocumentExportRequest(res, documentId, requestId) {
|
||||
try {
|
||||
const accessToken = await getAccessTokenCached();
|
||||
|
||||
// Step 1: Fetch document metadata from Google Drive
|
||||
const metadataUrl = `https://www.googleapis.com/drive/v3/files/${documentId}`;
|
||||
const metadataParams = {
|
||||
fields: 'id,name,mimeType,exportLinks',
|
||||
supportsAllDrives: true
|
||||
};
|
||||
|
||||
console.info({
|
||||
message: "Fetching document metadata",
|
||||
requestId,
|
||||
documentId
|
||||
});
|
||||
|
||||
const metadataResponse = await axios.get(metadataUrl, {
|
||||
params: metadataParams,
|
||||
headers: { Authorization: `Bearer ${accessToken}` },
|
||||
timeout: 30000 // 30-second timeout
|
||||
});
|
||||
|
||||
const document = metadataResponse.data;
|
||||
|
||||
console.info({
|
||||
message: "Document metadata retrieved",
|
||||
requestId,
|
||||
documentId: document.id,
|
||||
name: document.name,
|
||||
mimeType: document.mimeType,
|
||||
hasExportLinks: !!document.exportLinks
|
||||
});
|
||||
|
||||
// Step 2: Determine export strategy
|
||||
let contentUrl;
|
||||
let contentType;
|
||||
let fileExtension;
|
||||
|
||||
if (document.exportLinks) {
|
||||
// Google Workspace document - select best export format
|
||||
const exportFormat = googleDriveAdapterHelper.selectExportFormat(document.exportLinks);
|
||||
|
||||
if (!exportFormat) {
|
||||
// No supported export format available
|
||||
const error = googleDriveAdapterHelper.mapExportErrorToHttp({ response: { status: 403 } });
|
||||
res.statusCode = 403;
|
||||
res.setHeader("X-Request-Id", requestId);
|
||||
res.end("mimetype not supported");
|
||||
console.error({
|
||||
message: "No supported export format",
|
||||
requestId,
|
||||
documentId,
|
||||
mimeType: document.mimeType
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
contentUrl = exportFormat.url;
|
||||
contentType = exportFormat.contentType;
|
||||
fileExtension = exportFormat.extension;
|
||||
|
||||
console.info({
|
||||
message: "Export format selected",
|
||||
requestId,
|
||||
documentId,
|
||||
contentType,
|
||||
extension: fileExtension
|
||||
});
|
||||
} else if (document.mimeType === 'application/pdf') {
|
||||
// Native PDF file - stream directly
|
||||
contentUrl = `https://www.googleapis.com/drive/v3/files/${documentId}?alt=media&supportsAllDrives=true`;
|
||||
contentType = 'application/pdf';
|
||||
fileExtension = 'pdf';
|
||||
|
||||
console.info({
|
||||
message: "Streaming native PDF",
|
||||
requestId,
|
||||
documentId
|
||||
});
|
||||
} else {
|
||||
// Unsupported document type
|
||||
res.statusCode = 403;
|
||||
res.setHeader("X-Request-Id", requestId);
|
||||
res.end("mimetype not supported");
|
||||
console.error({
|
||||
message: "Unsupported mimetype",
|
||||
requestId,
|
||||
documentId,
|
||||
mimeType: document.mimeType
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// Step 3: Stream content from Google Drive
|
||||
const contentResponse = await axios.get(contentUrl, {
|
||||
headers: { Authorization: `Bearer ${accessToken}` },
|
||||
responseType: 'stream',
|
||||
timeout: 30000 // 30-second timeout
|
||||
});
|
||||
|
||||
// Step 4: Check size limit (10MB)
|
||||
const contentLength = contentResponse.headers['content-length'];
|
||||
const maxSizeBytes = 10 * 1024 * 1024; // 10MB
|
||||
|
||||
if (contentLength && parseInt(contentLength) > maxSizeBytes) {
|
||||
res.statusCode = 413;
|
||||
res.setHeader("X-Request-Id", requestId);
|
||||
res.end("Payload Too Large");
|
||||
console.error({
|
||||
message: "Document exceeds size limit",
|
||||
requestId,
|
||||
documentId,
|
||||
size: contentLength,
|
||||
limit: maxSizeBytes
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// Step 5: Set response headers
|
||||
res.statusCode = 200;
|
||||
res.setHeader("Content-Type", contentType);
|
||||
res.setHeader("X-Request-Id", requestId);
|
||||
|
||||
// Generate Content-Disposition header
|
||||
const sanitizedFilename = googleDriveAdapterHelper.sanitizeFilename(document.name);
|
||||
const contentDisposition = `inline; filename="${sanitizedFilename}.${fileExtension}"`;
|
||||
res.setHeader("Content-Disposition", contentDisposition);
|
||||
|
||||
if (contentLength) {
|
||||
res.setHeader("Content-Length", contentLength);
|
||||
}
|
||||
|
||||
// Step 6: Stream content to response
|
||||
contentResponse.data.pipe(res);
|
||||
|
||||
console.info({
|
||||
message: "Document export streaming",
|
||||
requestId,
|
||||
documentId,
|
||||
contentType,
|
||||
filename: `${sanitizedFilename}.${fileExtension}`,
|
||||
size: contentLength || 'unknown'
|
||||
});
|
||||
|
||||
// Wait for stream to complete
|
||||
await new Promise((resolve, reject) => {
|
||||
contentResponse.data.on('end', resolve);
|
||||
contentResponse.data.on('error', reject);
|
||||
});
|
||||
|
||||
console.info({
|
||||
message: "Document export completed",
|
||||
requestId,
|
||||
documentId
|
||||
});
|
||||
|
||||
} catch (error) {
|
||||
const errorResponse = googleDriveAdapterHelper.mapExportErrorToHttp(error);
|
||||
res.statusCode = errorResponse.statusCode;
|
||||
res.setHeader("X-Request-Id", requestId);
|
||||
res.end(errorResponse.message || "Export failed");
|
||||
|
||||
console.error({
|
||||
message: "Document export failed",
|
||||
requestId,
|
||||
documentId,
|
||||
error: error.message,
|
||||
statusCode: errorResponse.statusCode
|
||||
});
|
||||
}
|
||||
}
|
||||
@@ -207,40 +432,56 @@ async function handleSitemapRequest(res, requestId) {
|
||||
const requestId = googleDriveAdapterHelper.generateRequestId();
|
||||
const startTime = Date.now();
|
||||
|
||||
console.info("Request received", {
|
||||
console.info({
|
||||
message: "Request received",
|
||||
requestId,
|
||||
method: req.method,
|
||||
url: req.url,
|
||||
params: req.params,
|
||||
query: req.query,
|
||||
body: req.body
|
||||
});
|
||||
|
||||
try {
|
||||
const routeResult = googleDriveAdapterHelper.parseRoute(req.method, req.url);
|
||||
const routeResult = googleDriveAdapterHelper.parseRoute(req.method, req.url, req.params);
|
||||
|
||||
if (!routeResult.route) {
|
||||
res.statusCode = routeResult.statusCode;
|
||||
res.end();
|
||||
console.error("Route not found", { requestId, url: req.url });
|
||||
console.error({
|
||||
message: "Route not found",
|
||||
requestId,
|
||||
url: req.url
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// Handle sitemap route
|
||||
if (routeResult.route === "sitemap") {
|
||||
await handleSitemapRequest(res, requestId);
|
||||
await handleSitemapRequest(req, res, requestId);
|
||||
return;
|
||||
}
|
||||
|
||||
// Handle document export route
|
||||
if (routeResult.route === "document-export") {
|
||||
await handleDocumentExportRequest(res, routeResult.documentId, requestId);
|
||||
return;
|
||||
}
|
||||
} catch (error) {
|
||||
res.statusCode = 500;
|
||||
res.end();
|
||||
console.error("Request handler error", {
|
||||
console.error({
|
||||
message: "Request handler error",
|
||||
requestId,
|
||||
error: error.message,
|
||||
stack: error.stack,
|
||||
stack: error.stack
|
||||
});
|
||||
} finally {
|
||||
console.info("Request completed", {
|
||||
console.info({
|
||||
message: "Request completed",
|
||||
requestId,
|
||||
statusCode: res.statusCode,
|
||||
duration: Date.now() - startTime,
|
||||
duration: Date.now() - startTime
|
||||
});
|
||||
}
|
||||
})();
|
||||
|
||||
@@ -47,7 +47,10 @@ function loadGlobalVariables() {
|
||||
const varName = file.replace(".json", "");
|
||||
const data = JSON.parse(readFileSync(join(globalDir, file), "utf-8"));
|
||||
globalVariableContext[varName] = data;
|
||||
logger.info(`Loaded global data: ${varName}`, { keys: Object.keys(data) });
|
||||
logger.info({
|
||||
message: `Loaded global data: ${varName}`,
|
||||
keys: Object.keys(data)
|
||||
});
|
||||
});
|
||||
|
||||
// Load JS files second (functions can reference JSON data)
|
||||
@@ -64,16 +67,19 @@ function loadGlobalVariables() {
|
||||
const returnedObject = script.runInContext(context);
|
||||
globalVariableContext[varName] = returnedObject;
|
||||
|
||||
logger.info(`Loaded global functions: ${varName}`, {
|
||||
logger.info({
|
||||
message: `Loaded global functions: ${varName}`,
|
||||
type: typeof returnedObject,
|
||||
isObject: typeof returnedObject === 'object' && returnedObject !== null,
|
||||
keys: returnedObject ? Object.keys(returnedObject).length : 0
|
||||
});
|
||||
});
|
||||
|
||||
logger.info(`Loaded ${jsonFiles.length + jsFiles.length} global variables`,
|
||||
{ json: jsonFiles.length, js: jsFiles.length }
|
||||
);
|
||||
logger.info({
|
||||
message: `Loaded ${jsonFiles.length + jsFiles.length} global variables`,
|
||||
json: jsonFiles.length,
|
||||
js: jsFiles.length
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -92,6 +98,9 @@ function loadConfig() {
|
||||
port: process.env.PORT ? parseInt(process.env.PORT, 10) : config.server.port,
|
||||
host: process.env.HOST || config.server.host,
|
||||
},
|
||||
proxy: {
|
||||
...config.proxy,
|
||||
},
|
||||
logging: {
|
||||
...config.logging,
|
||||
level: process.env.LOG_LEVEL || config.logging.level,
|
||||
@@ -120,13 +129,15 @@ async function startServer() {
|
||||
loadGlobalVariables();
|
||||
|
||||
logger.info("Starting Proxy Script Server...");
|
||||
logger.info(
|
||||
`Configuration loaded: ${JSON.stringify({
|
||||
port: global.config.server.port,
|
||||
host: global.config.server.host,
|
||||
logLevel: global.config.logging.level,
|
||||
})}`,
|
||||
);
|
||||
logger.info({
|
||||
message: "Configuration loaded",
|
||||
port: global.config.server.port,
|
||||
host: global.config.server.host,
|
||||
logLevel: global.config.logging.level,
|
||||
proxyPrefix: global.config.proxy ?
|
||||
`${global.config.proxy.pathPrefix}${global.config.proxy.workspaceId}/${global.config.proxy.branch}/${global.config.proxy.routeName}` :
|
||||
'(none)'
|
||||
});
|
||||
|
||||
// Validate configuration
|
||||
validateConfig(global.config);
|
||||
@@ -139,6 +150,25 @@ async function startServer() {
|
||||
// Create HTTP server that delegates all requests to proxy
|
||||
const server = http.createServer((req, res) => {
|
||||
try {
|
||||
// Extract proxy routing metadata from config (not parsed from URL)
|
||||
// and attach to req.params if URL matches the configured prefix
|
||||
if (global.config.proxy) {
|
||||
const { pathPrefix, workspaceId, branch, routeName } = global.config.proxy;
|
||||
const fullPrefix = `${pathPrefix.replace(/\/$/, '')}/${workspaceId}/${branch}/${routeName}`;
|
||||
|
||||
// Check if URL starts with proxy prefix
|
||||
if (req.url.startsWith(fullPrefix)) {
|
||||
// Add routing metadata to request for proxy.js
|
||||
// All values come from config, not parsed from URL
|
||||
req.params = {
|
||||
"0": req.url, // Original full path
|
||||
workspaceId, // From config.proxy.workspaceId
|
||||
branch, // From config.proxy.branch
|
||||
route: routeName // From config.proxy.routeName (renamed to 'route' for consistency)
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
const context = vm.createContext({
|
||||
...globalVMContext,
|
||||
...globalVariableContext,
|
||||
@@ -147,9 +177,10 @@ async function startServer() {
|
||||
});
|
||||
script.runInContext(context);
|
||||
} catch (error) {
|
||||
logger.error("Request handling failed", {
|
||||
logger.error({
|
||||
message: "Request handling failed",
|
||||
error: error.message,
|
||||
stack: error.stack,
|
||||
stack: error.stack
|
||||
});
|
||||
res.statusCode = 500;
|
||||
res.end("Internal Server Error");
|
||||
@@ -176,15 +207,17 @@ async function startServer() {
|
||||
|
||||
// Start listening
|
||||
server.listen(global.config.server.port, global.config.server.host, () => {
|
||||
logger.info("Server listening", {
|
||||
logger.info({
|
||||
message: "Server listening",
|
||||
port: global.config.server.port,
|
||||
host: global.config.server.host,
|
||||
host: global.config.server.host
|
||||
});
|
||||
});
|
||||
} catch (error) {
|
||||
logger.error("Failed to start server", {
|
||||
logger.error({
|
||||
message: "Failed to start server",
|
||||
error: error.message,
|
||||
stack: error.stack,
|
||||
stack: error.stack
|
||||
});
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
45
tests/contract/documents-export.test.js
Normal file
45
tests/contract/documents-export.test.js
Normal file
@@ -0,0 +1,45 @@
|
||||
/**
|
||||
* Contract Tests: Document Export Endpoint
|
||||
*
|
||||
* Tests the HTTP contract for /documents/:documentId endpoint
|
||||
* Verifies request/response behavior against the API contract specification
|
||||
*/
|
||||
|
||||
import { describe, it } from 'node:test';
|
||||
import assert from 'node:assert';
|
||||
|
||||
describe('Document Export Endpoint Contract', () => {
|
||||
describe('Valid Google Workspace Document', () => {
|
||||
it('should return 200 with correct Content-Type header', async () => {
|
||||
// Test: Valid document returns 200 OK with appropriate Content-Type
|
||||
// TODO: Implement once /documents/:documentId route exists in proxy.js
|
||||
assert.ok(true, 'Test not yet implemented - awaiting route implementation');
|
||||
});
|
||||
|
||||
it('should return Content-Disposition header with inline and correct filename', async () => {
|
||||
// Test: Response includes Content-Disposition: inline; filename="..."
|
||||
// TODO: Implement once headers are set in proxy.js
|
||||
assert.ok(true, 'Test not yet implemented - awaiting implementation');
|
||||
});
|
||||
});
|
||||
|
||||
describe('Format-Specific Responses', () => {
|
||||
it('should return text/x-markdown Content-Type for Markdown export', async () => {
|
||||
// Test: Document with Markdown export returns text/x-markdown
|
||||
// TODO: Implement once format selection is in proxy.js
|
||||
assert.ok(true, 'Test not yet implemented - awaiting implementation');
|
||||
});
|
||||
|
||||
it('should return text/html Content-Type when Markdown unavailable', async () => {
|
||||
// Test: Falls back to HTML when Markdown not available
|
||||
// TODO: Implement once format selection is in proxy.js
|
||||
assert.ok(true, 'Test not yet implemented - awaiting implementation');
|
||||
});
|
||||
|
||||
it('should return application/pdf Content-Type for PDF-only export', async () => {
|
||||
// Test: Falls back to PDF when Markdown and HTML unavailable
|
||||
// TODO: Implement once format selection is in proxy.js
|
||||
assert.ok(true, 'Test not yet implemented - awaiting implementation');
|
||||
});
|
||||
});
|
||||
});
|
||||
39
tests/integration/google-drive-export.test.js
Normal file
39
tests/integration/google-drive-export.test.js
Normal file
@@ -0,0 +1,39 @@
|
||||
/**
|
||||
* Integration Tests: Google Drive Export
|
||||
*
|
||||
* Tests integration with Google Drive API for document export
|
||||
* Uses mocks to avoid real API calls during testing
|
||||
*/
|
||||
|
||||
import { describe, it, mock, beforeEach } from 'node:test';
|
||||
import assert from 'node:assert';
|
||||
|
||||
describe('Google Drive Export Integration', () => {
|
||||
describe('Metadata Fetch', () => {
|
||||
it('should fetch metadata from Google Drive API with correct fields', async () => {
|
||||
// This test will verify that we request the correct fields from Google Drive API
|
||||
// Fields: id, name, mimeType, exportLinks
|
||||
|
||||
// TODO: Implement once proxy.js has the document export route
|
||||
assert.ok(true, 'Test not yet implemented - awaiting route implementation');
|
||||
});
|
||||
});
|
||||
|
||||
describe('Format Selection', () => {
|
||||
it('should select first available export format from priority list', async () => {
|
||||
// This test will verify that format selection respects priority: Markdown > HTML > PDF
|
||||
|
||||
// TODO: Implement once proxy.js has format selection logic
|
||||
assert.ok(true, 'Test not yet implemented - awaiting implementation');
|
||||
});
|
||||
});
|
||||
|
||||
describe('Content Streaming', () => {
|
||||
it('should stream content from Google Drive export link to response', async () => {
|
||||
// This test will verify that content is streamed (not buffered)
|
||||
|
||||
// TODO: Implement once proxy.js has streaming logic
|
||||
assert.ok(true, 'Test not yet implemented - awaiting implementation');
|
||||
});
|
||||
});
|
||||
});
|
||||
176
tests/unit/export-headers.test.js
Normal file
176
tests/unit/export-headers.test.js
Normal file
@@ -0,0 +1,176 @@
|
||||
/**
|
||||
* Unit Tests: Export Headers Generation
|
||||
*
|
||||
* Tests Content-Disposition header generation logic
|
||||
* Verifies filename sanitization and extension handling
|
||||
*/
|
||||
|
||||
import { describe, it } from 'node:test';
|
||||
import assert from 'node:assert';
|
||||
|
||||
// Load helper functions using vm.Script pattern
|
||||
import { readFileSync } from 'fs';
|
||||
import { Script } from 'vm';
|
||||
|
||||
// Create VM context with required globals
|
||||
const vmContext = {
|
||||
crypto: globalThis.crypto,
|
||||
console: console
|
||||
};
|
||||
|
||||
// Load googleDriveAdapterHelper.js
|
||||
const helperCode = readFileSync('./src/globalVariables/googleDriveAdapterHelper.js', 'utf8');
|
||||
const wrappedCode = `(function() {\n${helperCode}\n})()`;
|
||||
const script = new Script(wrappedCode);
|
||||
const helpers = script.runInNewContext(vmContext);
|
||||
|
||||
describe('Export Headers', () => {
|
||||
describe('sanitizeFilename', () => {
|
||||
it('should preserve valid alphanumeric filenames', () => {
|
||||
const result = helpers.sanitizeFilename('MyDocument123');
|
||||
assert.strictEqual(result, 'MyDocument123');
|
||||
});
|
||||
|
||||
it('should replace spaces with underscores', () => {
|
||||
const result = helpers.sanitizeFilename('My Important Document');
|
||||
assert.strictEqual(result, 'My_Important_Document');
|
||||
});
|
||||
|
||||
it('should preserve hyphens and dots', () => {
|
||||
const result = helpers.sanitizeFilename('my-document.v2.final');
|
||||
assert.strictEqual(result, 'my-document.v2.final');
|
||||
});
|
||||
|
||||
it('should replace special characters with underscores', () => {
|
||||
const result = helpers.sanitizeFilename('My<>Document:with*chars?');
|
||||
assert.strictEqual(result, 'My__Document_with_chars_');
|
||||
});
|
||||
|
||||
it('should remove leading dots', () => {
|
||||
const result = helpers.sanitizeFilename('.hidden-file');
|
||||
assert.strictEqual(result, 'hidden-file');
|
||||
});
|
||||
|
||||
it('should remove trailing dots', () => {
|
||||
const result = helpers.sanitizeFilename('document...');
|
||||
assert.strictEqual(result, 'document');
|
||||
});
|
||||
|
||||
it('should collapse multiple dots', () => {
|
||||
const result = helpers.sanitizeFilename('my....document');
|
||||
assert.strictEqual(result, 'my.document');
|
||||
});
|
||||
|
||||
it('should handle null input', () => {
|
||||
const result = helpers.sanitizeFilename(null);
|
||||
assert.strictEqual(result, 'document');
|
||||
});
|
||||
|
||||
it('should handle undefined input', () => {
|
||||
const result = helpers.sanitizeFilename(undefined);
|
||||
assert.strictEqual(result, 'document');
|
||||
});
|
||||
|
||||
it('should handle empty string', () => {
|
||||
const result = helpers.sanitizeFilename('');
|
||||
assert.strictEqual(result, 'document');
|
||||
});
|
||||
|
||||
it('should truncate very long filenames to 255 characters', () => {
|
||||
const longName = 'a'.repeat(300);
|
||||
const result = helpers.sanitizeFilename(longName);
|
||||
assert.strictEqual(result.length, 255);
|
||||
});
|
||||
|
||||
it('should handle unicode characters', () => {
|
||||
const result = helpers.sanitizeFilename('Документ 文档');
|
||||
// Non-ASCII chars should be replaced
|
||||
assert.ok(result.includes('_'));
|
||||
});
|
||||
});
|
||||
|
||||
describe('getFileExtension', () => {
|
||||
it('should return "md" for text/x-markdown', () => {
|
||||
const result = helpers.getFileExtension('text/x-markdown');
|
||||
assert.strictEqual(result, 'md');
|
||||
});
|
||||
|
||||
it('should return "html" for text/html', () => {
|
||||
const result = helpers.getFileExtension('text/html');
|
||||
assert.strictEqual(result, 'html');
|
||||
});
|
||||
|
||||
it('should return "pdf" for application/pdf', () => {
|
||||
const result = helpers.getFileExtension('application/pdf');
|
||||
assert.strictEqual(result, 'pdf');
|
||||
});
|
||||
|
||||
it('should return "txt" for text/plain', () => {
|
||||
const result = helpers.getFileExtension('text/plain');
|
||||
assert.strictEqual(result, 'txt');
|
||||
});
|
||||
|
||||
it('should return "json" for application/json', () => {
|
||||
const result = helpers.getFileExtension('application/json');
|
||||
assert.strictEqual(result, 'json');
|
||||
});
|
||||
|
||||
it('should return "bin" for unknown mime types', () => {
|
||||
const result = helpers.getFileExtension('application/octet-stream');
|
||||
assert.strictEqual(result, 'bin');
|
||||
});
|
||||
|
||||
it('should return "bin" for null input', () => {
|
||||
const result = helpers.getFileExtension(null);
|
||||
assert.strictEqual(result, 'bin');
|
||||
});
|
||||
|
||||
it('should return "bin" for undefined input', () => {
|
||||
const result = helpers.getFileExtension(undefined);
|
||||
assert.strictEqual(result, 'bin');
|
||||
});
|
||||
});
|
||||
|
||||
describe('Content-Disposition header integration', () => {
|
||||
it('should generate correct header for markdown file', () => {
|
||||
const filename = 'Meeting Notes Q1 2026';
|
||||
const sanitized = helpers.sanitizeFilename(filename);
|
||||
const extension = helpers.getFileExtension('text/x-markdown');
|
||||
|
||||
const header = `inline; filename="${sanitized}.${extension}"`;
|
||||
|
||||
assert.strictEqual(header, 'inline; filename="Meeting_Notes_Q1_2026.md"');
|
||||
});
|
||||
|
||||
it('should generate correct header for html file', () => {
|
||||
const filename = 'Project<Plan>';
|
||||
const sanitized = helpers.sanitizeFilename(filename);
|
||||
const extension = helpers.getFileExtension('text/html');
|
||||
|
||||
const header = `inline; filename="${sanitized}.${extension}"`;
|
||||
|
||||
assert.strictEqual(header, 'inline; filename="Project_Plan_.html"');
|
||||
});
|
||||
|
||||
it('should generate correct header for pdf file', () => {
|
||||
const filename = 'Annual Report 2026';
|
||||
const sanitized = helpers.sanitizeFilename(filename);
|
||||
const extension = helpers.getFileExtension('application/pdf');
|
||||
|
||||
const header = `inline; filename="${sanitized}.${extension}"`;
|
||||
|
||||
assert.strictEqual(header, 'inline; filename="Annual_Report_2026.pdf"');
|
||||
});
|
||||
|
||||
it('should handle filename with existing extension', () => {
|
||||
const filename = 'document.old.txt';
|
||||
const sanitized = helpers.sanitizeFilename(filename);
|
||||
const extension = helpers.getFileExtension('text/x-markdown');
|
||||
|
||||
const header = `inline; filename="${sanitized}.${extension}"`;
|
||||
|
||||
// Should preserve the dots in filename and add new extension
|
||||
assert.strictEqual(header, 'inline; filename="document.old.txt.md"');
|
||||
});
|
||||
});
|
||||
});
|
||||
127
tests/unit/format-selection.test.js
Normal file
127
tests/unit/format-selection.test.js
Normal file
@@ -0,0 +1,127 @@
|
||||
/**
|
||||
* Unit Tests: Format Selection Logic
|
||||
*
|
||||
* Tests the selectExportFormat helper function
|
||||
* Verifies priority ordering: Markdown > HTML > PDF
|
||||
*/
|
||||
|
||||
import { describe, it } from 'node:test';
|
||||
import assert from 'node:assert';
|
||||
|
||||
// Load helper functions using vm.Script pattern (similar to proxy.js)
|
||||
import { readFileSync } from 'fs';
|
||||
import { Script } from 'vm';
|
||||
|
||||
// Create VM context with required globals
|
||||
const vmContext = {
|
||||
crypto: globalThis.crypto,
|
||||
console: console
|
||||
};
|
||||
|
||||
// Load googleDriveAdapterHelper.js
|
||||
const helperCode = readFileSync('./src/globalVariables/googleDriveAdapterHelper.js', 'utf8');
|
||||
const wrappedCode = `(function() {\n${helperCode}\n})()`;
|
||||
const script = new Script(wrappedCode);
|
||||
const helpers = script.runInNewContext(vmContext);
|
||||
|
||||
describe('Format Selection', () => {
|
||||
describe('selectExportFormat', () => {
|
||||
it('should prioritize text/x-markdown over other formats', () => {
|
||||
const exportLinks = {
|
||||
'text/x-markdown': 'https://example.com/export?format=md',
|
||||
'text/html': 'https://example.com/export?format=html',
|
||||
'application/pdf': 'https://example.com/export?format=pdf'
|
||||
};
|
||||
|
||||
const result = helpers.selectExportFormat(exportLinks);
|
||||
|
||||
assert.strictEqual(result.contentType, 'text/x-markdown');
|
||||
assert.strictEqual(result.extension, 'md');
|
||||
assert.strictEqual(result.url, 'https://example.com/export?format=md');
|
||||
});
|
||||
|
||||
it('should select text/html when markdown is unavailable', () => {
|
||||
const exportLinks = {
|
||||
'text/html': 'https://example.com/export?format=html',
|
||||
'application/pdf': 'https://example.com/export?format=pdf'
|
||||
};
|
||||
|
||||
const result = helpers.selectExportFormat(exportLinks);
|
||||
|
||||
assert.strictEqual(result.contentType, 'text/html');
|
||||
assert.strictEqual(result.extension, 'html');
|
||||
assert.strictEqual(result.url, 'https://example.com/export?format=html');
|
||||
});
|
||||
|
||||
it('should select application/pdf when markdown and html are unavailable', () => {
|
||||
const exportLinks = {
|
||||
'application/pdf': 'https://example.com/export?format=pdf'
|
||||
};
|
||||
|
||||
const result = helpers.selectExportFormat(exportLinks);
|
||||
|
||||
assert.strictEqual(result.contentType, 'application/pdf');
|
||||
assert.strictEqual(result.extension, 'pdf');
|
||||
assert.strictEqual(result.url, 'https://example.com/export?format=pdf');
|
||||
});
|
||||
|
||||
it('should return null when no supported formats are available', () => {
|
||||
const exportLinks = {
|
||||
'text/plain': 'https://example.com/export?format=txt',
|
||||
'application/json': 'https://example.com/export?format=json'
|
||||
};
|
||||
|
||||
const result = helpers.selectExportFormat(exportLinks);
|
||||
|
||||
assert.strictEqual(result, null);
|
||||
});
|
||||
|
||||
it('should return null when exportLinks is null', () => {
|
||||
const result = helpers.selectExportFormat(null);
|
||||
|
||||
assert.strictEqual(result, null);
|
||||
});
|
||||
|
||||
it('should return null when exportLinks is undefined', () => {
|
||||
const result = helpers.selectExportFormat(undefined);
|
||||
|
||||
assert.strictEqual(result, null);
|
||||
});
|
||||
|
||||
it('should return null when exportLinks is empty object', () => {
|
||||
const result = helpers.selectExportFormat({});
|
||||
|
||||
assert.strictEqual(result, null);
|
||||
});
|
||||
|
||||
it('should respect priority order even when formats appear in different order', () => {
|
||||
const exportLinks = {
|
||||
'application/pdf': 'https://example.com/export?format=pdf',
|
||||
'text/x-markdown': 'https://example.com/export?format=md',
|
||||
'text/html': 'https://example.com/export?format=html'
|
||||
};
|
||||
|
||||
const result = helpers.selectExportFormat(exportLinks);
|
||||
|
||||
// Should still select Markdown despite PDF being first in object
|
||||
assert.strictEqual(result.contentType, 'text/x-markdown');
|
||||
});
|
||||
});
|
||||
|
||||
describe('EXPORT_FORMATS constant', () => {
|
||||
it('should define correct priority order', () => {
|
||||
assert.ok(Array.isArray(helpers.EXPORT_FORMATS));
|
||||
assert.strictEqual(helpers.EXPORT_FORMATS.length, 3);
|
||||
|
||||
// Verify order: Markdown > HTML > PDF
|
||||
assert.strictEqual(helpers.EXPORT_FORMATS[0].mimeType, 'text/x-markdown');
|
||||
assert.strictEqual(helpers.EXPORT_FORMATS[0].extension, 'md');
|
||||
|
||||
assert.strictEqual(helpers.EXPORT_FORMATS[1].mimeType, 'text/html');
|
||||
assert.strictEqual(helpers.EXPORT_FORMATS[1].extension, 'html');
|
||||
|
||||
assert.strictEqual(helpers.EXPORT_FORMATS[2].mimeType, 'application/pdf');
|
||||
assert.strictEqual(helpers.EXPORT_FORMATS[2].extension, 'pdf');
|
||||
});
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user