9.6 KiB
Data Model: Document Export API Route
Feature: 002-document-export
Date: 2026-03-09
Purpose: Define data structures and entities for document export functionality
Overview
This feature introduces three primary entities for handling document export requests: Document, ExportRequest, and ExportFormat. These entities represent the data flowing through the export pipeline from request initiation to response delivery.
Entities
1. Document
Represents a file stored in Google Drive, accessed by unique ID.
Attributes:
| Field | Type | Required | Description | Validation |
|---|---|---|---|---|
id |
string | Yes | Google Drive document identifier (extracted from URL parameter) | Non-empty string, alphanumeric with hyphens/underscores |
name |
string | Yes | Document name from Google Drive metadata | Non-empty string, used in Content-Disposition filename |
mimeType |
string | Yes | MIME type of the document | One of Google Workspace types or native file types |
exportLinks |
object | No | Map of available export formats to URLs | Key: MIME type (string), Value: Export URL (string) |
Document Types:
-
Google Workspace Documents:
- Docs:
application/vnd.google-apps.document - Sheets:
application/vnd.google-apps.spreadsheet - Slides:
application/vnd.google-apps.presentation - Characteristic: Have
exportLinksfield with conversion options
- Docs:
-
Native Files:
- PDF:
application/pdf - Images:
image/jpeg,image/png, etc. - Other: Various MIME types
- Characteristic: No
exportLinksfield, streamed directly
- PDF:
State Transitions:
- N/A (stateless - documents fetched per request)
Example:
// Google Workspace Document (has exportLinks)
{
id: "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms",
name: "Meeting Notes Q1 2026",
mimeType: "application/vnd.google-apps.document",
exportLinks: {
"text/x-markdown": "https://docs.google.com/feeds/download/documents/export/Export?...",
"text/html": "https://docs.google.com/feeds/download/documents/export/Export?...",
"application/pdf": "https://docs.google.com/feeds/download/documents/export/Export?..."
}
}
// Native PDF (no exportLinks)
{
id: "1AbcDeFgHiJkLmNoPqRsTuVwXyZ1234567890",
name: "Product Specs",
mimeType: "application/pdf",
exportLinks: null
}
2. ExportRequest
Represents a user's request to export a document via the /documents/:documentId route.
Attributes:
| Field | Type | Required | Description | Validation |
|---|---|---|---|---|
documentId |
string | Yes | Document ID from URL path parameter | Non-empty string, alphanumeric with hyphens/underscores |
timestamp |
Date | Yes | Request initiation timestamp | ISO 8601 format, used for timeout calculation |
accessToken |
string | Yes | Google Drive API access token (from auth context) | Valid JWT, not expired |
Lifecycle:
- Initiated: Request received on
/documents/:documentId - Authenticated: Access token validated and available
- Metadata Fetched: Google Drive API called for document metadata
- Format Selected: Export format chosen based on availability
- Content Streamed: Document content piped to response
- Completed: Response sent to client
Timeout Handling:
- Maximum duration: 30 seconds from timestamp
- Enforced via axios timeout configuration
- Returns HTTP 504 if exceeded
Example:
{
documentId: "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms",
timestamp: "2026-03-09T18:00:00.000Z",
accessToken: "ya29.a0AfH6SMBx..." // Google OAuth2 access token
}
3. ExportFormat
Represents the selected output format for a document export.
Attributes:
| Field | Type | Required | Description | Validation |
|---|---|---|---|---|
mimeType |
string | Yes | MIME type of the export format | One of: text/x-markdown, text/html, application/pdf |
extension |
string | Yes | File extension for Content-Disposition header | One of: md, html, pdf |
url |
string | Conditional | Export URL from Google Drive exportLinks | Required for Google Workspace docs, null for native files |
isNative |
boolean | Yes | Whether this is a native file (direct stream) or export | true for native PDFs, false for conversions |
Format Priority: Priority order for selection when multiple formats available:
text/x-markdown(.md) - Most portable for content processingtext/html(.html) - Rich formatting fallbackapplication/pdf(.pdf) - Universal viewing format
Selection Rules:
- If
exportLinksexist: Select first available format from priority list - If no
exportLinksandmimeType === 'application/pdf': Use native PDF streaming - Otherwise: Return HTTP 403 "mimetype not supported"
Example:
// Google Workspace Document export (Markdown selected)
{
mimeType: "text/x-markdown",
extension: "md",
url: "https://docs.google.com/feeds/download/documents/export/Export?...",
isNative: false
}
// Native PDF file (direct stream)
{
mimeType: "application/pdf",
extension: "pdf",
url: null, // Not used - file streamed directly
isNative: true
}
// Unsupported file (image)
{
mimeType: null,
extension: null,
url: null,
isNative: false
}
// Returns HTTP 403
Entity Relationships
ExportRequest
|
| 1:1 (fetches)
v
Document
|
| 1:1 (determines)
v
ExportFormat
Flow:
- ExportRequest initiated with documentId
- Document metadata fetched from Google Drive API
- ExportFormat selected based on Document attributes (mimeType, exportLinks)
- Content streamed using ExportFormat configuration
Validation Rules
Document Validation
- ID Format: Must be valid Google Drive file ID (alphanumeric, hyphens, underscores)
- Name Sanitization: Remove special characters for Content-Disposition filename
- MIME Type: Must be recognized Google Workspace or native file type
- Export Links: If present, must be object with string keys and URL string values
Size & Timeout Constraints
- Max Document Size: 10MB (10,485,760 bytes)
- Validated via
Content-Lengthheader before streaming - Returns HTTP 413 if exceeded
- Validated via
- Max Request Duration: 30 seconds
- Enforced via axios timeout
- Returns HTTP 504 if exceeded
Format Selection Validation
- Priority Check: Iterate through formats in order: Markdown → HTML → PDF
- Availability Check: Format must exist in exportLinks object
- Fallback Check: If no exportLinks, mimeType must be
application/pdf - Rejection: If none of above, return HTTP 403
Error States
Document Not Found
- Condition: Google Drive API returns 404 or document doesn't exist
- Response: HTTP 404 "Document not found"
- Data State: No Document entity created
Unauthorized Access
- Condition: User lacks permissions, invalid/expired token
- Response: HTTP 401 "Unauthorized"
- Data State: No Document entity created
Unsupported Format
- Condition: No exportLinks, mimeType not application/pdf
- Response: HTTP 403 "mimetype not supported"
- Data State: Document entity exists, ExportFormat entity null
Size Limit Exceeded
- Condition: Content-Length > 10MB
- Response: HTTP 413 "Payload Too Large"
- Data State: Document entity exists, ExportFormat selected, streaming aborted
Timeout Exceeded
- Condition: Request duration > 30 seconds
- Response: HTTP 504 "Gateway Timeout"
- Data State: Partial processing, request abandoned
Google Drive API Error
- Condition: API unavailable, rate limit exceeded
- Response: HTTP 502 "Bad Gateway - Google Drive API unavailable"
- Data State: Variable depending on failure point
Data Flow Example
Successful Export (Google Workspace Document):
1. ExportRequest { documentId: "abc123", timestamp: T0, accessToken: "..." }
2. Document { id: "abc123", name: "Report", mimeType: "application/vnd.google-apps.document", exportLinks: {...} }
3. ExportFormat { mimeType: "text/x-markdown", extension: "md", url: "https://...", isNative: false }
4. Stream content from url to client
5. Response Headers: Content-Type: text/x-markdown, Content-Disposition: inline; filename="Report.md"
Successful Export (Native PDF):
1. ExportRequest { documentId: "xyz789", timestamp: T0, accessToken: "..." }
2. Document { id: "xyz789", name: "Invoice", mimeType: "application/pdf", exportLinks: null }
3. ExportFormat { mimeType: "application/pdf", extension: "pdf", url: null, isNative: true }
4. Stream file using files.get with alt=media
5. Response Headers: Content-Type: application/pdf, Content-Disposition: inline; filename="Invoice.pdf"
Failed Export (Unsupported Type):
1. ExportRequest { documentId: "img456", timestamp: T0, accessToken: "..." }
2. Document { id: "img456", name: "Photo", mimeType: "image/jpeg", exportLinks: null }
3. ExportFormat { mimeType: null, extension: null, url: null, isNative: false }
4. Return HTTP 403 "mimetype not supported"
Implementation Notes
Statelessness
- No entities persisted to database or cache
- All data exists only for request duration
- Document metadata fetched fresh per request
Memory Management
- Document metadata buffered in memory (typically <1KB)
- Content never buffered - streamed directly
- Maximum memory per request: ~10MB + metadata
Concurrency
- Each request handled independently with isolated ExportRequest entity
- No shared state between requests
- Target: 50 concurrent requests without degradation