7.9 KiB
Quickstart Guide: Document Export API
Feature: 002-document-export
Date: 2026-03-09
Audience: Developers and API consumers
Overview
The Document Export API provides a simple HTTP endpoint for exporting Google Drive documents in multiple formats. The system automatically selects the best available format (Markdown > HTML > PDF) and streams the content with appropriate headers.
Quick Start
1. Start the Proxy Server
# Install dependencies (if not already done)
npm install
# Start server in development mode (with auto-reload)
npm run dev
# Or start in production mode
npm start
Server starts on http://localhost:3000 (configurable via config/default.json)
2. Export a Document
Basic Request:
curl http://localhost:3000/documents/{DOCUMENT_ID}
Example (Export Google Doc as Markdown):
curl http://localhost:3000/documents/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms \
-o output.md
Example (Export Native PDF):
curl http://localhost:3000/documents/1AbcDeFgHiJkLmNoPqRsTuVwXyZ1234567890 \
-o output.pdf
Save with Original Filename:
# The Content-Disposition header includes the original filename
curl -OJ http://localhost:3000/documents/{DOCUMENT_ID}
Finding Document IDs
From Google Drive URL
Google Drive URLs contain the document ID:
https://docs.google.com/document/d/DOCUMENT_ID/edit
https://drive.google.com/file/d/DOCUMENT_ID/view
Example:
- URL:
https://docs.google.com/document/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/edit - Document ID:
1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms
Supported Formats
Google Workspace Documents
Automatically exported in best available format:
| Document Type | Preferred Format | Fallback Formats |
|---|---|---|
| Google Docs | Markdown (.md) | HTML (.html), PDF (.pdf) |
| Google Sheets | HTML (.html) | PDF (.pdf) |
| Google Slides | PDF (.pdf) | - |
Native Files
| File Type | Behavior |
|---|---|
| Streamed directly (no conversion) | |
| Images, Videos, Archives | Returns 403 "mimetype not supported" |
Response Headers
Every successful response includes:
Content-Type: text/x-markdown | text/html | application/pdf
Content-Disposition: inline; filename="document-name.ext"
- Content-Type: Indicates the export format
- Content-Disposition: Provides the original filename with appropriate extension
Error Handling
Common Errors
| Error | Status | Cause | Solution |
|---|---|---|---|
| Document not found | 404 | Invalid ID | Verify document ID is correct |
| Unauthorized | 401 | No permission | Check Google Drive access permissions |
| mimetype not supported | 403 | Unsupported file type | Only Workspace docs and PDFs supported |
| Payload Too Large | 413 | Document >10MB | Use smaller documents or direct Drive access |
| Gateway Timeout | 504 | Operation >30s | Retry or use smaller documents |
Error Response Format
All errors return plain text messages:
$ curl http://localhost:3000/documents/invalid-id
Document not found
$ curl http://localhost:3000/documents/{IMAGE_FILE_ID}
mimetype not supported
Advanced Usage
Check Response Headers
# View headers without downloading content
curl -I http://localhost:3000/documents/{DOCUMENT_ID}
Example Output:
HTTP/1.1 200 OK
Content-Type: text/x-markdown
Content-Disposition: inline; filename="Meeting_Notes.md"
Stream Large Documents
# Stream to stdout (for processing)
curl http://localhost:3000/documents/{DOCUMENT_ID} | less
# Pipe to another tool
curl http://localhost:3000/documents/{DOCUMENT_ID} | pandoc -f markdown -t docx -o output.docx
Integrate with Scripts
Bash Script Example:
#!/bin/bash
DOCUMENT_ID="1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms"
OUTPUT_DIR="./exports"
# Create output directory
mkdir -p "$OUTPUT_DIR"
# Export document
curl "http://localhost:3000/documents/$DOCUMENT_ID" \
-o "$OUTPUT_DIR/document.md" \
--fail \
--show-error
if [ $? -eq 0 ]; then
echo "Export successful: $OUTPUT_DIR/document.md"
else
echo "Export failed"
exit 1
fi
Node.js Example:
const axios = require('axios');
const fs = require('fs');
async function exportDocument(documentId, outputPath) {
const url = `http://localhost:3000/documents/${documentId}`;
try {
const response = await axios.get(url, {
responseType: 'stream',
timeout: 30000 // 30 second timeout
});
const writer = fs.createWriteStream(outputPath);
response.data.pipe(writer);
return new Promise((resolve, reject) => {
writer.on('finish', resolve);
writer.on('error', reject);
});
} catch (error) {
if (error.response) {
console.error(`Error ${error.response.status}: ${error.response.data}`);
} else {
console.error('Request failed:', error.message);
}
throw error;
}
}
// Usage
exportDocument('1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms', 'output.md')
.then(() => console.log('Export complete'))
.catch(err => console.error('Export failed:', err));
Testing
Run Tests
# Run all tests
npm test
# Run specific test suites
npm run test:contract # API contract tests
npm run test:integration # Google Drive integration tests
npm run test:unit # Unit tests
Manual Testing Checklist
- Export Google Doc as Markdown
- Export Google Sheet as HTML
- Export Google Slides as PDF
- Export native PDF file
- Test invalid document ID (should return 404)
- Test unsupported file type (should return 403)
- Verify Content-Disposition filename matches document name
- Verify Content-Type header matches export format
Performance Characteristics
| Metric | Expected Value |
|---|---|
| Response time (docs <10MB) | <5 seconds |
| Concurrent requests | 50+ supported |
| Success rate | >99% for valid docs |
| Memory per request | <1MB (streaming) |
Troubleshooting
"Document not found" for valid document
- Verify document ID is correct (check Google Drive URL)
- Ensure Google Drive service account has access to the document
- Check if document is in a shared drive (requires
supportsAllDrives=true)
"Unauthorized" error
- Check Google Drive credentials in
src/globalVariables/google_drive_settings.json - Verify service account has been granted access to the document
- Check if access token is expired (auth handled by proxy layer)
"Gateway Timeout" on large documents
- Document may be >10MB (check file size in Google Drive)
- Slow network connection to Google Drive API
- Try again - transient network issue
"mimetype not supported"
This is expected for non-document files:
- Images (.jpg, .png, .gif)
- Videos (.mp4, .mov)
- Archives (.zip, .tar)
- Executables (.exe, .dmg)
Only Google Workspace documents (Docs, Sheets, Slides) and native PDFs are supported.
Configuration
Server Settings
Edit config/default.json:
{
"server": {
"host": "localhost",
"port": 3000
},
"logging": {
"level": "info"
}
}
Google Drive Credentials
Credentials stored in src/globalVariables/google_drive_settings.json (managed by existing infrastructure).
Next Steps
- Integration: Use the
/documents/:documentIdendpoint in your applications - Testing: Run contract tests to verify behavior:
npm run test:contract - Monitoring: Check logs for errors:
npm run devshows real-time logs - Scaling: Deploy multiple instances behind a load balancer for high traffic
Support
For issues or questions:
- Check error messages and status codes (see Error Handling section)
- Review logs for detailed error information
- Verify Google Drive permissions and credentials
- Consult API contract:
specs/002-document-export/contracts/documents-export-api.md