Added new feature for document export, including API contracts, data model, implementation plan, and tests. Updated related configurations and instructions.
This commit is contained in:
337
specs/002-document-export/quickstart.md
Normal file
337
specs/002-document-export/quickstart.md
Normal file
@@ -0,0 +1,337 @@
|
||||
# Quickstart Guide: Document Export API
|
||||
|
||||
**Feature**: 002-document-export
|
||||
**Date**: 2026-03-09
|
||||
**Audience**: Developers and API consumers
|
||||
|
||||
## Overview
|
||||
|
||||
The Document Export API provides a simple HTTP endpoint for exporting Google Drive documents in multiple formats. The system automatically selects the best available format (Markdown > HTML > PDF) and streams the content with appropriate headers.
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Start the Proxy Server
|
||||
|
||||
```bash
|
||||
# Install dependencies (if not already done)
|
||||
npm install
|
||||
|
||||
# Start server in development mode (with auto-reload)
|
||||
npm run dev
|
||||
|
||||
# Or start in production mode
|
||||
npm start
|
||||
```
|
||||
|
||||
Server starts on `http://localhost:3000` (configurable via `config/default.json`)
|
||||
|
||||
---
|
||||
|
||||
### 2. Export a Document
|
||||
|
||||
**Basic Request**:
|
||||
```bash
|
||||
curl http://localhost:3000/documents/{DOCUMENT_ID}
|
||||
```
|
||||
|
||||
**Example (Export Google Doc as Markdown)**:
|
||||
```bash
|
||||
curl http://localhost:3000/documents/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms \
|
||||
-o output.md
|
||||
```
|
||||
|
||||
**Example (Export Native PDF)**:
|
||||
```bash
|
||||
curl http://localhost:3000/documents/1AbcDeFgHiJkLmNoPqRsTuVwXyZ1234567890 \
|
||||
-o output.pdf
|
||||
```
|
||||
|
||||
**Save with Original Filename**:
|
||||
```bash
|
||||
# The Content-Disposition header includes the original filename
|
||||
curl -OJ http://localhost:3000/documents/{DOCUMENT_ID}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Finding Document IDs
|
||||
|
||||
### From Google Drive URL
|
||||
|
||||
Google Drive URLs contain the document ID:
|
||||
|
||||
```
|
||||
https://docs.google.com/document/d/DOCUMENT_ID/edit
|
||||
https://drive.google.com/file/d/DOCUMENT_ID/view
|
||||
```
|
||||
|
||||
**Example**:
|
||||
- URL: `https://docs.google.com/document/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/edit`
|
||||
- Document ID: `1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms`
|
||||
|
||||
---
|
||||
|
||||
## Supported Formats
|
||||
|
||||
### Google Workspace Documents
|
||||
|
||||
Automatically exported in best available format:
|
||||
|
||||
| Document Type | Preferred Format | Fallback Formats |
|
||||
|---------------|------------------|------------------|
|
||||
| Google Docs | Markdown (.md) | HTML (.html), PDF (.pdf) |
|
||||
| Google Sheets | HTML (.html) | PDF (.pdf) |
|
||||
| Google Slides | PDF (.pdf) | - |
|
||||
|
||||
### Native Files
|
||||
|
||||
| File Type | Behavior |
|
||||
|-----------|----------|
|
||||
| PDF | Streamed directly (no conversion) |
|
||||
| Images, Videos, Archives | Returns 403 "mimetype not supported" |
|
||||
|
||||
---
|
||||
|
||||
## Response Headers
|
||||
|
||||
Every successful response includes:
|
||||
|
||||
```http
|
||||
Content-Type: text/x-markdown | text/html | application/pdf
|
||||
Content-Disposition: inline; filename="document-name.ext"
|
||||
```
|
||||
|
||||
- **Content-Type**: Indicates the export format
|
||||
- **Content-Disposition**: Provides the original filename with appropriate extension
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Common Errors
|
||||
|
||||
| Error | Status | Cause | Solution |
|
||||
|-------|--------|-------|----------|
|
||||
| Document not found | 404 | Invalid ID | Verify document ID is correct |
|
||||
| Unauthorized | 401 | No permission | Check Google Drive access permissions |
|
||||
| mimetype not supported | 403 | Unsupported file type | Only Workspace docs and PDFs supported |
|
||||
| Payload Too Large | 413 | Document >10MB | Use smaller documents or direct Drive access |
|
||||
| Gateway Timeout | 504 | Operation >30s | Retry or use smaller documents |
|
||||
|
||||
### Error Response Format
|
||||
|
||||
All errors return plain text messages:
|
||||
|
||||
```bash
|
||||
$ curl http://localhost:3000/documents/invalid-id
|
||||
Document not found
|
||||
|
||||
$ curl http://localhost:3000/documents/{IMAGE_FILE_ID}
|
||||
mimetype not supported
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Check Response Headers
|
||||
|
||||
```bash
|
||||
# View headers without downloading content
|
||||
curl -I http://localhost:3000/documents/{DOCUMENT_ID}
|
||||
```
|
||||
|
||||
**Example Output**:
|
||||
```http
|
||||
HTTP/1.1 200 OK
|
||||
Content-Type: text/x-markdown
|
||||
Content-Disposition: inline; filename="Meeting_Notes.md"
|
||||
```
|
||||
|
||||
### Stream Large Documents
|
||||
|
||||
```bash
|
||||
# Stream to stdout (for processing)
|
||||
curl http://localhost:3000/documents/{DOCUMENT_ID} | less
|
||||
|
||||
# Pipe to another tool
|
||||
curl http://localhost:3000/documents/{DOCUMENT_ID} | pandoc -f markdown -t docx -o output.docx
|
||||
```
|
||||
|
||||
### Integrate with Scripts
|
||||
|
||||
**Bash Script Example**:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
DOCUMENT_ID="1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms"
|
||||
OUTPUT_DIR="./exports"
|
||||
|
||||
# Create output directory
|
||||
mkdir -p "$OUTPUT_DIR"
|
||||
|
||||
# Export document
|
||||
curl "http://localhost:3000/documents/$DOCUMENT_ID" \
|
||||
-o "$OUTPUT_DIR/document.md" \
|
||||
--fail \
|
||||
--show-error
|
||||
|
||||
if [ $? -eq 0 ]; then
|
||||
echo "Export successful: $OUTPUT_DIR/document.md"
|
||||
else
|
||||
echo "Export failed"
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
**Node.js Example**:
|
||||
```javascript
|
||||
const axios = require('axios');
|
||||
const fs = require('fs');
|
||||
|
||||
async function exportDocument(documentId, outputPath) {
|
||||
const url = `http://localhost:3000/documents/${documentId}`;
|
||||
|
||||
try {
|
||||
const response = await axios.get(url, {
|
||||
responseType: 'stream',
|
||||
timeout: 30000 // 30 second timeout
|
||||
});
|
||||
|
||||
const writer = fs.createWriteStream(outputPath);
|
||||
response.data.pipe(writer);
|
||||
|
||||
return new Promise((resolve, reject) => {
|
||||
writer.on('finish', resolve);
|
||||
writer.on('error', reject);
|
||||
});
|
||||
} catch (error) {
|
||||
if (error.response) {
|
||||
console.error(`Error ${error.response.status}: ${error.response.data}`);
|
||||
} else {
|
||||
console.error('Request failed:', error.message);
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
// Usage
|
||||
exportDocument('1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms', 'output.md')
|
||||
.then(() => console.log('Export complete'))
|
||||
.catch(err => console.error('Export failed:', err));
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### Run Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
npm test
|
||||
|
||||
# Run specific test suites
|
||||
npm run test:contract # API contract tests
|
||||
npm run test:integration # Google Drive integration tests
|
||||
npm run test:unit # Unit tests
|
||||
```
|
||||
|
||||
### Manual Testing Checklist
|
||||
|
||||
- [ ] Export Google Doc as Markdown
|
||||
- [ ] Export Google Sheet as HTML
|
||||
- [ ] Export Google Slides as PDF
|
||||
- [ ] Export native PDF file
|
||||
- [ ] Test invalid document ID (should return 404)
|
||||
- [ ] Test unsupported file type (should return 403)
|
||||
- [ ] Verify Content-Disposition filename matches document name
|
||||
- [ ] Verify Content-Type header matches export format
|
||||
|
||||
---
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
| Metric | Expected Value |
|
||||
|--------|----------------|
|
||||
| Response time (docs <10MB) | <5 seconds |
|
||||
| Concurrent requests | 50+ supported |
|
||||
| Success rate | >99% for valid docs |
|
||||
| Memory per request | <1MB (streaming) |
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Document not found" for valid document
|
||||
|
||||
1. Verify document ID is correct (check Google Drive URL)
|
||||
2. Ensure Google Drive service account has access to the document
|
||||
3. Check if document is in a shared drive (requires `supportsAllDrives=true`)
|
||||
|
||||
### "Unauthorized" error
|
||||
|
||||
1. Check Google Drive credentials in `src/globalVariables/google_drive_settings.json`
|
||||
2. Verify service account has been granted access to the document
|
||||
3. Check if access token is expired (auth handled by proxy layer)
|
||||
|
||||
### "Gateway Timeout" on large documents
|
||||
|
||||
1. Document may be >10MB (check file size in Google Drive)
|
||||
2. Slow network connection to Google Drive API
|
||||
3. Try again - transient network issue
|
||||
|
||||
### "mimetype not supported"
|
||||
|
||||
This is expected for non-document files:
|
||||
- Images (.jpg, .png, .gif)
|
||||
- Videos (.mp4, .mov)
|
||||
- Archives (.zip, .tar)
|
||||
- Executables (.exe, .dmg)
|
||||
|
||||
Only Google Workspace documents (Docs, Sheets, Slides) and native PDFs are supported.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Server Settings
|
||||
|
||||
Edit `config/default.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"server": {
|
||||
"host": "localhost",
|
||||
"port": 3000
|
||||
},
|
||||
"logging": {
|
||||
"level": "info"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Google Drive Credentials
|
||||
|
||||
Credentials stored in `src/globalVariables/google_drive_settings.json` (managed by existing infrastructure).
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- **Integration**: Use the `/documents/:documentId` endpoint in your applications
|
||||
- **Testing**: Run contract tests to verify behavior: `npm run test:contract`
|
||||
- **Monitoring**: Check logs for errors: `npm run dev` shows real-time logs
|
||||
- **Scaling**: Deploy multiple instances behind a load balancer for high traffic
|
||||
|
||||
---
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
1. Check error messages and status codes (see Error Handling section)
|
||||
2. Review logs for detailed error information
|
||||
3. Verify Google Drive permissions and credentials
|
||||
4. Consult API contract: `specs/002-document-export/contracts/documents-export-api.md`
|
||||
Reference in New Issue
Block a user