Files

7.9 KiB

Quickstart Guide: Document Export API

Feature: 002-document-export
Date: 2026-03-09
Audience: Developers and API consumers

Overview

The Document Export API provides a simple HTTP endpoint for exporting Google Drive documents in multiple formats. The system automatically selects the best available format (Markdown > HTML > PDF) and streams the content with appropriate headers.


Quick Start

1. Start the Proxy Server

# Install dependencies (if not already done)
npm install

# Start server in development mode (with auto-reload)
npm run dev

# Or start in production mode
npm start

Server starts on http://localhost:3000 (configurable via config/default.json)


2. Export a Document

Basic Request:

curl http://localhost:3000/documents/{DOCUMENT_ID}

Example (Export Google Doc as Markdown):

curl http://localhost:3000/documents/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms \
  -o output.md

Example (Export Native PDF):

curl http://localhost:3000/documents/1AbcDeFgHiJkLmNoPqRsTuVwXyZ1234567890 \
  -o output.pdf

Save with Original Filename:

# The Content-Disposition header includes the original filename
curl -OJ http://localhost:3000/documents/{DOCUMENT_ID}

Finding Document IDs

From Google Drive URL

Google Drive URLs contain the document ID:

https://docs.google.com/document/d/DOCUMENT_ID/edit
https://drive.google.com/file/d/DOCUMENT_ID/view

Example:

  • URL: https://docs.google.com/document/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms/edit
  • Document ID: 1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms

Supported Formats

Google Workspace Documents

Automatically exported in best available format:

Document Type Preferred Format Fallback Formats
Google Docs Markdown (.md) HTML (.html), PDF (.pdf)
Google Sheets HTML (.html) PDF (.pdf)
Google Slides PDF (.pdf) -

Native Files

File Type Behavior
PDF Streamed directly (no conversion)
Images, Videos, Archives Returns 403 "mimetype not supported"

Response Headers

Every successful response includes:

Content-Type: text/x-markdown | text/html | application/pdf
Content-Disposition: inline; filename="document-name.ext"
  • Content-Type: Indicates the export format
  • Content-Disposition: Provides the original filename with appropriate extension

Error Handling

Common Errors

Error Status Cause Solution
Document not found 404 Invalid ID Verify document ID is correct
Unauthorized 401 No permission Check Google Drive access permissions
mimetype not supported 403 Unsupported file type Only Workspace docs and PDFs supported
Payload Too Large 413 Document >10MB Use smaller documents or direct Drive access
Gateway Timeout 504 Operation >30s Retry or use smaller documents

Error Response Format

All errors return plain text messages:

$ curl http://localhost:3000/documents/invalid-id
Document not found

$ curl http://localhost:3000/documents/{IMAGE_FILE_ID}
mimetype not supported

Advanced Usage

Check Response Headers

# View headers without downloading content
curl -I http://localhost:3000/documents/{DOCUMENT_ID}

Example Output:

HTTP/1.1 200 OK
Content-Type: text/x-markdown
Content-Disposition: inline; filename="Meeting_Notes.md"

Stream Large Documents

# Stream to stdout (for processing)
curl http://localhost:3000/documents/{DOCUMENT_ID} | less

# Pipe to another tool
curl http://localhost:3000/documents/{DOCUMENT_ID} | pandoc -f markdown -t docx -o output.docx

Integrate with Scripts

Bash Script Example:

#!/bin/bash

DOCUMENT_ID="1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms"
OUTPUT_DIR="./exports"

# Create output directory
mkdir -p "$OUTPUT_DIR"

# Export document
curl "http://localhost:3000/documents/$DOCUMENT_ID" \
  -o "$OUTPUT_DIR/document.md" \
  --fail \
  --show-error

if [ $? -eq 0 ]; then
  echo "Export successful: $OUTPUT_DIR/document.md"
else
  echo "Export failed"
  exit 1
fi

Node.js Example:

const axios = require('axios');
const fs = require('fs');

async function exportDocument(documentId, outputPath) {
  const url = `http://localhost:3000/documents/${documentId}`;
  
  try {
    const response = await axios.get(url, {
      responseType: 'stream',
      timeout: 30000  // 30 second timeout
    });
    
    const writer = fs.createWriteStream(outputPath);
    response.data.pipe(writer);
    
    return new Promise((resolve, reject) => {
      writer.on('finish', resolve);
      writer.on('error', reject);
    });
  } catch (error) {
    if (error.response) {
      console.error(`Error ${error.response.status}: ${error.response.data}`);
    } else {
      console.error('Request failed:', error.message);
    }
    throw error;
  }
}

// Usage
exportDocument('1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms', 'output.md')
  .then(() => console.log('Export complete'))
  .catch(err => console.error('Export failed:', err));

Testing

Run Tests

# Run all tests
npm test

# Run specific test suites
npm run test:contract    # API contract tests
npm run test:integration # Google Drive integration tests
npm run test:unit        # Unit tests

Manual Testing Checklist

  • Export Google Doc as Markdown
  • Export Google Sheet as HTML
  • Export Google Slides as PDF
  • Export native PDF file
  • Test invalid document ID (should return 404)
  • Test unsupported file type (should return 403)
  • Verify Content-Disposition filename matches document name
  • Verify Content-Type header matches export format

Performance Characteristics

Metric Expected Value
Response time (docs <10MB) <5 seconds
Concurrent requests 50+ supported
Success rate >99% for valid docs
Memory per request <1MB (streaming)

Troubleshooting

"Document not found" for valid document

  1. Verify document ID is correct (check Google Drive URL)
  2. Ensure Google Drive service account has access to the document
  3. Check if document is in a shared drive (requires supportsAllDrives=true)

"Unauthorized" error

  1. Check Google Drive credentials in src/globalVariables/google_drive_settings.json
  2. Verify service account has been granted access to the document
  3. Check if access token is expired (auth handled by proxy layer)

"Gateway Timeout" on large documents

  1. Document may be >10MB (check file size in Google Drive)
  2. Slow network connection to Google Drive API
  3. Try again - transient network issue

"mimetype not supported"

This is expected for non-document files:

  • Images (.jpg, .png, .gif)
  • Videos (.mp4, .mov)
  • Archives (.zip, .tar)
  • Executables (.exe, .dmg)

Only Google Workspace documents (Docs, Sheets, Slides) and native PDFs are supported.


Configuration

Server Settings

Edit config/default.json:

{
  "server": {
    "host": "localhost",
    "port": 3000
  },
  "logging": {
    "level": "info"
  }
}

Google Drive Credentials

Credentials stored in src/globalVariables/google_drive_settings.json (managed by existing infrastructure).


Next Steps

  • Integration: Use the /documents/:documentId endpoint in your applications
  • Testing: Run contract tests to verify behavior: npm run test:contract
  • Monitoring: Check logs for errors: npm run dev shows real-time logs
  • Scaling: Deploy multiple instances behind a load balancer for high traffic

Support

For issues or questions:

  1. Check error messages and status codes (see Error Handling section)
  2. Review logs for detailed error information
  3. Verify Google Drive permissions and credentials
  4. Consult API contract: specs/002-document-export/contracts/documents-export-api.md