11 KiB
API Contract: Sitemap Endpoint
Feature: 001-drive-proxy-adapter
Date: 2026-03-07
Phase: 1 - Design & Contracts
Endpoint: GET /sitemap.xml
Overview
The /sitemap.xml endpoint returns an XML sitemap listing all Google Drive documents accessible to the Service Account. This is the only endpoint exposed by the adapter.
Endpoint Definition
URL
GET /sitemap.xml
Authentication
- Method: None (endpoint is public)
- Backend Authentication: Service Account JWT to Google Drive API (transparent to client)
- Credentials: Loaded from
GOOGLE_SERVICE_ACCOUNT_KEYenvironment variable
Request
Method: GET
Headers:
- None required
Query Parameters:
- None supported
Request Body:
- None (GET request)
Example Request:
GET /sitemap.xml HTTP/1.1
Host: adapter.example.com
User-Agent: Mozilla/5.0
Response Specifications
Success Response (200 OK)
Status Code: 200 OK
Headers:
Content-Type: application/xmlContent-Length: {size_in_bytes}
Body: Valid XML sitemap conforming to sitemap protocol
XML Schema:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://adapter.example.com/documents/{documentId}</loc>
<lastmod>2026-03-06T10:30:00.000Z</lastmod>
</url>
<!-- Additional <url> entries (up to 50,000) -->
</urlset>
Field Descriptions:
<urlset>: Root element with sitemap namespace<url>: Individual URL entry (0 to 50,000 entries)<loc>: Absolute URL to document using RESTful format/documents/{documentId}<lastmod>: ISO 8601 timestamp of last document modification
Constraints:
- Maximum 50,000
<url>entries (sitemap protocol limit per spec.md FR-015) - Maximum 50MB uncompressed (protocol limit, not enforced)
- All
<loc>URLs use same base URL (configured viaBASE_URLenv var) - All
<loc>URLs use RESTful path format:/documents/{documentId}
Example Response:
HTTP/1.1 200 OK
Content-Type: application/xml
Content-Length: 4582
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://adapter.example.com/documents/1BxAA_example123</loc>
<lastmod>2026-03-06T10:30:00.000Z</lastmod>
</url>
<url>
<loc>https://adapter.example.com/documents/1CyBB_example456</loc>
<lastmod>2026-03-05T14:20:00.000Z</lastmod>
</url>
<url>
<loc>https://adapter.example.com/documents/1DzCC_example789</loc>
<lastmod>2026-03-04T08:15:00.000Z</lastmod>
</url>
</urlset>
Performance Targets (from spec.md success criteria):
- Response time: < 5 seconds for up to 10,000 documents
- Memory usage: < 256MB under normal load
- Concurrent requests: Support 10 concurrent requests without degradation
Not Found Response (404)
Status Code: 404 Not Found
Headers: None
Body: Empty (per spec.md clarification: "HTTP status code only, no error response body")
When Returned:
- Any path other than
/sitemap.xml(per spec.md FR-007)
Example Response:
HTTP/1.1 404 Not Found
Unauthorized Response (401)
Status Code: 401 Unauthorized
Headers: None
Body: Empty (per spec.md clarification: "HTTP status code only, no error response body")
When Returned:
- Service Account JWT authentication failed (per spec.md FR-010)
- OAuth token refresh failed
- Invalid Service Account credentials
Example Response:
HTTP/1.1 401 Unauthorized
Client Action: Check Service Account credentials in GOOGLE_SERVICE_ACCOUNT_KEY environment variable
Rate Limited Response (429)
Status Code: 429 Too Many Requests
Headers:
Retry-After: {seconds}(integer, seconds until retry allowed)
Body: Empty (per spec.md clarification: "HTTP status code only, no error response body")
When Returned:
- Google Drive API rate limit exceeded (per spec.md FR-013)
- Quota exhausted for Service Account
Example Response:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Client Action: Wait Retry-After seconds before retrying request
Retry-After Values:
- Derived from Google Drive API
Retry-Afterheader if available - Default: 60 seconds if not specified by Drive API
Internal Server Error (500)
Status Code: 500 Internal Server Error
Headers: None
Body: Empty (per spec.md clarification: "HTTP status code only, no error response body")
When Returned:
- Unexpected server error (per spec.md FR-008)
- Configuration error (missing environment variables)
- XML generation failure
Example Response:
HTTP/1.1 500 Internal Server Error
Client Action: Report error to adapter administrator
Server Logging: All 500 errors logged with stack trace to stderr (per spec.md FR-012)
Service Unavailable Response (503)
Status Code: 503 Service Unavailable
Headers: None
Body: Empty (per spec.md clarification: "HTTP status code only, no error response body")
When Returned:
- Google Drive API unavailable (per spec.md FR-017)
- Drive API returns 503 status (no retries per spec clarification)
Example Response:
HTTP/1.1 503 Service Unavailable
Client Action: Retry request later (Drive API temporarily unavailable)
Retry Behavior: Adapter does NOT retry Drive API 503 errors; immediately returns 503 to client (per spec.md FR-017 clarification)
Error Handling Specification
Error Response Format
All error responses follow same pattern:
- Status code indicates error type
- No response body (per spec.md clarification)
- Minimal headers (only
Retry-Afterfor 429)
Rationale: Simplicity, consistency, fail-fast approach
Error Status Code Matrix
| Error Condition | Status Code | Headers | Body | Retry? |
|---|---|---|---|---|
| Authentication failed | 401 | None | Empty | No (fix credentials) |
| Rate limit exceeded | 429 | Retry-After |
Empty | Yes (after delay) |
| Drive API unavailable | 503 | None | Empty | Yes (later) |
| Internal error | 500 | None | Empty | No (report to admin) |
| Path not found | 404 | None | Empty | No |
Logging Specification
Request Logging (stdout)
All requests logged with:
- Timestamp (ISO 8601)
- HTTP method and path
- Response status code
- Response time (milliseconds)
Example:
[2026-03-07T14:30:15.456Z] GET /sitemap.xml -> 200 (1234ms)
[2026-03-07T14:30:20.789Z] GET /sitemap.xml -> 429 (234ms)
[2026-03-07T14:30:25.012Z] GET /invalid.xml -> 404 (1ms)
Error Logging (stderr)
All errors logged with:
- Timestamp (ISO 8601)
- Request ID (for correlation)
- Error message
- Stack trace (for 500 errors)
Example:
[2026-03-07T14:30:20.789Z] [ERROR] Rate limit exceeded: Drive API quota exhausted
[2026-03-07T14:30:25.012Z] [ERROR] Authentication failed: Invalid Service Account key
[2026-03-07T14:30:30.345Z] [ERROR] Drive API unavailable: Connection timeout
Contract Tests
Test Scenarios
-
Successful sitemap generation
- Request:
GET /sitemap.xml - Expected: 200 status, valid XML,
Content-Type: application/xml
- Request:
-
Not found for other paths
- Request:
GET /invalid.xml - Expected: 404 status, empty body
- Request:
-
Rate limiting
- Simulate Drive API 429 response
- Expected: 429 status,
Retry-Afterheader, empty body
-
Authentication failure
- Simulate invalid credentials
- Expected: 401 status, empty body
-
Service unavailable
- Simulate Drive API 503 response
- Expected: 503 status, empty body (no retries)
-
XML schema validation
- Request:
GET /sitemap.xml - Validate XML against sitemap protocol schema
- Request:
-
URL format validation
- Request:
GET /sitemap.xml - Verify all
<loc>URLs use/documents/{documentId}format
- Request:
Test Assertions
XML Schema Validation:
- Root element:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> - Each
<url>has required<loc>child - Each
<lastmod>is valid ISO 8601 timestamp - Maximum 50,000
<url>entries
URL Format Validation:
- All
<loc>URLs are absolute (start with http:// or https://) - All
<loc>URLs use RESTful format:{baseUrl}/documents/{documentId} - Document IDs match regex:
^[a-zA-Z0-9_-]+$
Header Validation:
- 200 responses include
Content-Type: application/xml - 429 responses include
Retry-Afterheader with integer value - All error responses have empty body
Configuration
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
GOOGLE_SERVICE_ACCOUNT_KEY |
Yes | None | Inline JSON of Service Account key file |
BASE_URL |
Yes | None | Base URL for sitemap links (e.g., https://adapter.example.com) |
PORT |
No | 3000 | HTTP server port |
Example .env:
GOOGLE_SERVICE_ACCOUNT_KEY='{"type":"service_account","project_id":"...","private_key":"-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n","client_email":"...@developer.gserviceaccount.com",...}'
BASE_URL=https://adapter.example.com
PORT=3000
Compatibility
Sitemap Protocol Compliance
Protocol: https://www.sitemaps.org/protocol.html
Compliance:
- ✅ Valid XML with namespace
- ✅
<loc>with absolute URLs - ✅
<lastmod>with W3C Datetime format (ISO 8601) - ✅ Maximum 50,000 URLs
- ✅ Maximum 50MB uncompressed size
Optional Elements Not Used:
<changefreq>: Not applicable (no historical change data)<priority>: Not applicable (all documents equal priority)
HTTP Compliance
HTTP Version: HTTP/1.1
Methods Supported: GET only
Status Codes Used: 200, 401, 404, 429, 500, 503
Headers Used:
- Response:
Content-Type,Content-Length,Retry-After - Request: Standard HTTP headers accepted, none required
Security Considerations
Authentication
- Service Account credentials secured in environment variable (not in code or config files)
- Credentials never logged or exposed in error messages
- Read-only Drive scope (
drive.readonly) - no write permissions
Rate Limiting
- Transparent propagation of Drive API rate limits to client
- No internal rate limiting (rely on Drive API limits)
Input Validation
- Path validation: Only
/sitemap.xmlaccepted - Method validation: Only
GETaccepted - No query parameters processed (rejection not required, just ignored)
Output Sanitization
- All URLs XML-escaped to prevent injection
- All timestamps XML-escaped (though ISO 8601 format doesn't contain XML special chars)
Versioning
Current Version: 1.0.0 (initial implementation)
Future Changes:
- Breaking changes (new required parameters): Major version bump (2.0.0)
- Backward-compatible additions (query parameters): Minor version bump (1.1.0)
- Bug fixes: Patch version bump (1.0.1)
Deprecation Policy:
- Breaking changes include migration guide
- Deprecated features supported for at least one minor version
References
- Feature Specification:
/specs/001-drive-proxy-adapter/spec.md - Data Model:
/specs/001-drive-proxy-adapter/data-model.md - Research Document:
/specs/001-drive-proxy-adapter/research.md - Sitemap Protocol: https://www.sitemaps.org/protocol.html
- Google Drive API v3: https://developers.google.com/drive/api/v3/reference