# API Contract: Sitemap XML Endpoint **Feature**: 001-drive-proxy-adapter **Contract Type**: HTTP API **Endpoint**: `/sitemap.xml` **Version**: 1.0.0 **Date**: 2026-03-07 --- ## Endpoint Specification ### `GET /sitemap.xml` Generate an XML sitemap of all accessible Google Drive documents. --- ## Request ### HTTP Method `GET` ### URL `/sitemap.xml` ### Query Parameters None ### Request Headers None required ### Request Body None (GET request) --- ## Response ### Success Response (200 OK) **Status Code**: `200 OK` **Response Headers**: ``` Content-Type: application/xml; charset=utf-8 Content-Length: {size_in_bytes} ``` **Response Body** (XML): ```xml http://example.com/documents/{documentId1} 2026-03-07 http://example.com/documents/{documentId2} 2026-03-06 ``` **XML Schema Requirements**: - Root element: `` with namespace `xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"` - Each document: `` element containing: - `` (REQUIRED): Absolute URL in format `{baseUrl}/documents/{documentId}` - Must be URL-encoded - Must escape XML special characters: `&` → `&`, `<` → `<`, `>` → `>`, `"` → `"`, `'` → `'` - `` (OPTIONAL): ISO 8601 date format - Format: `YYYY-MM-DD` or `YYYY-MM-DDTHH:MM:SS+00:00` - Omitted if Drive API provides no `modifiedTime` **Empty Drive Response** (0 documents): ```xml ``` **Constraints**: - Maximum 50,000 `` entries (sitemap protocol limit) - If >50,000 documents exist, return 413 error instead --- ### Error Responses #### 404 Not Found **Trigger**: Request to any endpoint other than `/sitemap.xml` **Status Code**: `404 Not Found` **Response Headers**: None **Response Body**: Empty (no content) **Example**: ``` GET /documents/abc123 → 404 Not Found (empty body) GET /api/sitemap → 404 Not Found (empty body) POST /sitemap.xml → 404 Not Found (empty body) ``` --- #### 413 Payload Too Large **Trigger**: Google Drive contains more than 50,000 documents **Status Code**: `413 Payload Too Large` **Response Headers**: None **Response Body**: Empty (no content) **Rationale**: Sitemap protocol limits sitemaps to 50,000 URLs. This error prevents oversized sitemap generation. --- #### 429 Too Many Requests **Trigger**: Google Drive API returns rate limit error **Status Code**: `429 Too Many Requests` **Response Headers**: ``` Retry-After: {seconds} ``` **Response Body**: Empty (no content) **Example**: ``` HTTP/1.1 429 Too Many Requests Retry-After: 60 (empty body) ``` **Rationale**: Client should retry after the specified number of seconds. --- #### 401 Unauthorized **Trigger**: Service Account token refresh failed **Status Code**: `401 Unauthorized` **Response Headers**: None **Response Body**: Empty (no content) **Rationale**: Authentication failed. Check Service Account credentials configuration. --- #### 503 Service Unavailable **Trigger**: Google Drive API returns 503 error **Status Code**: `503 Service Unavailable` **Response Headers**: None **Response Body**: Empty (no content) **Behavior**: No retries - immediately pass through 503 to client per specification. --- #### 500 Internal Server Error **Trigger**: Unexpected error during sitemap generation **Status Code**: `500 Internal Server Error` **Response Headers**: None **Response Body**: Empty (no content) **Rationale**: Unexpected server error. Check logs for details. --- ## Examples ### Example 1: Successful Sitemap (3 documents) **Request**: ```http GET /sitemap.xml HTTP/1.1 Host: example.com ``` **Response**: ```http HTTP/1.1 200 OK Content-Type: application/xml; charset=utf-8 Content-Length: 512 http://example.com/documents/1A2B3C4D5E6F7G8H 2026-03-07 http://example.com/documents/9I0J1K2L3M4N5O6P 2026-03-05 http://example.com/documents/7Q8R9S0T1U2V3W4X 2026-03-01 ``` --- ### Example 2: Empty Drive **Request**: ```http GET /sitemap.xml HTTP/1.1 Host: example.com ``` **Response**: ```http HTTP/1.1 200 OK Content-Type: application/xml; charset=utf-8 Content-Length: 123 ``` --- ### Example 3: Rate Limit Exceeded **Request**: ```http GET /sitemap.xml HTTP/1.1 Host: example.com ``` **Response**: ```http HTTP/1.1 429 Too Many Requests Retry-After: 120 ``` --- ### Example 4: Too Many Documents **Request**: ```http GET /sitemap.xml HTTP/1.1 Host: example.com ``` **Response**: ```http HTTP/1.1 413 Payload Too Large ``` --- ### Example 5: Invalid Endpoint **Request**: ```http GET /documents/abc123 HTTP/1.1 Host: example.com ``` **Response**: ```http HTTP/1.1 404 Not Found ``` --- ## Contract Validation ### XML Schema Validation The sitemap XML MUST validate against the sitemap protocol schema: - **Namespace**: `http://www.sitemaps.org/schemas/sitemap/0.9` - **Root element**: `` - **Child elements**: Zero or more `` elements - **URL elements**: Each contains `` (required) and `` (optional) **Validation Tools**: - XML parser (ensure well-formed XML) - Sitemap validator: [https://www.xml-sitemaps.com/validate-xml-sitemap.html](https://www.xml-sitemaps.com/validate-xml-sitemap.html) - XSD schema validation against official sitemap schema --- ### Contract Testing Requirements All contract tests MUST verify: 1. **Success Path**: - Response status 200 - Content-Type header is `application/xml; charset=utf-8` - Response body is valid XML - XML contains correct namespace - All `` URLs are absolute and properly formatted - All `` URLs follow pattern: `{baseUrl}/documents/{documentId}` - All `` dates are valid ISO 8601 format (if present) 2. **Error Handling**: - Invalid endpoints return 404 with empty body - >50k documents returns 413 with empty body - Rate limiting returns 429 with `Retry-After` header and empty body - Drive API 503 returns 503 with empty body (no retries) - All error responses have no `Content-Type` header - All error responses have empty body 3. **Edge Cases**: - Empty Drive (0 documents) returns valid sitemap with no `` entries - Documents without `modifiedTime` omit `` tag - Special characters in document IDs are properly URL-encoded - XML special characters in URLs are properly escaped --- ## Breaking Changes Changes that constitute breaking changes (require MAJOR version bump): 1. Changing URL format from `/documents/{id}` to different format 2. Changing XML namespace or root element structure 3. Removing `` field entirely 4. Changing error response status codes 5. Adding required query parameters 6. Changing response Content-Type --- ## References - [Sitemap Protocol Specification](https://www.sitemaps.org/protocol.html) - [Google Sitemap Guidelines](https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap) - [XML Specification](https://www.w3.org/TR/xml/) - [ISO 8601 Date Format](https://en.wikipedia.org/wiki/ISO_8601) --- ## Version History | Version | Date | Changes | |---------|------|---------| | 1.0.0 | 2026-03-07 | Initial contract specification | --- ## Summary This contract defines the complete API specification for the `/sitemap.xml` endpoint, including: 1. **Request/response formats** with examples 2. **Error handling** with all status codes (404, 413, 429, 401, 503, 500) 3. **XML schema requirements** for sitemap format 4. **Validation criteria** for contract testing 5. **Breaking change policy** for version management All error responses follow the spec requirement: **status code only, no response body** (except 429 which includes `Retry-After` header).