Added new feature for document export
This commit is contained in:
290
specs/001-sitemap/contracts/openapi.yaml
Normal file
290
specs/001-sitemap/contracts/openapi.yaml
Normal file
@@ -0,0 +1,290 @@
|
||||
openapi: 3.0.3
|
||||
info:
|
||||
title: Google Drive Sitemap Adapter API
|
||||
description: |
|
||||
HTTP adapter for generating XML sitemaps listing accessible Google Drive documents.
|
||||
|
||||
## Overview
|
||||
This adapter provides a single endpoint (`/sitemap.xml`) that generates a valid XML sitemap
|
||||
conforming to the sitemap protocol (https://www.sitemaps.org/protocol.html).
|
||||
|
||||
The sitemap lists all documents accessible to the configured Google Service Account,
|
||||
with URLs pointing back to this adapter using document IDs.
|
||||
|
||||
## Authentication
|
||||
The adapter uses OAuth 2.0 Service Account authentication to access Google Drive.
|
||||
External clients do not need to authenticate with this API.
|
||||
|
||||
## Rate Limiting
|
||||
Google Drive API rate limits are handled gracefully. If rate limited, the adapter
|
||||
returns HTTP 429 with a Retry-After header indicating seconds until retry.
|
||||
|
||||
## Sitemap Protocol Compliance
|
||||
- Maximum 50,000 URLs per sitemap (protocol limit)
|
||||
- Each URL includes document ID and last modified timestamp
|
||||
- Always returns fresh data (no caching)
|
||||
|
||||
version: 1.0.0
|
||||
contact:
|
||||
name: API Support
|
||||
license:
|
||||
name: ISC
|
||||
|
||||
servers:
|
||||
- url: http://localhost:3000
|
||||
description: Development server
|
||||
- url: https://adapter.example.com
|
||||
description: Production server
|
||||
|
||||
tags:
|
||||
- name: Sitemap
|
||||
description: XML sitemap generation
|
||||
|
||||
paths:
|
||||
/sitemap.xml:
|
||||
get:
|
||||
summary: Generate XML sitemap
|
||||
description: |
|
||||
Returns an XML sitemap listing all accessible Google Drive documents.
|
||||
|
||||
Each URL in the sitemap points to this adapter with a document ID:
|
||||
`{baseUrl}/{documentId}`
|
||||
|
||||
The sitemap is generated on-demand (no caching) and may take up to 5 seconds
|
||||
for drives containing up to 10,000 documents.
|
||||
|
||||
## Sitemap Format
|
||||
Conforms to https://www.sitemaps.org/protocol.html:
|
||||
- `<loc>`: Absolute URL with document ID
|
||||
- `<lastmod>`: Last modified timestamp (ISO 8601)
|
||||
|
||||
## Document Retrieval
|
||||
Note: The URLs in the sitemap point back to this adapter, but document retrieval
|
||||
endpoints are not implemented. This adapter only generates sitemaps for discovery.
|
||||
|
||||
operationId: getSitemap
|
||||
tags:
|
||||
- Sitemap
|
||||
responses:
|
||||
'200':
|
||||
description: Successfully generated sitemap
|
||||
headers:
|
||||
Content-Type:
|
||||
description: Always application/xml
|
||||
schema:
|
||||
type: string
|
||||
example: application/xml
|
||||
Content-Length:
|
||||
description: Size of sitemap in bytes
|
||||
schema:
|
||||
type: integer
|
||||
example: 204800
|
||||
content:
|
||||
application/xml:
|
||||
schema:
|
||||
type: string
|
||||
format: xml
|
||||
example: |
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
||||
<url>
|
||||
<loc>https://adapter.example.com/1BxAA_example123</loc>
|
||||
<lastmod>2026-03-06T10:30:00.000Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>https://adapter.example.com/1CyBB_example456</loc>
|
||||
<lastmod>2026-03-05T14:20:00.000Z</lastmod>
|
||||
</url>
|
||||
</urlset>
|
||||
'401':
|
||||
description: Unauthorized - OAuth authentication failed
|
||||
headers:
|
||||
Content-Length:
|
||||
description: Always 0 (no response body)
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
'429':
|
||||
description: Too Many Requests - Rate limited by Google Drive API
|
||||
headers:
|
||||
Retry-After:
|
||||
description: Seconds to wait before retrying
|
||||
schema:
|
||||
type: integer
|
||||
example: 60
|
||||
Content-Length:
|
||||
description: Always 0 (no response body)
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
'500':
|
||||
description: Internal Server Error
|
||||
headers:
|
||||
Content-Length:
|
||||
description: Always 0 (no response body)
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
'503':
|
||||
description: Service Unavailable - Google Drive API is down
|
||||
headers:
|
||||
Content-Length:
|
||||
description: Always 0 (no response body)
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
/{documentId}:
|
||||
get:
|
||||
summary: Document retrieval endpoint (NOT IMPLEMENTED)
|
||||
description: |
|
||||
This endpoint is referenced in sitemap URLs but is not implemented.
|
||||
The adapter only generates sitemaps; it does not serve documents.
|
||||
|
||||
Clients should treat sitemap URLs as metadata only.
|
||||
|
||||
operationId: getDocument
|
||||
tags:
|
||||
- Documents (Not Implemented)
|
||||
parameters:
|
||||
- name: documentId
|
||||
in: path
|
||||
description: Google Drive document ID
|
||||
required: true
|
||||
schema:
|
||||
type: string
|
||||
pattern: '^[a-zA-Z0-9_-]+$'
|
||||
example: 1BxAA_example123
|
||||
responses:
|
||||
'404':
|
||||
description: Not Found - Document retrieval not implemented
|
||||
headers:
|
||||
Content-Length:
|
||||
description: Always 0 (no response body)
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
/{anyOtherPath}:
|
||||
get:
|
||||
summary: All other paths
|
||||
description: |
|
||||
Any path other than `/sitemap.xml` returns 404 Not Found.
|
||||
|
||||
operationId: notFound
|
||||
tags:
|
||||
- Routing
|
||||
parameters:
|
||||
- name: anyOtherPath
|
||||
in: path
|
||||
description: Any path other than /sitemap.xml
|
||||
required: true
|
||||
schema:
|
||||
type: string
|
||||
responses:
|
||||
'404':
|
||||
description: Not Found
|
||||
headers:
|
||||
Content-Length:
|
||||
description: Always 0 (no response body)
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
components:
|
||||
schemas:
|
||||
Sitemap:
|
||||
type: object
|
||||
description: XML sitemap structure (logical representation, actual response is XML)
|
||||
properties:
|
||||
xmlns:
|
||||
type: string
|
||||
description: XML namespace for sitemap protocol
|
||||
example: http://www.sitemaps.org/schemas/sitemap/0.9
|
||||
urls:
|
||||
type: array
|
||||
description: Array of URL entries
|
||||
items:
|
||||
$ref: '#/components/schemas/SitemapUrl'
|
||||
maxItems: 50000
|
||||
|
||||
SitemapUrl:
|
||||
type: object
|
||||
description: Single URL entry in sitemap
|
||||
required:
|
||||
- loc
|
||||
- lastmod
|
||||
properties:
|
||||
loc:
|
||||
type: string
|
||||
format: uri
|
||||
description: Absolute URL to document (adapter URL + document ID)
|
||||
example: https://adapter.example.com/1BxAA_example123
|
||||
lastmod:
|
||||
type: string
|
||||
format: date-time
|
||||
description: Last modified timestamp in ISO 8601 format
|
||||
example: 2026-03-06T10:30:00.000Z
|
||||
|
||||
Error:
|
||||
type: object
|
||||
description: Error response (note - most errors return empty body per spec)
|
||||
properties:
|
||||
code:
|
||||
type: integer
|
||||
description: HTTP status code
|
||||
example: 500
|
||||
message:
|
||||
type: string
|
||||
description: Error message (not included in actual responses)
|
||||
example: Internal Server Error
|
||||
|
||||
responses:
|
||||
UnauthorizedError:
|
||||
description: Unauthorized - OAuth authentication failed
|
||||
headers:
|
||||
Content-Length:
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
RateLimitError:
|
||||
description: Too Many Requests - Rate limited by Google Drive API
|
||||
headers:
|
||||
Retry-After:
|
||||
description: Seconds to wait before retrying
|
||||
schema:
|
||||
type: integer
|
||||
example: 60
|
||||
Content-Length:
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
InternalError:
|
||||
description: Internal Server Error
|
||||
headers:
|
||||
Content-Length:
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
ServiceUnavailable:
|
||||
description: Service Unavailable - Google Drive API is down
|
||||
headers:
|
||||
Content-Length:
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
NotFound:
|
||||
description: Not Found - Path not recognized
|
||||
headers:
|
||||
Content-Length:
|
||||
schema:
|
||||
type: integer
|
||||
example: 0
|
||||
|
||||
externalDocs:
|
||||
description: Sitemap Protocol Specification
|
||||
url: https://www.sitemaps.org/protocol.html
|
||||
454
specs/001-sitemap/contracts/openapi.yaml.backup-export-version
Normal file
454
specs/001-sitemap/contracts/openapi.yaml.backup-export-version
Normal file
@@ -0,0 +1,454 @@
|
||||
openapi: 3.0.3
|
||||
info:
|
||||
title: Google Drive HTTP Proxy Adapter API
|
||||
description: |
|
||||
HTTP proxy adapter for exporting Google Drive documents in multiple formats (Markdown, HTML, PDF)
|
||||
and generating XML sitemaps of accessible documents.
|
||||
|
||||
## Authentication
|
||||
The adapter uses OAuth 2.0 to access Google Drive on behalf of configured users.
|
||||
External clients do not need to authenticate with this API directly.
|
||||
|
||||
## Rate Limiting
|
||||
API requests are rate-limited to 100 requests per minute per IP address.
|
||||
Rate limit information is included in response headers.
|
||||
version: 1.0.0
|
||||
contact:
|
||||
name: API Support
|
||||
license:
|
||||
name: MIT
|
||||
|
||||
servers:
|
||||
- url: http://localhost:3000
|
||||
description: Development server
|
||||
- url: https://api.example.com
|
||||
description: Production server
|
||||
|
||||
tags:
|
||||
- name: Documents
|
||||
description: Document export operations
|
||||
- name: Discovery
|
||||
description: Document discovery and listing
|
||||
- name: Health
|
||||
description: Service health monitoring
|
||||
|
||||
paths:
|
||||
/health:
|
||||
get:
|
||||
summary: Health check endpoint
|
||||
description: Returns service health status and version information
|
||||
tags:
|
||||
- Health
|
||||
responses:
|
||||
'200':
|
||||
description: Service is healthy
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
type: object
|
||||
properties:
|
||||
status:
|
||||
type: string
|
||||
example: ok
|
||||
version:
|
||||
type: string
|
||||
example: 1.0.0
|
||||
uptime:
|
||||
type: number
|
||||
description: Service uptime in seconds
|
||||
example: 86400
|
||||
|
||||
/sitemap.xml:
|
||||
get:
|
||||
summary: Generate sitemap of accessible documents
|
||||
description: |
|
||||
Returns an XML sitemap listing all Google Drive documents accessible to the configured user.
|
||||
Follows the sitemap protocol specification (https://www.sitemaps.org/protocol.html).
|
||||
tags:
|
||||
- Discovery
|
||||
responses:
|
||||
'200':
|
||||
description: Sitemap generated successfully
|
||||
headers:
|
||||
Content-Type:
|
||||
schema:
|
||||
type: string
|
||||
example: application/xml; charset=utf-8
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
description: Unique request identifier for tracing
|
||||
X-Document-Count:
|
||||
schema:
|
||||
type: integer
|
||||
description: Number of documents in the sitemap
|
||||
content:
|
||||
application/xml:
|
||||
schema:
|
||||
type: string
|
||||
format: xml
|
||||
example: |
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
||||
<url>
|
||||
<loc>http://localhost:3000/1BxAA_example123</loc>
|
||||
<lastmod>2026-03-06T10:30:00Z</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>http://localhost:3000/2CyBB_example456</loc>
|
||||
<lastmod>2026-03-05T14:20:00Z</lastmod>
|
||||
</url>
|
||||
</urlset>
|
||||
'401':
|
||||
$ref: '#/components/responses/Unauthorized'
|
||||
'429':
|
||||
$ref: '#/components/responses/RateLimited'
|
||||
'500':
|
||||
$ref: '#/components/responses/InternalError'
|
||||
'503':
|
||||
$ref: '#/components/responses/ServiceUnavailable'
|
||||
|
||||
/{documentId}:
|
||||
get:
|
||||
summary: Export Google Drive document in specified format
|
||||
description: |
|
||||
Fetches a Google Drive document by ID and exports it in the requested format.
|
||||
Supports Markdown (default), HTML, and PDF formats.
|
||||
tags:
|
||||
- Documents
|
||||
parameters:
|
||||
- name: documentId
|
||||
in: path
|
||||
required: true
|
||||
description: Google Drive file ID (8-128 alphanumeric characters, hyphens, or underscores)
|
||||
schema:
|
||||
type: string
|
||||
pattern: '^[a-zA-Z0-9_-]{8,128}$'
|
||||
example: 1BxAA_example123
|
||||
|
||||
- name: format
|
||||
in: query
|
||||
required: false
|
||||
description: Export format (defaults to markdown if not specified)
|
||||
schema:
|
||||
type: string
|
||||
enum:
|
||||
- markdown
|
||||
- html
|
||||
- pdf
|
||||
default: markdown
|
||||
example: markdown
|
||||
|
||||
responses:
|
||||
'200':
|
||||
description: Document exported successfully
|
||||
headers:
|
||||
Content-Type:
|
||||
schema:
|
||||
type: string
|
||||
enum:
|
||||
- text/markdown; charset=utf-8
|
||||
- text/html; charset=utf-8
|
||||
- application/pdf
|
||||
description: MIME type of exported document
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
description: Unique request identifier for tracing
|
||||
X-Document-Title:
|
||||
schema:
|
||||
type: string
|
||||
description: Original document title from Google Drive
|
||||
X-Document-Modified:
|
||||
schema:
|
||||
type: string
|
||||
format: date-time
|
||||
description: Last modified timestamp (ISO 8601)
|
||||
content:
|
||||
text/markdown:
|
||||
schema:
|
||||
type: string
|
||||
example: |
|
||||
# Document Title
|
||||
|
||||
This is a paragraph with **bold** and *italic* text.
|
||||
|
||||
## Section Heading
|
||||
|
||||
- List item 1
|
||||
- List item 2
|
||||
|
||||
text/html:
|
||||
schema:
|
||||
type: string
|
||||
example: |
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>Document Title</title></head>
|
||||
<body>
|
||||
<h1>Document Title</h1>
|
||||
<p>This is a paragraph with <strong>bold</strong> and <em>italic</em> text.</p>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
application/pdf:
|
||||
schema:
|
||||
type: string
|
||||
format: binary
|
||||
|
||||
'400':
|
||||
$ref: '#/components/responses/BadRequest'
|
||||
'401':
|
||||
$ref: '#/components/responses/Unauthorized'
|
||||
'403':
|
||||
$ref: '#/components/responses/Forbidden'
|
||||
'404':
|
||||
$ref: '#/components/responses/NotFound'
|
||||
'413':
|
||||
$ref: '#/components/responses/PayloadTooLarge'
|
||||
'415':
|
||||
$ref: '#/components/responses/UnsupportedMediaType'
|
||||
'429':
|
||||
$ref: '#/components/responses/RateLimited'
|
||||
'500':
|
||||
$ref: '#/components/responses/InternalError'
|
||||
'503':
|
||||
$ref: '#/components/responses/ServiceUnavailable'
|
||||
|
||||
components:
|
||||
schemas:
|
||||
ErrorResponse:
|
||||
type: object
|
||||
required:
|
||||
- error
|
||||
- timestamp
|
||||
properties:
|
||||
error:
|
||||
type: object
|
||||
required:
|
||||
- code
|
||||
- message
|
||||
- requestId
|
||||
properties:
|
||||
code:
|
||||
type: string
|
||||
description: Machine-readable error code
|
||||
enum:
|
||||
- DOCUMENT_NOT_FOUND
|
||||
- DOCUMENT_FORBIDDEN
|
||||
- UNAUTHORIZED
|
||||
- INVALID_FORMAT
|
||||
- UNSUPPORTED_DOCUMENT_TYPE
|
||||
- RATE_LIMITED
|
||||
- DRIVE_API_ERROR
|
||||
- INTERNAL_ERROR
|
||||
- PAYLOAD_TOO_LARGE
|
||||
example: DOCUMENT_NOT_FOUND
|
||||
message:
|
||||
type: string
|
||||
description: Human-readable error message
|
||||
example: Document with ID '1BxAA_example123' does not exist or is not accessible
|
||||
details:
|
||||
type: object
|
||||
description: Optional additional context
|
||||
additionalProperties: true
|
||||
requestId:
|
||||
type: string
|
||||
format: uuid
|
||||
description: Request ID for support and debugging
|
||||
example: 550e8400-e29b-41d4-a716-446655440000
|
||||
timestamp:
|
||||
type: string
|
||||
format: date-time
|
||||
description: ISO 8601 timestamp when error occurred
|
||||
example: '2026-03-06T10:30:00.123Z'
|
||||
|
||||
responses:
|
||||
BadRequest:
|
||||
description: Invalid request parameters
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: INVALID_FORMAT
|
||||
message: "Invalid format 'docx'. Supported formats: markdown, html, pdf"
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440000
|
||||
timestamp: '2026-03-06T10:30:00.123Z'
|
||||
|
||||
Unauthorized:
|
||||
description: Authentication failed or missing
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: UNAUTHORIZED
|
||||
message: Authentication with Google Drive failed
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440001
|
||||
timestamp: '2026-03-06T10:30:01.456Z'
|
||||
|
||||
Forbidden:
|
||||
description: User lacks permission to access the document
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: DOCUMENT_FORBIDDEN
|
||||
message: You do not have permission to access this document
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440002
|
||||
timestamp: '2026-03-06T10:30:02.789Z'
|
||||
|
||||
NotFound:
|
||||
description: Document does not exist
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: DOCUMENT_NOT_FOUND
|
||||
message: Document with ID '1BxAA_invalid' does not exist or is not accessible
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440003
|
||||
timestamp: '2026-03-06T10:30:03.012Z'
|
||||
|
||||
PayloadTooLarge:
|
||||
description: Document exceeds maximum size limit
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: PAYLOAD_TOO_LARGE
|
||||
message: Document size exceeds maximum limit of 100MB
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440004
|
||||
timestamp: '2026-03-06T10:30:04.345Z'
|
||||
|
||||
UnsupportedMediaType:
|
||||
description: Document type cannot be exported in requested format
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: UNSUPPORTED_DOCUMENT_TYPE
|
||||
message: Document type 'application/vnd.google-apps.form' cannot be exported as PDF
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440005
|
||||
timestamp: '2026-03-06T10:30:05.678Z'
|
||||
|
||||
RateLimited:
|
||||
description: Rate limit exceeded
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
X-RateLimit-Limit:
|
||||
schema:
|
||||
type: integer
|
||||
description: Maximum requests per minute
|
||||
example: 100
|
||||
X-RateLimit-Remaining:
|
||||
schema:
|
||||
type: integer
|
||||
description: Remaining requests in current window
|
||||
example: 0
|
||||
X-RateLimit-Reset:
|
||||
schema:
|
||||
type: integer
|
||||
description: Unix timestamp when rate limit resets
|
||||
example: 1709724660
|
||||
Retry-After:
|
||||
schema:
|
||||
type: integer
|
||||
description: Seconds until rate limit resets
|
||||
example: 60
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: RATE_LIMITED
|
||||
message: Rate limit exceeded. Please retry after 60 seconds
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440006
|
||||
timestamp: '2026-03-06T10:30:06.901Z'
|
||||
|
||||
InternalError:
|
||||
description: Internal server error
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: INTERNAL_ERROR
|
||||
message: An unexpected error occurred while processing your request
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440007
|
||||
timestamp: '2026-03-06T10:30:07.234Z'
|
||||
|
||||
ServiceUnavailable:
|
||||
description: Service temporarily unavailable (Google Drive API down or rate limited)
|
||||
headers:
|
||||
X-Request-Id:
|
||||
schema:
|
||||
type: string
|
||||
format: uuid
|
||||
Retry-After:
|
||||
schema:
|
||||
type: integer
|
||||
description: Seconds until service may be available
|
||||
example: 300
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/ErrorResponse'
|
||||
example:
|
||||
error:
|
||||
code: DRIVE_API_ERROR
|
||||
message: Google Drive API is temporarily unavailable. Please retry later
|
||||
requestId: 550e8400-e29b-41d4-a716-446655440008
|
||||
timestamp: '2026-03-06T10:30:08.567Z'
|
||||
436
specs/001-sitemap/contracts/sitemap-api.md
Normal file
436
specs/001-sitemap/contracts/sitemap-api.md
Normal file
@@ -0,0 +1,436 @@
|
||||
# API Contract: Sitemap Endpoint
|
||||
|
||||
**Feature**: 001-drive-proxy-adapter
|
||||
**Date**: 2026-03-07
|
||||
**Phase**: 1 - Design & Contracts
|
||||
**Endpoint**: `GET /sitemap.xml`
|
||||
|
||||
## Overview
|
||||
|
||||
The `/sitemap.xml` endpoint returns an XML sitemap listing all Google Drive documents accessible to the Service Account. This is the only endpoint exposed by the adapter.
|
||||
|
||||
---
|
||||
|
||||
## Endpoint Definition
|
||||
|
||||
### URL
|
||||
```
|
||||
GET /sitemap.xml
|
||||
```
|
||||
|
||||
### Authentication
|
||||
- **Method**: None (endpoint is public)
|
||||
- **Backend Authentication**: Service Account JWT to Google Drive API (transparent to client)
|
||||
- **Credentials**: Loaded from `GOOGLE_SERVICE_ACCOUNT_KEY` environment variable
|
||||
|
||||
### Request
|
||||
|
||||
**Method**: `GET`
|
||||
|
||||
**Headers**:
|
||||
- None required
|
||||
|
||||
**Query Parameters**:
|
||||
- None supported
|
||||
|
||||
**Request Body**:
|
||||
- None (GET request)
|
||||
|
||||
**Example Request**:
|
||||
```http
|
||||
GET /sitemap.xml HTTP/1.1
|
||||
Host: adapter.example.com
|
||||
User-Agent: Mozilla/5.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Response Specifications
|
||||
|
||||
### Success Response (200 OK)
|
||||
|
||||
**Status Code**: `200 OK`
|
||||
|
||||
**Headers**:
|
||||
- `Content-Type: application/xml`
|
||||
- `Content-Length: {size_in_bytes}`
|
||||
|
||||
**Body**: Valid XML sitemap conforming to sitemap protocol
|
||||
|
||||
**XML Schema**:
|
||||
```xml
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
||||
<url>
|
||||
<loc>https://adapter.example.com/documents/{documentId}</loc>
|
||||
<lastmod>2026-03-06T10:30:00.000Z</lastmod>
|
||||
</url>
|
||||
<!-- Additional <url> entries (up to 50,000) -->
|
||||
</urlset>
|
||||
```
|
||||
|
||||
**Field Descriptions**:
|
||||
- `<urlset>`: Root element with sitemap namespace
|
||||
- `<url>`: Individual URL entry (0 to 50,000 entries)
|
||||
- `<loc>`: Absolute URL to document using RESTful format `/documents/{documentId}`
|
||||
- `<lastmod>`: ISO 8601 timestamp of last document modification
|
||||
|
||||
**Constraints**:
|
||||
- Maximum 50,000 `<url>` entries (sitemap protocol limit per spec.md FR-015)
|
||||
- Maximum 50MB uncompressed (protocol limit, not enforced)
|
||||
- All `<loc>` URLs use same base URL (configured via `BASE_URL` env var)
|
||||
- All `<loc>` URLs use RESTful path format: `/documents/{documentId}`
|
||||
|
||||
**Example Response**:
|
||||
```http
|
||||
HTTP/1.1 200 OK
|
||||
Content-Type: application/xml
|
||||
Content-Length: 4582
|
||||
|
||||
```
|
||||
|
||||
**Performance Targets** (from spec.md success criteria):
|
||||
- Response time: < 5 seconds for up to 10,000 documents
|
||||
- Memory usage: < 256MB under normal load
|
||||
- Concurrent requests: Support 10 concurrent requests without degradation
|
||||
|
||||
---
|
||||
|
||||
### Not Found Response (404)
|
||||
|
||||
**Status Code**: `404 Not Found`
|
||||
|
||||
**Headers**: None
|
||||
|
||||
**Body**: Empty (per spec.md clarification: "HTTP status code only, no error response body")
|
||||
|
||||
**When Returned**:
|
||||
- Any path other than `/sitemap.xml` (per spec.md FR-007)
|
||||
|
||||
**Example Response**:
|
||||
```http
|
||||
HTTP/1.1 404 Not Found
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Unauthorized Response (401)
|
||||
|
||||
**Status Code**: `401 Unauthorized`
|
||||
|
||||
**Headers**: None
|
||||
|
||||
**Body**: Empty (per spec.md clarification: "HTTP status code only, no error response body")
|
||||
|
||||
**When Returned**:
|
||||
- Service Account JWT authentication failed (per spec.md FR-010)
|
||||
- OAuth token refresh failed
|
||||
- Invalid Service Account credentials
|
||||
|
||||
**Example Response**:
|
||||
```http
|
||||
HTTP/1.1 401 Unauthorized
|
||||
|
||||
```
|
||||
|
||||
**Client Action**: Check Service Account credentials in `GOOGLE_SERVICE_ACCOUNT_KEY` environment variable
|
||||
|
||||
---
|
||||
|
||||
### Rate Limited Response (429)
|
||||
|
||||
**Status Code**: `429 Too Many Requests`
|
||||
|
||||
**Headers**:
|
||||
- `Retry-After: {seconds}` (integer, seconds until retry allowed)
|
||||
|
||||
**Body**: Empty (per spec.md clarification: "HTTP status code only, no error response body")
|
||||
|
||||
**When Returned**:
|
||||
- Google Drive API rate limit exceeded (per spec.md FR-013)
|
||||
- Quota exhausted for Service Account
|
||||
|
||||
**Example Response**:
|
||||
```http
|
||||
HTTP/1.1 429 Too Many Requests
|
||||
Retry-After: 60
|
||||
|
||||
```
|
||||
|
||||
**Client Action**: Wait `Retry-After` seconds before retrying request
|
||||
|
||||
**Retry-After Values**:
|
||||
- Derived from Google Drive API `Retry-After` header if available
|
||||
- Default: 60 seconds if not specified by Drive API
|
||||
|
||||
---
|
||||
|
||||
### Internal Server Error (500)
|
||||
|
||||
**Status Code**: `500 Internal Server Error`
|
||||
|
||||
**Headers**: None
|
||||
|
||||
**Body**: Empty (per spec.md clarification: "HTTP status code only, no error response body")
|
||||
|
||||
**When Returned**:
|
||||
- Unexpected server error (per spec.md FR-008)
|
||||
- Configuration error (missing environment variables)
|
||||
- XML generation failure
|
||||
|
||||
**Example Response**:
|
||||
```http
|
||||
HTTP/1.1 500 Internal Server Error
|
||||
|
||||
```
|
||||
|
||||
**Client Action**: Report error to adapter administrator
|
||||
|
||||
**Server Logging**: All 500 errors logged with stack trace to stderr (per spec.md FR-012)
|
||||
|
||||
---
|
||||
|
||||
### Service Unavailable Response (503)
|
||||
|
||||
**Status Code**: `503 Service Unavailable`
|
||||
|
||||
**Headers**: None
|
||||
|
||||
**Body**: Empty (per spec.md clarification: "HTTP status code only, no error response body")
|
||||
|
||||
**When Returned**:
|
||||
- Google Drive API unavailable (per spec.md FR-017)
|
||||
- Drive API returns 503 status (no retries per spec clarification)
|
||||
|
||||
**Example Response**:
|
||||
```http
|
||||
HTTP/1.1 503 Service Unavailable
|
||||
|
||||
```
|
||||
|
||||
**Client Action**: Retry request later (Drive API temporarily unavailable)
|
||||
|
||||
**Retry Behavior**: Adapter does NOT retry Drive API 503 errors; immediately returns 503 to client (per spec.md FR-017 clarification)
|
||||
|
||||
---
|
||||
|
||||
## Error Handling Specification
|
||||
|
||||
### Error Response Format
|
||||
|
||||
**All error responses follow same pattern**:
|
||||
- Status code indicates error type
|
||||
- No response body (per spec.md clarification)
|
||||
- Minimal headers (only `Retry-After` for 429)
|
||||
|
||||
**Rationale**: Simplicity, consistency, fail-fast approach
|
||||
|
||||
### Error Status Code Matrix
|
||||
|
||||
| Error Condition | Status Code | Headers | Body | Retry? |
|
||||
|----------------|-------------|---------|------|--------|
|
||||
| Authentication failed | 401 | None | Empty | No (fix credentials) |
|
||||
| Rate limit exceeded | 429 | `Retry-After` | Empty | Yes (after delay) |
|
||||
| Drive API unavailable | 503 | None | Empty | Yes (later) |
|
||||
| Internal error | 500 | None | Empty | No (report to admin) |
|
||||
| Path not found | 404 | None | Empty | No |
|
||||
|
||||
---
|
||||
|
||||
## Logging Specification
|
||||
|
||||
### Request Logging (stdout)
|
||||
|
||||
**All requests logged with**:
|
||||
- Timestamp (ISO 8601)
|
||||
- HTTP method and path
|
||||
- Response status code
|
||||
- Response time (milliseconds)
|
||||
|
||||
**Example**:
|
||||
```
|
||||
[2026-03-07T14:30:15.456Z] GET /sitemap.xml -> 200 (1234ms)
|
||||
[2026-03-07T14:30:20.789Z] GET /sitemap.xml -> 429 (234ms)
|
||||
[2026-03-07T14:30:25.012Z] GET /invalid.xml -> 404 (1ms)
|
||||
```
|
||||
|
||||
### Error Logging (stderr)
|
||||
|
||||
**All errors logged with**:
|
||||
- Timestamp (ISO 8601)
|
||||
- Request ID (for correlation)
|
||||
- Error message
|
||||
- Stack trace (for 500 errors)
|
||||
|
||||
**Example**:
|
||||
```
|
||||
[2026-03-07T14:30:20.789Z] [ERROR] Rate limit exceeded: Drive API quota exhausted
|
||||
[2026-03-07T14:30:25.012Z] [ERROR] Authentication failed: Invalid Service Account key
|
||||
[2026-03-07T14:30:30.345Z] [ERROR] Drive API unavailable: Connection timeout
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Contract Tests
|
||||
|
||||
### Test Scenarios
|
||||
|
||||
1. **Successful sitemap generation**
|
||||
- Request: `GET /sitemap.xml`
|
||||
- Expected: 200 status, valid XML, `Content-Type: application/xml`
|
||||
|
||||
2. **Not found for other paths**
|
||||
- Request: `GET /invalid.xml`
|
||||
- Expected: 404 status, empty body
|
||||
|
||||
3. **Rate limiting**
|
||||
- Simulate Drive API 429 response
|
||||
- Expected: 429 status, `Retry-After` header, empty body
|
||||
|
||||
4. **Authentication failure**
|
||||
- Simulate invalid credentials
|
||||
- Expected: 401 status, empty body
|
||||
|
||||
5. **Service unavailable**
|
||||
- Simulate Drive API 503 response
|
||||
- Expected: 503 status, empty body (no retries)
|
||||
|
||||
6. **XML schema validation**
|
||||
- Request: `GET /sitemap.xml`
|
||||
- Validate XML against sitemap protocol schema
|
||||
|
||||
7. **URL format validation**
|
||||
- Request: `GET /sitemap.xml`
|
||||
- Verify all `<loc>` URLs use `/documents/{documentId}` format
|
||||
|
||||
### Test Assertions
|
||||
|
||||
**XML Schema Validation**:
|
||||
- Root element: `<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">`
|
||||
- Each `<url>` has required `<loc>` child
|
||||
- Each `<lastmod>` is valid ISO 8601 timestamp
|
||||
- Maximum 50,000 `<url>` entries
|
||||
|
||||
**URL Format Validation**:
|
||||
- All `<loc>` URLs are absolute (start with http:// or https://)
|
||||
- All `<loc>` URLs use RESTful format: `{baseUrl}/documents/{documentId}`
|
||||
- Document IDs match regex: `^[a-zA-Z0-9_-]+$`
|
||||
|
||||
**Header Validation**:
|
||||
- 200 responses include `Content-Type: application/xml`
|
||||
- 429 responses include `Retry-After` header with integer value
|
||||
- All error responses have empty body
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
| `GOOGLE_SERVICE_ACCOUNT_KEY` | Yes | None | Inline JSON of Service Account key file |
|
||||
| `BASE_URL` | Yes | None | Base URL for sitemap links (e.g., `https://adapter.example.com`) |
|
||||
| `PORT` | No | 3000 | HTTP server port |
|
||||
|
||||
**Example .env**:
|
||||
```bash
|
||||
GOOGLE_SERVICE_ACCOUNT_KEY='{"type":"service_account","project_id":"...","private_key":"-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n","client_email":"...@developer.gserviceaccount.com",...}'
|
||||
BASE_URL=https://adapter.example.com
|
||||
PORT=3000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Compatibility
|
||||
|
||||
### Sitemap Protocol Compliance
|
||||
|
||||
**Protocol**: https://www.sitemaps.org/protocol.html
|
||||
|
||||
**Compliance**:
|
||||
- ✅ Valid XML with namespace
|
||||
- ✅ `<loc>` with absolute URLs
|
||||
- ✅ `<lastmod>` with W3C Datetime format (ISO 8601)
|
||||
- ✅ Maximum 50,000 URLs
|
||||
- ✅ Maximum 50MB uncompressed size
|
||||
|
||||
**Optional Elements Not Used**:
|
||||
- `<changefreq>`: Not applicable (no historical change data)
|
||||
- `<priority>`: Not applicable (all documents equal priority)
|
||||
|
||||
### HTTP Compliance
|
||||
|
||||
**HTTP Version**: HTTP/1.1
|
||||
|
||||
**Methods Supported**: `GET` only
|
||||
|
||||
**Status Codes Used**: 200, 401, 404, 429, 500, 503
|
||||
|
||||
**Headers Used**:
|
||||
- Response: `Content-Type`, `Content-Length`, `Retry-After`
|
||||
- Request: Standard HTTP headers accepted, none required
|
||||
|
||||
---
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Authentication
|
||||
- Service Account credentials secured in environment variable (not in code or config files)
|
||||
- Credentials never logged or exposed in error messages
|
||||
- Read-only Drive scope (`drive.readonly`) - no write permissions
|
||||
|
||||
### Rate Limiting
|
||||
- Transparent propagation of Drive API rate limits to client
|
||||
- No internal rate limiting (rely on Drive API limits)
|
||||
|
||||
### Input Validation
|
||||
- Path validation: Only `/sitemap.xml` accepted
|
||||
- Method validation: Only `GET` accepted
|
||||
- No query parameters processed (rejection not required, just ignored)
|
||||
|
||||
### Output Sanitization
|
||||
- All URLs XML-escaped to prevent injection
|
||||
- All timestamps XML-escaped (though ISO 8601 format doesn't contain XML special chars)
|
||||
|
||||
---
|
||||
|
||||
## Versioning
|
||||
|
||||
**Current Version**: 1.0.0 (initial implementation)
|
||||
|
||||
**Future Changes**:
|
||||
- Breaking changes (new required parameters): Major version bump (2.0.0)
|
||||
- Backward-compatible additions (query parameters): Minor version bump (1.1.0)
|
||||
- Bug fixes: Patch version bump (1.0.1)
|
||||
|
||||
**Deprecation Policy**:
|
||||
- Breaking changes include migration guide
|
||||
- Deprecated features supported for at least one minor version
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Feature Specification: `/specs/001-drive-proxy-adapter/spec.md`
|
||||
- Data Model: `/specs/001-drive-proxy-adapter/data-model.md`
|
||||
- Research Document: `/specs/001-drive-proxy-adapter/research.md`
|
||||
- Sitemap Protocol: https://www.sitemaps.org/protocol.html
|
||||
- Google Drive API v3: https://developers.google.com/drive/api/v3/reference
|
||||
|
||||
|
||||
**Deprecation Policy**:
|
||||
- Breaking changes include migration guide
|
||||
- Deprecated features supported for at least one minor version
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- Feature Specification: `/specs/001-drive-proxy-adapter/spec.md`
|
||||
- Data Model: `/specs/001-drive-proxy-adapter/data-model.md`
|
||||
- Research Document: `/specs/001-drive-proxy-adapter/research.md`
|
||||
- Sitemap Protocol: https://www.sitemaps.org/protocol.html
|
||||
- Google Drive API v3: https://developers.google.com/drive/api/v3/reference
|
||||
|
||||
382
specs/001-sitemap/contracts/sitemap-xml-schema.md
Normal file
382
specs/001-sitemap/contracts/sitemap-xml-schema.md
Normal file
@@ -0,0 +1,382 @@
|
||||
# API Contract: Sitemap XML Endpoint
|
||||
|
||||
**Feature**: 001-drive-proxy-adapter
|
||||
**Contract Type**: HTTP API
|
||||
**Endpoint**: `/sitemap.xml`
|
||||
**Version**: 1.0.0
|
||||
**Date**: 2026-03-07
|
||||
|
||||
---
|
||||
|
||||
## Endpoint Specification
|
||||
|
||||
### `GET /sitemap.xml`
|
||||
|
||||
Generate an XML sitemap of all accessible Google Drive documents.
|
||||
|
||||
---
|
||||
|
||||
## Request
|
||||
|
||||
### HTTP Method
|
||||
`GET`
|
||||
|
||||
### URL
|
||||
`/sitemap.xml`
|
||||
|
||||
### Query Parameters
|
||||
None
|
||||
|
||||
### Request Headers
|
||||
None required
|
||||
|
||||
### Request Body
|
||||
None (GET request)
|
||||
|
||||
---
|
||||
|
||||
## Response
|
||||
|
||||
### Success Response (200 OK)
|
||||
|
||||
**Status Code**: `200 OK`
|
||||
|
||||
**Response Headers**:
|
||||
```
|
||||
Content-Type: application/xml; charset=utf-8
|
||||
Content-Length: {size_in_bytes}
|
||||
```
|
||||
|
||||
**Response Body** (XML):
|
||||
```xml
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
||||
<url>
|
||||
<loc>http://example.com/documents/{documentId1}</loc>
|
||||
<lastmod>2026-03-07</lastmod>
|
||||
</url>
|
||||
<url>
|
||||
<loc>http://example.com/documents/{documentId2}</loc>
|
||||
<lastmod>2026-03-06</lastmod>
|
||||
</url>
|
||||
<!-- ... up to 50,000 entries -->
|
||||
</urlset>
|
||||
```
|
||||
|
||||
**XML Schema Requirements**:
|
||||
- Root element: `<urlset>` with namespace `xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"`
|
||||
- Each document: `<url>` element containing:
|
||||
- `<loc>` (REQUIRED): Absolute URL in format `{baseUrl}/documents/{documentId}`
|
||||
- Must be URL-encoded
|
||||
- Must escape XML special characters: `&` → `&`, `<` → `<`, `>` → `>`, `"` → `"`, `'` → `'`
|
||||
- `<lastmod>` (OPTIONAL): ISO 8601 date format
|
||||
- Format: `YYYY-MM-DD` or `YYYY-MM-DDTHH:MM:SS+00:00`
|
||||
- Omitted if Drive API provides no `modifiedTime`
|
||||
|
||||
**Empty Drive Response** (0 documents):
|
||||
```xml
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
||||
</urlset>
|
||||
```
|
||||
|
||||
**Constraints**:
|
||||
- Maximum 50,000 `<url>` entries (sitemap protocol limit)
|
||||
- If >50,000 documents exist, return 413 error instead
|
||||
|
||||
---
|
||||
|
||||
### Error Responses
|
||||
|
||||
#### 404 Not Found
|
||||
|
||||
**Trigger**: Request to any endpoint other than `/sitemap.xml`
|
||||
|
||||
**Status Code**: `404 Not Found`
|
||||
|
||||
**Response Headers**: None
|
||||
|
||||
**Response Body**: Empty (no content)
|
||||
|
||||
**Example**:
|
||||
```
|
||||
GET /documents/abc123 → 404 Not Found (empty body)
|
||||
GET /api/sitemap → 404 Not Found (empty body)
|
||||
POST /sitemap.xml → 404 Not Found (empty body)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### 413 Payload Too Large
|
||||
|
||||
**Trigger**: Google Drive contains more than 50,000 documents
|
||||
|
||||
**Status Code**: `413 Payload Too Large`
|
||||
|
||||
**Response Headers**: None
|
||||
|
||||
**Response Body**: Empty (no content)
|
||||
|
||||
**Rationale**: Sitemap protocol limits sitemaps to 50,000 URLs. This error prevents oversized sitemap generation.
|
||||
|
||||
---
|
||||
|
||||
#### 429 Too Many Requests
|
||||
|
||||
**Trigger**: Google Drive API returns rate limit error
|
||||
|
||||
**Status Code**: `429 Too Many Requests`
|
||||
|
||||
**Response Headers**:
|
||||
```
|
||||
Retry-After: {seconds}
|
||||
```
|
||||
|
||||
**Response Body**: Empty (no content)
|
||||
|
||||
**Example**:
|
||||
```
|
||||
HTTP/1.1 429 Too Many Requests
|
||||
Retry-After: 60
|
||||
|
||||
(empty body)
|
||||
```
|
||||
|
||||
**Rationale**: Client should retry after the specified number of seconds.
|
||||
|
||||
---
|
||||
|
||||
#### 401 Unauthorized
|
||||
|
||||
**Trigger**: Service Account token refresh failed
|
||||
|
||||
**Status Code**: `401 Unauthorized`
|
||||
|
||||
**Response Headers**: None
|
||||
|
||||
**Response Body**: Empty (no content)
|
||||
|
||||
**Rationale**: Authentication failed. Check Service Account credentials configuration.
|
||||
|
||||
---
|
||||
|
||||
#### 503 Service Unavailable
|
||||
|
||||
**Trigger**: Google Drive API returns 503 error
|
||||
|
||||
**Status Code**: `503 Service Unavailable`
|
||||
|
||||
**Response Headers**: None
|
||||
|
||||
**Response Body**: Empty (no content)
|
||||
|
||||
**Behavior**: No retries - immediately pass through 503 to client per specification.
|
||||
|
||||
---
|
||||
|
||||
#### 500 Internal Server Error
|
||||
|
||||
**Trigger**: Unexpected error during sitemap generation
|
||||
|
||||
**Status Code**: `500 Internal Server Error`
|
||||
|
||||
**Response Headers**: None
|
||||
|
||||
**Response Body**: Empty (no content)
|
||||
|
||||
**Rationale**: Unexpected server error. Check logs for details.
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Successful Sitemap (3 documents)
|
||||
|
||||
**Request**:
|
||||
```http
|
||||
GET /sitemap.xml HTTP/1.1
|
||||
Host: example.com
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```http
|
||||
HTTP/1.1 200 OK
|
||||
Content-Type: application/xml; charset=utf-8
|
||||
Content-Length: 512
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Empty Drive
|
||||
|
||||
**Request**:
|
||||
```http
|
||||
GET /sitemap.xml HTTP/1.1
|
||||
Host: example.com
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```http
|
||||
HTTP/1.1 200 OK
|
||||
Content-Type: application/xml; charset=utf-8
|
||||
Content-Length: 123
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Rate Limit Exceeded
|
||||
|
||||
**Request**:
|
||||
```http
|
||||
GET /sitemap.xml HTTP/1.1
|
||||
Host: example.com
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```http
|
||||
HTTP/1.1 429 Too Many Requests
|
||||
Retry-After: 120
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Too Many Documents
|
||||
|
||||
**Request**:
|
||||
```http
|
||||
GET /sitemap.xml HTTP/1.1
|
||||
Host: example.com
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```http
|
||||
HTTP/1.1 413 Payload Too Large
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Example 5: Invalid Endpoint
|
||||
|
||||
**Request**:
|
||||
```http
|
||||
GET /documents/abc123 HTTP/1.1
|
||||
Host: example.com
|
||||
```
|
||||
|
||||
**Response**:
|
||||
```http
|
||||
HTTP/1.1 404 Not Found
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Contract Validation
|
||||
|
||||
### XML Schema Validation
|
||||
|
||||
The sitemap XML MUST validate against the sitemap protocol schema:
|
||||
- **Namespace**: `http://www.sitemaps.org/schemas/sitemap/0.9`
|
||||
- **Root element**: `<urlset>`
|
||||
- **Child elements**: Zero or more `<url>` elements
|
||||
- **URL elements**: Each contains `<loc>` (required) and `<lastmod>` (optional)
|
||||
|
||||
**Validation Tools**:
|
||||
- XML parser (ensure well-formed XML)
|
||||
- Sitemap validator: [https://www.xml-sitemaps.com/validate-xml-sitemap.html](https://www.xml-sitemaps.com/validate-xml-sitemap.html)
|
||||
- XSD schema validation against official sitemap schema
|
||||
|
||||
---
|
||||
|
||||
### Contract Testing Requirements
|
||||
|
||||
All contract tests MUST verify:
|
||||
|
||||
1. **Success Path**:
|
||||
- Response status 200
|
||||
- Content-Type header is `application/xml; charset=utf-8`
|
||||
- Response body is valid XML
|
||||
- XML contains correct namespace
|
||||
- All `<loc>` URLs are absolute and properly formatted
|
||||
- All `<loc>` URLs follow pattern: `{baseUrl}/documents/{documentId}`
|
||||
- All `<lastmod>` dates are valid ISO 8601 format (if present)
|
||||
|
||||
2. **Error Handling**:
|
||||
- Invalid endpoints return 404 with empty body
|
||||
- >50k documents returns 413 with empty body
|
||||
- Rate limiting returns 429 with `Retry-After` header and empty body
|
||||
- Drive API 503 returns 503 with empty body (no retries)
|
||||
- All error responses have no `Content-Type` header
|
||||
- All error responses have empty body
|
||||
|
||||
3. **Edge Cases**:
|
||||
- Empty Drive (0 documents) returns valid sitemap with no `<url>` entries
|
||||
- Documents without `modifiedTime` omit `<lastmod>` tag
|
||||
- Special characters in document IDs are properly URL-encoded
|
||||
- XML special characters in URLs are properly escaped
|
||||
|
||||
---
|
||||
|
||||
## Breaking Changes
|
||||
|
||||
Changes that constitute breaking changes (require MAJOR version bump):
|
||||
|
||||
1. Changing URL format from `/documents/{id}` to different format
|
||||
2. Changing XML namespace or root element structure
|
||||
3. Removing `<lastmod>` field entirely
|
||||
4. Changing error response status codes
|
||||
5. Adding required query parameters
|
||||
6. Changing response Content-Type
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Sitemap Protocol Specification](https://www.sitemaps.org/protocol.html)
|
||||
- [Google Sitemap Guidelines](https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap)
|
||||
- [XML Specification](https://www.w3.org/TR/xml/)
|
||||
- [ISO 8601 Date Format](https://en.wikipedia.org/wiki/ISO_8601)
|
||||
|
||||
---
|
||||
|
||||
## Version History
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0.0 | 2026-03-07 | Initial contract specification |
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This contract defines the complete API specification for the `/sitemap.xml` endpoint, including:
|
||||
|
||||
1. **Request/response formats** with examples
|
||||
2. **Error handling** with all status codes (404, 413, 429, 401, 503, 500)
|
||||
3. **XML schema requirements** for sitemap format
|
||||
4. **Validation criteria** for contract testing
|
||||
5. **Breaking change policy** for version management
|
||||
|
||||
All error responses follow the spec requirement: **status code only, no response body** (except 429 which includes `Retry-After` header).
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0.0 | 2026-03-07 | Initial contract specification |
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
This contract defines the complete API specification for the `/sitemap.xml` endpoint, including:
|
||||
|
||||
1. **Request/response formats** with examples
|
||||
2. **Error handling** with all status codes (404, 413, 429, 401, 503, 500)
|
||||
3. **XML schema requirements** for sitemap format
|
||||
4. **Validation criteria** for contract testing
|
||||
5. **Breaking change policy** for version management
|
||||
|
||||
All error responses follow the spec requirement: **status code only, no response body** (except 429 which includes `Retry-After` header).
|
||||
Reference in New Issue
Block a user