Files
Peter.Morton 9286ee8927 feat: Add X-Verint-KAB-Original-URL header to document exports
Adds HTTP response header containing original Google Drive URL
for exported documents to enable content traceability and auditing.

- Adds X-Verint-KAB-Original-URL header to successful export responses
- Header format: https://drive.google.com/file/d/{fileId}
- Present for all export formats (PDF, DOCX, plain text)
- Header omitted on error responses (4xx/5xx)
- 18 new tests (9 contract + 9 integration)
- Zero new dependencies
- Performance: 0.000019ms overhead per request

Implements:
- FR-001: Header present on successful exports (200 OK)
- FR-002: Header absent on error responses
- FR-003: Standard header name X-Verint-KAB-Original-URL
- FR-004: Standard URL format with file ID
- FR-005: Uses validated document.id from Google Drive API
- FR-006: Header present regardless of file accessibility
- FR-007: Consistent across all export formats
- FR-008: Minimal performance impact (< 5ms requirement)

Testing:
- Contract tests validate header presence, format, and error handling
- Integration tests verify behavior across formats and permissions
- All 18 tests passing
- 100% requirements coverage

Documentation:
- Feature specification (specs/001-gdrive-url-header/spec.md)
- Implementation plan (plan.md)
- Technical research (research.md)
- Data model (data-model.md)
- API contract (contracts/response-headers.md)
- User guide (quickstart.md)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-27 16:04:54 -05:00

339 lines
9.2 KiB
Markdown

# Data Model: Google Drive Original URL Header
**Feature**: 001-gdrive-url-header
**Date**: 2026-03-27
**Status**: Complete
## Overview
This feature adds an HTTP response header to document export responses. There are no new data entities or persistent data structures. This document describes the data flow and transformations involved.
---
## Entities
### HTTP Response Header (New)
**Name**: `X-Verint-KAB-Original-URL`
**Description**: Custom HTTP response header containing the original Google Drive URL for the exported document.
**Properties**:
- **Name**: `X-Verint-KAB-Original-URL` (string, constant)
- **Value**: `https://drive.google.com/file/d/{fileId}` (string, dynamic)
**Lifecycle**:
- **Created**: During successful document export response (proxy.js line ~377-383)
- **Lifespan**: Single HTTP response only
- **Destroyed**: After response is sent to client
**Validation Rules**:
- Header name is fixed (cannot vary)
- Header value must be a valid URL with format: `https://drive.google.com/file/d/{fileId}`
- File ID must be alphanumeric string (validated by Google Drive API)
- Header is only present on successful exports (200 OK status)
---
### Google Drive File ID (Existing)
**Name**: Document ID / File ID
**Description**: Unique identifier for a Google Drive document, obtained from Google Drive API.
**Source**:
1. Client request URL: `/documents/{documentId}`
2. Validated by Google Drive Files API: `GET /drive/v3/files/{documentId}`
3. Returned in API response as `document.id`
**Properties**:
- **Type**: String
- **Format**: Alphanumeric (e.g., `1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms`)
- **Length**: Variable (typically 33-44 characters)
- **Validation**: Implicitly validated by Google Drive API (404 if invalid)
**Usage in Feature**:
- Input parameter: `documentId` in request URL
- Validated value: `document.id` from API response (line 278)
- Output: Embedded in `X-Verint-KAB-Original-URL` header value
---
### Google Drive URL (New, Derived)
**Name**: Original Document URL
**Description**: User-facing URL for accessing the document in Google Drive web interface.
**Properties**:
- **Base URL**: `https://drive.google.com/file/d/` (constant)
- **File ID**: `{document.id}` (dynamic, from API response)
- **Full URL**: `https://drive.google.com/file/d/{document.id}` (constructed)
**Construction**:
```javascript
const driveUrl = `https://drive.google.com/file/d/${document.id}`;
```
**Characteristics**:
- Immutable for a given file ID
- No query parameters needed
- No URL encoding required (file IDs are alphanumeric only)
- Publicly addressable (permissions enforced by Google Drive)
---
## Data Flow
### Request → Response Flow
```
1. Client Request
URL: GET /documents/{documentId}
2. Route Parsing
Extract: documentId = "{id}"
3. Metadata Fetch
API: GET https://www.googleapis.com/drive/v3/files/{documentId}
Response: { id: "{validated_id}", name: "...", mimeType: "..." }
4. Export Content Fetch
API: GET https://www.googleapis.com/drive/v3/files/{id}?alt=media
5. Response Header Construction
URL Construction: https://drive.google.com/file/d/{document.id}
Header: X-Verint-KAB-Original-URL: {constructed_url}
6. Client Response
Status: 200 OK
Headers:
- Content-Type: {mimeType}
- X-Request-Id: {requestId}
- Content-Disposition: inline; filename="{name}.{ext}"
- X-Verint-KAB-Original-URL: https://drive.google.com/file/d/{id}
Body: {document_content}
```
---
## State Transitions
### Document Export Request States
```
[Request Received]
Parse Route → Extract documentId
[ID Extracted]
Fetch Metadata from Google Drive
┌─────────────────────┬─────────────────────┐
↓ ↓ ↓
[Metadata Valid] [404 Not Found] [401 Unauthorized]
↓ ↓ ↓
Check Export Format Return Error Return Error
↓ (No URL Header) (No URL Header)
┌─────────────────┬─────────────────┐
↓ ↓ ↓
[Format Supported] [Format Unsupported] [Size Exceeded]
↓ ↓ ↓
Fetch Content Return 403 Return 413
↓ (No URL Header) (No URL Header)
[Content Retrieved]
Construct Drive URL
Set Response Headers (INCLUDING X-Verint-KAB-Original-URL)
[Response Sent with URL Header]
```
**Key Decision Points**:
- URL header is ONLY added in the `[Response Sent with URL Header]` state
- All error states omit the URL header
- URL is constructed using validated `document.id` (not route parameter)
---
## Relationships
### No Entity Relationships
This feature does not introduce any new data relationships:
- No database tables
- No foreign keys
- No associations between entities
- Single HTTP response header derived from existing file ID
### Dependency Chain
```
Google Drive File (External)
↓ (has)
File ID (String)
↓ (used to construct)
Drive URL (String)
↓ (embedded in)
HTTP Response Header (X-Verint-KAB-Original-URL)
↓ (sent in)
HTTP Response (200 OK)
```
---
## Validation Rules
### Input Validation
**Document ID** (from route):
- Extracted from URL path: `/documents/{documentId}`
- Format: Any string (validated by Google Drive API, not by adapter)
- Invalid IDs result in 404 error (no URL header)
**No additional validation needed** - Google Drive API performs validation
### Output Validation
**X-Verint-KAB-Original-URL Header Value**:
- Must start with: `https://drive.google.com/file/d/`
- Must contain valid file ID after prefix
- Must not contain query parameters or fragments
- Must be a single-line string (no newlines)
**Validation Implementation**:
```javascript
// No explicit validation needed - constructed from validated document.id
const driveUrl = `https://drive.google.com/file/d/${document.id}`;
res.setHeader("X-Verint-KAB-Original-URL", driveUrl);
```
---
## Data Constraints
### Performance Constraints
- **URL Construction Time**: < 1ms (string concatenation)
- **Memory Footprint**: ~100 bytes per response (temporary string)
- **Header Size**: ~80-120 bytes (fits well within HTTP header limits)
### Size Constraints
- **File ID Length**: Typically 33-44 characters (no upper limit enforced)
- **Full URL Length**: ~70-90 characters
- **HTTP Header Name**: 28 characters (`X-Verint-KAB-Original-URL`)
- **Total Header Size**: ~100-120 bytes
### Format Constraints
- **URL Scheme**: Must be `https://` (no http://)
- **Domain**: Must be `drive.google.com` (not docs.google.com or other domains)
- **Path Structure**: Must be `/file/d/{id}` (not `/open?id=` or other patterns)
---
## Examples
### Example 1: PDF Export
**Request**:
```
GET /documents/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms
```
**Google Drive API Response** (metadata):
```json
{
"id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms",
"name": "Q4 Financial Report",
"mimeType": "application/pdf"
}
```
**Constructed URL**:
```
https://drive.google.com/file/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms
```
**HTTP Response Headers**:
```
HTTP/1.1 200 OK
Content-Type: application/pdf
X-Request-Id: req_550e8400-e29b-41d4-a716-446655440000
Content-Disposition: inline; filename="Q4 Financial Report.pdf"
Content-Length: 245760
X-Verint-KAB-Original-URL: https://drive.google.com/file/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms
```
### Example 2: DOCX Export
**Request**:
```
GET /documents/2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt
```
**Google Drive API Response** (metadata):
```json
{
"id": "2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt",
"name": "Meeting Notes - March 2026",
"mimeType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
}
```
**Constructed URL**:
```
https://drive.google.com/file/d/2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt
```
**HTTP Response Headers**:
```
HTTP/1.1 200 OK
Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
X-Request-Id: req_660f9511-f3ac-52e5-b827-557766551111
Content-Disposition: inline; filename="Meeting Notes - March 2026.docx"
Content-Length: 52480
X-Verint-KAB-Original-URL: https://drive.google.com/file/d/2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt
```
### Example 3: Error Case (Document Not Found)
**Request**:
```
GET /documents/INVALID_ID_12345
```
**HTTP Response Headers**:
```
HTTP/1.1 404 Not Found
X-Request-Id: req_770fa622-g4bd-63f6-c938-668877662222
Document not found
```
**Note**: No `X-Verint-KAB-Original-URL` header present in error response.
---
## Summary
This feature introduces:
- **1 new HTTP header**: `X-Verint-KAB-Original-URL`
- **1 derived data element**: Google Drive URL (constructed from file ID)
- **0 new persistent entities**: All data is ephemeral (per-request)
- **0 new database tables**: No storage required
The data model is minimal by design - a simple string transformation from file ID to Drive URL, embedded in an HTTP response header.