Files
google-drive-content-adapter/specs/001-gdrive-url-header/data-model.md
Peter.Morton 9286ee8927 feat: Add X-Verint-KAB-Original-URL header to document exports
Adds HTTP response header containing original Google Drive URL
for exported documents to enable content traceability and auditing.

- Adds X-Verint-KAB-Original-URL header to successful export responses
- Header format: https://drive.google.com/file/d/{fileId}
- Present for all export formats (PDF, DOCX, plain text)
- Header omitted on error responses (4xx/5xx)
- 18 new tests (9 contract + 9 integration)
- Zero new dependencies
- Performance: 0.000019ms overhead per request

Implements:
- FR-001: Header present on successful exports (200 OK)
- FR-002: Header absent on error responses
- FR-003: Standard header name X-Verint-KAB-Original-URL
- FR-004: Standard URL format with file ID
- FR-005: Uses validated document.id from Google Drive API
- FR-006: Header present regardless of file accessibility
- FR-007: Consistent across all export formats
- FR-008: Minimal performance impact (< 5ms requirement)

Testing:
- Contract tests validate header presence, format, and error handling
- Integration tests verify behavior across formats and permissions
- All 18 tests passing
- 100% requirements coverage

Documentation:
- Feature specification (specs/001-gdrive-url-header/spec.md)
- Implementation plan (plan.md)
- Technical research (research.md)
- Data model (data-model.md)
- API contract (contracts/response-headers.md)
- User guide (quickstart.md)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-27 16:04:54 -05:00

9.2 KiB

Data Model: Google Drive Original URL Header

Feature: 001-gdrive-url-header
Date: 2026-03-27
Status: Complete

Overview

This feature adds an HTTP response header to document export responses. There are no new data entities or persistent data structures. This document describes the data flow and transformations involved.


Entities

HTTP Response Header (New)

Name: X-Verint-KAB-Original-URL

Description: Custom HTTP response header containing the original Google Drive URL for the exported document.

Properties:

  • Name: X-Verint-KAB-Original-URL (string, constant)
  • Value: https://drive.google.com/file/d/{fileId} (string, dynamic)

Lifecycle:

  • Created: During successful document export response (proxy.js line ~377-383)
  • Lifespan: Single HTTP response only
  • Destroyed: After response is sent to client

Validation Rules:

  • Header name is fixed (cannot vary)
  • Header value must be a valid URL with format: https://drive.google.com/file/d/{fileId}
  • File ID must be alphanumeric string (validated by Google Drive API)
  • Header is only present on successful exports (200 OK status)

Google Drive File ID (Existing)

Name: Document ID / File ID

Description: Unique identifier for a Google Drive document, obtained from Google Drive API.

Source:

  1. Client request URL: /documents/{documentId}
  2. Validated by Google Drive Files API: GET /drive/v3/files/{documentId}
  3. Returned in API response as document.id

Properties:

  • Type: String
  • Format: Alphanumeric (e.g., 1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms)
  • Length: Variable (typically 33-44 characters)
  • Validation: Implicitly validated by Google Drive API (404 if invalid)

Usage in Feature:

  • Input parameter: documentId in request URL
  • Validated value: document.id from API response (line 278)
  • Output: Embedded in X-Verint-KAB-Original-URL header value

Google Drive URL (New, Derived)

Name: Original Document URL

Description: User-facing URL for accessing the document in Google Drive web interface.

Properties:

  • Base URL: https://drive.google.com/file/d/ (constant)
  • File ID: {document.id} (dynamic, from API response)
  • Full URL: https://drive.google.com/file/d/{document.id} (constructed)

Construction:

const driveUrl = `https://drive.google.com/file/d/${document.id}`;

Characteristics:

  • Immutable for a given file ID
  • No query parameters needed
  • No URL encoding required (file IDs are alphanumeric only)
  • Publicly addressable (permissions enforced by Google Drive)

Data Flow

Request → Response Flow

1. Client Request
   ↓
   URL: GET /documents/{documentId}
   ↓
2. Route Parsing
   ↓
   Extract: documentId = "{id}"
   ↓
3. Metadata Fetch
   ↓
   API: GET https://www.googleapis.com/drive/v3/files/{documentId}
   ↓
   Response: { id: "{validated_id}", name: "...", mimeType: "..." }
   ↓
4. Export Content Fetch
   ↓
   API: GET https://www.googleapis.com/drive/v3/files/{id}?alt=media
   ↓
5. Response Header Construction
   ↓
   URL Construction: https://drive.google.com/file/d/{document.id}
   ↓
   Header: X-Verint-KAB-Original-URL: {constructed_url}
   ↓
6. Client Response
   ↓
   Status: 200 OK
   Headers:
     - Content-Type: {mimeType}
     - X-Request-Id: {requestId}
     - Content-Disposition: inline; filename="{name}.{ext}"
     - X-Verint-KAB-Original-URL: https://drive.google.com/file/d/{id}
   Body: {document_content}

State Transitions

Document Export Request States

[Request Received]
    ↓
    Parse Route → Extract documentId
    ↓
[ID Extracted]
    ↓
    Fetch Metadata from Google Drive
    ↓
    ┌─────────────────────┬─────────────────────┐
    ↓                     ↓                     ↓
[Metadata Valid]   [404 Not Found]     [401 Unauthorized]
    ↓                     ↓                     ↓
    Check Export Format   Return Error         Return Error
    ↓                     (No URL Header)      (No URL Header)
    ┌─────────────────┬─────────────────┐
    ↓                 ↓                 ↓
[Format Supported]  [Format Unsupported]  [Size Exceeded]
    ↓                 ↓                     ↓
    Fetch Content     Return 403           Return 413
    ↓                 (No URL Header)      (No URL Header)
    ↓
[Content Retrieved]
    ↓
    Construct Drive URL
    ↓
    Set Response Headers (INCLUDING X-Verint-KAB-Original-URL)
    ↓
[Response Sent with URL Header]

Key Decision Points:

  • URL header is ONLY added in the [Response Sent with URL Header] state
  • All error states omit the URL header
  • URL is constructed using validated document.id (not route parameter)

Relationships

No Entity Relationships

This feature does not introduce any new data relationships:

  • No database tables
  • No foreign keys
  • No associations between entities
  • Single HTTP response header derived from existing file ID

Dependency Chain

Google Drive File (External)
    ↓ (has)
File ID (String)
    ↓ (used to construct)
Drive URL (String)
    ↓ (embedded in)
HTTP Response Header (X-Verint-KAB-Original-URL)
    ↓ (sent in)
HTTP Response (200 OK)

Validation Rules

Input Validation

Document ID (from route):

  • Extracted from URL path: /documents/{documentId}
  • Format: Any string (validated by Google Drive API, not by adapter)
  • Invalid IDs result in 404 error (no URL header)

No additional validation needed - Google Drive API performs validation

Output Validation

X-Verint-KAB-Original-URL Header Value:

  • Must start with: https://drive.google.com/file/d/
  • Must contain valid file ID after prefix
  • Must not contain query parameters or fragments
  • Must be a single-line string (no newlines)

Validation Implementation:

// No explicit validation needed - constructed from validated document.id
const driveUrl = `https://drive.google.com/file/d/${document.id}`;
res.setHeader("X-Verint-KAB-Original-URL", driveUrl);

Data Constraints

Performance Constraints

  • URL Construction Time: < 1ms (string concatenation)
  • Memory Footprint: ~100 bytes per response (temporary string)
  • Header Size: ~80-120 bytes (fits well within HTTP header limits)

Size Constraints

  • File ID Length: Typically 33-44 characters (no upper limit enforced)
  • Full URL Length: ~70-90 characters
  • HTTP Header Name: 28 characters (X-Verint-KAB-Original-URL)
  • Total Header Size: ~100-120 bytes

Format Constraints

  • URL Scheme: Must be https:// (no http://)
  • Domain: Must be drive.google.com (not docs.google.com or other domains)
  • Path Structure: Must be /file/d/{id} (not /open?id= or other patterns)

Examples

Example 1: PDF Export

Request:

GET /documents/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms

Google Drive API Response (metadata):

{
  "id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms",
  "name": "Q4 Financial Report",
  "mimeType": "application/pdf"
}

Constructed URL:

https://drive.google.com/file/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms

HTTP Response Headers:

HTTP/1.1 200 OK
Content-Type: application/pdf
X-Request-Id: req_550e8400-e29b-41d4-a716-446655440000
Content-Disposition: inline; filename="Q4 Financial Report.pdf"
Content-Length: 245760
X-Verint-KAB-Original-URL: https://drive.google.com/file/d/1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms

Example 2: DOCX Export

Request:

GET /documents/2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt

Google Drive API Response (metadata):

{
  "id": "2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt",
  "name": "Meeting Notes - March 2026",
  "mimeType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
}

Constructed URL:

https://drive.google.com/file/d/2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt

HTTP Response Headers:

HTTP/1.1 200 OK
Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
X-Request-Id: req_660f9511-f3ac-52e5-b827-557766551111
Content-Disposition: inline; filename="Meeting Notes - March 2026.docx"
Content-Length: 52480
X-Verint-KAB-Original-URL: https://drive.google.com/file/d/2CyjMWt1YSB6oGNetLcEaCkhnVVsruqmct85PhzF3vqnt

Example 3: Error Case (Document Not Found)

Request:

GET /documents/INVALID_ID_12345

HTTP Response Headers:

HTTP/1.1 404 Not Found
X-Request-Id: req_770fa622-g4bd-63f6-c938-668877662222

Document not found

Note: No X-Verint-KAB-Original-URL header present in error response.


Summary

This feature introduces:

  • 1 new HTTP header: X-Verint-KAB-Original-URL
  • 1 derived data element: Google Drive URL (constructed from file ID)
  • 0 new persistent entities: All data is ephemeral (per-request)
  • 0 new database tables: No storage required

The data model is minimal by design - a simple string transformation from file ID to Drive URL, embedded in an HTTP response header.