fixing jsonSchema validation by using zod

2026-04-11 22:23:25 -05:00
parent 0bae26ae0b
commit eb0a4e8308
56 changed files with 12275 additions and 287 deletions
--- a/specs/001-mock-gds-server/data-model.md
+++ b/specs/001-mock-gds-server/data-model.md
@@ -392,3 +392,317 @@ No secondary indexes required - all queries use primary keys (sessionId, pnr)
 Proceed to contract definition (Phase 1 continued):
 - Define MCP tool schemas in `/contracts/mcp-tools.md`
 - Create quickstart guide with example usage
+
+---
+
+## Remote Access Entities (Added 2026-04-07)
+
+The following entities support remote MCP access over HTTP/2 with rate limiting, health monitoring, and global PNR retrieval.
+
+### 8. Remote Connection
+
+Represents an active remote client connection over HTTP/2.
+
+**Fields**:
+```typescript
+{
+  connectionId: string;      // Unique connection identifier (UUID)
+  sessionId: string;         // Associated MCP session ID
+  remoteIP: string;          // Client IP address (for rate limiting)
+  connectedAt: number;       // Unix timestamp (ms)
+  lastActivityAt: number;    // Unix timestamp (ms)
+  transportType: string;     // 'http' | 'stdio'
+  userAgent: string;         // Client user agent
+  requestCount: number;      // Total requests made in this connection
+}
+```
+
+**Storage**: Valkey hash at key `connection:{connectionId}`
+
+**Validation**:
+- `remoteIP` must be valid IPv4 or IPv6 format
+- `transportType` must be 'http' or 'stdio'
+- `connectedAt` must be <= `lastActivityAt`
+
+**State Transitions**:
+```
+[Connected] → [Active] → [Idle] → [Disconnected]
+                      ↓
+                   [Expired]
+```
+
+**Lifecycle**:
+- Created on initial HTTP request with new session
+- Updated on every request (lastActivityAt, requestCount)
+- Auto-expires after SESSION_TTL_HOURS of inactivity
+- Explicitly deleted on graceful disconnect
+
+---
+
+### 9. Rate Limit Record
+
+Tracks request rates per client IP address for abuse prevention.
+
+**Fields**:
+```typescript
+{
+  clientIP: string;          // Client IP address (key component)
+  currentWindow: number;     // Current time window (Unix timestamp / window_seconds)
+  currentCount: number;      // Request count in current window
+  previousCount: number;     // Request count in previous window
+  limit: number;             // Max requests allowed per window
+  resetAt: number;           // Next window reset timestamp
+}
+```
+
+**Storage**: 
+- Current window: Valkey integer at key `ratelimit:{ip}:{currentWindow}`
+- Previous window: Valkey integer at key `ratelimit:{ip}:{previousWindow}`
+
+**Validation**:
+- `clientIP` must be valid IP address
+- `currentCount`, `previousCount` must be non-negative integers
+- `limit` must be positive integer (default: 100)
+- `resetAt` must be in the future
+
+**Algorithm**: Sliding Window Counter (Hybrid)
+```javascript
+estimatedCount = (previousCount * previousWeight) + currentCount
+where previousWeight = 1 - (elapsedInWindow / windowSeconds)
+```
+
+**Lifecycle**:
+- Counter incremented on each request: `INCR ratelimit:{ip}:{window}`
+- TTL set to 2× window duration (keeps previous window accessible)
+- Auto-expires after TTL (no manual cleanup needed)
+- Resets at window boundary (new window key)
+
+**Error Response** (when limit exceeded):
+```typescript
+{
+  error: "Rate limit exceeded",
+  code: "RATE_LIMIT_EXCEEDED",
+  limit: 100,
+  current: 105,
+  resetAt: "2026-04-07T10:01:00.000Z",
+  retryAfter: 15  // seconds
+}
+```
+
+---
+
+### 10. Health Status
+
+Represents server operational status for monitoring and health checks.
+
+**Fields**:
+```typescript
+{
+  status: string;            // 'healthy' | 'degraded' | 'unhealthy'
+  uptime: number;            // Seconds since server start
+  version: string;           // Server version
+  connections: {
+    stdio: number;           // Active stdio connections (0 or 1 typically)
+    http: number;            // Active HTTP connections
+    total: number;           // Total active connections
+  },
+  sessions: {
+    active: number;          // Active sessions (sessions with recent activity)
+    total: number;           // Total sessions in Valkey
+  },
+  storage: {
+    connected: boolean;      // Valkey connection status
+    responseTime: number;    // Valkey ping response time (ms)
+  },
+  memory: {
+    used: number;            // Process memory used (MB)
+    total: number;           // Total system memory (MB)
+    percentage: number;      // Memory usage percentage
+  },
+  timestamp: number;         // Health check timestamp (Unix ms)
+}
+```
+
+**Endpoint**: `GET /health` (unauthenticated, exempt from rate limiting)
+
+**Status Determination**:
+- **healthy**: All systems operational, storage connected, memory < 80%
+- **degraded**: Storage slow (ping > 100ms) or memory 80-90%
+- **unhealthy**: Storage disconnected or memory > 90%
+
+**Response Codes**:
+- 200 OK: status = 'healthy'
+- 200 OK: status = 'degraded' (still serving requests)
+- 503 Service Unavailable: status = 'unhealthy'
+
+**Example Response**:
+```json
+{
+  "status": "healthy",
+  "uptime": 3600,
+  "version": "0.1.0",
+  "connections": {
+    "stdio": 0,
+    "http": 12,
+    "total": 12
+  },
+  "sessions": {
+    "active": 8,
+    "total": 15
+  },
+  "storage": {
+    "connected": true,
+    "responseTime": 2
+  },
+  "memory": {
+    "used": 45,
+    "total": 8192,
+    "percentage": 0.55
+  },
+  "timestamp": 1712486460000
+}
+```
+
+**Use Cases**:
+- Load balancer health checks
+- Monitoring system integration (Prometheus, Datadog)
+- Deployment validation (verify server started successfully)
+- Debugging (check connection counts, memory usage)
+
+---
+
+## Updated Storage Schema (Remote Mode)
+
+### Global PNR Storage (Session-Independent)
+
+**Key Change**: PNRs now stored globally with TTL, not scoped to sessions.
+
+```
+pnr:{pnr}                                # Global PNR storage
+  → {
+      pnr: string,
+      status: 'confirmed' | 'cancelled',
+      createdAt: ISO8601,
+      expiresAt: ISO8601,
+      creatingSessionId: string,         # For logging only, not access control
+      segments: [...],
+      passengers: [...],
+      totalPrice: number
+    }
+```
+
+**TTL**: Configurable via `PNR_TTL_HOURS` (default 1 hour)
+
+**Access**: Any session can retrieve any PNR (global retrieval)
+
+**Expiration**: PNR auto-deleted by Valkey after TTL expires
+
+---
+
+### Session PNR Reference
+
+Sessions track which PNRs they created (for `listBookings` tool):
+
+```
+session:{sessionId}:pnrs                 # Set of PNR codes created in this session
+  → Set<string>                          # e.g., ["TEST-ABC123", "TEST-DEF456"]
+```
+
+**Purpose**: Enable `listBookings` to return session-created PNRs
+
+**Lifecycle**: Deleted when session expires (PNRs persist independently)
+
+---
+
+### Rate Limit Keys
+
+```
+ratelimit:{ip}:{window}                  # Request count for IP in time window
+  → integer                              # e.g., 45 (requests made)
+  TTL: windowSeconds * 2                 # Keep previous window accessible
+```
+
+**Example**:
+```
+ratelimit:192.168.1.1:287456             # Current window (e.g., minute 287456)
+  → 45
+ratelimit:192.168.1.1:287455             # Previous window
+  → 92
+```
+
+---
+
+### Connection Tracking
+
+```
+connection:{connectionId}                # Remote connection metadata
+  → { connectionId, sessionId, remoteIP, connectedAt, ... }
+  TTL: SESSION_TTL_HOURS
+```
+
+**Purpose**: Track active HTTP connections for health monitoring
+
+**Cleanup**: Auto-expires with session TTL
+
+---
+
+## Updated Validation Rules (Remote Mode)
+
+### Additional Validations
+
+1. **IP Address Validation**:
+   - Must be valid IPv4 (e.g., `192.168.1.1`) or IPv6 format
+   - Used for rate limiting and logging
+   - Extracted from `X-Forwarded-For` or `X-Real-IP` headers (trusted proxy)
+
+2. **Session ID Format** (HTTP mode):
+   - Must be valid UUID v4
+   - Sent via `MCP-Session-ID` header
+   - Generated by transport if not provided
+
+3. **Rate Limit Headers** (HTTP mode):
+   - `X-RateLimit-Limit`: Integer > 0
+   - `X-RateLimit-Remaining`: Integer >= 0
+   - `X-RateLimit-Reset`: Unix timestamp
+   - `Retry-After`: Seconds (when limit exceeded)
+
+4. **CORS Headers** (HTTP mode):
+   - `Origin`: Any (wildcard policy)
+   - `Access-Control-Allow-Origin`: Must be `*`
+   - Preflight requests must use OPTIONS method
+
+---
+
+## Performance Considerations (Remote Mode)
+
+### Additional Overhead
+
+- **Rate Limiting**: +3 Valkey ops per request (~1-2ms overhead)
+- **Connection Tracking**: +1 Valkey write per request (~0.5ms overhead)
+- **CORS Preflight**: OPTIONS requests handled immediately (no tool execution)
+- **Health Checks**: Separate fast path (no session/rate limit checks)
+
+### Expected Performance (Remote)
+
+- **Search Operations**: <2s (requirement: SC-003)
+- **Booking Operations**: <500ms (includes rate limit check)
+- **Retrieval Operations**: <200ms (global PNR lookup)
+- **Health Check**: <100ms (Valkey ping only)
+- **Concurrent Remote Sessions**: 50+ (requirement: SC-012)
+
+### Optimization Strategies
+
+1. **Rate Limit Caching**: Cache IP counters in-memory for 1 second (reduce Valkey ops)
+2. **Connection Pooling**: Reuse HTTP/2 connections (handled by Nginx)
+3. **Health Check Caching**: Cache health status for 5 seconds
+4. **CORS Preflight Caching**: 24-hour `Access-Control-Max-Age`
+
+---
+
+## Next Steps
+
+Data model complete with remote access entities. Proceed to:
+1. ✅ Update contracts/ with health endpoint schema
+2. ✅ Update quickstart.md with remote access setup
+3. ✅ Run agent context update script
+