fixing jsonSchema validation by using zod

This commit is contained in:
2026-04-11 22:23:25 -05:00
parent 0bae26ae0b
commit eb0a4e8308
56 changed files with 12275 additions and 287 deletions

View File

@@ -392,3 +392,317 @@ No secondary indexes required - all queries use primary keys (sessionId, pnr)
Proceed to contract definition (Phase 1 continued):
- Define MCP tool schemas in `/contracts/mcp-tools.md`
- Create quickstart guide with example usage
---
## Remote Access Entities (Added 2026-04-07)
The following entities support remote MCP access over HTTP/2 with rate limiting, health monitoring, and global PNR retrieval.
### 8. Remote Connection
Represents an active remote client connection over HTTP/2.
**Fields**:
```typescript
{
connectionId: string; // Unique connection identifier (UUID)
sessionId: string; // Associated MCP session ID
remoteIP: string; // Client IP address (for rate limiting)
connectedAt: number; // Unix timestamp (ms)
lastActivityAt: number; // Unix timestamp (ms)
transportType: string; // 'http' | 'stdio'
userAgent: string; // Client user agent
requestCount: number; // Total requests made in this connection
}
```
**Storage**: Valkey hash at key `connection:{connectionId}`
**Validation**:
- `remoteIP` must be valid IPv4 or IPv6 format
- `transportType` must be 'http' or 'stdio'
- `connectedAt` must be <= `lastActivityAt`
**State Transitions**:
```
[Connected] → [Active] → [Idle] → [Disconnected]
[Expired]
```
**Lifecycle**:
- Created on initial HTTP request with new session
- Updated on every request (lastActivityAt, requestCount)
- Auto-expires after SESSION_TTL_HOURS of inactivity
- Explicitly deleted on graceful disconnect
---
### 9. Rate Limit Record
Tracks request rates per client IP address for abuse prevention.
**Fields**:
```typescript
{
clientIP: string; // Client IP address (key component)
currentWindow: number; // Current time window (Unix timestamp / window_seconds)
currentCount: number; // Request count in current window
previousCount: number; // Request count in previous window
limit: number; // Max requests allowed per window
resetAt: number; // Next window reset timestamp
}
```
**Storage**:
- Current window: Valkey integer at key `ratelimit:{ip}:{currentWindow}`
- Previous window: Valkey integer at key `ratelimit:{ip}:{previousWindow}`
**Validation**:
- `clientIP` must be valid IP address
- `currentCount`, `previousCount` must be non-negative integers
- `limit` must be positive integer (default: 100)
- `resetAt` must be in the future
**Algorithm**: Sliding Window Counter (Hybrid)
```javascript
estimatedCount = (previousCount * previousWeight) + currentCount
where previousWeight = 1 - (elapsedInWindow / windowSeconds)
```
**Lifecycle**:
- Counter incremented on each request: `INCR ratelimit:{ip}:{window}`
- TTL set to 2× window duration (keeps previous window accessible)
- Auto-expires after TTL (no manual cleanup needed)
- Resets at window boundary (new window key)
**Error Response** (when limit exceeded):
```typescript
{
error: "Rate limit exceeded",
code: "RATE_LIMIT_EXCEEDED",
limit: 100,
current: 105,
resetAt: "2026-04-07T10:01:00.000Z",
retryAfter: 15 // seconds
}
```
---
### 10. Health Status
Represents server operational status for monitoring and health checks.
**Fields**:
```typescript
{
status: string; // 'healthy' | 'degraded' | 'unhealthy'
uptime: number; // Seconds since server start
version: string; // Server version
connections: {
stdio: number; // Active stdio connections (0 or 1 typically)
http: number; // Active HTTP connections
total: number; // Total active connections
},
sessions: {
active: number; // Active sessions (sessions with recent activity)
total: number; // Total sessions in Valkey
},
storage: {
connected: boolean; // Valkey connection status
responseTime: number; // Valkey ping response time (ms)
},
memory: {
used: number; // Process memory used (MB)
total: number; // Total system memory (MB)
percentage: number; // Memory usage percentage
},
timestamp: number; // Health check timestamp (Unix ms)
}
```
**Endpoint**: `GET /health` (unauthenticated, exempt from rate limiting)
**Status Determination**:
- **healthy**: All systems operational, storage connected, memory < 80%
- **degraded**: Storage slow (ping > 100ms) or memory 80-90%
- **unhealthy**: Storage disconnected or memory > 90%
**Response Codes**:
- 200 OK: status = 'healthy'
- 200 OK: status = 'degraded' (still serving requests)
- 503 Service Unavailable: status = 'unhealthy'
**Example Response**:
```json
{
"status": "healthy",
"uptime": 3600,
"version": "0.1.0",
"connections": {
"stdio": 0,
"http": 12,
"total": 12
},
"sessions": {
"active": 8,
"total": 15
},
"storage": {
"connected": true,
"responseTime": 2
},
"memory": {
"used": 45,
"total": 8192,
"percentage": 0.55
},
"timestamp": 1712486460000
}
```
**Use Cases**:
- Load balancer health checks
- Monitoring system integration (Prometheus, Datadog)
- Deployment validation (verify server started successfully)
- Debugging (check connection counts, memory usage)
---
## Updated Storage Schema (Remote Mode)
### Global PNR Storage (Session-Independent)
**Key Change**: PNRs now stored globally with TTL, not scoped to sessions.
```
pnr:{pnr} # Global PNR storage
→ {
pnr: string,
status: 'confirmed' | 'cancelled',
createdAt: ISO8601,
expiresAt: ISO8601,
creatingSessionId: string, # For logging only, not access control
segments: [...],
passengers: [...],
totalPrice: number
}
```
**TTL**: Configurable via `PNR_TTL_HOURS` (default 1 hour)
**Access**: Any session can retrieve any PNR (global retrieval)
**Expiration**: PNR auto-deleted by Valkey after TTL expires
---
### Session PNR Reference
Sessions track which PNRs they created (for `listBookings` tool):
```
session:{sessionId}:pnrs # Set of PNR codes created in this session
→ Set<string> # e.g., ["TEST-ABC123", "TEST-DEF456"]
```
**Purpose**: Enable `listBookings` to return session-created PNRs
**Lifecycle**: Deleted when session expires (PNRs persist independently)
---
### Rate Limit Keys
```
ratelimit:{ip}:{window} # Request count for IP in time window
→ integer # e.g., 45 (requests made)
TTL: windowSeconds * 2 # Keep previous window accessible
```
**Example**:
```
ratelimit:192.168.1.1:287456 # Current window (e.g., minute 287456)
→ 45
ratelimit:192.168.1.1:287455 # Previous window
→ 92
```
---
### Connection Tracking
```
connection:{connectionId} # Remote connection metadata
→ { connectionId, sessionId, remoteIP, connectedAt, ... }
TTL: SESSION_TTL_HOURS
```
**Purpose**: Track active HTTP connections for health monitoring
**Cleanup**: Auto-expires with session TTL
---
## Updated Validation Rules (Remote Mode)
### Additional Validations
1. **IP Address Validation**:
- Must be valid IPv4 (e.g., `192.168.1.1`) or IPv6 format
- Used for rate limiting and logging
- Extracted from `X-Forwarded-For` or `X-Real-IP` headers (trusted proxy)
2. **Session ID Format** (HTTP mode):
- Must be valid UUID v4
- Sent via `MCP-Session-ID` header
- Generated by transport if not provided
3. **Rate Limit Headers** (HTTP mode):
- `X-RateLimit-Limit`: Integer > 0
- `X-RateLimit-Remaining`: Integer >= 0
- `X-RateLimit-Reset`: Unix timestamp
- `Retry-After`: Seconds (when limit exceeded)
4. **CORS Headers** (HTTP mode):
- `Origin`: Any (wildcard policy)
- `Access-Control-Allow-Origin`: Must be `*`
- Preflight requests must use OPTIONS method
---
## Performance Considerations (Remote Mode)
### Additional Overhead
- **Rate Limiting**: +3 Valkey ops per request (~1-2ms overhead)
- **Connection Tracking**: +1 Valkey write per request (~0.5ms overhead)
- **CORS Preflight**: OPTIONS requests handled immediately (no tool execution)
- **Health Checks**: Separate fast path (no session/rate limit checks)
### Expected Performance (Remote)
- **Search Operations**: <2s (requirement: SC-003)
- **Booking Operations**: <500ms (includes rate limit check)
- **Retrieval Operations**: <200ms (global PNR lookup)
- **Health Check**: <100ms (Valkey ping only)
- **Concurrent Remote Sessions**: 50+ (requirement: SC-012)
### Optimization Strategies
1. **Rate Limit Caching**: Cache IP counters in-memory for 1 second (reduce Valkey ops)
2. **Connection Pooling**: Reuse HTTP/2 connections (handled by Nginx)
3. **Health Check Caching**: Cache health status for 5 seconds
4. **CORS Preflight Caching**: 24-hour `Access-Control-Max-Age`
---
## Next Steps
Data model complete with remote access entities. Proceed to:
1. ✅ Update contracts/ with health endpoint schema
2. ✅ Update quickstart.md with remote access setup
3. ✅ Run agent context update script