fixing jsonSchema validation by using zod
This commit is contained in:
@@ -392,3 +392,317 @@ No secondary indexes required - all queries use primary keys (sessionId, pnr)
|
||||
Proceed to contract definition (Phase 1 continued):
|
||||
- Define MCP tool schemas in `/contracts/mcp-tools.md`
|
||||
- Create quickstart guide with example usage
|
||||
|
||||
---
|
||||
|
||||
## Remote Access Entities (Added 2026-04-07)
|
||||
|
||||
The following entities support remote MCP access over HTTP/2 with rate limiting, health monitoring, and global PNR retrieval.
|
||||
|
||||
### 8. Remote Connection
|
||||
|
||||
Represents an active remote client connection over HTTP/2.
|
||||
|
||||
**Fields**:
|
||||
```typescript
|
||||
{
|
||||
connectionId: string; // Unique connection identifier (UUID)
|
||||
sessionId: string; // Associated MCP session ID
|
||||
remoteIP: string; // Client IP address (for rate limiting)
|
||||
connectedAt: number; // Unix timestamp (ms)
|
||||
lastActivityAt: number; // Unix timestamp (ms)
|
||||
transportType: string; // 'http' | 'stdio'
|
||||
userAgent: string; // Client user agent
|
||||
requestCount: number; // Total requests made in this connection
|
||||
}
|
||||
```
|
||||
|
||||
**Storage**: Valkey hash at key `connection:{connectionId}`
|
||||
|
||||
**Validation**:
|
||||
- `remoteIP` must be valid IPv4 or IPv6 format
|
||||
- `transportType` must be 'http' or 'stdio'
|
||||
- `connectedAt` must be <= `lastActivityAt`
|
||||
|
||||
**State Transitions**:
|
||||
```
|
||||
[Connected] → [Active] → [Idle] → [Disconnected]
|
||||
↓
|
||||
[Expired]
|
||||
```
|
||||
|
||||
**Lifecycle**:
|
||||
- Created on initial HTTP request with new session
|
||||
- Updated on every request (lastActivityAt, requestCount)
|
||||
- Auto-expires after SESSION_TTL_HOURS of inactivity
|
||||
- Explicitly deleted on graceful disconnect
|
||||
|
||||
---
|
||||
|
||||
### 9. Rate Limit Record
|
||||
|
||||
Tracks request rates per client IP address for abuse prevention.
|
||||
|
||||
**Fields**:
|
||||
```typescript
|
||||
{
|
||||
clientIP: string; // Client IP address (key component)
|
||||
currentWindow: number; // Current time window (Unix timestamp / window_seconds)
|
||||
currentCount: number; // Request count in current window
|
||||
previousCount: number; // Request count in previous window
|
||||
limit: number; // Max requests allowed per window
|
||||
resetAt: number; // Next window reset timestamp
|
||||
}
|
||||
```
|
||||
|
||||
**Storage**:
|
||||
- Current window: Valkey integer at key `ratelimit:{ip}:{currentWindow}`
|
||||
- Previous window: Valkey integer at key `ratelimit:{ip}:{previousWindow}`
|
||||
|
||||
**Validation**:
|
||||
- `clientIP` must be valid IP address
|
||||
- `currentCount`, `previousCount` must be non-negative integers
|
||||
- `limit` must be positive integer (default: 100)
|
||||
- `resetAt` must be in the future
|
||||
|
||||
**Algorithm**: Sliding Window Counter (Hybrid)
|
||||
```javascript
|
||||
estimatedCount = (previousCount * previousWeight) + currentCount
|
||||
where previousWeight = 1 - (elapsedInWindow / windowSeconds)
|
||||
```
|
||||
|
||||
**Lifecycle**:
|
||||
- Counter incremented on each request: `INCR ratelimit:{ip}:{window}`
|
||||
- TTL set to 2× window duration (keeps previous window accessible)
|
||||
- Auto-expires after TTL (no manual cleanup needed)
|
||||
- Resets at window boundary (new window key)
|
||||
|
||||
**Error Response** (when limit exceeded):
|
||||
```typescript
|
||||
{
|
||||
error: "Rate limit exceeded",
|
||||
code: "RATE_LIMIT_EXCEEDED",
|
||||
limit: 100,
|
||||
current: 105,
|
||||
resetAt: "2026-04-07T10:01:00.000Z",
|
||||
retryAfter: 15 // seconds
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 10. Health Status
|
||||
|
||||
Represents server operational status for monitoring and health checks.
|
||||
|
||||
**Fields**:
|
||||
```typescript
|
||||
{
|
||||
status: string; // 'healthy' | 'degraded' | 'unhealthy'
|
||||
uptime: number; // Seconds since server start
|
||||
version: string; // Server version
|
||||
connections: {
|
||||
stdio: number; // Active stdio connections (0 or 1 typically)
|
||||
http: number; // Active HTTP connections
|
||||
total: number; // Total active connections
|
||||
},
|
||||
sessions: {
|
||||
active: number; // Active sessions (sessions with recent activity)
|
||||
total: number; // Total sessions in Valkey
|
||||
},
|
||||
storage: {
|
||||
connected: boolean; // Valkey connection status
|
||||
responseTime: number; // Valkey ping response time (ms)
|
||||
},
|
||||
memory: {
|
||||
used: number; // Process memory used (MB)
|
||||
total: number; // Total system memory (MB)
|
||||
percentage: number; // Memory usage percentage
|
||||
},
|
||||
timestamp: number; // Health check timestamp (Unix ms)
|
||||
}
|
||||
```
|
||||
|
||||
**Endpoint**: `GET /health` (unauthenticated, exempt from rate limiting)
|
||||
|
||||
**Status Determination**:
|
||||
- **healthy**: All systems operational, storage connected, memory < 80%
|
||||
- **degraded**: Storage slow (ping > 100ms) or memory 80-90%
|
||||
- **unhealthy**: Storage disconnected or memory > 90%
|
||||
|
||||
**Response Codes**:
|
||||
- 200 OK: status = 'healthy'
|
||||
- 200 OK: status = 'degraded' (still serving requests)
|
||||
- 503 Service Unavailable: status = 'unhealthy'
|
||||
|
||||
**Example Response**:
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"uptime": 3600,
|
||||
"version": "0.1.0",
|
||||
"connections": {
|
||||
"stdio": 0,
|
||||
"http": 12,
|
||||
"total": 12
|
||||
},
|
||||
"sessions": {
|
||||
"active": 8,
|
||||
"total": 15
|
||||
},
|
||||
"storage": {
|
||||
"connected": true,
|
||||
"responseTime": 2
|
||||
},
|
||||
"memory": {
|
||||
"used": 45,
|
||||
"total": 8192,
|
||||
"percentage": 0.55
|
||||
},
|
||||
"timestamp": 1712486460000
|
||||
}
|
||||
```
|
||||
|
||||
**Use Cases**:
|
||||
- Load balancer health checks
|
||||
- Monitoring system integration (Prometheus, Datadog)
|
||||
- Deployment validation (verify server started successfully)
|
||||
- Debugging (check connection counts, memory usage)
|
||||
|
||||
---
|
||||
|
||||
## Updated Storage Schema (Remote Mode)
|
||||
|
||||
### Global PNR Storage (Session-Independent)
|
||||
|
||||
**Key Change**: PNRs now stored globally with TTL, not scoped to sessions.
|
||||
|
||||
```
|
||||
pnr:{pnr} # Global PNR storage
|
||||
→ {
|
||||
pnr: string,
|
||||
status: 'confirmed' | 'cancelled',
|
||||
createdAt: ISO8601,
|
||||
expiresAt: ISO8601,
|
||||
creatingSessionId: string, # For logging only, not access control
|
||||
segments: [...],
|
||||
passengers: [...],
|
||||
totalPrice: number
|
||||
}
|
||||
```
|
||||
|
||||
**TTL**: Configurable via `PNR_TTL_HOURS` (default 1 hour)
|
||||
|
||||
**Access**: Any session can retrieve any PNR (global retrieval)
|
||||
|
||||
**Expiration**: PNR auto-deleted by Valkey after TTL expires
|
||||
|
||||
---
|
||||
|
||||
### Session PNR Reference
|
||||
|
||||
Sessions track which PNRs they created (for `listBookings` tool):
|
||||
|
||||
```
|
||||
session:{sessionId}:pnrs # Set of PNR codes created in this session
|
||||
→ Set<string> # e.g., ["TEST-ABC123", "TEST-DEF456"]
|
||||
```
|
||||
|
||||
**Purpose**: Enable `listBookings` to return session-created PNRs
|
||||
|
||||
**Lifecycle**: Deleted when session expires (PNRs persist independently)
|
||||
|
||||
---
|
||||
|
||||
### Rate Limit Keys
|
||||
|
||||
```
|
||||
ratelimit:{ip}:{window} # Request count for IP in time window
|
||||
→ integer # e.g., 45 (requests made)
|
||||
TTL: windowSeconds * 2 # Keep previous window accessible
|
||||
```
|
||||
|
||||
**Example**:
|
||||
```
|
||||
ratelimit:192.168.1.1:287456 # Current window (e.g., minute 287456)
|
||||
→ 45
|
||||
ratelimit:192.168.1.1:287455 # Previous window
|
||||
→ 92
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Connection Tracking
|
||||
|
||||
```
|
||||
connection:{connectionId} # Remote connection metadata
|
||||
→ { connectionId, sessionId, remoteIP, connectedAt, ... }
|
||||
TTL: SESSION_TTL_HOURS
|
||||
```
|
||||
|
||||
**Purpose**: Track active HTTP connections for health monitoring
|
||||
|
||||
**Cleanup**: Auto-expires with session TTL
|
||||
|
||||
---
|
||||
|
||||
## Updated Validation Rules (Remote Mode)
|
||||
|
||||
### Additional Validations
|
||||
|
||||
1. **IP Address Validation**:
|
||||
- Must be valid IPv4 (e.g., `192.168.1.1`) or IPv6 format
|
||||
- Used for rate limiting and logging
|
||||
- Extracted from `X-Forwarded-For` or `X-Real-IP` headers (trusted proxy)
|
||||
|
||||
2. **Session ID Format** (HTTP mode):
|
||||
- Must be valid UUID v4
|
||||
- Sent via `MCP-Session-ID` header
|
||||
- Generated by transport if not provided
|
||||
|
||||
3. **Rate Limit Headers** (HTTP mode):
|
||||
- `X-RateLimit-Limit`: Integer > 0
|
||||
- `X-RateLimit-Remaining`: Integer >= 0
|
||||
- `X-RateLimit-Reset`: Unix timestamp
|
||||
- `Retry-After`: Seconds (when limit exceeded)
|
||||
|
||||
4. **CORS Headers** (HTTP mode):
|
||||
- `Origin`: Any (wildcard policy)
|
||||
- `Access-Control-Allow-Origin`: Must be `*`
|
||||
- Preflight requests must use OPTIONS method
|
||||
|
||||
---
|
||||
|
||||
## Performance Considerations (Remote Mode)
|
||||
|
||||
### Additional Overhead
|
||||
|
||||
- **Rate Limiting**: +3 Valkey ops per request (~1-2ms overhead)
|
||||
- **Connection Tracking**: +1 Valkey write per request (~0.5ms overhead)
|
||||
- **CORS Preflight**: OPTIONS requests handled immediately (no tool execution)
|
||||
- **Health Checks**: Separate fast path (no session/rate limit checks)
|
||||
|
||||
### Expected Performance (Remote)
|
||||
|
||||
- **Search Operations**: <2s (requirement: SC-003)
|
||||
- **Booking Operations**: <500ms (includes rate limit check)
|
||||
- **Retrieval Operations**: <200ms (global PNR lookup)
|
||||
- **Health Check**: <100ms (Valkey ping only)
|
||||
- **Concurrent Remote Sessions**: 50+ (requirement: SC-012)
|
||||
|
||||
### Optimization Strategies
|
||||
|
||||
1. **Rate Limit Caching**: Cache IP counters in-memory for 1 second (reduce Valkey ops)
|
||||
2. **Connection Pooling**: Reuse HTTP/2 connections (handled by Nginx)
|
||||
3. **Health Check Caching**: Cache health status for 5 seconds
|
||||
4. **CORS Preflight Caching**: 24-hour `Access-Control-Max-Age`
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
Data model complete with remote access entities. Proceed to:
|
||||
1. ✅ Update contracts/ with health endpoint schema
|
||||
2. ✅ Update quickstart.md with remote access setup
|
||||
3. ✅ Run agent context update script
|
||||
|
||||
|
||||
Reference in New Issue
Block a user