Files

709 lines
22 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Data Model: Mock GDS MCP Server
**Branch**: `001-mock-gds-server` | **Date**: 2026-04-07
## Overview
This document defines the core data entities, their relationships, validation rules, and state transitions for the Mock GDS MCP server.
## Core Entities
### 1. Session
Represents an MCP client connection with isolated booking context.
**Fields**:
```typescript
{
id: string; // Unique session identifier (UUID)
createdAt: number; // Unix timestamp (ms)
expiresAt: number; // Unix timestamp (ms), TTL-based
lastActivity: number; // Unix timestamp (ms)
bookingCount: number; // Number of bookings created in session
searchCount: number; // Number of searches performed
}
```
**Storage**: Valkey hash at key `gds:session:{sessionId}`
**Validation**:
- `id` must be valid UUID v4
- `expiresAt` must be > `createdAt`
- Automatic expiry via Valkey TTL (default 1 hour)
**State Transitions**:
```
[Created] → [Active] → [Expired]
[Ended]
```
### 2. Passenger
Represents a traveler in a booking.
**Fields**:
```typescript
{
id: string; // Unique within booking
type: enum; // 'adult' | 'child' | 'infant'
firstName: string; // Required
lastName: string; // Required
dateOfBirth: string; // ISO 8601 date (YYYY-MM-DD), optional
email: string; // Email address, optional
phone: string; // Phone number, optional
frequentFlyerNumber: string; // Airline loyalty number, optional
}
```
**Validation**:
- `firstName`, `lastName` must be 1-50 characters, alphabetic + spaces/hyphens
- `email` must match RFC 5322 pattern if provided
- `phone` must match E.164 pattern if provided
- `type` must be valid enum value
### 3. FlightSegment
Represents a single flight leg in an itinerary.
**Fields**:
```typescript
{
id: string; // Unique segment identifier
flightNumber: string; // e.g., "AA123"
airlineCode: string; // IATA 2-letter code (e.g., "AA")
airlineName: string; // Full airline name
originCode: string; // IATA 3-letter airport code (e.g., "JFK")
originName: string; // Airport name
destinationCode: string; // IATA 3-letter airport code
destinationName: string; // Airport name
departureTime: string; // ISO 8601 datetime
arrivalTime: string; // ISO 8601 datetime
duration: number; // Minutes
aircraftType: string; // e.g., "Boeing 737-800"
cabin: enum; // 'economy' | 'premium_economy' | 'business' | 'first'
price: number; // USD cents (e.g., 29900 = $299.00)
seatsAvailable: number; // Available seats count
bookingClass: string; // Fare class code (e.g., "Y", "J", "F")
status: enum; // 'available' | 'sold_out' | 'cancelled'
}
```
**Validation**:
- `airlineCode` must be valid IATA 2-letter code
- `originCode`, `destinationCode` must be valid IATA 3-letter codes
- `originCode``destinationCode`
- `departureTime` < `arrivalTime`
- `duration` must match calculated time difference
- `seatsAvailable` must be >= 0
- `price` must be > 0
**Business Rules**:
- Flight times must be realistic for route (e.g., JFK→LAX ~6 hours)
- Prices scale with distance and cabin class
- sold_out status when seatsAvailable = 0
### 4. HotelReservation
Represents a hotel booking segment.
**Fields**:
```typescript
{
id: string; // Unique reservation identifier
hotelCode: string; // Internal hotel identifier
hotelName: string; // Hotel property name
chainCode: string; // Hotel chain code (e.g., "MAR" for Marriott)
chainName: string; // Hotel chain name
address: string; // Full address
cityCode: string; // IATA city code (e.g., "LAX")
cityName: string; // City name
checkInDate: string; // ISO 8601 date (YYYY-MM-DD)
checkOutDate: string; // ISO 8601 date (YYYY-MM-DD)
nights: number; // Calculated night count
roomType: string; // e.g., "Standard King", "Deluxe Suite"
rateCode: string; // Rate plan code
starRating: number; // 1-5 stars
price: number; // USD cents, total for stay
pricePerNight: number; // USD cents
guestCount: number; // Number of guests
amenities: string[]; // List of amenities
status: enum; // 'available' | 'sold_out' | 'confirmed' | 'cancelled'
}
```
**Validation**:
- `checkInDate` < `checkOutDate`
- `nights` must equal date difference
- `starRating` must be 1-5
- `guestCount` must be >= 1
- `price` must equal `pricePerNight` * `nights`
**Business Rules**:
- Check-in date must not be in the past
- Minimum 1 night stay
- Prices vary by star rating and city
### 5. CarRental
Represents a car rental segment.
**Fields**:
```typescript
{
id: string; // Unique rental identifier
companyCode: string; // Rental company code (e.g., "ZE" for Hertz)
companyName: string; // Company name
pickupLocationCode: string; // Airport code or location ID
pickupLocationName: string; // Location name
dropoffLocationCode: string; // Airport code or location ID
dropoffLocationName: string; // Location name
pickupDate: string; // ISO 8601 datetime
dropoffDate: string; // ISO 8601 datetime
vehicleClass: enum; // 'economy' | 'compact' | 'midsize' | 'fullsize' | 'suv' | 'luxury'
vehicleModel: string; // e.g., "Toyota Camry or similar"
dailyRate: number; // USD cents per day
totalPrice: number; // USD cents
rentalDays: number; // Number of days
mileagePolicy: enum; // 'unlimited' | 'limited'
insuranceIncluded: boolean;
status: enum; // 'available' | 'confirmed' | 'cancelled'
}
```
**Validation**:
- `pickupDate` < `dropoffDate`
- `rentalDays` must match date calculation
- `totalPrice` must equal `dailyRate` * `rentalDays`
- `pickupLocationCode` should match airport/city code for traveler's destination
**Business Rules**:
- Same-location dropoff preferred (one-way rentals add surcharge)
- Pickup date should align with flight arrival
- Dropoff date should align with departure flight
### 6. PNR (Passenger Name Record)
Represents a complete booking with multiple service segments.
**Fields**:
```typescript
{
pnr: string; // Unique booking reference (format: TEST-{BASE32})
sessionId: string; // Session that created the booking
createdAt: number; // Unix timestamp (ms)
lastModified: number; // Unix timestamp (ms)
status: enum; // 'pending' | 'confirmed' | 'cancelled'
passengers: Passenger[]; // Array of passengers
flights: FlightSegment[]; // Array of flight segments
hotels: HotelReservation[]; // Array of hotel bookings
cars: CarRental[]; // Array of car rentals
totalPrice: number; // USD cents, sum of all segments
currency: string; // Always "USD" for mock data
contactEmail: string; // Primary contact email
contactPhone: string; // Primary contact phone
}
```
**Storage**: Valkey hash at key `gds:session:{sessionId}:booking:{pnr}`
**Validation**:
- `pnr` must match format `TEST-[A-Z0-9]{6}`
- Must have at least one passenger
- Must have at least one service (flight, hotel, or car)
- `totalPrice` must equal sum of all segment prices
- `contactEmail` or `contactPhone` required (at least one)
**State Transitions**:
```
[Pending] → [Confirmed] → [Cancelled]
[Modified] → [Confirmed]
```
**Business Rules**:
- Cannot modify after cancellation
- Hotel dates must overlap with flight dates
- Car pickup should align with flight arrival
- Multi-city bookings require connecting flights
### 7. SearchQuery
Represents a search request (flights, hotels, or cars).
**Fields**:
```typescript
{
id: string; // Unique search identifier
sessionId: string; // Session performing search
type: enum; // 'flight' | 'hotel' | 'car'
timestamp: number; // Unix timestamp (ms)
parameters: object; // Type-specific search params
resultCount: number; // Number of results returned
responseTime: number; // Milliseconds to generate results
}
```
**Storage**: Ephemeral (not persisted), tracked for statistics only
### 8. MockDataRecord
Represents a static mock data entry (airports, airlines, hotels, etc.).
**Fields**:
```typescript
{
type: enum; // 'airport' | 'airline' | 'hotel' | 'car_company'
code: string; // IATA/ICAO code or internal ID
name: string; // Full name
metadata: object; // Type-specific data (coordinates, address, etc.)
}
```
**Storage**: In-memory JavaScript modules, not in Valkey
## Relationships
### Session ↔ PNR
- **Type**: One-to-Many
- **Description**: A session can create multiple bookings
- **Key**: `sessionId` in PNR references Session
- **Cascade**: PNRs remain accessible after session expires (for retrieval)
### PNR ↔ Passengers
- **Type**: One-to-Many
- **Description**: A booking contains multiple passengers
- **Embedded**: Passengers stored within PNR document
- **Constraint**: Minimum 1 passenger per PNR
### PNR ↔ FlightSegments
- **Type**: One-to-Many
- **Description**: A booking can include multiple flight legs
- **Embedded**: Flights stored within PNR document
- **Ordering**: Flights ordered chronologically
### PNR ↔ HotelReservations
- **Type**: One-to-Many
- **Description**: A booking can include multiple hotel stays
- **Embedded**: Hotels stored within PNR document
### PNR ↔ CarRentals
- **Type**: One-to-Many
- **Description**: A booking can include multiple car rentals
- **Embedded**: Cars stored within PNR document
## Data Validation Rules
### Cross-Entity Validation
1. **Date Consistency**:
- Hotel check-in must be >= flight arrival date
- Hotel check-out must be <= return flight departure date
- Car pickup must be >= flight arrival date
- Car dropoff must be <= return flight departure date
2. **Location Consistency**:
- Hotel city should match flight destination
- Car pickup location should match airport or destination city
3. **Passenger Consistency**:
- All segments in a PNR share the same passenger list
- Passenger count must match across segments
4. **Pricing Integrity**:
- PNR total must equal sum of all segment prices
- Segment prices must be positive integers
### Validation Timing
- **Search-time**: Parameter validation (dates, codes, counts)
- **Booking-time**: Business rule validation (date logic, location consistency)
- **Retrieval-time**: PNR format validation
- **Modification-time**: State transition validation (no modification of cancelled bookings)
## Mock Data Generation Rules
### Deterministic Generation
- Same search inputs → same results (when MOCK_DATA_SEED=fixed)
- PNR generation uses session-scoped sequence + timestamp hash
- Flight schedules fixed based on route (JFK→LAX always ~6 hours)
### Realistic Constraints
- Flight prices: $200-$800 domestic economy, $800-$2000 business, $2500+ first
- Hotel prices: $80-$150 budget, $150-$300 midrange, $300-$800 luxury
- Car rental: $35-$50 economy, $50-$80 midsize, $100-$150 luxury
- Flight duration: Calculated from route distance
- Availability: 90% flights available, 10% sold out
### Data Coverage
- **Airports**: ~100 major airports (top 50 US + top 50 international)
- **Airlines**: ~30 major carriers
- **Hotels**: ~50 chains/properties across major cities
- **Car Companies**: ~6 major rental companies
## State Management (Valkey)
### Key Naming Convention
```
gds:session:{sessionId} # Session metadata
gds:session:{sessionId}:booking:{pnr} # Individual booking
gds:session:{sessionId}:bookings # Set of all PNRs in session
gds:session:{sessionId}:searches # List of search IDs
gds:stats:bookings:total # Global booking counter
gds:stats:sessions:active # Set of active session IDs
```
### TTL Strategy
- **Sessions**: 1 hour (configurable via MCP_SESSION_TIMEOUT)
- **Bookings**: No expiry (persist beyond session for retrieval)
- **Search history**: 10 minutes (ephemeral)
### Data Serialization
- **Format**: JSON strings for complex objects
- **Encoding**: UTF-8
- **Compression**: None (mock data is small)
## Performance Considerations
### Memory Footprint
- **Per Session**: ~5KB (metadata only)
- **Per Booking**: ~10-30KB (depends on segment count)
- **Mock Data**: ~2MB (embedded in code, not in Valkey)
### Query Patterns
- **Hot Path**: `GET gds:session:{sessionId}:booking:{pnr}` (booking retrieval)
- **Write Path**: `HSET gds:session:{sessionId}:booking:{pnr}` (booking creation)
- **Cleanup**: `SCAN` + `DEL` for expired sessions (background job)
### Indexing
No secondary indexes required - all queries use primary keys (sessionId, pnr)
## Next Steps
Proceed to contract definition (Phase 1 continued):
- Define MCP tool schemas in `/contracts/mcp-tools.md`
- Create quickstart guide with example usage
---
## Remote Access Entities (Added 2026-04-07)
The following entities support remote MCP access over HTTP/2 with rate limiting, health monitoring, and global PNR retrieval.
### 8. Remote Connection
Represents an active remote client connection over HTTP/2.
**Fields**:
```typescript
{
connectionId: string; // Unique connection identifier (UUID)
sessionId: string; // Associated MCP session ID
remoteIP: string; // Client IP address (for rate limiting)
connectedAt: number; // Unix timestamp (ms)
lastActivityAt: number; // Unix timestamp (ms)
transportType: string; // 'http' | 'stdio'
userAgent: string; // Client user agent
requestCount: number; // Total requests made in this connection
}
```
**Storage**: Valkey hash at key `connection:{connectionId}`
**Validation**:
- `remoteIP` must be valid IPv4 or IPv6 format
- `transportType` must be 'http' or 'stdio'
- `connectedAt` must be <= `lastActivityAt`
**State Transitions**:
```
[Connected] → [Active] → [Idle] → [Disconnected]
[Expired]
```
**Lifecycle**:
- Created on initial HTTP request with new session
- Updated on every request (lastActivityAt, requestCount)
- Auto-expires after SESSION_TTL_HOURS of inactivity
- Explicitly deleted on graceful disconnect
---
### 9. Rate Limit Record
Tracks request rates per client IP address for abuse prevention.
**Fields**:
```typescript
{
clientIP: string; // Client IP address (key component)
currentWindow: number; // Current time window (Unix timestamp / window_seconds)
currentCount: number; // Request count in current window
previousCount: number; // Request count in previous window
limit: number; // Max requests allowed per window
resetAt: number; // Next window reset timestamp
}
```
**Storage**:
- Current window: Valkey integer at key `ratelimit:{ip}:{currentWindow}`
- Previous window: Valkey integer at key `ratelimit:{ip}:{previousWindow}`
**Validation**:
- `clientIP` must be valid IP address
- `currentCount`, `previousCount` must be non-negative integers
- `limit` must be positive integer (default: 100)
- `resetAt` must be in the future
**Algorithm**: Sliding Window Counter (Hybrid)
```javascript
estimatedCount = (previousCount * previousWeight) + currentCount
where previousWeight = 1 - (elapsedInWindow / windowSeconds)
```
**Lifecycle**:
- Counter incremented on each request: `INCR ratelimit:{ip}:{window}`
- TTL set to 2× window duration (keeps previous window accessible)
- Auto-expires after TTL (no manual cleanup needed)
- Resets at window boundary (new window key)
**Error Response** (when limit exceeded):
```typescript
{
error: "Rate limit exceeded",
code: "RATE_LIMIT_EXCEEDED",
limit: 100,
current: 105,
resetAt: "2026-04-07T10:01:00.000Z",
retryAfter: 15 // seconds
}
```
---
### 10. Health Status
Represents server operational status for monitoring and health checks.
**Fields**:
```typescript
{
status: string; // 'healthy' | 'degraded' | 'unhealthy'
uptime: number; // Seconds since server start
version: string; // Server version
connections: {
stdio: number; // Active stdio connections (0 or 1 typically)
http: number; // Active HTTP connections
total: number; // Total active connections
},
sessions: {
active: number; // Active sessions (sessions with recent activity)
total: number; // Total sessions in Valkey
},
storage: {
connected: boolean; // Valkey connection status
responseTime: number; // Valkey ping response time (ms)
},
memory: {
used: number; // Process memory used (MB)
total: number; // Total system memory (MB)
percentage: number; // Memory usage percentage
},
timestamp: number; // Health check timestamp (Unix ms)
}
```
**Endpoint**: `GET /health` (unauthenticated, exempt from rate limiting)
**Status Determination**:
- **healthy**: All systems operational, storage connected, memory < 80%
- **degraded**: Storage slow (ping > 100ms) or memory 80-90%
- **unhealthy**: Storage disconnected or memory > 90%
**Response Codes**:
- 200 OK: status = 'healthy'
- 200 OK: status = 'degraded' (still serving requests)
- 503 Service Unavailable: status = 'unhealthy'
**Example Response**:
```json
{
"status": "healthy",
"uptime": 3600,
"version": "0.1.0",
"connections": {
"stdio": 0,
"http": 12,
"total": 12
},
"sessions": {
"active": 8,
"total": 15
},
"storage": {
"connected": true,
"responseTime": 2
},
"memory": {
"used": 45,
"total": 8192,
"percentage": 0.55
},
"timestamp": 1712486460000
}
```
**Use Cases**:
- Load balancer health checks
- Monitoring system integration (Prometheus, Datadog)
- Deployment validation (verify server started successfully)
- Debugging (check connection counts, memory usage)
---
## Updated Storage Schema (Remote Mode)
### Global PNR Storage (Session-Independent)
**Key Change**: PNRs now stored globally with TTL, not scoped to sessions.
```
pnr:{pnr} # Global PNR storage
→ {
pnr: string,
status: 'confirmed' | 'cancelled',
createdAt: ISO8601,
expiresAt: ISO8601,
creatingSessionId: string, # For logging only, not access control
segments: [...],
passengers: [...],
totalPrice: number
}
```
**TTL**: Configurable via `PNR_TTL_HOURS` (default 1 hour)
**Access**: Any session can retrieve any PNR (global retrieval)
**Expiration**: PNR auto-deleted by Valkey after TTL expires
---
### Session PNR Reference
Sessions track which PNRs they created (for `listBookings` tool):
```
session:{sessionId}:pnrs # Set of PNR codes created in this session
→ Set<string> # e.g., ["TEST-ABC123", "TEST-DEF456"]
```
**Purpose**: Enable `listBookings` to return session-created PNRs
**Lifecycle**: Deleted when session expires (PNRs persist independently)
---
### Rate Limit Keys
```
ratelimit:{ip}:{window} # Request count for IP in time window
→ integer # e.g., 45 (requests made)
TTL: windowSeconds * 2 # Keep previous window accessible
```
**Example**:
```
ratelimit:192.168.1.1:287456 # Current window (e.g., minute 287456)
→ 45
ratelimit:192.168.1.1:287455 # Previous window
→ 92
```
---
### Connection Tracking
```
connection:{connectionId} # Remote connection metadata
→ { connectionId, sessionId, remoteIP, connectedAt, ... }
TTL: SESSION_TTL_HOURS
```
**Purpose**: Track active HTTP connections for health monitoring
**Cleanup**: Auto-expires with session TTL
---
## Updated Validation Rules (Remote Mode)
### Additional Validations
1. **IP Address Validation**:
- Must be valid IPv4 (e.g., `192.168.1.1`) or IPv6 format
- Used for rate limiting and logging
- Extracted from `X-Forwarded-For` or `X-Real-IP` headers (trusted proxy)
2. **Session ID Format** (HTTP mode):
- Must be valid UUID v4
- Sent via `MCP-Session-ID` header
- Generated by transport if not provided
3. **Rate Limit Headers** (HTTP mode):
- `X-RateLimit-Limit`: Integer > 0
- `X-RateLimit-Remaining`: Integer >= 0
- `X-RateLimit-Reset`: Unix timestamp
- `Retry-After`: Seconds (when limit exceeded)
4. **CORS Headers** (HTTP mode):
- `Origin`: Any (wildcard policy)
- `Access-Control-Allow-Origin`: Must be `*`
- Preflight requests must use OPTIONS method
---
## Performance Considerations (Remote Mode)
### Additional Overhead
- **Rate Limiting**: +3 Valkey ops per request (~1-2ms overhead)
- **Connection Tracking**: +1 Valkey write per request (~0.5ms overhead)
- **CORS Preflight**: OPTIONS requests handled immediately (no tool execution)
- **Health Checks**: Separate fast path (no session/rate limit checks)
### Expected Performance (Remote)
- **Search Operations**: <2s (requirement: SC-003)
- **Booking Operations**: <500ms (includes rate limit check)
- **Retrieval Operations**: <200ms (global PNR lookup)
- **Health Check**: <100ms (Valkey ping only)
- **Concurrent Remote Sessions**: 50+ (requirement: SC-012)
### Optimization Strategies
1. **Rate Limit Caching**: Cache IP counters in-memory for 1 second (reduce Valkey ops)
2. **Connection Pooling**: Reuse HTTP/2 connections (handled by Nginx)
3. **Health Check Caching**: Cache health status for 5 seconds
4. **CORS Preflight Caching**: 24-hour `Access-Control-Max-Age`
---
## Next Steps
Data model complete with remote access entities. Proceed to:
1. ✅ Update contracts/ with health endpoint schema
2. ✅ Update quickstart.md with remote access setup
3. ✅ Run agent context update script