orderflow_backtest/docs/decisions/ADR-002-json-ipc-communication.md

# ADR-002: JSON File-Based Inter-Process Communication

## Status
Accepted

## Context
The orderflow backtest system requires communication between the data processing pipeline and the web-based visualization frontend. Key requirements include:

- Real-time data updates from processor to visualization
- Tolerance for timing mismatches between writer and reader
- Simple implementation without external dependencies
- Support for different update frequencies (OHLC bars vs. orderbook depth)
- Graceful handling of process crashes or restarts

## Decision
We will use JSON files with atomic write operations for inter-process communication between the data processor and Dash visualization frontend.

## Consequences

### Positive
- **Simplicity**: No message queues, sockets, or complex protocols
- **Fault tolerance**: File-based communication survives process restarts
- **Debugging friendly**: Data files can be inspected manually
- **No dependencies**: Built-in JSON support, no external libraries
- **Atomic operations**: Temp file + rename prevents partial reads
- **Language agnostic**: Any process can read/write JSON files
- **Bounded memory**: Rolling data windows prevent unlimited growth

### Negative
- **File I/O overhead**: Disk writes may be slower than in-memory communication
- **Polling required**: Reader must poll for updates (500ms interval)
- **Limited throughput**: Not suitable for high-frequency (microsecond) updates
- **No acknowledgments**: Writer cannot confirm reader has processed data
- **File system dependency**: Performance varies by storage type

## Implementation Details

### File Structure
```
ohlc_data.json     # Rolling array of OHLC bars (max 1000)
depth_data.json    # Current orderbook depth snapshot
metrics_data.json  # Rolling array of OBI/CVD metrics (max 1000)
```

### Atomic Write Pattern
```python
def atomic_write(file_path: Path, data: Any) -> None:
    """Write data atomically to prevent partial reads."""
    temp_path = file_path.with_suffix('.tmp')
    with open(temp_path, 'w') as f:
        json.dump(data, f)
        f.flush()
        os.fsync(f.fileno())
    temp_path.replace(file_path)  # Atomic on POSIX systems
```

### Data Formats
```python
# OHLC format: [timestamp_ms, open, high, low, close, volume]
ohlc_data = [
    [1640995200000, 50000.0, 50100.0, 49900.0, 50050.0, 125.5],
    [1640995260000, 50050.0, 50200.0, 50000.0, 50150.0, 98.3]
]

# Depth format: top-N levels per side
depth_data = {
    "bids": [[49990.0, 1.5], [49985.0, 2.1]],
    "asks": [[50010.0, 1.2], [50015.0, 1.8]]
}

# Metrics format: [timestamp_ms, obi_open, obi_high, obi_low, obi_close]
metrics_data = [
    [1640995200000, 0.15, 0.22, 0.08, 0.18],
    [1640995260000, 0.18, 0.25, 0.12, 0.20]
]
```

### Error Handling
```python
# Reader pattern with graceful fallback
try:
    with open(data_file) as f:
        new_data = json.load(f)
    _LAST_DATA = new_data  # Cache successful read
except (FileNotFoundError, json.JSONDecodeError) as e:
    logging.warning(f"Using cached data: {e}")
    new_data = _LAST_DATA  # Use cached data
```

## Performance Characteristics

### Write Performance
- **Small files**: < 1MB typical, writes complete in < 10ms
- **Atomic operations**: Add ~2-5ms overhead for temp file creation
- **Throttling**: Updates limited to prevent excessive I/O

### Read Performance
- **Parse time**: < 5ms for typical JSON file sizes
- **Polling overhead**: 500ms interval balances responsiveness and CPU usage
- **Error recovery**: Cached data eliminates visual glitches

### Memory Usage
- **Bounded datasets**: Max 1000 bars × 6 fields × 8 bytes = ~48KB per file
- **JSON overhead**: ~2x memory during parsing
- **Total footprint**: < 500KB for all IPC data

## Alternatives Considered

### Redis Pub/Sub
- **Rejected**: Additional service dependency, overkill for simple use case
- **Pros**: True real-time updates, built-in data structures
- **Cons**: External dependency, memory overhead, configuration complexity

### ZeroMQ
- **Rejected**: Additional library dependency, more complex than needed
- **Pros**: High performance, flexible patterns
- **Cons**: Learning curve, binary dependency, networking complexity

### Named Pipes/Unix Sockets
- **Rejected**: Platform-specific, more complex error handling
- **Pros**: Better performance, no file I/O
- **Cons**: Platform limitations, harder debugging, process lifetime coupling

### SQLite as Message Queue
- **Rejected**: Overkill for simple data exchange
- **Pros**: ACID transactions, complex queries possible
- **Cons**: Schema management, locking considerations, overhead

### HTTP API
- **Rejected**: Too much overhead for local communication
- **Pros**: Standard protocol, language agnostic
- **Cons**: Network stack overhead, port management, authentication

## Future Considerations

### Scalability Limits
Current approach suitable for:
- Update frequencies: 1-10 Hz
- Data volumes: < 10MB total
- Process counts: 1 writer, few readers

### Migration Path
If performance becomes insufficient:
1. **Phase 1**: Add compression (gzip) to reduce I/O
2. **Phase 2**: Implement shared memory for high-frequency data
3. **Phase 3**: Consider message queue for complex routing
4. **Phase 4**: Migrate to streaming protocol for real-time requirements

## Monitoring
Track these metrics to validate the approach:
- File write latency and frequency
- JSON parse times in visualization
- Error rates for partial reads
- Memory usage growth over time

## Review Triggers
Reconsider this decision if:
- Update frequency requirements exceed 10 Hz
- File I/O becomes a performance bottleneck
- Multiple visualization clients need the same data
- Complex message routing becomes necessary
- Platform portability becomes a concern