orderflow_backtest/docs/modules/ohlc_processor.md

123 lines
4.3 KiB
Markdown

# Module: ohlc_processor
## Purpose
The `ohlc_processor` module serves as the main coordinator for trade data processing, orchestrating OHLC aggregation, orderbook management, and metrics calculation. It has been refactored into a modular architecture using composition with specialized helper modules.
## Public Interface
### Classes
- `OHLCProcessor(window_seconds: int = 60, depth_levels_per_side: int = 50)`: Main orchestrator class that coordinates trade processing using composition
### Methods
- `process_trades(trades: list[tuple]) -> None`: Aggregate trades into OHLC bars and update CVD metrics
- `update_orderbook(ob_update: OrderbookUpdate) -> None`: Apply orderbook updates and calculate OBI metrics
- `finalize() -> None`: Emit final OHLC bar and metrics data
- `cvd_cumulative` (property): Access to cumulative volume delta value
### Composed Modules
- `OrderbookManager`: Handles in-memory orderbook state and depth snapshots
- `MetricsCalculator`: Manages OBI and CVD metric calculations
- `level_parser` functions: Parse and normalize orderbook level data
## Usage Examples
```python
from ohlc_processor import OHLCProcessor
from db_interpreter import DBInterpreter
# Initialize processor with 1-minute windows and 50 depth levels
processor = OHLCProcessor(window_seconds=60, depth_levels_per_side=50)
# Process streaming data
for ob_update, trades in DBInterpreter(db_path).stream():
# Aggregate trades into OHLC bars
processor.process_trades(trades)
# Update orderbook and emit depth snapshots
processor.update_orderbook(ob_update)
# Finalize processing
processor.finalize()
```
### Advanced Configuration
```python
# Custom window size and depth levels
processor = OHLCProcessor(
window_seconds=30, # 30-second bars
depth_levels_per_side=25 # Top 25 levels per side
)
```
## Dependencies
### Internal Modules
- `orderbook_manager.OrderbookManager`: In-memory orderbook state management
- `metrics_calculator.MetricsCalculator`: OBI and CVD metrics calculation
- `level_parser`: Orderbook level parsing utilities
- `viz_io`: JSON output for visualization
- `db_interpreter.OrderbookUpdate`: Input data structures
### External
- `typing`: Type annotations
- `logging`: Debug and operational logging
## Modular Architecture
The processor now follows a clean composition pattern:
1. **Main Coordinator** (`OHLCProcessor`):
- Orchestrates trade and orderbook processing
- Maintains OHLC bar state and window management
- Delegates specialized tasks to composed modules
2. **Orderbook Management** (`OrderbookManager`):
- Maintains in-memory price→size mappings
- Applies partial updates and handles deletions
- Provides sorted top-N level extraction
3. **Metrics Calculation** (`MetricsCalculator`):
- Tracks CVD from trade flow (buy/sell volume delta)
- Calculates OBI from orderbook volume imbalance
- Manages windowed metrics aggregation with throttling
4. **Level Parsing** (`level_parser` module):
- Normalizes JSON and Python literal level representations
- Handles zero-size levels for orderbook deletions
- Provides robust error handling for malformed data
## Performance Characteristics
- **Throttled Updates**: Prevents excessive I/O during high-frequency periods
- **Memory Efficient**: Maintains only current window and top-N depth levels
- **Incremental Processing**: Applies only changed orderbook levels
- **Atomic Operations**: Thread-safe updates to shared data structures
## Testing
Run module tests:
```bash
uv run pytest test_ohlc_processor.py -v
```
Test coverage includes:
- OHLC calculation accuracy across window boundaries
- Volume accumulation correctness
- High/low price tracking
- Orderbook update application
- Depth snapshot generation
- OBI metric calculation
## Known Issues
- Orderbook level parsing assumes well-formed JSON or Python literals
- Memory usage scales with number of active price levels
- Clock skew between trades and orderbook updates not handled
## Configuration Options
- `window_seconds`: Time window size for OHLC aggregation (default: 60)
- `depth_levels_per_side`: Number of top price levels to maintain (default: 50)
- `UPSERT_THROTTLE_MS`: Minimum interval between upsert operations (internal)
- `DEPTH_EMIT_THROTTLE_MS`: Minimum interval between depth emissions (internal)