orderflow_backtest/docs/modules/ohlc_processor.md

4.3 KiB

Module: ohlc_processor

Purpose

The ohlc_processor module serves as the main coordinator for trade data processing, orchestrating OHLC aggregation, orderbook management, and metrics calculation. It has been refactored into a modular architecture using composition with specialized helper modules.

Public Interface

Classes

  • OHLCProcessor(window_seconds: int = 60, depth_levels_per_side: int = 50): Main orchestrator class that coordinates trade processing using composition

Methods

  • process_trades(trades: list[tuple]) -> None: Aggregate trades into OHLC bars and update CVD metrics
  • update_orderbook(ob_update: OrderbookUpdate) -> None: Apply orderbook updates and calculate OBI metrics
  • finalize() -> None: Emit final OHLC bar and metrics data
  • cvd_cumulative (property): Access to cumulative volume delta value

Composed Modules

  • OrderbookManager: Handles in-memory orderbook state and depth snapshots
  • MetricsCalculator: Manages OBI and CVD metric calculations
  • level_parser functions: Parse and normalize orderbook level data

Usage Examples

from ohlc_processor import OHLCProcessor
from db_interpreter import DBInterpreter

# Initialize processor with 1-minute windows and 50 depth levels
processor = OHLCProcessor(window_seconds=60, depth_levels_per_side=50)

# Process streaming data
for ob_update, trades in DBInterpreter(db_path).stream():
    # Aggregate trades into OHLC bars
    processor.process_trades(trades)
    
    # Update orderbook and emit depth snapshots
    processor.update_orderbook(ob_update)

# Finalize processing
processor.finalize()

Advanced Configuration

# Custom window size and depth levels
processor = OHLCProcessor(
    window_seconds=30,        # 30-second bars
    depth_levels_per_side=25  # Top 25 levels per side
)

Dependencies

Internal Modules

  • orderbook_manager.OrderbookManager: In-memory orderbook state management
  • metrics_calculator.MetricsCalculator: OBI and CVD metrics calculation
  • level_parser: Orderbook level parsing utilities
  • viz_io: JSON output for visualization
  • db_interpreter.OrderbookUpdate: Input data structures

External

  • typing: Type annotations
  • logging: Debug and operational logging

Modular Architecture

The processor now follows a clean composition pattern:

  1. Main Coordinator (OHLCProcessor):

    • Orchestrates trade and orderbook processing
    • Maintains OHLC bar state and window management
    • Delegates specialized tasks to composed modules
  2. Orderbook Management (OrderbookManager):

    • Maintains in-memory price→size mappings
    • Applies partial updates and handles deletions
    • Provides sorted top-N level extraction
  3. Metrics Calculation (MetricsCalculator):

    • Tracks CVD from trade flow (buy/sell volume delta)
    • Calculates OBI from orderbook volume imbalance
    • Manages windowed metrics aggregation with throttling
  4. Level Parsing (level_parser module):

    • Normalizes JSON and Python literal level representations
    • Handles zero-size levels for orderbook deletions
    • Provides robust error handling for malformed data

Performance Characteristics

  • Throttled Updates: Prevents excessive I/O during high-frequency periods
  • Memory Efficient: Maintains only current window and top-N depth levels
  • Incremental Processing: Applies only changed orderbook levels
  • Atomic Operations: Thread-safe updates to shared data structures

Testing

Run module tests:

uv run pytest test_ohlc_processor.py -v

Test coverage includes:

  • OHLC calculation accuracy across window boundaries
  • Volume accumulation correctness
  • High/low price tracking
  • Orderbook update application
  • Depth snapshot generation
  • OBI metric calculation

Known Issues

  • Orderbook level parsing assumes well-formed JSON or Python literals
  • Memory usage scales with number of active price levels
  • Clock skew between trades and orderbook updates not handled

Configuration Options

  • window_seconds: Time window size for OHLC aggregation (default: 60)
  • depth_levels_per_side: Number of top price levels to maintain (default: 50)
  • UPSERT_THROTTLE_MS: Minimum interval between upsert operations (internal)
  • DEPTH_EMIT_THROTTLE_MS: Minimum interval between depth emissions (internal)