# Module: db_interpreter ## Purpose The `db_interpreter` module provides efficient streaming access to SQLite databases containing orderbook and trade data. It handles batch reading, temporal windowing, and data structure normalization for downstream processing. ## Public Interface ### Classes - `OrderbookLevel(price: float, size: float)`: Dataclass representing a single price level in the orderbook - `OrderbookUpdate`: Container for windowed orderbook data with bids, asks, timestamp, and end_timestamp ### Functions - `DBInterpreter(db_path: Path)`: Constructor that initializes read-only SQLite connection with optimized PRAGMA settings ### Methods - `stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]`: Primary streaming interface that yields orderbook updates with associated trades in temporal windows ## Usage Examples ```python from pathlib import Path from db_interpreter import DBInterpreter # Initialize interpreter db_path = Path("data/BTC-USDT-2025-01-01.db") interpreter = DBInterpreter(db_path) # Stream orderbook and trade data for ob_update, trades in interpreter.stream(): # Process orderbook update print(f"Book update: {len(ob_update.bids)} bids, {len(ob_update.asks)} asks") print(f"Time window: {ob_update.timestamp} - {ob_update.end_timestamp}") # Process trades in this window for trade in trades: trade_id, price, size, side, timestamp_ms = trade[1:6] print(f"Trade: {side} {size} @ {price}") ``` ## Dependencies ### Internal - None (standalone module) ### External - `sqlite3`: Database connectivity - `pathlib`: Path handling - `dataclasses`: Data structure definitions - `typing`: Type annotations - `logging`: Debug and error logging ## Performance Characteristics - **Batch sizes**: BOOK_BATCH=2048, TRADE_BATCH=4096 for optimal memory usage - **SQLite optimizations**: Read-only, immutable mode, large mmap and cache sizes - **Memory efficient**: Streaming iterator pattern prevents loading entire dataset - **Temporal windowing**: One-row lookahead for precise time boundary calculation ## Testing Run module tests: ```bash uv run pytest test_db_interpreter.py -v ``` Test coverage includes: - Batch reading correctness - Temporal window boundary handling - Trade-to-window assignment accuracy - End-of-stream behavior - Error handling for malformed data ## Known Issues - Requires specific database schema (book and trades tables) - Python-literal string parsing assumes well-formed input - Large databases may require memory monitoring during streaming ## Configuration - `BOOK_BATCH`: Number of orderbook rows to fetch per query (default: 2048) - `TRADE_BATCH`: Number of trade rows to fetch per query (default: 4096) - SQLite PRAGMA settings optimized for read-only sequential access