2.7 KiB
2.7 KiB
Module: db_interpreter
Purpose
The db_interpreter module provides efficient streaming access to SQLite databases containing orderbook and trade data. It handles batch reading, temporal windowing, and data structure normalization for downstream processing.
Public Interface
Classes
OrderbookLevel(price: float, size: float): Dataclass representing a single price level in the orderbookOrderbookUpdate: Container for windowed orderbook data with bids, asks, timestamp, and end_timestamp
Functions
DBInterpreter(db_path: Path): Constructor that initializes read-only SQLite connection with optimized PRAGMA settings
Methods
stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]: Primary streaming interface that yields orderbook updates with associated trades in temporal windows
Usage Examples
from pathlib import Path
from db_interpreter import DBInterpreter
# Initialize interpreter
db_path = Path("data/BTC-USDT-2025-01-01.db")
interpreter = DBInterpreter(db_path)
# Stream orderbook and trade data
for ob_update, trades in interpreter.stream():
# Process orderbook update
print(f"Book update: {len(ob_update.bids)} bids, {len(ob_update.asks)} asks")
print(f"Time window: {ob_update.timestamp} - {ob_update.end_timestamp}")
# Process trades in this window
for trade in trades:
trade_id, price, size, side, timestamp_ms = trade[1:6]
print(f"Trade: {side} {size} @ {price}")
Dependencies
Internal
- None (standalone module)
External
sqlite3: Database connectivitypathlib: Path handlingdataclasses: Data structure definitionstyping: Type annotationslogging: Debug and error logging
Performance Characteristics
- Batch sizes: BOOK_BATCH=2048, TRADE_BATCH=4096 for optimal memory usage
- SQLite optimizations: Read-only, immutable mode, large mmap and cache sizes
- Memory efficient: Streaming iterator pattern prevents loading entire dataset
- Temporal windowing: One-row lookahead for precise time boundary calculation
Testing
Run module tests:
uv run pytest test_db_interpreter.py -v
Test coverage includes:
- Batch reading correctness
- Temporal window boundary handling
- Trade-to-window assignment accuracy
- End-of-stream behavior
- Error handling for malformed data
Known Issues
- Requires specific database schema (book and trades tables)
- Python-literal string parsing assumes well-formed input
- Large databases may require memory monitoring during streaming
Configuration
BOOK_BATCH: Number of orderbook rows to fetch per query (default: 2048)TRADE_BATCH: Number of trade rows to fetch per query (default: 4096)- SQLite PRAGMA settings optimized for read-only sequential access