Module: db_interpreter

Purpose

The db_interpreter module provides efficient streaming access to SQLite databases containing orderbook and trade data. It handles batch reading, temporal windowing, and data structure normalization for downstream processing.

Public Interface

Classes

OrderbookLevel(price: float, size: float): Dataclass representing a single price level in the orderbook
OrderbookUpdate: Container for windowed orderbook data with bids, asks, timestamp, and end_timestamp

Functions

DBInterpreter(db_path: Path): Constructor that initializes read-only SQLite connection with optimized PRAGMA settings

Methods

stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]: Primary streaming interface that yields orderbook updates with associated trades in temporal windows

Usage Examples

from pathlib import Path
from db_interpreter import DBInterpreter

# Initialize interpreter
db_path = Path("data/BTC-USDT-2025-01-01.db")
interpreter = DBInterpreter(db_path)

# Stream orderbook and trade data
for ob_update, trades in interpreter.stream():
    # Process orderbook update
    print(f"Book update: {len(ob_update.bids)} bids, {len(ob_update.asks)} asks")
    print(f"Time window: {ob_update.timestamp} - {ob_update.end_timestamp}")
    
    # Process trades in this window
    for trade in trades:
        trade_id, price, size, side, timestamp_ms = trade[1:6]
        print(f"Trade: {side} {size} @ {price}")

Dependencies

Internal

None (standalone module)

External

sqlite3: Database connectivity
pathlib: Path handling
dataclasses: Data structure definitions
typing: Type annotations
logging: Debug and error logging

Performance Characteristics

Batch sizes: BOOK_BATCH=2048, TRADE_BATCH=4096 for optimal memory usage
SQLite optimizations: Read-only, immutable mode, large mmap and cache sizes
Memory efficient: Streaming iterator pattern prevents loading entire dataset
Temporal windowing: One-row lookahead for precise time boundary calculation

Testing

Run module tests:

uv run pytest test_db_interpreter.py -v

Test coverage includes:

Batch reading correctness
Temporal window boundary handling
Trade-to-window assignment accuracy
End-of-stream behavior
Error handling for malformed data

Known Issues

Requires specific database schema (book and trades tables)
Python-literal string parsing assumes well-formed input
Large databases may require memory monitoring during streaming

Configuration

BOOK_BATCH: Number of orderbook rows to fetch per query (default: 2048)
TRADE_BATCH: Number of trade rows to fetch per query (default: 4096)
SQLite PRAGMA settings optimized for read-only sequential access

2.7 KiB Raw Blame History