8.5 KiB
System Architecture
Overview
The current system is a streamlined, high-performance pipeline that streams orderflow from SQLite databases, aggregates trades into OHLC bars, maintains a lightweight depth snapshot, and serves visuals via a Dash web application. Inter-process communication (IPC) between the processor and visualizer uses atomic JSON files for simplicity and robustness.
High-Level Architecture
┌─────────────────┐ ┌─────────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ SQLite Files │ → │ DB Interpreter │ → │ OHLC/Depth │ → │ Dash Visualizer │
│ (book,trades) │ │ (stream rows) │ │ Processor │ │ (app.py) │
└─────────────────┘ └─────────────────────┘ └─────────┬────────┘ └────────────▲─────┘
│ │
│ Atomic JSON (IPC) │
▼ │
ohlc_data.json, depth_data.json │
metrics_data.json │
│
Browser UI
Components
Data Access (db_interpreter.py)
OrderbookLevel: dataclass representing one price level.OrderbookUpdate: container for a book row window withbids,asks,timestamp, andend_timestamp.DBInterpreter:stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]streams the book table with lookahead and the trades table in timestamp order.- Efficient read-only connection with PRAGMA tuning: immutable mode, query_only, temp_store=MEMORY, mmap_size, cache_size.
- Batching constants:
BOOK_BATCH = 2048,TRADE_BATCH = 4096. - Each yielded
tradeselement is a tuple(id, trade_id, price, size, side, timestamp_ms)that falls within[book.timestamp, next_book.timestamp).
Processing (Modular Architecture)
Main Coordinator (ohlc_processor.py)
OHLCProcessor(window_seconds=60, depth_levels_per_side=50): Orchestrates trade processing using compositionprocess_trades(trades): aggregates trades into OHLC bars and delegates CVD updatesupdate_orderbook(ob_update): coordinates orderbook updates and OBI metric calculationfinalize(): finalizes both OHLC bars and metrics datacvd_cumulative(property): provides access to cumulative volume delta
Orderbook Management (orderbook_manager.py)
OrderbookManager: Handles in-memory orderbook state with partial updates- Maintains separate bid/ask price→size dictionaries
- Supports deletions via zero-size updates
- Provides sorted top-N level extraction for visualization
Metrics Calculation (metrics_calculator.py)
MetricsCalculator: Manages OBI and CVD metrics with windowed aggregation- Tracks CVD from trade flow (buy vs sell volume delta)
- Calculates OBI from orderbook volume imbalance
- Provides throttled updates and OHLC-style metric bars
Level Parsing (level_parser.py)
- Utility functions for normalizing orderbook level data:
normalize_levels(): parses levels, filtering zero/negative sizesparse_levels_including_zeros(): preserves zeros for deletion operations- Supports JSON and Python literal formats with robust error handling
Inter-Process Communication (viz_io.py)
- File paths (relative to project root):
ohlc_data.json: rolling list of OHLC bars (max 1000).depth_data.json: latest depth snapshot (bids/asks).metrics_data.json: rolling list of OBI/TOT OHLC bars (max 1000).
- Atomic writes via temp files prevent partial reads by the Dash app.
- API:
add_ohlc_bar(...): append a new bar; trim to last 1000.upsert_ohlc_bar(...): replace last bar if timestamp matches; else append; trim.clear_data(): reset OHLC data to an empty list.
Visualization (app.py)
- Dash application with two graphs plus OBI subplot:
- OHLC + Volume subplot with shared x-axis.
- OBI candlestick subplot (blue tones) sharing x-axis.
- Depth (cumulative) chart for bids and asks.
- Polling interval (500 ms) callback reads JSON files and updates figures resilently:
- Caches last good values to tolerate in-flight writes/decoding errors.
- Builds figures with Plotly dark theme.
- Exposed on
http://localhost:8050by default (host=0.0.0.0).
CLI Orchestration (main.py)
- Typer CLI entrypoint:
- Arguments:
instrument,start_date,end_date(UTC,YYYY-MM-DD), options:--window-seconds. - Discovers SQLite files under
../data/OKXmatching the instrument. - Launches Dash visualizer as a separate process:
uv run python app.py. - Streams databases sequentially: for each book row, processes trades and updates orderbook.
- Arguments:
Data Flow
- Discover and open SQLite database(s) for the requested instrument.
- Stream
bookrows with one-row lookahead to form time windows. - Stream
tradesin timestamp order and bucket into the active window. - For each window:
- Aggregate trades into OHLC using
OHLCProcessor.process_trades. - Apply partial depth updates via
OHLCProcessor.update_orderbookand emit periodic snapshots.
- Aggregate trades into OHLC using
- Persist current OHLC bar(s) and depth snapshots to JSON via atomic writes.
- Dash app polls JSON and renders charts.
IPC JSON Schemas
-
OHLC (
ohlc_data.json): array of bars; each bar is[ts, open, high, low, close, volume]. -
Depth (
depth_data.json): object with bids/asks arrays:{"bids": [[price, size], ...], "asks": [[price, size], ...]}. -
Metrics (
metrics_data.json): array of bars; each bar is[ts, obi_open, obi_high, obi_low, obi_close, tot_open, tot_high, tot_low, tot_close].
Configuration
OHLCProcessor(window_seconds, depth_levels_per_side)controls aggregation granularity and depth snapshot size.- Visualizer interval (
500 ms) balances UI responsiveness and CPU usage. - Paths: JSON files (
ohlc_data.json,depth_data.json) are colocated with the code and written atomically. - CLI parameters select instrument and time range; databases expected under
../data/OKX.
Performance Characteristics
- Read-only SQLite tuned for fast sequential scans: immutable URI, query_only, large mmap and cache.
- Batching minimizes cursor churn and Python overhead.
- JSON IPC uses atomic replace to avoid contention; OHLC list is bounded to 1000 entries.
- Processor throttles intra-window OHLC upserts and depth emissions to reduce I/O.
Error Handling
- Visualizer tolerates JSON decode races by reusing last good values and logging warnings.
- Processor guards depth parsing and writes; logs at debug/info levels.
- Visualizer startup is wrapped; if it fails, processing continues without UI.
Security Considerations
- SQLite connections are read-only and immutable; no write queries executed.
- File writes are confined to project directory; no paths derived from untrusted input.
- Logs avoid sensitive data; only operational metadata.
Testing Guidance
- Unit tests (run with
uv run pytest):OHLCProcessor: window boundary handling, high/low tracking, volume accumulation, upsert behavior.- Depth maintenance: deletions (size==0), top-N sorting, throttling.
DBInterpreter.stream: correct trade-window assignment, end-of-stream handling.
- Integration: end-to-end generation of JSON from a tiny fixture DB and basic figure construction without launching a server.
Roadmap (Optional Enhancements)
- Metrics: add OBI/CVD computation and persist metrics to a dedicated table.
- Repository Pattern: extract DB access into a repository module with typed methods.
- Orchestrator: introduce a
Storagepipeline module coordinating batch processing and persistence. - Strategy Layer: compute signals/alerts on stored metrics.
- Visualization: add OBI/CVD subplots and richer interactions.
This document reflects the current implementation centered on SQLite streaming, JSON-based IPC, and a Dash visualizer, providing a clear foundation for incremental enhancements.