157 lines
8.5 KiB
Markdown
157 lines
8.5 KiB
Markdown
# System Architecture
|
|
|
|
## Overview
|
|
|
|
The current system is a streamlined, high-performance pipeline that streams orderflow from SQLite databases, aggregates trades into OHLC bars, maintains a lightweight depth snapshot, and serves visuals via a Dash web application. Inter-process communication (IPC) between the processor and visualizer uses atomic JSON files for simplicity and robustness.
|
|
|
|
## High-Level Architecture
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────────┐ ┌──────────────────┐ ┌──────────────────┐
|
|
│ SQLite Files │ → │ DB Interpreter │ → │ OHLC/Depth │ → │ Dash Visualizer │
|
|
│ (book,trades) │ │ (stream rows) │ │ Processor │ │ (app.py) │
|
|
└─────────────────┘ └─────────────────────┘ └─────────┬────────┘ └────────────▲─────┘
|
|
│ │
|
|
│ Atomic JSON (IPC) │
|
|
▼ │
|
|
ohlc_data.json, depth_data.json │
|
|
metrics_data.json │
|
|
│
|
|
Browser UI
|
|
```
|
|
|
|
## Components
|
|
|
|
### Data Access (`db_interpreter.py`)
|
|
|
|
- `OrderbookLevel`: dataclass representing one price level.
|
|
- `OrderbookUpdate`: container for a book row window with `bids`, `asks`, `timestamp`, and `end_timestamp`.
|
|
- `DBInterpreter`:
|
|
- `stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]` streams the book table with lookahead and the trades table in timestamp order.
|
|
- Efficient read-only connection with PRAGMA tuning: immutable mode, query_only, temp_store=MEMORY, mmap_size, cache_size.
|
|
- Batching constants: `BOOK_BATCH = 2048`, `TRADE_BATCH = 4096`.
|
|
- Each yielded `trades` element is a tuple `(id, trade_id, price, size, side, timestamp_ms)` that falls within `[book.timestamp, next_book.timestamp)`.
|
|
|
|
### Processing (Modular Architecture)
|
|
|
|
#### Main Coordinator (`ohlc_processor.py`)
|
|
- `OHLCProcessor(window_seconds=60, depth_levels_per_side=50)`: Orchestrates trade processing using composition
|
|
- `process_trades(trades)`: aggregates trades into OHLC bars and delegates CVD updates
|
|
- `update_orderbook(ob_update)`: coordinates orderbook updates and OBI metric calculation
|
|
- `finalize()`: finalizes both OHLC bars and metrics data
|
|
- `cvd_cumulative` (property): provides access to cumulative volume delta
|
|
|
|
#### Orderbook Management (`orderbook_manager.py`)
|
|
- `OrderbookManager`: Handles in-memory orderbook state with partial updates
|
|
- Maintains separate bid/ask price→size dictionaries
|
|
- Supports deletions via zero-size updates
|
|
- Provides sorted top-N level extraction for visualization
|
|
|
|
#### Metrics Calculation (`metrics_calculator.py`)
|
|
- `MetricsCalculator`: Manages OBI and CVD metrics with windowed aggregation
|
|
- Tracks CVD from trade flow (buy vs sell volume delta)
|
|
- Calculates OBI from orderbook volume imbalance
|
|
- Provides throttled updates and OHLC-style metric bars
|
|
|
|
#### Level Parsing (`level_parser.py`)
|
|
- Utility functions for normalizing orderbook level data:
|
|
- `normalize_levels()`: parses levels, filtering zero/negative sizes
|
|
- `parse_levels_including_zeros()`: preserves zeros for deletion operations
|
|
- Supports JSON and Python literal formats with robust error handling
|
|
|
|
### Inter-Process Communication (`viz_io.py`)
|
|
|
|
- File paths (relative to project root):
|
|
- `ohlc_data.json`: rolling list of OHLC bars (max 1000).
|
|
- `depth_data.json`: latest depth snapshot (bids/asks).
|
|
- `metrics_data.json`: rolling list of OBI/TOT OHLC bars (max 1000).
|
|
- Atomic writes via temp files prevent partial reads by the Dash app.
|
|
- API:
|
|
- `add_ohlc_bar(...)`: append a new bar; trim to last 1000.
|
|
- `upsert_ohlc_bar(...)`: replace last bar if timestamp matches; else append; trim.
|
|
- `clear_data()`: reset OHLC data to an empty list.
|
|
|
|
### Visualization (`app.py`)
|
|
|
|
- Dash application with two graphs plus OBI subplot:
|
|
- OHLC + Volume subplot with shared x-axis.
|
|
- OBI candlestick subplot (blue tones) sharing x-axis.
|
|
- Depth (cumulative) chart for bids and asks.
|
|
- Polling interval (500 ms) callback reads JSON files and updates figures resilently:
|
|
- Caches last good values to tolerate in-flight writes/decoding errors.
|
|
- Builds figures with Plotly dark theme.
|
|
- Exposed on `http://localhost:8050` by default (`host=0.0.0.0`).
|
|
|
|
### CLI Orchestration (`main.py`)
|
|
|
|
- Typer CLI entrypoint:
|
|
- Arguments: `instrument`, `start_date`, `end_date` (UTC, `YYYY-MM-DD`), options: `--window-seconds`.
|
|
- Discovers SQLite files under `../data/OKX` matching the instrument.
|
|
- Launches Dash visualizer as a separate process: `uv run python app.py`.
|
|
- Streams databases sequentially: for each book row, processes trades and updates orderbook.
|
|
|
|
## Data Flow
|
|
|
|
1. Discover and open SQLite database(s) for the requested instrument.
|
|
2. Stream `book` rows with one-row lookahead to form time windows.
|
|
3. Stream `trades` in timestamp order and bucket into the active window.
|
|
4. For each window:
|
|
- Aggregate trades into OHLC using `OHLCProcessor.process_trades`.
|
|
- Apply partial depth updates via `OHLCProcessor.update_orderbook` and emit periodic snapshots.
|
|
5. Persist current OHLC bar(s) and depth snapshots to JSON via atomic writes.
|
|
6. Dash app polls JSON and renders charts.
|
|
|
|
## IPC JSON Schemas
|
|
|
|
- OHLC (`ohlc_data.json`): array of bars; each bar is `[ts, open, high, low, close, volume]`.
|
|
|
|
- Depth (`depth_data.json`): object with bids/asks arrays: `{"bids": [[price, size], ...], "asks": [[price, size], ...]}`.
|
|
|
|
- Metrics (`metrics_data.json`): array of bars; each bar is `[ts, obi_open, obi_high, obi_low, obi_close, tot_open, tot_high, tot_low, tot_close]`.
|
|
|
|
## Configuration
|
|
|
|
- `OHLCProcessor(window_seconds, depth_levels_per_side)` controls aggregation granularity and depth snapshot size.
|
|
- Visualizer interval (`500 ms`) balances UI responsiveness and CPU usage.
|
|
- Paths: JSON files (`ohlc_data.json`, `depth_data.json`) are colocated with the code and written atomically.
|
|
- CLI parameters select instrument and time range; databases expected under `../data/OKX`.
|
|
|
|
## Performance Characteristics
|
|
|
|
- Read-only SQLite tuned for fast sequential scans: immutable URI, query_only, large mmap and cache.
|
|
- Batching minimizes cursor churn and Python overhead.
|
|
- JSON IPC uses atomic replace to avoid contention; OHLC list is bounded to 1000 entries.
|
|
- Processor throttles intra-window OHLC upserts and depth emissions to reduce I/O.
|
|
|
|
## Error Handling
|
|
|
|
- Visualizer tolerates JSON decode races by reusing last good values and logging warnings.
|
|
- Processor guards depth parsing and writes; logs at debug/info levels.
|
|
- Visualizer startup is wrapped; if it fails, processing continues without UI.
|
|
|
|
## Security Considerations
|
|
|
|
- SQLite connections are read-only and immutable; no write queries executed.
|
|
- File writes are confined to project directory; no paths derived from untrusted input.
|
|
- Logs avoid sensitive data; only operational metadata.
|
|
|
|
## Testing Guidance
|
|
|
|
- Unit tests (run with `uv run pytest`):
|
|
- `OHLCProcessor`: window boundary handling, high/low tracking, volume accumulation, upsert behavior.
|
|
- Depth maintenance: deletions (size==0), top-N sorting, throttling.
|
|
- `DBInterpreter.stream`: correct trade-window assignment, end-of-stream handling.
|
|
- Integration: end-to-end generation of JSON from a tiny fixture DB and basic figure construction without launching a server.
|
|
|
|
## Roadmap (Optional Enhancements)
|
|
|
|
- Metrics: add OBI/CVD computation and persist metrics to a dedicated table.
|
|
- Repository Pattern: extract DB access into a repository module with typed methods.
|
|
- Orchestrator: introduce a `Storage` pipeline module coordinating batch processing and persistence.
|
|
- Strategy Layer: compute signals/alerts on stored metrics.
|
|
- Visualization: add OBI/CVD subplots and richer interactions.
|
|
|
|
---
|
|
|
|
This document reflects the current implementation centered on SQLite streaming, JSON-based IPC, and a Dash visualizer, providing a clear foundation for incremental enhancements.
|