orderflow_backtest/tasks/prd-order-book-imbalance.md

72 lines
4.2 KiB
Markdown
Raw Normal View History

2025-09-10 15:39:16 +08:00
## Order Book Imbalance (OBI) Product Requirements Document
### 1) Introduction / Overview
- Compute and visualize Order Book Imbalance (OBI) from the in-memory order book maintained by `OHLCProcessor`, aligned to the existing OHLC bar cadence.
- OBI is defined as raw `B - A`, where `B` is total bid size and `A` is total ask size.
- Persist an OBI time series as OHLC-style bars to `metrics_data.json` and render an OBI candlestick chart beneath the current Volume subplot in the Dash UI.
### 2) Goals
- Compute OBI from the full in-memory aggregated book (all bid/ask levels) on every order book update.
- Aggregate OBI into OHLC-style bars per `window_seconds`.
- Persist OBI bars to `metrics_data.json` with atomic writes and a rolling retention of 1000 rows.
- Add an OBI candlestick subplot (blue-toned) beneath Volume in the main chart, sharing the time axis.
- Throttle intra-window upserts of OBI bars using the same approach/frequency as current OHLC throttling; always write on window close.
### 3) User Stories
- As a researcher, I want OBI computed from the entire book so I can assess true depth imbalance.
- As an analyst, I want OBI stored per time window as candlesticks so I can compare it with price/volume behavior.
- As a developer, I want raw OBI values so I can analyze absolute imbalance patterns.
### 4) Functional Requirements
1. Inputs and Definitions
- Compute on every order book update using the complete in-memory book:
- `B = sum(self._book_bids.values())`
- `A = sum(self._book_asks.values())`
- `OBI = B - A`
- Edge case: if both sides are empty → `OBI = 0`.
2. Windowing & Aggregation
- Use the same `window_seconds` boundary as OHLC bars; window anchor is derived from the order book update timestamp.
- Maintain OBI OHLC per window: `obi_open`, `obi_high`, `obi_low`, `obi_close`.
- On window rollover, finalize and persist the bar.
3. Persistence
- Introduce `metrics_data.json` (co-located with other IPC files) with atomic writes.
- Schema: list of fixed-length rows
- `[timestamp_ms, obi_open, obi_high, obi_low, obi_close]`
- Keep only the last 1000 rows.
- Upsert intra-window bars periodically (throttled, matching OHLCs approach) and always write on window close.
4. Visualization
- Read `metrics_data.json` in the Dash app with the same tolerant JSON reading/caching approach as other IPC files.
- Extend the main figure to a third row for OBI candlesticks beneath Volume, sharing the x-axis.
- Style OBI candlesticks in blue tones (distinct increasing/decreasing shades) and add a zero baseline.
5. Performance & Correctness
- OBI compute happens on every order book update; I/O is throttled to maintain UI responsiveness.
- Use existing logging and error handling patterns; must not crash if metrics JSON is temporarily unreadable.
6. Testing
- Unit tests for OBI on symmetric, empty, and imbalanced books; intra-window aggregation; window rollover.
- Integration test: fixture DB produces `metrics_data.json` aligned with OHLC bars, valid schema/lengths.
### 5) Non-Goals
- No additional derived metrics; keep only raw OBI values for maximum flexibility.
- No database persistence for metrics; JSON IPC only.
- No strategy/signal changes.
### 6) Design Considerations
- Reuse `OHLCProcessor` in-memory book (`_book_bids`, `_book_asks`).
- Introduce new metrics IO helpers in `viz_io.py` mirroring existing OHLC IO (atomic write, rolling trim, upsert).
- Keep `metrics_data.json` separate from `ohlc_data.json` to avoid schema churn.
### 7) Technical Considerations
- Implement OBI compute and aggregation inside `OHLCProcessor.update_orderbook` after applying partial updates.
- Throttle intra-window upserts with the same cadence concept as OHLC; on window close always persist.
- Add a finalize path to persist the last OBI bar.
### 8) Success Metrics
- `metrics_data.json` present with valid rows during processing.
- OBI subplot updates smoothly and aligns with OHLC window timestamps.
- OBI ≈ 0 for symmetric books; correct sign for imbalanced cases; no noticeable performance regression.
### 9) Open Questions
- None; cadence confirmed to match OHLC throttling. Styling: blue tones for OBI candlesticks.