orderflow_backtest/tasks/prd-cumulative-volume-delta.md

77 lines
5.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## Cumulative Volume Delta (CVD) Product Requirements Document
### 1) Introduction / Overview
- Compute and visualize Cumulative Volume Delta (CVD) from trade data processed by `OHLCProcessor.process_trades`, aligned to the existing OHLC bar cadence.
- CVD is defined as the cumulative sum of volume delta, where volume delta = buy_volume - sell_volume per trade.
- Trade classification: `side == "buy"` → positive volume delta, `side == "sell"` → negative volume delta.
- Persist CVD time series as scalar values per window to `metrics_data.json` and render a CVD line chart beneath the current OBI subplot in the Dash UI.
### 2) Goals
- Compute volume delta from individual trades using the `side` field in the Trade dataclass.
- Accumulate CVD across all processed trades (no session resets initially).
- Aggregate CVD into window-aligned scalar values per `window_seconds`.
- Extend `metrics_data.json` schema to include CVD values alongside existing OBI data.
- Add a CVD line chart subplot beneath OBI in the main chart, sharing the time axis.
- Throttle intra-window upserts of CVD values using the same approach/frequency as current OHLC throttling; always write on window close.
### 3) User Stories
- As a researcher, I want CVD computed from actual trade data so I can assess buying/selling pressure over time.
- As an analyst, I want CVD stored per time window so I can correlate it with price movements and OBI patterns.
- As a developer, I want cumulative CVD values so I can analyze long-term directional bias in volume flow.
### 4) Functional Requirements
1. Inputs and Definitions
- Compute volume delta on every trade in `OHLCProcessor.process_trades`:
- If `trade.side == "buy"``volume_delta = +trade.size`
- If `trade.side == "sell"``volume_delta = -trade.size`
- If `trade.side` is neither "buy" nor "sell" → `volume_delta = 0` (log warning)
- Accumulate into running CVD: `self.cvd_cumulative += volume_delta`
2. Windowing & Aggregation
- Use the same `window_seconds` boundary as OHLC bars; window anchor is derived from the trade timestamp.
- Store CVD value at window boundaries (end-of-window CVD snapshot).
- On window rollover, capture the current `self.cvd_cumulative` value for that window.
3. Persistence
- Extend `metrics_data.json` schema from `[timestamp, obi_open, obi_high, obi_low, obi_close]` to `[timestamp, obi_open, obi_high, obi_low, obi_close, cvd_value]`.
- Update `viz_io.py` functions to handle the new 6-element schema.
- Keep only the last 1000 rows.
- Upsert intra-window CVD values periodically (throttled, matching OHLC's approach) and always write on window close.
4. Visualization
- Read extended `metrics_data.json` in the Dash app with the same tolerant JSON reading/caching approach.
- Extend the main figure to a fourth row for CVD line chart beneath OBI, sharing the x-axis.
- Style CVD as a line chart with appropriate color (distinct from OHLC/Volume/OBI) and add a zero baseline.
5. Performance & Correctness
- CVD compute happens on every trade; I/O is throttled to maintain UI responsiveness.
- Use existing logging and error handling patterns; must not crash if metrics JSON is temporarily unreadable.
- Handle backward compatibility: if existing `metrics_data.json` has 5-element rows, treat missing CVD as 0.
6. Testing
- Unit tests for volume delta calculation with "buy", "sell", and invalid side values.
- Unit tests for CVD accumulation across multiple trades and window boundaries.
- Integration test: fixture trades produce correct CVD progression in `metrics_data.json`.
### 5) Non-Goals
- No CVD reset functionality (will be implemented later).
- No additional derived CVD metrics (e.g., CVD rate of change, normalized CVD).
- No database persistence for CVD; JSON IPC only.
- No strategy/signal changes based on CVD.
### 6) Design Considerations
- Implement CVD calculation in `OHLCProcessor.process_trades` alongside existing OHLC aggregation.
- Extend `viz_io.py` metrics functions to support 6-element schema while maintaining backward compatibility.
- Add CVD state tracking: `self.cvd_cumulative`, `self.cvd_window_value` per window.
- Follow the same throttling pattern as OBI metrics for consistency.
### 7) Technical Considerations
- Add CVD computation in the trade processing loop within `OHLCProcessor.process_trades`.
- Extend `upsert_metric_bar` and `add_metric_bar` functions to accept optional `cvd_value` parameter.
- Handle schema migration gracefully: read existing 5-element rows, append 0.0 for missing CVD.
- Use the same window alignment as trades (based on trade timestamp, not orderbook timestamp).
### 8) Success Metrics
- `metrics_data.json` present with valid 6-element rows during processing.
- CVD subplot updates smoothly and aligns with OHLC window timestamps.
- CVD increases during buy-heavy periods, decreases during sell-heavy periods.
- No noticeable performance regression in trade processing or UI responsiveness.
### 9) Open Questions
- None; CVD computation approach confirmed using trade.side field. Schema extension approach confirmed for metrics_data.json.