orderflow_backtest/tasks/prd-interactive-visualizer.md

209 lines
11 KiB
Markdown

# PRD: Interactive Visualizer with Plotly + Dash
## Introduction/Overview
The current orderflow backtest system uses a static matplotlib-based visualizer that displays OHLC candlesticks, volume bars, Order Book Imbalance (OBI), and Cumulative Volume Delta (CVD) charts. This PRD outlines the development of a new interactive visualization system using Plotly + Dash that will provide real-time interactivity, detailed data inspection, and enhanced user experience for cryptocurrency trading analysis.
The goal is to replace the static visualization with a professional, web-based interactive dashboard that allows traders to explore orderbook metrics with precision and flexibility.
## Goals
1. **Replace Static Visualization**: Create a new `InteractiveVisualizer` class using Plotly + Dash
2. **Enable Cross-Chart Interactivity**: Implement synchronized zooming, panning, and time range selection across all charts
3. **Provide Precision Navigation**: Add crosshair cursor with vertical line indicator across all charts
4. **Display Contextual Information**: Show detailed metrics in a side panel when hovering over data points
5. **Support Multiple Time Granularities**: Allow users to adjust time resolution dynamically
6. **Maintain Performance**: Handle large datasets (months of data) with smooth interactions
7. **Preserve Integration**: Seamlessly integrate with existing metrics storage and data processing pipeline
## User Stories
### Primary Use Cases
- **US-1**: As a trader, I want to zoom into specific time periods across all charts simultaneously so that I can analyze market behavior during critical moments
- **US-2**: As a trader, I want to see a vertical crosshair line that spans all charts so that I can precisely align data points across OHLC, volume, OBI, and CVD metrics
- **US-3**: As a trader, I want to hover over any data point and see detailed information in a side panel so that I can inspect exact values without cluttering the charts
- **US-4**: As a trader, I want to pan through historical data smoothly so that I can explore different time periods efficiently
- **US-5**: As a trader, I want to reset CVD calculations from a selected point in time so that I can analyze cumulative volume delta from specific market events
### Secondary Use Cases
- **US-6**: As a trader, I want to adjust time granularity (1min, 5min, 1hour) so that I can view data at different resolutions
- **US-7**: As a trader, I want navigation controls (reset zoom, home button) so that I can quickly return to full data view
- **US-8**: As a trader, I want to select custom time ranges so that I can focus analysis on specific market sessions
## Functional Requirements
### Core Interactive Features
1. **F1**: The system must provide synchronized zooming across all four charts (OHLC, Volume, OBI, CVD)
2. **F2**: The system must provide synchronized panning across all four charts with shared X-axis
3. **F3**: The system must display a vertical crosshair line that spans all charts and follows mouse cursor
4. **F4**: The system must show detailed hover information for each chart type:
- OHLC: timestamp, open, high, low, close, spread
- Volume: timestamp, total volume, buy/sell breakdown if available
- OBI: timestamp, OBI value, bid volume, ask volume, imbalance percentage
- CVD: timestamp, CVD value, volume delta, cumulative change
### User Interface Requirements
5. **F5**: The system must display charts in a 4-row layout with shared X-axis (OHLC on top, Volume, OBI, CVD at bottom)
6. **F6**: The system must provide a side panel on the right displaying detailed information for the current cursor position
7. **F7**: The system must include navigation controls:
- Zoom in/out buttons
- Reset zoom button
- Home view button
- Time range selector
8. **F8**: The system must provide time granularity controls (1min, 5min, 15min, 1hour, 6hour)
### Data Integration Requirements
9. **F9**: The system must integrate with existing `SQLiteOrderflowRepository` for metrics data loading
10. **F10**: The system must support loading data from multiple database files seamlessly
11. **F11**: The system must maintain the existing `set_db_path()` and `update_from_book()` interface for compatibility
12. **F12**: The system must calculate OHLC bars from snapshots with configurable time windows
### Performance Requirements
13. **F13**: The system must render charts with <2 second initial load time for datasets up to 1 million data points
14. **F14**: The system must provide smooth zooming and panning interactions with <100ms response time
15. **F15**: The system must efficiently update hover information with <50ms latency
### CVD Reset Functionality
16. **F16**: The system must allow users to click on any point in the CVD chart to reset cumulative calculation from that timestamp
17. **F17**: The system must visually indicate CVD reset points with markers or annotations
18. **F18**: The system must recalculate and redraw CVD values from the reset point forward
## Non-Goals (Out of Scope)
1. **Advanced Drawing Tools**: Trend lines, Fibonacci retracements, or annotation tools
2. **Multiple Instrument Support**: Multi-symbol comparison or overlay charts
3. **Real-time Streaming**: Live data updates or WebSocket integration
4. **Export Functionality**: Chart export to PNG/PDF or data export to CSV
5. **User Authentication**: User accounts, saved layouts, or personalization
6. **Mobile Optimization**: Touch interfaces or responsive mobile design
7. **Advanced Indicators**: Technical analysis indicators beyond OBI/CVD
8. **Alert System**: Price alerts, threshold notifications, or automated signals
## Design Considerations
### Chart Layout
- **Layout**: 4-row subplot layout with 80% chart area, 20% side panel
- **Color Scheme**: Professional dark theme with customizable colors
- **Typography**: Clear, readable fonts optimized for financial data
- **Responsive Design**: Adaptable to different screen sizes (desktop focus)
### Side Panel Design
```
┌─────────────────┐
│ Current Data │
├─────────────────┤
│ Time: 16:30:45 │
│ Price: $50,123 │
│ Volume: 1,234 │
│ OBI: 0.234 │
│ CVD: -123.45 │
├─────────────────┤
│ Controls │
│ [Reset CVD] │
│ [Zoom Reset] │
│ [Time Range ▼] │
│ [Granularity ▼] │
└─────────────────┘
```
### Navigation Controls
- **Zoom**: Mouse wheel, zoom box selection, zoom buttons
- **Pan**: Click and drag, arrow keys, scroll bars
- **Reset**: Double-click to auto-scale, reset button to full view
- **Selection**: Click and drag for time range selection
## Technical Considerations
### Architecture Changes
- **New Class**: `InteractiveVisualizer` class separate from existing `Visualizer`
- **Dependencies**: Add `dash`, `plotly`, `dash-bootstrap-components` to requirements
- **Web Server**: Dash development server for local deployment
- **Data Flow**: Maintain existing metrics loading pipeline, adapt to Plotly data structures
### Integration Points
```python
# Maintain existing interface for compatibility
class InteractiveVisualizer:
def set_db_path(self, db_path: Path) -> None
def update_from_book(self, book: Book) -> None
def show(self) -> None # Launch Dash server instead of plt.show()
```
### Data Structure Adaptation
- **OHLC Data**: Convert bars to Plotly candlestick format
- **Metrics Data**: Transform to Plotly time series format
- **Memory Management**: Implement data decimation for large datasets
- **Caching**: Cache processed data to improve interaction performance
### Technology Stack
- **Frontend**: Dash + Plotly.js for charts
- **Backend**: Python Dash server with existing data pipeline
- **Styling**: Dash Bootstrap Components for professional UI
- **Data Processing**: Pandas for efficient data manipulation
## Success Metrics
### User Experience Metrics
1. **Interaction Responsiveness**: 95% of zoom/pan operations complete within 100ms
2. **Data Precision**: 100% accuracy in crosshair positioning and hover data display
3. **Navigation Efficiency**: Users can navigate to specific time periods 3x faster than static charts
### Technical Performance Metrics
4. **Load Time**: Initial chart rendering completes within 2 seconds for 500k data points
5. **Memory Usage**: Interactive visualizer uses <150% memory compared to static version
6. **Error Rate**: <1% interaction failures or display errors during normal usage
### Feature Adoption Metrics
7. **Feature Usage**: CVD reset functionality used in >30% of analysis sessions
8. **Time Range Analysis**: Custom time range selection used in >50% of sessions
9. **Granularity Changes**: Time resolution adjustment used in >40% of sessions
## Implementation Priority
### Phase 1: Core Interactive Charts (High Priority)
- Basic Plotly + Dash setup
- 4-chart layout with synchronized axes
- Basic zoom, pan, and crosshair functionality
- Integration with existing data pipeline
### Phase 2: Enhanced Interactivity (High Priority)
- Side panel with hover information
- Navigation controls and buttons
- Time granularity selection
- CVD reset functionality
### Phase 3: Performance Optimization (Medium Priority)
- Large dataset handling
- Interaction performance tuning
- Memory usage optimization
- Error handling and edge cases
### Phase 4: Polish and UX (Medium Priority)
- Professional styling and themes
- Enhanced navigation controls
- Time range selection tools
- User experience refinements
## Open Questions
1. **Deployment Method**: Should the interactive visualizer run as a local Dash server or be deployable as a standalone web application?
2. **Data Decimation Strategy**: How should the system handle datasets with millions of points while maintaining interactivity? Should it implement automatic decimation based on zoom level?
3. **CVD Reset Persistence**: Should CVD reset points be saved to the database or only exist in the current session?
4. **Multiple Database Sessions**: How should the interactive visualizer handle switching between different database files during the same session?
5. **Backward Compatibility**: Should the system maintain both static and interactive visualizers, or completely replace the matplotlib implementation?
6. **Configuration Management**: How should users configure default time granularities, color schemes, and layout preferences?
7. **Performance Baselines**: What are the acceptable performance thresholds for different dataset sizes and interaction types?
---
**Document Version**: 1.0
**Created**: Current Date
**Target Audience**: Junior Developer
**Estimated Implementation**: 3-4 weeks for complete feature set