Add interactive visualizer using Plotly and Dash, replacing the static matplotlib implementation. Introduced core modules for Dash app setup, custom components, and callback functions. Enhanced data processing utilities for Plotly format integration and updated dependencies in pyproject.toml.

2025-09-01 11:17:10 +08:00
parent fa6df78c1e
commit 36385af6f3
27 changed files with 1694 additions and 933 deletions
--- a/docs/API.md
+++ b/docs/API.md
@@ -213,7 +213,7 @@ def get_best_bid_ask(snapshot: BookSnapshot) -> tuple[float | None, float | None

 ### SQLiteOrderflowRepository

-Read-only repository for orderbook and trades data.
+Repository for orderbook, trades data and metrics.

 #### connect()

@@ -270,10 +270,6 @@ def iterate_book_rows(self, conn: sqlite3.Connection) -> Iterator[Tuple[int, str
    """
 ```

-### SQLiteMetricsRepository
-
-Write-enabled repository for metrics storage and retrieval.
-
 #### create_metrics_table()

 ```python
@@ -659,7 +655,7 @@ for trades in trades_by_timestamp.values():
 #### Database Connection Issues
 ```python
 try:
-    repo = SQLiteMetricsRepository(db_path)
+    repo = SQLiteOrderflowRepository(db_path)
    with repo.connect() as conn:
        metrics = repo.load_metrics_by_timerange(conn, start, end)
 except sqlite3.Error as e:
@@ -669,7 +665,7 @@ except sqlite3.Error as e:

 #### Missing Metrics Table
 ```python
-repo = SQLiteMetricsRepository(db_path)
+repo = SQLiteOrderflowRepository(db_path)
 with repo.connect() as conn:
    if not repo.table_exists(conn, "metrics"):
        repo.create_metrics_table(conn)
--- a/docs/CHANGELOG.md
+++ b/docs/CHANGELOG.md
@@ -13,7 +13,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **Persistent Metrics Storage**: SQLite-based storage for calculated metrics to avoid recalculation
 - **Memory Optimization**: >70% reduction in peak memory usage through streaming processing
 - **Enhanced Visualization**: Multi-subplot charts with OHLC, Volume, OBI, and CVD displays
- **Metrics Repository**: `SQLiteMetricsRepository` for write-enabled database operations
 - **MetricCalculator Class**: Static methods for financial metrics computation
 - **Batch Processing**: High-performance batch inserts (1000 records per operation)
 - **Time-Range Queries**: Efficient metrics retrieval for specified time periods
--- a/docs/CONTRIBUTING.md
+++ b/docs/CONTRIBUTING.md
@@ -1,306 +0,0 @@
-# Contributing to Orderflow Backtest System
-
-## Development Guidelines
-
-Thank you for your interest in contributing to the Orderflow Backtest System. This document outlines the development process, coding standards, and best practices for maintaining code quality.
-
-## Development Environment Setup
-
-### Prerequisites
- **Python**: 3.12 or higher
- **Package Manager**: UV (recommended) or pip
- **Database**: SQLite 3.x
- **GUI**: Qt5 for visualization (Linux/macOS)
-
-### Installation
-```bash
-# Clone the repository
-git clone <repository-url>
-cd orderflow_backtest
-
-# Install dependencies
-uv sync
-
-# Install development dependencies
-uv add --dev pytest coverage mypy
-
-# Verify installation
-uv run pytest
-```
-
-### Development Tools
-```bash
-# Run tests
-uv run pytest
-
-# Run tests with coverage
-uv run pytest --cov=. --cov-report=html
-
-# Run type checking
-uv run mypy .
-
-# Run specific test module
-uv run pytest tests/test_storage_metrics.py -v
-```
-
-## Code Standards
-
-### Function and File Size Limits
- **Functions**: Maximum 50 lines
- **Files**: Maximum 250 lines
- **Classes**: Single responsibility, clear purpose
- **Methods**: One main function per method
-
-### Naming Conventions
-```python
-# Good examples
-def calculate_order_book_imbalance(snapshot: BookSnapshot) -> float:
-def load_metrics_by_timerange(start: int, end: int) -> List[Metric]:
-class MetricCalculator:
-class SQLiteMetricsRepository:
-
-# Avoid abbreviations except domain terms
-# Good: OBI, CVD (standard financial terms)
-# Avoid: calc_obi, proc_data, mgr
-```
-
-### Type Annotations
-```python
-# Required for all public interfaces
-def process_trades(trades: List[Trade]) -> Dict[int, float]:
-    """Process trades and return volume by timestamp."""
-    
-class Storage:
-    def __init__(self, instrument: str) -> None:
-        self.instrument = instrument
-```
-
-### Documentation Standards
-```python
-def calculate_metrics(snapshot: BookSnapshot, trades: List[Trade]) -> Metric:
-    """
-    Calculate OBI and CVD metrics for a snapshot.
-    
-    Args:
-        snapshot: Orderbook state at specific timestamp
-        trades: List of trades executed at this timestamp
-        
-    Returns:
-        Metric: Calculated OBI, CVD, and best bid/ask values
-        
-    Raises:
-        ValueError: If snapshot contains invalid data
-        
-    Example:
-        >>> snapshot = BookSnapshot(...)
-        >>> trades = [Trade(...), ...]
-        >>> metric = calculate_metrics(snapshot, trades)
-        >>> print(f"OBI: {metric.obi:.3f}")
-        OBI: 0.333
-    """
-```
-
-## Architecture Principles
-
-### Separation of Concerns
- **Storage**: Data processing and persistence only
- **Strategy**: Trading analysis and signal generation only  
- **Visualizer**: Chart rendering and display only
- **Main**: Application orchestration and flow control
-
-### Repository Pattern
-```python
-# Good: Clean interface
-class SQLiteMetricsRepository:
-    def load_metrics_by_timerange(self, conn: Connection, start: int, end: int) -> List[Metric]:
-        # Implementation details hidden
-        
-# Avoid: Direct SQL in business logic
-def analyze_strategy(db_path: Path):
-    # Don't do this
-    conn = sqlite3.connect(db_path)
-    cursor = conn.execute("SELECT * FROM metrics WHERE ...")
-```
-
-### Error Handling
-```python
-# Required pattern
-try:
-    result = risky_operation()
-    return process_result(result)
-except SpecificException as e:
-    logging.error(f"Operation failed: {e}")
-    return default_value
-except Exception as e:
-    logging.error(f"Unexpected error in operation: {e}")
-    raise
-```
-
-## Testing Requirements
-
-### Test Coverage
- **Unit Tests**: All public methods must have unit tests
- **Integration Tests**: End-to-end workflow testing required
- **Edge Cases**: Handle empty data, boundary conditions, error scenarios
-
-### Test Structure
-```python
-def test_feature_description():
-    """Test that feature behaves correctly under normal conditions."""
-    # Arrange
-    test_data = create_test_data()
-    
-    # Act
-    result = function_under_test(test_data)
-    
-    # Assert
-    assert result.expected_property == expected_value
-    assert len(result.collection) == expected_count
-```
-
-### Test Data Management
-```python
-# Use temporary files for database tests
-def test_database_operation():
-    with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as tmp_file:
-        db_path = Path(tmp_file.name)
-    
-    try:
-        # Test implementation
-        pass
-    finally:
-        db_path.unlink(missing_ok=True)
-```
-
-## Database Development
-
-### Schema Changes
-1. **Create Migration**: Document schema changes in ADR format
-2. **Backward Compatibility**: Ensure existing databases continue to work
-3. **Auto-Migration**: Implement automatic schema updates where possible
-4. **Performance**: Add appropriate indexes for new queries
-
-### Query Patterns
-```python
-# Good: Parameterized queries
-cursor.execute(
-    "SELECT obi, cvd FROM metrics WHERE timestamp >= ? AND timestamp <= ?",
-    (start_timestamp, end_timestamp)
-)
-
-# Bad: String formatting (security risk)
-query = f"SELECT * FROM metrics WHERE timestamp = {timestamp}"
-```
-
-### Performance Guidelines
- **Batch Operations**: Process in batches of 1000 records
- **Indexes**: Add indexes for frequently queried columns
- **Transactions**: Use transactions for multi-record operations
- **Connection Management**: Caller manages connection lifecycle
-
-## Performance Requirements
-
-### Memory Management
- **Target**: >70% memory reduction vs. full snapshot retention
- **Measurement**: Profile memory usage with large datasets
- **Optimization**: Stream processing, batch operations, minimal object retention
-
-### Processing Speed
- **Target**: >500 snapshots/second processing rate
- **Measurement**: Benchmark with realistic datasets
- **Optimization**: Database batching, efficient algorithms, minimal I/O
-
-### Storage Efficiency
- **Target**: <25% storage overhead for metrics
- **Measurement**: Compare metrics table size to source data
- **Optimization**: Efficient data types, minimal redundancy
-
-## Submission Process
-
-### Before Submitting
-1. **Run Tests**: Ensure all tests pass
-   ```bash
-   uv run pytest
-   ```
-
-2. **Check Type Hints**: Verify type annotations
-   ```bash
-   uv run mypy .
-   ```
-
-3. **Test Coverage**: Ensure adequate test coverage
-   ```bash
-   uv run pytest --cov=. --cov-report=term-missing
-   ```
-
-4. **Documentation**: Update relevant documentation files
-
-### Pull Request Guidelines
- **Description**: Clear description of changes and motivation
- **Testing**: Include tests for new functionality
- **Documentation**: Update docs for API changes
- **Breaking Changes**: Document any breaking changes
- **Performance**: Include performance impact analysis for significant changes
-
-### Code Review Checklist
- [ ] Follows function/file size limits
- [ ] Has comprehensive test coverage
- [ ] Includes proper error handling
- [ ] Uses type annotations consistently
- [ ] Maintains backward compatibility
- [ ] Updates relevant documentation
- [ ] No security vulnerabilities (SQL injection, etc.)
- [ ] Performance impact analyzed
-
-## Documentation Maintenance
-
-### When to Update Documentation
- **API Changes**: Any modification to public interfaces
- **Architecture Changes**: New patterns, data structures, or workflows
- **Performance Changes**: Significant performance improvements or regressions
- **Feature Additions**: New capabilities or metrics
-
-### Documentation Types
- **Code Comments**: Complex algorithms and business logic
- **Docstrings**: All public functions and classes
- **Module Documentation**: Purpose and usage examples
- **Architecture Documentation**: System design and component relationships
-
-## Getting Help
-
-### Resources
- **Architecture Overview**: `docs/architecture.md`
- **API Documentation**: `docs/API.md`
- **Module Documentation**: `docs/modules/`
- **Decision Records**: `docs/decisions/`
-
-### Communication
- **Issues**: Use GitHub issues for bug reports and feature requests
- **Discussions**: Use GitHub discussions for questions and design discussions
- **Code Review**: Comment on pull requests for specific code feedback
-
---
-
-## Development Workflow
-
-### Feature Development
-1. **Create Branch**: Feature-specific branch from main
-2. **Develop**: Follow coding standards and test requirements
-3. **Test**: Comprehensive testing including edge cases
-4. **Document**: Update relevant documentation
-5. **Review**: Submit pull request for code review
-6. **Merge**: Merge after approval and CI success
-
-### Bug Fixes
-1. **Reproduce**: Create test that reproduces the bug
-2. **Fix**: Implement minimal fix addressing root cause
-3. **Verify**: Ensure fix resolves issue without regressions
-4. **Test**: Add regression test to prevent future occurrences
-
-### Performance Improvements
-1. **Benchmark**: Establish baseline performance metrics
-2. **Optimize**: Implement performance improvements
-3. **Measure**: Verify performance gains with benchmarks
-4. **Document**: Update performance characteristics in docs
-
-Thank you for contributing to the Orderflow Backtest System! Your contributions help make this a better tool for cryptocurrency trading analysis.
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -53,15 +53,12 @@ MetricCalculator # Static methods for OBI/CVD computation
 **Purpose**: Database access and persistence layer

 ```python
-# Read-only base repository
+# Repository
 SQLiteOrderflowRepository:
  - connect()                    # Optimized SQLite connection
  - load_trades_by_timestamp()   # Efficient trade loading
  - iterate_book_rows()          # Memory-efficient snapshot streaming
  - count_rows()                 # Performance monitoring
-
-# Write-enabled metrics repository
-SQLiteMetricsRepository:
  - create_metrics_table()       # Schema creation
  - insert_metrics_batch()       # High-performance batch inserts
  - load_metrics_by_timerange()  # Time-range queries