Add interactive visualizer using Plotly and Dash, replacing the static matplotlib implementation. Introduced core modules for Dash app setup, custom components, and callback functions. Enhanced data processing utilities for Plotly format integration and updated dependencies in pyproject.toml.

This commit is contained in:
2025-09-01 11:17:10 +08:00
parent fa6df78c1e
commit 36385af6f3
27 changed files with 1694 additions and 933 deletions

View File

@@ -213,7 +213,7 @@ def get_best_bid_ask(snapshot: BookSnapshot) -> tuple[float | None, float | None
### SQLiteOrderflowRepository
Read-only repository for orderbook and trades data.
Repository for orderbook, trades data and metrics.
#### connect()
@@ -270,10 +270,6 @@ def iterate_book_rows(self, conn: sqlite3.Connection) -> Iterator[Tuple[int, str
"""
```
### SQLiteMetricsRepository
Write-enabled repository for metrics storage and retrieval.
#### create_metrics_table()
```python
@@ -659,7 +655,7 @@ for trades in trades_by_timestamp.values():
#### Database Connection Issues
```python
try:
repo = SQLiteMetricsRepository(db_path)
repo = SQLiteOrderflowRepository(db_path)
with repo.connect() as conn:
metrics = repo.load_metrics_by_timerange(conn, start, end)
except sqlite3.Error as e:
@@ -669,7 +665,7 @@ except sqlite3.Error as e:
#### Missing Metrics Table
```python
repo = SQLiteMetricsRepository(db_path)
repo = SQLiteOrderflowRepository(db_path)
with repo.connect() as conn:
if not repo.table_exists(conn, "metrics"):
repo.create_metrics_table(conn)

View File

@@ -13,7 +13,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **Persistent Metrics Storage**: SQLite-based storage for calculated metrics to avoid recalculation
- **Memory Optimization**: >70% reduction in peak memory usage through streaming processing
- **Enhanced Visualization**: Multi-subplot charts with OHLC, Volume, OBI, and CVD displays
- **Metrics Repository**: `SQLiteMetricsRepository` for write-enabled database operations
- **MetricCalculator Class**: Static methods for financial metrics computation
- **Batch Processing**: High-performance batch inserts (1000 records per operation)
- **Time-Range Queries**: Efficient metrics retrieval for specified time periods

View File

@@ -1,306 +0,0 @@
# Contributing to Orderflow Backtest System
## Development Guidelines
Thank you for your interest in contributing to the Orderflow Backtest System. This document outlines the development process, coding standards, and best practices for maintaining code quality.
## Development Environment Setup
### Prerequisites
- **Python**: 3.12 or higher
- **Package Manager**: UV (recommended) or pip
- **Database**: SQLite 3.x
- **GUI**: Qt5 for visualization (Linux/macOS)
### Installation
```bash
# Clone the repository
git clone <repository-url>
cd orderflow_backtest
# Install dependencies
uv sync
# Install development dependencies
uv add --dev pytest coverage mypy
# Verify installation
uv run pytest
```
### Development Tools
```bash
# Run tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=. --cov-report=html
# Run type checking
uv run mypy .
# Run specific test module
uv run pytest tests/test_storage_metrics.py -v
```
## Code Standards
### Function and File Size Limits
- **Functions**: Maximum 50 lines
- **Files**: Maximum 250 lines
- **Classes**: Single responsibility, clear purpose
- **Methods**: One main function per method
### Naming Conventions
```python
# Good examples
def calculate_order_book_imbalance(snapshot: BookSnapshot) -> float:
def load_metrics_by_timerange(start: int, end: int) -> List[Metric]:
class MetricCalculator:
class SQLiteMetricsRepository:
# Avoid abbreviations except domain terms
# Good: OBI, CVD (standard financial terms)
# Avoid: calc_obi, proc_data, mgr
```
### Type Annotations
```python
# Required for all public interfaces
def process_trades(trades: List[Trade]) -> Dict[int, float]:
"""Process trades and return volume by timestamp."""
class Storage:
def __init__(self, instrument: str) -> None:
self.instrument = instrument
```
### Documentation Standards
```python
def calculate_metrics(snapshot: BookSnapshot, trades: List[Trade]) -> Metric:
"""
Calculate OBI and CVD metrics for a snapshot.
Args:
snapshot: Orderbook state at specific timestamp
trades: List of trades executed at this timestamp
Returns:
Metric: Calculated OBI, CVD, and best bid/ask values
Raises:
ValueError: If snapshot contains invalid data
Example:
>>> snapshot = BookSnapshot(...)
>>> trades = [Trade(...), ...]
>>> metric = calculate_metrics(snapshot, trades)
>>> print(f"OBI: {metric.obi:.3f}")
OBI: 0.333
"""
```
## Architecture Principles
### Separation of Concerns
- **Storage**: Data processing and persistence only
- **Strategy**: Trading analysis and signal generation only
- **Visualizer**: Chart rendering and display only
- **Main**: Application orchestration and flow control
### Repository Pattern
```python
# Good: Clean interface
class SQLiteMetricsRepository:
def load_metrics_by_timerange(self, conn: Connection, start: int, end: int) -> List[Metric]:
# Implementation details hidden
# Avoid: Direct SQL in business logic
def analyze_strategy(db_path: Path):
# Don't do this
conn = sqlite3.connect(db_path)
cursor = conn.execute("SELECT * FROM metrics WHERE ...")
```
### Error Handling
```python
# Required pattern
try:
result = risky_operation()
return process_result(result)
except SpecificException as e:
logging.error(f"Operation failed: {e}")
return default_value
except Exception as e:
logging.error(f"Unexpected error in operation: {e}")
raise
```
## Testing Requirements
### Test Coverage
- **Unit Tests**: All public methods must have unit tests
- **Integration Tests**: End-to-end workflow testing required
- **Edge Cases**: Handle empty data, boundary conditions, error scenarios
### Test Structure
```python
def test_feature_description():
"""Test that feature behaves correctly under normal conditions."""
# Arrange
test_data = create_test_data()
# Act
result = function_under_test(test_data)
# Assert
assert result.expected_property == expected_value
assert len(result.collection) == expected_count
```
### Test Data Management
```python
# Use temporary files for database tests
def test_database_operation():
with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as tmp_file:
db_path = Path(tmp_file.name)
try:
# Test implementation
pass
finally:
db_path.unlink(missing_ok=True)
```
## Database Development
### Schema Changes
1. **Create Migration**: Document schema changes in ADR format
2. **Backward Compatibility**: Ensure existing databases continue to work
3. **Auto-Migration**: Implement automatic schema updates where possible
4. **Performance**: Add appropriate indexes for new queries
### Query Patterns
```python
# Good: Parameterized queries
cursor.execute(
"SELECT obi, cvd FROM metrics WHERE timestamp >= ? AND timestamp <= ?",
(start_timestamp, end_timestamp)
)
# Bad: String formatting (security risk)
query = f"SELECT * FROM metrics WHERE timestamp = {timestamp}"
```
### Performance Guidelines
- **Batch Operations**: Process in batches of 1000 records
- **Indexes**: Add indexes for frequently queried columns
- **Transactions**: Use transactions for multi-record operations
- **Connection Management**: Caller manages connection lifecycle
## Performance Requirements
### Memory Management
- **Target**: >70% memory reduction vs. full snapshot retention
- **Measurement**: Profile memory usage with large datasets
- **Optimization**: Stream processing, batch operations, minimal object retention
### Processing Speed
- **Target**: >500 snapshots/second processing rate
- **Measurement**: Benchmark with realistic datasets
- **Optimization**: Database batching, efficient algorithms, minimal I/O
### Storage Efficiency
- **Target**: <25% storage overhead for metrics
- **Measurement**: Compare metrics table size to source data
- **Optimization**: Efficient data types, minimal redundancy
## Submission Process
### Before Submitting
1. **Run Tests**: Ensure all tests pass
```bash
uv run pytest
```
2. **Check Type Hints**: Verify type annotations
```bash
uv run mypy .
```
3. **Test Coverage**: Ensure adequate test coverage
```bash
uv run pytest --cov=. --cov-report=term-missing
```
4. **Documentation**: Update relevant documentation files
### Pull Request Guidelines
- **Description**: Clear description of changes and motivation
- **Testing**: Include tests for new functionality
- **Documentation**: Update docs for API changes
- **Breaking Changes**: Document any breaking changes
- **Performance**: Include performance impact analysis for significant changes
### Code Review Checklist
- [ ] Follows function/file size limits
- [ ] Has comprehensive test coverage
- [ ] Includes proper error handling
- [ ] Uses type annotations consistently
- [ ] Maintains backward compatibility
- [ ] Updates relevant documentation
- [ ] No security vulnerabilities (SQL injection, etc.)
- [ ] Performance impact analyzed
## Documentation Maintenance
### When to Update Documentation
- **API Changes**: Any modification to public interfaces
- **Architecture Changes**: New patterns, data structures, or workflows
- **Performance Changes**: Significant performance improvements or regressions
- **Feature Additions**: New capabilities or metrics
### Documentation Types
- **Code Comments**: Complex algorithms and business logic
- **Docstrings**: All public functions and classes
- **Module Documentation**: Purpose and usage examples
- **Architecture Documentation**: System design and component relationships
## Getting Help
### Resources
- **Architecture Overview**: `docs/architecture.md`
- **API Documentation**: `docs/API.md`
- **Module Documentation**: `docs/modules/`
- **Decision Records**: `docs/decisions/`
### Communication
- **Issues**: Use GitHub issues for bug reports and feature requests
- **Discussions**: Use GitHub discussions for questions and design discussions
- **Code Review**: Comment on pull requests for specific code feedback
---
## Development Workflow
### Feature Development
1. **Create Branch**: Feature-specific branch from main
2. **Develop**: Follow coding standards and test requirements
3. **Test**: Comprehensive testing including edge cases
4. **Document**: Update relevant documentation
5. **Review**: Submit pull request for code review
6. **Merge**: Merge after approval and CI success
### Bug Fixes
1. **Reproduce**: Create test that reproduces the bug
2. **Fix**: Implement minimal fix addressing root cause
3. **Verify**: Ensure fix resolves issue without regressions
4. **Test**: Add regression test to prevent future occurrences
### Performance Improvements
1. **Benchmark**: Establish baseline performance metrics
2. **Optimize**: Implement performance improvements
3. **Measure**: Verify performance gains with benchmarks
4. **Document**: Update performance characteristics in docs
Thank you for contributing to the Orderflow Backtest System! Your contributions help make this a better tool for cryptocurrency trading analysis.

View File

@@ -53,15 +53,12 @@ MetricCalculator # Static methods for OBI/CVD computation
**Purpose**: Database access and persistence layer
```python
# Read-only base repository
# Repository
SQLiteOrderflowRepository:
- connect() # Optimized SQLite connection
- load_trades_by_timestamp() # Efficient trade loading
- iterate_book_rows() # Memory-efficient snapshot streaming
- count_rows() # Performance monitoring
# Write-enabled metrics repository
SQLiteMetricsRepository:
- create_metrics_table() # Schema creation
- insert_metrics_batch() # High-performance batch inserts
- load_metrics_by_timerange() # Time-range queries