307 lines
8.9 KiB
Markdown
307 lines
8.9 KiB
Markdown
# Contributing to Orderflow Backtest System
|
|
|
|
## Development Guidelines
|
|
|
|
Thank you for your interest in contributing to the Orderflow Backtest System. This document outlines the development process, coding standards, and best practices for maintaining code quality.
|
|
|
|
## Development Environment Setup
|
|
|
|
### Prerequisites
|
|
- **Python**: 3.12 or higher
|
|
- **Package Manager**: UV (recommended) or pip
|
|
- **Database**: SQLite 3.x
|
|
- **GUI**: Qt5 for visualization (Linux/macOS)
|
|
|
|
### Installation
|
|
```bash
|
|
# Clone the repository
|
|
git clone <repository-url>
|
|
cd orderflow_backtest
|
|
|
|
# Install dependencies
|
|
uv sync
|
|
|
|
# Install development dependencies
|
|
uv add --dev pytest coverage mypy
|
|
|
|
# Verify installation
|
|
uv run pytest
|
|
```
|
|
|
|
### Development Tools
|
|
```bash
|
|
# Run tests
|
|
uv run pytest
|
|
|
|
# Run tests with coverage
|
|
uv run pytest --cov=. --cov-report=html
|
|
|
|
# Run type checking
|
|
uv run mypy .
|
|
|
|
# Run specific test module
|
|
uv run pytest tests/test_storage_metrics.py -v
|
|
```
|
|
|
|
## Code Standards
|
|
|
|
### Function and File Size Limits
|
|
- **Functions**: Maximum 50 lines
|
|
- **Files**: Maximum 250 lines
|
|
- **Classes**: Single responsibility, clear purpose
|
|
- **Methods**: One main function per method
|
|
|
|
### Naming Conventions
|
|
```python
|
|
# Good examples
|
|
def calculate_order_book_imbalance(snapshot: BookSnapshot) -> float:
|
|
def load_metrics_by_timerange(start: int, end: int) -> List[Metric]:
|
|
class MetricCalculator:
|
|
class SQLiteMetricsRepository:
|
|
|
|
# Avoid abbreviations except domain terms
|
|
# Good: OBI, CVD (standard financial terms)
|
|
# Avoid: calc_obi, proc_data, mgr
|
|
```
|
|
|
|
### Type Annotations
|
|
```python
|
|
# Required for all public interfaces
|
|
def process_trades(trades: List[Trade]) -> Dict[int, float]:
|
|
"""Process trades and return volume by timestamp."""
|
|
|
|
class Storage:
|
|
def __init__(self, instrument: str) -> None:
|
|
self.instrument = instrument
|
|
```
|
|
|
|
### Documentation Standards
|
|
```python
|
|
def calculate_metrics(snapshot: BookSnapshot, trades: List[Trade]) -> Metric:
|
|
"""
|
|
Calculate OBI and CVD metrics for a snapshot.
|
|
|
|
Args:
|
|
snapshot: Orderbook state at specific timestamp
|
|
trades: List of trades executed at this timestamp
|
|
|
|
Returns:
|
|
Metric: Calculated OBI, CVD, and best bid/ask values
|
|
|
|
Raises:
|
|
ValueError: If snapshot contains invalid data
|
|
|
|
Example:
|
|
>>> snapshot = BookSnapshot(...)
|
|
>>> trades = [Trade(...), ...]
|
|
>>> metric = calculate_metrics(snapshot, trades)
|
|
>>> print(f"OBI: {metric.obi:.3f}")
|
|
OBI: 0.333
|
|
"""
|
|
```
|
|
|
|
## Architecture Principles
|
|
|
|
### Separation of Concerns
|
|
- **Storage**: Data processing and persistence only
|
|
- **Strategy**: Trading analysis and signal generation only
|
|
- **Visualizer**: Chart rendering and display only
|
|
- **Main**: Application orchestration and flow control
|
|
|
|
### Repository Pattern
|
|
```python
|
|
# Good: Clean interface
|
|
class SQLiteMetricsRepository:
|
|
def load_metrics_by_timerange(self, conn: Connection, start: int, end: int) -> List[Metric]:
|
|
# Implementation details hidden
|
|
|
|
# Avoid: Direct SQL in business logic
|
|
def analyze_strategy(db_path: Path):
|
|
# Don't do this
|
|
conn = sqlite3.connect(db_path)
|
|
cursor = conn.execute("SELECT * FROM metrics WHERE ...")
|
|
```
|
|
|
|
### Error Handling
|
|
```python
|
|
# Required pattern
|
|
try:
|
|
result = risky_operation()
|
|
return process_result(result)
|
|
except SpecificException as e:
|
|
logging.error(f"Operation failed: {e}")
|
|
return default_value
|
|
except Exception as e:
|
|
logging.error(f"Unexpected error in operation: {e}")
|
|
raise
|
|
```
|
|
|
|
## Testing Requirements
|
|
|
|
### Test Coverage
|
|
- **Unit Tests**: All public methods must have unit tests
|
|
- **Integration Tests**: End-to-end workflow testing required
|
|
- **Edge Cases**: Handle empty data, boundary conditions, error scenarios
|
|
|
|
### Test Structure
|
|
```python
|
|
def test_feature_description():
|
|
"""Test that feature behaves correctly under normal conditions."""
|
|
# Arrange
|
|
test_data = create_test_data()
|
|
|
|
# Act
|
|
result = function_under_test(test_data)
|
|
|
|
# Assert
|
|
assert result.expected_property == expected_value
|
|
assert len(result.collection) == expected_count
|
|
```
|
|
|
|
### Test Data Management
|
|
```python
|
|
# Use temporary files for database tests
|
|
def test_database_operation():
|
|
with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as tmp_file:
|
|
db_path = Path(tmp_file.name)
|
|
|
|
try:
|
|
# Test implementation
|
|
pass
|
|
finally:
|
|
db_path.unlink(missing_ok=True)
|
|
```
|
|
|
|
## Database Development
|
|
|
|
### Schema Changes
|
|
1. **Create Migration**: Document schema changes in ADR format
|
|
2. **Backward Compatibility**: Ensure existing databases continue to work
|
|
3. **Auto-Migration**: Implement automatic schema updates where possible
|
|
4. **Performance**: Add appropriate indexes for new queries
|
|
|
|
### Query Patterns
|
|
```python
|
|
# Good: Parameterized queries
|
|
cursor.execute(
|
|
"SELECT obi, cvd FROM metrics WHERE timestamp >= ? AND timestamp <= ?",
|
|
(start_timestamp, end_timestamp)
|
|
)
|
|
|
|
# Bad: String formatting (security risk)
|
|
query = f"SELECT * FROM metrics WHERE timestamp = {timestamp}"
|
|
```
|
|
|
|
### Performance Guidelines
|
|
- **Batch Operations**: Process in batches of 1000 records
|
|
- **Indexes**: Add indexes for frequently queried columns
|
|
- **Transactions**: Use transactions for multi-record operations
|
|
- **Connection Management**: Caller manages connection lifecycle
|
|
|
|
## Performance Requirements
|
|
|
|
### Memory Management
|
|
- **Target**: >70% memory reduction vs. full snapshot retention
|
|
- **Measurement**: Profile memory usage with large datasets
|
|
- **Optimization**: Stream processing, batch operations, minimal object retention
|
|
|
|
### Processing Speed
|
|
- **Target**: >500 snapshots/second processing rate
|
|
- **Measurement**: Benchmark with realistic datasets
|
|
- **Optimization**: Database batching, efficient algorithms, minimal I/O
|
|
|
|
### Storage Efficiency
|
|
- **Target**: <25% storage overhead for metrics
|
|
- **Measurement**: Compare metrics table size to source data
|
|
- **Optimization**: Efficient data types, minimal redundancy
|
|
|
|
## Submission Process
|
|
|
|
### Before Submitting
|
|
1. **Run Tests**: Ensure all tests pass
|
|
```bash
|
|
uv run pytest
|
|
```
|
|
|
|
2. **Check Type Hints**: Verify type annotations
|
|
```bash
|
|
uv run mypy .
|
|
```
|
|
|
|
3. **Test Coverage**: Ensure adequate test coverage
|
|
```bash
|
|
uv run pytest --cov=. --cov-report=term-missing
|
|
```
|
|
|
|
4. **Documentation**: Update relevant documentation files
|
|
|
|
### Pull Request Guidelines
|
|
- **Description**: Clear description of changes and motivation
|
|
- **Testing**: Include tests for new functionality
|
|
- **Documentation**: Update docs for API changes
|
|
- **Breaking Changes**: Document any breaking changes
|
|
- **Performance**: Include performance impact analysis for significant changes
|
|
|
|
### Code Review Checklist
|
|
- [ ] Follows function/file size limits
|
|
- [ ] Has comprehensive test coverage
|
|
- [ ] Includes proper error handling
|
|
- [ ] Uses type annotations consistently
|
|
- [ ] Maintains backward compatibility
|
|
- [ ] Updates relevant documentation
|
|
- [ ] No security vulnerabilities (SQL injection, etc.)
|
|
- [ ] Performance impact analyzed
|
|
|
|
## Documentation Maintenance
|
|
|
|
### When to Update Documentation
|
|
- **API Changes**: Any modification to public interfaces
|
|
- **Architecture Changes**: New patterns, data structures, or workflows
|
|
- **Performance Changes**: Significant performance improvements or regressions
|
|
- **Feature Additions**: New capabilities or metrics
|
|
|
|
### Documentation Types
|
|
- **Code Comments**: Complex algorithms and business logic
|
|
- **Docstrings**: All public functions and classes
|
|
- **Module Documentation**: Purpose and usage examples
|
|
- **Architecture Documentation**: System design and component relationships
|
|
|
|
## Getting Help
|
|
|
|
### Resources
|
|
- **Architecture Overview**: `docs/architecture.md`
|
|
- **API Documentation**: `docs/API.md`
|
|
- **Module Documentation**: `docs/modules/`
|
|
- **Decision Records**: `docs/decisions/`
|
|
|
|
### Communication
|
|
- **Issues**: Use GitHub issues for bug reports and feature requests
|
|
- **Discussions**: Use GitHub discussions for questions and design discussions
|
|
- **Code Review**: Comment on pull requests for specific code feedback
|
|
|
|
---
|
|
|
|
## Development Workflow
|
|
|
|
### Feature Development
|
|
1. **Create Branch**: Feature-specific branch from main
|
|
2. **Develop**: Follow coding standards and test requirements
|
|
3. **Test**: Comprehensive testing including edge cases
|
|
4. **Document**: Update relevant documentation
|
|
5. **Review**: Submit pull request for code review
|
|
6. **Merge**: Merge after approval and CI success
|
|
|
|
### Bug Fixes
|
|
1. **Reproduce**: Create test that reproduces the bug
|
|
2. **Fix**: Implement minimal fix addressing root cause
|
|
3. **Verify**: Ensure fix resolves issue without regressions
|
|
4. **Test**: Add regression test to prevent future occurrences
|
|
|
|
### Performance Improvements
|
|
1. **Benchmark**: Establish baseline performance metrics
|
|
2. **Optimize**: Implement performance improvements
|
|
3. **Measure**: Verify performance gains with benchmarks
|
|
4. **Document**: Update performance characteristics in docs
|
|
|
|
Thank you for contributing to the Orderflow Backtest System! Your contributions help make this a better tool for cryptocurrency trading analysis.
|