8.9 KiB
8.9 KiB
Contributing to Orderflow Backtest System
Development Guidelines
Thank you for your interest in contributing to the Orderflow Backtest System. This document outlines the development process, coding standards, and best practices for maintaining code quality.
Development Environment Setup
Prerequisites
- Python: 3.12 or higher
- Package Manager: UV (recommended) or pip
- Database: SQLite 3.x
- GUI: Qt5 for visualization (Linux/macOS)
Installation
# Clone the repository
git clone <repository-url>
cd orderflow_backtest
# Install dependencies
uv sync
# Install development dependencies
uv add --dev pytest coverage mypy
# Verify installation
uv run pytest
Development Tools
# Run tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=. --cov-report=html
# Run type checking
uv run mypy .
# Run specific test module
uv run pytest tests/test_storage_metrics.py -v
Code Standards
Function and File Size Limits
- Functions: Maximum 50 lines
- Files: Maximum 250 lines
- Classes: Single responsibility, clear purpose
- Methods: One main function per method
Naming Conventions
# Good examples
def calculate_order_book_imbalance(snapshot: BookSnapshot) -> float:
def load_metrics_by_timerange(start: int, end: int) -> List[Metric]:
class MetricCalculator:
class SQLiteMetricsRepository:
# Avoid abbreviations except domain terms
# Good: OBI, CVD (standard financial terms)
# Avoid: calc_obi, proc_data, mgr
Type Annotations
# Required for all public interfaces
def process_trades(trades: List[Trade]) -> Dict[int, float]:
"""Process trades and return volume by timestamp."""
class Storage:
def __init__(self, instrument: str) -> None:
self.instrument = instrument
Documentation Standards
def calculate_metrics(snapshot: BookSnapshot, trades: List[Trade]) -> Metric:
"""
Calculate OBI and CVD metrics for a snapshot.
Args:
snapshot: Orderbook state at specific timestamp
trades: List of trades executed at this timestamp
Returns:
Metric: Calculated OBI, CVD, and best bid/ask values
Raises:
ValueError: If snapshot contains invalid data
Example:
>>> snapshot = BookSnapshot(...)
>>> trades = [Trade(...), ...]
>>> metric = calculate_metrics(snapshot, trades)
>>> print(f"OBI: {metric.obi:.3f}")
OBI: 0.333
"""
Architecture Principles
Separation of Concerns
- Storage: Data processing and persistence only
- Strategy: Trading analysis and signal generation only
- Visualizer: Chart rendering and display only
- Main: Application orchestration and flow control
Repository Pattern
# Good: Clean interface
class SQLiteMetricsRepository:
def load_metrics_by_timerange(self, conn: Connection, start: int, end: int) -> List[Metric]:
# Implementation details hidden
# Avoid: Direct SQL in business logic
def analyze_strategy(db_path: Path):
# Don't do this
conn = sqlite3.connect(db_path)
cursor = conn.execute("SELECT * FROM metrics WHERE ...")
Error Handling
# Required pattern
try:
result = risky_operation()
return process_result(result)
except SpecificException as e:
logging.error(f"Operation failed: {e}")
return default_value
except Exception as e:
logging.error(f"Unexpected error in operation: {e}")
raise
Testing Requirements
Test Coverage
- Unit Tests: All public methods must have unit tests
- Integration Tests: End-to-end workflow testing required
- Edge Cases: Handle empty data, boundary conditions, error scenarios
Test Structure
def test_feature_description():
"""Test that feature behaves correctly under normal conditions."""
# Arrange
test_data = create_test_data()
# Act
result = function_under_test(test_data)
# Assert
assert result.expected_property == expected_value
assert len(result.collection) == expected_count
Test Data Management
# Use temporary files for database tests
def test_database_operation():
with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as tmp_file:
db_path = Path(tmp_file.name)
try:
# Test implementation
pass
finally:
db_path.unlink(missing_ok=True)
Database Development
Schema Changes
- Create Migration: Document schema changes in ADR format
- Backward Compatibility: Ensure existing databases continue to work
- Auto-Migration: Implement automatic schema updates where possible
- Performance: Add appropriate indexes for new queries
Query Patterns
# Good: Parameterized queries
cursor.execute(
"SELECT obi, cvd FROM metrics WHERE timestamp >= ? AND timestamp <= ?",
(start_timestamp, end_timestamp)
)
# Bad: String formatting (security risk)
query = f"SELECT * FROM metrics WHERE timestamp = {timestamp}"
Performance Guidelines
- Batch Operations: Process in batches of 1000 records
- Indexes: Add indexes for frequently queried columns
- Transactions: Use transactions for multi-record operations
- Connection Management: Caller manages connection lifecycle
Performance Requirements
Memory Management
- Target: >70% memory reduction vs. full snapshot retention
- Measurement: Profile memory usage with large datasets
- Optimization: Stream processing, batch operations, minimal object retention
Processing Speed
- Target: >500 snapshots/second processing rate
- Measurement: Benchmark with realistic datasets
- Optimization: Database batching, efficient algorithms, minimal I/O
Storage Efficiency
- Target: <25% storage overhead for metrics
- Measurement: Compare metrics table size to source data
- Optimization: Efficient data types, minimal redundancy
Submission Process
Before Submitting
-
Run Tests: Ensure all tests pass
uv run pytest -
Check Type Hints: Verify type annotations
uv run mypy . -
Test Coverage: Ensure adequate test coverage
uv run pytest --cov=. --cov-report=term-missing -
Documentation: Update relevant documentation files
Pull Request Guidelines
- Description: Clear description of changes and motivation
- Testing: Include tests for new functionality
- Documentation: Update docs for API changes
- Breaking Changes: Document any breaking changes
- Performance: Include performance impact analysis for significant changes
Code Review Checklist
- Follows function/file size limits
- Has comprehensive test coverage
- Includes proper error handling
- Uses type annotations consistently
- Maintains backward compatibility
- Updates relevant documentation
- No security vulnerabilities (SQL injection, etc.)
- Performance impact analyzed
Documentation Maintenance
When to Update Documentation
- API Changes: Any modification to public interfaces
- Architecture Changes: New patterns, data structures, or workflows
- Performance Changes: Significant performance improvements or regressions
- Feature Additions: New capabilities or metrics
Documentation Types
- Code Comments: Complex algorithms and business logic
- Docstrings: All public functions and classes
- Module Documentation: Purpose and usage examples
- Architecture Documentation: System design and component relationships
Getting Help
Resources
- Architecture Overview:
docs/architecture.md - API Documentation:
docs/API.md - Module Documentation:
docs/modules/ - Decision Records:
docs/decisions/
Communication
- Issues: Use GitHub issues for bug reports and feature requests
- Discussions: Use GitHub discussions for questions and design discussions
- Code Review: Comment on pull requests for specific code feedback
Development Workflow
Feature Development
- Create Branch: Feature-specific branch from main
- Develop: Follow coding standards and test requirements
- Test: Comprehensive testing including edge cases
- Document: Update relevant documentation
- Review: Submit pull request for code review
- Merge: Merge after approval and CI success
Bug Fixes
- Reproduce: Create test that reproduces the bug
- Fix: Implement minimal fix addressing root cause
- Verify: Ensure fix resolves issue without regressions
- Test: Add regression test to prevent future occurrences
Performance Improvements
- Benchmark: Establish baseline performance metrics
- Optimize: Implement performance improvements
- Measure: Verify performance gains with benchmarks
- Document: Update performance characteristics in docs
Thank you for contributing to the Orderflow Backtest System! Your contributions help make this a better tool for cryptocurrency trading analysis.