169 lines
5.1 KiB
Markdown
169 lines
5.1 KiB
Markdown
|
|
# Module: main
|
||
|
|
|
||
|
|
## Purpose
|
||
|
|
The `main` module provides the command-line interface (CLI) orchestration for the orderflow backtest system. It handles database discovery, process management, and coordinates the streaming pipeline with the visualization frontend using Typer for argument parsing.
|
||
|
|
|
||
|
|
## Public Interface
|
||
|
|
|
||
|
|
### Functions
|
||
|
|
- `main(instrument: str, start_date: str, end_date: str, window_seconds: int = 60) -> None`: Primary CLI entrypoint
|
||
|
|
- `discover_databases(instrument: str, start_date: str, end_date: str) -> list[Path]`: Find matching database files
|
||
|
|
- `launch_visualizer() -> subprocess.Popen | None`: Start Dash application in separate process
|
||
|
|
|
||
|
|
### CLI Arguments
|
||
|
|
- `instrument`: Trading pair identifier (e.g., "BTC-USDT")
|
||
|
|
- `start_date`: Start date in YYYY-MM-DD format (UTC)
|
||
|
|
- `end_date`: End date in YYYY-MM-DD format (UTC)
|
||
|
|
- `--window-seconds`: OHLC aggregation window size (default: 60)
|
||
|
|
|
||
|
|
## Usage Examples
|
||
|
|
|
||
|
|
### Command Line Usage
|
||
|
|
```bash
|
||
|
|
# Basic usage with default 60-second windows
|
||
|
|
uv run python main.py BTC-USDT 2025-01-01 2025-01-31
|
||
|
|
|
||
|
|
# Custom window size
|
||
|
|
uv run python main.py ETH-USDT 2025-02-01 2025-02-28 --window-seconds 30
|
||
|
|
|
||
|
|
# Single day processing
|
||
|
|
uv run python main.py SOL-USDT 2025-03-15 2025-03-15
|
||
|
|
```
|
||
|
|
|
||
|
|
### Programmatic Usage
|
||
|
|
```python
|
||
|
|
from main import main, discover_databases
|
||
|
|
|
||
|
|
# Run processing pipeline
|
||
|
|
main("BTC-USDT", "2025-01-01", "2025-01-31", window_seconds=120)
|
||
|
|
|
||
|
|
# Discover available databases
|
||
|
|
db_files = discover_databases("ETH-USDT", "2025-02-01", "2025-02-28")
|
||
|
|
print(f"Found {len(db_files)} database files")
|
||
|
|
```
|
||
|
|
|
||
|
|
## Dependencies
|
||
|
|
|
||
|
|
### Internal
|
||
|
|
- `db_interpreter.DBInterpreter`: Database streaming
|
||
|
|
- `ohlc_processor.OHLCProcessor`: Trade aggregation and orderbook processing
|
||
|
|
- `viz_io`: Data clearing functions
|
||
|
|
|
||
|
|
### External
|
||
|
|
- `typer`: CLI framework and argument parsing
|
||
|
|
- `subprocess`: Process management for visualization
|
||
|
|
- `pathlib`: File and directory operations
|
||
|
|
- `datetime`: Date parsing and validation
|
||
|
|
- `logging`: Operational logging
|
||
|
|
- `sys`: Exit code management
|
||
|
|
|
||
|
|
## Database Discovery Logic
|
||
|
|
|
||
|
|
### File Pattern Matching
|
||
|
|
```python
|
||
|
|
# Expected directory structure
|
||
|
|
../data/OKX/{instrument}/{date}/
|
||
|
|
|
||
|
|
# Example paths
|
||
|
|
../data/OKX/BTC-USDT/2025-01-01/trades.db
|
||
|
|
../data/OKX/ETH-USDT/2025-02-15/trades.db
|
||
|
|
```
|
||
|
|
|
||
|
|
### Discovery Algorithm
|
||
|
|
1. Parse start and end dates to datetime objects
|
||
|
|
2. Iterate through date range (inclusive)
|
||
|
|
3. Construct expected path for each date
|
||
|
|
4. Verify file existence and readability
|
||
|
|
5. Return sorted list of valid database paths
|
||
|
|
|
||
|
|
## Process Orchestration
|
||
|
|
|
||
|
|
### Visualization Process Management
|
||
|
|
```python
|
||
|
|
# Launch Dash app in separate process
|
||
|
|
viz_process = subprocess.Popen([
|
||
|
|
"uv", "run", "python", "app.py"
|
||
|
|
], cwd=project_root)
|
||
|
|
|
||
|
|
# Process management
|
||
|
|
try:
|
||
|
|
# Main processing loop
|
||
|
|
process_databases(db_files)
|
||
|
|
finally:
|
||
|
|
# Cleanup visualization process
|
||
|
|
if viz_process:
|
||
|
|
viz_process.terminate()
|
||
|
|
viz_process.wait(timeout=5)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Data Processing Pipeline
|
||
|
|
1. **Initialize**: Clear existing data files
|
||
|
|
2. **Launch**: Start visualization process
|
||
|
|
3. **Stream**: Process each database sequentially
|
||
|
|
4. **Aggregate**: Generate OHLC bars and depth snapshots
|
||
|
|
5. **Cleanup**: Terminate visualization and finalize
|
||
|
|
|
||
|
|
## Error Handling
|
||
|
|
|
||
|
|
### Database Access Errors
|
||
|
|
- **File not found**: Log warning and skip missing databases
|
||
|
|
- **Permission denied**: Log error and exit with status code 1
|
||
|
|
- **Corruption**: Log error for specific database and continue with next
|
||
|
|
|
||
|
|
### Process Management Errors
|
||
|
|
- **Visualization startup failure**: Log error but continue processing
|
||
|
|
- **Process termination**: Graceful shutdown with timeout
|
||
|
|
- **Resource cleanup**: Ensure child processes are terminated
|
||
|
|
|
||
|
|
### Date Validation
|
||
|
|
- **Invalid format**: Clear error message with expected format
|
||
|
|
- **Invalid range**: End date must be >= start date
|
||
|
|
- **Future dates**: Warning for dates beyond data availability
|
||
|
|
|
||
|
|
## Performance Characteristics
|
||
|
|
|
||
|
|
- **Sequential processing**: Databases processed one at a time
|
||
|
|
- **Memory efficient**: Streaming approach prevents loading entire datasets
|
||
|
|
- **Process isolation**: Visualization runs independently
|
||
|
|
- **Resource cleanup**: Automatic process termination on exit
|
||
|
|
|
||
|
|
## Testing
|
||
|
|
|
||
|
|
Run module tests:
|
||
|
|
```bash
|
||
|
|
uv run pytest test_main.py -v
|
||
|
|
```
|
||
|
|
|
||
|
|
Test coverage includes:
|
||
|
|
- Database discovery logic
|
||
|
|
- Date parsing and validation
|
||
|
|
- Process management
|
||
|
|
- Error handling scenarios
|
||
|
|
- CLI argument validation
|
||
|
|
|
||
|
|
## Configuration
|
||
|
|
|
||
|
|
### Default Settings
|
||
|
|
- **Data directory**: `../data/OKX` (relative to project root)
|
||
|
|
- **Visualization command**: `uv run python app.py`
|
||
|
|
- **Window size**: 60 seconds
|
||
|
|
- **Process timeout**: 5 seconds for termination
|
||
|
|
|
||
|
|
### Environment Variables
|
||
|
|
- **DATA_PATH**: Override default data directory
|
||
|
|
- **VISUALIZATION_PORT**: Override Dash port (requires app.py modification)
|
||
|
|
|
||
|
|
## Known Issues
|
||
|
|
|
||
|
|
- Assumes specific directory structure under `../data/OKX`
|
||
|
|
- No validation of database schema compatibility
|
||
|
|
- Limited error recovery for process management
|
||
|
|
- No progress indication for large datasets
|
||
|
|
|
||
|
|
## Development Notes
|
||
|
|
|
||
|
|
- Uses Typer for modern CLI interface
|
||
|
|
- Subprocess management compatible with Unix/Windows
|
||
|
|
- Logging configured for both development and production use
|
||
|
|
- Exit codes follow Unix conventions (0=success, 1=error)
|