5.1 KiB

Module: main

Purpose

The main module provides the command-line interface (CLI) orchestration for the orderflow backtest system. It handles database discovery, process management, and coordinates the streaming pipeline with the visualization frontend using Typer for argument parsing.

Public Interface

Functions

  • main(instrument: str, start_date: str, end_date: str, window_seconds: int = 60) -> None: Primary CLI entrypoint
  • discover_databases(instrument: str, start_date: str, end_date: str) -> list[Path]: Find matching database files
  • launch_visualizer() -> subprocess.Popen | None: Start Dash application in separate process

CLI Arguments

  • instrument: Trading pair identifier (e.g., "BTC-USDT")
  • start_date: Start date in YYYY-MM-DD format (UTC)
  • end_date: End date in YYYY-MM-DD format (UTC)
  • --window-seconds: OHLC aggregation window size (default: 60)

Usage Examples

Command Line Usage

# Basic usage with default 60-second windows
uv run python main.py BTC-USDT 2025-01-01 2025-01-31

# Custom window size
uv run python main.py ETH-USDT 2025-02-01 2025-02-28 --window-seconds 30

# Single day processing
uv run python main.py SOL-USDT 2025-03-15 2025-03-15

Programmatic Usage

from main import main, discover_databases

# Run processing pipeline
main("BTC-USDT", "2025-01-01", "2025-01-31", window_seconds=120)

# Discover available databases
db_files = discover_databases("ETH-USDT", "2025-02-01", "2025-02-28")
print(f"Found {len(db_files)} database files")

Dependencies

Internal

  • db_interpreter.DBInterpreter: Database streaming
  • ohlc_processor.OHLCProcessor: Trade aggregation and orderbook processing
  • viz_io: Data clearing functions

External

  • typer: CLI framework and argument parsing
  • subprocess: Process management for visualization
  • pathlib: File and directory operations
  • datetime: Date parsing and validation
  • logging: Operational logging
  • sys: Exit code management

Database Discovery Logic

File Pattern Matching

# Expected directory structure
../data/OKX/{instrument}/{date}/

# Example paths
../data/OKX/BTC-USDT/2025-01-01/trades.db
../data/OKX/ETH-USDT/2025-02-15/trades.db

Discovery Algorithm

  1. Parse start and end dates to datetime objects
  2. Iterate through date range (inclusive)
  3. Construct expected path for each date
  4. Verify file existence and readability
  5. Return sorted list of valid database paths

Process Orchestration

Visualization Process Management

# Launch Dash app in separate process
viz_process = subprocess.Popen([
    "uv", "run", "python", "app.py"
], cwd=project_root)

# Process management
try:
    # Main processing loop
    process_databases(db_files)
finally:
    # Cleanup visualization process
    if viz_process:
        viz_process.terminate()
        viz_process.wait(timeout=5)

Data Processing Pipeline

  1. Initialize: Clear existing data files
  2. Launch: Start visualization process
  3. Stream: Process each database sequentially
  4. Aggregate: Generate OHLC bars and depth snapshots
  5. Cleanup: Terminate visualization and finalize

Error Handling

Database Access Errors

  • File not found: Log warning and skip missing databases
  • Permission denied: Log error and exit with status code 1
  • Corruption: Log error for specific database and continue with next

Process Management Errors

  • Visualization startup failure: Log error but continue processing
  • Process termination: Graceful shutdown with timeout
  • Resource cleanup: Ensure child processes are terminated

Date Validation

  • Invalid format: Clear error message with expected format
  • Invalid range: End date must be >= start date
  • Future dates: Warning for dates beyond data availability

Performance Characteristics

  • Sequential processing: Databases processed one at a time
  • Memory efficient: Streaming approach prevents loading entire datasets
  • Process isolation: Visualization runs independently
  • Resource cleanup: Automatic process termination on exit

Testing

Run module tests:

uv run pytest test_main.py -v

Test coverage includes:

  • Database discovery logic
  • Date parsing and validation
  • Process management
  • Error handling scenarios
  • CLI argument validation

Configuration

Default Settings

  • Data directory: ../data/OKX (relative to project root)
  • Visualization command: uv run python app.py
  • Window size: 60 seconds
  • Process timeout: 5 seconds for termination

Environment Variables

  • DATA_PATH: Override default data directory
  • VISUALIZATION_PORT: Override Dash port (requires app.py modification)

Known Issues

  • Assumes specific directory structure under ../data/OKX
  • No validation of database schema compatibility
  • Limited error recovery for process management
  • No progress indication for large datasets

Development Notes

  • Uses Typer for modern CLI interface
  • Subprocess management compatible with Unix/Windows
  • Logging configured for both development and production use
  • Exit codes follow Unix conventions (0=success, 1=error)