WIP UI rework with qt6
This commit is contained in:
165
docs/modules/app.md
Normal file
165
docs/modules/app.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# Module: app
|
||||
|
||||
## Purpose
|
||||
The `app` module provides a real-time Dash web application for visualizing OHLC candlestick charts, volume data, Order Book Imbalance (OBI) metrics, and orderbook depth. It implements a polling-based architecture that reads JSON data files and renders interactive charts with a dark theme.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `build_empty_ohlc_fig() -> go.Figure`: Create empty OHLC chart with proper styling
|
||||
- `build_empty_depth_fig() -> go.Figure`: Create empty depth chart with proper styling
|
||||
- `build_ohlc_fig(data: List[list], metrics: List[list]) -> go.Figure`: Build complete OHLC+Volume+OBI chart
|
||||
- `build_depth_fig(depth_data: dict) -> go.Figure`: Build orderbook depth visualization
|
||||
|
||||
### Global Variables
|
||||
- `_LAST_DATA`: Cached OHLC data for error recovery
|
||||
- `_LAST_DEPTH`: Cached depth data for error recovery
|
||||
- `_LAST_METRICS`: Cached metrics data for error recovery
|
||||
|
||||
### Dash Application
|
||||
- `app`: Main Dash application instance with Bootstrap theme
|
||||
- Layout with responsive grid (9:3 ratio for OHLC:Depth charts)
|
||||
- 500ms polling interval for real-time updates
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Running the Application
|
||||
```bash
|
||||
# Start the Dash server
|
||||
uv run python app.py
|
||||
|
||||
# Access the web interface
|
||||
# Open http://localhost:8050 in your browser
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
```python
|
||||
from app import build_ohlc_fig, build_depth_fig
|
||||
|
||||
# Build charts with sample data
|
||||
ohlc_data = [[1640995200000, 50000, 50100, 49900, 50050, 125.5]]
|
||||
metrics_data = [[1640995200000, 0.15, 0.22, 0.08, 0.18]]
|
||||
depth_data = {
|
||||
"bids": [[49990, 1.5], [49985, 2.1]],
|
||||
"asks": [[50010, 1.2], [50015, 1.8]]
|
||||
}
|
||||
|
||||
ohlc_fig = build_ohlc_fig(ohlc_data, metrics_data)
|
||||
depth_fig = build_depth_fig(depth_data)
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- `viz_io`: Data file paths and JSON reading
|
||||
- `viz_io.DATA_FILE`: OHLC data source
|
||||
- `viz_io.DEPTH_FILE`: Depth data source
|
||||
- `viz_io.METRICS_FILE`: Metrics data source
|
||||
|
||||
### External
|
||||
- `dash`: Web application framework
|
||||
- `dash.html`, `dash.dcc`: HTML and core components
|
||||
- `dash_bootstrap_components`: Bootstrap styling
|
||||
- `plotly.graph_objs`: Chart objects
|
||||
- `plotly.subplots`: Multiple subplot support
|
||||
- `pandas`: Data manipulation (minimal usage)
|
||||
- `json`: JSON file parsing
|
||||
- `logging`: Error and debug logging
|
||||
- `pathlib`: File path handling
|
||||
|
||||
## Chart Architecture
|
||||
|
||||
### OHLC Chart (Left Panel, 9/12 width)
|
||||
- **Main subplot**: Candlestick chart with OHLC data
|
||||
- **Volume subplot**: Bar chart sharing x-axis with main chart
|
||||
- **OBI subplot**: Order Book Imbalance candlestick chart in blue tones
|
||||
- **Shared x-axis**: Synchronized zooming and panning across subplots
|
||||
|
||||
### Depth Chart (Right Panel, 3/12 width)
|
||||
- **Cumulative depth**: Stepped line chart showing bid/ask liquidity
|
||||
- **Color coding**: Green for bids, red for asks
|
||||
- **Real-time updates**: Reflects current orderbook state
|
||||
|
||||
## Styling and Theme
|
||||
|
||||
### Dark Theme Configuration
|
||||
- Background: Black (`#000000`)
|
||||
- Text: White (`#ffffff`)
|
||||
- Grid: Dark gray with transparency
|
||||
- Candlesticks: Green (up) / Red (down)
|
||||
- Volume: Gray bars
|
||||
- OBI: Blue tones for candlesticks
|
||||
- Depth: Green (bids) / Red (asks)
|
||||
|
||||
### Responsive Design
|
||||
- Bootstrap grid system for layout
|
||||
- Fluid container for full-width usage
|
||||
- 100vh height for full viewport coverage
|
||||
- Configurable chart display modes
|
||||
|
||||
## Data Polling and Error Handling
|
||||
|
||||
### Polling Strategy
|
||||
- **Interval**: 500ms for near real-time updates
|
||||
- **Graceful degradation**: Uses cached data on JSON read errors
|
||||
- **Atomic reads**: Tolerates partial writes during file updates
|
||||
- **Logging**: Warnings for data inconsistencies
|
||||
|
||||
### Error Recovery
|
||||
```python
|
||||
# Pseudocode for error handling pattern
|
||||
try:
|
||||
with open(data_file) as f:
|
||||
new_data = json.load(f)
|
||||
_LAST_DATA = new_data # Cache successful read
|
||||
except (FileNotFoundError, json.JSONDecodeError):
|
||||
logging.warning("Using cached data due to read error")
|
||||
new_data = _LAST_DATA # Use cached data
|
||||
```
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Client-side rendering**: Plotly.js handles chart rendering
|
||||
- **Efficient updates**: Only redraws when data changes
|
||||
- **Memory bounded**: Limited by max bars in data files (1000)
|
||||
- **Network efficient**: Local file polling (no external API calls)
|
||||
|
||||
## Testing
|
||||
|
||||
Run application tests:
|
||||
```bash
|
||||
uv run pytest test_app.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Chart building functions
|
||||
- Data loading and caching
|
||||
- Error handling scenarios
|
||||
- Layout rendering
|
||||
- Callback functionality
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### Server Configuration
|
||||
- **Host**: `0.0.0.0` (accessible from network)
|
||||
- **Port**: `8050` (default Dash port)
|
||||
- **Debug mode**: Disabled in production
|
||||
|
||||
### Chart Configuration
|
||||
- **Update interval**: 500ms (configurable via dcc.Interval)
|
||||
- **Display mode bar**: Enabled for user interaction
|
||||
- **Logo display**: Disabled for clean interface
|
||||
|
||||
## Known Issues
|
||||
|
||||
- High CPU usage during rapid data updates
|
||||
- Memory usage grows with chart history
|
||||
- No authentication or access control
|
||||
- Limited mobile responsiveness for complex charts
|
||||
|
||||
## Development Notes
|
||||
|
||||
- Uses Flask development server (not suitable for production)
|
||||
- Callback exceptions suppressed for partial data scenarios
|
||||
- Bootstrap CSS loaded from CDN
|
||||
- Chart configurations optimized for financial data visualization
|
||||
83
docs/modules/db_interpreter.md
Normal file
83
docs/modules/db_interpreter.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# Module: db_interpreter
|
||||
|
||||
## Purpose
|
||||
The `db_interpreter` module provides efficient streaming access to SQLite databases containing orderbook and trade data. It handles batch reading, temporal windowing, and data structure normalization for downstream processing.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `OrderbookLevel(price: float, size: float)`: Dataclass representing a single price level in the orderbook
|
||||
- `OrderbookUpdate`: Container for windowed orderbook data with bids, asks, timestamp, and end_timestamp
|
||||
|
||||
### Functions
|
||||
- `DBInterpreter(db_path: Path)`: Constructor that initializes read-only SQLite connection with optimized PRAGMA settings
|
||||
|
||||
### Methods
|
||||
- `stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]`: Primary streaming interface that yields orderbook updates with associated trades in temporal windows
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
from db_interpreter import DBInterpreter
|
||||
|
||||
# Initialize interpreter
|
||||
db_path = Path("data/BTC-USDT-2025-01-01.db")
|
||||
interpreter = DBInterpreter(db_path)
|
||||
|
||||
# Stream orderbook and trade data
|
||||
for ob_update, trades in interpreter.stream():
|
||||
# Process orderbook update
|
||||
print(f"Book update: {len(ob_update.bids)} bids, {len(ob_update.asks)} asks")
|
||||
print(f"Time window: {ob_update.timestamp} - {ob_update.end_timestamp}")
|
||||
|
||||
# Process trades in this window
|
||||
for trade in trades:
|
||||
trade_id, price, size, side, timestamp_ms = trade[1:6]
|
||||
print(f"Trade: {side} {size} @ {price}")
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- None (standalone module)
|
||||
|
||||
### External
|
||||
- `sqlite3`: Database connectivity
|
||||
- `pathlib`: Path handling
|
||||
- `dataclasses`: Data structure definitions
|
||||
- `typing`: Type annotations
|
||||
- `logging`: Debug and error logging
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Batch sizes**: BOOK_BATCH=2048, TRADE_BATCH=4096 for optimal memory usage
|
||||
- **SQLite optimizations**: Read-only, immutable mode, large mmap and cache sizes
|
||||
- **Memory efficient**: Streaming iterator pattern prevents loading entire dataset
|
||||
- **Temporal windowing**: One-row lookahead for precise time boundary calculation
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_db_interpreter.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Batch reading correctness
|
||||
- Temporal window boundary handling
|
||||
- Trade-to-window assignment accuracy
|
||||
- End-of-stream behavior
|
||||
- Error handling for malformed data
|
||||
|
||||
## Known Issues
|
||||
|
||||
- Requires specific database schema (book and trades tables)
|
||||
- Python-literal string parsing assumes well-formed input
|
||||
- Large databases may require memory monitoring during streaming
|
||||
|
||||
## Configuration
|
||||
|
||||
- `BOOK_BATCH`: Number of orderbook rows to fetch per query (default: 2048)
|
||||
- `TRADE_BATCH`: Number of trade rows to fetch per query (default: 4096)
|
||||
- SQLite PRAGMA settings optimized for read-only sequential access
|
||||
162
docs/modules/dependencies.md
Normal file
162
docs/modules/dependencies.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# External Dependencies
|
||||
|
||||
## Overview
|
||||
This document describes all external dependencies used in the orderflow backtest system, their purposes, versions, and justifications for inclusion.
|
||||
|
||||
## Production Dependencies
|
||||
|
||||
### Core Framework Dependencies
|
||||
|
||||
#### Dash (^2.18.2)
|
||||
- **Purpose**: Web application framework for interactive visualizations
|
||||
- **Usage**: Real-time chart rendering and user interface
|
||||
- **Justification**: Mature Python-based framework with excellent Plotly integration
|
||||
- **Key Features**: Reactive components, built-in server, callback system
|
||||
|
||||
#### Dash Bootstrap Components (^1.6.0)
|
||||
- **Purpose**: Bootstrap CSS framework integration for Dash
|
||||
- **Usage**: Responsive layout grid and modern UI styling
|
||||
- **Justification**: Provides professional appearance with minimal custom CSS
|
||||
|
||||
#### Plotly (^5.24.1)
|
||||
- **Purpose**: Interactive charting and visualization library
|
||||
- **Usage**: OHLC candlesticks, volume bars, depth charts, OBI metrics
|
||||
- **Justification**: Industry standard for financial data visualization
|
||||
- **Key Features**: WebGL acceleration, zooming/panning, dark themes
|
||||
|
||||
### Data Processing Dependencies
|
||||
|
||||
#### Pandas (^2.2.3)
|
||||
- **Purpose**: Data manipulation and analysis library
|
||||
- **Usage**: Minimal usage for data structure conversions in visualization
|
||||
- **Justification**: Standard tool for financial data handling
|
||||
- **Note**: Usage kept minimal to maintain performance
|
||||
|
||||
#### Typer (^0.13.1)
|
||||
- **Purpose**: Modern CLI framework
|
||||
- **Usage**: Command-line argument parsing and help generation
|
||||
- **Justification**: Type-safe, auto-generated help, better UX than argparse
|
||||
- **Key Features**: Type hints integration, automatic validation
|
||||
|
||||
### Data Storage Dependencies
|
||||
|
||||
#### SQLite3 (Built-in)
|
||||
- **Purpose**: Database connectivity for historical data
|
||||
- **Usage**: Read-only access to orderbook and trade data
|
||||
- **Justification**: Built into Python, no external dependencies, excellent performance
|
||||
- **Configuration**: Optimized with immutable mode and mmap
|
||||
|
||||
## Development and Testing Dependencies
|
||||
|
||||
#### Pytest (^8.3.4)
|
||||
- **Purpose**: Testing framework
|
||||
- **Usage**: Unit tests, integration tests, test discovery
|
||||
- **Justification**: Standard Python testing tool with excellent plugin ecosystem
|
||||
|
||||
#### Coverage (^7.6.9)
|
||||
- **Purpose**: Code coverage measurement
|
||||
- **Usage**: Test coverage reporting and quality metrics
|
||||
- **Justification**: Essential for maintaining code quality
|
||||
|
||||
## Build and Package Management
|
||||
|
||||
#### UV (Package Manager)
|
||||
- **Purpose**: Fast Python package manager and task runner
|
||||
- **Usage**: Dependency management, virtual environments, script execution
|
||||
- **Justification**: Significantly faster than pip/poetry, better lock file format
|
||||
- **Commands**: `uv sync`, `uv run`, `uv add`
|
||||
|
||||
## Python Standard Library Usage
|
||||
|
||||
### Core Libraries
|
||||
- **sqlite3**: Database connectivity
|
||||
- **json**: JSON serialization for IPC
|
||||
- **pathlib**: Modern file path handling
|
||||
- **subprocess**: Process management for visualization
|
||||
- **logging**: Structured logging throughout application
|
||||
- **datetime**: Date/time parsing and manipulation
|
||||
- **dataclasses**: Structured data types
|
||||
- **typing**: Type annotations and hints
|
||||
- **tempfile**: Atomic file operations
|
||||
- **ast**: Safe evaluation of Python literals
|
||||
|
||||
### Performance Libraries
|
||||
- **itertools**: Efficient iteration patterns
|
||||
- **functools**: Function decoration and caching
|
||||
- **collections**: Specialized data structures
|
||||
|
||||
## Dependency Justifications
|
||||
|
||||
### Why Dash Over Alternatives?
|
||||
- **vs. Streamlit**: Better real-time updates, more control over layout
|
||||
- **vs. Flask + Custom JS**: Integrated Plotly support, faster development
|
||||
- **vs. Jupyter**: Better for production deployment, process isolation
|
||||
|
||||
### Why SQLite Over Alternatives?
|
||||
- **vs. PostgreSQL**: No server setup required, excellent read performance
|
||||
- **vs. Parquet**: Better for time-series queries, built-in indexing
|
||||
- **vs. CSV**: Proper data types, much faster queries, atomic transactions
|
||||
|
||||
### Why UV Over Poetry/Pip?
|
||||
- **vs. Poetry**: Significantly faster dependency resolution and installation
|
||||
- **vs. Pip**: Better dependency locking, integrated task runner
|
||||
- **vs. Pipenv**: More active development, better performance
|
||||
|
||||
## Version Pinning Strategy
|
||||
|
||||
### Patch Version Pinning
|
||||
- Core dependencies (Dash, Plotly) pinned to patch versions
|
||||
- Prevents breaking changes while allowing security updates
|
||||
|
||||
### Range Pinning
|
||||
- Development tools use caret (^) ranges for flexibility
|
||||
- Testing tools can update more freely
|
||||
|
||||
### Lock File Management
|
||||
- `uv.lock` ensures reproducible builds across environments
|
||||
- Regular updates scheduled monthly for security patches
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Dependency Scanning
|
||||
- Regular audit of dependencies for known vulnerabilities
|
||||
- Automated updates for security patches
|
||||
- Minimal dependency tree to reduce attack surface
|
||||
|
||||
### Data Isolation
|
||||
- Read-only database access prevents data modification
|
||||
- No external network connections required for core functionality
|
||||
- All file operations contained within project directory
|
||||
|
||||
## Performance Impact
|
||||
|
||||
### Bundle Size
|
||||
- Core runtime: ~50MB with all dependencies
|
||||
- Dash frontend: Additional ~10MB for JavaScript assets
|
||||
- SQLite: Zero overhead (built-in)
|
||||
|
||||
### Startup Time
|
||||
- Cold start: ~2-3 seconds for full application
|
||||
- UV virtual environment activation: ~100ms
|
||||
- Database connection: ~50ms per file
|
||||
|
||||
### Memory Usage
|
||||
- Base application: ~100MB
|
||||
- Per 1000 OHLC bars: ~5MB additional
|
||||
- Plotly charts: ~20MB for complex visualizations
|
||||
|
||||
## Maintenance Schedule
|
||||
|
||||
### Monthly
|
||||
- Security update review and application
|
||||
- Dependency version bump evaluation
|
||||
|
||||
### Quarterly
|
||||
- Major version update consideration
|
||||
- Performance impact assessment
|
||||
- Alternative technology evaluation
|
||||
|
||||
### Annually
|
||||
- Complete dependency audit
|
||||
- Technology stack review
|
||||
- Migration planning for deprecated packages
|
||||
101
docs/modules/level_parser.md
Normal file
101
docs/modules/level_parser.md
Normal file
@@ -0,0 +1,101 @@
|
||||
# Module: level_parser
|
||||
|
||||
## Purpose
|
||||
The `level_parser` module provides utilities for parsing and normalizing orderbook level data from various string formats. It handles JSON and Python literal representations, converting them into standardized numeric tuples for processing.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `normalize_levels(levels: Any) -> List[List[float]]`: Parse levels into [[price, size], ...] format, filtering out zero/negative sizes
|
||||
- `parse_levels_including_zeros(levels: Any) -> List[Tuple[float, float]]`: Parse levels preserving zero sizes for deletion operations
|
||||
|
||||
### Private Functions
|
||||
- `_parse_string_to_list(levels: Any) -> List[Any]`: Core parsing logic trying JSON first, then literal_eval
|
||||
- `_extract_price_size(item: Any) -> Tuple[Any, Any]`: Extract price/size from dict or list/tuple formats
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from level_parser import normalize_levels, parse_levels_including_zeros
|
||||
|
||||
# Parse standard levels (filters zeros)
|
||||
levels = normalize_levels('[[50000.0, 1.5], [49999.0, 2.0]]')
|
||||
# Returns: [[50000.0, 1.5], [49999.0, 2.0]]
|
||||
|
||||
# Parse with zero sizes preserved (for deletions)
|
||||
updates = parse_levels_including_zeros('[[50000.0, 0.0], [49999.0, 1.5]]')
|
||||
# Returns: [(50000.0, 0.0), (49999.0, 1.5)]
|
||||
|
||||
# Supports dict format
|
||||
dict_levels = normalize_levels('[{"price": 50000.0, "size": 1.5}]')
|
||||
# Returns: [[50000.0, 1.5]]
|
||||
|
||||
# Short key format
|
||||
short_levels = normalize_levels('[{"p": 50000.0, "s": 1.5}]')
|
||||
# Returns: [[50000.0, 1.5]]
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### External
|
||||
- `json`: Primary parsing method for level data
|
||||
- `ast.literal_eval`: Fallback parsing for Python literal formats
|
||||
- `logging`: Debug logging for parsing issues
|
||||
- `typing`: Type annotations
|
||||
|
||||
## Input Formats Supported
|
||||
|
||||
### JSON Array Format
|
||||
```json
|
||||
[[50000.0, 1.5], [49999.0, 2.0]]
|
||||
```
|
||||
|
||||
### Dict Format (Full Keys)
|
||||
```json
|
||||
[{"price": 50000.0, "size": 1.5}, {"price": 49999.0, "size": 2.0}]
|
||||
```
|
||||
|
||||
### Dict Format (Short Keys)
|
||||
```json
|
||||
[{"p": 50000.0, "s": 1.5}, {"p": 49999.0, "s": 2.0}]
|
||||
```
|
||||
|
||||
### Python Literal Format
|
||||
```python
|
||||
"[(50000.0, 1.5), (49999.0, 2.0)]"
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
- **Graceful Degradation**: Returns empty list on parse failures
|
||||
- **Data Validation**: Filters out invalid price/size pairs
|
||||
- **Type Safety**: Converts all values to float before processing
|
||||
- **Debug Logging**: Logs warnings for malformed input without crashing
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Fast Path**: JSON parsing prioritized for performance
|
||||
- **Fallback Support**: ast.literal_eval as backup for edge cases
|
||||
- **Memory Efficient**: Processes items iteratively, not loading entire dataset
|
||||
- **Validation**: Minimal overhead with early filtering of invalid data
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
uv run pytest test_level_parser.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- JSON format parsing accuracy
|
||||
- Dict format (both key styles) parsing
|
||||
- Python literal fallback parsing
|
||||
- Zero size preservation vs filtering
|
||||
- Error handling for malformed input
|
||||
- Type conversion edge cases
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- Assumes well-formed numeric data (price/size as numbers)
|
||||
- Does not validate economic constraints (e.g., positive prices)
|
||||
- Limited to list/dict input formats
|
||||
- No support for streaming/incremental parsing
|
||||
168
docs/modules/main.md
Normal file
168
docs/modules/main.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# Module: main
|
||||
|
||||
## Purpose
|
||||
The `main` module provides the command-line interface (CLI) orchestration for the orderflow backtest system. It handles database discovery, process management, and coordinates the streaming pipeline with the visualization frontend using Typer for argument parsing.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `main(instrument: str, start_date: str, end_date: str, window_seconds: int = 60) -> None`: Primary CLI entrypoint
|
||||
- `discover_databases(instrument: str, start_date: str, end_date: str) -> list[Path]`: Find matching database files
|
||||
- `launch_visualizer() -> subprocess.Popen | None`: Start Dash application in separate process
|
||||
|
||||
### CLI Arguments
|
||||
- `instrument`: Trading pair identifier (e.g., "BTC-USDT")
|
||||
- `start_date`: Start date in YYYY-MM-DD format (UTC)
|
||||
- `end_date`: End date in YYYY-MM-DD format (UTC)
|
||||
- `--window-seconds`: OHLC aggregation window size (default: 60)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Command Line Usage
|
||||
```bash
|
||||
# Basic usage with default 60-second windows
|
||||
uv run python main.py BTC-USDT 2025-01-01 2025-01-31
|
||||
|
||||
# Custom window size
|
||||
uv run python main.py ETH-USDT 2025-02-01 2025-02-28 --window-seconds 30
|
||||
|
||||
# Single day processing
|
||||
uv run python main.py SOL-USDT 2025-03-15 2025-03-15
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
```python
|
||||
from main import main, discover_databases
|
||||
|
||||
# Run processing pipeline
|
||||
main("BTC-USDT", "2025-01-01", "2025-01-31", window_seconds=120)
|
||||
|
||||
# Discover available databases
|
||||
db_files = discover_databases("ETH-USDT", "2025-02-01", "2025-02-28")
|
||||
print(f"Found {len(db_files)} database files")
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- `db_interpreter.DBInterpreter`: Database streaming
|
||||
- `ohlc_processor.OHLCProcessor`: Trade aggregation and orderbook processing
|
||||
- `viz_io`: Data clearing functions
|
||||
|
||||
### External
|
||||
- `typer`: CLI framework and argument parsing
|
||||
- `subprocess`: Process management for visualization
|
||||
- `pathlib`: File and directory operations
|
||||
- `datetime`: Date parsing and validation
|
||||
- `logging`: Operational logging
|
||||
- `sys`: Exit code management
|
||||
|
||||
## Database Discovery Logic
|
||||
|
||||
### File Pattern Matching
|
||||
```python
|
||||
# Expected directory structure
|
||||
../data/OKX/{instrument}/{date}/
|
||||
|
||||
# Example paths
|
||||
../data/OKX/BTC-USDT/2025-01-01/trades.db
|
||||
../data/OKX/ETH-USDT/2025-02-15/trades.db
|
||||
```
|
||||
|
||||
### Discovery Algorithm
|
||||
1. Parse start and end dates to datetime objects
|
||||
2. Iterate through date range (inclusive)
|
||||
3. Construct expected path for each date
|
||||
4. Verify file existence and readability
|
||||
5. Return sorted list of valid database paths
|
||||
|
||||
## Process Orchestration
|
||||
|
||||
### Visualization Process Management
|
||||
```python
|
||||
# Launch Dash app in separate process
|
||||
viz_process = subprocess.Popen([
|
||||
"uv", "run", "python", "app.py"
|
||||
], cwd=project_root)
|
||||
|
||||
# Process management
|
||||
try:
|
||||
# Main processing loop
|
||||
process_databases(db_files)
|
||||
finally:
|
||||
# Cleanup visualization process
|
||||
if viz_process:
|
||||
viz_process.terminate()
|
||||
viz_process.wait(timeout=5)
|
||||
```
|
||||
|
||||
### Data Processing Pipeline
|
||||
1. **Initialize**: Clear existing data files
|
||||
2. **Launch**: Start visualization process
|
||||
3. **Stream**: Process each database sequentially
|
||||
4. **Aggregate**: Generate OHLC bars and depth snapshots
|
||||
5. **Cleanup**: Terminate visualization and finalize
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Database Access Errors
|
||||
- **File not found**: Log warning and skip missing databases
|
||||
- **Permission denied**: Log error and exit with status code 1
|
||||
- **Corruption**: Log error for specific database and continue with next
|
||||
|
||||
### Process Management Errors
|
||||
- **Visualization startup failure**: Log error but continue processing
|
||||
- **Process termination**: Graceful shutdown with timeout
|
||||
- **Resource cleanup**: Ensure child processes are terminated
|
||||
|
||||
### Date Validation
|
||||
- **Invalid format**: Clear error message with expected format
|
||||
- **Invalid range**: End date must be >= start date
|
||||
- **Future dates**: Warning for dates beyond data availability
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Sequential processing**: Databases processed one at a time
|
||||
- **Memory efficient**: Streaming approach prevents loading entire datasets
|
||||
- **Process isolation**: Visualization runs independently
|
||||
- **Resource cleanup**: Automatic process termination on exit
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_main.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Database discovery logic
|
||||
- Date parsing and validation
|
||||
- Process management
|
||||
- Error handling scenarios
|
||||
- CLI argument validation
|
||||
|
||||
## Configuration
|
||||
|
||||
### Default Settings
|
||||
- **Data directory**: `../data/OKX` (relative to project root)
|
||||
- **Visualization command**: `uv run python app.py`
|
||||
- **Window size**: 60 seconds
|
||||
- **Process timeout**: 5 seconds for termination
|
||||
|
||||
### Environment Variables
|
||||
- **DATA_PATH**: Override default data directory
|
||||
- **VISUALIZATION_PORT**: Override Dash port (requires app.py modification)
|
||||
|
||||
## Known Issues
|
||||
|
||||
- Assumes specific directory structure under `../data/OKX`
|
||||
- No validation of database schema compatibility
|
||||
- Limited error recovery for process management
|
||||
- No progress indication for large datasets
|
||||
|
||||
## Development Notes
|
||||
|
||||
- Uses Typer for modern CLI interface
|
||||
- Subprocess management compatible with Unix/Windows
|
||||
- Logging configured for both development and production use
|
||||
- Exit codes follow Unix conventions (0=success, 1=error)
|
||||
@@ -1,302 +0,0 @@
|
||||
# Module: Metrics Calculation System
|
||||
|
||||
## Purpose
|
||||
|
||||
The metrics calculation system provides high-performance computation of Order Book Imbalance (OBI) and Cumulative Volume Delta (CVD) indicators for cryptocurrency trading analysis. It processes orderbook snapshots and trade data to generate financial metrics with per-snapshot granularity.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
|
||||
#### `Metric` (dataclass)
|
||||
Represents calculated metrics for a single orderbook snapshot.
|
||||
|
||||
```python
|
||||
@dataclass(slots=True)
|
||||
class Metric:
|
||||
snapshot_id: int # Reference to source snapshot
|
||||
timestamp: int # Unix timestamp
|
||||
obi: float # Order Book Imbalance [-1, 1]
|
||||
cvd: float # Cumulative Volume Delta
|
||||
best_bid: float | None # Best bid price
|
||||
best_ask: float | None # Best ask price
|
||||
```
|
||||
|
||||
#### `MetricCalculator` (static class)
|
||||
Provides calculation methods for financial metrics.
|
||||
|
||||
```python
|
||||
class MetricCalculator:
|
||||
@staticmethod
|
||||
def calculate_obi(snapshot: BookSnapshot) -> float
|
||||
|
||||
@staticmethod
|
||||
def calculate_volume_delta(trades: List[Trade]) -> float
|
||||
|
||||
@staticmethod
|
||||
def calculate_cvd(previous_cvd: float, volume_delta: float) -> float
|
||||
|
||||
@staticmethod
|
||||
def get_best_bid_ask(snapshot: BookSnapshot) -> tuple[float | None, float | None]
|
||||
```
|
||||
|
||||
### Functions
|
||||
|
||||
#### Order Book Imbalance (OBI) Calculation
|
||||
```python
|
||||
def calculate_obi(snapshot: BookSnapshot) -> float:
|
||||
"""
|
||||
Calculate Order Book Imbalance using the standard formula.
|
||||
|
||||
Formula: OBI = (Vb - Va) / (Vb + Va)
|
||||
Where:
|
||||
Vb = Total volume on bid side
|
||||
Va = Total volume on ask side
|
||||
|
||||
Args:
|
||||
snapshot: BookSnapshot containing bids and asks data
|
||||
|
||||
Returns:
|
||||
float: OBI value between -1 and 1, or 0.0 if no volume
|
||||
|
||||
Example:
|
||||
>>> snapshot = BookSnapshot(bids={50000.0: OrderbookLevel(...)}, ...)
|
||||
>>> obi = MetricCalculator.calculate_obi(snapshot)
|
||||
>>> print(f"OBI: {obi:.3f}")
|
||||
OBI: 0.333
|
||||
"""
|
||||
```
|
||||
|
||||
#### Volume Delta Calculation
|
||||
```python
|
||||
def calculate_volume_delta(trades: List[Trade]) -> float:
|
||||
"""
|
||||
Calculate Volume Delta for a list of trades.
|
||||
|
||||
Volume Delta = Buy Volume - Sell Volume
|
||||
- Buy trades (side = "buy"): positive contribution
|
||||
- Sell trades (side = "sell"): negative contribution
|
||||
|
||||
Args:
|
||||
trades: List of Trade objects for specific timestamp
|
||||
|
||||
Returns:
|
||||
float: Net volume delta (positive = buy pressure, negative = sell pressure)
|
||||
|
||||
Example:
|
||||
>>> trades = [
|
||||
... Trade(side="buy", size=10.0, ...),
|
||||
... Trade(side="sell", size=3.0, ...)
|
||||
... ]
|
||||
>>> vd = MetricCalculator.calculate_volume_delta(trades)
|
||||
>>> print(f"Volume Delta: {vd}")
|
||||
Volume Delta: 7.0
|
||||
"""
|
||||
```
|
||||
|
||||
#### Cumulative Volume Delta (CVD) Calculation
|
||||
```python
|
||||
def calculate_cvd(previous_cvd: float, volume_delta: float) -> float:
|
||||
"""
|
||||
Calculate Cumulative Volume Delta with incremental support.
|
||||
|
||||
Formula: CVD_t = CVD_{t-1} + Volume_Delta_t
|
||||
|
||||
Args:
|
||||
previous_cvd: Previous CVD value (use 0.0 for reset)
|
||||
volume_delta: Current volume delta to add
|
||||
|
||||
Returns:
|
||||
float: New cumulative volume delta value
|
||||
|
||||
Example:
|
||||
>>> cvd = 0.0 # Starting value
|
||||
>>> cvd = MetricCalculator.calculate_cvd(cvd, 10.0) # First trade
|
||||
>>> cvd = MetricCalculator.calculate_cvd(cvd, -5.0) # Second trade
|
||||
>>> print(f"CVD: {cvd}")
|
||||
CVD: 5.0
|
||||
"""
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic OBI Calculation
|
||||
```python
|
||||
from models import MetricCalculator, BookSnapshot, OrderbookLevel
|
||||
|
||||
# Create sample orderbook snapshot
|
||||
snapshot = BookSnapshot(
|
||||
id=1,
|
||||
timestamp=1640995200,
|
||||
bids={
|
||||
50000.0: OrderbookLevel(price=50000.0, size=10.0, liquidation_count=0, order_count=1),
|
||||
49999.0: OrderbookLevel(price=49999.0, size=5.0, liquidation_count=0, order_count=1),
|
||||
},
|
||||
asks={
|
||||
50001.0: OrderbookLevel(price=50001.0, size=3.0, liquidation_count=0, order_count=1),
|
||||
50002.0: OrderbookLevel(price=50002.0, size=2.0, liquidation_count=0, order_count=1),
|
||||
}
|
||||
)
|
||||
|
||||
# Calculate OBI
|
||||
obi = MetricCalculator.calculate_obi(snapshot)
|
||||
print(f"OBI: {obi:.3f}") # Output: OBI: 0.500
|
||||
# Explanation: (15 - 5) / (15 + 5) = 10/20 = 0.5
|
||||
```
|
||||
|
||||
### CVD Calculation with Reset
|
||||
```python
|
||||
from models import MetricCalculator, Trade
|
||||
|
||||
# Simulate trading session
|
||||
cvd = 0.0 # Reset CVD at session start
|
||||
|
||||
# Process trades for first timestamp
|
||||
trades_t1 = [
|
||||
Trade(id=1, trade_id=1.0, price=50000.0, size=8.0, side="buy", timestamp=1000),
|
||||
Trade(id=2, trade_id=2.0, price=50001.0, size=3.0, side="sell", timestamp=1000),
|
||||
]
|
||||
|
||||
vd_t1 = MetricCalculator.calculate_volume_delta(trades_t1) # 8.0 - 3.0 = 5.0
|
||||
cvd = MetricCalculator.calculate_cvd(cvd, vd_t1) # 0.0 + 5.0 = 5.0
|
||||
|
||||
# Process trades for second timestamp
|
||||
trades_t2 = [
|
||||
Trade(id=3, trade_id=3.0, price=49999.0, size=2.0, side="buy", timestamp=1001),
|
||||
Trade(id=4, trade_id=4.0, price=50000.0, size=7.0, side="sell", timestamp=1001),
|
||||
]
|
||||
|
||||
vd_t2 = MetricCalculator.calculate_volume_delta(trades_t2) # 2.0 - 7.0 = -5.0
|
||||
cvd = MetricCalculator.calculate_cvd(cvd, vd_t2) # 5.0 + (-5.0) = 0.0
|
||||
|
||||
print(f"Final CVD: {cvd}") # Output: Final CVD: 0.0
|
||||
```
|
||||
|
||||
### Complete Metrics Processing
|
||||
```python
|
||||
from models import MetricCalculator, Metric
|
||||
|
||||
def process_snapshot_metrics(snapshot, trades, previous_cvd=0.0):
|
||||
"""Process complete metrics for a single snapshot."""
|
||||
|
||||
# Calculate OBI
|
||||
obi = MetricCalculator.calculate_obi(snapshot)
|
||||
|
||||
# Calculate volume delta and CVD
|
||||
volume_delta = MetricCalculator.calculate_volume_delta(trades)
|
||||
cvd = MetricCalculator.calculate_cvd(previous_cvd, volume_delta)
|
||||
|
||||
# Extract best bid/ask
|
||||
best_bid, best_ask = MetricCalculator.get_best_bid_ask(snapshot)
|
||||
|
||||
# Create metric record
|
||||
metric = Metric(
|
||||
snapshot_id=snapshot.id,
|
||||
timestamp=snapshot.timestamp,
|
||||
obi=obi,
|
||||
cvd=cvd,
|
||||
best_bid=best_bid,
|
||||
best_ask=best_ask
|
||||
)
|
||||
|
||||
return metric, cvd
|
||||
|
||||
# Usage in processing loop
|
||||
current_cvd = 0.0
|
||||
for snapshot, trades in snapshot_trade_pairs:
|
||||
metric, current_cvd = process_snapshot_metrics(snapshot, trades, current_cvd)
|
||||
# Store metric to database...
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- `models.BookSnapshot`: Orderbook state data
|
||||
- `models.Trade`: Individual trade execution data
|
||||
- `models.OrderbookLevel`: Price level information
|
||||
|
||||
### External
|
||||
- **Python Standard Library**: `typing` for type hints
|
||||
- **No external packages required**
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
### Computational Complexity
|
||||
- **OBI Calculation**: O(n) where n = number of price levels
|
||||
- **Volume Delta**: O(m) where m = number of trades
|
||||
- **CVD Calculation**: O(1) - simple addition
|
||||
- **Best Bid/Ask**: O(n) for min/max operations
|
||||
|
||||
### Memory Usage
|
||||
- **Static Methods**: No instance state, minimal memory overhead
|
||||
- **Calculations**: Process data in-place without copying
|
||||
- **Results**: Lightweight `Metric` objects with slots optimization
|
||||
|
||||
### Typical Performance
|
||||
```python
|
||||
# Benchmark results (approximate)
|
||||
Snapshot with 50 price levels: ~0.1ms per OBI calculation
|
||||
Timestamp with 20 trades: ~0.05ms per volume delta
|
||||
CVD update: ~0.001ms per calculation
|
||||
Complete metric processing: ~0.2ms per snapshot
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Edge Cases Handled
|
||||
```python
|
||||
# Empty orderbook
|
||||
empty_snapshot = BookSnapshot(bids={}, asks={})
|
||||
obi = MetricCalculator.calculate_obi(empty_snapshot) # Returns 0.0
|
||||
|
||||
# No trades
|
||||
empty_trades = []
|
||||
vd = MetricCalculator.calculate_volume_delta(empty_trades) # Returns 0.0
|
||||
|
||||
# Zero volume scenario
|
||||
zero_vol_snapshot = BookSnapshot(
|
||||
bids={50000.0: OrderbookLevel(price=50000.0, size=0.0, ...)},
|
||||
asks={50001.0: OrderbookLevel(price=50001.0, size=0.0, ...)}
|
||||
)
|
||||
obi = MetricCalculator.calculate_obi(zero_vol_snapshot) # Returns 0.0
|
||||
```
|
||||
|
||||
### Validation
|
||||
- **OBI Range**: Results automatically bounded to [-1, 1]
|
||||
- **Division by Zero**: Handled gracefully with 0.0 return
|
||||
- **Invalid Data**: Empty collections handled without errors
|
||||
|
||||
## Testing
|
||||
|
||||
### Test Coverage
|
||||
- **Unit Tests**: `tests/test_metric_calculator.py`
|
||||
- **Integration Tests**: Included in storage and strategy tests
|
||||
- **Edge Cases**: Empty data, zero volume, boundary conditions
|
||||
|
||||
### Running Tests
|
||||
```bash
|
||||
# Run metric calculator tests specifically
|
||||
uv run pytest tests/test_metric_calculator.py -v
|
||||
|
||||
# Run all tests with metrics
|
||||
uv run pytest -k "metric" -v
|
||||
|
||||
# Performance tests
|
||||
uv run pytest tests/test_metric_calculator.py::test_calculate_obi_performance
|
||||
```
|
||||
|
||||
## Known Issues
|
||||
|
||||
### Current Limitations
|
||||
- **Precision**: Floating-point arithmetic limitations for very small numbers
|
||||
- **Scale**: No optimization for extremely large orderbooks (>10k levels)
|
||||
- **Currency**: No multi-currency support (assumes single denomination)
|
||||
|
||||
### Planned Enhancements
|
||||
- **Decimal Precision**: Consider `decimal.Decimal` for high-precision calculations
|
||||
- **Vectorization**: NumPy integration for batch calculations
|
||||
- **Additional Metrics**: Volume Profile, Liquidity metrics, Delta Flow
|
||||
|
||||
---
|
||||
|
||||
The metrics calculation system provides a robust foundation for financial analysis with clean interfaces, comprehensive error handling, and optimal performance for high-frequency trading data.
|
||||
147
docs/modules/metrics_calculator.md
Normal file
147
docs/modules/metrics_calculator.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# Module: metrics_calculator
|
||||
|
||||
## Purpose
|
||||
The `metrics_calculator` module handles calculation and management of trading metrics including Order Book Imbalance (OBI) and Cumulative Volume Delta (CVD). It provides windowed aggregation with throttled updates for real-time visualization.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `MetricsCalculator(window_seconds: int = 60, emit_every_n_updates: int = 25)`: Main metrics calculation engine
|
||||
|
||||
### Methods
|
||||
- `update_cvd_from_trade(side: str, size: float) -> None`: Update CVD from individual trade data
|
||||
- `update_obi_metrics(timestamp: str, total_bids: float, total_asks: float) -> None`: Update OBI metrics from orderbook volumes
|
||||
- `finalize_metrics() -> None`: Emit final metrics bar at processing end
|
||||
|
||||
### Properties
|
||||
- `cvd_cumulative: float`: Current cumulative volume delta value
|
||||
|
||||
### Private Methods
|
||||
- `_emit_metrics_bar() -> None`: Emit current metrics to visualization layer
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from metrics_calculator import MetricsCalculator
|
||||
|
||||
# Initialize calculator
|
||||
calc = MetricsCalculator(window_seconds=60, emit_every_n_updates=25)
|
||||
|
||||
# Update CVD from trades
|
||||
calc.update_cvd_from_trade("buy", 1.5) # +1.5 CVD
|
||||
calc.update_cvd_from_trade("sell", 1.0) # -1.0 CVD, net +0.5
|
||||
|
||||
# Update OBI from orderbook
|
||||
total_bids, total_asks = 150.0, 120.0
|
||||
calc.update_obi_metrics("1640995200000", total_bids, total_asks)
|
||||
|
||||
# Access current CVD
|
||||
current_cvd = calc.cvd_cumulative # 0.5
|
||||
|
||||
# Finalize at end of processing
|
||||
calc.finalize_metrics()
|
||||
```
|
||||
|
||||
## Metrics Definitions
|
||||
|
||||
### Cumulative Volume Delta (CVD)
|
||||
- **Formula**: CVD = Σ(buy_volume - sell_volume)
|
||||
- **Interpretation**: Positive = more buying pressure, Negative = more selling pressure
|
||||
- **Accumulation**: Running total across all processed trades
|
||||
- **Update Frequency**: Every trade
|
||||
|
||||
### Order Book Imbalance (OBI)
|
||||
- **Formula**: OBI = total_bid_volume - total_ask_volume
|
||||
- **Interpretation**: Positive = more bid liquidity, Negative = more ask liquidity
|
||||
- **Aggregation**: OHLC-style bars per time window (open, high, low, close)
|
||||
- **Update Frequency**: Throttled per orderbook update
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- `viz_io.upsert_metric_bar`: Output interface for visualization
|
||||
|
||||
### External
|
||||
- `logging`: Warning messages for unknown trade sides
|
||||
- `typing`: Type annotations
|
||||
|
||||
## Windowed Aggregation
|
||||
|
||||
### OBI Windows
|
||||
- **Window Size**: Configurable via `window_seconds` (default: 60)
|
||||
- **Window Alignment**: Aligned to epoch time boundaries
|
||||
- **OHLC Tracking**: Maintains open, high, low, close values per window
|
||||
- **Rollover**: Automatic window transitions with final bar emission
|
||||
|
||||
### Throttling Mechanism
|
||||
- **Purpose**: Reduce I/O overhead during high-frequency updates
|
||||
- **Trigger**: Every N updates (configurable via `emit_every_n_updates`)
|
||||
- **Behavior**: Emits intermediate updates for real-time visualization
|
||||
- **Final Emission**: Guaranteed on window rollover and finalization
|
||||
|
||||
## State Management
|
||||
|
||||
### CVD State
|
||||
- `cvd_cumulative: float`: Running total across all trades
|
||||
- **Persistence**: Maintained throughout processor lifetime
|
||||
- **Updates**: Incremental addition/subtraction per trade
|
||||
|
||||
### OBI State
|
||||
- `metrics_window_start: int`: Current window start timestamp
|
||||
- `metrics_bar: dict`: Current OBI OHLC values
|
||||
- `_metrics_since_last_emit: int`: Throttling counter
|
||||
|
||||
## Output Format
|
||||
|
||||
### Metrics Bar Structure
|
||||
```python
|
||||
{
|
||||
'obi_open': float, # First OBI value in window
|
||||
'obi_high': float, # Maximum OBI in window
|
||||
'obi_low': float, # Minimum OBI in window
|
||||
'obi_close': float, # Latest OBI value
|
||||
}
|
||||
```
|
||||
|
||||
### Visualization Integration
|
||||
- Emitted via `viz_io.upsert_metric_bar(timestamp, obi_open, obi_high, obi_low, obi_close, cvd_value)`
|
||||
- Compatible with existing OHLC visualization infrastructure
|
||||
- Real-time updates during active processing
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Low Memory**: Maintains only current window state
|
||||
- **Throttled I/O**: Configurable update frequency prevents excessive writes
|
||||
- **Efficient Updates**: O(1) operations for trade and OBI updates
|
||||
- **Window Management**: Automatic transitions without manual intervention
|
||||
|
||||
## Configuration
|
||||
|
||||
### Constructor Parameters
|
||||
- `window_seconds: int`: Time window for OBI aggregation (default: 60)
|
||||
- `emit_every_n_updates: int`: Throttling factor for intermediate updates (default: 25)
|
||||
|
||||
### Tuning Guidelines
|
||||
- **Higher throttling**: Reduces I/O load, delays real-time updates
|
||||
- **Lower throttling**: More responsive visualization, higher I/O overhead
|
||||
- **Window size**: Affects granularity of OBI trends (shorter = more detail)
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
uv run pytest test_metrics_calculator.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- CVD accumulation accuracy across multiple trades
|
||||
- OBI window rollover and OHLC tracking
|
||||
- Throttling behavior verification
|
||||
- Edge cases (unknown trade sides, empty windows)
|
||||
- Integration with visualization output
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- CVD calculation assumes binary buy/sell classification
|
||||
- No support for partial fills or complex order types
|
||||
- OBI calculation treats all liquidity equally (no price weighting)
|
||||
- Window boundaries aligned to absolute timestamps (no sliding windows)
|
||||
122
docs/modules/ohlc_processor.md
Normal file
122
docs/modules/ohlc_processor.md
Normal file
@@ -0,0 +1,122 @@
|
||||
# Module: ohlc_processor
|
||||
|
||||
## Purpose
|
||||
The `ohlc_processor` module serves as the main coordinator for trade data processing, orchestrating OHLC aggregation, orderbook management, and metrics calculation. It has been refactored into a modular architecture using composition with specialized helper modules.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `OHLCProcessor(window_seconds: int = 60, depth_levels_per_side: int = 50)`: Main orchestrator class that coordinates trade processing using composition
|
||||
|
||||
### Methods
|
||||
- `process_trades(trades: list[tuple]) -> None`: Aggregate trades into OHLC bars and update CVD metrics
|
||||
- `update_orderbook(ob_update: OrderbookUpdate) -> None`: Apply orderbook updates and calculate OBI metrics
|
||||
- `finalize() -> None`: Emit final OHLC bar and metrics data
|
||||
- `cvd_cumulative` (property): Access to cumulative volume delta value
|
||||
|
||||
### Composed Modules
|
||||
- `OrderbookManager`: Handles in-memory orderbook state and depth snapshots
|
||||
- `MetricsCalculator`: Manages OBI and CVD metric calculations
|
||||
- `level_parser` functions: Parse and normalize orderbook level data
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from ohlc_processor import OHLCProcessor
|
||||
from db_interpreter import DBInterpreter
|
||||
|
||||
# Initialize processor with 1-minute windows and 50 depth levels
|
||||
processor = OHLCProcessor(window_seconds=60, depth_levels_per_side=50)
|
||||
|
||||
# Process streaming data
|
||||
for ob_update, trades in DBInterpreter(db_path).stream():
|
||||
# Aggregate trades into OHLC bars
|
||||
processor.process_trades(trades)
|
||||
|
||||
# Update orderbook and emit depth snapshots
|
||||
processor.update_orderbook(ob_update)
|
||||
|
||||
# Finalize processing
|
||||
processor.finalize()
|
||||
```
|
||||
|
||||
### Advanced Configuration
|
||||
```python
|
||||
# Custom window size and depth levels
|
||||
processor = OHLCProcessor(
|
||||
window_seconds=30, # 30-second bars
|
||||
depth_levels_per_side=25 # Top 25 levels per side
|
||||
)
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal Modules
|
||||
- `orderbook_manager.OrderbookManager`: In-memory orderbook state management
|
||||
- `metrics_calculator.MetricsCalculator`: OBI and CVD metrics calculation
|
||||
- `level_parser`: Orderbook level parsing utilities
|
||||
- `viz_io`: JSON output for visualization
|
||||
- `db_interpreter.OrderbookUpdate`: Input data structures
|
||||
|
||||
### External
|
||||
- `typing`: Type annotations
|
||||
- `logging`: Debug and operational logging
|
||||
|
||||
## Modular Architecture
|
||||
|
||||
The processor now follows a clean composition pattern:
|
||||
|
||||
1. **Main Coordinator** (`OHLCProcessor`):
|
||||
- Orchestrates trade and orderbook processing
|
||||
- Maintains OHLC bar state and window management
|
||||
- Delegates specialized tasks to composed modules
|
||||
|
||||
2. **Orderbook Management** (`OrderbookManager`):
|
||||
- Maintains in-memory price→size mappings
|
||||
- Applies partial updates and handles deletions
|
||||
- Provides sorted top-N level extraction
|
||||
|
||||
3. **Metrics Calculation** (`MetricsCalculator`):
|
||||
- Tracks CVD from trade flow (buy/sell volume delta)
|
||||
- Calculates OBI from orderbook volume imbalance
|
||||
- Manages windowed metrics aggregation with throttling
|
||||
|
||||
4. **Level Parsing** (`level_parser` module):
|
||||
- Normalizes JSON and Python literal level representations
|
||||
- Handles zero-size levels for orderbook deletions
|
||||
- Provides robust error handling for malformed data
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Throttled Updates**: Prevents excessive I/O during high-frequency periods
|
||||
- **Memory Efficient**: Maintains only current window and top-N depth levels
|
||||
- **Incremental Processing**: Applies only changed orderbook levels
|
||||
- **Atomic Operations**: Thread-safe updates to shared data structures
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_ohlc_processor.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- OHLC calculation accuracy across window boundaries
|
||||
- Volume accumulation correctness
|
||||
- High/low price tracking
|
||||
- Orderbook update application
|
||||
- Depth snapshot generation
|
||||
- OBI metric calculation
|
||||
|
||||
## Known Issues
|
||||
|
||||
- Orderbook level parsing assumes well-formed JSON or Python literals
|
||||
- Memory usage scales with number of active price levels
|
||||
- Clock skew between trades and orderbook updates not handled
|
||||
|
||||
## Configuration Options
|
||||
|
||||
- `window_seconds`: Time window size for OHLC aggregation (default: 60)
|
||||
- `depth_levels_per_side`: Number of top price levels to maintain (default: 50)
|
||||
- `UPSERT_THROTTLE_MS`: Minimum interval between upsert operations (internal)
|
||||
- `DEPTH_EMIT_THROTTLE_MS`: Minimum interval between depth emissions (internal)
|
||||
121
docs/modules/orderbook_manager.md
Normal file
121
docs/modules/orderbook_manager.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Module: orderbook_manager
|
||||
|
||||
## Purpose
|
||||
The `orderbook_manager` module provides in-memory orderbook state management with partial update capabilities. It maintains separate bid and ask sides and supports efficient top-level extraction for visualization.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `OrderbookManager(depth_levels_per_side: int = 50)`: Main orderbook state manager
|
||||
|
||||
### Methods
|
||||
- `apply_updates(bids_updates: List[Tuple[float, float]], asks_updates: List[Tuple[float, float]]) -> None`: Apply partial updates to both sides
|
||||
- `get_total_volume() -> Tuple[float, float]`: Get total bid and ask volumes
|
||||
- `get_top_levels() -> Tuple[List[List[float]], List[List[float]]]`: Get sorted top levels for both sides
|
||||
|
||||
### Private Methods
|
||||
- `_apply_partial_updates(side_map: Dict[float, float], updates: List[Tuple[float, float]]) -> None`: Apply updates to one side
|
||||
- `_build_top_levels(side_map: Dict[float, float], limit: int, reverse: bool) -> List[List[float]]`: Extract sorted top levels
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from orderbook_manager import OrderbookManager
|
||||
|
||||
# Initialize manager
|
||||
manager = OrderbookManager(depth_levels_per_side=25)
|
||||
|
||||
# Apply orderbook updates
|
||||
bids = [(50000.0, 1.5), (49999.0, 2.0)]
|
||||
asks = [(50001.0, 1.2), (50002.0, 0.8)]
|
||||
manager.apply_updates(bids, asks)
|
||||
|
||||
# Get volume totals for OBI calculation
|
||||
total_bids, total_asks = manager.get_total_volume()
|
||||
obi = total_bids - total_asks
|
||||
|
||||
# Get top levels for depth visualization
|
||||
bids_sorted, asks_sorted = manager.get_top_levels()
|
||||
|
||||
# Handle deletions (size = 0)
|
||||
deletions = [(50000.0, 0.0)] # Remove price level
|
||||
manager.apply_updates(deletions, [])
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### External
|
||||
- `typing`: Type annotations for Dict, List, Tuple
|
||||
|
||||
## State Management
|
||||
|
||||
### Internal State
|
||||
- `_book_bids: Dict[float, float]`: Price → size mapping for bid side
|
||||
- `_book_asks: Dict[float, float]`: Price → size mapping for ask side
|
||||
- `depth_levels_per_side: int`: Configuration for top-N extraction
|
||||
|
||||
### Update Semantics
|
||||
- **Size = 0**: Remove price level (deletion)
|
||||
- **Size > 0**: Upsert price level with new size
|
||||
- **Size < 0**: Ignored (invalid update)
|
||||
|
||||
### Sorting Behavior
|
||||
- **Bids**: Descending by price (highest price first)
|
||||
- **Asks**: Ascending by price (lowest price first)
|
||||
- **Top-N**: Limited by `depth_levels_per_side` parameter
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Memory Efficient**: Only stores non-zero price levels
|
||||
- **Fast Updates**: O(1) upsert/delete operations using dict
|
||||
- **Efficient Sorting**: Only sorts when extracting top levels
|
||||
- **Bounded Output**: Limits result size for visualization performance
|
||||
|
||||
## Use Cases
|
||||
|
||||
### OBI Calculation
|
||||
```python
|
||||
total_bids, total_asks = manager.get_total_volume()
|
||||
order_book_imbalance = total_bids - total_asks
|
||||
```
|
||||
|
||||
### Depth Visualization
|
||||
```python
|
||||
bids, asks = manager.get_top_levels()
|
||||
depth_payload = {"bids": bids, "asks": asks}
|
||||
```
|
||||
|
||||
### Incremental Updates
|
||||
```python
|
||||
# Typical orderbook update cycle
|
||||
updates = parse_orderbook_changes(raw_data)
|
||||
manager.apply_updates(updates['bids'], updates['asks'])
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
uv run pytest test_orderbook_manager.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Partial update application correctness
|
||||
- Deletion handling (size = 0)
|
||||
- Volume calculation accuracy
|
||||
- Top-level sorting and limiting
|
||||
- Edge cases (empty books, single levels)
|
||||
- Performance with large orderbooks
|
||||
|
||||
## Configuration
|
||||
|
||||
- `depth_levels_per_side`: Controls output size for visualization (default: 50)
|
||||
- Affects memory usage and sorting performance
|
||||
- Higher values provide more market depth detail
|
||||
- Lower values improve processing speed
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- No built-in validation of price/size values
|
||||
- Memory usage scales with number of unique price levels
|
||||
- No historical state tracking (current snapshot only)
|
||||
- No support for spread calculation or market data statistics
|
||||
155
docs/modules/viz_io.md
Normal file
155
docs/modules/viz_io.md
Normal file
@@ -0,0 +1,155 @@
|
||||
# Module: viz_io
|
||||
|
||||
## Purpose
|
||||
The `viz_io` module provides atomic inter-process communication (IPC) between the data processing pipeline and the visualization frontend. It manages JSON file-based data exchange with atomic writes to prevent race conditions and data corruption.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `add_ohlc_bar(timestamp, open_price, high_price, low_price, close_price, volume)`: Append new OHLC bar to rolling dataset
|
||||
- `upsert_ohlc_bar(timestamp, open_price, high_price, low_price, close_price, volume)`: Update existing bar or append new one
|
||||
- `clear_data()`: Reset OHLC dataset to empty state
|
||||
- `add_metric_bar(timestamp, obi_open, obi_high, obi_low, obi_close)`: Append OBI metric bar
|
||||
- `upsert_metric_bar(timestamp, obi_open, obi_high, obi_low, obi_close)`: Update existing OBI bar or append new one
|
||||
- `clear_metrics()`: Reset metrics dataset to empty state
|
||||
- `set_depth_data(bids, asks)`: Update current orderbook depth snapshot
|
||||
|
||||
### Constants
|
||||
- `DATA_FILE`: Path to OHLC data JSON file
|
||||
- `DEPTH_FILE`: Path to depth data JSON file
|
||||
- `METRICS_FILE`: Path to metrics data JSON file
|
||||
- `MAX_BARS`: Maximum number of bars to retain (1000)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic OHLC Operations
|
||||
```python
|
||||
import viz_io
|
||||
|
||||
# Add a new OHLC bar
|
||||
viz_io.add_ohlc_bar(
|
||||
timestamp=1640995200000, # Unix timestamp in milliseconds
|
||||
open_price=50000.0,
|
||||
high_price=50100.0,
|
||||
low_price=49900.0,
|
||||
close_price=50050.0,
|
||||
volume=125.5
|
||||
)
|
||||
|
||||
# Update the current bar (if timestamp matches) or add new one
|
||||
viz_io.upsert_ohlc_bar(
|
||||
timestamp=1640995200000,
|
||||
open_price=50000.0,
|
||||
high_price=50150.0, # Updated high
|
||||
low_price=49850.0, # Updated low
|
||||
close_price=50075.0, # Updated close
|
||||
volume=130.2 # Updated volume
|
||||
)
|
||||
```
|
||||
|
||||
### Orderbook Depth Management
|
||||
```python
|
||||
# Set current depth snapshot
|
||||
bids = [[49990.0, 1.5], [49985.0, 2.1], [49980.0, 0.8]]
|
||||
asks = [[50010.0, 1.2], [50015.0, 1.8], [50020.0, 2.5]]
|
||||
|
||||
viz_io.set_depth_data(bids, asks)
|
||||
```
|
||||
|
||||
### Metrics Operations
|
||||
```python
|
||||
# Add Order Book Imbalance metrics
|
||||
viz_io.add_metric_bar(
|
||||
timestamp=1640995200000,
|
||||
obi_open=0.15,
|
||||
obi_high=0.22,
|
||||
obi_low=0.08,
|
||||
obi_close=0.18
|
||||
)
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- None (standalone utility module)
|
||||
|
||||
### External
|
||||
- `json`: JSON serialization/deserialization
|
||||
- `pathlib`: File path handling
|
||||
- `typing`: Type annotations
|
||||
- `tempfile`: Atomic write operations
|
||||
|
||||
## Data Formats
|
||||
|
||||
### OHLC Data (`ohlc_data.json`)
|
||||
```json
|
||||
[
|
||||
[1640995200000, 50000.0, 50100.0, 49900.0, 50050.0, 125.5],
|
||||
[1640995260000, 50050.0, 50200.0, 50000.0, 50150.0, 98.3]
|
||||
]
|
||||
```
|
||||
Format: `[timestamp, open, high, low, close, volume]`
|
||||
|
||||
### Depth Data (`depth_data.json`)
|
||||
```json
|
||||
{
|
||||
"bids": [[49990.0, 1.5], [49985.0, 2.1]],
|
||||
"asks": [[50010.0, 1.2], [50015.0, 1.8]]
|
||||
}
|
||||
```
|
||||
Format: `{"bids": [[price, size], ...], "asks": [[price, size], ...]}`
|
||||
|
||||
### Metrics Data (`metrics_data.json`)
|
||||
```json
|
||||
[
|
||||
[1640995200000, 0.15, 0.22, 0.08, 0.18],
|
||||
[1640995260000, 0.18, 0.25, 0.12, 0.20]
|
||||
]
|
||||
```
|
||||
Format: `[timestamp, obi_open, obi_high, obi_low, obi_close]`
|
||||
|
||||
## Atomic Write Operations
|
||||
|
||||
All write operations use atomic file replacement to prevent partial reads:
|
||||
|
||||
1. Write data to temporary file
|
||||
2. Flush and sync to disk
|
||||
3. Atomically rename temporary file to target file
|
||||
|
||||
This ensures the visualization frontend always reads complete, valid JSON data.
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Bounded Memory**: OHLC and metrics datasets limited to 1000 bars max
|
||||
- **Atomic Operations**: No partial reads possible during writes
|
||||
- **Rolling Window**: Automatic trimming of old data maintains constant memory usage
|
||||
- **Fast Lookups**: Timestamp-based upsert operations use list scanning (acceptable for 1000 items)
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_viz_io.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Atomic write operations
|
||||
- Data format validation
|
||||
- Rolling window behavior
|
||||
- Upsert logic correctness
|
||||
- File corruption prevention
|
||||
- Concurrent read/write scenarios
|
||||
|
||||
## Known Issues
|
||||
|
||||
- File I/O may block briefly during atomic writes
|
||||
- JSON parsing errors not propagated to callers
|
||||
- Limited to 1000 bars maximum (configurable via MAX_BARS)
|
||||
- No compression for large datasets
|
||||
|
||||
## Thread Safety
|
||||
|
||||
All operations are thread-safe for single writer, multiple reader scenarios:
|
||||
- Writer: Data processing pipeline (single thread)
|
||||
- Readers: Visualization frontend (polling)
|
||||
- Atomic file operations prevent corruption during concurrent access
|
||||
Reference in New Issue
Block a user