- Introduced `TimeframeAggregator` class for real-time aggregation of minute-level data to higher timeframes, enhancing the `IncStrategyBase` functionality. - Updated `IncStrategyBase` to include `update_minute_data()` method, allowing strategies to process minute-level OHLCV data seamlessly. - Enhanced existing strategies (`IncMetaTrendStrategy`, `IncRandomStrategy`) to utilize the new aggregation features, simplifying their implementations and improving performance. - Added comprehensive documentation in `IMPLEMENTATION_SUMMARY.md` detailing the new architecture and usage examples for the aggregation feature. - Updated performance metrics and logging to monitor minute data processing effectively. - Ensured backward compatibility with existing `update()` methods, maintaining functionality for current strategies.
342 lines
11 KiB
Markdown
342 lines
11 KiB
Markdown
# Real-Time Strategy Architecture - Technical Specification
|
|
|
|
## Overview
|
|
|
|
This document outlines the technical specification for updating the trading strategy system to support real-time data processing with incremental calculations. The current architecture processes entire datasets during initialization, which is inefficient for real-time trading where new data arrives continuously.
|
|
|
|
## Current Architecture Issues
|
|
|
|
### Problems with Current Implementation
|
|
1. **Initialization-Heavy Design**: All calculations performed during `initialize()` method
|
|
2. **Full Dataset Processing**: Entire historical dataset processed on each initialization
|
|
3. **Memory Inefficient**: Stores complete calculation history in arrays
|
|
4. **No Incremental Updates**: Cannot add new data without full recalculation
|
|
5. **Performance Bottleneck**: Recalculating years of data for each new candle
|
|
6. **Index-Based Access**: Signal generation relies on pre-calculated arrays with fixed indices
|
|
|
|
### Current Strategy Flow
|
|
```
|
|
Data → initialize() → Full Calculation → Store Arrays → get_signal(index)
|
|
```
|
|
|
|
## Target Architecture: Incremental Calculation
|
|
|
|
### New Strategy Flow
|
|
```
|
|
Initial Data → initialize() → Warm-up Calculation → Ready State
|
|
New Data Point → calculate_on_data() → Update State → get_signal()
|
|
```
|
|
|
|
## Technical Requirements
|
|
|
|
### 1. Base Strategy Interface Updates
|
|
|
|
#### New Abstract Methods
|
|
```python
|
|
@abstractmethod
|
|
def get_minimum_buffer_size(self) -> Dict[str, int]:
|
|
"""
|
|
Return minimum data points needed for each timeframe.
|
|
|
|
Returns:
|
|
Dict[str, int]: {timeframe: min_points} mapping
|
|
|
|
Example:
|
|
{"15min": 50, "1min": 750} # 50 15min candles = 750 1min candles
|
|
"""
|
|
pass
|
|
|
|
@abstractmethod
|
|
def calculate_on_data(self, new_data_point: Dict, timestamp: pd.Timestamp) -> None:
|
|
"""
|
|
Process a single new data point incrementally.
|
|
|
|
Args:
|
|
new_data_point: OHLCV data point {open, high, low, close, volume}
|
|
timestamp: Timestamp of the data point
|
|
"""
|
|
pass
|
|
|
|
@abstractmethod
|
|
def supports_incremental_calculation(self) -> bool:
|
|
"""
|
|
Whether strategy supports incremental calculation.
|
|
|
|
Returns:
|
|
bool: True if incremental mode supported
|
|
"""
|
|
pass
|
|
```
|
|
|
|
#### New Properties and Methods
|
|
```python
|
|
@property
|
|
def calculation_mode(self) -> str:
|
|
"""Current calculation mode: 'initialization' or 'incremental'"""
|
|
return self._calculation_mode
|
|
|
|
@property
|
|
def is_warmed_up(self) -> bool:
|
|
"""Whether strategy has sufficient data for reliable signals"""
|
|
return self._is_warmed_up
|
|
|
|
def reset_calculation_state(self) -> None:
|
|
"""Reset internal calculation state for reinitialization"""
|
|
pass
|
|
|
|
def get_current_state_summary(self) -> Dict:
|
|
"""Get summary of current calculation state for debugging"""
|
|
pass
|
|
```
|
|
|
|
### 2. Internal State Management
|
|
|
|
#### State Variables
|
|
Each strategy must maintain:
|
|
```python
|
|
class StrategyBase:
|
|
def __init__(self, ...):
|
|
# Calculation state
|
|
self._calculation_mode = "initialization" # or "incremental"
|
|
self._is_warmed_up = False
|
|
self._data_points_received = 0
|
|
|
|
# Timeframe-specific buffers
|
|
self._timeframe_buffers = {} # {timeframe: deque(maxlen=buffer_size)}
|
|
self._timeframe_last_update = {} # {timeframe: timestamp}
|
|
|
|
# Indicator states (strategy-specific)
|
|
self._indicator_states = {}
|
|
|
|
# Signal generation state
|
|
self._last_signals = {} # Cache recent signals
|
|
self._signal_history = deque(maxlen=100) # Recent signal history
|
|
```
|
|
|
|
#### Buffer Management
|
|
```python
|
|
def _update_timeframe_buffers(self, new_data_point: Dict, timestamp: pd.Timestamp):
|
|
"""Update all timeframe buffers with new data point"""
|
|
|
|
def _should_update_timeframe(self, timeframe: str, timestamp: pd.Timestamp) -> bool:
|
|
"""Check if timeframe should be updated based on timestamp"""
|
|
|
|
def _get_timeframe_buffer(self, timeframe: str) -> pd.DataFrame:
|
|
"""Get current buffer for specific timeframe"""
|
|
```
|
|
|
|
### 3. Strategy-Specific Requirements
|
|
|
|
#### DefaultStrategy (Supertrend-based)
|
|
```python
|
|
class DefaultStrategy(StrategyBase):
|
|
def get_minimum_buffer_size(self) -> Dict[str, int]:
|
|
primary_tf = self.params.get("timeframe", "15min")
|
|
if primary_tf == "15min":
|
|
return {"15min": 50, "1min": 750}
|
|
elif primary_tf == "5min":
|
|
return {"5min": 50, "1min": 250}
|
|
# ... other timeframes
|
|
|
|
def _initialize_indicator_states(self):
|
|
"""Initialize Supertrend calculation states"""
|
|
self._supertrend_states = [
|
|
SupertrendState(period=10, multiplier=3.0),
|
|
SupertrendState(period=11, multiplier=2.0),
|
|
SupertrendState(period=12, multiplier=1.0)
|
|
]
|
|
|
|
def _update_supertrend_incrementally(self, ohlc_data):
|
|
"""Update Supertrend calculations with new data"""
|
|
# Incremental ATR calculation
|
|
# Incremental Supertrend calculation
|
|
# Update meta-trend based on all three Supertrends
|
|
```
|
|
|
|
#### BBRSStrategy (Bollinger Bands + RSI)
|
|
```python
|
|
class BBRSStrategy(StrategyBase):
|
|
def get_minimum_buffer_size(self) -> Dict[str, int]:
|
|
bb_period = self.params.get("bb_period", 20)
|
|
rsi_period = self.params.get("rsi_period", 14)
|
|
min_periods = max(bb_period, rsi_period) + 10 # +10 for warmup
|
|
return {"1min": min_periods}
|
|
|
|
def _initialize_indicator_states(self):
|
|
"""Initialize BB and RSI calculation states"""
|
|
self._bb_state = BollingerBandsState(period=self.params.get("bb_period", 20))
|
|
self._rsi_state = RSIState(period=self.params.get("rsi_period", 14))
|
|
self._market_regime_state = MarketRegimeState()
|
|
|
|
def _update_indicators_incrementally(self, price_data):
|
|
"""Update BB, RSI, and market regime with new data"""
|
|
# Incremental moving average for BB
|
|
# Incremental RSI calculation
|
|
# Market regime detection update
|
|
```
|
|
|
|
#### RandomStrategy
|
|
```python
|
|
class RandomStrategy(StrategyBase):
|
|
def get_minimum_buffer_size(self) -> Dict[str, int]:
|
|
return {"1min": 1} # No indicators needed
|
|
|
|
def supports_incremental_calculation(self) -> bool:
|
|
return True # Always supports incremental
|
|
```
|
|
|
|
### 4. Indicator State Classes
|
|
|
|
#### Base Indicator State
|
|
```python
|
|
class IndicatorState(ABC):
|
|
"""Base class for maintaining indicator calculation state"""
|
|
|
|
@abstractmethod
|
|
def update(self, new_value: float) -> float:
|
|
"""Update indicator with new value and return current indicator value"""
|
|
pass
|
|
|
|
@abstractmethod
|
|
def is_warmed_up(self) -> bool:
|
|
"""Whether indicator has enough data for reliable values"""
|
|
pass
|
|
|
|
@abstractmethod
|
|
def reset(self) -> None:
|
|
"""Reset indicator state"""
|
|
pass
|
|
```
|
|
|
|
#### Specific Indicator States
|
|
```python
|
|
class MovingAverageState(IndicatorState):
|
|
"""Maintains state for incremental moving average calculation"""
|
|
|
|
class RSIState(IndicatorState):
|
|
"""Maintains state for incremental RSI calculation"""
|
|
|
|
class SupertrendState(IndicatorState):
|
|
"""Maintains state for incremental Supertrend calculation"""
|
|
|
|
class BollingerBandsState(IndicatorState):
|
|
"""Maintains state for incremental Bollinger Bands calculation"""
|
|
```
|
|
|
|
### 5. Data Flow Architecture
|
|
|
|
#### Initialization Phase
|
|
```
|
|
1. Strategy.initialize(backtester)
|
|
2. Strategy._resample_data(original_data)
|
|
3. Strategy._initialize_indicator_states()
|
|
4. Strategy._warm_up_with_historical_data()
|
|
5. Strategy._calculation_mode = "incremental"
|
|
6. Strategy._is_warmed_up = True
|
|
```
|
|
|
|
#### Real-Time Processing Phase
|
|
```
|
|
1. New data arrives → StrategyManager.process_new_data()
|
|
2. StrategyManager → Strategy.calculate_on_data(new_point)
|
|
3. Strategy._update_timeframe_buffers()
|
|
4. Strategy._update_indicators_incrementally()
|
|
5. Strategy ready for get_entry_signal()/get_exit_signal()
|
|
```
|
|
|
|
### 6. Performance Requirements
|
|
|
|
#### Memory Efficiency
|
|
- Maximum buffer size per timeframe: configurable (default: 200 periods)
|
|
- Use `collections.deque` with `maxlen` for automatic buffer management
|
|
- Store only essential state, not full calculation history
|
|
|
|
#### Processing Speed
|
|
- Target: <1ms per data point for incremental updates
|
|
- Target: <10ms for signal generation
|
|
- Batch processing support for multiple data points
|
|
|
|
#### Accuracy Requirements
|
|
- Incremental calculations must match batch calculations within 0.01% tolerance
|
|
- Indicator values must be identical to traditional calculation methods
|
|
- Signal timing must be preserved exactly
|
|
|
|
### 7. Error Handling and Recovery
|
|
|
|
#### State Corruption Recovery
|
|
```python
|
|
def _validate_calculation_state(self) -> bool:
|
|
"""Validate internal calculation state consistency"""
|
|
|
|
def _recover_from_state_corruption(self) -> None:
|
|
"""Recover from corrupted calculation state"""
|
|
# Reset to initialization mode
|
|
# Recalculate from available buffer data
|
|
# Resume incremental mode
|
|
```
|
|
|
|
#### Data Gap Handling
|
|
```python
|
|
def handle_data_gap(self, gap_duration: pd.Timedelta) -> None:
|
|
"""Handle gaps in data stream"""
|
|
if gap_duration > self._max_acceptable_gap:
|
|
self._trigger_reinitialization()
|
|
else:
|
|
self._interpolate_missing_data()
|
|
```
|
|
|
|
### 8. Backward Compatibility
|
|
|
|
#### Compatibility Layer
|
|
- Existing `initialize()` method continues to work
|
|
- New methods are optional with default implementations
|
|
- Gradual migration path for existing strategies
|
|
- Fallback to batch calculation if incremental not supported
|
|
|
|
#### Migration Strategy
|
|
1. Phase 1: Add new interface with default implementations
|
|
2. Phase 2: Implement incremental calculation for each strategy
|
|
3. Phase 3: Optimize and remove batch calculation fallbacks
|
|
4. Phase 4: Make incremental calculation mandatory
|
|
|
|
### 9. Testing Requirements
|
|
|
|
#### Unit Tests
|
|
- Test incremental vs. batch calculation accuracy
|
|
- Test state management and recovery
|
|
- Test buffer management and memory usage
|
|
- Test performance benchmarks
|
|
|
|
#### Integration Tests
|
|
- Test with real-time data streams
|
|
- Test strategy manager coordination
|
|
- Test error recovery scenarios
|
|
- Test memory usage over extended periods
|
|
|
|
#### Performance Tests
|
|
- Benchmark incremental vs. batch processing
|
|
- Memory usage profiling
|
|
- Latency measurements for signal generation
|
|
- Stress testing with high-frequency data
|
|
|
|
### 10. Configuration and Monitoring
|
|
|
|
#### Configuration Options
|
|
```python
|
|
STRATEGY_CONFIG = {
|
|
"calculation_mode": "incremental", # or "batch"
|
|
"buffer_size_multiplier": 2.0, # multiply minimum buffer size
|
|
"max_acceptable_gap": "5min", # max data gap before reinitialization
|
|
"enable_state_validation": True, # enable periodic state validation
|
|
"performance_monitoring": True # enable performance metrics
|
|
}
|
|
```
|
|
|
|
#### Monitoring Metrics
|
|
- Calculation latency per strategy
|
|
- Memory usage per strategy
|
|
- State validation failures
|
|
- Data gap occurrences
|
|
- Signal generation frequency
|
|
|
|
This specification provides the foundation for implementing efficient real-time strategy processing while maintaining accuracy and reliability. |