- Updated all technical indicators to return pandas DataFrames instead of lists, improving consistency and usability. - Modified the `calculate` method in `TechnicalIndicators` to directly return DataFrames with relevant indicator values. - Enhanced the `data_integration.py` to utilize the new DataFrame outputs for better integration with charting. - Updated documentation to reflect the new DataFrame-centric approach, including usage examples and output structures. - Improved error handling to ensure empty DataFrames are returned when insufficient data is available. These changes streamline the indicator calculations and improve the overall architecture, aligning with project standards for maintainability and performance.
347 lines
12 KiB
Markdown
347 lines
12 KiB
Markdown
# Technical Indicators Module
|
|
|
|
## Overview
|
|
|
|
The Technical Indicators module provides a **vectorized, DataFrame-centric** system for calculating technical analysis indicators. It is designed to handle sparse OHLCV data efficiently using pandas for high-performance calculations, making it ideal for real-time trading applications and chart visualization.
|
|
|
|
## Key Features
|
|
|
|
- **DataFrame-Centric Design**: All indicators return pandas DataFrames with timestamp index for easy alignment and plotting
|
|
- **Vectorized Calculations**: Leverages pandas and numpy for high-speed computation
|
|
- **Clean Output**: Returns only relevant indicator columns (e.g., `'sma'`, `'ema'`, `'rsi'`) with timestamp index
|
|
- **Safe Trading**: Proper warm-up periods ensure no early/invalid values are returned
|
|
- **Gap Handling**: Maintains timestamp alignment without interpolation for trading integrity
|
|
- **Modular Architecture**: Clear separation between calculation logic and result formatting
|
|
|
|
## Architecture
|
|
|
|
### Package Structure
|
|
```
|
|
data/common/indicators/
|
|
├── __init__.py # Package exports
|
|
├── technical.py # Main facade class
|
|
├── base.py # Base indicator class
|
|
├── result.py # Result container class (legacy)
|
|
├── utils.py # Utility functions
|
|
└── implementations/ # Individual indicator implementations
|
|
├── __init__.py
|
|
├── sma.py # Simple Moving Average
|
|
├── ema.py # Exponential Moving Average
|
|
├── rsi.py # Relative Strength Index
|
|
├── macd.py # MACD
|
|
└── bollinger.py # Bollinger Bands
|
|
```
|
|
|
|
### Key Components
|
|
|
|
#### 1. Base Classes
|
|
- **BaseIndicator**: Abstract base class providing common functionality
|
|
- Data preparation with timestamp handling
|
|
- Validation and error handling
|
|
- Logging support
|
|
|
|
#### 2. Individual Indicators
|
|
Each indicator is implemented as a separate class inheriting from `BaseIndicator`:
|
|
- **Vectorized calculations** using pandas operations
|
|
- **Clean DataFrame output** with only relevant columns
|
|
- **Proper warm-up periods** for safe trading
|
|
- **Independent testing** and maintenance
|
|
|
|
#### 3. TechnicalIndicators Facade
|
|
Main entry point providing:
|
|
- Unified DataFrame-based interface
|
|
- Batch calculations
|
|
- Consistent error handling
|
|
- Data preparation utilities
|
|
|
|
## Supported Indicators
|
|
|
|
### Simple Moving Average (SMA)
|
|
```python
|
|
from data.common.indicators import TechnicalIndicators
|
|
|
|
indicators = TechnicalIndicators()
|
|
result_df = indicators.sma(df, period=20, price_column='close')
|
|
# Returns DataFrame with columns: ['sma'], indexed by timestamp
|
|
```
|
|
- **Parameters**:
|
|
- `period`: Number of periods (default: 20)
|
|
- `price_column`: Column to average (default: 'close')
|
|
- **Returns**: DataFrame with `'sma'` column, indexed by timestamp
|
|
- **Warm-up**: First `period-1` values are excluded for safety
|
|
|
|
### Exponential Moving Average (EMA)
|
|
```python
|
|
result_df = indicators.ema(df, period=12, price_column='close')
|
|
# Returns DataFrame with columns: ['ema'], indexed by timestamp
|
|
```
|
|
- **Parameters**:
|
|
- `period`: Number of periods (default: 20)
|
|
- `price_column`: Column to average (default: 'close')
|
|
- **Returns**: DataFrame with `'ema'` column, indexed by timestamp
|
|
- **Warm-up**: First `period-1` values are excluded for safety
|
|
|
|
### Relative Strength Index (RSI)
|
|
```python
|
|
result_df = indicators.rsi(df, period=14, price_column='close')
|
|
# Returns DataFrame with columns: ['rsi'], indexed by timestamp
|
|
```
|
|
- **Parameters**:
|
|
- `period`: Number of periods (default: 14)
|
|
- `price_column`: Column to analyze (default: 'close')
|
|
- **Returns**: DataFrame with `'rsi'` column, indexed by timestamp
|
|
- **Warm-up**: First `period-1` values are excluded for safety
|
|
|
|
### Moving Average Convergence Divergence (MACD)
|
|
```python
|
|
result_df = indicators.macd(
|
|
df,
|
|
fast_period=12,
|
|
slow_period=26,
|
|
signal_period=9,
|
|
price_column='close'
|
|
)
|
|
# Returns DataFrame with columns: ['macd', 'signal', 'histogram'], indexed by timestamp
|
|
```
|
|
- **Parameters**:
|
|
- `fast_period`: Fast EMA period (default: 12)
|
|
- `slow_period`: Slow EMA period (default: 26)
|
|
- `signal_period`: Signal line period (default: 9)
|
|
- `price_column`: Column to analyze (default: 'close')
|
|
- **Returns**: DataFrame with `'macd'`, `'signal'`, `'histogram'` columns, indexed by timestamp
|
|
- **Warm-up**: First `max(slow_period, signal_period)-1` values are excluded for safety
|
|
|
|
### Bollinger Bands
|
|
```python
|
|
result_df = indicators.bollinger_bands(
|
|
df,
|
|
period=20,
|
|
std_dev=2.0,
|
|
price_column='close'
|
|
)
|
|
# Returns DataFrame with columns: ['upper_band', 'middle_band', 'lower_band'], indexed by timestamp
|
|
```
|
|
- **Parameters**:
|
|
- `period`: SMA period (default: 20)
|
|
- `std_dev`: Standard deviation multiplier (default: 2.0)
|
|
- `price_column`: Column to analyze (default: 'close')
|
|
- **Returns**: DataFrame with `'upper_band'`, `'middle_band'`, `'lower_band'` columns, indexed by timestamp
|
|
- **Warm-up**: First `period-1` values are excluded for safety
|
|
|
|
## Usage Examples
|
|
|
|
### Basic Usage with DataFrame Output
|
|
```python
|
|
from data.common.indicators import TechnicalIndicators
|
|
|
|
# Initialize calculator
|
|
indicators = TechnicalIndicators(logger=my_logger)
|
|
|
|
# Calculate single indicator - returns DataFrame
|
|
sma_df = indicators.sma(df, period=20)
|
|
|
|
# Access results using DataFrame operations
|
|
print(f"First SMA value: {sma_df['sma'].iloc[0]}")
|
|
print(f"Latest SMA value: {sma_df['sma'].iloc[-1]}")
|
|
print(f"All SMA values: {sma_df['sma'].tolist()}")
|
|
|
|
# Plotting integration
|
|
import plotly.graph_objects as go
|
|
fig = go.Figure()
|
|
fig.add_trace(go.Scatter(
|
|
x=sma_df.index,
|
|
y=sma_df['sma'],
|
|
name='SMA 20',
|
|
line=dict(color='blue')
|
|
))
|
|
```
|
|
|
|
### Using the Dynamic `calculate` Method
|
|
```python
|
|
# Calculate any indicator by type name
|
|
rsi_df = indicators.calculate('rsi', df, period=14)
|
|
if rsi_df is not None and not rsi_df.empty:
|
|
print(f"RSI range: {rsi_df['rsi'].min():.2f} - {rsi_df['rsi'].max():.2f}")
|
|
|
|
# MACD with custom parameters
|
|
macd_df = indicators.calculate('macd', df, fast_period=10, slow_period=30, signal_period=8)
|
|
if macd_df is not None and not macd_df.empty:
|
|
print(f"MACD signal line: {macd_df['signal'].iloc[-1]:.4f}")
|
|
```
|
|
|
|
### Batch Calculations
|
|
```python
|
|
# Configure multiple indicators
|
|
config = {
|
|
'sma_20': {'type': 'sma', 'period': 20},
|
|
'ema_12': {'type': 'ema', 'period': 12},
|
|
'rsi_14': {'type': 'rsi', 'period': 14},
|
|
'macd': {
|
|
'type': 'macd',
|
|
'fast_period': 12,
|
|
'slow_period': 26,
|
|
'signal_period': 9
|
|
}
|
|
}
|
|
|
|
# Calculate all at once - returns dict of DataFrames
|
|
results = indicators.calculate_multiple_indicators(df, config)
|
|
|
|
# Access individual results
|
|
sma_df = results['sma_20'] # DataFrame with 'sma' column
|
|
ema_df = results['ema_12'] # DataFrame with 'ema' column
|
|
rsi_df = results['rsi_14'] # DataFrame with 'rsi' column
|
|
macd_df = results['macd'] # DataFrame with 'macd', 'signal', 'histogram' columns
|
|
```
|
|
|
|
### Working with Different Price Columns
|
|
```python
|
|
# Calculate SMA on the 'high' price
|
|
sma_high_df = indicators.sma(df, period=20, price_column='high')
|
|
|
|
# Calculate RSI on the 'open' price
|
|
rsi_open_df = indicators.calculate('rsi', df, period=14, price_column='open')
|
|
|
|
# All results are DataFrames with the same structure
|
|
assert 'sma' in sma_high_df.columns
|
|
assert 'rsi' in rsi_open_df.columns
|
|
```
|
|
|
|
## Data Handling and Best Practices
|
|
|
|
### DataFrame Preparation
|
|
```python
|
|
from components.charts.utils import prepare_chart_data
|
|
|
|
# Prepare DataFrame from candle data
|
|
df = prepare_chart_data(candles)
|
|
# df has columns: ['open', 'high', 'low', 'close', 'volume'] with DatetimeIndex
|
|
```
|
|
|
|
### Gap Handling
|
|
The system handles data gaps naturally:
|
|
- **No interpolation**: Gaps in timestamps are preserved
|
|
- **Rolling calculations**: Use only available data points
|
|
- **Safe trading**: No artificial data is introduced
|
|
|
|
```python
|
|
# Example: If you have gaps in your data
|
|
# 09:00, 09:01, 09:02, 09:04, 09:05 (missing 09:03)
|
|
# The indicators will calculate correctly using available data
|
|
# No interpolation or filling of gaps
|
|
```
|
|
|
|
### Warm-up Periods
|
|
All indicators implement proper warm-up periods for safe trading:
|
|
- **SMA/EMA/RSI/BB**: First `period-1` values excluded
|
|
- **MACD**: First `max(slow_period, signal_period)-1` values excluded
|
|
- **Result**: Only reliable, fully-calculated values are returned
|
|
|
|
### Error Handling
|
|
```python
|
|
try:
|
|
result_df = indicators.rsi(df, period=14)
|
|
if result_df is not None and not result_df.empty:
|
|
# Process results
|
|
pass
|
|
else:
|
|
# Handle insufficient data
|
|
logger.warning("Insufficient data for RSI calculation")
|
|
except Exception as e:
|
|
logger.error(f"RSI calculation failed: {e}")
|
|
# Handle calculation errors
|
|
```
|
|
|
|
## Performance Considerations
|
|
|
|
1. **Vectorized Operations**
|
|
- Uses pandas rolling/ewm functions for maximum performance
|
|
- Minimal data copying and transformations
|
|
- Efficient memory usage
|
|
|
|
2. **DataFrame Alignment**
|
|
- Timestamp index ensures proper alignment with price data
|
|
- Easy integration with plotting libraries
|
|
- Consistent data structure across all indicators
|
|
|
|
3. **Memory Efficiency**
|
|
- Returns only necessary columns
|
|
- No metadata overhead in result DataFrames
|
|
- Clean, minimal output format
|
|
|
|
## Testing
|
|
|
|
The module includes comprehensive tests for the new DataFrame-based approach:
|
|
- Unit tests for each indicator's DataFrame output
|
|
- Integration tests for the facade
|
|
- Edge case handling (gaps, insufficient data)
|
|
- Performance benchmarks
|
|
|
|
Run tests with:
|
|
```bash
|
|
uv run pytest tests/test_indicators.py
|
|
uv run pytest tests/test_indicators_safety.py
|
|
```
|
|
|
|
## Migration from Legacy Format
|
|
|
|
If you were using the old `List[IndicatorResult]` format:
|
|
|
|
### Old Style:
|
|
```python
|
|
results = indicators.sma(df, period=20)
|
|
for result in results:
|
|
print(f"Time: {result.timestamp}, SMA: {result.values['sma']}")
|
|
```
|
|
|
|
### New Style:
|
|
```python
|
|
result_df = indicators.sma(df, period=20)
|
|
for timestamp, row in result_df.iterrows():
|
|
print(f"Time: {timestamp}, SMA: {row['sma']}")
|
|
```
|
|
|
|
## Contributing
|
|
|
|
When adding new indicators:
|
|
1. Create a new class in `implementations/`
|
|
2. Inherit from `BaseIndicator`
|
|
3. Implement the `calculate` method to return a DataFrame
|
|
4. Ensure proper warm-up periods
|
|
5. Add comprehensive tests
|
|
6. Update documentation
|
|
|
|
See [Adding New Indicators](./adding-new-indicators.md) for detailed instructions.
|
|
|
|
## API Reference
|
|
|
|
### TechnicalIndicators Class
|
|
|
|
```python
|
|
class TechnicalIndicators:
|
|
def sma(self, df: pd.DataFrame, period: int, price_column: str = 'close') -> pd.DataFrame
|
|
def ema(self, df: pd.DataFrame, period: int, price_column: str = 'close') -> pd.DataFrame
|
|
def rsi(self, df: pd.DataFrame, period: int = 14, price_column: str = 'close') -> pd.DataFrame
|
|
def macd(self, df: pd.DataFrame, fast_period: int = 12, slow_period: int = 26,
|
|
signal_period: int = 9, price_column: str = 'close') -> pd.DataFrame
|
|
def bollinger_bands(self, df: pd.DataFrame, period: int = 20, std_dev: float = 2.0,
|
|
price_column: str = 'close') -> pd.DataFrame
|
|
def calculate(self, indicator_type: str, df: pd.DataFrame, **kwargs) -> Optional[pd.DataFrame]
|
|
def calculate_multiple_indicators(self, df: pd.DataFrame,
|
|
indicators_config: Dict[str, Dict[str, Any]]) -> Dict[str, pd.DataFrame]
|
|
```
|
|
|
|
### Return Format
|
|
All methods return:
|
|
- **Success**: `pd.DataFrame` with indicator column(s) and DatetimeIndex
|
|
- **Failure/Insufficient Data**: `pd.DataFrame()` (empty DataFrame)
|
|
- **Error**: `None` (with logged error)
|
|
|
|
### DataFrame Structure
|
|
```python
|
|
# Example SMA result
|
|
result_df = indicators.sma(df, period=20)
|
|
# result_df.index: DatetimeIndex (timestamps)
|
|
# result_df.columns: ['sma']
|
|
# result_df.shape: (N, 1) where N = len(df) - period + 1 (after warm-up)
|
|
``` |