Add daily model training scripts and terminal UI for live trading

- Introduced `train_daily.sh` for automating daily model retraining, including data download and model training steps. - Added `install_cron.sh` for setting up a cron job to run the daily training script. - Created `setup_schedule.sh` for configuring Systemd timers for daily training tasks. - Implemented a terminal UI using Rich for real-time monitoring of trading performance, including metrics display and log handling. - Updated `pyproject.toml` to include the `rich` dependency for UI functionality. - Enhanced `.gitignore` to exclude model and log files. - Added database support for trade persistence and metrics calculation. - Updated README with installation and usage instructions for the new features.
2026-01-18 11:08:57 +08:00
parent 35992ee374
commit b5550f4ff4
27 changed files with 3582 additions and 113 deletions
--- a/README.md
+++ b/README.md
@@ -1,82 +1,262 @@
-### lowkey_backtest — Supertrend Backtester
+# Lowkey Backtest

-### Overview
-Backtest a simple, long-only strategy driven by a meta Supertrend signal on aggregated OHLCV data. The script:
- Loads 1-minute BTC/USD data from `../data/btcusd_1-min_data.csv`
- Aggregates to multiple timeframes (e.g., `5min`, `15min`, `30min`, `1h`, `4h`, `1d`)
- Computes three Supertrend variants and creates a meta signal when all agree
- Executes entries/exits at the aggregated bar open price
- Applies OKX spot fee assumptions (taker by default)
- Evaluates stop-loss using intra-bar 1-minute data
- Writes detailed trade logs and a summary CSV
+A backtesting framework supporting multiple market types (spot, perpetual) with realistic trading simulation including leverage, funding, and shorts.
+
+## Requirements

-### Requirements
 - Python 3.12+
- Dependencies: `pandas`, `numpy`, `ta`
- Package management: `uv`
+- Package manager: `uv`

-Install dependencies with uv:
+## Installation

 ```bash
 uv sync
-# If a dependency is missing, add it explicitly and sync
-uv add pandas numpy ta
-uv sync
 ```

-### Data
- Expected CSV location: `../data/btcusd_1-min_data.csv` (relative to the repo root)
- Required columns: `Timestamp`, `Open`, `High`, `Low`, `Close`, `Volume`
- `Timestamp` should be UNIX seconds; zero-volume rows are ignored
+## Quick Reference

-### Quickstart
-Run the backtest with defaults:
+| Command | Description |
+|---------|-------------|
+| `uv run python main.py download -p BTC-USDT` | Download data |
+| `uv run python main.py backtest -s meta_st -p BTC-USDT` | Run backtest |
+| `uv run python main.py wfa -s regime -p BTC-USDT` | Walk-forward analysis |
+| `uv run python train_model.py --download` | Train/retrain ML model |
+| `uv run python research/regime_detection.py` | Run research script |
+
+---
+
+## Backtest CLI
+
+The main entry point is `main.py` which provides three commands: `download`, `backtest`, and `wfa`.
+
+### Download Data
+
+Download historical OHLCV data from exchanges.

 ```bash
-uv run python main.py
+uv run python main.py download -p BTC-USDT -t 1h
 ```

-Outputs:
- Per-run trade logs in `backtest_logs/` named like `trade_log_<TIMEFRAME>_sl<STOPLOSS>.csv`
- Run-level summary in `backtest_summary.csv`
+**Options:**
+- `-p, --pair` (required): Trading pair (e.g., `BTC-USDT`, `ETH-USDT`)
+- `-t, --timeframe`: Timeframe (default: `1m`)
+- `-e, --exchange`: Exchange (default: `okx`)
+- `-m, --market`: Market type: `spot` or `perpetual` (default: `spot`)
+- `--start`: Start date in `YYYY-MM-DD` format

-### Configuring a Run
-Adjust parameters directly in `main.py`:
- Date range (in `load_data`): `load_data('2021-11-01', '2024-10-16')`
- Timeframes to test (any subset of `"5min", "15min", "30min", "1h", "4h", "1d"`):
-  - `timeframes = ["5min", "15min", "30min", "1h", "4h", "1d"]`
- Stop-loss percentages:
-  - `stoplosses = [0.03, 0.05, 0.1]`
- Supertrend settings (in `add_supertrend_indicators`): `(period, multiplier)` pairs `(12, 3.0)`, `(10, 1.0)`, `(11, 2.0)`
- Fee model (in `calculate_okx_taker_maker_fee`): taker `0.0010`, maker `0.0008`
+**Examples:**
+```bash
+# Download 1-hour spot data
+uv run python main.py download -p ETH-USDT -t 1h

-### What the Backtester Does
- Aggregation: Resamples 1-minute data to your selected timeframe using OHLCV rules
- Supertrend signals: Computes three Supertrends and derives a meta trend of `+1` (bullish) or `-1` (bearish) when all agree; otherwise `0`
- Trade logic (long-only):
-  - Entry when the meta trend changes to bullish; uses aggregated bar open price
-  - Exit when the meta trend changes to bearish; uses aggregated bar open price
-  - Stop-loss: For each aggregated bar, scans corresponding 1-minute closes to detect stop-loss and exits using a realistic fill (threshold or next 1-minute open if gapped)
- Performance metrics: total return, max drawdown, Sharpe (daily, factor 252), win rate, number of trades, final/initial equity, and total fees
-
-### Important: Lookahead Bias Note
-The current implementation uses the meta Supertrend signal of the same bar for entries/exits, which introduces lookahead bias. To avoid this, lag the signal by one bar inside `backtest()` in `main.py`:
-
-```python
-# Replace the current line
-meta_trend_signal = meta_trend
-
-# With a one-bar lag to remove lookahead
-# meta_trend_signal = np.roll(meta_trend, 1)
-# meta_trend_signal[0] = 0
+# Download perpetual data from a specific date
+uv run python main.py download -p BTC-USDT -m perpetual --start 2024-01-01
 ```

-### Outputs
- `backtest_logs/trade_log_<TIMEFRAME>_sl<STOPLOSS>.csv`: trade-by-trade records including type (`buy`, `sell`, `stop_loss`, `forced_close`), timestamps, prices, balances, PnL, and fees
- `backtest_summary.csv`: one row per (timeframe, stop-loss) combination with `timeframe`, `stop_loss`, `total_return`, `max_drawdown`, `sharpe_ratio`, `win_rate`, `num_trades`, `final_equity`, `initial_equity`, `num_stop_losses`, `total_fees`
+### Run Backtest

-### Troubleshooting
- CSV not found: Ensure the dataset is located at `../data/btcusd_1-min_data.csv`
- Missing packages: Run `uv add pandas numpy ta` then `uv sync`
- Memory/performance: Large date ranges on 1-minute data can be heavy; narrow the date span or test fewer timeframes
+Run a backtest with a specific strategy.

+```bash
+uv run python main.py backtest -s <strategy> -p <pair> [options]
+```
+
+**Available Strategies:**
+- `meta_st` - Meta Supertrend (triple supertrend consensus)
+- `regime` - Regime Reversion (ML-based spread trading)
+- `rsi` - RSI overbought/oversold
+- `macross` - Moving Average Crossover
+
+**Options:**
+- `-s, --strategy` (required): Strategy name
+- `-p, --pair` (required): Trading pair
+- `-t, --timeframe`: Timeframe (default: `1m`)
+- `--start`: Start date
+- `--end`: End date
+- `-g, --grid`: Run grid search optimization
+- `--plot`: Show equity curve plot
+- `--sl`: Stop loss percentage
+- `--tp`: Take profit percentage
+- `--trail`: Enable trailing stop
+- `--fees`: Override fee rate
+- `--slippage`: Slippage (default: `0.001`)
+- `-l, --leverage`: Leverage multiplier
+
+**Examples:**
+```bash
+# Basic backtest with Meta Supertrend
+uv run python main.py backtest -s meta_st -p BTC-USDT -t 1h
+
+# Backtest with date range and plot
+uv run python main.py backtest -s meta_st -p BTC-USDT --start 2024-01-01 --end 2024-12-31 --plot
+
+# Grid search optimization
+uv run python main.py backtest -s meta_st -p BTC-USDT -t 4h -g
+
+# Backtest with risk parameters
+uv run python main.py backtest -s meta_st -p BTC-USDT --sl 0.05 --tp 0.10 --trail
+
+# Regime strategy on ETH/BTC spread
+uv run python main.py backtest -s regime -p ETH-USDT -t 1h
+```
+
+### Walk-Forward Analysis (WFA)
+
+Run walk-forward optimization to avoid overfitting.
+
+```bash
+uv run python main.py wfa -s <strategy> -p <pair> [options]
+```
+
+**Options:**
+- `-s, --strategy` (required): Strategy name
+- `-p, --pair` (required): Trading pair
+- `-t, --timeframe`: Timeframe (default: `1d`)
+- `-w, --windows`: Number of walk-forward windows (default: `10`)
+- `--plot`: Show WFA results plot
+
+**Examples:**
+```bash
+# Walk-forward analysis with 10 windows
+uv run python main.py wfa -s meta_st -p BTC-USDT -t 1d -w 10
+
+# WFA with plot output
+uv run python main.py wfa -s regime -p ETH-USDT --plot
+```
+
+---
+
+## Research Scripts
+
+Research scripts are located in the `research/` directory for experimental analysis.
+
+### Regime Detection Research
+
+Tests multiple holding horizons for the regime reversion strategy using walk-forward training.
+
+```bash
+uv run python research/regime_detection.py
+```
+
+**Options:**
+- `--days DAYS`: Number of days of historical data (default: 90)
+- `--start DATE`: Start date (YYYY-MM-DD), overrides `--days`
+- `--end DATE`: End date (YYYY-MM-DD), defaults to now
+- `--output PATH`: Output CSV path
+
+**Examples:**
+```bash
+# Use last 90 days (default)
+uv run python research/regime_detection.py
+
+# Use last 180 days
+uv run python research/regime_detection.py --days 180
+
+# Specific date range
+uv run python research/regime_detection.py --start 2025-07-01 --end 2025-12-31
+```
+
+**What it does:**
+- Loads BTC and ETH hourly data
+- Calculates spread features (Z-score, RSI, volume ratios)
+- Trains RandomForest classifier with walk-forward methodology
+- Tests horizons from 6h to 150h
+- Outputs best parameters by F1 score, Net PnL, and MAE
+
+**Output:**
+- Console: Summary of results for each horizon
+- File: `research/horizon_optimization_results.csv`
+
+---
+
+## ML Model Training
+
+The `regime` strategy uses a RandomForest classifier that can be trained with new data.
+
+### Train Model
+
+Train or retrain the ML model with latest data:
+
+```bash
+uv run python train_model.py [options]
+```
+
+**Options:**
+- `--days DAYS`: Days of historical data (default: 90)
+- `--pair PAIR`: Base pair for context (default: BTC-USDT)
+- `--spread-pair PAIR`: Trading pair (default: ETH-USDT)
+- `--timeframe TF`: Timeframe (default: 1h)
+- `--market TYPE`: Market type: `spot` or `perpetual` (default: perpetual)
+- `--output PATH`: Model output path (default: `data/regime_model.pkl`)
+- `--train-ratio R`: Train/test split ratio (default: 0.7)
+- `--horizon H`: Prediction horizon in bars (default: 102)
+- `--download`: Download latest data before training
+- `--dry-run`: Run without saving model
+
+**Examples:**
+```bash
+# Train with last 90 days of data
+uv run python train_model.py
+
+# Download fresh data and train
+uv run python train_model.py --download
+
+# Train with 180 days of data
+uv run python train_model.py --days 180
+
+# Train on spot market data
+uv run python train_model.py --market spot
+
+# Dry run to see metrics without saving
+uv run python train_model.py --dry-run
+```
+
+### Daily Retraining (Cron)
+
+To automate daily model retraining, add a cron job:
+
+```bash
+# Edit crontab
+crontab -e
+
+# Add entry to retrain daily at 00:30 UTC
+30 0 * * * cd /path/to/lowkey_backtest_live && uv run python train_model.py --download >> logs/training.log 2>&1
+```
+
+### Model Files
+
+| File | Description |
+|------|-------------|
+| `data/regime_model.pkl` | Current production model |
+| `data/regime_model_YYYYMMDD_HHMMSS.pkl` | Versioned model snapshots |
+
+The model file contains:
+- Trained RandomForest classifier
+- Feature column names
+- Training metrics (F1 score, sample counts)
+- Training timestamp
+
+---
+
+## Output Files
+
+| Location | Description |
+|----------|-------------|
+| `backtest_logs/` | Trade logs and WFA results |
+| `research/` | Research output files |
+| `data/` | Downloaded OHLCV data and ML models |
+| `data/regime_model.pkl` | Trained ML model for regime strategy |
+
+---
+
+## Running Tests
+
+```bash
+uv run pytest tests/
+```
+
+Run a specific test file:
+
+```bash
+uv run pytest tests/test_data_manager.py
+```