77 lines
4.7 KiB
Markdown
77 lines
4.7 KiB
Markdown
|
|
# PRD: VectorBT Migration & CCXT Integration
|
||
|
|
|
||
|
|
## 1. Introduction
|
||
|
|
The goal of this project is to refactor the current backtesting infrastructure to a professional-grade stack using **VectorBT** for high-performance backtesting and **CCXT** for robust historical data acquisition. The system will support rapid prototyping of "many simple strategies," parameter optimization (Grid Search), and stability testing (Walk-Forward Analysis).
|
||
|
|
|
||
|
|
## 2. Goals
|
||
|
|
- **Replace Custom Backtester:** Retire the existing loop-based backtesting logic in favor of vectorized operations using `vectorbt`.
|
||
|
|
- **Automate Data Collection:** Implement a `ccxt` based downloader to fetch and cache OHLCV data from OKX (and other exchanges) automatically.
|
||
|
|
- **Enable Optimization:** Built-in support for Grid Search to find optimal strategy parameters.
|
||
|
|
- **Validation:** Implement Walk-Forward Analysis (WFA) to validate strategy robustness and prevent overfitting.
|
||
|
|
- **Standardized Reporting:** Generate consistent outputs: Console summaries, CSV logs, and VectorBT interactive plots.
|
||
|
|
|
||
|
|
## 3. User Stories
|
||
|
|
- **Data Acquisition:** "As a user, I want to run a command `download_data --pair BTC/USDT --exchange okx` and have the system fetch historical 1-minute candles and save them to `data/ccxt/okx/BTC-USDT/1m.csv`."
|
||
|
|
- **Strategy Dev:** "As a researcher, I want to define a new strategy by simply writing a class/function that defines entry/exit signals, without worrying about the backtesting loop."
|
||
|
|
- **Optimization:** "As a researcher, I want to say 'Optimize RSI period between 10 and 20' and get a heatmap of results."
|
||
|
|
- **Validation:** "As a researcher, I want to verify if my 'best' parameters work on unseen data using Walk-Forward Analysis."
|
||
|
|
- **Analysis:** "As a user, I want to see an equity curve and key metrics (Sharpe, Drawdown) immediately after a test run."
|
||
|
|
|
||
|
|
## 4. Functional Requirements
|
||
|
|
|
||
|
|
### 4.1 Data Module (`data_manager`)
|
||
|
|
- **Exchange Interface:** Use `ccxt` to connect to exchanges (initially OKX).
|
||
|
|
- **Fetching Logic:** Fetch OHLCV data in chunks to handle rate limits and long histories.
|
||
|
|
- **Storage:** Save data to standardized paths: `data/ccxt/{exchange}/{pair}_{timeframe}.csv`.
|
||
|
|
- **Loading:** Utility to load saved CSVs into a Pandas DataFrame compatible with `vectorbt`.
|
||
|
|
|
||
|
|
### 4.2 Strategy Interface (`strategies/`)
|
||
|
|
- **Base Protocol:** Define a standard structure for strategies. A strategy should return/define:
|
||
|
|
- Indicator calculations (Vectorized).
|
||
|
|
- Entry signals (Boolean Series).
|
||
|
|
- Exit signals (Boolean Series).
|
||
|
|
- **Parameterization:** Strategies must accept dynamic parameters to support Grid Search.
|
||
|
|
|
||
|
|
### 4.3 Backtest Engine (`engine.py`)
|
||
|
|
- **Simulation:** Use `vectorbt.Portfolio.from_signals` (or similar) for fast simulation.
|
||
|
|
- **Cost Model:** Support configurable fees (maker/taker) and slippage estimates.
|
||
|
|
- **Grid Search:** Utilize `vectorbt`'s parameter broadcasting to run many variations simultaneously.
|
||
|
|
- **Walk-Forward Analysis:**
|
||
|
|
- Implement a splitting mechanism (e.g., `vectorbt.Splitter`) to divide data into In-Sample (Train) and Out-of-Sample (Test) sets.
|
||
|
|
- Execute optimization on Train, validate on Test.
|
||
|
|
|
||
|
|
### 4.4 Reporting (`reporting.py`)
|
||
|
|
- **Console:** Print key metrics: Total Return, Sharpe Ratio, Max Drawdown, Win Rate, Count of Trades.
|
||
|
|
- **Files:** Save detailed trade logs and metrics summaries to `backtest_logs/`.
|
||
|
|
- **Visuals:** Generate and save/show `vectorbt` plots (Equity curve, Drawdowns).
|
||
|
|
|
||
|
|
## 5. Non-Goals
|
||
|
|
- Real-time live trading execution (this is strictly for research/backtesting).
|
||
|
|
- Complex Machine Learning models (initially focusing on indicator-based logic).
|
||
|
|
- High-frequency tick-level backtesting (1-minute granularity is the target).
|
||
|
|
|
||
|
|
## 6. Technical Architecture Proposal
|
||
|
|
```text
|
||
|
|
project_root/
|
||
|
|
├── data/
|
||
|
|
│ └── ccxt/ # New data storage structure
|
||
|
|
├── strategies/ # Strategy definitions
|
||
|
|
│ ├── __init__.py
|
||
|
|
│ ├── base.py # Abstract Base Class
|
||
|
|
│ └── ma_cross.py # Example strategy
|
||
|
|
├── engine/
|
||
|
|
│ ├── data_loader.py # CCXT wrapper
|
||
|
|
│ ├── backtester.py # VBT runner
|
||
|
|
│ └── optimizer.py # Grid Search & WFA logic
|
||
|
|
├── main.py # CLI entry point
|
||
|
|
└── pyproject.toml
|
||
|
|
```
|
||
|
|
|
||
|
|
## 7. Success Metrics
|
||
|
|
- Can download 1 year of 1m BTC/USDT data from OKX in < 2 minutes.
|
||
|
|
- Can run a 100-parameter grid search on 1 year of 1m data in < 10 seconds.
|
||
|
|
- Walk-forward analysis produces a clear "Robustness Score" or visual comparison of Train vs Test performance.
|
||
|
|
|
||
|
|
## 8. Open Questions
|
||
|
|
- Do we need to handle funding rates for perp futures in the PnL calculation immediately? (Assumed NO for V1, stick to spot/simple futures price action).
|