2025-10-10 14:57:51 +08:00
# BTC– ETH Regime Modeling
2025-10-10 06:53:24 +00:00
2025-10-10 14:57:51 +08:00
This project builds and tests a **Hidden Markov Model (HMM)** that classifies structural market regimes in Bitcoin and Ethereum based on 1-minute OHLCV data from Bitstamp (https://www.kaggle.com/datasets/mczielinski/bitcoin-historical-data/ and https://www.kaggle.com/datasets/viniciusqroz/ethereum-historical-data).
It is designed to identify volatility and correlation phases—*risk-on*, *risk-off* , and *neutral* —and to evaluate how predictive or useful these regimes are across multiple timeframes and forecast horizons.
---
## Project Overview
### Goals
1. Detect repeating market “regimes” using unsupervised learning.
2. Evaluate how those regimes behave across timeframes and forecast horizons.
3. Use regime identification to **select trading strategies per market state** , rather than predict short-term direction.
### Datasets
Two synchronized 1-minute OHLCV datasets:
* `btcusd_1-min_data.csv`
* `ethusd_1min_ohlc.csv`
Both sourced from Bitstamp (Kaggle datasets).
---
## Architecture
### 1. `main.py`
Core experiment runner.
Implements:
* **Feature construction**:
* Multi-scale realized volatility (`rv_*` )
* Trend ratios (`trend_*` )
* Rolling BTC– ETH correlations (`corr_*` )
* Cross-asset beta and divergence
* Liquidity proxies (`volratio` , `vol_sum` , `vol_diff` )
* **Hidden Markov Model**: Gaussian emissions, diagonal covariance.
* **Randomized time-split validation**: multiple random train/test windows with configurable embargo gap to avoid leakage.
* **Metrics**:
* Hit rate (directional accuracy)
* Annualized Sharpe ratio of the regime-implied signal
* Mean ± std across random splits
This script explores model robustness across **different resample rules** (e.g. `30min` , `45min` , `1H` ).
---
### 2. `main_conf_metrics.py`
Lightweight evaluator used for the **confidence and coverage sweep** .
* Adds a `--conf` parameter to control how confident the model must be before emitting a trade (pseudo-ETSC gate).
* Prints per-run metrics:
* `cov` : coverage (fraction of bars with predictions)
* `hit` : overall hit rate
* `hit_trades` : accuracy conditional on trading
* `Sharpe` : annualized risk-adjusted performance
Used by shell scripts to benchmark many timeframes and confidence thresholds.
---
### 3. Shell scripts
#### `run_grid.sh`
Runs a large grid of:
* multiple resample rules (e.g. 20 min – 60 min),
* multiple horizons (e.g. 2– 6 bars ahead).
#### `run_focus.sh`
Focuses on the most promising regions (37– 41 min, 49– 59 min)
and sweeps confidence thresholds (0.45 – 0.60).
Produces concise summary lines for each combination.
---
## Key Findings
1. **Optimal timeframe:**
~**35 – 45 minutes** consistently yields the highest Sharpe ratios (~2.2– 2.3).
2. **Forecast horizon:**
Best performance around **two bars ahead** (~80 min look-ahead for 40 min bars).
3. **Confidence threshold:**
Little effect between 0.45– 0.60; model already confident on > 90 % of bars.
4. **Interpretation:**
Regimes reflect volatility and structure, not raw direction.
Use them to switch *strategy archetypes* (trend vs. mean-reversion) rather than predict sign.
---
## Example Usage
### Single test
```bash
python main.py \
--btc ../data/btcusd_1-min_data.csv \
--eth ../data/ethusd_1min_ohlc.csv \
--rules "30min,45min,1H" \
--states 3 \
--horizon 60
```
### Confidence and coverage sweep
```bash
./run_focus.sh
```
---
## Typical Output
```
# Randomized time-split comparison
States=3 HorizonMin=60 Splits=8 TestBars=500 GapBars=24
rule splits hit sharpe
45min 8 0.4642 ± 0.0071 2.0575 ± 0.0413
39min 8 0.4662 ± 0.0083 2.3124 ± 0.0502
30min 8 0.4632 ± 0.0090 2.0331 ± 0.0368
```
---
## Interpretation for Strategy Design
| Regime Type | Market Traits | Suggested Strategy |
| -------------------- | ------------------------ | ------------------------- |
| High-vol / decoupled | large ETH/BTC divergence | Momentum / Breakout |
| Low-vol / correlated | calm, mean-reverting | Reversion / Market-Making |
| Neutral | noisy transitions | Flat / Reduced exposure |
---
## Requirements
* **Python ≥ 3.11**
* **Environment manager:** [**uv** ](https://github.com/astral-sh/uv ) (fast Python package installer and environment manager)
### Setup
Create and activate a local environment using **uv** :
```bash
# from the project root
uv venv
source .venv/bin/activate
# install dependencies
uv pip install numpy pandas scikit-learn hmmlearn
```
---
## Repository Structure
```
.
├── main.py # core HMM regime experiment with CV
├── main_conf_metrics.py # confidence/coverage sweep
├── run_grid.sh # full grid search over horizons/timeframes
├── run_focus.sh # focused confidence sweep
├── README.md
```