# BTC–ETH Regime Modeling This project builds and tests a **Hidden Markov Model (HMM)** that classifies structural market regimes in Bitcoin and Ethereum based on 1-minute OHLCV data from Bitstamp (https://www.kaggle.com/datasets/mczielinski/bitcoin-historical-data/ and https://www.kaggle.com/datasets/viniciusqroz/ethereum-historical-data). It is designed to identify volatility and correlation phases—*risk-on*, *risk-off*, and *neutral*—and to evaluate how predictive or useful these regimes are across multiple timeframes and forecast horizons. --- ## Project Overview ### Goals 1. Detect repeating market “regimes” using unsupervised learning. 2. Evaluate how those regimes behave across timeframes and forecast horizons. 3. Use regime identification to **select trading strategies per market state**, rather than predict short-term direction. ### Datasets Two synchronized 1-minute OHLCV datasets: * `btcusd_1-min_data.csv` * `ethusd_1min_ohlc.csv` Both sourced from Bitstamp (Kaggle datasets). --- ## Architecture ### 1. `main.py` Core experiment runner. Implements: * **Feature construction**: * Multi-scale realized volatility (`rv_*`) * Trend ratios (`trend_*`) * Rolling BTC–ETH correlations (`corr_*`) * Cross-asset beta and divergence * Liquidity proxies (`volratio`, `vol_sum`, `vol_diff`) * **Hidden Markov Model**: Gaussian emissions, diagonal covariance. * **Randomized time-split validation**: multiple random train/test windows with configurable embargo gap to avoid leakage. * **Metrics**: * Hit rate (directional accuracy) * Annualized Sharpe ratio of the regime-implied signal * Mean ± std across random splits This script explores model robustness across **different resample rules** (e.g. `30min`, `45min`, `1H`). --- ### 2. `main_conf_metrics.py` Lightweight evaluator used for the **confidence and coverage sweep**. * Adds a `--conf` parameter to control how confident the model must be before emitting a trade (pseudo-ETSC gate). * Prints per-run metrics: * `cov`: coverage (fraction of bars with predictions) * `hit`: overall hit rate * `hit_trades`: accuracy conditional on trading * `Sharpe`: annualized risk-adjusted performance Used by shell scripts to benchmark many timeframes and confidence thresholds. --- ### 3. Shell scripts #### `run_grid.sh` Runs a large grid of: * multiple resample rules (e.g. 20 min – 60 min), * multiple horizons (e.g. 2–6 bars ahead). #### `run_focus.sh` Focuses on the most promising regions (37–41 min, 49–59 min) and sweeps confidence thresholds (0.45 – 0.60). Produces concise summary lines for each combination. --- ## Key Findings 1. **Optimal timeframe:** ~**35 – 45 minutes** consistently yields the highest Sharpe ratios (~2.2–2.3). 2. **Forecast horizon:** Best performance around **two bars ahead** (~80 min look-ahead for 40 min bars). 3. **Confidence threshold:** Little effect between 0.45–0.60; model already confident on > 90 % of bars. 4. **Interpretation:** Regimes reflect volatility and structure, not raw direction. Use them to switch *strategy archetypes* (trend vs. mean-reversion) rather than predict sign. --- ## Example Usage ### Single test ```bash python main.py \ --btc ../data/btcusd_1-min_data.csv \ --eth ../data/ethusd_1min_ohlc.csv \ --rules "30min,45min,1H" \ --states 3 \ --horizon 60 ``` ### Confidence and coverage sweep ```bash ./run_focus.sh ``` --- ## Typical Output ``` # Randomized time-split comparison States=3 HorizonMin=60 Splits=8 TestBars=500 GapBars=24 rule splits hit sharpe 45min 8 0.4642 ± 0.0071 2.0575 ± 0.0413 39min 8 0.4662 ± 0.0083 2.3124 ± 0.0502 30min 8 0.4632 ± 0.0090 2.0331 ± 0.0368 ``` --- ## Interpretation for Strategy Design | Regime Type | Market Traits | Suggested Strategy | | -------------------- | ------------------------ | ------------------------- | | High-vol / decoupled | large ETH/BTC divergence | Momentum / Breakout | | Low-vol / correlated | calm, mean-reverting | Reversion / Market-Making | | Neutral | noisy transitions | Flat / Reduced exposure | --- ## Requirements * **Python ≥ 3.11** * **Environment manager:** [**uv**](https://github.com/astral-sh/uv) (fast Python package installer and environment manager) ### Setup Create and activate a local environment using **uv**: ```bash # from the project root uv venv source .venv/bin/activate # install dependencies uv pip install numpy pandas scikit-learn hmmlearn ``` --- ## Repository Structure ``` . ├── main.py # core HMM regime experiment with CV ├── main_conf_metrics.py # confidence/coverage sweep ├── run_grid.sh # full grid search over horizons/timeframes ├── run_focus.sh # focused confidence sweep ├── README.md ```