End-to-end pipeline for engineering OHLCV features, training an XGBoost regressor (GPU by default), and running inference via a small, reusable predictor API.
## Quickstart (uv)
Prereqs:
- Python 3.12+
-`uv` installed (see `https://docs.astral.sh/uv/`)
Install dependencies:
```powershell
uv sync
```
Run training (expects an input CSV; see Data Requirements):
```powershell
uv run python main.py
```
Run the inference demo:
```powershell
uv run python inference_example.py
```
## Data requirements
Your input DataFrame/CSV must include these columns:
-`Timestamp` can be either a pandas datetime-like column or Unix seconds (int). During inference, the predictor will try to parse strings as datetimes; non-object dtypes are treated as Unix seconds.
-`dash`/Plotly for charts (Plotly is used by `plot_results.py`)
Install using:
```powershell
uv sync
```
## Troubleshooting
- KeyError: `'log_return'` during inference: ensure your input DataFrame includes `log_return` as described above.
- Model file not found: confirm the path passed to `OHLCVPredictor(...)` matches where training saved it (default `../data/xgboost_model_all_features.json`).
- Feature mismatch (e.g., XGBoost "Number of columns does not match"): ensure you use the model together with its companion feature list JSON. The predictor will automatically use it if present. If missing, retrain with the current code so the feature list is generated.