docs

2025-05-20 18:36:59 +08:00
parent 369b3c1daf
commit 955a340d02
4 changed files with 202 additions and 2 deletions
--- a/docs/analysis.md
+++ b/docs/analysis.md
@@ -0,0 +1,78 @@
+# Analysis Module
+
+This document provides an overview of the `Analysis` module and its components, which are typically used for technical analysis of financial market data.
+
+## Modules
+
+The `Analysis` module includes classes for calculating common technical indicators:
+
+-   **Relative Strength Index (RSI)**: Implemented in `cycles/Analysis/rsi.py`.
+-   **Bollinger Bands**: Implemented in `cycles/Analysis/boillinger_band.py`.
+
+## Class: `RSI`
+
+Found in `cycles/Analysis/rsi.py`.
+
+Calculates the Relative Strength Index.
+### Mathematical Model  
+1. **Average Gain (AvgU)** and **Average Loss (AvgD)** over 14 periods:  
+   $$
+   \text{AvgU} = \frac{\sum \text{Upward Price Changes}}{14}, \quad \text{AvgD} = \frac{\sum \text{Downward Price Changes}}{14}
+   $$  
+2. **Relative Strength (RS)**:  
+   $$
+   RS = \frac{\text{AvgU}}{\text{AvgD}}
+   $$  
+3. **RSI**:  
+   $$
+   RSI = 100 - \frac{100}{1 + RS}
+   $$  
+
+### `__init__(self, period: int = 14)`
+
+-   **Description**: Initializes the RSI calculator.
+-   **Parameters**:
+    -   `period` (int, optional): The period for RSI calculation. Defaults to 14. Must be a positive integer.
+
+### `calculate(self, data_df: pd.DataFrame, price_column: str = 'close') -> pd.DataFrame`
+
+-   **Description**: Calculates the RSI and adds it as an 'RSI' column to the input DataFrame. Handles cases where data length is less than the period by returning the original DataFrame with a warning.
+-   **Parameters**:
+    -   `data_df` (pd.DataFrame): DataFrame with historical price data. Must contain the `price_column`.
+    -   `price_column` (str, optional): The name of the column containing price data. Defaults to 'close'.
+-   **Returns**: `pd.DataFrame` - The input DataFrame with an added 'RSI' column (containing `np.nan` for initial periods where RSI cannot be calculated). Returns a copy of the original DataFrame if the period is larger than the number of data points.
+
+## Class: `BollingerBands`
+
+Found in `cycles/Analysis/boillinger_band.py`.
+
+## **Bollinger Bands**  
+### Mathematical Model  
+1. **Middle Band**: 20-day Simple Moving Average (SMA)  
+   $$
+   \text{Middle Band} = \frac{1}{20} \sum_{i=1}^{20} \text{Close}_{t-i}
+   $$  
+2. **Upper Band**: Middle Band + 2 × 20-day Standard Deviation (σ)  
+   $$
+   \text{Upper Band} = \text{Middle Band} + 2 \times \sigma_{20}
+   $$  
+3. **Lower Band**: Middle Band − 2 × 20-day Standard Deviation (σ)  
+   $$
+   \text{Lower Band} = \text{Middle Band} - 2 \times \sigma_{20}
+   $$  
+
+
+### `__init__(self, period: int = 20, std_dev_multiplier: float = 2.0)`
+
+-   **Description**: Initializes the BollingerBands calculator.
+-   **Parameters**:
+    -   `period` (int, optional): The period for the moving average and standard deviation. Defaults to 20. Must be a positive integer.
+    -   `std_dev_multiplier` (float, optional): The number of standard deviations for the upper and lower bands. Defaults to 2.0. Must be positive.
+
+### `calculate(self, data_df: pd.DataFrame, price_column: str = 'close') -> pd.DataFrame`
+
+-   **Description**: Calculates Bollinger Bands and adds 'SMA' (Simple Moving Average), 'UpperBand', and 'LowerBand' columns to the DataFrame.
+-   **Parameters**:
+    -   `data_df` (pd.DataFrame): DataFrame with price data. Must include the `price_column`.
+    -   `price_column` (str, optional): The name of the column containing the price data (e.g., 'close'). Defaults to 'close'.
+-   **Returns**: `pd.DataFrame` - The original DataFrame with added columns: 'SMA', 'UpperBand', 'LowerBand'.
--- a/docs/utils_storage.md
+++ b/docs/utils_storage.md
@@ -0,0 +1,73 @@
+# Storage Utilities
+
+This document describes the storage utility functions found in `cycles/utils/storage.py`.
+
+## Overview
+
+The `storage.py` module provides a `Storage` class designed for handling the loading and saving of data and results. It supports operations with CSV and JSON files and integrates with pandas DataFrames for data manipulation. The class also manages the creation of necessary `results` and `data` directories.
+
+## Constants
+
+-   `RESULTS_DIR`: Defines the default directory name for storing results (default: "results").
+-   `DATA_DIR`: Defines the default directory name for storing input data (default: "data").
+
+## Class: `Storage`
+
+Handles storage operations for data and results.
+
+### `__init__(self, logging=None, results_dir=RESULTS_DIR, data_dir=DATA_DIR)`
+
+-   **Description**: Initializes the `Storage` class. It creates the results and data directories if they don't already exist.
+-   **Parameters**:
+    -   `logging` (optional): A logging instance for outputting information. Defaults to `None`.
+    -   `results_dir` (str, optional): Path to the directory for storing results. Defaults to `RESULTS_DIR`.
+    -   `data_dir` (str, optional): Path to the directory for storing data. Defaults to `DATA_DIR`.
+
+### `load_data(self, file_path, start_date, stop_date)`
+
+-   **Description**: Loads data from a specified file (CSV or JSON), performs type optimization, filters by date range, and converts column names to lowercase. The timestamp column is set as the DataFrame index.
+-   **Parameters**:
+    -   `file_path` (str): Path to the data file (relative to `data_dir`).
+    -   `start_date` (datetime-like): The start date for filtering data.
+    -   `stop_date` (datetime-like): The end date for filtering data.
+-   **Returns**: `pandas.DataFrame` - The loaded and processed data, with a `timestamp` index. Returns an empty DataFrame on error.
+
+### `save_data(self, data: pd.DataFrame, file_path: str)`
+
+-   **Description**: Saves a pandas DataFrame to a CSV file within the `data_dir`. If the DataFrame has a DatetimeIndex, it's converted to a Unix timestamp (seconds since epoch) and stored in a column named 'timestamp', which becomes the first column in the CSV. The DataFrame's active index is not saved if a 'timestamp' column is created.
+-   **Parameters**:
+    -   `data` (pd.DataFrame): The DataFrame to save.
+    -   `file_path` (str): Path to the data file (relative to `data_dir`).
+
+### `format_row(self, row)`
+
+-   **Description**: Formats a dictionary row for output to a combined results CSV file, applying specific string formatting for percentages and float values.
+-   **Parameters**:
+    -   `row` (dict): The row of data to format.
+-   **Returns**: `dict` - The formatted row.
+
+### `write_results_chunk(self, filename, fieldnames, rows, write_header=False, initial_usd=None)`
+
+-   **Description**: Writes a chunk of results (list of dictionaries) to a CSV file. Can append to an existing file or write a new one with a header. An optional `initial_usd` can be written as a comment in the header.
+-   **Parameters**:
+    -   `filename` (str): The name of the file to write to (path is absolute or relative to current working dir).
+    -   `fieldnames` (list): A list of strings representing the CSV header/column names.
+    -   `rows` (list): A list of dictionaries, where each dictionary is a row.
+    -   `write_header` (bool, optional): If `True`, writes the header. Defaults to `False`.
+    -   `initial_usd` (numeric, optional): If provided and `write_header` is `True`, this value is written as a comment in the CSV header. Defaults to `None`.
+
+### `write_results_combined(self, filename, fieldnames, rows)`
+
+-   **Description**: Writes combined results to a CSV file in the `results_dir`. Uses tab as a delimiter and formats rows using `format_row`.
+-   **Parameters**:
+    -   `filename` (str): The name of the file to write to (relative to `results_dir`).
+    -   `fieldnames` (list): A list of strings representing the CSV header/column names.
+    -   `rows` (list): A list of dictionaries, where each dictionary is a row.
+
+### `write_trades(self, all_trade_rows, trades_fieldnames)`
+
+-   **Description**: Writes trade data to separate CSV files based on timeframe and stop-loss percentage. Files are named `trades_{tf}_ST{sl_percent}pct.csv` and stored in `results_dir`.
+-   **Parameters**:
+    -   `all_trade_rows` (list): A list of dictionaries, where each dictionary represents a trade.
+    -   `trades_fieldnames` (list): A list of strings for the CSV header of trade files.
+
--- a/docs/utils_system.md
+++ b/docs/utils_system.md
@@ -0,0 +1,49 @@
+# System Utilities
+
+This document describes the system utility functions found in `cycles/utils/system.py`.
+
+## Overview
+
+The `system.py` module provides utility functions related to system information and resource management. It currently includes a class `SystemUtils` for determining optimal configurations based on system resources.
+
+## Classes and Methods
+
+### `SystemUtils`
+
+A class to provide system-related utility methods.
+
+#### `__init__(self, logging=None)`
+
+-   **Description**: Initializes the `SystemUtils` class.
+-   **Parameters**:
+    -   `logging` (optional): A logging instance to output information. Defaults to `None`.
+
+#### `get_optimal_workers(self)`
+
+-   **Description**: Determines the optimal number of worker processes based on available CPU cores and memory.
+    The heuristic aims to use 75% of CPU cores, with a cap based on available memory (assuming each worker might need ~2GB for large datasets). It returns the minimum of the workers calculated by CPU and memory.
+-   **Parameters**: None.
+-   **Returns**: `int` - The recommended number of worker processes.
+
+## Usage Examples
+
+```python
+from cycles.utils.system import SystemUtils
+
+# Initialize (optionally with a logger)
+# import logging
+# logging.basicConfig(level=logging.INFO)
+# logger = logging.getLogger(__name__)
+# sys_utils = SystemUtils(logging=logger)
+sys_utils = SystemUtils()
+
+
+optimal_workers = sys_utils.get_optimal_workers()
+print(f"Optimal number of workers: {optimal_workers}")
+
+# This value can then be used, for example, when setting up a ThreadPoolExecutor
+# from concurrent.futures import ThreadPoolExecutor
+# with ThreadPoolExecutor(max_workers=optimal_workers) as executor:
+#     # ... submit tasks ...
+#     pass
+``` 
--- a/test_bbrsi.py
+++ b/test_bbrsi.py
@@ -109,8 +109,8 @@ if __name__ == "__main__":
        # Plot 2: RSI
        if 'RSI' in data_bb.columns: # Check data_bb now as it should contain RSI
            sns.lineplot(x=data_bb.index, y='RSI', data=data_bb, label='RSI (14)', ax=ax2, color='purple')
-            ax2.axhline(70, color='red', linestyle='--', linewidth=0.8, label='Overbought (70)')
-            ax2.axhline(30, color='green', linestyle='--', linewidth=0.8, label='Oversold (30)')
+            ax2.axhline(75, color='red', linestyle='--', linewidth=0.8, label='Overbought (75)')
+            ax2.axhline(25, color='green', linestyle='--', linewidth=0.8, label='Oversold (25)')
            # Plot Buy/Sell signals on RSI chart
            if not buy_signals.empty:
                ax2.scatter(buy_signals.index, buy_signals['RSI'], color='green', marker='o', s=20, label='Buy Signal (RSI)', zorder=5)