# ADR-002: BaseDataCollector Refactoring and Component Extraction ## Status Accepted ## Context The `BaseDataCollector` class was initially monolithic, handling connection management, state and telemetry, and callback dispatching directly. This led to a less modular, harder-to-test, and less maintainable codebase. Additionally, `OHLCVData` and its associated validation, although broadly applicable, were tightly coupled within the `data` module, leading to potential import complexities and naming conflicts. ## Decision To improve modularity, maintainability, testability, and reusability, we decided to refactor `BaseDataCollector` by extracting its core responsibilities into dedicated, smaller, and focused components. We also decided to relocate `OHLCVData` to a more common and accessible location. ### Extracted Components: 1. **`CollectorStateAndTelemetry`**: Responsible for managing collector status, health, statistics, and logging. 2. **`ConnectionManager`**: Responsible for handling WebSocket connection lifecycle (connect, disconnect, reconnect) and related error management. 3. **`CallbackDispatcher`**: Responsible for managing and dispatching data callbacks to registered listeners. ### OHLCVData Relocation: - The `OHLCVData` class and the `validate_ohlcv_data` function, along with the `DataValidationError` exception, were moved from `data/ohlcv_data.py` to `data/common/ohlcv_data.py`. ## Consequences **Positive:** - **Improved Modularity**: `BaseDataCollector` is now leaner and focuses solely on orchestrating the new components. - **Enhanced Testability**: Each extracted component can be unit-tested in isolation, reducing test complexity and improving test coverage. - **Increased Maintainability**: Changes to connection logic, state management, or callback handling are isolated to their respective components, minimizing impact on other parts of the system. - **Greater Reusability**: `CollectorStateAndTelemetry`, `ConnectionManager`, and `CallbackDispatcher` can potentially be reused in other contexts or for different types of collectors. - **Clearer Separation of Concerns**: Each component has a single, well-defined responsibility. - **Centralized `OHLCVData`**: Moving `OHLCVData` to `data/common` provides a more intuitive and accessible location for a common data structure, resolving potential import conflicts and improving code organization. **Negative:** - **Increased File Count**: More files are introduced, potentially increasing initial navigation overhead (mitigated by clear naming and directory structure). - **Refactoring Overhead**: Required updating existing code to use the new components and adjusting imports across multiple files. ## Alternatives Considered - **Keeping Monolithic `BaseDataCollector`**: Rejected due to the drawbacks of tightly coupled code (poor testability, maintainability). - **Partial Extraction**: Considered extracting only one or two components, but decided against it to achieve maximum modularity benefits. - **Different `OHLCVData` Location**: Considered `utils/data_types.py` or `data/models.py`, but `data/common/ohlcv_data.py` was deemed most appropriate given its nature as a common data structure within the `data` module.