Implement data collection architecture with modular components
- Introduced a comprehensive data collection framework, including `CollectorServiceConfig`, `BaseDataCollector`, and `CollectorManager`, enhancing modularity and maintainability. - Developed `CollectorFactory` for streamlined collector creation, promoting separation of concerns and improved configuration handling. - Enhanced `DataCollectionService` to utilize the new architecture, ensuring robust error handling and logging practices. - Added `TaskManager` for efficient management of asynchronous tasks, improving performance and resource management. - Implemented health monitoring and auto-recovery features in `CollectorManager`, ensuring reliable operation of data collectors. - Updated imports across the codebase to reflect the new structure, ensuring consistent access to components. These changes significantly improve the architecture and maintainability of the data collection service, aligning with project standards for modularity, performance, and error handling.
This commit is contained in:
@@ -1,9 +1,11 @@
|
||||
## Relevant Files
|
||||
|
||||
- `data/collector_manager.py` - Core manager for data collectors (refactored: 563→178 lines).
|
||||
- `data/collection_service.py` - Main service for data collection.
|
||||
- `data/collector_manager.py` - Core manager for data collectors (refactored: 563→178 lines, enhanced with TaskManager).
|
||||
- `data/collection_service.py` - Main service for data collection (enhanced with TaskManager).
|
||||
- `data/collector_types.py` - Shared data types for collector management (new file).
|
||||
- `data/manager_components/` - Component classes for modular manager architecture (new directory).
|
||||
- `data/manager_components/manager_stats_tracker.py` - Enhanced with performance monitoring and cache optimization.
|
||||
- `utils/async_task_manager.py` - New comprehensive async task management utility (new file).
|
||||
- `data/__init__.py` - Updated imports for new structure.
|
||||
- `tests/test_collector_manager.py` - Unit tests for `collector_manager.py` (imports updated).
|
||||
- `tests/test_data_collection_aggregation.py` - Integration tests (imports updated).
|
||||
@@ -113,11 +115,11 @@ Both files show good foundational architecture but exceed the recommended file s
|
||||
- [x] 3.5 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
|
||||
|
||||
|
||||
- [ ] 4.0 Optimize Performance and Resource Management
|
||||
- [ ] 4.1 Implement a `TaskManager` class in `utils/async_task_manager.py` to manage and track `asyncio.Task` instances in `CollectorManager` and `DataCollectionService`.
|
||||
- [ ] 4.2 Introduce a `CachedStatsManager` in `data/manager_components/manager_stats_tracker.py` for `CollectorManager` to cache statistics and update them periodically instead of on every `get_status` call.
|
||||
- [ ] 4.3 Review all `asyncio.sleep` calls for optimal intervals.
|
||||
- [ ] 4.4 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
|
||||
- [x] 4.0 Optimize Performance and Resource Management
|
||||
- [x] 4.1 Implement a `TaskManager` class in `utils/async_task_manager.py` to manage and track `asyncio.Task` instances in `CollectorManager` and `DataCollectionService`.
|
||||
- [x] 4.2 Introduce a `CachedStatsManager` in `data/manager_components/manager_stats_tracker.py` for `CollectorManager` to cache statistics and update them periodically instead of on every `get_status` call.
|
||||
- [x] 4.3 Review all `asyncio.sleep` calls for optimal intervals.
|
||||
- [x] 4.4 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
|
||||
|
||||
- [ ] 5.0 Improve Documentation and Test Coverage
|
||||
- [ ] 5.1 Add comprehensive docstrings to all public methods and classes in `CollectorManager` and `DataCollectionService`, including examples, thread safety notes, and performance considerations.
|
||||
|
||||
Reference in New Issue
Block a user