TCPDashboard/tasks/collector-service-tasks-optimization.md
Vasily.onl f6cb1485b1 Implement data collection architecture with modular components
- Introduced a comprehensive data collection framework, including `CollectorServiceConfig`, `BaseDataCollector`, and `CollectorManager`, enhancing modularity and maintainability.
- Developed `CollectorFactory` for streamlined collector creation, promoting separation of concerns and improved configuration handling.
- Enhanced `DataCollectionService` to utilize the new architecture, ensuring robust error handling and logging practices.
- Added `TaskManager` for efficient management of asynchronous tasks, improving performance and resource management.
- Implemented health monitoring and auto-recovery features in `CollectorManager`, ensuring reliable operation of data collectors.
- Updated imports across the codebase to reflect the new structure, ensuring consistent access to components.

These changes significantly improve the architecture and maintainability of the data collection service, aligning with project standards for modularity, performance, and error handling.
2025-06-10 13:40:28 +08:00

131 lines
8.2 KiB
Markdown

## Relevant Files
- `data/collector_manager.py` - Core manager for data collectors (refactored: 563→178 lines, enhanced with TaskManager).
- `data/collection_service.py` - Main service for data collection (enhanced with TaskManager).
- `data/collector_types.py` - Shared data types for collector management (new file).
- `data/manager_components/` - Component classes for modular manager architecture (new directory).
- `data/manager_components/manager_stats_tracker.py` - Enhanced with performance monitoring and cache optimization.
- `utils/async_task_manager.py` - New comprehensive async task management utility (new file).
- `data/__init__.py` - Updated imports for new structure.
- `tests/test_collector_manager.py` - Unit tests for `collector_manager.py` (imports updated).
- `tests/test_data_collection_aggregation.py` - Integration tests (imports updated).
- `scripts/production_clean.py` - Production script (verified working).
- `scripts/start_data_collection.py` - Data collection script (verified working).
## Code Review Analysis: `collection_service.py` & `collector_manager.py`
### Overall Assessment
Both files show good foundational architecture but exceed the recommended file size limits and contain several areas for improvement.
### 📏 File Size Violations
- **`collector_manager.py`**: 563 lines (❌ Exceeds 250-line limit by 125%)
- **`collection_service.py`**: 451 lines (❌ Exceeds 250-line limit by 80%)
### 🔍 Function Size Analysis
**Functions Exceeding 50-Line Limit:**
**`collector_manager.py`:**
- `__init__()` - 65 lines
- `_global_health_monitor()` - 71 lines
- `get_status()` - 53 lines
**`collection_service.py`:**
- `_create_default_config()` - 89 lines
- `run()` - 98 lines
### 🏗️ Architecture & Design Issues
1. **Tight Coupling in CollectorManager**
- **Issue**: The manager class handles too many responsibilities (collector lifecycle, health monitoring, statistics, logging).
- **Solution**: Apply Single Responsibility Principle by creating dedicated component classes.
2. **Configuration Management Complexity**
- **Issue**: Configuration logic scattered across multiple methods.
- **Solution**: Dedicated configuration manager for centralized handling.
### 🔒 Security & Error Handling Review
**Strengths:**
- Proper exception handling with context
- No hardcoded credentials
- Graceful shutdown handling
- Input validation in configuration
**Areas for Improvement:**
1. **Error Information Leakage**
- **Issue**: Could leak internal details.
- **Solution**: Sanitize error messages before logging.
2. **Configuration File Security**
- **Issue**: No file permission validation.
- **Solution**: Add validation to ensure appropriate file permissions.
### 🚀 Performance Optimization Opportunities
1. **Async Task Management**
- **Issue**: Potential memory leaks with untracked tasks.
- **Solution**: Implement proper task lifecycle management with a `TaskManager`.
2. **Statistics Collection Optimization**
- **Issue**: Statistics calculated on every status request.
- **Solution**: Use cached statistics with background updates via a `CachedStatsManager`.
### 🧪 Testing & Maintainability
**Missing Test Coverage Areas:**
1. Collector manager state transitions
2. Health monitoring edge cases
3. Configuration validation
4. Signal handling
5. Concurrent collector operations
### 📝 Documentation Improvements
1. **Missing API Documentation**
- **Issue**: Public methods and classes lack comprehensive docstrings.
- **Solution**: Add examples, thread safety, and performance considerations.
2. **Configuration Schema Documentation**
- **Issue**: No formal schema validation.
- **Solution**: Implement JSON schema validation for configurations.
### 📊 Quality Metrics Summary
| Metric | Current | Target | Status |
|--------|---------|--------|--------|
| File Size | 563/451 lines | <250 lines | |
| Function Size | 5 functions >50 lines | 0 functions >50 lines | ❌ |
| Cyclomatic Complexity | Medium-High | Low-Medium | ⚠️ |
| Test Coverage | ~30% estimated | >80% | ❌ |
| Documentation | Basic | Comprehensive | ⚠️ |
| Error Handling | Good | Excellent | ✅ |
## Tasks
- [x] 1.0 Refactor `collector_manager.py` for Modularity and Readability
- [x] 1.1 Extract `ManagerStatus` and `CollectorConfig` dataclasses to `data/collector_types.py`.
- [x] 1.2 Create `data/manager_components/collector_lifecycle_manager.py` to handle `add_collector`, `remove_collector`, `enable_collector`, `disable_collector`, `_start_collector`, `restart_collector`, `restart_all_collectors`.
- [x] 1.3 Create `data/manager_components/manager_health_monitor.py` to encapsulate `_global_health_monitor` logic.
- [x] 1.4 Create `data/manager_components/manager_stats_tracker.py` to manage statistics in `get_status` and update `_stats`.
- [x] 1.5 Create `data/manager_components/manager_logger.py` to centralize logging methods (`_log_debug`, `_log_info`, `_log_warning`, `_log_error`, `_log_critical`).
- [x] 1.6 Update `CollectorManager` to use instances of these new component classes.
- [x] 1.7 Ensure `CollectorManager` `__init__` method is under 50 lines by delegating initialization to helper methods within the class or component classes.
- [x] 2.0 Refactor `collection_service.py` for Improved Structure
- [x] 2.1 Create `config/service_config.py` to handle `_load_config` and `_create_default_config` logic, including schema validation.
- [x] 2.2 Create `data/collector_factory.py` to encapsulate `_create_collector` logic.
- [x] 2.3 Update `DataCollectionService` to use instances of these new component classes.
- [x] 2.4 Refactor `run()` method to be under 50 lines by extracting sub-logics (e.g., `_run_main_loop`).
- [x] 2.5 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
- [x] 3.0 Enhance Error Handling and Security
- [x] 3.1 Implement a `_sanitize_error` method in `CollectorManager` and `DataCollectionService` to prevent leaking internal error details.
- [x] 3.2 Add file permission validation for configuration files in `config/service_config.py`.
- [x] 3.3 Review all `try-except` blocks to ensure specific exceptions are caught rather than broad `Exception`.
- [x] 3.4 Ensure all logger calls include `exc_info=True` for error and critical logs.
- [x] 3.5 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
- [x] 4.0 Optimize Performance and Resource Management
- [x] 4.1 Implement a `TaskManager` class in `utils/async_task_manager.py` to manage and track `asyncio.Task` instances in `CollectorManager` and `DataCollectionService`.
- [x] 4.2 Introduce a `CachedStatsManager` in `data/manager_components/manager_stats_tracker.py` for `CollectorManager` to cache statistics and update them periodically instead of on every `get_status` call.
- [x] 4.3 Review all `asyncio.sleep` calls for optimal intervals.
- [x] 4.4 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
- [ ] 5.0 Improve Documentation and Test Coverage
- [ ] 5.1 Add comprehensive docstrings to all public methods and classes in `CollectorManager` and `DataCollectionService`, including examples, thread safety notes, and performance considerations.
- [ ] 5.2 Create new unit test files: `tests/data/manager_components/test_collector_lifecycle_manager.py`, `tests/data/manager_components/test_manager_health_monitor.py`, `tests/data/manager_components/test_manager_stats_tracker.py`, `tests/config/test_service_config.py`, `tests/data/test_collector_factory.py`.
- [ ] 5.3 Write unit tests for all new components (lifecycle manager, health monitor, stats tracker, service config, collector factory).
- [ ] 5.4 Enhance existing tests or create new ones for `CollectorManager` to cover state transitions, health monitoring edge cases, and concurrent operations.
- [ ] 5.5 Enhance existing tests or create new ones for `DataCollectionService` to cover configuration validation, service lifecycle, and signal handling.
- [ ] 5.6 Ensure all tests use `uv run pytest` and verify passing.
- [ ] 5.7 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.