TCPDashboard/tasks/collector-service-tasks-optimization.md

129 lines
7.9 KiB
Markdown
Raw Normal View History

## Relevant Files
- `data/collector_manager.py` - Core manager for data collectors (refactored: 563→178 lines).
- `data/collection_service.py` - Main service for data collection.
- `data/collector_types.py` - Shared data types for collector management (new file).
- `data/manager_components/` - Component classes for modular manager architecture (new directory).
- `data/__init__.py` - Updated imports for new structure.
- `tests/test_collector_manager.py` - Unit tests for `collector_manager.py` (imports updated).
- `tests/test_data_collection_aggregation.py` - Integration tests (imports updated).
- `scripts/production_clean.py` - Production script (verified working).
- `scripts/start_data_collection.py` - Data collection script (verified working).
## Code Review Analysis: `collection_service.py` & `collector_manager.py`
### Overall Assessment
Both files show good foundational architecture but exceed the recommended file size limits and contain several areas for improvement.
### 📏 File Size Violations
- **`collector_manager.py`**: 563 lines (❌ Exceeds 250-line limit by 125%)
- **`collection_service.py`**: 451 lines (❌ Exceeds 250-line limit by 80%)
### 🔍 Function Size Analysis
**Functions Exceeding 50-Line Limit:**
**`collector_manager.py`:**
- `__init__()` - 65 lines
- `_global_health_monitor()` - 71 lines
- `get_status()` - 53 lines
**`collection_service.py`:**
- `_create_default_config()` - 89 lines
- `run()` - 98 lines
### 🏗️ Architecture & Design Issues
1. **Tight Coupling in CollectorManager**
- **Issue**: The manager class handles too many responsibilities (collector lifecycle, health monitoring, statistics, logging).
- **Solution**: Apply Single Responsibility Principle by creating dedicated component classes.
2. **Configuration Management Complexity**
- **Issue**: Configuration logic scattered across multiple methods.
- **Solution**: Dedicated configuration manager for centralized handling.
### 🔒 Security & Error Handling Review
**Strengths:**
- Proper exception handling with context
- No hardcoded credentials
- Graceful shutdown handling
- Input validation in configuration
**Areas for Improvement:**
1. **Error Information Leakage**
- **Issue**: Could leak internal details.
- **Solution**: Sanitize error messages before logging.
2. **Configuration File Security**
- **Issue**: No file permission validation.
- **Solution**: Add validation to ensure appropriate file permissions.
### 🚀 Performance Optimization Opportunities
1. **Async Task Management**
- **Issue**: Potential memory leaks with untracked tasks.
- **Solution**: Implement proper task lifecycle management with a `TaskManager`.
2. **Statistics Collection Optimization**
- **Issue**: Statistics calculated on every status request.
- **Solution**: Use cached statistics with background updates via a `CachedStatsManager`.
### 🧪 Testing & Maintainability
**Missing Test Coverage Areas:**
1. Collector manager state transitions
2. Health monitoring edge cases
3. Configuration validation
4. Signal handling
5. Concurrent collector operations
### 📝 Documentation Improvements
1. **Missing API Documentation**
- **Issue**: Public methods and classes lack comprehensive docstrings.
- **Solution**: Add examples, thread safety, and performance considerations.
2. **Configuration Schema Documentation**
- **Issue**: No formal schema validation.
- **Solution**: Implement JSON schema validation for configurations.
### 📊 Quality Metrics Summary
| Metric | Current | Target | Status |
|--------|---------|--------|--------|
| File Size | 563/451 lines | <250 lines | |
| Function Size | 5 functions >50 lines | 0 functions >50 lines | ❌ |
| Cyclomatic Complexity | Medium-High | Low-Medium | ⚠️ |
| Test Coverage | ~30% estimated | >80% | ❌ |
| Documentation | Basic | Comprehensive | ⚠️ |
| Error Handling | Good | Excellent | ✅ |
## Tasks
- [x] 1.0 Refactor `collector_manager.py` for Modularity and Readability
- [x] 1.1 Extract `ManagerStatus` and `CollectorConfig` dataclasses to `data/collector_types.py`.
- [x] 1.2 Create `data/manager_components/collector_lifecycle_manager.py` to handle `add_collector`, `remove_collector`, `enable_collector`, `disable_collector`, `_start_collector`, `restart_collector`, `restart_all_collectors`.
- [x] 1.3 Create `data/manager_components/manager_health_monitor.py` to encapsulate `_global_health_monitor` logic.
- [x] 1.4 Create `data/manager_components/manager_stats_tracker.py` to manage statistics in `get_status` and update `_stats`.
- [x] 1.5 Create `data/manager_components/manager_logger.py` to centralize logging methods (`_log_debug`, `_log_info`, `_log_warning`, `_log_error`, `_log_critical`).
- [x] 1.6 Update `CollectorManager` to use instances of these new component classes.
- [x] 1.7 Ensure `CollectorManager` `__init__` method is under 50 lines by delegating initialization to helper methods within the class or component classes.
- [x] 2.0 Refactor `collection_service.py` for Improved Structure
- [x] 2.1 Create `config/service_config.py` to handle `_load_config` and `_create_default_config` logic, including schema validation.
- [x] 2.2 Create `data/collector_factory.py` to encapsulate `_create_collector` logic.
- [x] 2.3 Update `DataCollectionService` to use instances of these new component classes.
- [x] 2.4 Refactor `run()` method to be under 50 lines by extracting sub-logics (e.g., `_run_main_loop`).
- [x] 2.5 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
- [ ] 3.0 Enhance Error Handling and Security
- [ ] 3.1 Implement a `_sanitize_error` method in `CollectorManager` and `DataCollectionService` to prevent leaking internal error details.
- [ ] 3.2 Add file permission validation for configuration files in `config/service_config.py`.
- [ ] 3.3 Review all `try-except` blocks to ensure specific exceptions are caught rather than broad `Exception`.
- [ ] 3.4 Ensure all logger calls include `exc_info=True` for error and critical logs.
- [ ] 3.5 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
- [ ] 4.0 Optimize Performance and Resource Management
- [ ] 4.1 Implement a `TaskManager` class in `utils/async_task_manager.py` to manage and track `asyncio.Task` instances in `CollectorManager` and `DataCollectionService`.
- [ ] 4.2 Introduce a `CachedStatsManager` in `data/manager_components/manager_stats_tracker.py` for `CollectorManager` to cache statistics and update them periodically instead of on every `get_status` call.
- [ ] 4.3 Review all `asyncio.sleep` calls for optimal intervals.
- [ ] 4.4 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.
- [ ] 5.0 Improve Documentation and Test Coverage
- [ ] 5.1 Add comprehensive docstrings to all public methods and classes in `CollectorManager` and `DataCollectionService`, including examples, thread safety notes, and performance considerations.
- [ ] 5.2 Create new unit test files: `tests/data/manager_components/test_collector_lifecycle_manager.py`, `tests/data/manager_components/test_manager_health_monitor.py`, `tests/data/manager_components/test_manager_stats_tracker.py`, `tests/config/test_service_config.py`, `tests/data/test_collector_factory.py`.
- [ ] 5.3 Write unit tests for all new components (lifecycle manager, health monitor, stats tracker, service config, collector factory).
- [ ] 5.4 Enhance existing tests or create new ones for `CollectorManager` to cover state transitions, health monitoring edge cases, and concurrent operations.
- [ ] 5.5 Enhance existing tests or create new ones for `DataCollectionService` to cover configuration validation, service lifecycle, and signal handling.
- [ ] 5.6 Ensure all tests use `uv run pytest` and verify passing.
- [ ] 5.7 Test './scripts/start_data_collection.py' and './scripts/production_clean.py' to ensure they work as expected.