# Unified Logging System The TCP Dashboard project uses a unified logging system built on Python's standard `logging` library. It provides consistent, centralized logging across all components. ## Key Features - **Component-based Logging**: Each component (e.g., `bot_manager`, `data_collector`) gets its own dedicated logger, with logs organized into separate directories under `logs/`. - **Standardized & Simple**: Relies on standard Python `logging` handlers, making it robust and easy to maintain. - **Date-based Rotation**: Log files are automatically rotated daily at midnight by `TimedRotatingFileHandler`. - **Automatic Cleanup**: Log file retention is managed automatically based on the number of backup files to keep (`backupCount`), preventing excessive disk usage. - **Unified Format**: All log messages follow a detailed, consistent format: `[YYYY-MM-DD HH:MM:SS - LEVEL - pathname:lineno - funcName] - message]`. - **Configurable Console Output**: Optional console output for real-time monitoring, configurable via function arguments or environment variables. ## Usage ### Getting a Logger The primary way to get a logger is via the `get_logger` function. It is thread-safe and ensures that loggers are configured only once. ```python from utils.logger import get_logger # Get a logger for the bot manager component # This will create a file logger and, if verbose=True, a console logger. logger = get_logger('bot_manager', verbose=True) logger.info("Bot started successfully") logger.debug("Connecting to database...") logger.warning("API response time is high") logger.error("Failed to execute trade", exc_info=True) ``` ### Configuration The `get_logger` function accepts the following parameters: | Parameter | Type | Default | Description | |-------------------|---------------------|---------|-----------------------------------------------------------------------------| | `component_name` | `str` | `default_logger` | Name of the component (e.g., `bot_manager`). Used for the logger name and directory. | | `log_level` | `str` | `INFO` | The minimum logging level to be processed (DEBUG, INFO, WARNING, ERROR, CRITICAL). | | `verbose` | `Optional[bool]` | `None` | If `True`, enables console logging. If `None`, uses `VERBOSE_LOGGING` or `LOG_TO_CONSOLE` from environment variables. | | `max_log_files` | `int` | `30` | The maximum number of backup log files to keep. The core of the log cleanup mechanism. | | `clean_old_logs` | `bool` | `True` | **Deprecated**. Kept for backward compatibility but has no effect. Cleanup is controlled by `max_log_files`. | For centralized control, you can use environment variables: - `VERBOSE_LOGGING`: Set to `true` to enable console logging for all loggers. - `LOG_TO_CONSOLE`: An alias for `VERBOSE_LOGGING`. ### Log File Structure The logger creates a directory for each component inside `logs/`. The main log file is named `component_name.log`. When rotated, old logs are renamed with a date suffix. ``` logs/ ├── bot_manager/ │ ├── bot_manager.log (current log file) │ └── bot_manager.log.2023-11-15 ├── data_collector/ │ ├── data_collector.log │ └── data_collector.log.2023-11-15 └── default_logger/ └── default_logger.log └── test_component/ └── test_component.log ``` ## Advanced Usage ### Age-Based Log Cleanup While the primary cleanup mechanism is count-based (via `max_log_files`), a separate utility function, `cleanup_old_logs`, is available for age-based cleanup if you have specific retention policies. ```python from utils.logger import cleanup_old_logs # Deletes all log files in the 'bot_manager' directory older than 15 days cleanup_old_logs('bot_manager', days_to_keep=15) ``` ### Shutting Down Logging In some cases, especially in tests or when an application is shutting down gracefully, you may need to explicitly close all log file handlers. ```python from utils.logger import shutdown_logging # Closes all open file handlers managed by the logging system shutdown_logging() ``` ## Component Integration Pattern (Conditional Logging) While the logger utility is simple, it is designed to support a powerful conditional logging pattern at the application level. This allows components to be developed to run with or without logging, making them more flexible and easier to test. ### Key Concepts 1. **Optional Logging**: Components are designed to accept `logger=None` in their constructor and function normally without producing any logs. 2. **Error-Only Mode**: A component can be designed to only log messages of level `ERROR` or higher. This is a component-level implementation pattern, not a feature of `get_logger`. 3. **Logger Inheritance**: Parent components can pass their logger instance to child components, ensuring a consistent logging context. ### Example: Component Constructor All major components should follow this pattern to support conditional logging. ```python class ComponentExample: def __init__(self, logger=None, log_errors_only=False): self.logger = logger self.log_errors_only = log_errors_only def _log_info(self, message: str) -> None: """Log info message if logger is available and not in errors-only mode.""" if self.logger and not self.log_errors_only: self.logger.info(message) def _log_error(self, message: str, exc_info: bool = False) -> None: """Log error message if logger is available.""" if self.logger: self.logger.error(message, exc_info=exc_info) # ... other helper methods for debug, warning, critical ... ``` This pattern decouples the component's logic from the global logging configuration and makes its logging behavior explicit and easy to manage. ## Troubleshooting - **Permissions**: Ensure the application has write permissions to the `logs/` directory. - **No Logs**: If file logging fails (e.g., due to permissions), a warning is printed to the console. If `verbose` is not enabled, no further logs will be produced. Ensure the `logs/` directory is writable. - **Console Spam**: If the console is too noisy, set `verbose=False` when calling `get_logger` and ensure `VERBOSE_LOGGING` is not set to `true` in your environment. ## Best Practices ### 1. Component Naming Use descriptive, consistent component names: - `bot_manager` - for bot lifecycle management - `data_collector` - for market data collection - `strategies` - for trading strategies - `backtesting` - for backtesting engine - `dashboard` - for web dashboard ### 2. Log Level Guidelines - **DEBUG**: Detailed diagnostic information, typically only of interest when diagnosing problems - **INFO**: General information about program execution - **WARNING**: Something unexpected happened, but the program is still working - **ERROR**: A serious problem occurred, the program couldn't perform a function - **CRITICAL**: A serious error occurred, the program may not be able to continue ### 3. Verbose Logging Guidelines ```python # Development: Use verbose logging with DEBUG level dev_logger = get_logger('component', 'DEBUG', verbose=True, max_log_files=3) # Production: Use INFO level with no console output prod_logger = get_logger('component', 'INFO', verbose=False, max_log_files=30) # Testing: Disable cleanup to preserve test logs test_logger = get_logger('test_component', 'DEBUG', verbose=True, clean_old_logs=False) ``` ### 4. Log Retention Guidelines ```python # High-frequency components (data collectors): shorter retention data_logger = get_logger('data_collector', max_log_files=7) # Important components (bot managers): longer retention bot_logger = get_logger('bot_manager', max_log_files=30) # Development: very short retention dev_logger = get_logger('dev_component', max_log_files=3) ``` ### 5. Message Content ```python # Good: Descriptive and actionable logger.error("Failed to connect to OKX API: timeout after 30s") # Bad: Vague and unhelpful logger.error("Error occurred") # Good: Include relevant context logger.info(f"Bot {bot_id} executed trade: {symbol} {side} {quantity}@{price}") # Good: Include duration for performance monitoring start_time = time.time() # ... do work ... duration = time.time() - start_time logger.info(f"Data aggregation completed in {duration:.2f}s") ``` ### 6. Exception Handling ```python try: execute_trade(symbol, quantity, price) logger.info(f"Trade executed successfully: {symbol}") except APIError as e: logger.error(f"API error during trade execution: {e}", exc_info=True) raise except ValidationError as e: logger.warning(f"Trade validation failed: {e}") return False except Exception as e: logger.critical(f"Unexpected error during trade execution: {e}", exc_info=True) raise ``` ### 7. Performance Considerations ```python # Good: Efficient string formatting logger.debug(f"Processing {len(data)} records") # Avoid: Expensive operations in log messages unless necessary # logger.debug(f"Data: {expensive_serialization(data)}") # Only if needed # Better: Check log level first for expensive operations if logger.isEnabledFor(logging.DEBUG): logger.debug(f"Data: {expensive_serialization(data)}") ``` ## Migration Guide ### Updating Existing Components 1. **Add logger parameter to constructor**: ```python def __init__(self, ..., logger=None, log_errors_only=False): ``` 2. **Add conditional logging helpers**: ```python def _log_debug(self, message: str) -> None: if self.logger and not self.log_errors_only: self.logger.debug(message) ``` 3. **Update all logging calls**: ```python # Before self.logger.info("Message") # After self._log_info("Message") ``` 4. **Pass logger to child components**: ```python child = ChildComponent(logger=self.logger) ``` ### From Standard Logging ```python # Old logging (if any existed) import logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) # New unified logging from utils.logger import get_logger logger = get_logger('component_name', verbose=True) ``` ### Gradual Adoption 1. **Phase 1**: Add optional logger parameters to new components 2. **Phase 2**: Update existing components to support conditional logging 3. **Phase 3**: Implement hierarchical logging structure 4. **Phase 4**: Add error-only logging mode ## Testing ### Testing Conditional Logging #### Test Script Example ```python # test_conditional_logging.py from utils.logger import get_logger from data.collector_manager import CollectorManager from data.exchanges.okx.collector import OKXCollector def test_no_logging(): """Test components work without loggers.""" manager = CollectorManager(logger=None) collector = OKXCollector("BTC-USDT", logger=None) print("✓ No logging test passed") def test_with_logging(): """Test components work with loggers.""" logger = get_logger('test_system') manager = CollectorManager(logger=logger) collector = OKXCollector("BTC-USDT", logger=logger) print("✓ With logging test passed") def test_error_only(): """Test error-only logging mode.""" logger = get_logger('test_errors') collector = OKXCollector("BTC-USDT", logger=logger, log_errors_only=True) print("✓ Error-only logging test passed") if __name__ == "__main__": test_no_logging() test_with_logging() test_error_only() print("✅ All conditional logging tests passed!") ``` ### Testing Changes ```python # Test without logger component = MyComponent(logger=None) # Should work without errors, no logging # Test with logger logger = get_logger('test_component') component = MyComponent(logger=logger) # Should log normally # Test error-only mode component = MyComponent(logger=logger, log_errors_only=True) # Should only log errors ``` ### Basic System Test Run a simple test to verify the logging system: ```bash python -c "from utils.logger import get_logger; logger = get_logger('test', verbose=True); logger.info('Test message'); print('Check logs/test/ directory')" ``` ## Troubleshooting ### Common Issues 1. **Permission errors**: Ensure the application has write permissions to the project directory 2. **Disk space**: Monitor disk usage and adjust log retention with `max_log_files` 3. **Threading issues**: The logger is thread-safe, but check for application-level concurrency issues 4. **Too many console messages**: Adjust `verbose` parameter or log levels ### Debug Mode Enable debug logging to troubleshoot issues: ```python logger = get_logger('component_name', 'DEBUG', verbose=True) ``` ### Console Output Issues ```python # Force console output regardless of environment logger = get_logger('component_name', verbose=True) # Check environment variables import os print(f"VERBOSE_LOGGING: {os.getenv('VERBOSE_LOGGING')}") print(f"LOG_TO_CONSOLE: {os.getenv('LOG_TO_CONSOLE')}") ``` ### Fallback Logging If file logging fails, the system automatically falls back to console logging with a warning message. ## Integration with Existing Code The logging system is designed to be gradually adopted: 1. **Start with new modules**: Use the unified logger in new code 2. **Replace existing logging**: Gradually migrate existing logging to the unified system 3. **No breaking changes**: Existing code continues to work ## Maintenance ### Automatic Cleanup Benefits The automatic cleanup feature provides several benefits: - **Disk space management**: Prevents log directories from growing indefinitely - **Performance**: Fewer files to scan in log directories - **Maintenance-free**: No need for external cron jobs or scripts - **Component-specific**: Each component can have different retention policies ### Manual Cleanup for Special Cases For cases requiring age-based cleanup instead of count-based: ```python # cleanup_logs.py from utils.logger import cleanup_old_logs components = ['bot_manager', 'data_collector', 'strategies', 'dashboard'] for component in components: cleanup_old_logs(component, days_to_keep=30) ``` ### Monitoring Disk Usage Monitor the `logs/` directory size and adjust retention policies as needed: ```bash # Check log directory size du -sh logs/ # Find large log files find logs/ -name "*.txt" -size +10M # Count log files per component find logs/ -name "*.txt" | cut -d'/' -f2 | sort | uniq -c ``` This conditional logging system provides maximum flexibility while maintaining clean, maintainable code that works in all scenarios.