- Revised the logging documentation to clarify the unified logging system's features and usage patterns. - Simplified the logger implementation by removing the custom `DateRotatingFileHandler` and utilizing the standard library's `TimedRotatingFileHandler` for date-based log rotation. - Enhanced the `get_logger` function to ensure thread-safe logger configuration and prevent duplicate handlers. - Introduced a new `cleanup_old_logs` function for age-based log cleanup, while retaining the existing count-based cleanup mechanism. - Improved error handling and logging setup to ensure robust logging behavior across components. These changes enhance the clarity and maintainability of the logging system, making it easier for developers to implement and utilize logging in their components.
14 KiB
Unified Logging System
The TCP Dashboard project uses a unified logging system built on Python's standard logging library. It provides consistent, centralized logging across all components.
Key Features
- Component-based Logging: Each component (e.g.,
bot_manager,data_collector) gets its own dedicated logger, with logs organized into separate directories underlogs/. - Standardized & Simple: Relies on standard Python
logginghandlers, making it robust and easy to maintain. - Date-based Rotation: Log files are automatically rotated daily at midnight by
TimedRotatingFileHandler. - Automatic Cleanup: Log file retention is managed automatically based on the number of backup files to keep (
backupCount), preventing excessive disk usage. - Unified Format: All log messages follow a consistent format:
[YYYY-MM-DD HH:MM:SS - LEVEL - message]. - Configurable Console Output: Optional verbose console output for real-time monitoring, configurable via function arguments or environment variables.
Usage
Getting a Logger
The primary way to get a logger is via the get_logger function. It is thread-safe and ensures that loggers are configured only once.
from utils.logger import get_logger
# Get a logger for the bot manager component
# This will create a file logger and, if verbose=True, a console logger.
logger = get_logger('bot_manager', verbose=True)
logger.info("Bot started successfully")
logger.debug("Connecting to database...")
logger.warning("API response time is high")
logger.error("Failed to execute trade", exc_info=True)
Configuration
The get_logger function accepts the following parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
component_name |
str |
- | Name of the component (e.g., bot_manager). Used for the logger name and directory. |
log_level |
str |
INFO |
The minimum logging level to be processed (DEBUG, INFO, WARNING, ERROR, CRITICAL). |
verbose |
Optional[bool] |
None |
If True, enables console logging. If None, uses VERBOSE_LOGGING or LOG_TO_CONSOLE from environment variables. |
max_log_files |
int |
30 |
The maximum number of backup log files to keep. The core of the log cleanup mechanism. |
clean_old_logs |
bool |
True |
Deprecated. Kept for backward compatibility but has no effect. Cleanup is controlled by max_log_files. |
For centralized control, you can use environment variables:
VERBOSE_LOGGING: Set totrueto enable console logging for all loggers.LOG_TO_CONSOLE: An alias forVERBOSE_LOGGING.
Log File Structure
The logger creates a directory for each component inside logs/. The main log file is named component_name.log. When rotated, old logs are renamed with a date suffix.
logs/
├── bot_manager/
│ ├── bot_manager.log (current log file)
│ └── bot_manager.log.2023-11-15
├── data_collector/
│ ├── data_collector.log
│ └── data_collector.log.2023-11-15
└── test_component/
└── test_component.log
Advanced Usage
Age-Based Log Cleanup
While the primary cleanup mechanism is count-based (via max_log_files), a separate utility function, cleanup_old_logs, is available for age-based cleanup if you have specific retention policies.
from utils.logger import cleanup_old_logs
# Deletes all log files in the 'bot_manager' directory older than 15 days
cleanup_old_logs('bot_manager', days_to_keep=15)
Shutting Down Logging
In some cases, especially in tests or when an application is shutting down gracefully, you may need to explicitly close all log file handlers.
from utils.logger import shutdown_logging
# Closes all open file handlers managed by the logging system
shutdown_logging()
Component Integration Pattern (Conditional Logging)
While the logger utility is simple, it is designed to support a powerful conditional logging pattern at the application level. This allows components to be developed to run with or without logging, making them more flexible and easier to test.
Key Concepts
- Optional Logging: Components are designed to accept
logger=Nonein their constructor and function normally without producing any logs. - Error-Only Mode: A component can be designed to only log messages of level
ERRORor higher. This is a component-level implementation pattern, not a feature ofget_logger. - Logger Inheritance: Parent components can pass their logger instance to child components, ensuring a consistent logging context.
Example: Component Constructor
All major components should follow this pattern to support conditional logging.
class ComponentExample:
def __init__(self, logger=None, log_errors_only=False):
self.logger = logger
self.log_errors_only = log_errors_only
def _log_info(self, message: str) -> None:
"""Log info message if logger is available and not in errors-only mode."""
if self.logger and not self.log_errors_only:
self.logger.info(message)
def _log_error(self, message: str, exc_info: bool = False) -> None:
"""Log error message if logger is available."""
if self.logger:
self.logger.error(message, exc_info=exc_info)
# ... other helper methods for debug, warning, critical ...
This pattern decouples the component's logic from the global logging configuration and makes its logging behavior explicit and easy to manage.
Troubleshooting
- Permissions: Ensure the application has write permissions to the
logs/directory. - No Logs: If file logging fails (e.g., due to permissions), a warning is printed to the console. If
verboseis not enabled, no further logs will be produced. Ensure thelogs/directory is writable. - Console Spam: If the console is too noisy, set
verbose=Falsewhen callingget_loggerand ensureVERBOSE_LOGGINGis not set totruein your environment.
Best Practices
1. Component Naming
Use descriptive, consistent component names:
bot_manager- for bot lifecycle managementdata_collector- for market data collectionstrategies- for trading strategiesbacktesting- for backtesting enginedashboard- for web dashboard
2. Log Level Guidelines
- DEBUG: Detailed diagnostic information, typically only of interest when diagnosing problems
- INFO: General information about program execution
- WARNING: Something unexpected happened, but the program is still working
- ERROR: A serious problem occurred, the program couldn't perform a function
- CRITICAL: A serious error occurred, the program may not be able to continue
3. Verbose Logging Guidelines
# Development: Use verbose logging with DEBUG level
dev_logger = get_logger('component', 'DEBUG', verbose=True, max_log_files=3)
# Production: Use INFO level with no console output
prod_logger = get_logger('component', 'INFO', verbose=False, max_log_files=30)
# Testing: Disable cleanup to preserve test logs
test_logger = get_logger('test_component', 'DEBUG', verbose=True, clean_old_logs=False)
4. Log Retention Guidelines
# High-frequency components (data collectors): shorter retention
data_logger = get_logger('data_collector', max_log_files=7)
# Important components (bot managers): longer retention
bot_logger = get_logger('bot_manager', max_log_files=30)
# Development: very short retention
dev_logger = get_logger('dev_component', max_log_files=3)
5. Message Content
# Good: Descriptive and actionable
logger.error("Failed to connect to OKX API: timeout after 30s")
# Bad: Vague and unhelpful
logger.error("Error occurred")
# Good: Include relevant context
logger.info(f"Bot {bot_id} executed trade: {symbol} {side} {quantity}@{price}")
# Good: Include duration for performance monitoring
start_time = time.time()
# ... do work ...
duration = time.time() - start_time
logger.info(f"Data aggregation completed in {duration:.2f}s")
6. Exception Handling
try:
execute_trade(symbol, quantity, price)
logger.info(f"Trade executed successfully: {symbol}")
except APIError as e:
logger.error(f"API error during trade execution: {e}", exc_info=True)
raise
except ValidationError as e:
logger.warning(f"Trade validation failed: {e}")
return False
except Exception as e:
logger.critical(f"Unexpected error during trade execution: {e}", exc_info=True)
raise
7. Performance Considerations
# Good: Efficient string formatting
logger.debug(f"Processing {len(data)} records")
# Avoid: Expensive operations in log messages unless necessary
# logger.debug(f"Data: {expensive_serialization(data)}") # Only if needed
# Better: Check log level first for expensive operations
if logger.isEnabledFor(logging.DEBUG):
logger.debug(f"Data: {expensive_serialization(data)}")
Migration Guide
Updating Existing Components
- Add logger parameter to constructor:
def __init__(self, ..., logger=None, log_errors_only=False):
- Add conditional logging helpers:
def _log_debug(self, message: str) -> None:
if self.logger and not self.log_errors_only:
self.logger.debug(message)
- Update all logging calls:
# Before
self.logger.info("Message")
# After
self._log_info("Message")
- Pass logger to child components:
child = ChildComponent(logger=self.logger)
From Standard Logging
# Old logging (if any existed)
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# New unified logging
from utils.logger import get_logger
logger = get_logger('component_name', verbose=True)
Gradual Adoption
- Phase 1: Add optional logger parameters to new components
- Phase 2: Update existing components to support conditional logging
- Phase 3: Implement hierarchical logging structure
- Phase 4: Add error-only logging mode
Testing
Testing Conditional Logging
Test Script Example
# test_conditional_logging.py
from utils.logger import get_logger
from data.collector_manager import CollectorManager
from data.exchanges.okx.collector import OKXCollector
def test_no_logging():
"""Test components work without loggers."""
manager = CollectorManager(logger=None)
collector = OKXCollector("BTC-USDT", logger=None)
print("✓ No logging test passed")
def test_with_logging():
"""Test components work with loggers."""
logger = get_logger('test_system')
manager = CollectorManager(logger=logger)
collector = OKXCollector("BTC-USDT", logger=logger)
print("✓ With logging test passed")
def test_error_only():
"""Test error-only logging mode."""
logger = get_logger('test_errors')
collector = OKXCollector("BTC-USDT", logger=logger, log_errors_only=True)
print("✓ Error-only logging test passed")
if __name__ == "__main__":
test_no_logging()
test_with_logging()
test_error_only()
print("✅ All conditional logging tests passed!")
Testing Changes
# Test without logger
component = MyComponent(logger=None)
# Should work without errors, no logging
# Test with logger
logger = get_logger('test_component')
component = MyComponent(logger=logger)
# Should log normally
# Test error-only mode
component = MyComponent(logger=logger, log_errors_only=True)
# Should only log errors
Basic System Test
Run a simple test to verify the logging system:
python -c "from utils.logger import get_logger; logger = get_logger('test', verbose=True); logger.info('Test message'); print('Check logs/test/ directory')"
Troubleshooting
Common Issues
- Permission errors: Ensure the application has write permissions to the project directory
- Disk space: Monitor disk usage and adjust log retention with
max_log_files - Threading issues: The logger is thread-safe, but check for application-level concurrency issues
- Too many console messages: Adjust
verboseparameter or log levels
Debug Mode
Enable debug logging to troubleshoot issues:
logger = get_logger('component_name', 'DEBUG', verbose=True)
Console Output Issues
# Force console output regardless of environment
logger = get_logger('component_name', verbose=True)
# Check environment variables
import os
print(f"VERBOSE_LOGGING: {os.getenv('VERBOSE_LOGGING')}")
print(f"LOG_TO_CONSOLE: {os.getenv('LOG_TO_CONSOLE')}")
Fallback Logging
If file logging fails, the system automatically falls back to console logging with a warning message.
Integration with Existing Code
The logging system is designed to be gradually adopted:
- Start with new modules: Use the unified logger in new code
- Replace existing logging: Gradually migrate existing logging to the unified system
- No breaking changes: Existing code continues to work
Maintenance
Automatic Cleanup Benefits
The automatic cleanup feature provides several benefits:
- Disk space management: Prevents log directories from growing indefinitely
- Performance: Fewer files to scan in log directories
- Maintenance-free: No need for external cron jobs or scripts
- Component-specific: Each component can have different retention policies
Manual Cleanup for Special Cases
For cases requiring age-based cleanup instead of count-based:
# cleanup_logs.py
from utils.logger import cleanup_old_logs
components = ['bot_manager', 'data_collector', 'strategies', 'dashboard']
for component in components:
cleanup_old_logs(component, days_to_keep=30)
Monitoring Disk Usage
Monitor the logs/ directory size and adjust retention policies as needed:
# Check log directory size
du -sh logs/
# Find large log files
find logs/ -name "*.txt" -size +10M
# Count log files per component
find logs/ -name "*.txt" | cut -d'/' -f2 | sort | uniq -c
This conditional logging system provides maximum flexibility while maintaining clean, maintainable code that works in all scenarios.