Vasily.onl 848119e2cb Enhance logging system documentation and default logger configuration
- Updated the logging documentation to reflect changes in the unified log message format, including the addition of pathname, line number, and function name for better traceability.
- Modified the `get_logger` function to set a default value for `component_name`, improving usability for users who may not specify a component name.
- Ensured consistency in the documentation regarding the parameters and their descriptions.

These updates improve the clarity and ease of use of the logging system, making it more accessible for developers.
2025-06-06 21:07:52 +08:00

14 KiB

Unified Logging System

The TCP Dashboard project uses a unified logging system built on Python's standard logging library. It provides consistent, centralized logging across all components.

Key Features

  • Component-based Logging: Each component (e.g., bot_manager, data_collector) gets its own dedicated logger, with logs organized into separate directories under logs/.
  • Standardized & Simple: Relies on standard Python logging handlers, making it robust and easy to maintain.
  • Date-based Rotation: Log files are automatically rotated daily at midnight by TimedRotatingFileHandler.
  • Automatic Cleanup: Log file retention is managed automatically based on the number of backup files to keep (backupCount), preventing excessive disk usage.
  • Unified Format: All log messages follow a detailed, consistent format: [YYYY-MM-DD HH:MM:SS - LEVEL - pathname:lineno - funcName] - message].
  • Configurable Console Output: Optional console output for real-time monitoring, configurable via function arguments or environment variables.

Usage

Getting a Logger

The primary way to get a logger is via the get_logger function. It is thread-safe and ensures that loggers are configured only once.

from utils.logger import get_logger

# Get a logger for the bot manager component
# This will create a file logger and, if verbose=True, a console logger.
logger = get_logger('bot_manager', verbose=True)

logger.info("Bot started successfully")
logger.debug("Connecting to database...")
logger.warning("API response time is high")
logger.error("Failed to execute trade", exc_info=True)

Configuration

The get_logger function accepts the following parameters:

Parameter Type Default Description
component_name str default_logger Name of the component (e.g., bot_manager). Used for the logger name and directory.
log_level str INFO The minimum logging level to be processed (DEBUG, INFO, WARNING, ERROR, CRITICAL).
verbose Optional[bool] None If True, enables console logging. If None, uses VERBOSE_LOGGING or LOG_TO_CONSOLE from environment variables.
max_log_files int 30 The maximum number of backup log files to keep. The core of the log cleanup mechanism.
clean_old_logs bool True Deprecated. Kept for backward compatibility but has no effect. Cleanup is controlled by max_log_files.

For centralized control, you can use environment variables:

  • VERBOSE_LOGGING: Set to true to enable console logging for all loggers.
  • LOG_TO_CONSOLE: An alias for VERBOSE_LOGGING.

Log File Structure

The logger creates a directory for each component inside logs/. The main log file is named component_name.log. When rotated, old logs are renamed with a date suffix.

logs/
├── bot_manager/
│   ├── bot_manager.log         (current log file)
│   └── bot_manager.log.2023-11-15
├── data_collector/
│   ├── data_collector.log
│   └── data_collector.log.2023-11-15
└── default_logger/
    └── default_logger.log
└── test_component/
    └── test_component.log

Advanced Usage

Age-Based Log Cleanup

While the primary cleanup mechanism is count-based (via max_log_files), a separate utility function, cleanup_old_logs, is available for age-based cleanup if you have specific retention policies.

from utils.logger import cleanup_old_logs

# Deletes all log files in the 'bot_manager' directory older than 15 days
cleanup_old_logs('bot_manager', days_to_keep=15)

Shutting Down Logging

In some cases, especially in tests or when an application is shutting down gracefully, you may need to explicitly close all log file handlers.

from utils.logger import shutdown_logging

# Closes all open file handlers managed by the logging system
shutdown_logging()

Component Integration Pattern (Conditional Logging)

While the logger utility is simple, it is designed to support a powerful conditional logging pattern at the application level. This allows components to be developed to run with or without logging, making them more flexible and easier to test.

Key Concepts

  1. Optional Logging: Components are designed to accept logger=None in their constructor and function normally without producing any logs.
  2. Error-Only Mode: A component can be designed to only log messages of level ERROR or higher. This is a component-level implementation pattern, not a feature of get_logger.
  3. Logger Inheritance: Parent components can pass their logger instance to child components, ensuring a consistent logging context.

Example: Component Constructor

All major components should follow this pattern to support conditional logging.

class ComponentExample:
    def __init__(self, logger=None, log_errors_only=False):
        self.logger = logger
        self.log_errors_only = log_errors_only

    def _log_info(self, message: str) -> None:
        """Log info message if logger is available and not in errors-only mode."""
        if self.logger and not self.log_errors_only:
            self.logger.info(message)

    def _log_error(self, message: str, exc_info: bool = False) -> None:
        """Log error message if logger is available."""
        if self.logger:
            self.logger.error(message, exc_info=exc_info)

    # ... other helper methods for debug, warning, critical ...

This pattern decouples the component's logic from the global logging configuration and makes its logging behavior explicit and easy to manage.

Troubleshooting

  • Permissions: Ensure the application has write permissions to the logs/ directory.
  • No Logs: If file logging fails (e.g., due to permissions), a warning is printed to the console. If verbose is not enabled, no further logs will be produced. Ensure the logs/ directory is writable.
  • Console Spam: If the console is too noisy, set verbose=False when calling get_logger and ensure VERBOSE_LOGGING is not set to true in your environment.

Best Practices

1. Component Naming

Use descriptive, consistent component names:

  • bot_manager - for bot lifecycle management
  • data_collector - for market data collection
  • strategies - for trading strategies
  • backtesting - for backtesting engine
  • dashboard - for web dashboard

2. Log Level Guidelines

  • DEBUG: Detailed diagnostic information, typically only of interest when diagnosing problems
  • INFO: General information about program execution
  • WARNING: Something unexpected happened, but the program is still working
  • ERROR: A serious problem occurred, the program couldn't perform a function
  • CRITICAL: A serious error occurred, the program may not be able to continue

3. Verbose Logging Guidelines

# Development: Use verbose logging with DEBUG level
dev_logger = get_logger('component', 'DEBUG', verbose=True, max_log_files=3)

# Production: Use INFO level with no console output
prod_logger = get_logger('component', 'INFO', verbose=False, max_log_files=30)

# Testing: Disable cleanup to preserve test logs
test_logger = get_logger('test_component', 'DEBUG', verbose=True, clean_old_logs=False)

4. Log Retention Guidelines

# High-frequency components (data collectors): shorter retention
data_logger = get_logger('data_collector', max_log_files=7)

# Important components (bot managers): longer retention
bot_logger = get_logger('bot_manager', max_log_files=30)

# Development: very short retention
dev_logger = get_logger('dev_component', max_log_files=3)

5. Message Content

# Good: Descriptive and actionable
logger.error("Failed to connect to OKX API: timeout after 30s")

# Bad: Vague and unhelpful
logger.error("Error occurred")

# Good: Include relevant context
logger.info(f"Bot {bot_id} executed trade: {symbol} {side} {quantity}@{price}")

# Good: Include duration for performance monitoring
start_time = time.time()
# ... do work ...
duration = time.time() - start_time
logger.info(f"Data aggregation completed in {duration:.2f}s")

6. Exception Handling

try:
    execute_trade(symbol, quantity, price)
    logger.info(f"Trade executed successfully: {symbol}")
except APIError as e:
    logger.error(f"API error during trade execution: {e}", exc_info=True)
    raise
except ValidationError as e:
    logger.warning(f"Trade validation failed: {e}")
    return False
except Exception as e:
    logger.critical(f"Unexpected error during trade execution: {e}", exc_info=True)
    raise

7. Performance Considerations

# Good: Efficient string formatting
logger.debug(f"Processing {len(data)} records")

# Avoid: Expensive operations in log messages unless necessary
# logger.debug(f"Data: {expensive_serialization(data)}")  # Only if needed

# Better: Check log level first for expensive operations
if logger.isEnabledFor(logging.DEBUG):
    logger.debug(f"Data: {expensive_serialization(data)}")

Migration Guide

Updating Existing Components

  1. Add logger parameter to constructor:
def __init__(self, ..., logger=None, log_errors_only=False):
  1. Add conditional logging helpers:
def _log_debug(self, message: str) -> None:
    if self.logger and not self.log_errors_only:
        self.logger.debug(message)
  1. Update all logging calls:
# Before
self.logger.info("Message")

# After
self._log_info("Message")
  1. Pass logger to child components:
child = ChildComponent(logger=self.logger)

From Standard Logging

# Old logging (if any existed)
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# New unified logging
from utils.logger import get_logger
logger = get_logger('component_name', verbose=True)

Gradual Adoption

  1. Phase 1: Add optional logger parameters to new components
  2. Phase 2: Update existing components to support conditional logging
  3. Phase 3: Implement hierarchical logging structure
  4. Phase 4: Add error-only logging mode

Testing

Testing Conditional Logging

Test Script Example

# test_conditional_logging.py
from utils.logger import get_logger
from data.collector_manager import CollectorManager
from data.exchanges.okx.collector import OKXCollector

def test_no_logging():
    """Test components work without loggers."""
    manager = CollectorManager(logger=None)
    collector = OKXCollector("BTC-USDT", logger=None)
    print("✓ No logging test passed")

def test_with_logging():
    """Test components work with loggers."""
    logger = get_logger('test_system')
    manager = CollectorManager(logger=logger)
    collector = OKXCollector("BTC-USDT", logger=logger)
    print("✓ With logging test passed")

def test_error_only():
    """Test error-only logging mode."""
    logger = get_logger('test_errors')
    collector = OKXCollector("BTC-USDT", logger=logger, log_errors_only=True)
    print("✓ Error-only logging test passed")

if __name__ == "__main__":
    test_no_logging()
    test_with_logging()
    test_error_only()
    print("✅ All conditional logging tests passed!")

Testing Changes

# Test without logger
component = MyComponent(logger=None)
# Should work without errors, no logging

# Test with logger
logger = get_logger('test_component')
component = MyComponent(logger=logger)
# Should log normally

# Test error-only mode
component = MyComponent(logger=logger, log_errors_only=True)
# Should only log errors

Basic System Test

Run a simple test to verify the logging system:

python -c "from utils.logger import get_logger; logger = get_logger('test', verbose=True); logger.info('Test message'); print('Check logs/test/ directory')"

Troubleshooting

Common Issues

  1. Permission errors: Ensure the application has write permissions to the project directory
  2. Disk space: Monitor disk usage and adjust log retention with max_log_files
  3. Threading issues: The logger is thread-safe, but check for application-level concurrency issues
  4. Too many console messages: Adjust verbose parameter or log levels

Debug Mode

Enable debug logging to troubleshoot issues:

logger = get_logger('component_name', 'DEBUG', verbose=True)

Console Output Issues

# Force console output regardless of environment
logger = get_logger('component_name', verbose=True)

# Check environment variables
import os
print(f"VERBOSE_LOGGING: {os.getenv('VERBOSE_LOGGING')}")
print(f"LOG_TO_CONSOLE: {os.getenv('LOG_TO_CONSOLE')}")

Fallback Logging

If file logging fails, the system automatically falls back to console logging with a warning message.

Integration with Existing Code

The logging system is designed to be gradually adopted:

  1. Start with new modules: Use the unified logger in new code
  2. Replace existing logging: Gradually migrate existing logging to the unified system
  3. No breaking changes: Existing code continues to work

Maintenance

Automatic Cleanup Benefits

The automatic cleanup feature provides several benefits:

  • Disk space management: Prevents log directories from growing indefinitely
  • Performance: Fewer files to scan in log directories
  • Maintenance-free: No need for external cron jobs or scripts
  • Component-specific: Each component can have different retention policies

Manual Cleanup for Special Cases

For cases requiring age-based cleanup instead of count-based:

# cleanup_logs.py
from utils.logger import cleanup_old_logs

components = ['bot_manager', 'data_collector', 'strategies', 'dashboard']
for component in components:
    cleanup_old_logs(component, days_to_keep=30)

Monitoring Disk Usage

Monitor the logs/ directory size and adjust retention policies as needed:

# Check log directory size
du -sh logs/

# Find large log files
find logs/ -name "*.txt" -size +10M

# Count log files per component
find logs/ -name "*.txt" | cut -d'/' -f2 | sort | uniq -c

This conditional logging system provides maximum flexibility while maintaining clean, maintainable code that works in all scenarios.