474 lines
13 KiB
Markdown
Raw Normal View History

# Unified Logging System
The TCP Dashboard project uses a unified logging system that provides consistent, centralized logging across all components.
## Features
- **Component-specific directories**: Each component gets its own log directory
- **Date-based file rotation**: New log files created daily automatically
- **Unified format**: Consistent timestamp and message format across all logs
- **Thread-safe**: Safe for use in multi-threaded applications
- **Verbose console logging**: Configurable console output with proper log level handling
- **Automatic log cleanup**: Built-in functionality to remove old log files automatically
- **Error handling**: Graceful fallback to console logging if file logging fails
## Log Format
All log messages follow this unified format:
```
[YYYY-MM-DD HH:MM:SS - LEVEL - message]
```
Example:
```
[2024-01-15 14:30:25 - INFO - Bot started successfully]
[2024-01-15 14:30:26 - ERROR - Connection failed: timeout]
```
## File Organization
Logs are organized in a hierarchical structure:
```
logs/
├── app/
│ ├── 2024-01-15.txt
│ └── 2024-01-16.txt
├── bot_manager/
│ ├── 2024-01-15.txt
│ └── 2024-01-16.txt
├── data_collector/
│ └── 2024-01-15.txt
└── strategies/
└── 2024-01-15.txt
```
## Basic Usage
### Import and Initialize
```python
from utils.logger import get_logger
# Basic usage - gets logger with default settings
logger = get_logger('bot_manager')
# With verbose console output
logger = get_logger('bot_manager', verbose=True)
# With custom cleanup settings
logger = get_logger('bot_manager', clean_old_logs=True, max_log_files=7)
# All parameters
logger = get_logger(
component_name='bot_manager',
log_level='DEBUG',
verbose=True,
clean_old_logs=True,
max_log_files=14
)
```
### Log Messages
```python
# Different log levels
logger.debug("Detailed debugging information")
logger.info("General information about program execution")
logger.warning("Something unexpected happened")
logger.error("An error occurred", exc_info=True) # Include stack trace
logger.critical("A critical error occurred")
```
### Complete Example
```python
from utils.logger import get_logger
class BotManager:
def __init__(self):
# Initialize with verbose output and keep only 7 days of logs
self.logger = get_logger('bot_manager', verbose=True, max_log_files=7)
self.logger.info("BotManager initialized")
def start_bot(self, bot_id: str):
try:
self.logger.info(f"Starting bot {bot_id}")
# Bot startup logic here
self.logger.info(f"Bot {bot_id} started successfully")
except Exception as e:
self.logger.error(f"Failed to start bot {bot_id}: {e}", exc_info=True)
raise
def stop_bot(self, bot_id: str):
self.logger.info(f"Stopping bot {bot_id}")
# Bot shutdown logic here
self.logger.info(f"Bot {bot_id} stopped")
```
## Configuration
### Logger Parameters
The `get_logger()` function accepts several parameters for customization:
```python
get_logger(
component_name: str, # Required: component name
log_level: str = "INFO", # Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
verbose: Optional[bool] = None, # Console logging: True, False, or None (use env)
clean_old_logs: bool = True, # Auto-cleanup old logs
max_log_files: int = 30 # Max number of log files to keep
)
```
### Log Levels
Set the log level when getting a logger:
```python
# Available levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
logger = get_logger('component_name', 'DEBUG') # Show all messages
logger = get_logger('component_name', 'ERROR') # Show only errors and critical
```
### Verbose Console Logging
Control console output with the `verbose` parameter:
```python
# Explicit verbose settings
logger = get_logger('bot_manager', verbose=True) # Always show console logs
logger = get_logger('bot_manager', verbose=False) # Never show console logs
# Use environment variable (default behavior)
logger = get_logger('bot_manager', verbose=None) # Uses VERBOSE_LOGGING from .env
```
Environment variables for console logging:
```bash
# In .env file or environment
VERBOSE_LOGGING=true # Enable verbose console logging
LOG_TO_CONSOLE=true # Alternative environment variable (backward compatibility)
```
Console output respects log levels:
- **DEBUG level**: Shows all messages (DEBUG, INFO, WARNING, ERROR, CRITICAL)
- **INFO level**: Shows INFO and above (INFO, WARNING, ERROR, CRITICAL)
- **WARNING level**: Shows WARNING and above (WARNING, ERROR, CRITICAL)
- **ERROR level**: Shows ERROR and above (ERROR, CRITICAL)
- **CRITICAL level**: Shows only CRITICAL messages
### Automatic Log Cleanup
Control automatic cleanup of old log files:
```python
# Enable automatic cleanup (default)
logger = get_logger('bot_manager', clean_old_logs=True, max_log_files=7)
# Disable automatic cleanup
logger = get_logger('bot_manager', clean_old_logs=False)
# Custom retention (keep 14 most recent log files)
logger = get_logger('bot_manager', max_log_files=14)
```
**How automatic cleanup works:**
- Triggered every time a new log file is created (date change)
- Keeps only the most recent `max_log_files` files
- Deletes older files automatically
- Based on file modification time, not filename
## Advanced Features
### Manual Log Cleanup
Remove old log files manually based on age:
```python
from utils.logger import cleanup_old_logs
# Remove logs older than 30 days for a specific component
cleanup_old_logs('bot_manager', days_to_keep=30)
# Or clean up logs for multiple components
for component in ['bot_manager', 'data_collector', 'strategies']:
cleanup_old_logs(component, days_to_keep=7)
```
### Error Handling with Context
```python
try:
risky_operation()
except Exception as e:
logger.error(f"Operation failed: {e}", exc_info=True)
# exc_info=True includes the full stack trace
```
### Structured Logging
For complex data, use structured messages:
```python
# Good: Structured information
logger.info(f"Trade executed: symbol={symbol}, price={price}, quantity={quantity}")
# Even better: JSON-like structure for parsing
logger.info(f"Trade executed", extra={
'symbol': symbol,
'price': price,
'quantity': quantity,
'timestamp': datetime.now().isoformat()
})
```
## Configuration Examples
### Development Environment
```python
# Verbose logging with frequent cleanup
logger = get_logger(
'bot_manager',
log_level='DEBUG',
verbose=True,
max_log_files=3 # Keep only 3 days of logs
)
```
### Production Environment
```python
# Minimal console output with longer retention
logger = get_logger(
'bot_manager',
log_level='INFO',
verbose=False,
max_log_files=30 # Keep 30 days of logs
)
```
### Testing Environment
```python
# Disable cleanup for testing
logger = get_logger(
'test_component',
log_level='DEBUG',
verbose=True,
clean_old_logs=False # Don't delete logs during tests
)
```
## Environment Variables
Create a `.env` file to control default logging behavior:
```bash
# Enable verbose console logging globally
VERBOSE_LOGGING=true
# Alternative (backward compatibility)
LOG_TO_CONSOLE=true
```
## Best Practices
### 1. Component Naming
Use descriptive, consistent component names:
- `bot_manager` - for bot lifecycle management
- `data_collector` - for market data collection
- `strategies` - for trading strategies
- `backtesting` - for backtesting engine
- `dashboard` - for web dashboard
### 2. Log Level Guidelines
- **DEBUG**: Detailed diagnostic information, typically only of interest when diagnosing problems
- **INFO**: General information about program execution
- **WARNING**: Something unexpected happened, but the program is still working
- **ERROR**: A serious problem occurred, the program couldn't perform a function
- **CRITICAL**: A serious error occurred, the program may not be able to continue
### 3. Verbose Logging Guidelines
```python
# Development: Use verbose logging with DEBUG level
dev_logger = get_logger('component', 'DEBUG', verbose=True, max_log_files=3)
# Production: Use INFO level with no console output
prod_logger = get_logger('component', 'INFO', verbose=False, max_log_files=30)
# Testing: Disable cleanup to preserve test logs
test_logger = get_logger('test_component', 'DEBUG', verbose=True, clean_old_logs=False)
```
### 4. Log Retention Guidelines
```python
# High-frequency components (data collectors): shorter retention
data_logger = get_logger('data_collector', max_log_files=7)
# Important components (bot managers): longer retention
bot_logger = get_logger('bot_manager', max_log_files=30)
# Development: very short retention
dev_logger = get_logger('dev_component', max_log_files=3)
```
### 5. Message Content
```python
# Good: Descriptive and actionable
logger.error("Failed to connect to OKX API: timeout after 30s")
# Bad: Vague and unhelpful
logger.error("Error occurred")
# Good: Include relevant context
logger.info(f"Bot {bot_id} executed trade: {symbol} {side} {quantity}@{price}")
# Good: Include duration for performance monitoring
start_time = time.time()
# ... do work ...
duration = time.time() - start_time
logger.info(f"Data aggregation completed in {duration:.2f}s")
```
### 6. Exception Handling
```python
try:
execute_trade(symbol, quantity, price)
logger.info(f"Trade executed successfully: {symbol}")
except APIError as e:
logger.error(f"API error during trade execution: {e}", exc_info=True)
raise
except ValidationError as e:
logger.warning(f"Trade validation failed: {e}")
return False
except Exception as e:
logger.critical(f"Unexpected error during trade execution: {e}", exc_info=True)
raise
```
### 7. Performance Considerations
```python
# Good: Efficient string formatting
logger.debug(f"Processing {len(data)} records")
# Avoid: Expensive operations in log messages unless necessary
# logger.debug(f"Data: {expensive_serialization(data)}") # Only if needed
# Better: Check log level first for expensive operations
if logger.isEnabledFor(logging.DEBUG):
logger.debug(f"Data: {expensive_serialization(data)}")
```
## Integration with Existing Code
The logging system is designed to be gradually adopted:
1. **Start with new modules**: Use the unified logger in new code
2. **Replace existing logging**: Gradually migrate existing logging to the unified system
3. **No breaking changes**: Existing code continues to work
### Migration Example
```python
# Old logging (if any existed)
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# New unified logging
from utils.logger import get_logger
logger = get_logger('component_name', verbose=True)
```
## Testing
Run a simple test to verify the logging system:
```bash
python -c "from utils.logger import get_logger; logger = get_logger('test', verbose=True); logger.info('Test message'); print('Check logs/test/ directory')"
```
## Maintenance
### Automatic Cleanup Benefits
The automatic cleanup feature provides several benefits:
- **Disk space management**: Prevents log directories from growing indefinitely
- **Performance**: Fewer files to scan in log directories
- **Maintenance-free**: No need for external cron jobs or scripts
- **Component-specific**: Each component can have different retention policies
### Manual Cleanup for Special Cases
For cases requiring age-based cleanup instead of count-based:
```python
# cleanup_logs.py
from utils.logger import cleanup_old_logs
components = ['bot_manager', 'data_collector', 'strategies', 'dashboard']
for component in components:
cleanup_old_logs(component, days_to_keep=30)
```
### Monitoring Disk Usage
Monitor the `logs/` directory size and adjust retention policies as needed:
```bash
# Check log directory size
du -sh logs/
# Find large log files
find logs/ -name "*.txt" -size +10M
# Count log files per component
find logs/ -name "*.txt" | cut -d'/' -f2 | sort | uniq -c
```
## Troubleshooting
### Common Issues
1. **Permission errors**: Ensure the application has write permissions to the project directory
2. **Disk space**: Monitor disk usage and adjust log retention with `max_log_files`
3. **Threading issues**: The logger is thread-safe, but check for application-level concurrency issues
4. **Too many console messages**: Adjust `verbose` parameter or log levels
### Debug Mode
Enable debug logging to troubleshoot issues:
```python
logger = get_logger('component_name', 'DEBUG', verbose=True)
```
### Console Output Issues
```python
# Force console output regardless of environment
logger = get_logger('component_name', verbose=True)
# Check environment variables
import os
print(f"VERBOSE_LOGGING: {os.getenv('VERBOSE_LOGGING')}")
print(f"LOG_TO_CONSOLE: {os.getenv('LOG_TO_CONSOLE')}")
```
### Fallback Logging
If file logging fails, the system automatically falls back to console logging with a warning message.
## New Features Summary
### Verbose Parameter
- Controls console logging output
- Respects log levels (DEBUG shows all, ERROR shows only errors)
- Uses environment variables as default (`VERBOSE_LOGGING` or `LOG_TO_CONSOLE`)
- Can be explicitly set to `True`/`False` to override environment
### Automatic Cleanup
- Enabled by default (`clean_old_logs=True`)
- Triggered when new log files are created (date changes)
- Keeps most recent `max_log_files` files (default: 30)
- Component-specific retention policies
- Non-blocking operation with error handling