TCPDashboard/docs/data_collectors.md

# Data Collector System Documentation

## Overview

The Data Collector System provides a robust, scalable framework for collecting real-time market data from cryptocurrency exchanges. It features comprehensive health monitoring, automatic recovery, and centralized management capabilities designed for production trading environments.

## Key Features

### 🔄 **Auto-Recovery & Health Monitoring**
- **Heartbeat System**: Continuous health monitoring with configurable intervals
- **Auto-Restart**: Automatic restart on failures with exponential backoff
- **Connection Recovery**: Robust reconnection logic for network interruptions
- **Data Freshness Monitoring**: Detects stale data and triggers recovery

### 🎛️ **Centralized Management**
- **CollectorManager**: Supervises multiple collectors with coordinated lifecycle
- **Dynamic Control**: Enable/disable collectors at runtime without system restart
- **Global Health Checks**: System-wide monitoring and alerting
- **Graceful Shutdown**: Proper cleanup and resource management

### 📊 **Comprehensive Monitoring**
- **Real-time Status**: Detailed status reporting for all collectors
- **Performance Metrics**: Message counts, uptime, error rates, restart counts
- **Health Analytics**: Connection state, data freshness, error tracking
- **Logging Integration**: Enhanced logging with configurable verbosity

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                   CollectorManager                          │
│  ┌─────────────────────────────────────────────────────┐    │
│  │              Global Health Monitor                  │    │
│  │  • System-wide health checks                       │    │
│  │  • Auto-restart coordination                       │    │
│  │  • Performance analytics                           │    │
│  └─────────────────────────────────────────────────────┘    │
│                           │                                 │
│  ┌─────────────────┐  ┌─────────────────┐  ┌──────────────┐ │
│  │  OKX Collector  │  │Binance Collector│  │   Custom     │ │
│  │                 │  │                 │  │  Collector   │ │
│  │ • Health Monitor│  │ • Health Monitor│  │ • Health Mon │ │
│  │ • Auto-restart  │  │ • Auto-restart  │  │ • Auto-resta │ │
│  │ • Data Validate │  │ • Data Validate │  │ • Data Valid │ │
│  └─────────────────┘  └─────────────────┘  └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
                              │
                    ┌─────────────────┐
                    │   Data Output   │
                    │                 │
                    │ • Callbacks     │
                    │ • Redis Pub/Sub │
                    │ • Database      │
                    └─────────────────┘
```

## Quick Start

### 1. Basic Collector Usage

```python
import asyncio
from data import BaseDataCollector, DataType, MarketDataPoint

class MyExchangeCollector(BaseDataCollector):
    """Custom collector implementation."""

    def __init__(self, symbols: list):
        super().__init__("my_exchange", symbols, [DataType.TICKER])
        self.websocket = None

    async def connect(self) -> bool:
        """Connect to exchange WebSocket."""
        try:
            # Connect to your exchange WebSocket
            self.websocket = await connect_to_exchange()
            return True
        except Exception:
            return False

    async def disconnect(self) -> None:
        """Disconnect from exchange."""
        if self.websocket:
            await self.websocket.close()

    async def subscribe_to_data(self, symbols: list, data_types: list) -> bool:
        """Subscribe to data streams."""
        try:
            await self.websocket.subscribe(symbols, data_types)
            return True
        except Exception:
            return False

    async def unsubscribe_from_data(self, symbols: list, data_types: list) -> bool:
        """Unsubscribe from data streams."""
        try:
            await self.websocket.unsubscribe(symbols, data_types)
            return True
        except Exception:
            return False

    async def _process_message(self, message) -> MarketDataPoint:
        """Process incoming message."""
        return MarketDataPoint(
            exchange=self.exchange_name,
            symbol=message['symbol'],
            timestamp=message['timestamp'],
            data_type=DataType.TICKER,
            data=message['data']
        )

    async def _handle_messages(self) -> None:
        """Handle incoming messages."""
        try:
            message = await self.websocket.receive()
            data_point = await self._process_message(message)
            await self._notify_callbacks(data_point)
        except Exception as e:
            # This will trigger reconnection logic
            raise e

# Usage
async def main():
    # Create collector
    collector = MyExchangeCollector(["BTC-USDT", "ETH-USDT"])

    # Add data callback
    def on_data(data_point: MarketDataPoint):
        print(f"Received: {data_point.symbol} - {data_point.data}")

    collector.add_data_callback(DataType.TICKER, on_data)

    # Start collector (with auto-restart enabled by default)
    await collector.start()

    # Let it run
    await asyncio.sleep(60)

    # Stop collector
    await collector.stop()

asyncio.run(main())
```

### 2. Using CollectorManager

```python
import asyncio
from data import CollectorManager, CollectorConfig

async def main():
    # Create manager
    manager = CollectorManager(
        "trading_system_manager",
        global_health_check_interval=30.0  # Check every 30 seconds
    )

    # Create collectors
    okx_collector = OKXCollector(["BTC-USDT", "ETH-USDT"])
    binance_collector = BinanceCollector(["BTC-USDT", "ETH-USDT"])

    # Add collectors with custom configs
    manager.add_collector(okx_collector, CollectorConfig(
        name="okx_main",
        exchange="okx",
        symbols=["BTC-USDT", "ETH-USDT"],
        data_types=["ticker", "trade"],
        auto_restart=True,
        health_check_interval=15.0,
        enabled=True
    ))

    manager.add_collector(binance_collector, CollectorConfig(
        name="binance_backup",
        exchange="binance",
        symbols=["BTC-USDT", "ETH-USDT"],
        data_types=["ticker"],
        auto_restart=True,
        enabled=False  # Start disabled
    ))

    # Start manager
    await manager.start()

    # Monitor status
    while True:
        status = manager.get_status()
        print(f"Running: {len(manager.get_running_collectors())}")
        print(f"Failed: {len(manager.get_failed_collectors())}")
        print(f"Restarts: {status['statistics']['restarts_performed']}")

        await asyncio.sleep(10)

asyncio.run(main())
```

## API Reference

### BaseDataCollector

The abstract base class that all data collectors must inherit from.

#### Constructor

```python
def __init__(self,
             exchange_name: str,
             symbols: List[str],
             data_types: Optional[List[DataType]] = None,
             component_name: Optional[str] = None,
             auto_restart: bool = True,
             health_check_interval: float = 30.0)
```

**Parameters:**
- `exchange_name`: Name of the exchange (e.g., 'okx', 'binance')
- `symbols`: List of trading symbols to collect data for
- `data_types`: Types of data to collect (default: [DataType.CANDLE])
- `component_name`: Name for logging (default: based on exchange_name)
- `auto_restart`: Enable automatic restart on failures (default: True)
- `health_check_interval`: Seconds between health checks (default: 30.0)

#### Abstract Methods

Must be implemented by subclasses:

```python
async def connect(self) -> bool
async def disconnect(self) -> None
async def subscribe_to_data(self, symbols: List[str], data_types: List[DataType]) -> bool
async def unsubscribe_from_data(self, symbols: List[str], data_types: List[DataType]) -> bool
async def _process_message(self, message: Any) -> Optional[MarketDataPoint]
async def _handle_messages(self) -> None
```

#### Public Methods

```python
async def start() -> bool                    # Start the collector
async def stop(force: bool = False) -> None  # Stop the collector
async def restart() -> bool                  # Restart the collector

# Callback management
def add_data_callback(self, data_type: DataType, callback: Callable) -> None
def remove_data_callback(self, data_type: DataType, callback: Callable) -> None

# Symbol management
def add_symbol(self, symbol: str) -> None
def remove_symbol(self, symbol: str) -> None

# Status and monitoring
def get_status(self) -> Dict[str, Any]
def get_health_status(self) -> Dict[str, Any]

# Data validation
def validate_ohlcv_data(self, data: Dict[str, Any], symbol: str, timeframe: str) -> OHLCVData
```

#### Status Information

The `get_status()` method returns comprehensive status information:

```python
{
    'exchange': 'okx',
    'status': 'running',                    # Current status
    'should_be_running': True,              # Desired state
    'symbols': ['BTC-USDT', 'ETH-USDT'],   # Configured symbols
    'data_types': ['ticker'],               # Data types being collected
    'auto_restart': True,                   # Auto-restart enabled
    'health': {
        'time_since_heartbeat': 5.2,       # Seconds since last heartbeat
        'time_since_data': 2.1,            # Seconds since last data
        'max_silence_duration': 300.0      # Max allowed silence
    },
    'statistics': {
        'messages_received': 1250,          # Total messages received
        'messages_processed': 1248,         # Successfully processed
        'errors': 2,                        # Error count
        'restarts': 1,                      # Restart count
        'uptime_seconds': 3600.5,          # Current uptime
        'reconnect_attempts': 0,            # Current reconnect attempts
        'last_message_time': '2023-...',    # ISO timestamp
        'connection_uptime': '2023-...',    # Connection start time
        'last_error': 'Connection failed',  # Last error message
        'last_restart_time': '2023-...'     # Last restart time
    }
}
```

#### Health Status

The `get_health_status()` method provides detailed health information:

```python
{
    'is_healthy': True,                     # Overall health status
    'issues': [],                          # List of current issues
    'status': 'running',                   # Current collector status
    'last_heartbeat': '2023-...',         # Last heartbeat timestamp
    'last_data_received': '2023-...',     # Last data timestamp
    'should_be_running': True,             # Expected state
    'is_running': True                     # Actual running state
}
```

### CollectorManager

Manages multiple data collectors with coordinated lifecycle and health monitoring.

#### Constructor

```python
def __init__(self,
             manager_name: str = "collector_manager",
             global_health_check_interval: float = 60.0,
             restart_delay: float = 5.0)
```

#### Public Methods

```python
# Collector management
def add_collector(self, collector: BaseDataCollector, config: Optional[CollectorConfig] = None) -> None
def remove_collector(self, collector_name: str) -> bool
def enable_collector(self, collector_name: str) -> bool
def disable_collector(self, collector_name: str) -> bool

# Lifecycle management
async def start() -> bool
async def stop() -> None
async def restart_collector(self, collector_name: str) -> bool
async def restart_all_collectors(self) -> Dict[str, bool]

# Status and monitoring
def get_status(self) -> Dict[str, Any]
def get_collector_status(self, collector_name: str) -> Optional[Dict[str, Any]]
def list_collectors(self) -> List[str]
def get_running_collectors(self) -> List[str]
def get_failed_collectors(self) -> List[str]
```

### CollectorConfig

Configuration dataclass for collectors:

```python
@dataclass
class CollectorConfig:
    name: str                               # Unique collector name
    exchange: str                           # Exchange name
    symbols: List[str]                      # Trading symbols
    data_types: List[str]                   # Data types to collect
    auto_restart: bool = True               # Enable auto-restart
    health_check_interval: float = 30.0    # Health check interval
    enabled: bool = True                    # Initially enabled
```

### Data Types

#### DataType Enum

```python
class DataType(Enum):
    TICKER = "ticker"        # Price and volume updates
    TRADE = "trade"          # Individual trade executions
    ORDERBOOK = "orderbook"  # Order book snapshots
    CANDLE = "candle"        # OHLCV candle data
    BALANCE = "balance"      # Account balance updates
```

#### MarketDataPoint

Standardized market data structure:

```python
@dataclass
class MarketDataPoint:
    exchange: str            # Exchange name
    symbol: str             # Trading symbol
    timestamp: datetime     # Data timestamp (UTC)
    data_type: DataType     # Type of data
    data: Dict[str, Any]    # Raw data payload
```

#### OHLCVData

OHLCV (candlestick) data structure with validation:

```python
@dataclass
class OHLCVData:
    symbol: str                          # Trading symbol
    timeframe: str                       # Timeframe (1m, 5m, 1h, etc.)
    timestamp: datetime                  # Candle timestamp
    open: Decimal                        # Opening price
    high: Decimal                        # Highest price
    low: Decimal                         # Lowest price
    close: Decimal                       # Closing price
    volume: Decimal                      # Trading volume
    trades_count: Optional[int] = None   # Number of trades
```

## Health Monitoring

### Monitoring Levels

The system provides multi-level health monitoring:

1. **Individual Collector Health**
   - Heartbeat monitoring (message loop activity)
   - Data freshness (time since last data received)
   - Connection state monitoring
   - Error rate tracking

2. **Manager-Level Health**
   - Global health checks across all collectors
   - Coordinated restart management
   - System-wide performance metrics
   - Resource utilization monitoring

### Health Check Intervals

- **Individual Collector**: Configurable per collector (default: 30s)
- **Global Manager**: Configurable for manager (default: 60s)
- **Heartbeat Updates**: Updated with each message loop iteration
- **Data Freshness**: Updated when data is received

### Auto-Restart Triggers

Collectors are automatically restarted when:

1. **No Heartbeat**: Message loop becomes unresponsive
2. **Stale Data**: No data received within configured timeout
3. **Connection Failures**: WebSocket or API connection lost
4. **Error Status**: Collector enters ERROR or UNHEALTHY state
5. **Manual Trigger**: Explicit restart request

### Failure Handling

```python
# Configure failure handling
collector = MyCollector(
    symbols=["BTC-USDT"],
    auto_restart=True,                    # Enable auto-restart
    health_check_interval=30.0            # Check every 30 seconds
)

# The collector will automatically:
# 1. Detect failures within 30 seconds
# 2. Attempt reconnection with exponential backoff
# 3. Restart up to 5 times (configurable)
# 4. Log all recovery attempts
# 5. Report status to manager
```

## Configuration

### Environment Variables

The system respects these environment variables:

```bash
# Logging configuration
LOG_LEVEL=INFO                    # Logging level (DEBUG, INFO, WARN, ERROR)
LOG_CLEANUP=true                  # Enable automatic log cleanup
LOG_MAX_FILES=30                  # Maximum log files to retain

# Health monitoring
DEFAULT_HEALTH_CHECK_INTERVAL=30  # Default health check interval (seconds)
MAX_SILENCE_DURATION=300          # Max time without data (seconds)
MAX_RECONNECT_ATTEMPTS=5          # Maximum reconnection attempts
RECONNECT_DELAY=5                 # Delay between reconnect attempts (seconds)
```

### Programmatic Configuration

```python
# Configure individual collector
collector = MyCollector(
    exchange_name="custom_exchange",
    symbols=["BTC-USDT", "ETH-USDT"],
    data_types=[DataType.TICKER, DataType.TRADE],
    auto_restart=True,
    health_check_interval=15.0        # Check every 15 seconds
)

# Configure manager
manager = CollectorManager(
    manager_name="production_manager",
    global_health_check_interval=30.0,   # Global checks every 30s
    restart_delay=10.0                    # 10s delay between restarts
)

# Configure specific collector in manager
config = CollectorConfig(
    name="primary_okx",
    exchange="okx",
    symbols=["BTC-USDT", "ETH-USDT", "SOL-USDT"],
    data_types=["ticker", "trade", "orderbook"],
    auto_restart=True,
    health_check_interval=20.0,
    enabled=True
)

manager.add_collector(collector, config)
```

## Best Practices

### 1. Collector Implementation

```python
class ProductionCollector(BaseDataCollector):
    def __init__(self, exchange_name: str, symbols: list):
        super().__init__(
            exchange_name=exchange_name,
            symbols=symbols,
            data_types=[DataType.TICKER, DataType.TRADE],
            auto_restart=True,               # Always enable auto-restart
            health_check_interval=30.0       # Reasonable interval
        )

        # Connection management
        self.connection_pool = None
        self.rate_limiter = RateLimiter(100, 60)  # 100 requests per minute

        # Data validation
        self.data_validator = DataValidator()

        # Performance monitoring
        self.metrics = MetricsCollector()

    async def connect(self) -> bool:
        """Implement robust connection logic."""
        try:
            # Use connection pooling for reliability
            self.connection_pool = await create_connection_pool(
                self.exchange_name,
                max_connections=5,
                retry_attempts=3
            )

            # Test connection
            await self.connection_pool.ping()
            return True

        except Exception as e:
            self.logger.error(f"Connection failed: {e}")
            return False

    async def _process_message(self, message) -> Optional[MarketDataPoint]:
        """Implement thorough data processing."""
        try:
            # Rate limiting
            await self.rate_limiter.acquire()

            # Data validation
            if not self.data_validator.validate(message):
                self.logger.warning(f"Invalid message: {message}")
                return None

            # Metrics collection
            self.metrics.increment('messages_processed')

            # Create standardized data point
            return MarketDataPoint(
                exchange=self.exchange_name,
                symbol=message['symbol'],
                timestamp=self._parse_timestamp(message['timestamp']),
                data_type=DataType.TICKER,
                data=self._normalize_data(message)
            )

        except Exception as e:
            self.metrics.increment('processing_errors')
            self.logger.error(f"Message processing failed: {e}")
            raise  # Let health monitor handle it
```

### 2. Error Handling

```python
# Implement proper error handling
class RobustCollector(BaseDataCollector):
    async def _handle_messages(self) -> None:
        """Handle messages with proper error management."""
        try:
            # Check connection health
            if not await self._check_connection_health():
                raise ConnectionError("Connection health check failed")

            # Receive message with timeout
            message = await asyncio.wait_for(
                self.websocket.receive(),
                timeout=30.0  # 30 second timeout
            )

            # Process message
            data_point = await self._process_message(message)
            if data_point:
                await self._notify_callbacks(data_point)

        except asyncio.TimeoutError:
            # No data received - let health monitor handle
            raise ConnectionError("Message receive timeout")

        except WebSocketError as e:
            # WebSocket specific errors
            self.logger.error(f"WebSocket error: {e}")
            raise ConnectionError(f"WebSocket failed: {e}")

        except ValidationError as e:
            # Data validation errors - don't restart for these
            self.logger.warning(f"Data validation failed: {e}")
            # Continue without raising - these are data issues, not connection issues

        except Exception as e:
            # Unexpected errors - trigger restart
            self.logger.error(f"Unexpected error: {e}")
            raise
```

### 3. Manager Setup

```python
async def setup_production_system():
    """Setup production collector system."""

    # Create manager with appropriate settings
    manager = CollectorManager(
        manager_name="crypto_trading_system",
        global_health_check_interval=60.0,    # Check every minute
        restart_delay=30.0                     # 30s between restarts
    )

    # Add primary data sources
    exchanges = ['okx', 'binance', 'coinbase']
    symbols = ['BTC-USDT', 'ETH-USDT', 'SOL-USDT', 'AVAX-USDT']

    for exchange in exchanges:
        collector = create_collector(exchange, symbols)

        # Configure for production
        config = CollectorConfig(
            name=f"{exchange}_primary",
            exchange=exchange,
            symbols=symbols,
            data_types=["ticker", "trade"],
            auto_restart=True,
            health_check_interval=30.0,
            enabled=True
        )

        # Add callbacks for data processing
        collector.add_data_callback(DataType.TICKER, process_ticker_data)
        collector.add_data_callback(DataType.TRADE, process_trade_data)

        manager.add_collector(collector, config)

    # Start system
    success = await manager.start()
    if not success:
        raise RuntimeError("Failed to start collector system")

    return manager

# Usage
async def main():
    manager = await setup_production_system()

    # Monitor system health
    while True:
        status = manager.get_status()

        if status['statistics']['failed_collectors'] > 0:
            # Alert on failures
            await send_alert(f"Collectors failed: {manager.get_failed_collectors()}")

        # Log status every 5 minutes
        await asyncio.sleep(300)
```

### 4. Monitoring Integration

```python
# Integrate with monitoring systems
import prometheus_client
from utils.logger import get_logger

class MonitoredCollector(BaseDataCollector):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        # Prometheus metrics
        self.messages_counter = prometheus_client.Counter(
            'collector_messages_total',
            'Total messages processed',
            ['exchange', 'symbol', 'type']
        )

        self.errors_counter = prometheus_client.Counter(
            'collector_errors_total',
            'Total errors',
            ['exchange', 'error_type']
        )

        self.uptime_gauge = prometheus_client.Gauge(
            'collector_uptime_seconds',
            'Collector uptime',
            ['exchange']
        )

    async def _notify_callbacks(self, data_point: MarketDataPoint):
        """Override to add metrics."""
        # Update metrics
        self.messages_counter.labels(
            exchange=data_point.exchange,
            symbol=data_point.symbol,
            type=data_point.data_type.value
        ).inc()

        # Update uptime
        status = self.get_status()
        if status['statistics']['uptime_seconds']:
            self.uptime_gauge.labels(
                exchange=self.exchange_name
            ).set(status['statistics']['uptime_seconds'])

        # Call parent
        await super()._notify_callbacks(data_point)

    async def _handle_connection_error(self) -> bool:
        """Override to add error metrics."""
        self.errors_counter.labels(
            exchange=self.exchange_name,
            error_type='connection'
        ).inc()

        return await super()._handle_connection_error()
```

## Troubleshooting

### Common Issues

#### 1. Collector Won't Start

**Symptoms**: `start()` returns `False`, status shows `ERROR`

**Solutions**:
```python
# Check connection details
collector = MyCollector(symbols=["BTC-USDT"])
success = await collector.start()
if not success:
    status = collector.get_status()
    print(f"Error: {status['statistics']['last_error']}")

# Common fixes:
# - Verify API credentials
# - Check network connectivity
# - Validate symbol names
# - Review exchange-specific requirements
```

#### 2. Frequent Restarts

**Symptoms**: High restart count, intermittent data

**Solutions**:
```python
# Adjust health check intervals
collector = MyCollector(
    symbols=["BTC-USDT"],
    health_check_interval=60.0,  # Increase interval
    auto_restart=True
)

# Check for:
# - Network instability
# - Exchange rate limiting
# - Invalid message formats
# - Resource constraints
```

#### 3. No Data Received

**Symptoms**: Collector running but no callbacks triggered

**Solutions**:
```python
# Check data flow
collector = MyCollector(symbols=["BTC-USDT"])

def debug_callback(data_point):
    print(f"Received: {data_point}")

collector.add_data_callback(DataType.TICKER, debug_callback)

# Verify:
# - Callback registration
# - Symbol subscription
# - Message parsing logic
# - Exchange data availability
```

#### 4. Memory Leaks

**Symptoms**: Increasing memory usage over time

**Solutions**:
```python
# Implement proper cleanup
class CleanCollector(BaseDataCollector):
    async def disconnect(self):
        """Ensure proper cleanup."""
        # Clear buffers
        self.message_buffer.clear()

        # Close connections
        if self.websocket:
            await self.websocket.close()
            self.websocket = None

        # Clear callbacks
        for callback_list in self._data_callbacks.values():
            callback_list.clear()

        await super().disconnect()
```

### Performance Optimization

#### 1. Batch Processing

```python
class BatchingCollector(BaseDataCollector):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.message_batch = []
        self.batch_size = 100
        self.batch_timeout = 1.0

    async def _handle_messages(self):
        """Batch process messages for efficiency."""
        message = await self.websocket.receive()
        self.message_batch.append(message)

        # Process batch when full or timeout
        if (len(self.message_batch) >= self.batch_size or
            time.time() - self.last_batch_time > self.batch_timeout):
            await self._process_batch()

    async def _process_batch(self):
        """Process messages in batch."""
        batch = self.message_batch.copy()
        self.message_batch.clear()
        self.last_batch_time = time.time()

        for message in batch:
            data_point = await self._process_message(message)
            if data_point:
                await self._notify_callbacks(data_point)
```

#### 2. Connection Pooling

```python
class PooledCollector(BaseDataCollector):
    async def connect(self) -> bool:
        """Use connection pooling for better performance."""
        try:
            # Create connection pool
            self.connection_pool = await aiohttp.ClientSession(
                connector=aiohttp.TCPConnector(
                    limit=10,              # Pool size
                    limit_per_host=5,      # Per-host limit
                    keepalive_timeout=300, # Keep connections alive
                    enable_cleanup_closed=True
                )
            )
            return True
        except Exception:
            return False
```

### Logging and Debugging

#### Enable Debug Logging

```python
import os
os.environ['LOG_LEVEL'] = 'DEBUG'

# Collector will now log detailed information
collector = MyCollector(symbols=["BTC-USDT"])
await collector.start()

# Check logs in ./logs/ directory
# - collector_debug.log: Debug information
# - collector_info.log: General information
# - collector_error.log: Error messages
```

#### Custom Logging

```python
from utils.logger import get_logger

class CustomCollector(BaseDataCollector):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        # Add custom logger
        self.performance_logger = get_logger(
            f"{self.exchange_name}_performance",
            verbose=False
        )

    async def _process_message(self, message):
        start_time = time.time()

        try:
            result = await super()._process_message(message)

            # Log performance
            processing_time = time.time() - start_time
            self.performance_logger.info(
                f"Message processed in {processing_time:.3f}s"
            )

            return result
        except Exception as e:
            self.performance_logger.error(
                f"Processing failed after {time.time() - start_time:.3f}s: {e}"
            )
            raise
```

## Integration Examples

### Django Integration

```python
# Django management command
from django.core.management.base import BaseCommand
from data import CollectorManager
import asyncio

class Command(BaseCommand):
    help = 'Start crypto data collectors'

    def handle(self, *args, **options):
        async def run_collectors():
            manager = CollectorManager("django_collectors")

            # Add collectors
            from myapp.collectors import OKXCollector, BinanceCollector
            manager.add_collector(OKXCollector(['BTC-USDT']))
            manager.add_collector(BinanceCollector(['ETH-USDT']))

            # Start system
            await manager.start()

            # Keep running
            try:
                while True:
                    await asyncio.sleep(60)
                    status = manager.get_status()
                    self.stdout.write(f"Status: {status['statistics']}")
            except KeyboardInterrupt:
                await manager.stop()

        asyncio.run(run_collectors())
```

### FastAPI Integration

```python
# FastAPI application
from fastapi import FastAPI
from data import CollectorManager
import asyncio

app = FastAPI()
manager = None

@app.on_event("startup")
async def startup_event():
    global manager
    manager = CollectorManager("fastapi_collectors")

    # Add collectors
    from collectors import OKXCollector
    collector = OKXCollector(['BTC-USDT', 'ETH-USDT'])
    manager.add_collector(collector)

    # Start in background
    await manager.start()

@app.on_event("shutdown")
async def shutdown_event():
    global manager
    if manager:
        await manager.stop()

@app.get("/collector/status")
async def get_collector_status():
    return manager.get_status()

@app.post("/collector/{name}/restart")
async def restart_collector(name: str):
    success = await manager.restart_collector(name)
    return {"success": success}
```

### Celery Integration

```python
# Celery task
from celery import Celery
from data import CollectorManager
import asyncio

app = Celery('crypto_collectors')

@app.task
def start_data_collection():
    """Start data collection as Celery task."""

    async def run():
        manager = CollectorManager("celery_collectors")

        # Setup collectors
        from collectors import OKXCollector, BinanceCollector
        manager.add_collector(OKXCollector(['BTC-USDT']))
        manager.add_collector(BinanceCollector(['ETH-USDT']))

        # Start and monitor
        await manager.start()

        # Run until stopped
        try:
            while True:
                await asyncio.sleep(300)  # 5 minute intervals

                # Check health and restart if needed
                failed = manager.get_failed_collectors()
                if failed:
                    print(f"Restarting failed collectors: {failed}")
                    await manager.restart_all_collectors()

        except Exception as e:
            print(f"Collection error: {e}")
        finally:
            await manager.stop()

    # Run async task
    asyncio.run(run())
```

## Migration Guide

### From Manual Connection Management

**Before** (manual management):
```python
class OldCollector:
    def __init__(self):
        self.websocket = None
        self.running = False

    async def start(self):
        while self.running:
            try:
                self.websocket = await connect()
                await self.listen()
            except Exception as e:
                print(f"Error: {e}")
                await asyncio.sleep(5)  # Manual retry
```

**After** (with BaseDataCollector):
```python
class NewCollector(BaseDataCollector):
    def __init__(self):
        super().__init__("exchange", ["BTC-USDT"])
        # Auto-restart and health monitoring included

    async def connect(self) -> bool:
        self.websocket = await connect()
        return True

    async def _handle_messages(self):
        message = await self.websocket.receive()
        # Error handling and restart logic automatic
```

### From Basic Monitoring

**Before** (basic monitoring):
```python
# Manual status tracking
status = {
    'connected': False,
    'last_message': None,
    'error_count': 0
}

# Manual health checks
async def health_check():
    if time.time() - status['last_message'] > 300:
        print("No data for 5 minutes!")
```

**After** (comprehensive monitoring):
```python
# Automatic health monitoring
collector = MyCollector(["BTC-USDT"])

# Rich status information
status = collector.get_status()
health = collector.get_health_status()

# Automatic alerts and recovery
if not health['is_healthy']:
    print(f"Issues: {health['issues']}")
    # Auto-restart already triggered
```

---

## Support and Contributing

### Getting Help

1. **Check Logs**: Review logs in `./logs/` directory
2. **Status Information**: Use `get_status()` and `get_health_status()` methods
3. **Debug Mode**: Set `LOG_LEVEL=DEBUG` for detailed logging
4. **Test with Demo**: Run `examples/collector_demo.py` to verify setup

### Contributing

The data collector system is designed to be extensible. Contributions are welcome for:

- New exchange implementations
- Enhanced monitoring features
- Performance optimizations
- Additional data types
- Integration examples

### License

This documentation and the associated code are part of the Crypto Trading Bot Platform project.

---

*For more information, see the main project documentation in `/docs/`.*