orderflow_backtest/docs/modules/level_parser.md

3.1 KiB

Module: level_parser

Purpose

The level_parser module provides utilities for parsing and normalizing orderbook level data from various string formats. It handles JSON and Python literal representations, converting them into standardized numeric tuples for processing.

Public Interface

Functions

  • normalize_levels(levels: Any) -> List[List[float]]: Parse levels into [[price, size], ...] format, filtering out zero/negative sizes
  • parse_levels_including_zeros(levels: Any) -> List[Tuple[float, float]]: Parse levels preserving zero sizes for deletion operations

Private Functions

  • _parse_string_to_list(levels: Any) -> List[Any]: Core parsing logic trying JSON first, then literal_eval
  • _extract_price_size(item: Any) -> Tuple[Any, Any]: Extract price/size from dict or list/tuple formats

Usage Examples

from level_parser import normalize_levels, parse_levels_including_zeros

# Parse standard levels (filters zeros)
levels = normalize_levels('[[50000.0, 1.5], [49999.0, 2.0]]')
# Returns: [[50000.0, 1.5], [49999.0, 2.0]]

# Parse with zero sizes preserved (for deletions)
updates = parse_levels_including_zeros('[[50000.0, 0.0], [49999.0, 1.5]]')
# Returns: [(50000.0, 0.0), (49999.0, 1.5)]

# Supports dict format
dict_levels = normalize_levels('[{"price": 50000.0, "size": 1.5}]')
# Returns: [[50000.0, 1.5]]

# Short key format
short_levels = normalize_levels('[{"p": 50000.0, "s": 1.5}]')
# Returns: [[50000.0, 1.5]]

Dependencies

External

  • json: Primary parsing method for level data
  • ast.literal_eval: Fallback parsing for Python literal formats
  • logging: Debug logging for parsing issues
  • typing: Type annotations

Input Formats Supported

JSON Array Format

[[50000.0, 1.5], [49999.0, 2.0]]

Dict Format (Full Keys)

[{"price": 50000.0, "size": 1.5}, {"price": 49999.0, "size": 2.0}]

Dict Format (Short Keys)

[{"p": 50000.0, "s": 1.5}, {"p": 49999.0, "s": 2.0}]

Python Literal Format

"[(50000.0, 1.5), (49999.0, 2.0)]"

Error Handling

  • Graceful Degradation: Returns empty list on parse failures
  • Data Validation: Filters out invalid price/size pairs
  • Type Safety: Converts all values to float before processing
  • Debug Logging: Logs warnings for malformed input without crashing

Performance Characteristics

  • Fast Path: JSON parsing prioritized for performance
  • Fallback Support: ast.literal_eval as backup for edge cases
  • Memory Efficient: Processes items iteratively, not loading entire dataset
  • Validation: Minimal overhead with early filtering of invalid data

Testing

uv run pytest test_level_parser.py -v

Test coverage includes:

  • JSON format parsing accuracy
  • Dict format (both key styles) parsing
  • Python literal fallback parsing
  • Zero size preservation vs filtering
  • Error handling for malformed input
  • Type conversion edge cases

Known Limitations

  • Assumes well-formed numeric data (price/size as numbers)
  • Does not validate economic constraints (e.g., positive prices)
  • Limited to list/dict input formats
  • No support for streaming/incremental parsing