Compare commits
2 Commits
ebf232317c
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| a295a3f141 | |||
| 65dab17424 |
@@ -1,61 +0,0 @@
|
||||
---
|
||||
description: Global development standards and AI interaction principles
|
||||
globs:
|
||||
alwaysApply: true
|
||||
---
|
||||
|
||||
# Rule: Always Apply - Global Development Standards
|
||||
|
||||
## AI Interaction Principles
|
||||
|
||||
### Step-by-Step Development
|
||||
- **NEVER** generate large blocks of code without explanation
|
||||
- **ALWAYS** ask "provide your plan in a concise bullet list and wait for my confirmation before proceeding"
|
||||
- Break complex tasks into smaller, manageable pieces (≤250 lines per file, ≤50 lines per function)
|
||||
- Explain your reasoning step-by-step before writing code
|
||||
- Wait for explicit approval before moving to the next sub-task
|
||||
|
||||
### Context Awareness
|
||||
- **ALWAYS** reference existing code patterns and data structures before suggesting new approaches
|
||||
- Ask about existing conventions before implementing new functionality
|
||||
- Preserve established architectural decisions unless explicitly asked to change them
|
||||
- Maintain consistency with existing naming conventions and code style
|
||||
|
||||
## Code Quality Standards
|
||||
|
||||
### File and Function Limits
|
||||
- **Maximum file size**: 250 lines
|
||||
- **Maximum function size**: 50 lines
|
||||
- **Maximum complexity**: If a function does more than one main thing, break it down
|
||||
- **Naming**: Use clear, descriptive names that explain purpose
|
||||
|
||||
### Documentation Requirements
|
||||
- **Every public function** must have a docstring explaining purpose, parameters, and return value
|
||||
- **Every class** must have a class-level docstring
|
||||
- **Complex logic** must have inline comments explaining the "why", not just the "what"
|
||||
- **API endpoints** must be documented with request/response examples
|
||||
|
||||
### Error Handling
|
||||
- **ALWAYS** include proper error handling for external dependencies
|
||||
- **NEVER** use bare except clauses
|
||||
- Provide meaningful error messages that help with debugging
|
||||
- Log errors appropriately for the application context
|
||||
|
||||
## Security and Best Practices
|
||||
- **NEVER** hardcode credentials, API keys, or sensitive data
|
||||
- **ALWAYS** validate user inputs
|
||||
- Use parameterized queries for database operations
|
||||
- Follow the principle of least privilege
|
||||
- Implement proper authentication and authorization
|
||||
|
||||
## Testing Requirements
|
||||
- **Every implementation** should have corresponding unit tests
|
||||
- **Every API endpoint** should have integration tests
|
||||
- Test files should be placed alongside the code they test
|
||||
- Use descriptive test names that explain what is being tested
|
||||
|
||||
## Response Format
|
||||
- Be concise and avoid unnecessary repetition
|
||||
- Focus on actionable information
|
||||
- Provide examples when explaining complex concepts
|
||||
- Ask clarifying questions when requirements are ambiguous
|
||||
@@ -1,237 +0,0 @@
|
||||
---
|
||||
description: Modular design principles and architecture guidelines for scalable development
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
|
||||
# Rule: Architecture and Modular Design
|
||||
|
||||
## Goal
|
||||
Maintain a clean, modular architecture that scales effectively and prevents the complexity issues that arise in AI-assisted development.
|
||||
|
||||
## Core Architecture Principles
|
||||
|
||||
### 1. Modular Design
|
||||
- **Single Responsibility**: Each module has one clear purpose
|
||||
- **Loose Coupling**: Modules depend on interfaces, not implementations
|
||||
- **High Cohesion**: Related functionality is grouped together
|
||||
- **Clear Boundaries**: Module interfaces are well-defined and stable
|
||||
|
||||
### 2. Size Constraints
|
||||
- **Files**: Maximum 250 lines per file
|
||||
- **Functions**: Maximum 50 lines per function
|
||||
- **Classes**: Maximum 300 lines per class
|
||||
- **Modules**: Maximum 10 public functions/classes per module
|
||||
|
||||
### 3. Dependency Management
|
||||
- **Layer Dependencies**: Higher layers depend on lower layers only
|
||||
- **No Circular Dependencies**: Modules cannot depend on each other cyclically
|
||||
- **Interface Segregation**: Depend on specific interfaces, not broad ones
|
||||
- **Dependency Injection**: Pass dependencies rather than creating them internally
|
||||
|
||||
## Modular Architecture Patterns
|
||||
|
||||
### Layer Structure
|
||||
```
|
||||
src/
|
||||
├── presentation/ # UI, API endpoints, CLI interfaces
|
||||
├── application/ # Business logic, use cases, workflows
|
||||
├── domain/ # Core business entities and rules
|
||||
├── infrastructure/ # Database, external APIs, file systems
|
||||
└── shared/ # Common utilities, constants, types
|
||||
```
|
||||
|
||||
### Module Organization
|
||||
```
|
||||
module_name/
|
||||
├── __init__.py # Public interface exports
|
||||
├── core.py # Main module logic
|
||||
├── types.py # Type definitions and interfaces
|
||||
├── utils.py # Module-specific utilities
|
||||
├── tests/ # Module tests
|
||||
└── README.md # Module documentation
|
||||
```
|
||||
|
||||
## Design Patterns for AI Development
|
||||
|
||||
### 1. Repository Pattern
|
||||
Separate data access from business logic:
|
||||
|
||||
```python
|
||||
# Domain interface
|
||||
class UserRepository:
|
||||
def get_by_id(self, user_id: str) -> User: ...
|
||||
def save(self, user: User) -> None: ...
|
||||
|
||||
# Infrastructure implementation
|
||||
class SqlUserRepository(UserRepository):
|
||||
def get_by_id(self, user_id: str) -> User:
|
||||
# Database-specific implementation
|
||||
pass
|
||||
```
|
||||
|
||||
### 2. Service Pattern
|
||||
Encapsulate business logic in focused services:
|
||||
|
||||
```python
|
||||
class UserService:
|
||||
def __init__(self, user_repo: UserRepository):
|
||||
self._user_repo = user_repo
|
||||
|
||||
def create_user(self, data: UserData) -> User:
|
||||
# Validation and business logic
|
||||
# Single responsibility: user creation
|
||||
pass
|
||||
```
|
||||
|
||||
### 3. Factory Pattern
|
||||
Create complex objects with clear interfaces:
|
||||
|
||||
```python
|
||||
class DatabaseFactory:
|
||||
@staticmethod
|
||||
def create_connection(config: DatabaseConfig) -> Connection:
|
||||
# Handle different database types
|
||||
# Encapsulate connection complexity
|
||||
pass
|
||||
```
|
||||
|
||||
## Architecture Decision Guidelines
|
||||
|
||||
### When to Create New Modules
|
||||
Create a new module when:
|
||||
- **Functionality** exceeds size constraints (250 lines)
|
||||
- **Responsibility** is distinct from existing modules
|
||||
- **Dependencies** would create circular references
|
||||
- **Reusability** would benefit other parts of the system
|
||||
- **Testing** requires isolated test environments
|
||||
|
||||
### When to Split Existing Modules
|
||||
Split modules when:
|
||||
- **File size** exceeds 250 lines
|
||||
- **Multiple responsibilities** are evident
|
||||
- **Testing** becomes difficult due to complexity
|
||||
- **Dependencies** become too numerous
|
||||
- **Change frequency** differs significantly between parts
|
||||
|
||||
### Module Interface Design
|
||||
```python
|
||||
# Good: Clear, focused interface
|
||||
class PaymentProcessor:
|
||||
def process_payment(self, amount: Money, method: PaymentMethod) -> PaymentResult:
|
||||
"""Process a single payment transaction."""
|
||||
pass
|
||||
|
||||
# Bad: Unfocused, kitchen-sink interface
|
||||
class PaymentManager:
|
||||
def process_payment(self, ...): pass
|
||||
def validate_card(self, ...): pass
|
||||
def send_receipt(self, ...): pass
|
||||
def update_inventory(self, ...): pass # Wrong responsibility!
|
||||
```
|
||||
|
||||
## Architecture Validation
|
||||
|
||||
### Architecture Review Checklist
|
||||
- [ ] **Dependencies flow in one direction** (no cycles)
|
||||
- [ ] **Layers are respected** (presentation doesn't call infrastructure directly)
|
||||
- [ ] **Modules have single responsibility**
|
||||
- [ ] **Interfaces are stable** and well-defined
|
||||
- [ ] **Size constraints** are maintained
|
||||
- [ ] **Testing** is straightforward for each module
|
||||
|
||||
### Red Flags
|
||||
- **God Objects**: Classes/modules that do too many things
|
||||
- **Circular Dependencies**: Modules that depend on each other
|
||||
- **Deep Inheritance**: More than 3 levels of inheritance
|
||||
- **Large Interfaces**: Interfaces with more than 7 methods
|
||||
- **Tight Coupling**: Modules that know too much about each other's internals
|
||||
|
||||
## Refactoring Guidelines
|
||||
|
||||
### When to Refactor
|
||||
- Module exceeds size constraints
|
||||
- Code duplication across modules
|
||||
- Difficult to test individual components
|
||||
- New features require changing multiple unrelated modules
|
||||
- Performance bottlenecks due to poor separation
|
||||
|
||||
### Refactoring Process
|
||||
1. **Identify** the specific architectural problem
|
||||
2. **Design** the target architecture
|
||||
3. **Create tests** to verify current behavior
|
||||
4. **Implement changes** incrementally
|
||||
5. **Validate** that tests still pass
|
||||
6. **Update documentation** to reflect changes
|
||||
|
||||
### Safe Refactoring Practices
|
||||
- **One change at a time**: Don't mix refactoring with new features
|
||||
- **Tests first**: Ensure comprehensive test coverage before refactoring
|
||||
- **Incremental changes**: Small steps with verification at each stage
|
||||
- **Backward compatibility**: Maintain existing interfaces during transition
|
||||
- **Documentation updates**: Keep architecture documentation current
|
||||
|
||||
## Architecture Documentation
|
||||
|
||||
### Architecture Decision Records (ADRs)
|
||||
Document significant decisions in `./docs/decisions/`:
|
||||
|
||||
```markdown
|
||||
# ADR-003: Service Layer Architecture
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
As the application grows, business logic is scattered across controllers and models.
|
||||
|
||||
## Decision
|
||||
Implement a service layer to encapsulate business logic.
|
||||
|
||||
## Consequences
|
||||
**Positive:**
|
||||
- Clear separation of concerns
|
||||
- Easier testing of business logic
|
||||
- Better reusability across different interfaces
|
||||
|
||||
**Negative:**
|
||||
- Additional abstraction layer
|
||||
- More files to maintain
|
||||
```
|
||||
|
||||
### Module Documentation Template
|
||||
```markdown
|
||||
# Module: [Name]
|
||||
|
||||
## Purpose
|
||||
What this module does and why it exists.
|
||||
|
||||
## Dependencies
|
||||
- **Imports from**: List of modules this depends on
|
||||
- **Used by**: List of modules that depend on this one
|
||||
- **External**: Third-party dependencies
|
||||
|
||||
## Public Interface
|
||||
```python
|
||||
# Key functions and classes exposed by this module
|
||||
```
|
||||
|
||||
## Architecture Notes
|
||||
- Design patterns used
|
||||
- Important architectural decisions
|
||||
- Known limitations or constraints
|
||||
```
|
||||
|
||||
## Migration Strategies
|
||||
|
||||
### Legacy Code Integration
|
||||
- **Strangler Fig Pattern**: Gradually replace old code with new modules
|
||||
- **Adapter Pattern**: Create interfaces to integrate old and new code
|
||||
- **Facade Pattern**: Simplify complex legacy interfaces
|
||||
|
||||
### Gradual Modernization
|
||||
1. **Identify boundaries** in existing code
|
||||
2. **Extract modules** one at a time
|
||||
3. **Create interfaces** for each extracted module
|
||||
4. **Test thoroughly** at each step
|
||||
5. **Update documentation** continuously
|
||||
@@ -1,123 +0,0 @@
|
||||
---
|
||||
description: AI-generated code review checklist and quality assurance guidelines
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
|
||||
# Rule: Code Review and Quality Assurance
|
||||
|
||||
## Goal
|
||||
Establish systematic review processes for AI-generated code to maintain quality, security, and maintainability standards.
|
||||
|
||||
## AI Code Review Checklist
|
||||
|
||||
### Pre-Implementation Review
|
||||
Before accepting any AI-generated code:
|
||||
|
||||
1. **Understand the Code**
|
||||
- [ ] Can you explain what the code does in your own words?
|
||||
- [ ] Do you understand each function and its purpose?
|
||||
- [ ] Are there any "magic" values or unexplained logic?
|
||||
- [ ] Does the code solve the actual problem stated?
|
||||
|
||||
2. **Architecture Alignment**
|
||||
- [ ] Does the code follow established project patterns?
|
||||
- [ ] Is it consistent with existing data structures?
|
||||
- [ ] Does it integrate cleanly with existing components?
|
||||
- [ ] Are new dependencies justified and necessary?
|
||||
|
||||
3. **Code Quality**
|
||||
- [ ] Are functions smaller than 50 lines?
|
||||
- [ ] Are files smaller than 250 lines?
|
||||
- [ ] Are variable and function names descriptive?
|
||||
- [ ] Is the code DRY (Don't Repeat Yourself)?
|
||||
|
||||
### Security Review
|
||||
- [ ] **Input Validation**: All user inputs are validated and sanitized
|
||||
- [ ] **Authentication**: Proper authentication checks are in place
|
||||
- [ ] **Authorization**: Access controls are implemented correctly
|
||||
- [ ] **Data Protection**: Sensitive data is handled securely
|
||||
- [ ] **SQL Injection**: Database queries use parameterized statements
|
||||
- [ ] **XSS Prevention**: Output is properly escaped
|
||||
- [ ] **Error Handling**: Errors don't leak sensitive information
|
||||
|
||||
### Integration Review
|
||||
- [ ] **Existing Functionality**: New code doesn't break existing features
|
||||
- [ ] **Data Consistency**: Database changes maintain referential integrity
|
||||
- [ ] **API Compatibility**: Changes don't break existing API contracts
|
||||
- [ ] **Performance Impact**: New code doesn't introduce performance bottlenecks
|
||||
- [ ] **Testing Coverage**: Appropriate tests are included
|
||||
|
||||
## Review Process
|
||||
|
||||
### Step 1: Initial Code Analysis
|
||||
1. **Read through the entire generated code** before running it
|
||||
2. **Identify patterns** that don't match existing codebase
|
||||
3. **Check dependencies** - are new packages really needed?
|
||||
4. **Verify logic flow** - does the algorithm make sense?
|
||||
|
||||
### Step 2: Security and Error Handling Review
|
||||
1. **Trace data flow** from input to output
|
||||
2. **Identify potential failure points** and verify error handling
|
||||
3. **Check for security vulnerabilities** using the security checklist
|
||||
4. **Verify proper logging** and monitoring implementation
|
||||
|
||||
### Step 3: Integration Testing
|
||||
1. **Test with existing code** to ensure compatibility
|
||||
2. **Run existing test suite** to verify no regressions
|
||||
3. **Test edge cases** and error conditions
|
||||
4. **Verify performance** under realistic conditions
|
||||
|
||||
## Common AI Code Issues to Watch For
|
||||
|
||||
### Overcomplication Patterns
|
||||
- **Unnecessary abstractions**: AI creating complex patterns for simple tasks
|
||||
- **Over-engineering**: Solutions that are more complex than needed
|
||||
- **Redundant code**: AI recreating existing functionality
|
||||
- **Inappropriate design patterns**: Using patterns that don't fit the use case
|
||||
|
||||
### Context Loss Indicators
|
||||
- **Inconsistent naming**: Different conventions from existing code
|
||||
- **Wrong data structures**: Using different patterns than established
|
||||
- **Ignored existing functions**: Reimplementing existing functionality
|
||||
- **Architectural misalignment**: Code that doesn't fit the overall design
|
||||
|
||||
### Technical Debt Indicators
|
||||
- **Magic numbers**: Hardcoded values without explanation
|
||||
- **Poor error messages**: Generic or unhelpful error handling
|
||||
- **Missing documentation**: Code without adequate comments
|
||||
- **Tight coupling**: Components that are too interdependent
|
||||
|
||||
## Quality Gates
|
||||
|
||||
### Mandatory Reviews
|
||||
All AI-generated code must pass these gates before acceptance:
|
||||
|
||||
1. **Security Review**: No security vulnerabilities detected
|
||||
2. **Integration Review**: Integrates cleanly with existing code
|
||||
3. **Performance Review**: Meets performance requirements
|
||||
4. **Maintainability Review**: Code can be easily modified by team members
|
||||
5. **Documentation Review**: Adequate documentation is provided
|
||||
|
||||
### Acceptance Criteria
|
||||
- [ ] Code is understandable by any team member
|
||||
- [ ] Integration requires minimal changes to existing code
|
||||
- [ ] Security review passes all checks
|
||||
- [ ] Performance meets established benchmarks
|
||||
- [ ] Documentation is complete and accurate
|
||||
|
||||
## Rejection Criteria
|
||||
Reject AI-generated code if:
|
||||
- Security vulnerabilities are present
|
||||
- Code is too complex for the problem being solved
|
||||
- Integration requires major refactoring of existing code
|
||||
- Code duplicates existing functionality without justification
|
||||
- Documentation is missing or inadequate
|
||||
|
||||
## Review Documentation
|
||||
For each review, document:
|
||||
- Issues found and how they were resolved
|
||||
- Performance impact assessment
|
||||
- Security concerns and mitigations
|
||||
- Integration challenges and solutions
|
||||
- Recommendations for future similar tasks
|
||||
@@ -1,93 +0,0 @@
|
||||
---
|
||||
description: Context management for maintaining codebase awareness and preventing context drift
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
|
||||
# Rule: Context Management
|
||||
|
||||
## Goal
|
||||
Maintain comprehensive project context to prevent context drift and ensure AI-generated code integrates seamlessly with existing codebase patterns and architecture.
|
||||
|
||||
## Context Documentation Requirements
|
||||
|
||||
### PRD.md file documentation
|
||||
1. **Project Overview**
|
||||
- Business objectives and goals
|
||||
- Target users and use cases
|
||||
- Key success metrics
|
||||
|
||||
### CONTEXT.md File Structure
|
||||
Every project must maintain a `CONTEXT.md` file in the root directory with:
|
||||
|
||||
1. **Architecture Overview**
|
||||
- High-level system architecture
|
||||
- Key design patterns used
|
||||
- Database schema overview
|
||||
- API structure and conventions
|
||||
|
||||
2. **Technology Stack**
|
||||
- Programming languages and versions
|
||||
- Frameworks and libraries
|
||||
- Database systems
|
||||
- Development and deployment tools
|
||||
|
||||
3. **Coding Conventions**
|
||||
- Naming conventions
|
||||
- File organization patterns
|
||||
- Code structure preferences
|
||||
- Import/export patterns
|
||||
|
||||
4. **Current Implementation Status**
|
||||
- Completed features
|
||||
- Work in progress
|
||||
- Known technical debt
|
||||
- Planned improvements
|
||||
|
||||
## Context Maintenance Protocol
|
||||
|
||||
### Before Every Coding Session
|
||||
1. **Review CONTEXT.md and PRD.md** to understand current project state
|
||||
2. **Scan recent changes** in git history to understand latest patterns
|
||||
3. **Identify existing patterns** for similar functionality before implementing new features
|
||||
4. **Ask for clarification** if existing patterns are unclear or conflicting
|
||||
|
||||
### During Development
|
||||
1. **Reference existing code** when explaining implementation approaches
|
||||
2. **Maintain consistency** with established patterns and conventions
|
||||
3. **Update CONTEXT.md** when making architectural decisions
|
||||
4. **Document deviations** from established patterns with reasoning
|
||||
|
||||
### Context Preservation Strategies
|
||||
- **Incremental development**: Build on existing patterns rather than creating new ones
|
||||
- **Pattern consistency**: Use established data structures and function signatures
|
||||
- **Integration awareness**: Consider how new code affects existing functionality
|
||||
- **Dependency management**: Understand existing dependencies before adding new ones
|
||||
|
||||
## Context Prompting Best Practices
|
||||
|
||||
### Effective Context Sharing
|
||||
- Include relevant sections of CONTEXT.md in prompts for complex tasks
|
||||
- Reference specific existing files when asking for similar functionality
|
||||
- Provide examples of existing patterns when requesting new implementations
|
||||
- Share recent git commit messages to understand latest changes
|
||||
|
||||
### Context Window Optimization
|
||||
- Prioritize most relevant context for current task
|
||||
- Use @filename references to include specific files
|
||||
- Break large contexts into focused, task-specific chunks
|
||||
- Update context references as project evolves
|
||||
|
||||
## Red Flags - Context Loss Indicators
|
||||
- AI suggests patterns that conflict with existing code
|
||||
- New implementations ignore established conventions
|
||||
- Proposed solutions don't integrate with existing architecture
|
||||
- Code suggestions require significant refactoring of existing functionality
|
||||
|
||||
## Recovery Protocol
|
||||
When context loss is detected:
|
||||
1. **Stop development** and review CONTEXT.md
|
||||
2. **Analyze existing codebase** for established patterns
|
||||
3. **Update context documentation** with missing information
|
||||
4. **Restart task** with proper context provided
|
||||
5. **Test integration** with existing code before proceeding
|
||||
@@ -1,67 +0,0 @@
|
||||
---
|
||||
description: Creating PRD for a project or specific task/function
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
---
|
||||
description: Creating PRD for a project or specific task/function
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
# Rule: Generating a Product Requirements Document (PRD)
|
||||
|
||||
## Goal
|
||||
|
||||
To guide an AI assistant in creating a detailed Product Requirements Document (PRD) in Markdown format, based on an initial user prompt. The PRD should be clear, actionable, and suitable for a junior developer to understand and implement the feature.
|
||||
|
||||
## Process
|
||||
|
||||
1. **Receive Initial Prompt:** The user provides a brief description or request for a new feature or functionality.
|
||||
2. **Ask Clarifying Questions:** Before writing the PRD, the AI *must* ask clarifying questions to gather sufficient detail. The goal is to understand the "what" and "why" of the feature, not necessarily the "how" (which the developer will figure out).
|
||||
3. **Generate PRD:** Based on the initial prompt and the user's answers to the clarifying questions, generate a PRD using the structure outlined below.
|
||||
4. **Save PRD:** Save the generated document as `prd-[feature-name].md` inside the `/tasks` directory.
|
||||
|
||||
## Clarifying Questions (Examples)
|
||||
|
||||
The AI should adapt its questions based on the prompt, but here are some common areas to explore:
|
||||
|
||||
* **Problem/Goal:** "What problem does this feature solve for the user?" or "What is the main goal we want to achieve with this feature?"
|
||||
* **Target User:** "Who is the primary user of this feature?"
|
||||
* **Core Functionality:** "Can you describe the key actions a user should be able to perform with this feature?"
|
||||
* **User Stories:** "Could you provide a few user stories? (e.g., As a [type of user], I want to [perform an action] so that [benefit].)"
|
||||
* **Acceptance Criteria:** "How will we know when this feature is successfully implemented? What are the key success criteria?"
|
||||
* **Scope/Boundaries:** "Are there any specific things this feature *should not* do (non-goals)?"
|
||||
* **Data Requirements:** "What kind of data does this feature need to display or manipulate?"
|
||||
* **Design/UI:** "Are there any existing design mockups or UI guidelines to follow?" or "Can you describe the desired look and feel?"
|
||||
* **Edge Cases:** "Are there any potential edge cases or error conditions we should consider?"
|
||||
|
||||
## PRD Structure
|
||||
|
||||
The generated PRD should include the following sections:
|
||||
|
||||
1. **Introduction/Overview:** Briefly describe the feature and the problem it solves. State the goal.
|
||||
2. **Goals:** List the specific, measurable objectives for this feature.
|
||||
3. **User Stories:** Detail the user narratives describing feature usage and benefits.
|
||||
4. **Functional Requirements:** List the specific functionalities the feature must have. Use clear, concise language (e.g., "The system must allow users to upload a profile picture."). Number these requirements.
|
||||
5. **Non-Goals (Out of Scope):** Clearly state what this feature will *not* include to manage scope.
|
||||
6. **Design Considerations (Optional):** Link to mockups, describe UI/UX requirements, or mention relevant components/styles if applicable.
|
||||
7. **Technical Considerations (Optional):** Mention any known technical constraints, dependencies, or suggestions (e.g., "Should integrate with the existing Auth module").
|
||||
8. **Success Metrics:** How will the success of this feature be measured? (e.g., "Increase user engagement by 10%", "Reduce support tickets related to X").
|
||||
9. **Open Questions:** List any remaining questions or areas needing further clarification.
|
||||
|
||||
## Target Audience
|
||||
|
||||
Assume the primary reader of the PRD is a **junior developer**. Therefore, requirements should be explicit, unambiguous, and avoid jargon where possible. Provide enough detail for them to understand the feature's purpose and core logic.
|
||||
|
||||
## Output
|
||||
|
||||
* **Format:** Markdown (`.md`)
|
||||
* **Location:** `/tasks/`
|
||||
* **Filename:** `prd-[feature-name].md`
|
||||
|
||||
## Final instructions
|
||||
|
||||
1. Do NOT start implmenting the PRD
|
||||
2. Make sure to ask the user clarifying questions
|
||||
|
||||
3. Take the user's answers to the clarifying questions and improve the PRD
|
||||
@@ -1,244 +0,0 @@
|
||||
---
|
||||
description: Documentation standards for code, architecture, and development decisions
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
|
||||
# Rule: Documentation Standards
|
||||
|
||||
## Goal
|
||||
Maintain comprehensive, up-to-date documentation that supports development, onboarding, and long-term maintenance of the codebase.
|
||||
|
||||
## Documentation Hierarchy
|
||||
|
||||
### 1. Project Level Documentation (in ./docs/)
|
||||
- **README.md**: Project overview, setup instructions, basic usage
|
||||
- **CONTEXT.md**: Current project state, architecture decisions, patterns
|
||||
- **CHANGELOG.md**: Version history and significant changes
|
||||
- **CONTRIBUTING.md**: Development guidelines and processes
|
||||
- **API.md**: API endpoints, request/response formats, authentication
|
||||
|
||||
### 2. Module Level Documentation (in ./docs/modules/)
|
||||
- **[module-name].md**: Purpose, public interfaces, usage examples
|
||||
- **dependencies.md**: External dependencies and their purposes
|
||||
- **architecture.md**: Module relationships and data flow
|
||||
|
||||
### 3. Code Level Documentation
|
||||
- **Docstrings**: Function and class documentation
|
||||
- **Inline comments**: Complex logic explanations
|
||||
- **Type hints**: Clear parameter and return types
|
||||
- **README files**: Directory-specific instructions
|
||||
|
||||
## Documentation Standards
|
||||
|
||||
### Code Documentation
|
||||
```python
|
||||
def process_user_data(user_id: str, data: dict) -> UserResult:
|
||||
"""
|
||||
Process and validate user data before storage.
|
||||
|
||||
Args:
|
||||
user_id: Unique identifier for the user
|
||||
data: Dictionary containing user information to process
|
||||
|
||||
Returns:
|
||||
UserResult: Processed user data with validation status
|
||||
|
||||
Raises:
|
||||
ValidationError: When user data fails validation
|
||||
DatabaseError: When storage operation fails
|
||||
|
||||
Example:
|
||||
>>> result = process_user_data("123", {"name": "John", "email": "john@example.com"})
|
||||
>>> print(result.status)
|
||||
'valid'
|
||||
"""
|
||||
```
|
||||
|
||||
### API Documentation Format
|
||||
```markdown
|
||||
### POST /api/users
|
||||
|
||||
Create a new user account.
|
||||
|
||||
**Request:**
|
||||
```json
|
||||
{
|
||||
"name": "string (required)",
|
||||
"email": "string (required, valid email)",
|
||||
"age": "number (optional, min: 13)"
|
||||
}
|
||||
```
|
||||
|
||||
**Response (201):**
|
||||
```json
|
||||
{
|
||||
"id": "uuid",
|
||||
"name": "string",
|
||||
"email": "string",
|
||||
"created_at": "iso_datetime"
|
||||
}
|
||||
```
|
||||
|
||||
**Errors:**
|
||||
- 400: Invalid input data
|
||||
- 409: Email already exists
|
||||
```
|
||||
|
||||
### Architecture Decision Records (ADRs)
|
||||
Document significant architecture decisions in `./docs/decisions/`:
|
||||
|
||||
```markdown
|
||||
# ADR-001: Database Choice - PostgreSQL
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
We need to choose a database for storing user data and application state.
|
||||
|
||||
## Decision
|
||||
We will use PostgreSQL as our primary database.
|
||||
|
||||
## Consequences
|
||||
**Positive:**
|
||||
- ACID compliance ensures data integrity
|
||||
- Rich query capabilities with SQL
|
||||
- Good performance for our expected load
|
||||
|
||||
**Negative:**
|
||||
- More complex setup than simpler alternatives
|
||||
- Requires SQL knowledge from team members
|
||||
|
||||
## Alternatives Considered
|
||||
- MongoDB: Rejected due to consistency requirements
|
||||
- SQLite: Rejected due to scalability needs
|
||||
```
|
||||
|
||||
## Documentation Maintenance
|
||||
|
||||
### When to Update Documentation
|
||||
|
||||
#### Always Update:
|
||||
- **API changes**: Any modification to public interfaces
|
||||
- **Architecture changes**: New patterns, data structures, or workflows
|
||||
- **Configuration changes**: Environment variables, deployment settings
|
||||
- **Dependencies**: Adding, removing, or upgrading packages
|
||||
- **Business logic changes**: Core functionality modifications
|
||||
|
||||
#### Update Weekly:
|
||||
- **CONTEXT.md**: Current development status and priorities
|
||||
- **Known issues**: Bug reports and workarounds
|
||||
- **Performance notes**: Bottlenecks and optimization opportunities
|
||||
|
||||
#### Update per Release:
|
||||
- **CHANGELOG.md**: User-facing changes and improvements
|
||||
- **Version documentation**: Breaking changes and migration guides
|
||||
- **Examples and tutorials**: Keep sample code current
|
||||
|
||||
### Documentation Quality Checklist
|
||||
|
||||
#### Completeness
|
||||
- [ ] Purpose and scope clearly explained
|
||||
- [ ] All public interfaces documented
|
||||
- [ ] Examples provided for complex usage
|
||||
- [ ] Error conditions and handling described
|
||||
- [ ] Dependencies and requirements listed
|
||||
|
||||
#### Accuracy
|
||||
- [ ] Code examples are tested and working
|
||||
- [ ] Links point to correct locations
|
||||
- [ ] Version numbers are current
|
||||
- [ ] Screenshots reflect current UI
|
||||
|
||||
#### Clarity
|
||||
- [ ] Written for the intended audience
|
||||
- [ ] Technical jargon is explained
|
||||
- [ ] Step-by-step instructions are clear
|
||||
- [ ] Visual aids used where helpful
|
||||
|
||||
## Documentation Automation
|
||||
|
||||
### Auto-Generated Documentation
|
||||
- **API docs**: Generate from code annotations
|
||||
- **Type documentation**: Extract from type hints
|
||||
- **Module dependencies**: Auto-update from imports
|
||||
- **Test coverage**: Include coverage reports
|
||||
|
||||
### Documentation Testing
|
||||
```python
|
||||
# Test that code examples in documentation work
|
||||
def test_documentation_examples():
|
||||
"""Verify code examples in docs actually work."""
|
||||
# Test examples from README.md
|
||||
# Test API examples from docs/API.md
|
||||
# Test configuration examples
|
||||
```
|
||||
|
||||
## Documentation Templates
|
||||
|
||||
### New Module Documentation Template
|
||||
```markdown
|
||||
# Module: [Name]
|
||||
|
||||
## Purpose
|
||||
Brief description of what this module does and why it exists.
|
||||
|
||||
## Public Interface
|
||||
### Functions
|
||||
- `function_name(params)`: Description and example
|
||||
|
||||
### Classes
|
||||
- `ClassName`: Purpose and basic usage
|
||||
|
||||
## Usage Examples
|
||||
```python
|
||||
# Basic usage example
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
- Internal: List of internal modules this depends on
|
||||
- External: List of external packages required
|
||||
|
||||
## Testing
|
||||
How to run tests for this module.
|
||||
|
||||
## Known Issues
|
||||
Current limitations or bugs.
|
||||
```
|
||||
|
||||
### API Endpoint Template
|
||||
```markdown
|
||||
### [METHOD] /api/endpoint
|
||||
|
||||
Brief description of what this endpoint does.
|
||||
|
||||
**Authentication:** Required/Optional
|
||||
**Rate Limiting:** X requests per minute
|
||||
|
||||
**Request:**
|
||||
- Headers required
|
||||
- Body schema
|
||||
- Query parameters
|
||||
|
||||
**Response:**
|
||||
- Success response format
|
||||
- Error response format
|
||||
- Status codes
|
||||
|
||||
**Example:**
|
||||
Working request/response example
|
||||
```
|
||||
|
||||
## Review and Maintenance Process
|
||||
|
||||
### Documentation Review
|
||||
- Include documentation updates in code reviews
|
||||
- Verify examples still work with code changes
|
||||
- Check for broken links and outdated information
|
||||
- Ensure consistency with current implementation
|
||||
|
||||
### Regular Audits
|
||||
- Monthly review of documentation accuracy
|
||||
- Quarterly assessment of documentation completeness
|
||||
- Annual review of documentation structure and organization
|
||||
@@ -1,207 +0,0 @@
|
||||
---
|
||||
description: Enhanced task list management with quality gates and iterative workflow integration
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
|
||||
# Rule: Enhanced Task List Management
|
||||
|
||||
## Goal
|
||||
Manage task lists with integrated quality gates and iterative workflow to prevent context loss and ensure sustainable development.
|
||||
|
||||
## Task Implementation Protocol
|
||||
|
||||
### Pre-Implementation Check
|
||||
Before starting any sub-task:
|
||||
- [ ] **Context Review**: Have you reviewed CONTEXT.md and relevant documentation?
|
||||
- [ ] **Pattern Identification**: Do you understand existing patterns to follow?
|
||||
- [ ] **Integration Planning**: Do you know how this will integrate with existing code?
|
||||
- [ ] **Size Validation**: Is this task small enough (≤50 lines, ≤250 lines per file)?
|
||||
|
||||
### Implementation Process
|
||||
1. **One sub-task at a time**: Do **NOT** start the next sub‑task until you ask the user for permission and they say "yes" or "y"
|
||||
2. **Step-by-step execution**:
|
||||
- Plan the approach in bullet points
|
||||
- Wait for approval
|
||||
- Implement the specific sub-task
|
||||
- Test the implementation
|
||||
- Update documentation if needed
|
||||
3. **Quality validation**: Run through the code review checklist before marking complete
|
||||
|
||||
### Completion Protocol
|
||||
When you finish a **sub‑task**:
|
||||
1. **Immediate marking**: Change `[ ]` to `[x]`
|
||||
2. **Quality check**: Verify the implementation meets quality standards
|
||||
3. **Integration test**: Ensure new code works with existing functionality
|
||||
4. **Documentation update**: Update relevant files if needed
|
||||
5. **Parent task check**: If **all** subtasks underneath a parent task are now `[x]`, also mark the **parent task** as completed
|
||||
6. **Stop and wait**: Get user approval before proceeding to next sub-task
|
||||
|
||||
## Enhanced Task List Structure
|
||||
|
||||
### Task File Header
|
||||
```markdown
|
||||
# Task List: [Feature Name]
|
||||
|
||||
**Source PRD**: `prd-[feature-name].md`
|
||||
**Status**: In Progress / Complete / Blocked
|
||||
**Context Last Updated**: [Date]
|
||||
**Architecture Review**: Required / Complete / N/A
|
||||
|
||||
## Quick Links
|
||||
- [Context Documentation](./CONTEXT.md)
|
||||
- [Architecture Guidelines](./docs/architecture.md)
|
||||
- [Related Files](#relevant-files)
|
||||
```
|
||||
|
||||
### Task Format with Quality Gates
|
||||
```markdown
|
||||
- [ ] 1.0 Parent Task Title
|
||||
- **Quality Gate**: Architecture review required
|
||||
- **Dependencies**: List any dependencies
|
||||
- [ ] 1.1 [Sub-task description 1.1]
|
||||
- **Size estimate**: [Small/Medium/Large]
|
||||
- **Pattern reference**: [Reference to existing pattern]
|
||||
- **Test requirements**: [Unit/Integration/Both]
|
||||
- [ ] 1.2 [Sub-task description 1.2]
|
||||
- **Integration points**: [List affected components]
|
||||
- **Risk level**: [Low/Medium/High]
|
||||
```
|
||||
|
||||
## Relevant Files Management
|
||||
|
||||
### Enhanced File Tracking
|
||||
```markdown
|
||||
## Relevant Files
|
||||
|
||||
### Implementation Files
|
||||
- `path/to/file1.ts` - Brief description of purpose and role
|
||||
- **Status**: Created / Modified / Needs Review
|
||||
- **Last Modified**: [Date]
|
||||
- **Review Status**: Pending / Approved / Needs Changes
|
||||
|
||||
### Test Files
|
||||
- `path/to/file1.test.ts` - Unit tests for file1.ts
|
||||
- **Coverage**: [Percentage or status]
|
||||
- **Last Run**: [Date and result]
|
||||
|
||||
### Documentation Files
|
||||
- `docs/module-name.md` - Module documentation
|
||||
- **Status**: Up to date / Needs update / Missing
|
||||
- **Last Updated**: [Date]
|
||||
|
||||
### Configuration Files
|
||||
- `config/setting.json` - Configuration changes
|
||||
- **Environment**: [Dev/Staging/Prod affected]
|
||||
- **Backup**: [Location of backup]
|
||||
```
|
||||
|
||||
## Task List Maintenance
|
||||
|
||||
### During Development
|
||||
1. **Regular updates**: Update task status after each significant change
|
||||
2. **File tracking**: Add new files as they are created or modified
|
||||
3. **Dependency tracking**: Note when new dependencies between tasks emerge
|
||||
4. **Risk assessment**: Flag tasks that become more complex than anticipated
|
||||
|
||||
### Quality Checkpoints
|
||||
At 25%, 50%, 75%, and 100% completion:
|
||||
- [ ] **Architecture alignment**: Code follows established patterns
|
||||
- [ ] **Performance impact**: No significant performance degradation
|
||||
- [ ] **Security review**: No security vulnerabilities introduced
|
||||
- [ ] **Documentation current**: All changes are documented
|
||||
|
||||
### Weekly Review Process
|
||||
1. **Completion assessment**: What percentage of tasks are actually complete?
|
||||
2. **Quality assessment**: Are completed tasks meeting quality standards?
|
||||
3. **Process assessment**: Is the iterative workflow being followed?
|
||||
4. **Risk assessment**: Are there emerging risks or blockers?
|
||||
|
||||
## Task Status Indicators
|
||||
|
||||
### Status Levels
|
||||
- `[ ]` **Not Started**: Task not yet begun
|
||||
- `[~]` **In Progress**: Currently being worked on
|
||||
- `[?]` **Blocked**: Waiting for dependencies or decisions
|
||||
- `[!]` **Needs Review**: Implementation complete but needs quality review
|
||||
- `[x]` **Complete**: Finished and quality approved
|
||||
|
||||
### Quality Indicators
|
||||
- ✅ **Quality Approved**: Passed all quality gates
|
||||
- ⚠️ **Quality Concerns**: Has issues but functional
|
||||
- ❌ **Quality Failed**: Needs rework before approval
|
||||
- 🔄 **Under Review**: Currently being reviewed
|
||||
|
||||
### Integration Status
|
||||
- 🔗 **Integrated**: Successfully integrated with existing code
|
||||
- 🔧 **Integration Issues**: Problems with existing code integration
|
||||
- ⏳ **Integration Pending**: Ready for integration testing
|
||||
|
||||
## Emergency Procedures
|
||||
|
||||
### When Tasks Become Too Complex
|
||||
If a sub-task grows beyond expected scope:
|
||||
1. **Stop implementation** immediately
|
||||
2. **Document current state** and what was discovered
|
||||
3. **Break down** the task into smaller pieces
|
||||
4. **Update task list** with new sub-tasks
|
||||
5. **Get approval** for the new breakdown before proceeding
|
||||
|
||||
### When Context is Lost
|
||||
If AI seems to lose track of project patterns:
|
||||
1. **Pause development**
|
||||
2. **Review CONTEXT.md** and recent changes
|
||||
3. **Update context documentation** with current state
|
||||
4. **Restart** with explicit pattern references
|
||||
5. **Reduce task size** until context is re-established
|
||||
|
||||
### When Quality Gates Fail
|
||||
If implementation doesn't meet quality standards:
|
||||
1. **Mark task** with `[!]` status
|
||||
2. **Document specific issues** found
|
||||
3. **Create remediation tasks** if needed
|
||||
4. **Don't proceed** until quality issues are resolved
|
||||
|
||||
## AI Instructions Integration
|
||||
|
||||
### Context Awareness Commands
|
||||
```markdown
|
||||
**Before starting any task, run these checks:**
|
||||
1. @CONTEXT.md - Review current project state
|
||||
2. @architecture.md - Understand design principles
|
||||
3. @code-review.md - Know quality standards
|
||||
4. Look at existing similar code for patterns
|
||||
```
|
||||
|
||||
### Quality Validation Commands
|
||||
```markdown
|
||||
**After completing any sub-task:**
|
||||
1. Run code review checklist
|
||||
2. Test integration with existing code
|
||||
3. Update documentation if needed
|
||||
4. Mark task complete only after quality approval
|
||||
```
|
||||
|
||||
### Workflow Commands
|
||||
```markdown
|
||||
**For each development session:**
|
||||
1. Review incomplete tasks and their status
|
||||
2. Identify next logical sub-task to work on
|
||||
3. Check dependencies and blockers
|
||||
4. Follow iterative workflow process
|
||||
5. Update task list with progress and findings
|
||||
```
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Daily Success Indicators
|
||||
- Tasks are completed according to quality standards
|
||||
- No sub-tasks are started without completing previous ones
|
||||
- File tracking remains accurate and current
|
||||
- Integration issues are caught early
|
||||
|
||||
### Weekly Success Indicators
|
||||
- Overall task completion rate is sustainable
|
||||
- Quality issues are decreasing over time
|
||||
- Context loss incidents are rare
|
||||
- Team confidence in codebase remains high
|
||||
@@ -1,70 +0,0 @@
|
||||
---
|
||||
description: Generate a task list or TODO for a user requirement or implementation.
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
---
|
||||
description:
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
# Rule: Generating a Task List from a PRD
|
||||
|
||||
## Goal
|
||||
|
||||
To guide an AI assistant in creating a detailed, step-by-step task list in Markdown format based on an existing Product Requirements Document (PRD). The task list should guide a developer through implementation.
|
||||
|
||||
## Output
|
||||
|
||||
- **Format:** Markdown (`.md`)
|
||||
- **Location:** `/tasks/`
|
||||
- **Filename:** `tasks-[prd-file-name].md` (e.g., `tasks-prd-user-profile-editing.md`)
|
||||
|
||||
## Process
|
||||
|
||||
1. **Receive PRD Reference:** The user points the AI to a specific PRD file
|
||||
2. **Analyze PRD:** The AI reads and analyzes the functional requirements, user stories, and other sections of the specified PRD.
|
||||
3. **Phase 1: Generate Parent Tasks:** Based on the PRD analysis, create the file and generate the main, high-level tasks required to implement the feature. Use your judgement on how many high-level tasks to use. It's likely to be about 5. Present these tasks to the user in the specified format (without sub-tasks yet). Inform the user: "I have generated the high-level tasks based on the PRD. Ready to generate the sub-tasks? Respond with 'Go' to proceed."
|
||||
4. **Wait for Confirmation:** Pause and wait for the user to respond with "Go".
|
||||
5. **Phase 2: Generate Sub-Tasks:** Once the user confirms, break down each parent task into smaller, actionable sub-tasks necessary to complete the parent task. Ensure sub-tasks logically follow from the parent task and cover the implementation details implied by the PRD.
|
||||
6. **Identify Relevant Files:** Based on the tasks and PRD, identify potential files that will need to be created or modified. List these under the `Relevant Files` section, including corresponding test files if applicable.
|
||||
7. **Generate Final Output:** Combine the parent tasks, sub-tasks, relevant files, and notes into the final Markdown structure.
|
||||
8. **Save Task List:** Save the generated document in the `/tasks/` directory with the filename `tasks-[prd-file-name].md`, where `[prd-file-name]` matches the base name of the input PRD file (e.g., if the input was `prd-user-profile-editing.md`, the output is `tasks-prd-user-profile-editing.md`).
|
||||
|
||||
## Output Format
|
||||
|
||||
The generated task list _must_ follow this structure:
|
||||
|
||||
```markdown
|
||||
## Relevant Files
|
||||
|
||||
- `path/to/potential/file1.ts` - Brief description of why this file is relevant (e.g., Contains the main component for this feature).
|
||||
- `path/to/file1.test.ts` - Unit tests for `file1.ts`.
|
||||
- `path/to/another/file.tsx` - Brief description (e.g., API route handler for data submission).
|
||||
- `path/to/another/file.test.tsx` - Unit tests for `another/file.tsx`.
|
||||
- `lib/utils/helpers.ts` - Brief description (e.g., Utility functions needed for calculations).
|
||||
- `lib/utils/helpers.test.ts` - Unit tests for `helpers.ts`.
|
||||
|
||||
### Notes
|
||||
|
||||
- Unit tests should typically be placed alongside the code files they are testing (e.g., `MyComponent.tsx` and `MyComponent.test.tsx` in the same directory).
|
||||
- Use `npx jest [optional/path/to/test/file]` to run tests. Running without a path executes all tests found by the Jest configuration.
|
||||
|
||||
## Tasks
|
||||
|
||||
- [ ] 1.0 Parent Task Title
|
||||
- [ ] 1.1 [Sub-task description 1.1]
|
||||
- [ ] 1.2 [Sub-task description 1.2]
|
||||
- [ ] 2.0 Parent Task Title
|
||||
- [ ] 2.1 [Sub-task description 2.1]
|
||||
- [ ] 3.0 Parent Task Title (may not require sub-tasks if purely structural or configuration)
|
||||
```
|
||||
|
||||
## Interaction Model
|
||||
|
||||
The process explicitly requires a pause after generating parent tasks to get user confirmation ("Go") before proceeding to generate the detailed sub-tasks. This ensures the high-level plan aligns with user expectations before diving into details.
|
||||
|
||||
## Target Audience
|
||||
|
||||
|
||||
Assume the primary reader of the task list is a **junior developer** who will implement the feature.
|
||||
@@ -1,236 +0,0 @@
|
||||
---
|
||||
description: Iterative development workflow for AI-assisted coding
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
|
||||
# Rule: Iterative Development Workflow
|
||||
|
||||
## Goal
|
||||
Establish a structured, iterative development process that prevents the chaos and complexity that can arise from uncontrolled AI-assisted development.
|
||||
|
||||
## Development Phases
|
||||
|
||||
### Phase 1: Planning and Design
|
||||
**Before writing any code:**
|
||||
|
||||
1. **Understand the Requirement**
|
||||
- Break down the task into specific, measurable objectives
|
||||
- Identify existing code patterns that should be followed
|
||||
- List dependencies and integration points
|
||||
- Define acceptance criteria
|
||||
|
||||
2. **Design Review**
|
||||
- Propose approach in bullet points
|
||||
- Wait for explicit approval before proceeding
|
||||
- Consider how the solution fits existing architecture
|
||||
- Identify potential risks and mitigation strategies
|
||||
|
||||
### Phase 2: Incremental Implementation
|
||||
**One small piece at a time:**
|
||||
|
||||
1. **Micro-Tasks** (≤ 50 lines each)
|
||||
- Implement one function or small class at a time
|
||||
- Test immediately after implementation
|
||||
- Ensure integration with existing code
|
||||
- Document decisions and patterns used
|
||||
|
||||
2. **Validation Checkpoints**
|
||||
- After each micro-task, verify it works correctly
|
||||
- Check that it follows established patterns
|
||||
- Confirm it integrates cleanly with existing code
|
||||
- Get approval before moving to next micro-task
|
||||
|
||||
### Phase 3: Integration and Testing
|
||||
**Ensuring system coherence:**
|
||||
|
||||
1. **Integration Testing**
|
||||
- Test new code with existing functionality
|
||||
- Verify no regressions in existing features
|
||||
- Check performance impact
|
||||
- Validate error handling
|
||||
|
||||
2. **Documentation Update**
|
||||
- Update relevant documentation
|
||||
- Record any new patterns or decisions
|
||||
- Update context files if architecture changed
|
||||
|
||||
## Iterative Prompting Strategy
|
||||
|
||||
### Step 1: Context Setting
|
||||
```
|
||||
Before implementing [feature], help me understand:
|
||||
1. What existing patterns should I follow?
|
||||
2. What existing functions/classes are relevant?
|
||||
3. How should this integrate with [specific existing component]?
|
||||
4. What are the potential architectural impacts?
|
||||
```
|
||||
|
||||
### Step 2: Plan Creation
|
||||
```
|
||||
Based on the context, create a detailed plan for implementing [feature]:
|
||||
1. Break it into micro-tasks (≤50 lines each)
|
||||
2. Identify dependencies and order of implementation
|
||||
3. Specify integration points with existing code
|
||||
4. List potential risks and mitigation strategies
|
||||
|
||||
Wait for my approval before implementing.
|
||||
```
|
||||
|
||||
### Step 3: Incremental Implementation
|
||||
```
|
||||
Implement only the first micro-task: [specific task]
|
||||
- Use existing patterns from [reference file/function]
|
||||
- Keep it under 50 lines
|
||||
- Include error handling
|
||||
- Add appropriate tests
|
||||
- Explain your implementation choices
|
||||
|
||||
Stop after this task and wait for approval.
|
||||
```
|
||||
|
||||
## Quality Gates
|
||||
|
||||
### Before Each Implementation
|
||||
- [ ] **Purpose is clear**: Can explain what this piece does and why
|
||||
- [ ] **Pattern is established**: Following existing code patterns
|
||||
- [ ] **Size is manageable**: Implementation is small enough to understand completely
|
||||
- [ ] **Integration is planned**: Know how it connects to existing code
|
||||
|
||||
### After Each Implementation
|
||||
- [ ] **Code is understood**: Can explain every line of implemented code
|
||||
- [ ] **Tests pass**: All existing and new tests are passing
|
||||
- [ ] **Integration works**: New code works with existing functionality
|
||||
- [ ] **Documentation updated**: Changes are reflected in relevant documentation
|
||||
|
||||
### Before Moving to Next Task
|
||||
- [ ] **Current task complete**: All acceptance criteria met
|
||||
- [ ] **No regressions**: Existing functionality still works
|
||||
- [ ] **Clean state**: No temporary code or debugging artifacts
|
||||
- [ ] **Approval received**: Explicit go-ahead for next task
|
||||
- [ ] **Documentaion updated**: If relevant changes to module was made.
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
### Large Block Implementation
|
||||
**Don't:**
|
||||
```
|
||||
Implement the entire user management system with authentication,
|
||||
CRUD operations, and email notifications.
|
||||
```
|
||||
|
||||
**Do:**
|
||||
```
|
||||
First, implement just the User model with basic fields.
|
||||
Stop there and let me review before continuing.
|
||||
```
|
||||
|
||||
### Context Loss
|
||||
**Don't:**
|
||||
```
|
||||
Create a new authentication system.
|
||||
```
|
||||
|
||||
**Do:**
|
||||
```
|
||||
Looking at the existing auth patterns in auth.py, implement
|
||||
password validation following the same structure as the
|
||||
existing email validation function.
|
||||
```
|
||||
|
||||
### Over-Engineering
|
||||
**Don't:**
|
||||
```
|
||||
Build a flexible, extensible user management framework that
|
||||
can handle any future requirements.
|
||||
```
|
||||
|
||||
**Do:**
|
||||
```
|
||||
Implement user creation functionality that matches the existing
|
||||
pattern in customer.py, focusing only on the current requirements.
|
||||
```
|
||||
|
||||
## Progress Tracking
|
||||
|
||||
### Task Status Indicators
|
||||
- 🔄 **In Planning**: Requirements gathering and design
|
||||
- ⏳ **In Progress**: Currently implementing
|
||||
- ✅ **Complete**: Implemented, tested, and integrated
|
||||
- 🚫 **Blocked**: Waiting for decisions or dependencies
|
||||
- 🔧 **Needs Refactor**: Working but needs improvement
|
||||
|
||||
### Weekly Review Process
|
||||
1. **Progress Assessment**
|
||||
- What was completed this week?
|
||||
- What challenges were encountered?
|
||||
- How well did the iterative process work?
|
||||
|
||||
2. **Process Adjustment**
|
||||
- Were task sizes appropriate?
|
||||
- Did context management work effectively?
|
||||
- What improvements can be made?
|
||||
|
||||
3. **Architecture Review**
|
||||
- Is the code remaining maintainable?
|
||||
- Are patterns staying consistent?
|
||||
- Is technical debt accumulating?
|
||||
|
||||
## Emergency Procedures
|
||||
|
||||
### When Things Go Wrong
|
||||
If development becomes chaotic or problematic:
|
||||
|
||||
1. **Stop Development**
|
||||
- Don't continue adding to the problem
|
||||
- Take time to assess the situation
|
||||
- Don't rush to "fix" with more AI-generated code
|
||||
|
||||
2. **Assess the Situation**
|
||||
- What specific problems exist?
|
||||
- How far has the code diverged from established patterns?
|
||||
- What parts are still working correctly?
|
||||
|
||||
3. **Recovery Process**
|
||||
- Roll back to last known good state
|
||||
- Update context documentation with lessons learned
|
||||
- Restart with smaller, more focused tasks
|
||||
- Get explicit approval for each step of recovery
|
||||
|
||||
### Context Recovery
|
||||
When AI seems to lose track of project patterns:
|
||||
|
||||
1. **Context Refresh**
|
||||
- Review and update CONTEXT.md
|
||||
- Include examples of current code patterns
|
||||
- Clarify architectural decisions
|
||||
|
||||
2. **Pattern Re-establishment**
|
||||
- Show AI examples of existing, working code
|
||||
- Explicitly state patterns to follow
|
||||
- Start with very small, pattern-matching tasks
|
||||
|
||||
3. **Gradual Re-engagement**
|
||||
- Begin with simple, low-risk tasks
|
||||
- Verify pattern adherence at each step
|
||||
- Gradually increase task complexity as consistency returns
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Short-term (Daily)
|
||||
- Code is understandable and well-integrated
|
||||
- No major regressions introduced
|
||||
- Development velocity feels sustainable
|
||||
- Team confidence in codebase remains high
|
||||
|
||||
### Medium-term (Weekly)
|
||||
- Technical debt is not accumulating
|
||||
- New features integrate cleanly
|
||||
- Development patterns remain consistent
|
||||
- Documentation stays current
|
||||
|
||||
### Long-term (Monthly)
|
||||
- Codebase remains maintainable as it grows
|
||||
- New team members can understand and contribute
|
||||
- AI assistance enhances rather than hinders development
|
||||
- Architecture remains clean and purposeful
|
||||
@@ -1,24 +0,0 @@
|
||||
---
|
||||
description:
|
||||
globs:
|
||||
alwaysApply: true
|
||||
---
|
||||
# Rule: Project specific rules
|
||||
|
||||
## Goal
|
||||
Unify the project structure and interraction with tools and console
|
||||
|
||||
### System tools
|
||||
- **ALWAYS** use UV for package management
|
||||
- **ALWAYS** use Arch linux compatible command for terminal
|
||||
|
||||
### Coding patterns
|
||||
- **ALWYAS** check the arguments and methods before use to avoid errors with wrong parameters or names
|
||||
- If in doubt, check [CONTEXT.md](mdc:CONTEXT.md) file and [architecture.md](mdc:docs/architecture.md)
|
||||
- **PREFER** ORM pattern for databases with SQLAclhemy.
|
||||
- **DO NOT USE** emoji in code and comments
|
||||
|
||||
### Testing
|
||||
- Use UV for test in format *uv run pytest [filename]*
|
||||
|
||||
|
||||
@@ -1,237 +0,0 @@
|
||||
---
|
||||
description: Code refactoring and technical debt management for AI-assisted development
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
|
||||
# Rule: Code Refactoring and Technical Debt Management
|
||||
|
||||
## Goal
|
||||
Guide AI in systematic code refactoring to improve maintainability, reduce complexity, and prevent technical debt accumulation in AI-assisted development projects.
|
||||
|
||||
## When to Apply This Rule
|
||||
- Code complexity has increased beyond manageable levels
|
||||
- Duplicate code patterns are detected
|
||||
- Performance issues are identified
|
||||
- New features are difficult to integrate
|
||||
- Code review reveals maintainability concerns
|
||||
- Weekly technical debt assessment indicates refactoring needs
|
||||
|
||||
## Pre-Refactoring Assessment
|
||||
|
||||
Before starting any refactoring, the AI MUST:
|
||||
|
||||
1. **Context Analysis:**
|
||||
- Review existing `CONTEXT.md` for architectural decisions
|
||||
- Analyze current code patterns and conventions
|
||||
- Identify all files that will be affected (search the codebase for use)
|
||||
- Check for existing tests that verify current behavior
|
||||
|
||||
2. **Scope Definition:**
|
||||
- Clearly define what will and will not be changed
|
||||
- Identify the specific refactoring pattern to apply
|
||||
- Estimate the blast radius of changes
|
||||
- Plan rollback strategy if needed
|
||||
|
||||
3. **Documentation Review:**
|
||||
- Check `./docs/` for relevant module documentation
|
||||
- Review any existing architectural diagrams
|
||||
- Identify dependencies and integration points
|
||||
- Note any known constraints or limitations
|
||||
|
||||
## Refactoring Process
|
||||
|
||||
### Phase 1: Planning and Safety
|
||||
1. **Create Refactoring Plan:**
|
||||
- Document the current state and desired end state
|
||||
- Break refactoring into small, atomic steps
|
||||
- Identify tests that must pass throughout the process
|
||||
- Plan verification steps for each change
|
||||
|
||||
2. **Establish Safety Net:**
|
||||
- Ensure comprehensive test coverage exists
|
||||
- If tests are missing, create them BEFORE refactoring
|
||||
- Document current behavior that must be preserved
|
||||
- Create backup of current implementation approach
|
||||
|
||||
3. **Get Approval:**
|
||||
- Present the refactoring plan to the user
|
||||
- Wait for explicit "Go" or "Proceed" confirmation
|
||||
- Do NOT start refactoring without approval
|
||||
|
||||
### Phase 2: Incremental Implementation
|
||||
4. **One Change at a Time:**
|
||||
- Implement ONE refactoring step per iteration
|
||||
- Run tests after each step to ensure nothing breaks
|
||||
- Update documentation if interfaces change
|
||||
- Mark progress in the refactoring plan
|
||||
|
||||
5. **Verification Protocol:**
|
||||
- Run all relevant tests after each change
|
||||
- Verify functionality works as expected
|
||||
- Check performance hasn't degraded
|
||||
- Ensure no new linting or type errors
|
||||
|
||||
6. **User Checkpoint:**
|
||||
- After each significant step, pause for user review
|
||||
- Present what was changed and current status
|
||||
- Wait for approval before continuing
|
||||
- Address any concerns before proceeding
|
||||
|
||||
### Phase 3: Completion and Documentation
|
||||
7. **Final Verification:**
|
||||
- Run full test suite to ensure nothing is broken
|
||||
- Verify all original functionality is preserved
|
||||
- Check that new code follows project conventions
|
||||
- Confirm performance is maintained or improved
|
||||
|
||||
8. **Documentation Update:**
|
||||
- Update `CONTEXT.md` with new patterns/decisions
|
||||
- Update module documentation in `./docs/`
|
||||
- Document any new conventions established
|
||||
- Note lessons learned for future refactoring
|
||||
|
||||
## Common Refactoring Patterns
|
||||
|
||||
### Extract Method/Function
|
||||
```
|
||||
WHEN: Functions/methods exceed 50 lines or have multiple responsibilities
|
||||
HOW:
|
||||
1. Identify logical groupings within the function
|
||||
2. Extract each group into a well-named helper function
|
||||
3. Ensure each function has a single responsibility
|
||||
4. Verify tests still pass
|
||||
```
|
||||
|
||||
### Extract Module/Class
|
||||
```
|
||||
WHEN: Files exceed 250 lines or handle multiple concerns
|
||||
HOW:
|
||||
1. Identify cohesive functionality groups
|
||||
2. Create new files for each group
|
||||
3. Move related functions/classes together
|
||||
4. Update imports and dependencies
|
||||
5. Verify module boundaries are clean
|
||||
```
|
||||
|
||||
### Eliminate Duplication
|
||||
```
|
||||
WHEN: Similar code appears in multiple places
|
||||
HOW:
|
||||
1. Identify the common pattern or functionality
|
||||
2. Extract to a shared utility function or module
|
||||
3. Update all usage sites to use the shared code
|
||||
4. Ensure the abstraction is not over-engineered
|
||||
```
|
||||
|
||||
### Improve Data Structures
|
||||
```
|
||||
WHEN: Complex nested objects or unclear data flow
|
||||
HOW:
|
||||
1. Define clear interfaces/types for data structures
|
||||
2. Create transformation functions between different representations
|
||||
3. Ensure data flow is unidirectional where possible
|
||||
4. Add validation at boundaries
|
||||
```
|
||||
|
||||
### Reduce Coupling
|
||||
```
|
||||
WHEN: Modules are tightly interconnected
|
||||
HOW:
|
||||
1. Identify dependencies between modules
|
||||
2. Extract interfaces for external dependencies
|
||||
3. Use dependency injection where appropriate
|
||||
4. Ensure modules can be tested in isolation
|
||||
```
|
||||
|
||||
## Quality Gates
|
||||
|
||||
Every refactoring must pass these gates:
|
||||
|
||||
### Technical Quality
|
||||
- [ ] All existing tests pass
|
||||
- [ ] No new linting errors introduced
|
||||
- [ ] Code follows established project conventions
|
||||
- [ ] No performance regression detected
|
||||
- [ ] File sizes remain under 250 lines
|
||||
- [ ] Function sizes remain under 50 lines
|
||||
|
||||
### Maintainability
|
||||
- [ ] Code is more readable than before
|
||||
- [ ] Duplicated code has been reduced
|
||||
- [ ] Module responsibilities are clearer
|
||||
- [ ] Dependencies are explicit and minimal
|
||||
- [ ] Error handling is consistent
|
||||
|
||||
### Documentation
|
||||
- [ ] Public interfaces are documented
|
||||
- [ ] Complex logic has explanatory comments
|
||||
- [ ] Architectural decisions are recorded
|
||||
- [ ] Examples are provided where helpful
|
||||
|
||||
## AI Instructions for Refactoring
|
||||
|
||||
1. **Always ask for permission** before starting any refactoring work
|
||||
2. **Start with tests** - ensure comprehensive coverage before changing code
|
||||
3. **Work incrementally** - make small changes and verify each step
|
||||
4. **Preserve behavior** - functionality must remain exactly the same
|
||||
5. **Update documentation** - keep all docs current with changes
|
||||
6. **Follow conventions** - maintain consistency with existing codebase
|
||||
7. **Stop and ask** if any step fails or produces unexpected results
|
||||
8. **Explain changes** - clearly communicate what was changed and why
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
### Over-Engineering
|
||||
- Don't create abstractions for code that isn't duplicated
|
||||
- Avoid complex inheritance hierarchies
|
||||
- Don't optimize prematurely
|
||||
|
||||
### Breaking Changes
|
||||
- Never change public APIs without explicit approval
|
||||
- Don't remove functionality, even if it seems unused
|
||||
- Avoid changing behavior "while we're here"
|
||||
|
||||
### Scope Creep
|
||||
- Stick to the defined refactoring scope
|
||||
- Don't add new features during refactoring
|
||||
- Resist the urge to "improve" unrelated code
|
||||
|
||||
## Success Metrics
|
||||
|
||||
Track these metrics to ensure refactoring effectiveness:
|
||||
|
||||
### Code Quality
|
||||
- Reduced cyclomatic complexity
|
||||
- Lower code duplication percentage
|
||||
- Improved test coverage
|
||||
- Fewer linting violations
|
||||
|
||||
### Developer Experience
|
||||
- Faster time to understand code
|
||||
- Easier integration of new features
|
||||
- Reduced bug introduction rate
|
||||
- Higher developer confidence in changes
|
||||
|
||||
### Maintainability
|
||||
- Clearer module boundaries
|
||||
- More predictable behavior
|
||||
- Easier debugging and troubleshooting
|
||||
- Better performance characteristics
|
||||
|
||||
## Output Files
|
||||
|
||||
When refactoring is complete, update:
|
||||
- `refactoring-log-[date].md` - Document what was changed and why
|
||||
- `CONTEXT.md` - Update with new patterns and decisions
|
||||
- `./docs/` - Update relevant module documentation
|
||||
- Task lists - Mark refactoring tasks as complete
|
||||
|
||||
## Final Verification
|
||||
|
||||
Before marking refactoring complete:
|
||||
1. Run full test suite and verify all tests pass
|
||||
2. Check that code follows all project conventions
|
||||
3. Verify documentation is up to date
|
||||
4. Confirm user is satisfied with the results
|
||||
5. Record lessons learned for future refactoring efforts
|
||||
@@ -1,44 +0,0 @@
|
||||
---
|
||||
description: TODO list task implementation
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
---
|
||||
description:
|
||||
globs:
|
||||
alwaysApply: false
|
||||
---
|
||||
# Task List Management
|
||||
|
||||
Guidelines for managing task lists in markdown files to track progress on completing a PRD
|
||||
|
||||
## Task Implementation
|
||||
- **One sub-task at a time:** Do **NOT** start the next sub‑task until you ask the user for permission and they say “yes” or "y"
|
||||
- **Completion protocol:**
|
||||
1. When you finish a **sub‑task**, immediately mark it as completed by changing `[ ]` to `[x]`.
|
||||
2. If **all** subtasks underneath a parent task are now `[x]`, also mark the **parent task** as completed.
|
||||
- Stop after each sub‑task and wait for the user’s go‑ahead.
|
||||
|
||||
## Task List Maintenance
|
||||
|
||||
1. **Update the task list as you work:**
|
||||
- Mark tasks and subtasks as completed (`[x]`) per the protocol above.
|
||||
- Add new tasks as they emerge.
|
||||
|
||||
2. **Maintain the “Relevant Files” section:**
|
||||
- List every file created or modified.
|
||||
- Give each file a one‑line description of its purpose.
|
||||
|
||||
## AI Instructions
|
||||
|
||||
When working with task lists, the AI must:
|
||||
|
||||
1. Regularly update the task list file after finishing any significant work.
|
||||
2. Follow the completion protocol:
|
||||
- Mark each finished **sub‑task** `[x]`.
|
||||
- Mark the **parent task** `[x]` once **all** its subtasks are `[x]`.
|
||||
3. Add newly discovered tasks.
|
||||
4. Keep “Relevant Files” accurate and up to date.
|
||||
5. Before starting work, check which sub‑task is next.
|
||||
|
||||
6. After implementing a sub‑task, update the file and then pause for user approval.
|
||||
5
.vscode/launch.json
vendored
5
.vscode/launch.json
vendored
@@ -16,7 +16,10 @@
|
||||
"args": [
|
||||
"BTC-USDT",
|
||||
"2025-08-26",
|
||||
"2025-08-30"
|
||||
// "2025-08-30"
|
||||
"2025-09-22",
|
||||
"--timeframe-minutes", "15",
|
||||
"--no-ui"
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
3
.vscode/settings.json
vendored
3
.vscode/settings.json
vendored
@@ -3,5 +3,6 @@
|
||||
"."
|
||||
],
|
||||
"python.testing.unittestEnabled": false,
|
||||
"python.testing.pytestEnabled": true
|
||||
"python.testing.pytestEnabled": true,
|
||||
"python.languageServer": "None"
|
||||
}
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 105 KiB |
872
desktop_app.py
872
desktop_app.py
File diff suppressed because it is too large
Load Diff
151
docs/API.md
151
docs/API.md
@@ -1,151 +0,0 @@
|
||||
# API Documentation (Current Implementation)
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the public interfaces of the current system: SQLite streaming, OHLC/depth aggregation, JSON-based IPC, and the Dash visualizer. Metrics (OBI/CVD), repository/storage layers, and strategy APIs are not part of the current implementation.
|
||||
|
||||
## Input Database Schema (Required)
|
||||
|
||||
### book table
|
||||
```sql
|
||||
CREATE TABLE book (
|
||||
id INTEGER PRIMARY KEY,
|
||||
instrument TEXT,
|
||||
bids TEXT NOT NULL, -- Python-literal: [[price, size, ...], ...]
|
||||
asks TEXT NOT NULL, -- Python-literal: [[price, size, ...], ...]
|
||||
timestamp TEXT NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
### trades table
|
||||
```sql
|
||||
CREATE TABLE trades (
|
||||
id INTEGER PRIMARY KEY,
|
||||
instrument TEXT,
|
||||
trade_id TEXT,
|
||||
price REAL NOT NULL,
|
||||
size REAL NOT NULL,
|
||||
side TEXT NOT NULL, -- "buy" or "sell"
|
||||
timestamp TEXT NOT NULL
|
||||
);
|
||||
```
|
||||
|
||||
## Data Access: db_interpreter.py
|
||||
|
||||
### Classes
|
||||
- `OrderbookLevel` (dataclass): represents a price level.
|
||||
- `OrderbookUpdate`: windowed book update with `bids`, `asks`, `timestamp`, `end_timestamp`.
|
||||
|
||||
### DBInterpreter
|
||||
```python
|
||||
class DBInterpreter:
|
||||
def __init__(self, db_path: Path): ...
|
||||
|
||||
def stream(self) -> Iterator[tuple[OrderbookUpdate, list[tuple]]]:
|
||||
"""
|
||||
Stream orderbook rows with one-row lookahead and trades in timestamp order.
|
||||
Yields pairs of (OrderbookUpdate, trades_in_window), where each trade tuple is:
|
||||
(id, trade_id, price, size, side, timestamp_ms) and timestamp_ms ∈ [timestamp, end_timestamp).
|
||||
"""
|
||||
```
|
||||
|
||||
- Read-only SQLite connection with PRAGMA tuning (immutable, query_only, mmap, cache).
|
||||
- Batch sizes: `BOOK_BATCH = 2048`, `TRADE_BATCH = 4096`.
|
||||
|
||||
## Processing: ohlc_processor.py
|
||||
|
||||
### OHLCProcessor
|
||||
```python
|
||||
class OHLCProcessor:
|
||||
def __init__(self, window_seconds: int = 60, depth_levels_per_side: int = 50): ...
|
||||
|
||||
def process_trades(self, trades: list[tuple]) -> None:
|
||||
"""Aggregate trades into OHLC bars per window; throttled upserts for UI responsiveness."""
|
||||
|
||||
def update_orderbook(self, ob_update: OrderbookUpdate) -> None:
|
||||
"""Maintain in-memory price→size maps, apply partial updates, and emit top-N depth snapshots periodically."""
|
||||
|
||||
def finalize(self) -> None:
|
||||
"""Emit the last OHLC bar if present."""
|
||||
```
|
||||
|
||||
- Internal helpers for parsing levels from JSON or Python-literal strings and for applying deletions (size==0).
|
||||
|
||||
## Inter-Process Communication: viz_io.py
|
||||
|
||||
### Files
|
||||
- `ohlc_data.json`: rolling array of OHLC bars (max 1000).
|
||||
- `depth_data.json`: latest depth snapshot (bids/asks), top-N per side.
|
||||
- `metrics_data.json`: rolling array of OBI OHLC bars (max 1000).
|
||||
|
||||
### Functions
|
||||
```python
|
||||
def add_ohlc_bar(timestamp: int, open_price: float, high_price: float, low_price: float, close_price: float, volume: float = 0.0) -> None: ...
|
||||
|
||||
def upsert_ohlc_bar(timestamp: int, open_price: float, high_price: float, low_price: float, close_price: float, volume: float = 0.0) -> None: ...
|
||||
|
||||
def clear_data() -> None: ...
|
||||
|
||||
def add_metric_bar(timestamp: int, obi_open: float, obi_high: float, obi_low: float, obi_close: float) -> None: ...
|
||||
|
||||
def upsert_metric_bar(timestamp: int, obi_open: float, obi_high: float, obi_low: float, obi_close: float) -> None: ...
|
||||
|
||||
def clear_metrics() -> None: ...
|
||||
```
|
||||
|
||||
- Atomic writes via temp file replace to prevent partial reads.
|
||||
|
||||
## Visualization: app.py (Dash)
|
||||
|
||||
- Three visuals: OHLC+Volume and Depth (cumulative) with Plotly dark theme, plus an OBI candlestick subplot beneath Volume.
|
||||
- Polling interval: 500 ms. Tolerates JSON decode races using cached last values.
|
||||
|
||||
### Callback Contract
|
||||
```python
|
||||
@app.callback(
|
||||
[Output('ohlc-chart', 'figure'), Output('depth-chart', 'figure')],
|
||||
[Input('interval-update', 'n_intervals')]
|
||||
)
|
||||
```
|
||||
- Reads `ohlc_data.json` (list of `[ts, open, high, low, close, volume]`).
|
||||
- Reads `depth_data.json` (`{"bids": [[price, size], ...], "asks": [[price, size], ...]}`).
|
||||
- Reads `metrics_data.json` (list of `[ts, obi_o, obi_h, obi_l, obi_c]`).
|
||||
|
||||
## CLI Orchestration: main.py
|
||||
|
||||
### Typer Entry Point
|
||||
```python
|
||||
def main(instrument: str, start_date: str, end_date: str, window_seconds: int = 60) -> None:
|
||||
"""Stream DBs, process OHLC/depth, and launch Dash visualizer in a separate process."""
|
||||
```
|
||||
|
||||
- Discovers databases under `../data/OKX` matching the instrument and date range.
|
||||
- Launches UI: `uv run python app.py`.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Run processing + UI
|
||||
```bash
|
||||
uv run python main.py BTC-USDT 2025-07-01 2025-08-01 --window-seconds 60
|
||||
# Open http://localhost:8050
|
||||
```
|
||||
|
||||
### Process trades and update depth in a loop (conceptual)
|
||||
```python
|
||||
from db_interpreter import DBInterpreter
|
||||
from ohlc_processor import OHLCProcessor
|
||||
|
||||
processor = OHLCProcessor(window_seconds=60)
|
||||
for ob_update, trades in DBInterpreter(db_path).stream():
|
||||
processor.process_trades(trades)
|
||||
processor.update_orderbook(ob_update)
|
||||
processor.finalize()
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
- Reader/Writer coordination via atomic JSON prevents partial reads.
|
||||
- Visualizer caches last valid data if JSON decoding fails mid-write; logs warnings.
|
||||
- Visualizer start failures do not stop processing; logs error and continues.
|
||||
|
||||
## Notes
|
||||
- Metrics computation includes simplified OBI (Order Book Imbalance) calculated as bid_total - ask_total. Repository/storage layers and strategy APIs are intentionally kept minimal.
|
||||
@@ -1,152 +0,0 @@
|
||||
# Changelog
|
||||
|
||||
All notable changes to the Orderflow Backtest System are documented in this file.
|
||||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
- Comprehensive documentation structure with module-specific guides
|
||||
- Architecture Decision Records (ADRs) for major technical decisions
|
||||
- CONTRIBUTING.md with development guidelines and standards
|
||||
- Enhanced module documentation in `docs/modules/` directory
|
||||
- Dependency documentation with security and performance considerations
|
||||
|
||||
### Changed
|
||||
- Documentation structure reorganized to follow documentation standards
|
||||
- Improved code documentation requirements with examples
|
||||
- Enhanced testing guidelines with coverage requirements
|
||||
|
||||
## [2.0.0] - 2024-12-Present
|
||||
|
||||
### Added
|
||||
- **Simplified Pipeline Architecture**: Streamlined SQLite → OHLC/Depth → JSON → Dash pipeline
|
||||
- **JSON-based IPC**: Atomic file-based communication between processor and visualizer
|
||||
- **Real-time Visualization**: Dash web application with 500ms polling updates
|
||||
- **OHLC Aggregation**: Configurable time window aggregation with throttled updates
|
||||
- **Orderbook Depth**: Real-time depth snapshots with top-N level management
|
||||
- **OBI Metrics**: Order Book Imbalance calculation with candlestick visualization
|
||||
- **Atomic JSON Operations**: Race-condition-free data exchange via temp files
|
||||
- **CLI Orchestration**: Typer-based command interface with process management
|
||||
- **Performance Optimizations**: Batch reading with optimized SQLite PRAGMA settings
|
||||
|
||||
### Changed
|
||||
- **Architecture Simplification**: Removed complex repository/storage layers
|
||||
- **Data Flow**: Direct streaming from database to visualization via JSON
|
||||
- **Error Handling**: Graceful degradation with cached data fallbacks
|
||||
- **Process Management**: Separate visualization process launched automatically
|
||||
- **Memory Efficiency**: Bounded datasets prevent unlimited memory growth
|
||||
|
||||
### Technical Details
|
||||
- **Database Access**: Read-only SQLite with immutable mode and mmap optimization
|
||||
- **Batch Sizes**: BOOK_BATCH=2048, TRADE_BATCH=4096 for optimal performance
|
||||
- **JSON Formats**: Standardized schemas for OHLC, depth, and metrics data
|
||||
- **Chart Architecture**: Multi-subplot layout with shared time axis
|
||||
- **IPC Files**: `ohlc_data.json`, `depth_data.json`, `metrics_data.json`
|
||||
|
||||
### Removed
|
||||
- Complex metrics storage and repository patterns
|
||||
- Strategy framework components
|
||||
- In-memory snapshot retention
|
||||
- Multi-database orchestration complexity
|
||||
|
||||
## [1.0.0] - Previous Version
|
||||
|
||||
### Features
|
||||
- **Orderbook Reconstruction**: Build complete orderbooks from SQLite database files
|
||||
- **Data Models**: Core structures for `OrderbookLevel`, `Trade`, `BookSnapshot`, `Book`
|
||||
- **SQLite Repository**: Read-only data access for orderbook and trades data
|
||||
- **Orderbook Parser**: Text parsing with price caching optimization
|
||||
- **Storage Orchestration**: High-level facade for book building
|
||||
- **Basic Visualization**: OHLC candlestick charts with Qt5Agg backend
|
||||
- **Strategy Framework**: Basic strategy pattern with `DefaultStrategy`
|
||||
- **CLI Interface**: Command-line application for date range processing
|
||||
- **Test Suite**: Unit and integration tests
|
||||
|
||||
### Architecture
|
||||
- **Repository Pattern**: Clean separation of data access logic
|
||||
- **Dataclass Models**: Lightweight, type-safe data structures
|
||||
- **Parser Optimization**: Price caching for performance
|
||||
- **Modular Design**: Clear separation between components
|
||||
|
||||
---
|
||||
|
||||
## Migration Guide
|
||||
|
||||
### Upgrading from v1.0.0 to v2.0.0
|
||||
|
||||
#### Code Changes Required
|
||||
|
||||
1. **Strategy Constructor**
|
||||
```python
|
||||
# Before (v1.0.0)
|
||||
strategy = DefaultStrategy("BTC-USDT", enable_visualization=True)
|
||||
|
||||
# After (v2.0.0)
|
||||
strategy = DefaultStrategy("BTC-USDT")
|
||||
visualizer = Visualizer(window_seconds=60, max_bars=500)
|
||||
```
|
||||
|
||||
2. **Main Application Flow**
|
||||
```python
|
||||
# Before (v1.0.0)
|
||||
strategy = DefaultStrategy(instrument, enable_visualization=True)
|
||||
storage.build_booktick_from_db(db_path, db_date)
|
||||
strategy.on_booktick(storage.book)
|
||||
|
||||
# After (v2.0.0)
|
||||
strategy = DefaultStrategy(instrument)
|
||||
visualizer = Visualizer(window_seconds=60, max_bars=500)
|
||||
|
||||
strategy.set_db_path(db_path)
|
||||
visualizer.set_db_path(db_path)
|
||||
storage.build_booktick_from_db(db_path, db_date)
|
||||
strategy.on_booktick(storage.book)
|
||||
visualizer.update_from_book(storage.book)
|
||||
```
|
||||
|
||||
#### Database Migration
|
||||
- **Automatic**: Metrics table created automatically on first run
|
||||
- **No Data Loss**: Existing orderbook and trades data unchanged
|
||||
- **Schema Addition**: New `metrics` table with indexes added to existing databases
|
||||
|
||||
#### Benefits of Upgrading
|
||||
- **Memory Efficiency**: >70% reduction in memory usage
|
||||
- **Performance**: Faster processing through persistent metrics storage
|
||||
- **Enhanced Analysis**: Access to OBI and CVD financial indicators
|
||||
- **Better Visualization**: Multi-chart display with synchronized time axis
|
||||
- **Improved Architecture**: Cleaner separation of concerns
|
||||
|
||||
#### Testing Migration
|
||||
```bash
|
||||
# Verify upgrade compatibility
|
||||
uv run pytest tests/test_main_integration.py -v
|
||||
|
||||
# Test new metrics functionality
|
||||
uv run pytest tests/test_storage_metrics.py -v
|
||||
|
||||
# Validate visualization separation
|
||||
uv run pytest tests/test_main_visualization.py -v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Development Notes
|
||||
|
||||
### Performance Improvements
|
||||
- **v2.0.0**: >70% memory reduction, batch processing, persistent storage
|
||||
- **v1.0.0**: In-memory processing, real-time calculations
|
||||
|
||||
### Architecture Evolution
|
||||
- **v2.0.0**: Streaming processing with metrics storage, separated visualization
|
||||
- **v1.0.0**: Full snapshot retention, integrated visualization in strategies
|
||||
|
||||
### Testing Coverage
|
||||
- **v2.0.0**: 27 tests across 6 files, integration and unit coverage
|
||||
- **v1.0.0**: Basic unit tests for core components
|
||||
|
||||
---
|
||||
|
||||
*For detailed technical documentation, see [docs/](../docs/) directory.*
|
||||
@@ -1,53 +0,0 @@
|
||||
# Project Context
|
||||
|
||||
## Current State
|
||||
|
||||
The project implements a modular, efficient orderflow processing pipeline:
|
||||
- Stream orderflow from SQLite (`DBInterpreter.stream`).
|
||||
- Process trades and orderbook updates through modular `OHLCProcessor` architecture.
|
||||
- Exchange data with the UI via atomic JSON files (`viz_io`).
|
||||
- Render OHLC+Volume, Depth, and Metrics charts with a Dash app (`app.py`).
|
||||
|
||||
The system features a clean composition-based architecture with specialized modules for different concerns, providing OBI/CVD metrics alongside OHLC data.
|
||||
|
||||
## Recent Work
|
||||
|
||||
- **Modular Refactoring**: Extracted `ohlc_processor.py` into focused modules:
|
||||
- `level_parser.py`: Orderbook level parsing utilities (85 lines)
|
||||
- `orderbook_manager.py`: In-memory orderbook state management (90 lines)
|
||||
- `metrics_calculator.py`: OBI and CVD metrics calculation (112 lines)
|
||||
- **Architecture Compliance**: Reduced main processor from 440 to 248 lines (250-line target achieved)
|
||||
- Maintained full backward compatibility and functionality
|
||||
- Implemented read-only, batched SQLite streaming with PRAGMA tuning.
|
||||
- Added robust JSON IPC with atomic writes and tolerant UI reads.
|
||||
- Built a responsive Dash visualization polling at 500ms.
|
||||
- Unified CLI using Typer, with UV for process management.
|
||||
|
||||
## Conventions
|
||||
|
||||
- Python 3.12+, UV for dependency and command execution.
|
||||
- **Modular Architecture**: Composition over inheritance, single-responsibility modules
|
||||
- **File Size Limits**: ≤250 lines per file, ≤50 lines per function (enforced)
|
||||
- Type hints throughout; concise, focused functions and classes.
|
||||
- Error handling with meaningful logs; avoid bare exceptions.
|
||||
- Prefer explicit JSON structures for IPC; keep payloads small and bounded.
|
||||
|
||||
## Priorities
|
||||
|
||||
- Improve configurability: database path discovery, CLI flags for paths and UI options.
|
||||
- Add tests for `DBInterpreter.stream` and `OHLCProcessor` (run with `uv run pytest`).
|
||||
- Performance tuning for large DBs while keeping UI responsive.
|
||||
- Documentation kept in sync with code; architecture reflects current design.
|
||||
|
||||
## Roadmap (Future Work)
|
||||
|
||||
- Enhance OBI metrics with additional derived calculations (e.g., normalized OBI).
|
||||
- Optional repository layer abstraction and a storage orchestrator.
|
||||
- Extend visualization with additional subplots and interactivity.
|
||||
- Strategy module for analytics and alerting on derived metrics.
|
||||
|
||||
## Tooling
|
||||
|
||||
- Package management and commands: UV (e.g., `uv sync`, `uv run ...`).
|
||||
- Visualization server: Dash on `http://localhost:8050`.
|
||||
- Linting/testing: Pytest (e.g., `uv run pytest`).
|
||||
@@ -1,26 +0,0 @@
|
||||
# Orderflow Backtest System Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This directory contains documentation for the current Orderflow Backtest System, which streams historical orderflow from SQLite, aggregates OHLC bars, maintains a lightweight depth snapshot, and renders charts via a Dash web application.
|
||||
|
||||
## Documentation Structure
|
||||
|
||||
- `architecture.md`: System architecture, component relationships, and data flow (SQLite → Streaming → OHLC/Depth → JSON → Dash)
|
||||
- `API.md`: Public interfaces for DB streaming, OHLC/depth processing, JSON IPC, Dash visualization, and CLI
|
||||
- `CONTEXT.md`: Project state, conventions, and development priorities
|
||||
- `decisions/`: Architecture decision records
|
||||
|
||||
## Quick Navigation
|
||||
|
||||
| Topic | Documentation |
|
||||
|-------|---------------|
|
||||
| Getting Started | See the usage examples in `API.md` |
|
||||
| System Architecture | `architecture.md` |
|
||||
| Database Schema | `API.md#input-database-schema-required` |
|
||||
| Development Setup | Project root `README` and `pyproject.toml` |
|
||||
|
||||
## Notes
|
||||
|
||||
- Metrics (OBI/CVD), repository/storage layers, and strategy components have been removed from the current codebase and are planned as future enhancements.
|
||||
- Use UV for package management and running commands. Example: `uv run python main.py ...`.
|
||||
@@ -1,156 +0,0 @@
|
||||
# System Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
The current system is a streamlined, high-performance pipeline that streams orderflow from SQLite databases, aggregates trades into OHLC bars, maintains a lightweight depth snapshot, and serves visuals via a Dash web application. Inter-process communication (IPC) between the processor and visualizer uses atomic JSON files for simplicity and robustness.
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────────┐ ┌──────────────────┐ ┌──────────────────┐
|
||||
│ SQLite Files │ → │ DB Interpreter │ → │ OHLC/Depth │ → │ Dash Visualizer │
|
||||
│ (book,trades) │ │ (stream rows) │ │ Processor │ │ (app.py) │
|
||||
└─────────────────┘ └─────────────────────┘ └─────────┬────────┘ └────────────▲─────┘
|
||||
│ │
|
||||
│ Atomic JSON (IPC) │
|
||||
▼ │
|
||||
ohlc_data.json, depth_data.json │
|
||||
metrics_data.json │
|
||||
│
|
||||
Browser UI
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### Data Access (`db_interpreter.py`)
|
||||
|
||||
- `OrderbookLevel`: dataclass representing one price level.
|
||||
- `OrderbookUpdate`: container for a book row window with `bids`, `asks`, `timestamp`, and `end_timestamp`.
|
||||
- `DBInterpreter`:
|
||||
- `stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]` streams the book table with lookahead and the trades table in timestamp order.
|
||||
- Efficient read-only connection with PRAGMA tuning: immutable mode, query_only, temp_store=MEMORY, mmap_size, cache_size.
|
||||
- Batching constants: `BOOK_BATCH = 2048`, `TRADE_BATCH = 4096`.
|
||||
- Each yielded `trades` element is a tuple `(id, trade_id, price, size, side, timestamp_ms)` that falls within `[book.timestamp, next_book.timestamp)`.
|
||||
|
||||
### Processing (Modular Architecture)
|
||||
|
||||
#### Main Coordinator (`ohlc_processor.py`)
|
||||
- `OHLCProcessor(window_seconds=60, depth_levels_per_side=50)`: Orchestrates trade processing using composition
|
||||
- `process_trades(trades)`: aggregates trades into OHLC bars and delegates CVD updates
|
||||
- `update_orderbook(ob_update)`: coordinates orderbook updates and OBI metric calculation
|
||||
- `finalize()`: finalizes both OHLC bars and metrics data
|
||||
- `cvd_cumulative` (property): provides access to cumulative volume delta
|
||||
|
||||
#### Orderbook Management (`orderbook_manager.py`)
|
||||
- `OrderbookManager`: Handles in-memory orderbook state with partial updates
|
||||
- Maintains separate bid/ask price→size dictionaries
|
||||
- Supports deletions via zero-size updates
|
||||
- Provides sorted top-N level extraction for visualization
|
||||
|
||||
#### Metrics Calculation (`metrics_calculator.py`)
|
||||
- `MetricsCalculator`: Manages OBI and CVD metrics with windowed aggregation
|
||||
- Tracks CVD from trade flow (buy vs sell volume delta)
|
||||
- Calculates OBI from orderbook volume imbalance
|
||||
- Provides throttled updates and OHLC-style metric bars
|
||||
|
||||
#### Level Parsing (`level_parser.py`)
|
||||
- Utility functions for normalizing orderbook level data:
|
||||
- `normalize_levels()`: parses levels, filtering zero/negative sizes
|
||||
- `parse_levels_including_zeros()`: preserves zeros for deletion operations
|
||||
- Supports JSON and Python literal formats with robust error handling
|
||||
|
||||
### Inter-Process Communication (`viz_io.py`)
|
||||
|
||||
- File paths (relative to project root):
|
||||
- `ohlc_data.json`: rolling list of OHLC bars (max 1000).
|
||||
- `depth_data.json`: latest depth snapshot (bids/asks).
|
||||
- `metrics_data.json`: rolling list of OBI/TOT OHLC bars (max 1000).
|
||||
- Atomic writes via temp files prevent partial reads by the Dash app.
|
||||
- API:
|
||||
- `add_ohlc_bar(...)`: append a new bar; trim to last 1000.
|
||||
- `upsert_ohlc_bar(...)`: replace last bar if timestamp matches; else append; trim.
|
||||
- `clear_data()`: reset OHLC data to an empty list.
|
||||
|
||||
### Visualization (`app.py`)
|
||||
|
||||
- Dash application with two graphs plus OBI subplot:
|
||||
- OHLC + Volume subplot with shared x-axis.
|
||||
- OBI candlestick subplot (blue tones) sharing x-axis.
|
||||
- Depth (cumulative) chart for bids and asks.
|
||||
- Polling interval (500 ms) callback reads JSON files and updates figures resilently:
|
||||
- Caches last good values to tolerate in-flight writes/decoding errors.
|
||||
- Builds figures with Plotly dark theme.
|
||||
- Exposed on `http://localhost:8050` by default (`host=0.0.0.0`).
|
||||
|
||||
### CLI Orchestration (`main.py`)
|
||||
|
||||
- Typer CLI entrypoint:
|
||||
- Arguments: `instrument`, `start_date`, `end_date` (UTC, `YYYY-MM-DD`), options: `--window-seconds`.
|
||||
- Discovers SQLite files under `../data/OKX` matching the instrument.
|
||||
- Launches Dash visualizer as a separate process: `uv run python app.py`.
|
||||
- Streams databases sequentially: for each book row, processes trades and updates orderbook.
|
||||
|
||||
## Data Flow
|
||||
|
||||
1. Discover and open SQLite database(s) for the requested instrument.
|
||||
2. Stream `book` rows with one-row lookahead to form time windows.
|
||||
3. Stream `trades` in timestamp order and bucket into the active window.
|
||||
4. For each window:
|
||||
- Aggregate trades into OHLC using `OHLCProcessor.process_trades`.
|
||||
- Apply partial depth updates via `OHLCProcessor.update_orderbook` and emit periodic snapshots.
|
||||
5. Persist current OHLC bar(s) and depth snapshots to JSON via atomic writes.
|
||||
6. Dash app polls JSON and renders charts.
|
||||
|
||||
## IPC JSON Schemas
|
||||
|
||||
- OHLC (`ohlc_data.json`): array of bars; each bar is `[ts, open, high, low, close, volume]`.
|
||||
|
||||
- Depth (`depth_data.json`): object with bids/asks arrays: `{"bids": [[price, size], ...], "asks": [[price, size], ...]}`.
|
||||
|
||||
- Metrics (`metrics_data.json`): array of bars; each bar is `[ts, obi_open, obi_high, obi_low, obi_close, tot_open, tot_high, tot_low, tot_close]`.
|
||||
|
||||
## Configuration
|
||||
|
||||
- `OHLCProcessor(window_seconds, depth_levels_per_side)` controls aggregation granularity and depth snapshot size.
|
||||
- Visualizer interval (`500 ms`) balances UI responsiveness and CPU usage.
|
||||
- Paths: JSON files (`ohlc_data.json`, `depth_data.json`) are colocated with the code and written atomically.
|
||||
- CLI parameters select instrument and time range; databases expected under `../data/OKX`.
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- Read-only SQLite tuned for fast sequential scans: immutable URI, query_only, large mmap and cache.
|
||||
- Batching minimizes cursor churn and Python overhead.
|
||||
- JSON IPC uses atomic replace to avoid contention; OHLC list is bounded to 1000 entries.
|
||||
- Processor throttles intra-window OHLC upserts and depth emissions to reduce I/O.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- Visualizer tolerates JSON decode races by reusing last good values and logging warnings.
|
||||
- Processor guards depth parsing and writes; logs at debug/info levels.
|
||||
- Visualizer startup is wrapped; if it fails, processing continues without UI.
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- SQLite connections are read-only and immutable; no write queries executed.
|
||||
- File writes are confined to project directory; no paths derived from untrusted input.
|
||||
- Logs avoid sensitive data; only operational metadata.
|
||||
|
||||
## Testing Guidance
|
||||
|
||||
- Unit tests (run with `uv run pytest`):
|
||||
- `OHLCProcessor`: window boundary handling, high/low tracking, volume accumulation, upsert behavior.
|
||||
- Depth maintenance: deletions (size==0), top-N sorting, throttling.
|
||||
- `DBInterpreter.stream`: correct trade-window assignment, end-of-stream handling.
|
||||
- Integration: end-to-end generation of JSON from a tiny fixture DB and basic figure construction without launching a server.
|
||||
|
||||
## Roadmap (Optional Enhancements)
|
||||
|
||||
- Metrics: add OBI/CVD computation and persist metrics to a dedicated table.
|
||||
- Repository Pattern: extract DB access into a repository module with typed methods.
|
||||
- Orchestrator: introduce a `Storage` pipeline module coordinating batch processing and persistence.
|
||||
- Strategy Layer: compute signals/alerts on stored metrics.
|
||||
- Visualization: add OBI/CVD subplots and richer interactions.
|
||||
|
||||
---
|
||||
|
||||
This document reflects the current implementation centered on SQLite streaming, JSON-based IPC, and a Dash visualizer, providing a clear foundation for incremental enhancements.
|
||||
@@ -1,122 +0,0 @@
|
||||
# ADR-001: SQLite Database Choice
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
The orderflow backtest system needs to efficiently store and stream large volumes of historical orderbook and trade data. Key requirements include:
|
||||
|
||||
- Fast sequential read access for time-series data
|
||||
- Minimal setup and maintenance overhead
|
||||
- Support for concurrent reads from visualization layer
|
||||
- Ability to handle databases ranging from 100MB to 10GB+
|
||||
- No network dependencies for data access
|
||||
|
||||
## Decision
|
||||
We will use SQLite as the primary database for storing historical orderbook and trade data.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- **Zero configuration**: No database server setup or administration required
|
||||
- **Excellent read performance**: Optimized for sequential scans with proper PRAGMA settings
|
||||
- **Built-in Python support**: No external dependencies or connection libraries needed
|
||||
- **File portability**: Database files can be easily shared and archived
|
||||
- **ACID compliance**: Ensures data integrity during writes (for data ingestion)
|
||||
- **Small footprint**: Minimal memory and storage overhead
|
||||
- **Fast startup**: No connection pooling or server initialization delays
|
||||
|
||||
### Negative
|
||||
- **Single writer limitation**: Cannot handle concurrent writes (acceptable for read-only backtest)
|
||||
- **Limited scalability**: Not suitable for high-concurrency production trading systems
|
||||
- **No network access**: Cannot query databases remotely (acceptable for local analysis)
|
||||
- **File locking**: Potential issues with file system sharing (mitigated by read-only access)
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Schema Design
|
||||
```sql
|
||||
-- Orderbook snapshots with timestamp windows
|
||||
CREATE TABLE book (
|
||||
id INTEGER PRIMARY KEY,
|
||||
instrument TEXT,
|
||||
bids TEXT NOT NULL, -- JSON array of [price, size] pairs
|
||||
asks TEXT NOT NULL, -- JSON array of [price, size] pairs
|
||||
timestamp TEXT NOT NULL
|
||||
);
|
||||
|
||||
-- Individual trade records
|
||||
CREATE TABLE trades (
|
||||
id INTEGER PRIMARY KEY,
|
||||
instrument TEXT,
|
||||
trade_id TEXT,
|
||||
price REAL NOT NULL,
|
||||
size REAL NOT NULL,
|
||||
side TEXT NOT NULL, -- "buy" or "sell"
|
||||
timestamp TEXT NOT NULL
|
||||
);
|
||||
|
||||
-- Indexes for efficient time-based queries
|
||||
CREATE INDEX idx_book_timestamp ON book(timestamp);
|
||||
CREATE INDEX idx_trades_timestamp ON trades(timestamp);
|
||||
```
|
||||
|
||||
### Performance Optimizations
|
||||
```python
|
||||
# Read-only connection with optimized PRAGMA settings
|
||||
connection_uri = f"file:{db_path}?immutable=1&mode=ro"
|
||||
conn = sqlite3.connect(connection_uri, uri=True)
|
||||
conn.execute("PRAGMA query_only = 1")
|
||||
conn.execute("PRAGMA temp_store = MEMORY")
|
||||
conn.execute("PRAGMA mmap_size = 268435456") # 256MB
|
||||
conn.execute("PRAGMA cache_size = 10000")
|
||||
```
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### PostgreSQL
|
||||
- **Rejected**: Requires server setup and maintenance
|
||||
- **Pros**: Better concurrent access, richer query features
|
||||
- **Cons**: Overkill for read-only use case, deployment complexity
|
||||
|
||||
### Parquet Files
|
||||
- **Rejected**: Limited query capabilities for time-series data
|
||||
- **Pros**: Excellent compression, columnar format
|
||||
- **Cons**: No indexes, complex range queries, requires additional libraries
|
||||
|
||||
### MongoDB
|
||||
- **Rejected**: Document structure not optimal for time-series data
|
||||
- **Pros**: Flexible schema, good aggregation pipeline
|
||||
- **Cons**: Requires server, higher memory usage, learning curve
|
||||
|
||||
### CSV Files
|
||||
- **Rejected**: Poor query performance for large datasets
|
||||
- **Pros**: Simple format, universal compatibility
|
||||
- **Cons**: No indexing, slow filtering, type conversion overhead
|
||||
|
||||
### InfluxDB
|
||||
- **Rejected**: Overkill for historical data analysis
|
||||
- **Pros**: Optimized for time-series, good compression
|
||||
- **Cons**: Additional service dependency, learning curve
|
||||
|
||||
## Migration Path
|
||||
If scalability becomes an issue in the future:
|
||||
|
||||
1. **Phase 1**: Implement database abstraction layer in `db_interpreter`
|
||||
2. **Phase 2**: Add PostgreSQL adapter for production workloads
|
||||
3. **Phase 3**: Implement data partitioning for very large datasets
|
||||
4. **Phase 4**: Consider distributed storage for multi-terabyte datasets
|
||||
|
||||
## Monitoring
|
||||
Track the following metrics to validate this decision:
|
||||
- Database file sizes and growth rates
|
||||
- Query performance for different date ranges
|
||||
- Memory usage during streaming operations
|
||||
- Time to process complete backtests
|
||||
|
||||
## Review Date
|
||||
This decision should be reviewed if:
|
||||
- Database files consistently exceed 50GB
|
||||
- Query performance degrades below 1000 rows/second
|
||||
- Concurrent access requirements change
|
||||
- Network-based data sharing becomes necessary
|
||||
@@ -1,162 +0,0 @@
|
||||
# ADR-002: JSON File-Based Inter-Process Communication
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
The orderflow backtest system requires communication between the data processing pipeline and the web-based visualization frontend. Key requirements include:
|
||||
|
||||
- Real-time data updates from processor to visualization
|
||||
- Tolerance for timing mismatches between writer and reader
|
||||
- Simple implementation without external dependencies
|
||||
- Support for different update frequencies (OHLC bars vs. orderbook depth)
|
||||
- Graceful handling of process crashes or restarts
|
||||
|
||||
## Decision
|
||||
We will use JSON files with atomic write operations for inter-process communication between the data processor and Dash visualization frontend.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- **Simplicity**: No message queues, sockets, or complex protocols
|
||||
- **Fault tolerance**: File-based communication survives process restarts
|
||||
- **Debugging friendly**: Data files can be inspected manually
|
||||
- **No dependencies**: Built-in JSON support, no external libraries
|
||||
- **Atomic operations**: Temp file + rename prevents partial reads
|
||||
- **Language agnostic**: Any process can read/write JSON files
|
||||
- **Bounded memory**: Rolling data windows prevent unlimited growth
|
||||
|
||||
### Negative
|
||||
- **File I/O overhead**: Disk writes may be slower than in-memory communication
|
||||
- **Polling required**: Reader must poll for updates (500ms interval)
|
||||
- **Limited throughput**: Not suitable for high-frequency (microsecond) updates
|
||||
- **No acknowledgments**: Writer cannot confirm reader has processed data
|
||||
- **File system dependency**: Performance varies by storage type
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### File Structure
|
||||
```
|
||||
ohlc_data.json # Rolling array of OHLC bars (max 1000)
|
||||
depth_data.json # Current orderbook depth snapshot
|
||||
metrics_data.json # Rolling array of OBI/CVD metrics (max 1000)
|
||||
```
|
||||
|
||||
### Atomic Write Pattern
|
||||
```python
|
||||
def atomic_write(file_path: Path, data: Any) -> None:
|
||||
"""Write data atomically to prevent partial reads."""
|
||||
temp_path = file_path.with_suffix('.tmp')
|
||||
with open(temp_path, 'w') as f:
|
||||
json.dump(data, f)
|
||||
f.flush()
|
||||
os.fsync(f.fileno())
|
||||
temp_path.replace(file_path) # Atomic on POSIX systems
|
||||
```
|
||||
|
||||
### Data Formats
|
||||
```python
|
||||
# OHLC format: [timestamp_ms, open, high, low, close, volume]
|
||||
ohlc_data = [
|
||||
[1640995200000, 50000.0, 50100.0, 49900.0, 50050.0, 125.5],
|
||||
[1640995260000, 50050.0, 50200.0, 50000.0, 50150.0, 98.3]
|
||||
]
|
||||
|
||||
# Depth format: top-N levels per side
|
||||
depth_data = {
|
||||
"bids": [[49990.0, 1.5], [49985.0, 2.1]],
|
||||
"asks": [[50010.0, 1.2], [50015.0, 1.8]]
|
||||
}
|
||||
|
||||
# Metrics format: [timestamp_ms, obi_open, obi_high, obi_low, obi_close]
|
||||
metrics_data = [
|
||||
[1640995200000, 0.15, 0.22, 0.08, 0.18],
|
||||
[1640995260000, 0.18, 0.25, 0.12, 0.20]
|
||||
]
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
```python
|
||||
# Reader pattern with graceful fallback
|
||||
try:
|
||||
with open(data_file) as f:
|
||||
new_data = json.load(f)
|
||||
_LAST_DATA = new_data # Cache successful read
|
||||
except (FileNotFoundError, json.JSONDecodeError) as e:
|
||||
logging.warning(f"Using cached data: {e}")
|
||||
new_data = _LAST_DATA # Use cached data
|
||||
```
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
### Write Performance
|
||||
- **Small files**: < 1MB typical, writes complete in < 10ms
|
||||
- **Atomic operations**: Add ~2-5ms overhead for temp file creation
|
||||
- **Throttling**: Updates limited to prevent excessive I/O
|
||||
|
||||
### Read Performance
|
||||
- **Parse time**: < 5ms for typical JSON file sizes
|
||||
- **Polling overhead**: 500ms interval balances responsiveness and CPU usage
|
||||
- **Error recovery**: Cached data eliminates visual glitches
|
||||
|
||||
### Memory Usage
|
||||
- **Bounded datasets**: Max 1000 bars × 6 fields × 8 bytes = ~48KB per file
|
||||
- **JSON overhead**: ~2x memory during parsing
|
||||
- **Total footprint**: < 500KB for all IPC data
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Redis Pub/Sub
|
||||
- **Rejected**: Additional service dependency, overkill for simple use case
|
||||
- **Pros**: True real-time updates, built-in data structures
|
||||
- **Cons**: External dependency, memory overhead, configuration complexity
|
||||
|
||||
### ZeroMQ
|
||||
- **Rejected**: Additional library dependency, more complex than needed
|
||||
- **Pros**: High performance, flexible patterns
|
||||
- **Cons**: Learning curve, binary dependency, networking complexity
|
||||
|
||||
### Named Pipes/Unix Sockets
|
||||
- **Rejected**: Platform-specific, more complex error handling
|
||||
- **Pros**: Better performance, no file I/O
|
||||
- **Cons**: Platform limitations, harder debugging, process lifetime coupling
|
||||
|
||||
### SQLite as Message Queue
|
||||
- **Rejected**: Overkill for simple data exchange
|
||||
- **Pros**: ACID transactions, complex queries possible
|
||||
- **Cons**: Schema management, locking considerations, overhead
|
||||
|
||||
### HTTP API
|
||||
- **Rejected**: Too much overhead for local communication
|
||||
- **Pros**: Standard protocol, language agnostic
|
||||
- **Cons**: Network stack overhead, port management, authentication
|
||||
|
||||
## Future Considerations
|
||||
|
||||
### Scalability Limits
|
||||
Current approach suitable for:
|
||||
- Update frequencies: 1-10 Hz
|
||||
- Data volumes: < 10MB total
|
||||
- Process counts: 1 writer, few readers
|
||||
|
||||
### Migration Path
|
||||
If performance becomes insufficient:
|
||||
1. **Phase 1**: Add compression (gzip) to reduce I/O
|
||||
2. **Phase 2**: Implement shared memory for high-frequency data
|
||||
3. **Phase 3**: Consider message queue for complex routing
|
||||
4. **Phase 4**: Migrate to streaming protocol for real-time requirements
|
||||
|
||||
## Monitoring
|
||||
Track these metrics to validate the approach:
|
||||
- File write latency and frequency
|
||||
- JSON parse times in visualization
|
||||
- Error rates for partial reads
|
||||
- Memory usage growth over time
|
||||
|
||||
## Review Triggers
|
||||
Reconsider this decision if:
|
||||
- Update frequency requirements exceed 10 Hz
|
||||
- File I/O becomes a performance bottleneck
|
||||
- Multiple visualization clients need the same data
|
||||
- Complex message routing becomes necessary
|
||||
- Platform portability becomes a concern
|
||||
@@ -1,204 +0,0 @@
|
||||
# ADR-003: Dash Web Framework for Visualization
|
||||
|
||||
## Status
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
The orderflow backtest system requires a user interface for visualizing OHLC candlestick charts, volume data, orderbook depth, and derived metrics. Key requirements include:
|
||||
|
||||
- Real-time chart updates with minimal latency
|
||||
- Professional financial data visualization capabilities
|
||||
- Support for multiple chart types (candlesticks, bars, line charts)
|
||||
- Interactive features (zooming, panning, hover details)
|
||||
- Dark theme suitable for trading applications
|
||||
- Python-native solution to avoid JavaScript development
|
||||
|
||||
## Decision
|
||||
We will use Dash (by Plotly) as the web framework for building the visualization frontend, with Plotly.js for chart rendering.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- **Python-native**: No JavaScript development required
|
||||
- **Plotly integration**: Best-in-class financial charting capabilities
|
||||
- **Reactive architecture**: Automatic UI updates via callback system
|
||||
- **Professional appearance**: High-quality charts suitable for trading applications
|
||||
- **Interactive features**: Built-in zooming, panning, hover tooltips
|
||||
- **Responsive design**: Bootstrap integration for modern layouts
|
||||
- **Development speed**: Rapid prototyping and iteration
|
||||
- **WebGL acceleration**: Smooth performance for large datasets
|
||||
|
||||
### Negative
|
||||
- **Performance overhead**: Heavier than custom JavaScript solutions
|
||||
- **Limited customization**: Constrained by Dash component ecosystem
|
||||
- **Single-page limitation**: Not suitable for complex multi-page applications
|
||||
- **Memory usage**: Can be heavy for resource-constrained environments
|
||||
- **Learning curve**: Callback patterns require understanding of reactive programming
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Application Structure
|
||||
```python
|
||||
# Main application with Bootstrap theme
|
||||
app = dash.Dash(__name__, external_stylesheets=[dbc.themes.FLATLY])
|
||||
|
||||
# Responsive layout with 9:3 ratio for charts:depth
|
||||
app.layout = dbc.Container([
|
||||
dbc.Row([
|
||||
dbc.Col([ # OHLC + Volume + Metrics
|
||||
dcc.Graph(id='ohlc-chart', style={'height': '100vh'})
|
||||
], width=9),
|
||||
dbc.Col([ # Orderbook Depth
|
||||
dcc.Graph(id='depth-chart', style={'height': '100vh'})
|
||||
], width=3)
|
||||
]),
|
||||
dcc.Interval(id='interval-update', interval=500, n_intervals=0)
|
||||
])
|
||||
```
|
||||
|
||||
### Chart Architecture
|
||||
```python
|
||||
# Multi-subplot chart with shared x-axis
|
||||
fig = make_subplots(
|
||||
rows=3, cols=1,
|
||||
row_heights=[0.6, 0.2, 0.2], # OHLC, Volume, Metrics
|
||||
vertical_spacing=0.02,
|
||||
shared_xaxes=True,
|
||||
subplot_titles=['Price', 'Volume', 'OBI Metrics']
|
||||
)
|
||||
|
||||
# Candlestick chart with dark theme
|
||||
fig.add_trace(go.Candlestick(
|
||||
x=timestamps, open=opens, high=highs, low=lows, close=closes,
|
||||
increasing_line_color='#00ff00', decreasing_line_color='#ff0000'
|
||||
), row=1, col=1)
|
||||
```
|
||||
|
||||
### Real-time Updates
|
||||
```python
|
||||
@app.callback(
|
||||
[Output('ohlc-chart', 'figure'), Output('depth-chart', 'figure')],
|
||||
[Input('interval-update', 'n_intervals')]
|
||||
)
|
||||
def update_charts(n_intervals):
|
||||
# Read data from JSON files with error handling
|
||||
# Build and return updated figures
|
||||
return ohlc_fig, depth_fig
|
||||
```
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
### Update Latency
|
||||
- **Polling interval**: 500ms for near real-time updates
|
||||
- **Chart render time**: 50-200ms depending on data size
|
||||
- **Memory usage**: ~100MB for typical chart configurations
|
||||
- **Browser requirements**: Modern browser with WebGL support
|
||||
|
||||
### Scalability Limits
|
||||
- **Data points**: Up to 10,000 candlesticks without performance issues
|
||||
- **Update frequency**: Optimal at 1-2 Hz, maximum ~10 Hz
|
||||
- **Concurrent users**: Single user design (development server)
|
||||
- **Memory growth**: Linear with data history size
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Streamlit
|
||||
- **Rejected**: Less interactive, slower updates, limited charting
|
||||
- **Pros**: Simpler programming model, good for prototypes
|
||||
- **Cons**: Poor real-time performance, limited financial chart types
|
||||
|
||||
### Flask + Custom JavaScript
|
||||
- **Rejected**: Requires JavaScript development, more complex
|
||||
- **Pros**: Complete control, potentially better performance
|
||||
- **Cons**: Significant development overhead, maintenance burden
|
||||
|
||||
### Jupyter Notebooks
|
||||
- **Rejected**: Not suitable for production deployment
|
||||
- **Pros**: Great for exploration and analysis
|
||||
- **Cons**: No real-time updates, not web-deployable
|
||||
|
||||
### Bokeh
|
||||
- **Rejected**: Less mature ecosystem, fewer financial chart types
|
||||
- **Pros**: Good performance, Python-native
|
||||
- **Cons**: Smaller community, limited examples for financial data
|
||||
|
||||
### Custom React Application
|
||||
- **Rejected**: Requires separate frontend team, complex deployment
|
||||
- **Pros**: Maximum flexibility, best performance potential
|
||||
- **Cons**: High development cost, maintenance overhead
|
||||
|
||||
### Desktop GUI (Tkinter/PyQt)
|
||||
- **Rejected**: Not web-accessible, limited styling options
|
||||
- **Pros**: No browser dependency, good performance
|
||||
- **Cons**: Deployment complexity, poor mobile support
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### Theme and Styling
|
||||
```python
|
||||
# Dark theme configuration
|
||||
dark_theme = {
|
||||
'plot_bgcolor': '#000000',
|
||||
'paper_bgcolor': '#000000',
|
||||
'font_color': '#ffffff',
|
||||
'grid_color': '#333333'
|
||||
}
|
||||
```
|
||||
|
||||
### Chart Types
|
||||
- **Candlestick charts**: OHLC price data with volume
|
||||
- **Bar charts**: Volume and metrics visualization
|
||||
- **Line charts**: Cumulative depth and trend analysis
|
||||
- **Scatter plots**: Trade-by-trade analysis (future)
|
||||
|
||||
### Interactive Features
|
||||
- **Zoom and pan**: Time-based navigation
|
||||
- **Hover tooltips**: Detailed data on mouse over
|
||||
- **Crosshairs**: Precise value reading
|
||||
- **Range selector**: Quick time period selection
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Short-term (1-3 months)
|
||||
- Add range selector for time navigation
|
||||
- Implement chart annotation for significant events
|
||||
- Add export functionality for charts and data
|
||||
|
||||
### Medium-term (3-6 months)
|
||||
- Multi-instrument support with tabs
|
||||
- Advanced indicators and overlays
|
||||
- User preference persistence
|
||||
|
||||
### Long-term (6+ months)
|
||||
- Real-time alerts and notifications
|
||||
- Strategy backtesting visualization
|
||||
- Portfolio-level analytics
|
||||
|
||||
## Monitoring and Metrics
|
||||
|
||||
### Performance Monitoring
|
||||
- Chart render times and update frequencies
|
||||
- Memory usage growth over time
|
||||
- Browser compatibility and error rates
|
||||
- User interaction patterns
|
||||
|
||||
### Quality Metrics
|
||||
- Chart accuracy compared to source data
|
||||
- Visual responsiveness during heavy updates
|
||||
- Error recovery from data corruption
|
||||
|
||||
## Review Triggers
|
||||
Reconsider this decision if:
|
||||
- Update frequency requirements exceed 10 Hz consistently
|
||||
- Memory usage becomes prohibitive (> 1GB)
|
||||
- Custom visualization requirements cannot be met
|
||||
- Multi-user deployment becomes necessary
|
||||
- Mobile responsiveness becomes a priority
|
||||
- Integration with external charting libraries is needed
|
||||
|
||||
## Migration Path
|
||||
If replacement becomes necessary:
|
||||
1. **Phase 1**: Abstract chart building logic from Dash specifics
|
||||
2. **Phase 2**: Implement alternative frontend while maintaining data formats
|
||||
3. **Phase 3**: A/B test performance and usability
|
||||
4. **Phase 4**: Complete migration with feature parity
|
||||
@@ -1,165 +0,0 @@
|
||||
# Module: app
|
||||
|
||||
## Purpose
|
||||
The `app` module provides a real-time Dash web application for visualizing OHLC candlestick charts, volume data, Order Book Imbalance (OBI) metrics, and orderbook depth. It implements a polling-based architecture that reads JSON data files and renders interactive charts with a dark theme.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `build_empty_ohlc_fig() -> go.Figure`: Create empty OHLC chart with proper styling
|
||||
- `build_empty_depth_fig() -> go.Figure`: Create empty depth chart with proper styling
|
||||
- `build_ohlc_fig(data: List[list], metrics: List[list]) -> go.Figure`: Build complete OHLC+Volume+OBI chart
|
||||
- `build_depth_fig(depth_data: dict) -> go.Figure`: Build orderbook depth visualization
|
||||
|
||||
### Global Variables
|
||||
- `_LAST_DATA`: Cached OHLC data for error recovery
|
||||
- `_LAST_DEPTH`: Cached depth data for error recovery
|
||||
- `_LAST_METRICS`: Cached metrics data for error recovery
|
||||
|
||||
### Dash Application
|
||||
- `app`: Main Dash application instance with Bootstrap theme
|
||||
- Layout with responsive grid (9:3 ratio for OHLC:Depth charts)
|
||||
- 500ms polling interval for real-time updates
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Running the Application
|
||||
```bash
|
||||
# Start the Dash server
|
||||
uv run python app.py
|
||||
|
||||
# Access the web interface
|
||||
# Open http://localhost:8050 in your browser
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
```python
|
||||
from app import build_ohlc_fig, build_depth_fig
|
||||
|
||||
# Build charts with sample data
|
||||
ohlc_data = [[1640995200000, 50000, 50100, 49900, 50050, 125.5]]
|
||||
metrics_data = [[1640995200000, 0.15, 0.22, 0.08, 0.18]]
|
||||
depth_data = {
|
||||
"bids": [[49990, 1.5], [49985, 2.1]],
|
||||
"asks": [[50010, 1.2], [50015, 1.8]]
|
||||
}
|
||||
|
||||
ohlc_fig = build_ohlc_fig(ohlc_data, metrics_data)
|
||||
depth_fig = build_depth_fig(depth_data)
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- `viz_io`: Data file paths and JSON reading
|
||||
- `viz_io.DATA_FILE`: OHLC data source
|
||||
- `viz_io.DEPTH_FILE`: Depth data source
|
||||
- `viz_io.METRICS_FILE`: Metrics data source
|
||||
|
||||
### External
|
||||
- `dash`: Web application framework
|
||||
- `dash.html`, `dash.dcc`: HTML and core components
|
||||
- `dash_bootstrap_components`: Bootstrap styling
|
||||
- `plotly.graph_objs`: Chart objects
|
||||
- `plotly.subplots`: Multiple subplot support
|
||||
- `pandas`: Data manipulation (minimal usage)
|
||||
- `json`: JSON file parsing
|
||||
- `logging`: Error and debug logging
|
||||
- `pathlib`: File path handling
|
||||
|
||||
## Chart Architecture
|
||||
|
||||
### OHLC Chart (Left Panel, 9/12 width)
|
||||
- **Main subplot**: Candlestick chart with OHLC data
|
||||
- **Volume subplot**: Bar chart sharing x-axis with main chart
|
||||
- **OBI subplot**: Order Book Imbalance candlestick chart in blue tones
|
||||
- **Shared x-axis**: Synchronized zooming and panning across subplots
|
||||
|
||||
### Depth Chart (Right Panel, 3/12 width)
|
||||
- **Cumulative depth**: Stepped line chart showing bid/ask liquidity
|
||||
- **Color coding**: Green for bids, red for asks
|
||||
- **Real-time updates**: Reflects current orderbook state
|
||||
|
||||
## Styling and Theme
|
||||
|
||||
### Dark Theme Configuration
|
||||
- Background: Black (`#000000`)
|
||||
- Text: White (`#ffffff`)
|
||||
- Grid: Dark gray with transparency
|
||||
- Candlesticks: Green (up) / Red (down)
|
||||
- Volume: Gray bars
|
||||
- OBI: Blue tones for candlesticks
|
||||
- Depth: Green (bids) / Red (asks)
|
||||
|
||||
### Responsive Design
|
||||
- Bootstrap grid system for layout
|
||||
- Fluid container for full-width usage
|
||||
- 100vh height for full viewport coverage
|
||||
- Configurable chart display modes
|
||||
|
||||
## Data Polling and Error Handling
|
||||
|
||||
### Polling Strategy
|
||||
- **Interval**: 500ms for near real-time updates
|
||||
- **Graceful degradation**: Uses cached data on JSON read errors
|
||||
- **Atomic reads**: Tolerates partial writes during file updates
|
||||
- **Logging**: Warnings for data inconsistencies
|
||||
|
||||
### Error Recovery
|
||||
```python
|
||||
# Pseudocode for error handling pattern
|
||||
try:
|
||||
with open(data_file) as f:
|
||||
new_data = json.load(f)
|
||||
_LAST_DATA = new_data # Cache successful read
|
||||
except (FileNotFoundError, json.JSONDecodeError):
|
||||
logging.warning("Using cached data due to read error")
|
||||
new_data = _LAST_DATA # Use cached data
|
||||
```
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Client-side rendering**: Plotly.js handles chart rendering
|
||||
- **Efficient updates**: Only redraws when data changes
|
||||
- **Memory bounded**: Limited by max bars in data files (1000)
|
||||
- **Network efficient**: Local file polling (no external API calls)
|
||||
|
||||
## Testing
|
||||
|
||||
Run application tests:
|
||||
```bash
|
||||
uv run pytest test_app.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Chart building functions
|
||||
- Data loading and caching
|
||||
- Error handling scenarios
|
||||
- Layout rendering
|
||||
- Callback functionality
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### Server Configuration
|
||||
- **Host**: `0.0.0.0` (accessible from network)
|
||||
- **Port**: `8050` (default Dash port)
|
||||
- **Debug mode**: Disabled in production
|
||||
|
||||
### Chart Configuration
|
||||
- **Update interval**: 500ms (configurable via dcc.Interval)
|
||||
- **Display mode bar**: Enabled for user interaction
|
||||
- **Logo display**: Disabled for clean interface
|
||||
|
||||
## Known Issues
|
||||
|
||||
- High CPU usage during rapid data updates
|
||||
- Memory usage grows with chart history
|
||||
- No authentication or access control
|
||||
- Limited mobile responsiveness for complex charts
|
||||
|
||||
## Development Notes
|
||||
|
||||
- Uses Flask development server (not suitable for production)
|
||||
- Callback exceptions suppressed for partial data scenarios
|
||||
- Bootstrap CSS loaded from CDN
|
||||
- Chart configurations optimized for financial data visualization
|
||||
@@ -1,83 +0,0 @@
|
||||
# Module: db_interpreter
|
||||
|
||||
## Purpose
|
||||
The `db_interpreter` module provides efficient streaming access to SQLite databases containing orderbook and trade data. It handles batch reading, temporal windowing, and data structure normalization for downstream processing.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `OrderbookLevel(price: float, size: float)`: Dataclass representing a single price level in the orderbook
|
||||
- `OrderbookUpdate`: Container for windowed orderbook data with bids, asks, timestamp, and end_timestamp
|
||||
|
||||
### Functions
|
||||
- `DBInterpreter(db_path: Path)`: Constructor that initializes read-only SQLite connection with optimized PRAGMA settings
|
||||
|
||||
### Methods
|
||||
- `stream() -> Iterator[tuple[OrderbookUpdate, list[tuple]]]`: Primary streaming interface that yields orderbook updates with associated trades in temporal windows
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
from db_interpreter import DBInterpreter
|
||||
|
||||
# Initialize interpreter
|
||||
db_path = Path("data/BTC-USDT-2025-01-01.db")
|
||||
interpreter = DBInterpreter(db_path)
|
||||
|
||||
# Stream orderbook and trade data
|
||||
for ob_update, trades in interpreter.stream():
|
||||
# Process orderbook update
|
||||
print(f"Book update: {len(ob_update.bids)} bids, {len(ob_update.asks)} asks")
|
||||
print(f"Time window: {ob_update.timestamp} - {ob_update.end_timestamp}")
|
||||
|
||||
# Process trades in this window
|
||||
for trade in trades:
|
||||
trade_id, price, size, side, timestamp_ms = trade[1:6]
|
||||
print(f"Trade: {side} {size} @ {price}")
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- None (standalone module)
|
||||
|
||||
### External
|
||||
- `sqlite3`: Database connectivity
|
||||
- `pathlib`: Path handling
|
||||
- `dataclasses`: Data structure definitions
|
||||
- `typing`: Type annotations
|
||||
- `logging`: Debug and error logging
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Batch sizes**: BOOK_BATCH=2048, TRADE_BATCH=4096 for optimal memory usage
|
||||
- **SQLite optimizations**: Read-only, immutable mode, large mmap and cache sizes
|
||||
- **Memory efficient**: Streaming iterator pattern prevents loading entire dataset
|
||||
- **Temporal windowing**: One-row lookahead for precise time boundary calculation
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_db_interpreter.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Batch reading correctness
|
||||
- Temporal window boundary handling
|
||||
- Trade-to-window assignment accuracy
|
||||
- End-of-stream behavior
|
||||
- Error handling for malformed data
|
||||
|
||||
## Known Issues
|
||||
|
||||
- Requires specific database schema (book and trades tables)
|
||||
- Python-literal string parsing assumes well-formed input
|
||||
- Large databases may require memory monitoring during streaming
|
||||
|
||||
## Configuration
|
||||
|
||||
- `BOOK_BATCH`: Number of orderbook rows to fetch per query (default: 2048)
|
||||
- `TRADE_BATCH`: Number of trade rows to fetch per query (default: 4096)
|
||||
- SQLite PRAGMA settings optimized for read-only sequential access
|
||||
@@ -1,162 +0,0 @@
|
||||
# External Dependencies
|
||||
|
||||
## Overview
|
||||
This document describes all external dependencies used in the orderflow backtest system, their purposes, versions, and justifications for inclusion.
|
||||
|
||||
## Production Dependencies
|
||||
|
||||
### Core Framework Dependencies
|
||||
|
||||
#### Dash (^2.18.2)
|
||||
- **Purpose**: Web application framework for interactive visualizations
|
||||
- **Usage**: Real-time chart rendering and user interface
|
||||
- **Justification**: Mature Python-based framework with excellent Plotly integration
|
||||
- **Key Features**: Reactive components, built-in server, callback system
|
||||
|
||||
#### Dash Bootstrap Components (^1.6.0)
|
||||
- **Purpose**: Bootstrap CSS framework integration for Dash
|
||||
- **Usage**: Responsive layout grid and modern UI styling
|
||||
- **Justification**: Provides professional appearance with minimal custom CSS
|
||||
|
||||
#### Plotly (^5.24.1)
|
||||
- **Purpose**: Interactive charting and visualization library
|
||||
- **Usage**: OHLC candlesticks, volume bars, depth charts, OBI metrics
|
||||
- **Justification**: Industry standard for financial data visualization
|
||||
- **Key Features**: WebGL acceleration, zooming/panning, dark themes
|
||||
|
||||
### Data Processing Dependencies
|
||||
|
||||
#### Pandas (^2.2.3)
|
||||
- **Purpose**: Data manipulation and analysis library
|
||||
- **Usage**: Minimal usage for data structure conversions in visualization
|
||||
- **Justification**: Standard tool for financial data handling
|
||||
- **Note**: Usage kept minimal to maintain performance
|
||||
|
||||
#### Typer (^0.13.1)
|
||||
- **Purpose**: Modern CLI framework
|
||||
- **Usage**: Command-line argument parsing and help generation
|
||||
- **Justification**: Type-safe, auto-generated help, better UX than argparse
|
||||
- **Key Features**: Type hints integration, automatic validation
|
||||
|
||||
### Data Storage Dependencies
|
||||
|
||||
#### SQLite3 (Built-in)
|
||||
- **Purpose**: Database connectivity for historical data
|
||||
- **Usage**: Read-only access to orderbook and trade data
|
||||
- **Justification**: Built into Python, no external dependencies, excellent performance
|
||||
- **Configuration**: Optimized with immutable mode and mmap
|
||||
|
||||
## Development and Testing Dependencies
|
||||
|
||||
#### Pytest (^8.3.4)
|
||||
- **Purpose**: Testing framework
|
||||
- **Usage**: Unit tests, integration tests, test discovery
|
||||
- **Justification**: Standard Python testing tool with excellent plugin ecosystem
|
||||
|
||||
#### Coverage (^7.6.9)
|
||||
- **Purpose**: Code coverage measurement
|
||||
- **Usage**: Test coverage reporting and quality metrics
|
||||
- **Justification**: Essential for maintaining code quality
|
||||
|
||||
## Build and Package Management
|
||||
|
||||
#### UV (Package Manager)
|
||||
- **Purpose**: Fast Python package manager and task runner
|
||||
- **Usage**: Dependency management, virtual environments, script execution
|
||||
- **Justification**: Significantly faster than pip/poetry, better lock file format
|
||||
- **Commands**: `uv sync`, `uv run`, `uv add`
|
||||
|
||||
## Python Standard Library Usage
|
||||
|
||||
### Core Libraries
|
||||
- **sqlite3**: Database connectivity
|
||||
- **json**: JSON serialization for IPC
|
||||
- **pathlib**: Modern file path handling
|
||||
- **subprocess**: Process management for visualization
|
||||
- **logging**: Structured logging throughout application
|
||||
- **datetime**: Date/time parsing and manipulation
|
||||
- **dataclasses**: Structured data types
|
||||
- **typing**: Type annotations and hints
|
||||
- **tempfile**: Atomic file operations
|
||||
- **ast**: Safe evaluation of Python literals
|
||||
|
||||
### Performance Libraries
|
||||
- **itertools**: Efficient iteration patterns
|
||||
- **functools**: Function decoration and caching
|
||||
- **collections**: Specialized data structures
|
||||
|
||||
## Dependency Justifications
|
||||
|
||||
### Why Dash Over Alternatives?
|
||||
- **vs. Streamlit**: Better real-time updates, more control over layout
|
||||
- **vs. Flask + Custom JS**: Integrated Plotly support, faster development
|
||||
- **vs. Jupyter**: Better for production deployment, process isolation
|
||||
|
||||
### Why SQLite Over Alternatives?
|
||||
- **vs. PostgreSQL**: No server setup required, excellent read performance
|
||||
- **vs. Parquet**: Better for time-series queries, built-in indexing
|
||||
- **vs. CSV**: Proper data types, much faster queries, atomic transactions
|
||||
|
||||
### Why UV Over Poetry/Pip?
|
||||
- **vs. Poetry**: Significantly faster dependency resolution and installation
|
||||
- **vs. Pip**: Better dependency locking, integrated task runner
|
||||
- **vs. Pipenv**: More active development, better performance
|
||||
|
||||
## Version Pinning Strategy
|
||||
|
||||
### Patch Version Pinning
|
||||
- Core dependencies (Dash, Plotly) pinned to patch versions
|
||||
- Prevents breaking changes while allowing security updates
|
||||
|
||||
### Range Pinning
|
||||
- Development tools use caret (^) ranges for flexibility
|
||||
- Testing tools can update more freely
|
||||
|
||||
### Lock File Management
|
||||
- `uv.lock` ensures reproducible builds across environments
|
||||
- Regular updates scheduled monthly for security patches
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Dependency Scanning
|
||||
- Regular audit of dependencies for known vulnerabilities
|
||||
- Automated updates for security patches
|
||||
- Minimal dependency tree to reduce attack surface
|
||||
|
||||
### Data Isolation
|
||||
- Read-only database access prevents data modification
|
||||
- No external network connections required for core functionality
|
||||
- All file operations contained within project directory
|
||||
|
||||
## Performance Impact
|
||||
|
||||
### Bundle Size
|
||||
- Core runtime: ~50MB with all dependencies
|
||||
- Dash frontend: Additional ~10MB for JavaScript assets
|
||||
- SQLite: Zero overhead (built-in)
|
||||
|
||||
### Startup Time
|
||||
- Cold start: ~2-3 seconds for full application
|
||||
- UV virtual environment activation: ~100ms
|
||||
- Database connection: ~50ms per file
|
||||
|
||||
### Memory Usage
|
||||
- Base application: ~100MB
|
||||
- Per 1000 OHLC bars: ~5MB additional
|
||||
- Plotly charts: ~20MB for complex visualizations
|
||||
|
||||
## Maintenance Schedule
|
||||
|
||||
### Monthly
|
||||
- Security update review and application
|
||||
- Dependency version bump evaluation
|
||||
|
||||
### Quarterly
|
||||
- Major version update consideration
|
||||
- Performance impact assessment
|
||||
- Alternative technology evaluation
|
||||
|
||||
### Annually
|
||||
- Complete dependency audit
|
||||
- Technology stack review
|
||||
- Migration planning for deprecated packages
|
||||
@@ -1,101 +0,0 @@
|
||||
# Module: level_parser
|
||||
|
||||
## Purpose
|
||||
The `level_parser` module provides utilities for parsing and normalizing orderbook level data from various string formats. It handles JSON and Python literal representations, converting them into standardized numeric tuples for processing.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `normalize_levels(levels: Any) -> List[List[float]]`: Parse levels into [[price, size], ...] format, filtering out zero/negative sizes
|
||||
- `parse_levels_including_zeros(levels: Any) -> List[Tuple[float, float]]`: Parse levels preserving zero sizes for deletion operations
|
||||
|
||||
### Private Functions
|
||||
- `_parse_string_to_list(levels: Any) -> List[Any]`: Core parsing logic trying JSON first, then literal_eval
|
||||
- `_extract_price_size(item: Any) -> Tuple[Any, Any]`: Extract price/size from dict or list/tuple formats
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from level_parser import normalize_levels, parse_levels_including_zeros
|
||||
|
||||
# Parse standard levels (filters zeros)
|
||||
levels = normalize_levels('[[50000.0, 1.5], [49999.0, 2.0]]')
|
||||
# Returns: [[50000.0, 1.5], [49999.0, 2.0]]
|
||||
|
||||
# Parse with zero sizes preserved (for deletions)
|
||||
updates = parse_levels_including_zeros('[[50000.0, 0.0], [49999.0, 1.5]]')
|
||||
# Returns: [(50000.0, 0.0), (49999.0, 1.5)]
|
||||
|
||||
# Supports dict format
|
||||
dict_levels = normalize_levels('[{"price": 50000.0, "size": 1.5}]')
|
||||
# Returns: [[50000.0, 1.5]]
|
||||
|
||||
# Short key format
|
||||
short_levels = normalize_levels('[{"p": 50000.0, "s": 1.5}]')
|
||||
# Returns: [[50000.0, 1.5]]
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### External
|
||||
- `json`: Primary parsing method for level data
|
||||
- `ast.literal_eval`: Fallback parsing for Python literal formats
|
||||
- `logging`: Debug logging for parsing issues
|
||||
- `typing`: Type annotations
|
||||
|
||||
## Input Formats Supported
|
||||
|
||||
### JSON Array Format
|
||||
```json
|
||||
[[50000.0, 1.5], [49999.0, 2.0]]
|
||||
```
|
||||
|
||||
### Dict Format (Full Keys)
|
||||
```json
|
||||
[{"price": 50000.0, "size": 1.5}, {"price": 49999.0, "size": 2.0}]
|
||||
```
|
||||
|
||||
### Dict Format (Short Keys)
|
||||
```json
|
||||
[{"p": 50000.0, "s": 1.5}, {"p": 49999.0, "s": 2.0}]
|
||||
```
|
||||
|
||||
### Python Literal Format
|
||||
```python
|
||||
"[(50000.0, 1.5), (49999.0, 2.0)]"
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
- **Graceful Degradation**: Returns empty list on parse failures
|
||||
- **Data Validation**: Filters out invalid price/size pairs
|
||||
- **Type Safety**: Converts all values to float before processing
|
||||
- **Debug Logging**: Logs warnings for malformed input without crashing
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Fast Path**: JSON parsing prioritized for performance
|
||||
- **Fallback Support**: ast.literal_eval as backup for edge cases
|
||||
- **Memory Efficient**: Processes items iteratively, not loading entire dataset
|
||||
- **Validation**: Minimal overhead with early filtering of invalid data
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
uv run pytest test_level_parser.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- JSON format parsing accuracy
|
||||
- Dict format (both key styles) parsing
|
||||
- Python literal fallback parsing
|
||||
- Zero size preservation vs filtering
|
||||
- Error handling for malformed input
|
||||
- Type conversion edge cases
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- Assumes well-formed numeric data (price/size as numbers)
|
||||
- Does not validate economic constraints (e.g., positive prices)
|
||||
- Limited to list/dict input formats
|
||||
- No support for streaming/incremental parsing
|
||||
@@ -1,168 +0,0 @@
|
||||
# Module: main
|
||||
|
||||
## Purpose
|
||||
The `main` module provides the command-line interface (CLI) orchestration for the orderflow backtest system. It handles database discovery, process management, and coordinates the streaming pipeline with the visualization frontend using Typer for argument parsing.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `main(instrument: str, start_date: str, end_date: str, window_seconds: int = 60) -> None`: Primary CLI entrypoint
|
||||
- `discover_databases(instrument: str, start_date: str, end_date: str) -> list[Path]`: Find matching database files
|
||||
- `launch_visualizer() -> subprocess.Popen | None`: Start Dash application in separate process
|
||||
|
||||
### CLI Arguments
|
||||
- `instrument`: Trading pair identifier (e.g., "BTC-USDT")
|
||||
- `start_date`: Start date in YYYY-MM-DD format (UTC)
|
||||
- `end_date`: End date in YYYY-MM-DD format (UTC)
|
||||
- `--window-seconds`: OHLC aggregation window size (default: 60)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Command Line Usage
|
||||
```bash
|
||||
# Basic usage with default 60-second windows
|
||||
uv run python main.py BTC-USDT 2025-01-01 2025-01-31
|
||||
|
||||
# Custom window size
|
||||
uv run python main.py ETH-USDT 2025-02-01 2025-02-28 --window-seconds 30
|
||||
|
||||
# Single day processing
|
||||
uv run python main.py SOL-USDT 2025-03-15 2025-03-15
|
||||
```
|
||||
|
||||
### Programmatic Usage
|
||||
```python
|
||||
from main import main, discover_databases
|
||||
|
||||
# Run processing pipeline
|
||||
main("BTC-USDT", "2025-01-01", "2025-01-31", window_seconds=120)
|
||||
|
||||
# Discover available databases
|
||||
db_files = discover_databases("ETH-USDT", "2025-02-01", "2025-02-28")
|
||||
print(f"Found {len(db_files)} database files")
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- `db_interpreter.DBInterpreter`: Database streaming
|
||||
- `ohlc_processor.OHLCProcessor`: Trade aggregation and orderbook processing
|
||||
- `viz_io`: Data clearing functions
|
||||
|
||||
### External
|
||||
- `typer`: CLI framework and argument parsing
|
||||
- `subprocess`: Process management for visualization
|
||||
- `pathlib`: File and directory operations
|
||||
- `datetime`: Date parsing and validation
|
||||
- `logging`: Operational logging
|
||||
- `sys`: Exit code management
|
||||
|
||||
## Database Discovery Logic
|
||||
|
||||
### File Pattern Matching
|
||||
```python
|
||||
# Expected directory structure
|
||||
../data/OKX/{instrument}/{date}/
|
||||
|
||||
# Example paths
|
||||
../data/OKX/BTC-USDT/2025-01-01/trades.db
|
||||
../data/OKX/ETH-USDT/2025-02-15/trades.db
|
||||
```
|
||||
|
||||
### Discovery Algorithm
|
||||
1. Parse start and end dates to datetime objects
|
||||
2. Iterate through date range (inclusive)
|
||||
3. Construct expected path for each date
|
||||
4. Verify file existence and readability
|
||||
5. Return sorted list of valid database paths
|
||||
|
||||
## Process Orchestration
|
||||
|
||||
### Visualization Process Management
|
||||
```python
|
||||
# Launch Dash app in separate process
|
||||
viz_process = subprocess.Popen([
|
||||
"uv", "run", "python", "app.py"
|
||||
], cwd=project_root)
|
||||
|
||||
# Process management
|
||||
try:
|
||||
# Main processing loop
|
||||
process_databases(db_files)
|
||||
finally:
|
||||
# Cleanup visualization process
|
||||
if viz_process:
|
||||
viz_process.terminate()
|
||||
viz_process.wait(timeout=5)
|
||||
```
|
||||
|
||||
### Data Processing Pipeline
|
||||
1. **Initialize**: Clear existing data files
|
||||
2. **Launch**: Start visualization process
|
||||
3. **Stream**: Process each database sequentially
|
||||
4. **Aggregate**: Generate OHLC bars and depth snapshots
|
||||
5. **Cleanup**: Terminate visualization and finalize
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Database Access Errors
|
||||
- **File not found**: Log warning and skip missing databases
|
||||
- **Permission denied**: Log error and exit with status code 1
|
||||
- **Corruption**: Log error for specific database and continue with next
|
||||
|
||||
### Process Management Errors
|
||||
- **Visualization startup failure**: Log error but continue processing
|
||||
- **Process termination**: Graceful shutdown with timeout
|
||||
- **Resource cleanup**: Ensure child processes are terminated
|
||||
|
||||
### Date Validation
|
||||
- **Invalid format**: Clear error message with expected format
|
||||
- **Invalid range**: End date must be >= start date
|
||||
- **Future dates**: Warning for dates beyond data availability
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Sequential processing**: Databases processed one at a time
|
||||
- **Memory efficient**: Streaming approach prevents loading entire datasets
|
||||
- **Process isolation**: Visualization runs independently
|
||||
- **Resource cleanup**: Automatic process termination on exit
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_main.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Database discovery logic
|
||||
- Date parsing and validation
|
||||
- Process management
|
||||
- Error handling scenarios
|
||||
- CLI argument validation
|
||||
|
||||
## Configuration
|
||||
|
||||
### Default Settings
|
||||
- **Data directory**: `../data/OKX` (relative to project root)
|
||||
- **Visualization command**: `uv run python app.py`
|
||||
- **Window size**: 60 seconds
|
||||
- **Process timeout**: 5 seconds for termination
|
||||
|
||||
### Environment Variables
|
||||
- **DATA_PATH**: Override default data directory
|
||||
- **VISUALIZATION_PORT**: Override Dash port (requires app.py modification)
|
||||
|
||||
## Known Issues
|
||||
|
||||
- Assumes specific directory structure under `../data/OKX`
|
||||
- No validation of database schema compatibility
|
||||
- Limited error recovery for process management
|
||||
- No progress indication for large datasets
|
||||
|
||||
## Development Notes
|
||||
|
||||
- Uses Typer for modern CLI interface
|
||||
- Subprocess management compatible with Unix/Windows
|
||||
- Logging configured for both development and production use
|
||||
- Exit codes follow Unix conventions (0=success, 1=error)
|
||||
@@ -1,147 +0,0 @@
|
||||
# Module: metrics_calculator
|
||||
|
||||
## Purpose
|
||||
The `metrics_calculator` module handles calculation and management of trading metrics including Order Book Imbalance (OBI) and Cumulative Volume Delta (CVD). It provides windowed aggregation with throttled updates for real-time visualization.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `MetricsCalculator(window_seconds: int = 60, emit_every_n_updates: int = 25)`: Main metrics calculation engine
|
||||
|
||||
### Methods
|
||||
- `update_cvd_from_trade(side: str, size: float) -> None`: Update CVD from individual trade data
|
||||
- `update_obi_metrics(timestamp: str, total_bids: float, total_asks: float) -> None`: Update OBI metrics from orderbook volumes
|
||||
- `finalize_metrics() -> None`: Emit final metrics bar at processing end
|
||||
|
||||
### Properties
|
||||
- `cvd_cumulative: float`: Current cumulative volume delta value
|
||||
|
||||
### Private Methods
|
||||
- `_emit_metrics_bar() -> None`: Emit current metrics to visualization layer
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from metrics_calculator import MetricsCalculator
|
||||
|
||||
# Initialize calculator
|
||||
calc = MetricsCalculator(window_seconds=60, emit_every_n_updates=25)
|
||||
|
||||
# Update CVD from trades
|
||||
calc.update_cvd_from_trade("buy", 1.5) # +1.5 CVD
|
||||
calc.update_cvd_from_trade("sell", 1.0) # -1.0 CVD, net +0.5
|
||||
|
||||
# Update OBI from orderbook
|
||||
total_bids, total_asks = 150.0, 120.0
|
||||
calc.update_obi_metrics("1640995200000", total_bids, total_asks)
|
||||
|
||||
# Access current CVD
|
||||
current_cvd = calc.cvd_cumulative # 0.5
|
||||
|
||||
# Finalize at end of processing
|
||||
calc.finalize_metrics()
|
||||
```
|
||||
|
||||
## Metrics Definitions
|
||||
|
||||
### Cumulative Volume Delta (CVD)
|
||||
- **Formula**: CVD = Σ(buy_volume - sell_volume)
|
||||
- **Interpretation**: Positive = more buying pressure, Negative = more selling pressure
|
||||
- **Accumulation**: Running total across all processed trades
|
||||
- **Update Frequency**: Every trade
|
||||
|
||||
### Order Book Imbalance (OBI)
|
||||
- **Formula**: OBI = total_bid_volume - total_ask_volume
|
||||
- **Interpretation**: Positive = more bid liquidity, Negative = more ask liquidity
|
||||
- **Aggregation**: OHLC-style bars per time window (open, high, low, close)
|
||||
- **Update Frequency**: Throttled per orderbook update
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- `viz_io.upsert_metric_bar`: Output interface for visualization
|
||||
|
||||
### External
|
||||
- `logging`: Warning messages for unknown trade sides
|
||||
- `typing`: Type annotations
|
||||
|
||||
## Windowed Aggregation
|
||||
|
||||
### OBI Windows
|
||||
- **Window Size**: Configurable via `window_seconds` (default: 60)
|
||||
- **Window Alignment**: Aligned to epoch time boundaries
|
||||
- **OHLC Tracking**: Maintains open, high, low, close values per window
|
||||
- **Rollover**: Automatic window transitions with final bar emission
|
||||
|
||||
### Throttling Mechanism
|
||||
- **Purpose**: Reduce I/O overhead during high-frequency updates
|
||||
- **Trigger**: Every N updates (configurable via `emit_every_n_updates`)
|
||||
- **Behavior**: Emits intermediate updates for real-time visualization
|
||||
- **Final Emission**: Guaranteed on window rollover and finalization
|
||||
|
||||
## State Management
|
||||
|
||||
### CVD State
|
||||
- `cvd_cumulative: float`: Running total across all trades
|
||||
- **Persistence**: Maintained throughout processor lifetime
|
||||
- **Updates**: Incremental addition/subtraction per trade
|
||||
|
||||
### OBI State
|
||||
- `metrics_window_start: int`: Current window start timestamp
|
||||
- `metrics_bar: dict`: Current OBI OHLC values
|
||||
- `_metrics_since_last_emit: int`: Throttling counter
|
||||
|
||||
## Output Format
|
||||
|
||||
### Metrics Bar Structure
|
||||
```python
|
||||
{
|
||||
'obi_open': float, # First OBI value in window
|
||||
'obi_high': float, # Maximum OBI in window
|
||||
'obi_low': float, # Minimum OBI in window
|
||||
'obi_close': float, # Latest OBI value
|
||||
}
|
||||
```
|
||||
|
||||
### Visualization Integration
|
||||
- Emitted via `viz_io.upsert_metric_bar(timestamp, obi_open, obi_high, obi_low, obi_close, cvd_value)`
|
||||
- Compatible with existing OHLC visualization infrastructure
|
||||
- Real-time updates during active processing
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Low Memory**: Maintains only current window state
|
||||
- **Throttled I/O**: Configurable update frequency prevents excessive writes
|
||||
- **Efficient Updates**: O(1) operations for trade and OBI updates
|
||||
- **Window Management**: Automatic transitions without manual intervention
|
||||
|
||||
## Configuration
|
||||
|
||||
### Constructor Parameters
|
||||
- `window_seconds: int`: Time window for OBI aggregation (default: 60)
|
||||
- `emit_every_n_updates: int`: Throttling factor for intermediate updates (default: 25)
|
||||
|
||||
### Tuning Guidelines
|
||||
- **Higher throttling**: Reduces I/O load, delays real-time updates
|
||||
- **Lower throttling**: More responsive visualization, higher I/O overhead
|
||||
- **Window size**: Affects granularity of OBI trends (shorter = more detail)
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
uv run pytest test_metrics_calculator.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- CVD accumulation accuracy across multiple trades
|
||||
- OBI window rollover and OHLC tracking
|
||||
- Throttling behavior verification
|
||||
- Edge cases (unknown trade sides, empty windows)
|
||||
- Integration with visualization output
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- CVD calculation assumes binary buy/sell classification
|
||||
- No support for partial fills or complex order types
|
||||
- OBI calculation treats all liquidity equally (no price weighting)
|
||||
- Window boundaries aligned to absolute timestamps (no sliding windows)
|
||||
@@ -1,122 +0,0 @@
|
||||
# Module: ohlc_processor
|
||||
|
||||
## Purpose
|
||||
The `ohlc_processor` module serves as the main coordinator for trade data processing, orchestrating OHLC aggregation, orderbook management, and metrics calculation. It has been refactored into a modular architecture using composition with specialized helper modules.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `OHLCProcessor(window_seconds: int = 60, depth_levels_per_side: int = 50)`: Main orchestrator class that coordinates trade processing using composition
|
||||
|
||||
### Methods
|
||||
- `process_trades(trades: list[tuple]) -> None`: Aggregate trades into OHLC bars and update CVD metrics
|
||||
- `update_orderbook(ob_update: OrderbookUpdate) -> None`: Apply orderbook updates and calculate OBI metrics
|
||||
- `finalize() -> None`: Emit final OHLC bar and metrics data
|
||||
- `cvd_cumulative` (property): Access to cumulative volume delta value
|
||||
|
||||
### Composed Modules
|
||||
- `OrderbookManager`: Handles in-memory orderbook state and depth snapshots
|
||||
- `MetricsCalculator`: Manages OBI and CVD metric calculations
|
||||
- `level_parser` functions: Parse and normalize orderbook level data
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from ohlc_processor import OHLCProcessor
|
||||
from db_interpreter import DBInterpreter
|
||||
|
||||
# Initialize processor with 1-minute windows and 50 depth levels
|
||||
processor = OHLCProcessor(window_seconds=60, depth_levels_per_side=50)
|
||||
|
||||
# Process streaming data
|
||||
for ob_update, trades in DBInterpreter(db_path).stream():
|
||||
# Aggregate trades into OHLC bars
|
||||
processor.process_trades(trades)
|
||||
|
||||
# Update orderbook and emit depth snapshots
|
||||
processor.update_orderbook(ob_update)
|
||||
|
||||
# Finalize processing
|
||||
processor.finalize()
|
||||
```
|
||||
|
||||
### Advanced Configuration
|
||||
```python
|
||||
# Custom window size and depth levels
|
||||
processor = OHLCProcessor(
|
||||
window_seconds=30, # 30-second bars
|
||||
depth_levels_per_side=25 # Top 25 levels per side
|
||||
)
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal Modules
|
||||
- `orderbook_manager.OrderbookManager`: In-memory orderbook state management
|
||||
- `metrics_calculator.MetricsCalculator`: OBI and CVD metrics calculation
|
||||
- `level_parser`: Orderbook level parsing utilities
|
||||
- `viz_io`: JSON output for visualization
|
||||
- `db_interpreter.OrderbookUpdate`: Input data structures
|
||||
|
||||
### External
|
||||
- `typing`: Type annotations
|
||||
- `logging`: Debug and operational logging
|
||||
|
||||
## Modular Architecture
|
||||
|
||||
The processor now follows a clean composition pattern:
|
||||
|
||||
1. **Main Coordinator** (`OHLCProcessor`):
|
||||
- Orchestrates trade and orderbook processing
|
||||
- Maintains OHLC bar state and window management
|
||||
- Delegates specialized tasks to composed modules
|
||||
|
||||
2. **Orderbook Management** (`OrderbookManager`):
|
||||
- Maintains in-memory price→size mappings
|
||||
- Applies partial updates and handles deletions
|
||||
- Provides sorted top-N level extraction
|
||||
|
||||
3. **Metrics Calculation** (`MetricsCalculator`):
|
||||
- Tracks CVD from trade flow (buy/sell volume delta)
|
||||
- Calculates OBI from orderbook volume imbalance
|
||||
- Manages windowed metrics aggregation with throttling
|
||||
|
||||
4. **Level Parsing** (`level_parser` module):
|
||||
- Normalizes JSON and Python literal level representations
|
||||
- Handles zero-size levels for orderbook deletions
|
||||
- Provides robust error handling for malformed data
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Throttled Updates**: Prevents excessive I/O during high-frequency periods
|
||||
- **Memory Efficient**: Maintains only current window and top-N depth levels
|
||||
- **Incremental Processing**: Applies only changed orderbook levels
|
||||
- **Atomic Operations**: Thread-safe updates to shared data structures
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_ohlc_processor.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- OHLC calculation accuracy across window boundaries
|
||||
- Volume accumulation correctness
|
||||
- High/low price tracking
|
||||
- Orderbook update application
|
||||
- Depth snapshot generation
|
||||
- OBI metric calculation
|
||||
|
||||
## Known Issues
|
||||
|
||||
- Orderbook level parsing assumes well-formed JSON or Python literals
|
||||
- Memory usage scales with number of active price levels
|
||||
- Clock skew between trades and orderbook updates not handled
|
||||
|
||||
## Configuration Options
|
||||
|
||||
- `window_seconds`: Time window size for OHLC aggregation (default: 60)
|
||||
- `depth_levels_per_side`: Number of top price levels to maintain (default: 50)
|
||||
- `UPSERT_THROTTLE_MS`: Minimum interval between upsert operations (internal)
|
||||
- `DEPTH_EMIT_THROTTLE_MS`: Minimum interval between depth emissions (internal)
|
||||
@@ -1,121 +0,0 @@
|
||||
# Module: orderbook_manager
|
||||
|
||||
## Purpose
|
||||
The `orderbook_manager` module provides in-memory orderbook state management with partial update capabilities. It maintains separate bid and ask sides and supports efficient top-level extraction for visualization.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Classes
|
||||
- `OrderbookManager(depth_levels_per_side: int = 50)`: Main orderbook state manager
|
||||
|
||||
### Methods
|
||||
- `apply_updates(bids_updates: List[Tuple[float, float]], asks_updates: List[Tuple[float, float]]) -> None`: Apply partial updates to both sides
|
||||
- `get_total_volume() -> Tuple[float, float]`: Get total bid and ask volumes
|
||||
- `get_top_levels() -> Tuple[List[List[float]], List[List[float]]]`: Get sorted top levels for both sides
|
||||
|
||||
### Private Methods
|
||||
- `_apply_partial_updates(side_map: Dict[float, float], updates: List[Tuple[float, float]]) -> None`: Apply updates to one side
|
||||
- `_build_top_levels(side_map: Dict[float, float], limit: int, reverse: bool) -> List[List[float]]`: Extract sorted top levels
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```python
|
||||
from orderbook_manager import OrderbookManager
|
||||
|
||||
# Initialize manager
|
||||
manager = OrderbookManager(depth_levels_per_side=25)
|
||||
|
||||
# Apply orderbook updates
|
||||
bids = [(50000.0, 1.5), (49999.0, 2.0)]
|
||||
asks = [(50001.0, 1.2), (50002.0, 0.8)]
|
||||
manager.apply_updates(bids, asks)
|
||||
|
||||
# Get volume totals for OBI calculation
|
||||
total_bids, total_asks = manager.get_total_volume()
|
||||
obi = total_bids - total_asks
|
||||
|
||||
# Get top levels for depth visualization
|
||||
bids_sorted, asks_sorted = manager.get_top_levels()
|
||||
|
||||
# Handle deletions (size = 0)
|
||||
deletions = [(50000.0, 0.0)] # Remove price level
|
||||
manager.apply_updates(deletions, [])
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### External
|
||||
- `typing`: Type annotations for Dict, List, Tuple
|
||||
|
||||
## State Management
|
||||
|
||||
### Internal State
|
||||
- `_book_bids: Dict[float, float]`: Price → size mapping for bid side
|
||||
- `_book_asks: Dict[float, float]`: Price → size mapping for ask side
|
||||
- `depth_levels_per_side: int`: Configuration for top-N extraction
|
||||
|
||||
### Update Semantics
|
||||
- **Size = 0**: Remove price level (deletion)
|
||||
- **Size > 0**: Upsert price level with new size
|
||||
- **Size < 0**: Ignored (invalid update)
|
||||
|
||||
### Sorting Behavior
|
||||
- **Bids**: Descending by price (highest price first)
|
||||
- **Asks**: Ascending by price (lowest price first)
|
||||
- **Top-N**: Limited by `depth_levels_per_side` parameter
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Memory Efficient**: Only stores non-zero price levels
|
||||
- **Fast Updates**: O(1) upsert/delete operations using dict
|
||||
- **Efficient Sorting**: Only sorts when extracting top levels
|
||||
- **Bounded Output**: Limits result size for visualization performance
|
||||
|
||||
## Use Cases
|
||||
|
||||
### OBI Calculation
|
||||
```python
|
||||
total_bids, total_asks = manager.get_total_volume()
|
||||
order_book_imbalance = total_bids - total_asks
|
||||
```
|
||||
|
||||
### Depth Visualization
|
||||
```python
|
||||
bids, asks = manager.get_top_levels()
|
||||
depth_payload = {"bids": bids, "asks": asks}
|
||||
```
|
||||
|
||||
### Incremental Updates
|
||||
```python
|
||||
# Typical orderbook update cycle
|
||||
updates = parse_orderbook_changes(raw_data)
|
||||
manager.apply_updates(updates['bids'], updates['asks'])
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
uv run pytest test_orderbook_manager.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Partial update application correctness
|
||||
- Deletion handling (size = 0)
|
||||
- Volume calculation accuracy
|
||||
- Top-level sorting and limiting
|
||||
- Edge cases (empty books, single levels)
|
||||
- Performance with large orderbooks
|
||||
|
||||
## Configuration
|
||||
|
||||
- `depth_levels_per_side`: Controls output size for visualization (default: 50)
|
||||
- Affects memory usage and sorting performance
|
||||
- Higher values provide more market depth detail
|
||||
- Lower values improve processing speed
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- No built-in validation of price/size values
|
||||
- Memory usage scales with number of unique price levels
|
||||
- No historical state tracking (current snapshot only)
|
||||
- No support for spread calculation or market data statistics
|
||||
@@ -1,155 +0,0 @@
|
||||
# Module: viz_io
|
||||
|
||||
## Purpose
|
||||
The `viz_io` module provides atomic inter-process communication (IPC) between the data processing pipeline and the visualization frontend. It manages JSON file-based data exchange with atomic writes to prevent race conditions and data corruption.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Functions
|
||||
- `add_ohlc_bar(timestamp, open_price, high_price, low_price, close_price, volume)`: Append new OHLC bar to rolling dataset
|
||||
- `upsert_ohlc_bar(timestamp, open_price, high_price, low_price, close_price, volume)`: Update existing bar or append new one
|
||||
- `clear_data()`: Reset OHLC dataset to empty state
|
||||
- `add_metric_bar(timestamp, obi_open, obi_high, obi_low, obi_close)`: Append OBI metric bar
|
||||
- `upsert_metric_bar(timestamp, obi_open, obi_high, obi_low, obi_close)`: Update existing OBI bar or append new one
|
||||
- `clear_metrics()`: Reset metrics dataset to empty state
|
||||
- `set_depth_data(bids, asks)`: Update current orderbook depth snapshot
|
||||
|
||||
### Constants
|
||||
- `DATA_FILE`: Path to OHLC data JSON file
|
||||
- `DEPTH_FILE`: Path to depth data JSON file
|
||||
- `METRICS_FILE`: Path to metrics data JSON file
|
||||
- `MAX_BARS`: Maximum number of bars to retain (1000)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic OHLC Operations
|
||||
```python
|
||||
import viz_io
|
||||
|
||||
# Add a new OHLC bar
|
||||
viz_io.add_ohlc_bar(
|
||||
timestamp=1640995200000, # Unix timestamp in milliseconds
|
||||
open_price=50000.0,
|
||||
high_price=50100.0,
|
||||
low_price=49900.0,
|
||||
close_price=50050.0,
|
||||
volume=125.5
|
||||
)
|
||||
|
||||
# Update the current bar (if timestamp matches) or add new one
|
||||
viz_io.upsert_ohlc_bar(
|
||||
timestamp=1640995200000,
|
||||
open_price=50000.0,
|
||||
high_price=50150.0, # Updated high
|
||||
low_price=49850.0, # Updated low
|
||||
close_price=50075.0, # Updated close
|
||||
volume=130.2 # Updated volume
|
||||
)
|
||||
```
|
||||
|
||||
### Orderbook Depth Management
|
||||
```python
|
||||
# Set current depth snapshot
|
||||
bids = [[49990.0, 1.5], [49985.0, 2.1], [49980.0, 0.8]]
|
||||
asks = [[50010.0, 1.2], [50015.0, 1.8], [50020.0, 2.5]]
|
||||
|
||||
viz_io.set_depth_data(bids, asks)
|
||||
```
|
||||
|
||||
### Metrics Operations
|
||||
```python
|
||||
# Add Order Book Imbalance metrics
|
||||
viz_io.add_metric_bar(
|
||||
timestamp=1640995200000,
|
||||
obi_open=0.15,
|
||||
obi_high=0.22,
|
||||
obi_low=0.08,
|
||||
obi_close=0.18
|
||||
)
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Internal
|
||||
- None (standalone utility module)
|
||||
|
||||
### External
|
||||
- `json`: JSON serialization/deserialization
|
||||
- `pathlib`: File path handling
|
||||
- `typing`: Type annotations
|
||||
- `tempfile`: Atomic write operations
|
||||
|
||||
## Data Formats
|
||||
|
||||
### OHLC Data (`ohlc_data.json`)
|
||||
```json
|
||||
[
|
||||
[1640995200000, 50000.0, 50100.0, 49900.0, 50050.0, 125.5],
|
||||
[1640995260000, 50050.0, 50200.0, 50000.0, 50150.0, 98.3]
|
||||
]
|
||||
```
|
||||
Format: `[timestamp, open, high, low, close, volume]`
|
||||
|
||||
### Depth Data (`depth_data.json`)
|
||||
```json
|
||||
{
|
||||
"bids": [[49990.0, 1.5], [49985.0, 2.1]],
|
||||
"asks": [[50010.0, 1.2], [50015.0, 1.8]]
|
||||
}
|
||||
```
|
||||
Format: `{"bids": [[price, size], ...], "asks": [[price, size], ...]}`
|
||||
|
||||
### Metrics Data (`metrics_data.json`)
|
||||
```json
|
||||
[
|
||||
[1640995200000, 0.15, 0.22, 0.08, 0.18],
|
||||
[1640995260000, 0.18, 0.25, 0.12, 0.20]
|
||||
]
|
||||
```
|
||||
Format: `[timestamp, obi_open, obi_high, obi_low, obi_close]`
|
||||
|
||||
## Atomic Write Operations
|
||||
|
||||
All write operations use atomic file replacement to prevent partial reads:
|
||||
|
||||
1. Write data to temporary file
|
||||
2. Flush and sync to disk
|
||||
3. Atomically rename temporary file to target file
|
||||
|
||||
This ensures the visualization frontend always reads complete, valid JSON data.
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
- **Bounded Memory**: OHLC and metrics datasets limited to 1000 bars max
|
||||
- **Atomic Operations**: No partial reads possible during writes
|
||||
- **Rolling Window**: Automatic trimming of old data maintains constant memory usage
|
||||
- **Fast Lookups**: Timestamp-based upsert operations use list scanning (acceptable for 1000 items)
|
||||
|
||||
## Testing
|
||||
|
||||
Run module tests:
|
||||
```bash
|
||||
uv run pytest test_viz_io.py -v
|
||||
```
|
||||
|
||||
Test coverage includes:
|
||||
- Atomic write operations
|
||||
- Data format validation
|
||||
- Rolling window behavior
|
||||
- Upsert logic correctness
|
||||
- File corruption prevention
|
||||
- Concurrent read/write scenarios
|
||||
|
||||
## Known Issues
|
||||
|
||||
- File I/O may block briefly during atomic writes
|
||||
- JSON parsing errors not propagated to callers
|
||||
- Limited to 1000 bars maximum (configurable via MAX_BARS)
|
||||
- No compression for large datasets
|
||||
|
||||
## Thread Safety
|
||||
|
||||
All operations are thread-safe for single writer, multiple reader scenarios:
|
||||
- Writer: Data processing pipeline (single thread)
|
||||
- Readers: Visualization frontend (polling)
|
||||
- Atomic file operations prevent corruption during concurrent access
|
||||
132
level_parser.py
132
level_parser.py
@@ -1,85 +1,87 @@
|
||||
"""Level parsing utilities for orderbook data."""
|
||||
"""Ultra-fast level parsing for strings like:
|
||||
"[['110173.4', '0.0000454', '0', '4'], ['110177.1', '0', '0', '0'], ...]"
|
||||
"""
|
||||
|
||||
import json
|
||||
import ast
|
||||
import logging
|
||||
from typing import List, Any, Tuple
|
||||
from typing import List, Tuple, Any
|
||||
|
||||
|
||||
def normalize_levels(levels: Any) -> List[List[float]]:
|
||||
"""
|
||||
Convert string-encoded levels into [[price, size], ...] floats.
|
||||
|
||||
Filters out zero/negative sizes. Supports JSON and Python literal formats.
|
||||
Return [[price, size], ...] with size > 0 only (floats).
|
||||
Assumes 'levels' is a single-quoted list-of-lists string as above.
|
||||
"""
|
||||
if not levels or levels == '[]':
|
||||
return []
|
||||
|
||||
parsed = _parse_string_to_list(levels)
|
||||
if not parsed:
|
||||
return []
|
||||
|
||||
pairs: List[List[float]] = []
|
||||
for item in parsed:
|
||||
price, size = _extract_price_size(item)
|
||||
if price is None or size is None:
|
||||
continue
|
||||
try:
|
||||
p, s = float(price), float(size)
|
||||
if s > 0:
|
||||
pairs.append([p, s])
|
||||
except Exception:
|
||||
continue
|
||||
|
||||
if not pairs:
|
||||
logging.debug("normalize_levels: no valid pairs parsed from input")
|
||||
return pairs
|
||||
pairs = _fast_pairs(levels)
|
||||
# filter strictly positive sizes
|
||||
return [[p, s] for (p, s) in pairs if s > 0.0]
|
||||
|
||||
|
||||
def parse_levels_including_zeros(levels: Any) -> List[Tuple[float, float]]:
|
||||
"""
|
||||
Parse levels into (price, size) tuples including zero sizes for deletions.
|
||||
|
||||
Similar to normalize_levels but preserves zero sizes (for orderbook deletions).
|
||||
Return [(price, size), ...] (floats), preserving zeros for deletions.
|
||||
Assumes 'levels' is a single-quoted list-of-lists string as above.
|
||||
"""
|
||||
if not levels or levels == '[]':
|
||||
return _fast_pairs(levels)
|
||||
|
||||
|
||||
# ----------------- internal: fast path -----------------
|
||||
|
||||
def _fast_pairs(levels: Any) -> List[Tuple[float, float]]:
|
||||
"""
|
||||
Extremely fast parser for inputs like:
|
||||
"[['110173.4','0.0000454','0','4'],['110177.1','0','0','0'], ...]"
|
||||
Keeps only the first two fields from each row and converts to float.
|
||||
"""
|
||||
if not levels:
|
||||
return []
|
||||
|
||||
parsed = _parse_string_to_list(levels)
|
||||
if not parsed:
|
||||
# If already a list (rare in your pipeline), fall back to simple handling
|
||||
if isinstance(levels, (list, tuple)):
|
||||
out: List[Tuple[float, float]] = []
|
||||
for item in levels:
|
||||
if isinstance(item, (list, tuple)) and len(item) >= 2:
|
||||
try:
|
||||
p = float(item[0]); s = float(item[1])
|
||||
out.append((p, s))
|
||||
except Exception:
|
||||
continue
|
||||
return out
|
||||
|
||||
# Expect a string: strip outer brackets and single quotes fast
|
||||
s = str(levels).strip()
|
||||
if len(s) < 5: # too short to contain "[[...]]"
|
||||
return []
|
||||
|
||||
results: List[Tuple[float, float]] = []
|
||||
for item in parsed:
|
||||
price, size = _extract_price_size(item)
|
||||
if price is None or size is None:
|
||||
# Remove the outermost [ and ] quickly (tolerant)
|
||||
if s[0] == '[':
|
||||
s = s[1:]
|
||||
if s and s[-1] == ']':
|
||||
s = s[:-1]
|
||||
|
||||
# Remove *all* single quotes (input uses single quotes, not JSON)
|
||||
s = s.replace("'", "")
|
||||
|
||||
# Now s looks like: [[110173.4, 0.0000454, 0, 4], [110177.1, 0, 0, 0], ...]
|
||||
# Split into rows on "],", then strip brackets/spaces per row
|
||||
rows = s.split("],")
|
||||
out: List[Tuple[float, float]] = []
|
||||
|
||||
for row in rows:
|
||||
row = row.strip()
|
||||
# strip any leading/trailing brackets/spaces
|
||||
if row.startswith('['):
|
||||
row = row[1:]
|
||||
if row.endswith(']'):
|
||||
row = row[:-1]
|
||||
|
||||
# fast split by commas and take first two fields
|
||||
cols = row.split(',')
|
||||
if len(cols) < 2:
|
||||
continue
|
||||
try:
|
||||
p, s = float(price), float(size)
|
||||
if s >= 0:
|
||||
results.append((p, s))
|
||||
p = float(cols[0].strip())
|
||||
s_ = float(cols[1].strip())
|
||||
out.append((p, s_))
|
||||
except Exception:
|
||||
continue
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def _parse_string_to_list(levels: Any) -> List[Any]:
|
||||
"""Parse string levels to list, trying JSON first then literal_eval."""
|
||||
try:
|
||||
parsed = json.loads(levels)
|
||||
except Exception:
|
||||
try:
|
||||
parsed = ast.literal_eval(levels)
|
||||
except Exception:
|
||||
return []
|
||||
return parsed if isinstance(parsed, list) else []
|
||||
|
||||
|
||||
def _extract_price_size(item: Any) -> Tuple[Any, Any]:
|
||||
"""Extract price and size from dict or list/tuple format."""
|
||||
if isinstance(item, dict):
|
||||
return item.get("price", item.get("p")), item.get("size", item.get("s"))
|
||||
elif isinstance(item, (list, tuple)) and len(item) >= 2:
|
||||
return item[0], item[1]
|
||||
return None, None
|
||||
return out
|
||||
|
||||
70
main.py
70
main.py
@@ -1,25 +1,24 @@
|
||||
import json
|
||||
import logging
|
||||
import typer
|
||||
from pathlib import Path
|
||||
from datetime import datetime, timezone
|
||||
import subprocess
|
||||
import time
|
||||
import threading
|
||||
from db_interpreter import DBInterpreter
|
||||
from ohlc_processor import OHLCProcessor
|
||||
from strategy import Strategy
|
||||
from desktop_app import MainWindow
|
||||
import sys
|
||||
from PySide6.QtWidgets import QApplication
|
||||
from PySide6.QtCore import QTimer
|
||||
|
||||
|
||||
def main(instrument: str = typer.Argument(..., help="Instrument to backtest, e.g. BTC-USDT"),
|
||||
start_date: str = typer.Argument(..., help="Start date, e.g. 2025-07-01"),
|
||||
end_date: str = typer.Argument(..., help="End date, e.g. 2025-08-01"),
|
||||
window_seconds: int = typer.Option(60, help="OHLC window size in seconds")):
|
||||
"""
|
||||
Process orderbook data and visualize OHLC charts in real-time.
|
||||
"""
|
||||
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
|
||||
timeframe_minutes: int = typer.Option(15, "--timeframe-minutes", help="Timeframe in minutes"),
|
||||
ui: bool = typer.Option(False, "--ui/--no-ui", help="Enable UI")):
|
||||
logging.basicConfig(filename="strategy.log", level=logging.DEBUG, format="%(asctime)s %(levelname)s %(message)s")
|
||||
|
||||
start_date = datetime.strptime(start_date, "%Y-%m-%d").replace(tzinfo=timezone.utc)
|
||||
end_date = datetime.strptime(end_date, "%Y-%m-%d").replace(tzinfo=timezone.utc)
|
||||
@@ -39,11 +38,20 @@ def main(instrument: str = typer.Argument(..., help="Instrument to backtest, e.g
|
||||
|
||||
logging.info(f"Found {len(db_paths)} database files: {[p.name for p in db_paths]}")
|
||||
|
||||
processor = OHLCProcessor(window_seconds=window_seconds)
|
||||
processor = OHLCProcessor(aggregate_window_seconds=60 * timeframe_minutes)
|
||||
strategy = Strategy(
|
||||
lookback=30,
|
||||
min_volume_factor=1.0,
|
||||
confirm_break_signal_high=True, # or False to disable
|
||||
debug=True,
|
||||
debug_level=2, # 1=key events, 2=per-bar detail
|
||||
debug_every_n_bars=100
|
||||
)
|
||||
|
||||
|
||||
def process_data():
|
||||
"""Process database data in a separate thread."""
|
||||
try:
|
||||
db_to_process = []
|
||||
|
||||
for db_path in db_paths:
|
||||
db_name_parts = db_path.name.split(".")[0].split("-")
|
||||
if len(db_name_parts) < 5:
|
||||
@@ -54,40 +62,44 @@ def main(instrument: str = typer.Argument(..., help="Instrument to backtest, e.g
|
||||
db_date = datetime.strptime("".join(db_name), "%y%m%d").replace(tzinfo=timezone.utc)
|
||||
|
||||
if db_date < start_date or db_date >= end_date:
|
||||
logging.info(f"Skipping {db_path.name} - outside date range")
|
||||
continue
|
||||
db_to_process.append(db_path)
|
||||
|
||||
for i, db_path in enumerate(db_to_process):
|
||||
print(f"{i}/{len(db_to_process)}")
|
||||
|
||||
logging.info(f"Processing database: {db_path.name}")
|
||||
db_interpreter = DBInterpreter(db_path)
|
||||
|
||||
batch_count = 0
|
||||
for orderbook_update, trades in db_interpreter.stream():
|
||||
batch_count += 1
|
||||
|
||||
processor.process_trades(trades)
|
||||
processor.update_orderbook(orderbook_update)
|
||||
processor.process_trades(trades)
|
||||
strategy.process(processor)
|
||||
|
||||
processor.finalize()
|
||||
logging.info("Data processing completed")
|
||||
except Exception as e:
|
||||
logging.error(f"Error in data processing: {e}")
|
||||
processor.flush()
|
||||
strategy.process(processor)
|
||||
|
||||
try:
|
||||
app = QApplication(sys.argv)
|
||||
desktop_app = MainWindow()
|
||||
strategy.on_finish(processor) # optional: flat at last close
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
desktop_app.setup_data_processor(processor)
|
||||
desktop_app.show()
|
||||
|
||||
logging.info("Desktop visualizer started")
|
||||
print(json.dumps(strategy.get_stats(), indent=2))
|
||||
|
||||
if ui:
|
||||
data_thread = threading.Thread(target=process_data, daemon=True)
|
||||
data_thread.start()
|
||||
|
||||
app.exec()
|
||||
app = QApplication(sys.argv)
|
||||
desktop_app = MainWindow()
|
||||
desktop_app.show()
|
||||
|
||||
except Exception as e:
|
||||
logging.error(f"Failed to start desktop visualizer: {e}")
|
||||
timer = QTimer()
|
||||
timer.timeout.connect(lambda: desktop_app.update_data(processor, 30))
|
||||
timer.start(1000)
|
||||
|
||||
app.exec()
|
||||
else:
|
||||
process_data()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
@@ -1,21 +1,163 @@
|
||||
# metrics_calculator.py
|
||||
import logging
|
||||
from typing import Optional, Tuple
|
||||
from typing import List, Optional
|
||||
|
||||
|
||||
class MetricsCalculator:
|
||||
def __init__(self):
|
||||
self.cvd_cumulative = 0.0
|
||||
self.obi_value = 0.0
|
||||
# ----- OBI (value at close) -----
|
||||
self._obi_value = 0.0
|
||||
|
||||
# ----- CVD (bucket-local delta, TradingView-style) -----
|
||||
self._bucket_buy = 0.0
|
||||
self._bucket_sell = 0.0
|
||||
self._bucket_net = 0.0 # buy - sell within current bucket
|
||||
|
||||
# --- per-bucket lifecycle state ---
|
||||
self._b_ts_start: Optional[int] = None
|
||||
self._b_ts_end: Optional[int] = None
|
||||
|
||||
self._obi_o = self._obi_h = self._obi_l = self._obi_c = None
|
||||
self._cvd_o = self._cvd_h = self._cvd_l = self._cvd_c = None
|
||||
|
||||
# final series rows:
|
||||
# [ts_start, ts_end, o, h, l, c, value_at_close]
|
||||
self._series_obi: List[List[float]] = []
|
||||
self._series_cvd: List[List[float]] = []
|
||||
|
||||
# ----- ATR(14) -----
|
||||
self._atr_period: int = 14
|
||||
self._prev_close: Optional[float] = None
|
||||
self._tr_window: List[float] = [] # last N TR values
|
||||
self._series_atr: List[float] = [] # one value per finalized bucket
|
||||
|
||||
# ------------------------------
|
||||
# CVD (bucket-local delta)
|
||||
# ------------------------------
|
||||
def update_cvd_from_trade(self, side: str, size: float) -> None:
|
||||
"""Accumulate buy/sell within the *current* bucket (TV-style volume delta)."""
|
||||
if self._b_ts_start is None:
|
||||
# bucket not open yet; processor should call begin_bucket first
|
||||
return
|
||||
|
||||
s = float(size)
|
||||
if side == "buy":
|
||||
volume_delta = float(size)
|
||||
self._bucket_buy += s
|
||||
elif side == "sell":
|
||||
volume_delta = -float(size)
|
||||
self._bucket_sell += s
|
||||
else:
|
||||
logging.warning(f"Unknown trade side '{side}', treating as neutral")
|
||||
logging.warning(f"Unknown trade side '{side}', ignoring")
|
||||
return
|
||||
|
||||
self.cvd_cumulative += volume_delta
|
||||
self._bucket_net = self._bucket_buy - self._bucket_sell
|
||||
v = self._bucket_net
|
||||
|
||||
if self._cvd_o is None:
|
||||
self._cvd_o = 0.0
|
||||
self._cvd_h = v
|
||||
self._cvd_l = v
|
||||
self._cvd_c = v
|
||||
else:
|
||||
self._cvd_h = max(self._cvd_h, v)
|
||||
self._cvd_l = min(self._cvd_l, v)
|
||||
self._cvd_c = v
|
||||
|
||||
# ------------------------------
|
||||
# OBI
|
||||
# ------------------------------
|
||||
def update_obi_from_book(self, total_bids: float, total_asks: float) -> None:
|
||||
self.obi_value = float(total_bids - total_asks)
|
||||
self._obi_value = float(total_bids - total_asks)
|
||||
if self._b_ts_start is not None:
|
||||
v = self._obi_value
|
||||
if self._obi_o is None:
|
||||
self._obi_o = self._obi_h = self._obi_l = self._obi_c = v
|
||||
else:
|
||||
self._obi_h = max(self._obi_h, v)
|
||||
self._obi_l = min(self._obi_l, v)
|
||||
self._obi_c = v
|
||||
|
||||
# ------------------------------
|
||||
# ATR helpers
|
||||
# ------------------------------
|
||||
def _update_atr_from_bar(self, high: float, low: float, close: float) -> None:
|
||||
if self._prev_close is None:
|
||||
tr = float(high) - float(low)
|
||||
else:
|
||||
tr = max(
|
||||
float(high) - float(low),
|
||||
abs(float(high) - float(self._prev_close)),
|
||||
abs(float(low) - float(self._prev_close)),
|
||||
)
|
||||
self._tr_window.append(tr)
|
||||
if len(self._tr_window) > self._atr_period:
|
||||
self._tr_window.pop(0)
|
||||
atr = (sum(self._tr_window) / len(self._tr_window)) if self._tr_window else 0.0
|
||||
self._series_atr.append(atr)
|
||||
self._prev_close = float(close)
|
||||
|
||||
# ------------------------------
|
||||
# Bucket lifecycle
|
||||
# ------------------------------
|
||||
def begin_bucket(self, ts_start_ms: int, ts_end_ms: int) -> None:
|
||||
self._b_ts_start = int(ts_start_ms)
|
||||
self._b_ts_end = int(ts_end_ms)
|
||||
|
||||
# OBI opens at current value
|
||||
self._obi_o = self._obi_h = self._obi_l = self._obi_c = self._obi_value
|
||||
|
||||
# CVD resets each bucket
|
||||
self._bucket_buy = 0.0
|
||||
self._bucket_sell = 0.0
|
||||
self._bucket_net = 0.0
|
||||
self._cvd_o = 0.0
|
||||
self._cvd_h = 0.0
|
||||
self._cvd_l = 0.0
|
||||
self._cvd_c = 0.0
|
||||
|
||||
def finalize_bucket(self, bar: Optional[dict] = None) -> None:
|
||||
if self._b_ts_start is None or self._b_ts_end is None:
|
||||
return
|
||||
|
||||
# OBI row
|
||||
o = float(self._obi_o if self._obi_o is not None else self._obi_value)
|
||||
h = float(self._obi_h if self._obi_h is not None else self._obi_value)
|
||||
l = float(self._obi_l if self._obi_l is not None else self._obi_value)
|
||||
c = float(self._obi_c if self._obi_c is not None else self._obi_value)
|
||||
self._series_obi.append([self._b_ts_start, self._b_ts_end, o, h, l, c, float(self._obi_value)])
|
||||
|
||||
# CVD row (bucket-local delta)
|
||||
o = float(self._cvd_o if self._cvd_o is not None else 0.0)
|
||||
h = float(self._cvd_h if self._cvd_h is not None else 0.0)
|
||||
l = float(self._cvd_l if self._cvd_l is not None else 0.0)
|
||||
c = float(self._cvd_c if self._cvd_c is not None else 0.0)
|
||||
self._series_cvd.append([self._b_ts_start, self._b_ts_end, o, h, l, c, float(self._bucket_net)])
|
||||
|
||||
# ATR from the finalized OHLC bar
|
||||
if bar is not None:
|
||||
try:
|
||||
self._update_atr_from_bar(bar["high"], bar["low"], bar["close"])
|
||||
except Exception as e:
|
||||
logging.debug(f"ATR update error (ignored): {e}")
|
||||
|
||||
# reset state
|
||||
self._b_ts_start = self._b_ts_end = None
|
||||
self._obi_o = self._obi_h = self._obi_l = self._obi_c = None
|
||||
self._cvd_o = self._cvd_h = self._cvd_l = self._cvd_c = None
|
||||
|
||||
def add_flat_bucket(self, ts_start_ms: int, ts_end_ms: int) -> None:
|
||||
# OBI flat
|
||||
v_obi = float(self._obi_value)
|
||||
self._series_obi.append([int(ts_start_ms), int(ts_end_ms), v_obi, v_obi, v_obi, v_obi, v_obi])
|
||||
|
||||
# CVD flat at zero (no trades in this bucket)
|
||||
self._series_cvd.append([int(ts_start_ms), int(ts_end_ms), 0.0, 0.0, 0.0, 0.0, 0.0])
|
||||
|
||||
# ATR: carry last ATR forward if any, else 0.0
|
||||
last_atr = self._series_atr[-1] if self._series_atr else 0.0
|
||||
self._series_atr.append(float(last_atr))
|
||||
|
||||
# ------------------------------
|
||||
# Output
|
||||
# ------------------------------
|
||||
def get_series(self):
|
||||
return {'cvd': self._series_cvd, 'obi': self._series_obi, 'atr': self._series_atr}
|
||||
|
||||
@@ -1,70 +1,171 @@
|
||||
import logging
|
||||
from typing import List, Any, Dict, Tuple
|
||||
from typing import List, Any, Dict, Tuple, Optional
|
||||
|
||||
from viz_io import add_ohlc_bar, upsert_ohlc_bar, _atomic_write_json, DEPTH_FILE
|
||||
from db_interpreter import OrderbookUpdate
|
||||
from level_parser import normalize_levels, parse_levels_including_zeros
|
||||
from level_parser import parse_levels_including_zeros
|
||||
from orderbook_manager import OrderbookManager
|
||||
from metrics_calculator import MetricsCalculator
|
||||
|
||||
|
||||
class OHLCProcessor:
|
||||
"""
|
||||
Processes trade data and orderbook updates into OHLC bars and depth snapshots.
|
||||
Time-bucketed OHLC aggregator with gap-bar filling and metric hooks.
|
||||
|
||||
This class aggregates individual trades into time-windowed OHLC (Open, High, Low, Close)
|
||||
bars and maintains an in-memory orderbook state for depth visualization. It also
|
||||
calculates Order Book Imbalance (OBI) and Cumulative Volume Delta (CVD) metrics.
|
||||
|
||||
The processor uses throttled updates to balance visualization responsiveness with
|
||||
I/O efficiency, emitting intermediate updates during active windows.
|
||||
|
||||
Attributes:
|
||||
window_seconds: Time window duration for OHLC aggregation
|
||||
depth_levels_per_side: Number of top price levels to maintain per side
|
||||
trades_processed: Total number of trades processed
|
||||
bars_created: Total number of OHLC bars created
|
||||
cvd_cumulative: Running cumulative volume delta (via metrics calculator)
|
||||
- Bars are aligned to fixed buckets of length `aggregate_window_seconds`.
|
||||
- If there is a gap (no trades for one or more buckets), synthetic zero-volume
|
||||
candles are emitted with O=H=L=C=last_close AND a flat metrics bucket is added.
|
||||
- By default, the next bar's OPEN is the previous bar's CLOSE (configurable via
|
||||
`carry_forward_open`).
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.current_bar = None
|
||||
def __init__(self, aggregate_window_seconds: int) -> None:
|
||||
self.aggregate_window_seconds = int(aggregate_window_seconds)
|
||||
self._bucket_ms = self.aggregate_window_seconds * 1000
|
||||
|
||||
self.current_bar: Optional[Dict[str, Any]] = None
|
||||
self._current_bucket_index: Optional[int] = None
|
||||
self._last_close: Optional[float] = None
|
||||
|
||||
self.trades_processed = 0
|
||||
self.bars: List[Dict[str, Any]] = []
|
||||
|
||||
self._orderbook = OrderbookManager()
|
||||
self._metrics = MetricsCalculator()
|
||||
self.orderbook = OrderbookManager()
|
||||
self.metrics = MetricsCalculator()
|
||||
|
||||
@property
|
||||
def cvd_cumulative(self) -> float:
|
||||
"""Access cumulative CVD from metrics calculator."""
|
||||
return self._metrics.cvd_cumulative
|
||||
# -----------------------
|
||||
# Internal helpers
|
||||
# -----------------------
|
||||
def _new_bar(self, bucket_start_ms: int, open_price: float) -> Dict[str, Any]:
|
||||
return {
|
||||
"timestamp_start": bucket_start_ms,
|
||||
"timestamp_end": bucket_start_ms + self._bucket_ms,
|
||||
"open": float(open_price),
|
||||
"high": float(open_price),
|
||||
"low": float(open_price),
|
||||
"close": float(open_price),
|
||||
"volume": 0.0,
|
||||
}
|
||||
|
||||
def _emit_gap_bars(self, from_index: int, to_index: int) -> None:
|
||||
"""
|
||||
Emit empty buckets strictly between from_index and to_index.
|
||||
Each synthetic bar has zero volume and O=H=L=C=last_close.
|
||||
Also emit a flat metrics bucket for each gap.
|
||||
"""
|
||||
if self._last_close is None:
|
||||
return
|
||||
|
||||
for bi in range(from_index + 1, to_index):
|
||||
start_ms = bi * self._bucket_ms
|
||||
gap_bar = self._new_bar(start_ms, self._last_close)
|
||||
self.bars.append(gap_bar)
|
||||
|
||||
# metrics: add a flat bucket to keep OBI/CVD/ATR time-continuous
|
||||
try:
|
||||
self.metrics.add_flat_bucket(start_ms, start_ms + self._bucket_ms)
|
||||
except Exception as e:
|
||||
logging.debug(f"metrics add_flat_bucket error (ignored): {e}")
|
||||
|
||||
# -----------------------
|
||||
# Public API
|
||||
# -----------------------
|
||||
def process_trades(self, trades: List[Tuple[Any, ...]]) -> None:
|
||||
"""
|
||||
trades: iterables like (trade_id, trade_id_str, price, size, side, timestamp_ms, ...)
|
||||
timestamp_ms expected in milliseconds.
|
||||
"""
|
||||
if not trades:
|
||||
return
|
||||
|
||||
# Ensure time-ascending order; if upstream guarantees it, you can skip.
|
||||
trades = sorted(trades, key=lambda t: int(t[5]))
|
||||
|
||||
for trade in trades:
|
||||
trade_id, trade_id_str, price, size, side, timestamp_ms = trade[:6]
|
||||
price = float(price)
|
||||
size = float(size)
|
||||
timestamp_ms = int(timestamp_ms)
|
||||
self.trades_processed += 1
|
||||
|
||||
self._metrics.update_cvd_from_trade(side, size)
|
||||
# Determine this trade's bucket
|
||||
bucket_index = timestamp_ms // self._bucket_ms
|
||||
bucket_start = bucket_index * self._bucket_ms
|
||||
|
||||
if not self.current_bar:
|
||||
self.current_bar = {
|
||||
'open': float(price),
|
||||
'high': float(price),
|
||||
'low': float(price),
|
||||
'close': float(price)
|
||||
}
|
||||
self.current_bar['high'] = max(self.current_bar['high'], float(price))
|
||||
self.current_bar['low'] = min(self.current_bar['low'], float(price))
|
||||
self.current_bar['close'] = float(price)
|
||||
self.current_bar['volume'] += float(size)
|
||||
# New bucket?
|
||||
if self._current_bucket_index is None or bucket_index != self._current_bucket_index:
|
||||
# finalize prior bar (also finalizes metrics incl. ATR)
|
||||
if self.current_bar is not None:
|
||||
self.bars.append(self.current_bar)
|
||||
self._last_close = self.current_bar["close"]
|
||||
self.metrics.finalize_bucket(self.current_bar) # <— pass bar for ATR
|
||||
# handle gaps
|
||||
if self._current_bucket_index is not None and bucket_index > self._current_bucket_index + 1:
|
||||
self._emit_gap_bars(self._current_bucket_index, bucket_index)
|
||||
|
||||
open_for_new = self._last_close if self._last_close is not None else price
|
||||
|
||||
self.current_bar = self._new_bar(bucket_start, open_for_new)
|
||||
self._current_bucket_index = bucket_index
|
||||
|
||||
self.metrics.begin_bucket(bucket_start, bucket_start + self._bucket_ms)
|
||||
|
||||
# Metrics driven by trades: update CVD
|
||||
self.metrics.update_cvd_from_trade(side, float(size))
|
||||
|
||||
# Update current bucket with this trade
|
||||
b = self.current_bar
|
||||
b["high"] = max(b["high"], price)
|
||||
b["low"] = min(b["low"], price)
|
||||
b["close"] = price
|
||||
b["volume"] += size
|
||||
# keep timestamp_end snapped to bucket boundary
|
||||
b["timestamp_end"] = bucket_start + self._bucket_ms
|
||||
|
||||
def flush(self) -> None:
|
||||
"""Emit the in-progress bar (if any). Call at the end of a run/backtest."""
|
||||
if self.current_bar is not None:
|
||||
self.bars.append(self.current_bar)
|
||||
self._last_close = self.current_bar["close"]
|
||||
try:
|
||||
self.metrics.finalize_bucket(self.current_bar) # <— pass bar for ATR
|
||||
except Exception as e:
|
||||
logging.debug(f"metrics finalize_bucket on flush error (ignored): {e}")
|
||||
self.current_bar = None
|
||||
else:
|
||||
try:
|
||||
self.metrics.finalize_bucket(None)
|
||||
except Exception as e:
|
||||
logging.debug(f"metrics finalize_bucket on flush error (ignored): {e}")
|
||||
|
||||
def update_orderbook(self, ob_update: OrderbookUpdate) -> None:
|
||||
"""
|
||||
Apply orderbook deltas and refresh OBI metrics.
|
||||
Call this frequently (on each OB update) so intra-bucket OBI highs/lows track the book.
|
||||
"""
|
||||
bids_updates = parse_levels_including_zeros(ob_update.bids)
|
||||
asks_updates = parse_levels_including_zeros(ob_update.asks)
|
||||
|
||||
self._orderbook.apply_updates(bids_updates, asks_updates)
|
||||
self.orderbook.apply_updates(bids_updates, asks_updates)
|
||||
|
||||
total_bids, total_asks = self._orderbook.get_total_volume()
|
||||
self._metrics.update_obi_from_book(total_bids, total_asks)
|
||||
total_bids, total_asks = self.orderbook.get_total_volume()
|
||||
try:
|
||||
self.metrics.update_obi_from_book(total_bids, total_asks)
|
||||
except Exception as e:
|
||||
logging.debug(f"OBI update error (ignored): {e}")
|
||||
|
||||
# -----------------------
|
||||
# UI-facing helpers
|
||||
# -----------------------
|
||||
def get_metrics_series(self):
|
||||
"""
|
||||
Returns:
|
||||
{
|
||||
'cvd': [[ts_start, ts_end, o, h, l, c, value_at_close], ...],
|
||||
'obi': [[...], ...],
|
||||
'atr': [atr_value_per_bar, ...]
|
||||
}
|
||||
"""
|
||||
try:
|
||||
return self.metrics.get_series()
|
||||
except Exception:
|
||||
return {}
|
||||
|
||||
386
strategy.py
Normal file
386
strategy.py
Normal file
@@ -0,0 +1,386 @@
|
||||
# strategy.py
|
||||
import logging
|
||||
from typing import List, Dict, Optional
|
||||
from statistics import median
|
||||
from datetime import datetime, timezone
|
||||
|
||||
|
||||
class Strategy:
|
||||
"""
|
||||
Long-only CVD Divergence with ATR-based execution, fee-aware PnL, cooldown,
|
||||
adaptive CVD strength, optional confirmation entry, and debug logging.
|
||||
|
||||
Configure logging in main.py, for example:
|
||||
logging.basicConfig(
|
||||
filename="strategy.log",
|
||||
level=logging.DEBUG,
|
||||
format="%(asctime)s %(levelname)s %(message)s"
|
||||
)
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
# Core signal windows
|
||||
lookback: int = 30,
|
||||
min_volume_factor: float = 1.0,
|
||||
|
||||
# ATR & execution
|
||||
atr_period: int = 14,
|
||||
atr_mult_init: float = 2.0,
|
||||
atr_mult_trail: float = 3.0,
|
||||
breakeven_after_rr: float = 1.5,
|
||||
min_bars_before_be: int = 2,
|
||||
atr_min_rel_to_med: float = 1.0,
|
||||
cooldown_bars: int = 3,
|
||||
|
||||
# Divergence strength thresholds
|
||||
price_ll_min_atr: float = 0.05,
|
||||
cvd_min_gap: float = 0.0, # if 0 → adaptive
|
||||
cvd_gap_pct_of_range: float = 0.10, # 10% of rolling cumCVD range
|
||||
|
||||
# Entry confirmation
|
||||
confirm_break_signal_high: bool = True,
|
||||
|
||||
# Fees
|
||||
fee_rate: float = 0.002, # taker 0.20% per side
|
||||
fee_rate_maker: float = 0.0008, # maker 0.08% per side
|
||||
maker_entry: bool = False,
|
||||
maker_exit: bool = False,
|
||||
|
||||
# Debug
|
||||
debug: bool = False,
|
||||
debug_level: int = 1, # 0=quiet, 1=key, 2=detail
|
||||
debug_every_n_bars: int = 200,
|
||||
):
|
||||
# Params
|
||||
self.lookback = lookback
|
||||
self.min_volume_factor = min_volume_factor
|
||||
|
||||
self.atr_period = atr_period
|
||||
self.atr_mult_init = atr_mult_init
|
||||
self.atr_mult_trail = atr_mult_trail
|
||||
self.breakeven_after_rr = breakeven_after_rr
|
||||
self.min_bars_before_be = min_bars_before_be
|
||||
self.atr_min_rel_to_med = atr_min_rel_to_med
|
||||
self.cooldown_bars = cooldown_bars
|
||||
|
||||
self.price_ll_min_atr = price_ll_min_atr
|
||||
self.cvd_min_gap = cvd_min_gap
|
||||
self.cvd_gap_pct_of_range = cvd_gap_pct_of_range
|
||||
|
||||
self.confirm_break_signal_high = confirm_break_signal_high
|
||||
|
||||
self.fee_rate = fee_rate
|
||||
self.fee_rate_maker = fee_rate_maker
|
||||
self.maker_entry = maker_entry
|
||||
self.maker_exit = maker_exit
|
||||
|
||||
self.debug = debug
|
||||
self.debug_level = debug_level
|
||||
self.debug_every_n_bars = debug_every_n_bars
|
||||
|
||||
# Runtime state
|
||||
self._last_bar_i: int = 0
|
||||
self._cum_cvd: List[float] = []
|
||||
self._atr_vals: List[float] = []
|
||||
|
||||
self._in_position: bool = False
|
||||
self._entry_price: float = 0.0
|
||||
self._entry_i: int = -1
|
||||
self._atr_at_entry: float = 0.0
|
||||
self._stop: float = 0.0
|
||||
|
||||
self._pending_entry_i: Optional[int] = None
|
||||
self._pending_from_signal_i: Optional[int] = None
|
||||
self._signal_high: Optional[float] = None
|
||||
self._cooldown_until_i: int = -1
|
||||
|
||||
self.trades: List[Dict] = []
|
||||
|
||||
# Debug counters
|
||||
self._dbg = {
|
||||
"bars_seen": 0,
|
||||
"checked": 0,
|
||||
"fail_div_price": 0,
|
||||
"fail_div_cvd": 0,
|
||||
"fail_vol": 0,
|
||||
"fail_atr_regime": 0,
|
||||
"fail_price_strength": 0,
|
||||
"fail_cvd_strength": 0,
|
||||
"pass_all": 0,
|
||||
"signals_created": 0,
|
||||
"signals_canceled_no_break": 0,
|
||||
"skipped_cooldown": 0,
|
||||
"entries": 0,
|
||||
"trail_raises": 0,
|
||||
"to_be": 0,
|
||||
"exits_sl": 0,
|
||||
"exits_be": 0,
|
||||
"exits_eor": 0,
|
||||
}
|
||||
|
||||
# ============ helpers ============
|
||||
@staticmethod
|
||||
def _fmt_ts(ms: int) -> str:
|
||||
try:
|
||||
return datetime.fromtimestamp(int(ms) / 1000, tz=timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
|
||||
except Exception:
|
||||
return str(ms)
|
||||
|
||||
def _ensure_cum_cvd(self, cvd_rows: List[List[float]], upto_len: int) -> None:
|
||||
while len(self._cum_cvd) < upto_len:
|
||||
i = len(self._cum_cvd)
|
||||
bucket_net = float(cvd_rows[i][6]) if i < len(cvd_rows) else 0.0
|
||||
prev = self._cum_cvd[-1] if self._cum_cvd else 0.0
|
||||
self._cum_cvd.append(prev + bucket_net)
|
||||
|
||||
def _ensure_atr(self, bars: List[Dict], upto_len: int, metrics_atr: Optional[List[float]]) -> None:
|
||||
if metrics_atr and len(metrics_atr) >= upto_len:
|
||||
self._atr_vals = [float(x) for x in metrics_atr[:upto_len]]
|
||||
return
|
||||
|
||||
while len(self._atr_vals) < upto_len:
|
||||
i = len(self._atr_vals)
|
||||
if i == 0:
|
||||
self._atr_vals.append(0.0)
|
||||
continue
|
||||
h = float(bars[i]["high"])
|
||||
l = float(bars[i]["low"])
|
||||
pc = float(bars[i-1]["close"])
|
||||
tr = max(h - l, abs(h - pc), abs(l - pc))
|
||||
|
||||
if i < self.atr_period:
|
||||
prev_sum = (self._atr_vals[-1] * (i - 1)) if i > 1 else 0.0
|
||||
atr = (prev_sum + tr) / float(i)
|
||||
else:
|
||||
prev_atr = self._atr_vals[-1]
|
||||
atr = (prev_atr * (self.atr_period - 1) + tr) / float(self.atr_period)
|
||||
self._atr_vals.append(atr)
|
||||
|
||||
# ============ filters ============
|
||||
def _volume_ok(self, bars, i):
|
||||
if self.min_volume_factor <= 0:
|
||||
return True
|
||||
start = max(0, i - self.lookback)
|
||||
past = [b["volume"] for b in bars[start:i]] or [0.0]
|
||||
med_v = median(past)
|
||||
return (med_v == 0) or (bars[i]["volume"] >= self.min_volume_factor * med_v)
|
||||
|
||||
def _atr_ok(self, i):
|
||||
if i <= 0 or i >= len(self._atr_vals):
|
||||
return False
|
||||
start = max(0, i - self.lookback)
|
||||
window = self._atr_vals[start:i] or [0.0]
|
||||
med_atr = median(window)
|
||||
return (med_atr == 0.0) or (self._atr_vals[i] >= self.atr_min_rel_to_med * med_atr)
|
||||
|
||||
def _adaptive_cvd_gap(self, i):
|
||||
if self.cvd_min_gap > 0.0:
|
||||
return self.cvd_min_gap
|
||||
start = max(0, i - self.lookback)
|
||||
window = self._cum_cvd[start:i] or [0.0]
|
||||
rng = (max(window) - min(window)) if window else 0.0
|
||||
return rng * self.cvd_gap_pct_of_range
|
||||
|
||||
def _is_bullish_divergence(self, bars, i):
|
||||
if i < self.lookback:
|
||||
return False
|
||||
self._dbg["checked"] += 1
|
||||
start = i - self.lookback
|
||||
wl = [b["low"] for b in bars[start:i]]
|
||||
win_low = min(wl) if wl else bars[i]["low"]
|
||||
pr_ll = bars[i]["low"] < win_low
|
||||
if not pr_ll:
|
||||
self._dbg["fail_div_price"] += 1
|
||||
return False
|
||||
wcvd = self._cum_cvd[start:i] or [self._cum_cvd[i]]
|
||||
win_cvd_min = min(wcvd) if wcvd else self._cum_cvd[i]
|
||||
cvd_hl = self._cum_cvd[i] > win_cvd_min
|
||||
if not cvd_hl:
|
||||
self._dbg["fail_div_cvd"] += 1
|
||||
return False
|
||||
if not self._volume_ok(bars, i):
|
||||
self._dbg["fail_vol"] += 1
|
||||
return False
|
||||
if not self._atr_ok(i):
|
||||
self._dbg["fail_atr_regime"] += 1
|
||||
return False
|
||||
atr_i = self._atr_vals[i]
|
||||
price_gap = (win_low - bars[i]["low"])
|
||||
if price_gap < self.price_ll_min_atr * atr_i:
|
||||
self._dbg["fail_price_strength"] += 1
|
||||
return False
|
||||
required_gap = self._adaptive_cvd_gap(i)
|
||||
cvd_gap = (self._cum_cvd[i] - win_cvd_min)
|
||||
if cvd_gap < required_gap:
|
||||
self._dbg["fail_cvd_strength"] += 1
|
||||
return False
|
||||
self._dbg["pass_all"] += 1
|
||||
return True
|
||||
|
||||
# ============ execution ============
|
||||
def _net_breakeven_price(self):
|
||||
f_entry = self.fee_rate_maker if self.maker_entry else self.fee_rate
|
||||
f_exit = self.fee_rate_maker if self.maker_exit else self.fee_rate
|
||||
return self._entry_price * ((1.0 + f_entry) / max(1e-12, (1.0 - f_exit)))
|
||||
|
||||
def _do_enter(self, bars, i):
|
||||
b = bars[i]
|
||||
atr = float(self._atr_vals[i]) if i < len(self._atr_vals) else 0.0
|
||||
self._in_position = True
|
||||
self._entry_price = float(b["open"])
|
||||
self._entry_i = i
|
||||
self._atr_at_entry = atr
|
||||
self._stop = self._entry_price - self.atr_mult_init * atr
|
||||
self._pending_entry_i = None
|
||||
self._signal_high = None
|
||||
self._pending_from_signal_i = None
|
||||
self._dbg["entries"] += 1
|
||||
logging.info(f"[ENTRY] ts={b['timestamp_start']} ({self._fmt_ts(b['timestamp_start'])}) "
|
||||
f"price={self._entry_price:.2f} stop={self._stop:.2f} (ATR={atr:.2f})")
|
||||
|
||||
def _exit_with_fees(self, bars, i, exit_price, reason):
|
||||
entry = self.trades[-1] if self.trades and self.trades[-1].get("exit_i") is None else None
|
||||
if not entry:
|
||||
entry = {"entry_i": self._entry_i, "entry_ts": bars[self._entry_i]["timestamp_start"] if self._entry_i >= 0 else None,
|
||||
"entry_price": self._entry_price}
|
||||
self.trades.append(entry)
|
||||
entry_price = float(entry["entry_price"])
|
||||
exit_price = float(exit_price)
|
||||
fr_entry = self.fee_rate_maker if self.maker_entry else self.fee_rate
|
||||
fr_exit = self.fee_rate_maker if self.maker_exit else self.fee_rate
|
||||
pnl_gross = (exit_price / entry_price) - 1.0
|
||||
net_factor = (exit_price * (1.0 - fr_exit)) / (entry_price * (1.0 + fr_entry))
|
||||
pnl_net = net_factor - 1.0
|
||||
if reason == "SL": self._dbg["exits_sl"] += 1
|
||||
elif reason == "BE": self._dbg["exits_be"] += 1
|
||||
elif reason == "EoR": self._dbg["exits_eor"] += 1
|
||||
entry.update({
|
||||
"exit_i": i, "exit_ts": bars[i]["timestamp_start"], "exit_price": exit_price,
|
||||
"pnl_gross_pct": pnl_gross * 100.0, "pnl_net_pct": pnl_net * 100.0,
|
||||
"fees_pct": (fr_entry + fr_exit) * 100.0, "reason": reason,
|
||||
"fee_rate_entry": fr_entry, "fee_rate_exit": fr_exit,
|
||||
})
|
||||
logging.info(f"[EXIT {reason}] ts={bars[i]['timestamp_start']} ({self._fmt_ts(bars[i]['timestamp_start'])}) "
|
||||
f"pnl_net={pnl_net*100:.2f}% (gross={pnl_gross*100:.2f}%, fee={(fr_entry+fr_exit)*100:.2f}%)")
|
||||
|
||||
# ============ main ============
|
||||
def process(self, processor):
|
||||
bars = processor.bars
|
||||
series = processor.get_metrics_series()
|
||||
cvd_rows = series.get("cvd", [])
|
||||
metrics_atr = series.get("atr")
|
||||
n = min(len(bars), len(cvd_rows))
|
||||
if n <= self._last_bar_i:
|
||||
return
|
||||
|
||||
self._ensure_cum_cvd(cvd_rows, n)
|
||||
self._ensure_atr(bars, n, metrics_atr)
|
||||
|
||||
for i in range(self._last_bar_i, n):
|
||||
b = bars[i]
|
||||
self._dbg["bars_seen"] += 1
|
||||
|
||||
# periodic snapshot
|
||||
if self.debug and self.debug_level >= 1 and (i % max(1, self.debug_every_n_bars) == 0):
|
||||
atr_i = self._atr_vals[i] if i < len(self._atr_vals) else 0.0
|
||||
logging.debug(f"[BAR] i={i} ts={b['timestamp_start']} "
|
||||
f"O={b['open']:.2f} H={b['high']:.2f} L={b['low']:.2f} C={b['close']:.2f} "
|
||||
f"V={b['volume']:.4f} ATR={atr_i:.2f} CUMCVD={self._cum_cvd[i]:.2f}")
|
||||
|
||||
# pending entry
|
||||
if self._pending_entry_i is not None and i == self._pending_entry_i and not self._in_position:
|
||||
if self.confirm_break_signal_high and self._signal_high is not None:
|
||||
if b["high"] > self._signal_high:
|
||||
logging.debug(f"[CONFIRM] i={i} broke signal_high={self._signal_high:.2f} with H={b['high']:.2f} → ENTER")
|
||||
self._do_enter(bars, i)
|
||||
else:
|
||||
self._dbg["signals_canceled_no_break"] += 1
|
||||
logging.debug(f"[CANCEL] i={i} no break of signal_high={self._signal_high:.2f} (H={b['high']:.2f})")
|
||||
self._pending_entry_i = None
|
||||
self._signal_high = None
|
||||
self._pending_from_signal_i = None
|
||||
else:
|
||||
self._do_enter(bars, i)
|
||||
|
||||
# manage position
|
||||
if self._in_position:
|
||||
if b["low"] <= self._stop:
|
||||
be_price = self._net_breakeven_price()
|
||||
reason = "BE" if self._stop >= be_price else "SL"
|
||||
self._exit_with_fees(bars, i, max(self._stop, b["low"]), reason)
|
||||
self._in_position = False
|
||||
self._cooldown_until_i = i + self.cooldown_bars
|
||||
logging.debug(f"[COOLDN] start i={i} until={self._cooldown_until_i}")
|
||||
continue
|
||||
|
||||
atr_i = self._atr_vals[i]
|
||||
new_trail = b["close"] - self.atr_mult_trail * atr_i
|
||||
if new_trail > self._stop:
|
||||
self._dbg["trail_raises"] += 1
|
||||
logging.debug(f"[TRAIL] i={i} stop {self._stop:.2f} → {new_trail:.2f} (ATR={atr_i:.2f})")
|
||||
self._stop = new_trail
|
||||
|
||||
if (i - self._entry_i) >= self.min_bars_before_be and self._atr_at_entry > 0.0:
|
||||
if b["close"] >= self._entry_price + self.breakeven_after_rr * self._atr_at_entry:
|
||||
be_price = self._net_breakeven_price()
|
||||
self._stop = max(self._stop, be_price, b["close"] - self.atr_mult_trail * atr_i)
|
||||
self._dbg["to_be"] += 1
|
||||
logging.debug(f"[BE] i={i} set stop ≥ netBE={be_price:.2f} now stop={self._stop:.2f}")
|
||||
|
||||
# new signal
|
||||
if not self._in_position:
|
||||
if i < self._cooldown_until_i:
|
||||
self._dbg["skipped_cooldown"] += 1
|
||||
else:
|
||||
if self._is_bullish_divergence(bars, i):
|
||||
self._signal_high = b["high"]
|
||||
self._pending_from_signal_i = i
|
||||
self._pending_entry_i = i + 1
|
||||
self._dbg["signals_created"] += 1
|
||||
logging.debug(f"[SIGNAL] i={i} ts={b['timestamp_start']} signal_high={self._signal_high:.2f}")
|
||||
|
||||
self._last_bar_i = n
|
||||
|
||||
def on_finish(self, processor):
|
||||
bars = processor.bars
|
||||
if self._in_position and bars:
|
||||
last_i = len(bars) - 1
|
||||
last_close = float(bars[last_i]["close"])
|
||||
self._exit_with_fees(bars, last_i, last_close, "EoR")
|
||||
self._in_position = False
|
||||
|
||||
d = self._dbg
|
||||
logging.info(
|
||||
"[SUMMARY] "
|
||||
f"bars={d['bars_seen']} checked={d['checked']} pass={d['pass_all']} "
|
||||
f"fail_price_div={d['fail_div_price']} fail_cvd_div={d['fail_div_cvd']} "
|
||||
f"fail_vol={d['fail_vol']} fail_atr_regime={d['fail_atr_regime']} "
|
||||
f"fail_price_str={d['fail_price_strength']} fail_cvd_str={d['fail_cvd_strength']} "
|
||||
f"signals={d['signals_created']} canceled_no_break={d['signals_canceled_no_break']} "
|
||||
f"cooldown_skips={d['skipped_cooldown']} entries={d['entries']} "
|
||||
f"trail_raises={d['trail_raises']} to_be={d['to_be']} "
|
||||
f"exits_sl={d['exits_sl']} exits_be={d['exits_be']} exits_eor={d['exits_eor']}"
|
||||
)
|
||||
|
||||
def get_stats(self) -> Dict:
|
||||
done = [t for t in self.trades if "pnl_net_pct" in t]
|
||||
total = len(done)
|
||||
wins = [t for t in done if t["pnl_net_pct"] > 0]
|
||||
avg_net = (sum(t["pnl_net_pct"] for t in done) / total) if total else 0.0
|
||||
sum_net = sum(t["pnl_net_pct"] for t in done)
|
||||
equity = 1.0
|
||||
for t in done:
|
||||
equity *= (1.0 + t["pnl_net_pct"] / 100.0)
|
||||
compounded_net = (equity - 1.0) * 100.0
|
||||
avg_gross = (sum(t.get("pnl_gross_pct", 0.0) for t in done) / total) if total else 0.0
|
||||
total_fees = sum((t.get("fees_pct") or 0.0) for t in done)
|
||||
return {
|
||||
"trades": total,
|
||||
"win_rate": (len(wins) / total if total else 0.0),
|
||||
"avg_pnl_pct": avg_net,
|
||||
"sum_return_pct": sum_net,
|
||||
"compounded_return_pct": compounded_net,
|
||||
"avg_pnl_gross_pct": avg_gross,
|
||||
"total_fees_pct": total_fees
|
||||
}
|
||||
@@ -1,76 +0,0 @@
|
||||
## Cumulative Volume Delta (CVD) – Product Requirements Document
|
||||
|
||||
### 1) Introduction / Overview
|
||||
- Compute and visualize Cumulative Volume Delta (CVD) from trade data processed by `OHLCProcessor.process_trades`, aligned to the existing OHLC bar cadence.
|
||||
- CVD is defined as the cumulative sum of volume delta, where volume delta = buy_volume - sell_volume per trade.
|
||||
- Trade classification: `side == "buy"` → positive volume delta, `side == "sell"` → negative volume delta.
|
||||
- Persist CVD time series as scalar values per window to `metrics_data.json` and render a CVD line chart beneath the current OBI subplot in the Dash UI.
|
||||
|
||||
### 2) Goals
|
||||
- Compute volume delta from individual trades using the `side` field in the Trade dataclass.
|
||||
- Accumulate CVD across all processed trades (no session resets initially).
|
||||
- Aggregate CVD into window-aligned scalar values per `window_seconds`.
|
||||
- Extend `metrics_data.json` schema to include CVD values alongside existing OBI data.
|
||||
- Add a CVD line chart subplot beneath OBI in the main chart, sharing the time axis.
|
||||
- Throttle intra-window upserts of CVD values using the same approach/frequency as current OHLC throttling; always write on window close.
|
||||
|
||||
### 3) User Stories
|
||||
- As a researcher, I want CVD computed from actual trade data so I can assess buying/selling pressure over time.
|
||||
- As an analyst, I want CVD stored per time window so I can correlate it with price movements and OBI patterns.
|
||||
- As a developer, I want cumulative CVD values so I can analyze long-term directional bias in volume flow.
|
||||
|
||||
### 4) Functional Requirements
|
||||
1. Inputs and Definitions
|
||||
- Compute volume delta on every trade in `OHLCProcessor.process_trades`:
|
||||
- If `trade.side == "buy"` → `volume_delta = +trade.size`
|
||||
- If `trade.side == "sell"` → `volume_delta = -trade.size`
|
||||
- If `trade.side` is neither "buy" nor "sell" → `volume_delta = 0` (log warning)
|
||||
- Accumulate into running CVD: `self.cvd_cumulative += volume_delta`
|
||||
2. Windowing & Aggregation
|
||||
- Use the same `window_seconds` boundary as OHLC bars; window anchor is derived from the trade timestamp.
|
||||
- Store CVD value at window boundaries (end-of-window CVD snapshot).
|
||||
- On window rollover, capture the current `self.cvd_cumulative` value for that window.
|
||||
3. Persistence
|
||||
- Extend `metrics_data.json` schema from `[timestamp, obi_open, obi_high, obi_low, obi_close]` to `[timestamp, obi_open, obi_high, obi_low, obi_close, cvd_value]`.
|
||||
- Update `viz_io.py` functions to handle the new 6-element schema.
|
||||
- Keep only the last 1000 rows.
|
||||
- Upsert intra-window CVD values periodically (throttled, matching OHLC's approach) and always write on window close.
|
||||
4. Visualization
|
||||
- Read extended `metrics_data.json` in the Dash app with the same tolerant JSON reading/caching approach.
|
||||
- Extend the main figure to a fourth row for CVD line chart beneath OBI, sharing the x-axis.
|
||||
- Style CVD as a line chart with appropriate color (distinct from OHLC/Volume/OBI) and add a zero baseline.
|
||||
5. Performance & Correctness
|
||||
- CVD compute happens on every trade; I/O is throttled to maintain UI responsiveness.
|
||||
- Use existing logging and error handling patterns; must not crash if metrics JSON is temporarily unreadable.
|
||||
- Handle backward compatibility: if existing `metrics_data.json` has 5-element rows, treat missing CVD as 0.
|
||||
6. Testing
|
||||
- Unit tests for volume delta calculation with "buy", "sell", and invalid side values.
|
||||
- Unit tests for CVD accumulation across multiple trades and window boundaries.
|
||||
- Integration test: fixture trades produce correct CVD progression in `metrics_data.json`.
|
||||
|
||||
### 5) Non-Goals
|
||||
- No CVD reset functionality (will be implemented later).
|
||||
- No additional derived CVD metrics (e.g., CVD rate of change, normalized CVD).
|
||||
- No database persistence for CVD; JSON IPC only.
|
||||
- No strategy/signal changes based on CVD.
|
||||
|
||||
### 6) Design Considerations
|
||||
- Implement CVD calculation in `OHLCProcessor.process_trades` alongside existing OHLC aggregation.
|
||||
- Extend `viz_io.py` metrics functions to support 6-element schema while maintaining backward compatibility.
|
||||
- Add CVD state tracking: `self.cvd_cumulative`, `self.cvd_window_value` per window.
|
||||
- Follow the same throttling pattern as OBI metrics for consistency.
|
||||
|
||||
### 7) Technical Considerations
|
||||
- Add CVD computation in the trade processing loop within `OHLCProcessor.process_trades`.
|
||||
- Extend `upsert_metric_bar` and `add_metric_bar` functions to accept optional `cvd_value` parameter.
|
||||
- Handle schema migration gracefully: read existing 5-element rows, append 0.0 for missing CVD.
|
||||
- Use the same window alignment as trades (based on trade timestamp, not orderbook timestamp).
|
||||
|
||||
### 8) Success Metrics
|
||||
- `metrics_data.json` present with valid 6-element rows during processing.
|
||||
- CVD subplot updates smoothly and aligns with OHLC window timestamps.
|
||||
- CVD increases during buy-heavy periods, decreases during sell-heavy periods.
|
||||
- No noticeable performance regression in trade processing or UI responsiveness.
|
||||
|
||||
### 9) Open Questions
|
||||
- None; CVD computation approach confirmed using trade.side field. Schema extension approach confirmed for metrics_data.json.
|
||||
@@ -1,71 +0,0 @@
|
||||
## Order Book Imbalance (OBI) – Product Requirements Document
|
||||
|
||||
### 1) Introduction / Overview
|
||||
- Compute and visualize Order Book Imbalance (OBI) from the in-memory order book maintained by `OHLCProcessor`, aligned to the existing OHLC bar cadence.
|
||||
- OBI is defined as raw `B - A`, where `B` is total bid size and `A` is total ask size.
|
||||
- Persist an OBI time series as OHLC-style bars to `metrics_data.json` and render an OBI candlestick chart beneath the current Volume subplot in the Dash UI.
|
||||
|
||||
### 2) Goals
|
||||
- Compute OBI from the full in-memory aggregated book (all bid/ask levels) on every order book update.
|
||||
- Aggregate OBI into OHLC-style bars per `window_seconds`.
|
||||
- Persist OBI bars to `metrics_data.json` with atomic writes and a rolling retention of 1000 rows.
|
||||
- Add an OBI candlestick subplot (blue-toned) beneath Volume in the main chart, sharing the time axis.
|
||||
- Throttle intra-window upserts of OBI bars using the same approach/frequency as current OHLC throttling; always write on window close.
|
||||
|
||||
### 3) User Stories
|
||||
- As a researcher, I want OBI computed from the entire book so I can assess true depth imbalance.
|
||||
- As an analyst, I want OBI stored per time window as candlesticks so I can compare it with price/volume behavior.
|
||||
- As a developer, I want raw OBI values so I can analyze absolute imbalance patterns.
|
||||
|
||||
### 4) Functional Requirements
|
||||
1. Inputs and Definitions
|
||||
- Compute on every order book update using the complete in-memory book:
|
||||
- `B = sum(self._book_bids.values())`
|
||||
- `A = sum(self._book_asks.values())`
|
||||
- `OBI = B - A`
|
||||
- Edge case: if both sides are empty → `OBI = 0`.
|
||||
2. Windowing & Aggregation
|
||||
- Use the same `window_seconds` boundary as OHLC bars; window anchor is derived from the order book update timestamp.
|
||||
- Maintain OBI OHLC per window: `obi_open`, `obi_high`, `obi_low`, `obi_close`.
|
||||
- On window rollover, finalize and persist the bar.
|
||||
3. Persistence
|
||||
- Introduce `metrics_data.json` (co-located with other IPC files) with atomic writes.
|
||||
- Schema: list of fixed-length rows
|
||||
- `[timestamp_ms, obi_open, obi_high, obi_low, obi_close]`
|
||||
- Keep only the last 1000 rows.
|
||||
- Upsert intra-window bars periodically (throttled, matching OHLC’s approach) and always write on window close.
|
||||
4. Visualization
|
||||
- Read `metrics_data.json` in the Dash app with the same tolerant JSON reading/caching approach as other IPC files.
|
||||
- Extend the main figure to a third row for OBI candlesticks beneath Volume, sharing the x-axis.
|
||||
- Style OBI candlesticks in blue tones (distinct increasing/decreasing shades) and add a zero baseline.
|
||||
5. Performance & Correctness
|
||||
- OBI compute happens on every order book update; I/O is throttled to maintain UI responsiveness.
|
||||
- Use existing logging and error handling patterns; must not crash if metrics JSON is temporarily unreadable.
|
||||
6. Testing
|
||||
- Unit tests for OBI on symmetric, empty, and imbalanced books; intra-window aggregation; window rollover.
|
||||
- Integration test: fixture DB produces `metrics_data.json` aligned with OHLC bars, valid schema/lengths.
|
||||
|
||||
### 5) Non-Goals
|
||||
- No additional derived metrics; keep only raw OBI values for maximum flexibility.
|
||||
- No database persistence for metrics; JSON IPC only.
|
||||
- No strategy/signal changes.
|
||||
|
||||
### 6) Design Considerations
|
||||
- Reuse `OHLCProcessor` in-memory book (`_book_bids`, `_book_asks`).
|
||||
- Introduce new metrics IO helpers in `viz_io.py` mirroring existing OHLC IO (atomic write, rolling trim, upsert).
|
||||
- Keep `metrics_data.json` separate from `ohlc_data.json` to avoid schema churn.
|
||||
|
||||
### 7) Technical Considerations
|
||||
- Implement OBI compute and aggregation inside `OHLCProcessor.update_orderbook` after applying partial updates.
|
||||
- Throttle intra-window upserts with the same cadence concept as OHLC; on window close always persist.
|
||||
- Add a finalize path to persist the last OBI bar.
|
||||
|
||||
### 8) Success Metrics
|
||||
- `metrics_data.json` present with valid rows during processing.
|
||||
- OBI subplot updates smoothly and aligns with OHLC window timestamps.
|
||||
- OBI ≈ 0 for symmetric books; correct sign for imbalanced cases; no noticeable performance regression.
|
||||
|
||||
### 9) Open Questions
|
||||
- None; cadence confirmed to match OHLC throttling. Styling: blue tones for OBI candlesticks.
|
||||
|
||||
|
||||
@@ -1,190 +0,0 @@
|
||||
# Product Requirements Document: Migration from Dash/Plotly to PySide6/PyQtGraph
|
||||
|
||||
## Introduction/Overview
|
||||
|
||||
This PRD outlines the complete migration of the orderflow backtest visualization system from the current Dash/Plotly web-based implementation to a native desktop application using PySide6 and PyQtGraph. The migration addresses critical issues with the current implementation including async problems, debugging difficulties, performance bottlenecks, and data handling inefficiencies.
|
||||
|
||||
The goal is to create a robust, high-performance desktop application that provides better control over the codebase, eliminates current visualization bugs (particularly the CVD graph display issue), and enables future real-time trading strategy monitoring capabilities.
|
||||
|
||||
## Goals
|
||||
|
||||
1. **Eliminate Current Technical Issues**
|
||||
- Resolve async-related problems causing visualization failures
|
||||
- Fix CVD graph display issues that persist despite correct-looking code
|
||||
- Enable proper debugging capabilities with breakpoint support
|
||||
- Improve overall application performance and responsiveness
|
||||
|
||||
2. **Improve Development Experience**
|
||||
- Gain better control over the codebase through native Python implementation
|
||||
- Reduce dependency on intermediate file-based data exchange
|
||||
- Simplify the development and debugging workflow
|
||||
- Establish a foundation for future real-time capabilities
|
||||
|
||||
3. **Maintain and Enhance Visualization Capabilities**
|
||||
- Preserve all existing chart types and interactions
|
||||
- Improve performance for granular dataset handling
|
||||
- Prepare infrastructure for real-time data streaming
|
||||
- Enhance user experience through native desktop interface
|
||||
|
||||
## User Stories
|
||||
|
||||
1. **As a trading strategy developer**, I want to visualize OHLC data with volume, OBI, and CVD indicators in a single, synchronized view so that I can analyze market behavior patterns effectively.
|
||||
|
||||
2. **As a data analyst**, I want to zoom, pan, and select specific time ranges on charts so that I can focus on relevant market periods for detailed analysis.
|
||||
|
||||
3. **As a system developer**, I want to debug visualization issues with breakpoints and proper debugging tools so that I can identify and fix problems efficiently.
|
||||
|
||||
4. **As a performance-conscious user**, I want smooth chart rendering and interactions even with large, granular datasets so that my analysis workflow is not interrupted by lag or freezing.
|
||||
|
||||
5. **As a future trading system operator**, I want a foundation that can handle real-time data updates so that I can monitor live trading strategies effectively.
|
||||
|
||||
## Functional Requirements
|
||||
|
||||
### Core Visualization Components
|
||||
|
||||
1. **Main Chart Window**
|
||||
- The system must display OHLC candlestick charts in a primary plot area
|
||||
- The system must allow customizable time window selection for OHLC display
|
||||
- The system must synchronize all chart components to the same time axis
|
||||
|
||||
2. **Integrated Indicator Charts**
|
||||
- The system must display Volume bars below the OHLC chart
|
||||
- The system must display Order Book Imbalance (OBI) indicator
|
||||
- The system must display Cumulative Volume Delta (CVD) indicator
|
||||
- All indicators must share the same X-axis as the OHLC chart
|
||||
|
||||
3. **Depth Chart Visualization**
|
||||
- The system must display order book depth at selected time snapshots
|
||||
- The system must update depth visualization based on time selection
|
||||
- The system must provide clear bid/ask visualization
|
||||
|
||||
### User Interaction Features
|
||||
|
||||
4. **Chart Navigation**
|
||||
- The system must support zoom in/out functionality across all charts
|
||||
- The system must allow panning across time ranges
|
||||
- The system must provide time range selection capabilities
|
||||
- The system must support rectangle selection for detailed analysis
|
||||
|
||||
5. **Data Inspection**
|
||||
- The system must display mouseover information for all chart elements
|
||||
- The system must show precise values for OHLC, volume, OBI, and CVD data points
|
||||
- The system must provide crosshair functionality for precise data reading
|
||||
|
||||
### Technical Architecture
|
||||
|
||||
6. **Application Framework**
|
||||
- The system must be built using PySide6 for the GUI framework
|
||||
- The system must use PyQtGraph for all chart rendering and interactions
|
||||
- The system must implement a native desktop application architecture
|
||||
|
||||
7. **Data Integration**
|
||||
- The system must integrate with existing data processing modules (metrics_calculator, ohlc_processor, orderbook_manager)
|
||||
- The system must eliminate dependency on intermediate JSON files for data display
|
||||
- The system must support direct in-memory data transfer between processing and visualization
|
||||
|
||||
8. **Performance Requirements**
|
||||
- The system must handle granular datasets efficiently without UI blocking
|
||||
- The system must provide smooth chart interactions (zoom, pan, selection)
|
||||
- The system must render updates in less than 100ms for typical dataset sizes
|
||||
|
||||
### Development and Debugging
|
||||
|
||||
9. **Code Quality**
|
||||
- The system must be fully debuggable with standard Python debugging tools
|
||||
- The system must follow the existing project architecture patterns
|
||||
- The system must maintain clean separation between data processing and visualization
|
||||
|
||||
## Non-Goals (Out of Scope)
|
||||
|
||||
1. **Web Interface Maintenance** - The existing Dash/Plotly implementation will be completely replaced, not maintained in parallel
|
||||
|
||||
2. **Backward Compatibility** - No requirement to maintain compatibility with existing Dash/Plotly components or web-based deployment
|
||||
|
||||
3. **Multi-Platform Distribution** - Initial focus on development environment only, not packaging for distribution
|
||||
|
||||
4. **Real-Time Implementation** - While the architecture should support future real-time capabilities, the initial migration will focus on historical data visualization
|
||||
|
||||
5. **Advanced Chart Types** - Only migrate existing chart types; new visualization features are out of scope for this migration
|
||||
|
||||
## Design Considerations
|
||||
|
||||
### User Interface Layout
|
||||
- **Main Window Structure**: Primary chart area with integrated indicators below
|
||||
- **Control Panel**: Side panel or toolbar for time range selection and chart configuration
|
||||
- **Status Bar**: Display current data range, loading status, and performance metrics
|
||||
- **Menu System**: File operations, view options, and application settings
|
||||
|
||||
### PyQtGraph Integration
|
||||
- **Plot Organization**: Use PyQtGraph's PlotWidget for main charts with linked axes
|
||||
- **Custom Plot Items**: Implement custom plot items for OHLC candlesticks and depth visualization
|
||||
- **Performance Optimization**: Utilize PyQtGraph's fast plotting capabilities for large datasets
|
||||
|
||||
### Data Flow Architecture
|
||||
- **Direct Memory Access**: Replace JSON file intermediates with direct Python object passing
|
||||
- **Lazy Loading**: Implement efficient data loading strategies for large time ranges
|
||||
- **Caching Strategy**: Cache processed data to improve navigation performance
|
||||
|
||||
## Technical Considerations
|
||||
|
||||
### Dependencies and Integration
|
||||
- **PySide6**: Main GUI framework, provides native desktop capabilities
|
||||
- **PyQtGraph**: High-performance plotting library, optimized for real-time data
|
||||
- **Existing Modules**: Maintain integration with metrics_calculator.py, ohlc_processor.py, orderbook_manager.py
|
||||
- **Database Integration**: Continue using existing SQLite database through db_interpreter.py
|
||||
|
||||
### Migration Strategy (Iterative Implementation)
|
||||
- **Phase 1**: Basic PySide6 window with single PyQtGraph plot
|
||||
- **Phase 2**: OHLC candlestick chart implementation
|
||||
- **Phase 3**: Volume, OBI, and CVD indicator integration
|
||||
- **Phase 4**: Depth chart implementation
|
||||
- **Phase 5**: User interaction features (zoom, pan, selection)
|
||||
- **Phase 6**: Data integration and performance optimization
|
||||
|
||||
### Performance Considerations
|
||||
- **Memory Management**: Efficient data structure handling for large datasets
|
||||
- **Rendering Optimization**: Use PyQtGraph's ViewBox and plotting optimizations
|
||||
- **Thread Safety**: Proper handling of data processing in background threads
|
||||
- **Resource Cleanup**: Proper cleanup of chart objects and data structures
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Technical Success Criteria
|
||||
1. **Bug Resolution**: CVD graph displays correctly and all existing visualization bugs are resolved
|
||||
2. **Performance Improvement**: Chart interactions respond within 100ms for typical datasets
|
||||
3. **Debugging Capability**: Developers can set breakpoints and debug visualization code effectively
|
||||
4. **Data Handling**: Elimination of intermediate JSON files reduces data transfer overhead by 50%
|
||||
|
||||
### User Experience Success Criteria
|
||||
1. **Feature Parity**: All existing chart types and interactions are preserved and functional
|
||||
2. **Responsiveness**: Application feels more responsive than the current Dash implementation
|
||||
3. **Stability**: No crashes or freezing during normal chart operations
|
||||
4. **Visual Quality**: Charts render clearly with proper scaling and anti-aliasing
|
||||
|
||||
### Development Success Criteria
|
||||
1. **Code Maintainability**: New codebase follows established project patterns and is easier to maintain
|
||||
2. **Development Velocity**: Future visualization features can be implemented more quickly
|
||||
3. **Testing Capability**: Comprehensive testing can be performed with proper debugging tools
|
||||
4. **Architecture Foundation**: System is ready for future real-time data integration
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Data Loading Strategy**: Should we implement progressive loading for very large datasets, or rely on existing data chunking mechanisms?
|
||||
|
||||
2. **Configuration Management**: How should chart configuration and user preferences be stored and managed in the desktop application?
|
||||
|
||||
3. **Error Handling**: What specific error handling and user feedback mechanisms should be implemented for data loading and processing failures?
|
||||
|
||||
4. **Performance Monitoring**: Should we include built-in performance monitoring and profiling tools in the application?
|
||||
|
||||
5. **Future Real-Time Integration**: What specific interface patterns should be established now to facilitate future real-time data streaming integration?
|
||||
|
||||
## Implementation Approach
|
||||
|
||||
This migration will follow the iterative development workflow with explicit approval checkpoints between each phase. Each implementation phase will be:
|
||||
- Limited to manageable scope (≤250 lines per module)
|
||||
- Tested immediately after implementation
|
||||
- Integrated with existing data processing modules
|
||||
- Validated for performance and functionality before proceeding to the next phase
|
||||
|
||||
The implementation will begin with basic PySide6 application structure and progressively add PyQtGraph visualization capabilities while maintaining integration with the existing data processing pipeline.
|
||||
Reference in New Issue
Block a user