-
Notifications
You must be signed in to change notification settings - Fork 0
Description
[Conversation Reference: "User requested admin dashboard log viewing with correlation ID tracking for CIDX server administrators to troubleshoot complex issues including SSO authentication, OAuth flows, and indexing operations"]
Executive Summary
Epic Objective: Provide CIDX server administrators with comprehensive log viewing capabilities through the admin dashboard, enabling effective troubleshooting of complex operational issues via correlation ID tracking.
Business Value: Currently administrators have no centralized UI to view logs, making troubleshooting difficult. This epic delivers searchable, filterable log access with correlation IDs that trace errors across SSO, OAuth, and indexing operations through REST API, MCP API, and Web UI with full API parity.
Architecture Impact: Multi-handler logging strategy adding SQLite handler alongside existing system logs; CorrelationContextMiddleware for UUID v4 correlation ID propagation; shared LogAggregatorService backend for three-interface access (REST, MCP, Web UI).
Epic Scope and Objectives
Primary Objectives
- Enable administrators to view all operational logs through admin dashboard web UI
- Provide search and filter capabilities (log level, correlation ID, text search)
- Ensure ALL error conditions generate unique correlation IDs for traceability
- Deliver full API parity across REST API, MCP API, and Web UI interfaces
- Implement log export functionality for external analysis
Measured Success Criteria
- Administrators can access logs through admin dashboard Logs tab
- Logs are searchable by message content and correlation ID
- Logs are filterable by log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
- All SSO authentication errors include correlation IDs
- All OAuth flow errors include correlation IDs
- All indexing operation errors include correlation IDs
- REST API provides full log query/search/export capabilities
- MCP API provides full log query/search/export capabilities
- Logs can be exported to JSON and CSV formats
Architecture Overview
Multi-Handler Logging Strategy (Non-Breaking)
Single logging call
|
+--------------+--------------+
| | |
v v v
Console/File SQLite Handler Existing Audit
(unchanged) (NEW - logs.db) (unchanged)
Key Decision: Keep existing system logs unchanged. Add new SQLite handler that writes to ~/.cidx-server/logs.db alongside existing handlers.
Correlation ID Propagation
Request arrives
|
v
CorrelationContextMiddleware
|
+-- Extract X-Correlation-ID header (or generate UUID v4)
|
+-- Store in Python contextvars
|
v
All logging calls automatically inject correlation_id
|
v
Response includes X-Correlation-ID header
Three-Interface Access Architecture
+-------------------+ +-------------------+ +-------------------+
| Web UI | | REST API | | MCP API |
| /admin/logs tab | | /admin/api/logs | | admin_logs_query |
| HTMX-based | | endpoints | | tools |
+--------+----------+ +--------+----------+ +--------+----------+
| | |
+------------+------------+------------+------------+
|
v
+---------+---------+
| LogAggregatorService |
| (Shared Backend) |
+----------+----------+
|
v
+----------+----------+
| SQLite logs.db |
| ~/.cidx-server/ |
+---------------------+
Database Schema
CREATE TABLE logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp DATETIME NOT NULL,
level VARCHAR(10) NOT NULL,
source VARCHAR(100),
message TEXT NOT NULL,
correlation_id VARCHAR(36),
user_id VARCHAR(100),
request_path VARCHAR(500),
extra_data JSON,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_logs_timestamp ON logs(timestamp);
CREATE INDEX idx_logs_level ON logs(level);
CREATE INDEX idx_logs_correlation_id ON logs(correlation_id);
CREATE INDEX idx_logs_source ON logs(source);Features & Stories Implementation Order
Feature 1: Log Viewing UI and API
[Conversation Reference: "Log Viewing UI Tab - New tab in admin dashboard displaying logs with refresh button (no real-time streaming)"]
- #TBD [STORY] Story 1: Log Viewing with Basic Display
Feature 2: Log Search and Filtering
[Conversation Reference: "Frontend Search Capability - Search across log messages, correlation IDs, and relevant fields; Log Level Filtering - Filter by log level"]
- #TBD [STORY] Story 2: Log Search and Filtering
Feature 3: Correlation ID System
[Conversation Reference: "Correlation ID Generation - Ensure ALL error conditions generate unique correlation IDs (critical for SSO, OAuth, indexing)"]
- #TBD [STORY] Story 3: Correlation ID Generation for Error Tracking
Feature 4: Log Export
[Conversation Reference: "Log Export/Download - Download filtered/searched logs to file (JSON/CSV formats)"]
- #TBD [STORY] Story 4: Log Export for External Analysis
Technical Implementation Standards
Core Architectural Decisions
-
Multi-Handler Logging Strategy
- Keep existing stdout/stderr, file logs, audit logs unchanged
- Add SQLiteLogHandler writing to
~/.cidx-server/logs.db - Single logging call writes to all handlers simultaneously
-
Configurable Log Retention
- Default: 30 days (user-configurable via admin UI Config tab)
- Setting:
log_retention_daysexternalized config - Background cleanup task runs daily
-
Correlation ID Format: UUID v4 (industry standard)
-
API Design: REST + MCP parity (matches CIDX query parity principle)
-
Frontend: Server-side filtering (better performance)
-
Export Formats: JSON + CSV
Error Path Audit Scope
Phase 1 (P0): SSO (auth/oidc/), OAuth (auth/oauth/), Indexing (sync/error_handler.py) - ~15 files
Phase 2 (P1): Repositories, Query operations - ~20 files
Phase 3 (P2): Remaining error handlers - ~15 files
Total: ~50 files, ~100+ logging calls to update
Update Pattern:
logger.error("msg", extra={"correlation_id": get_correlation_id()})Risk Assessment and Mitigation
Technical Risks
Risk: SQLite database growth could impact disk space
Mitigation: Configurable retention with daily cleanup; default 30 days
Risk: High log volume impacting query performance
Mitigation: Database indexes on timestamp, level, correlation_id, source
Risk: Missing correlation IDs in existing error paths
Mitigation: Comprehensive ~50 file audit with systematic update pattern
Operational Risks
Risk: Breaking existing logging behavior
Mitigation: Multi-handler approach adds SQLite alongside existing logs; no removal
Dependencies and Prerequisites
Technical Dependencies
- SQLite (standard library - no new dependencies)
- Python contextvars (standard library - Python 3.7+)
- Existing admin dashboard infrastructure
Implementation Dependencies
- Story 1 (Log Viewing) provides base infrastructure for Stories 2, 3, 4
- Story 3 (Correlation ID) can proceed in parallel after Story 1
- Story 4 (Export) depends on Stories 1 and 2
Success Metrics and Validation
Functional Metrics
- 100% of log entries accessible through all three interfaces (Web UI, REST, MCP)
- Search returns results in <2 seconds for typical queries
- All error paths (SSO, OAuth, indexing) generate correlation IDs
Performance Metrics
- Log page load time <3 seconds for initial display
- Filter/search response time <2 seconds
- Export generation <10 seconds for 30 days of logs
Quality Metrics
- Test coverage >90%
- Zero critical security vulnerabilities
- Documentation completeness 100%
- Full API parity validation between REST, MCP, and Web UI
Epic Completion Criteria
Definition of Done
- All four stories implemented and deployed
- Complete test suite with >90% coverage
- Full documentation with examples
- Performance benchmarks met
- Security audit passed (admin-only access verified)
- Manual E2E testing completed for all interfaces
Acceptance Validation
- Administrator can view logs in Web UI Logs tab
- Administrator can search and filter logs via all interfaces
- Administrator can trace errors via correlation IDs
- Administrator can export logs to JSON/CSV
- All SSO/OAuth/indexing errors have correlation IDs
- REST API and MCP API have feature parity with Web UI
Epic Owner: CIDX Development Team
Stakeholders: CIDX Server Administrators, Operations Team
Success Measurement: Administrator troubleshooting time reduction; correlation ID coverage for all error paths