Skip to content

[EPIC] Admin Dashboard Log Viewing with Correlation ID Tracking #663

@jsbattig

Description

@jsbattig

[Conversation Reference: "User requested admin dashboard log viewing with correlation ID tracking for CIDX server administrators to troubleshoot complex issues including SSO authentication, OAuth flows, and indexing operations"]

Executive Summary

Epic Objective: Provide CIDX server administrators with comprehensive log viewing capabilities through the admin dashboard, enabling effective troubleshooting of complex operational issues via correlation ID tracking.

Business Value: Currently administrators have no centralized UI to view logs, making troubleshooting difficult. This epic delivers searchable, filterable log access with correlation IDs that trace errors across SSO, OAuth, and indexing operations through REST API, MCP API, and Web UI with full API parity.

Architecture Impact: Multi-handler logging strategy adding SQLite handler alongside existing system logs; CorrelationContextMiddleware for UUID v4 correlation ID propagation; shared LogAggregatorService backend for three-interface access (REST, MCP, Web UI).

Epic Scope and Objectives

Primary Objectives

  • Enable administrators to view all operational logs through admin dashboard web UI
  • Provide search and filter capabilities (log level, correlation ID, text search)
  • Ensure ALL error conditions generate unique correlation IDs for traceability
  • Deliver full API parity across REST API, MCP API, and Web UI interfaces
  • Implement log export functionality for external analysis

Measured Success Criteria

  • Administrators can access logs through admin dashboard Logs tab
  • Logs are searchable by message content and correlation ID
  • Logs are filterable by log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
  • All SSO authentication errors include correlation IDs
  • All OAuth flow errors include correlation IDs
  • All indexing operation errors include correlation IDs
  • REST API provides full log query/search/export capabilities
  • MCP API provides full log query/search/export capabilities
  • Logs can be exported to JSON and CSV formats

Architecture Overview

Multi-Handler Logging Strategy (Non-Breaking)

                    Single logging call
                           |
            +--------------+--------------+
            |              |              |
            v              v              v
      Console/File   SQLite Handler   Existing Audit
       (unchanged)    (NEW - logs.db)     (unchanged)

Key Decision: Keep existing system logs unchanged. Add new SQLite handler that writes to ~/.cidx-server/logs.db alongside existing handlers.

Correlation ID Propagation

Request arrives
     |
     v
CorrelationContextMiddleware
     |
     +-- Extract X-Correlation-ID header (or generate UUID v4)
     |
     +-- Store in Python contextvars
     |
     v
All logging calls automatically inject correlation_id
     |
     v
Response includes X-Correlation-ID header

Three-Interface Access Architecture

+-------------------+     +-------------------+     +-------------------+
|    Web UI         |     |    REST API       |     |    MCP API        |
|  /admin/logs tab  |     | /admin/api/logs   |     | admin_logs_query  |
|  HTMX-based       |     |   endpoints       |     |    tools          |
+--------+----------+     +--------+----------+     +--------+----------+
         |                         |                         |
         +------------+------------+------------+------------+
                      |
                      v
            +---------+---------+
            | LogAggregatorService |
            | (Shared Backend)     |
            +----------+----------+
                       |
                       v
            +----------+----------+
            |   SQLite logs.db    |
            | ~/.cidx-server/     |
            +---------------------+

Database Schema

CREATE TABLE logs (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp DATETIME NOT NULL,
    level VARCHAR(10) NOT NULL,
    source VARCHAR(100),
    message TEXT NOT NULL,
    correlation_id VARCHAR(36),
    user_id VARCHAR(100),
    request_path VARCHAR(500),
    extra_data JSON,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_logs_timestamp ON logs(timestamp);
CREATE INDEX idx_logs_level ON logs(level);
CREATE INDEX idx_logs_correlation_id ON logs(correlation_id);
CREATE INDEX idx_logs_source ON logs(source);

Features & Stories Implementation Order

Feature 1: Log Viewing UI and API

[Conversation Reference: "Log Viewing UI Tab - New tab in admin dashboard displaying logs with refresh button (no real-time streaming)"]

  • #TBD [STORY] Story 1: Log Viewing with Basic Display

Feature 2: Log Search and Filtering

[Conversation Reference: "Frontend Search Capability - Search across log messages, correlation IDs, and relevant fields; Log Level Filtering - Filter by log level"]

  • #TBD [STORY] Story 2: Log Search and Filtering

Feature 3: Correlation ID System

[Conversation Reference: "Correlation ID Generation - Ensure ALL error conditions generate unique correlation IDs (critical for SSO, OAuth, indexing)"]

  • #TBD [STORY] Story 3: Correlation ID Generation for Error Tracking

Feature 4: Log Export

[Conversation Reference: "Log Export/Download - Download filtered/searched logs to file (JSON/CSV formats)"]

  • #TBD [STORY] Story 4: Log Export for External Analysis

Technical Implementation Standards

Core Architectural Decisions

  1. Multi-Handler Logging Strategy

    • Keep existing stdout/stderr, file logs, audit logs unchanged
    • Add SQLiteLogHandler writing to ~/.cidx-server/logs.db
    • Single logging call writes to all handlers simultaneously
  2. Configurable Log Retention

    • Default: 30 days (user-configurable via admin UI Config tab)
    • Setting: log_retention_days externalized config
    • Background cleanup task runs daily
  3. Correlation ID Format: UUID v4 (industry standard)

  4. API Design: REST + MCP parity (matches CIDX query parity principle)

  5. Frontend: Server-side filtering (better performance)

  6. Export Formats: JSON + CSV

Error Path Audit Scope

Phase 1 (P0): SSO (auth/oidc/), OAuth (auth/oauth/), Indexing (sync/error_handler.py) - ~15 files
Phase 2 (P1): Repositories, Query operations - ~20 files
Phase 3 (P2): Remaining error handlers - ~15 files
Total: ~50 files, ~100+ logging calls to update

Update Pattern:

logger.error("msg", extra={"correlation_id": get_correlation_id()})

Risk Assessment and Mitigation

Technical Risks

Risk: SQLite database growth could impact disk space
Mitigation: Configurable retention with daily cleanup; default 30 days

Risk: High log volume impacting query performance
Mitigation: Database indexes on timestamp, level, correlation_id, source

Risk: Missing correlation IDs in existing error paths
Mitigation: Comprehensive ~50 file audit with systematic update pattern

Operational Risks

Risk: Breaking existing logging behavior
Mitigation: Multi-handler approach adds SQLite alongside existing logs; no removal

Dependencies and Prerequisites

Technical Dependencies

  • SQLite (standard library - no new dependencies)
  • Python contextvars (standard library - Python 3.7+)
  • Existing admin dashboard infrastructure

Implementation Dependencies

  • Story 1 (Log Viewing) provides base infrastructure for Stories 2, 3, 4
  • Story 3 (Correlation ID) can proceed in parallel after Story 1
  • Story 4 (Export) depends on Stories 1 and 2

Success Metrics and Validation

Functional Metrics

  • 100% of log entries accessible through all three interfaces (Web UI, REST, MCP)
  • Search returns results in <2 seconds for typical queries
  • All error paths (SSO, OAuth, indexing) generate correlation IDs

Performance Metrics

  • Log page load time <3 seconds for initial display
  • Filter/search response time <2 seconds
  • Export generation <10 seconds for 30 days of logs

Quality Metrics

  • Test coverage >90%
  • Zero critical security vulnerabilities
  • Documentation completeness 100%
  • Full API parity validation between REST, MCP, and Web UI

Epic Completion Criteria

Definition of Done

  • All four stories implemented and deployed
  • Complete test suite with >90% coverage
  • Full documentation with examples
  • Performance benchmarks met
  • Security audit passed (admin-only access verified)
  • Manual E2E testing completed for all interfaces

Acceptance Validation

  • Administrator can view logs in Web UI Logs tab
  • Administrator can search and filter logs via all interfaces
  • Administrator can trace errors via correlation IDs
  • Administrator can export logs to JSON/CSV
  • All SSO/OAuth/indexing errors have correlation IDs
  • REST API and MCP API have feature parity with Web UI

Epic Owner: CIDX Development Team
Stakeholders: CIDX Server Administrators, Operations Team
Success Measurement: Administrator troubleshooting time reduction; correlation ID coverage for all error paths

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions