Add caching for chatbot experiment data #2473

snopoke · 2025-11-18T18:26:29Z

Technical Description

Demo

Docs and Changelog

This PR requires docs/changelog update

- Proposes SQL-based caching tables for experiment and session statistics - Three update strategies: scheduled refresh, incremental, and hybrid - Includes data compression and archival strategies - Provides detailed implementation plans, migration phases, and testing approach - Addresses performance, scalability, and maintainability concerns

- Add Option B: Time-bucketed statistics with hourly/daily/weekly/monthly granularity - Add Option C: Hybrid approach combining buckets with denormalized totals - Include comprehensive comparison table between single-row and time-bucketed approaches - Add bucket compression policies for automatic data lifecycle management - Show how time buckets enable trend analysis and efficient incremental updates - Provide clear decision matrix for choosing between options - Recommend starting with simple Option A, migrating to Option C when needed - Add code examples for bucket creation, compression, and trend queries

Complete plan for Option B (time-bucketed) with simplified approach: - No denormalized totals tables (query by summing buckets) - Static compression policy: 1 day hourly, 30 days daily, monthly after - Never delete data, only compress to larger buckets - Two models: ExperimentStatisticsBucket, SessionStatisticsBucket - Manager methods for efficient batch queries - Celery tasks for updating and compressing buckets - Management commands for manual control and monitoring - Comprehensive testing strategy - 6-phase migration plan with clear deliverables - Admin interface for monitoring - Rollback plan and success metrics

Key improvements per feedback: - Use SQL aggregation (TruncHour, GROUP BY) instead of Python loops in _update_experiment_buckets - Fix new_participant_count to count participants by first session created_at in bucket - Remove is_finalized field entirely - not needed for static compression policy - Simplify indexes - remove redundant ones (unique_together already creates index) - Clean up admin actions, tests, and management commands accordingly

Key improvements based on feedback: - Manager methods now sum new_participant_count from buckets instead of querying sessions - Update tasks use SQL aggregation keyed by truncated date (no Python loops) - Participant counting uses ParticipantData model for accuracy - Compression tasks filter old buckets directly in SQL - Use dateutil.relativedelta for date calculations - Bulk update operations to minimize database round-trips

Store experiment versions directly in SessionStatisticsBucket using ArrayField: - Add experiment_versions ArrayField to SessionStatisticsBucket model - Update _update_session_buckets to populate versions from CustomTaggedItem - Update SessionStatisticsBucketManager methods to aggregate versions from buckets - Update compression to combine versions from hourly buckets into daily buckets - Update view integration to include experiment_versions in cached data - Eliminates separate query to CustomTaggedItem when displaying session stats This optimization caches experiment version data alongside other statistics, further reducing database queries when loading session tables.

coderabbitai · 2025-11-18T18:29:59Z

📝 Walkthrough

Walkthrough

Two design and implementation plan documents have been added to guide the development of a statistics caching system for experiments and sessions. The first document outlines problem statements related to expensive queries, design goals, and three architectural options (Single-Row Aggregates, Time-Bucketed Data, and Hybrid) with schema proposals, update strategies, and graceful degradation logic. The second document provides a detailed, production-ready implementation plan focused on time-bucketed statistics with SQL-centric aggregation, automatic bucket compression, Celery-based tasks, and a phased rollout strategy. Both include Django model sketches, integration points with existing views, migration guidance, monitoring considerations, and testing strategies. No code changes or API alterations are present.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35–40 minutes

Design option evaluation: The first document presents three architectural approaches (Single-Row Aggregates, Time-Bucketed Data, Hybrid). A reviewer should verify that the trade-offs between each option are accurately characterized and that the recommended approach aligns with stated design goals (performance, accuracy, scalability).
Schema and field choices: Both documents include detailed Django model sketches with indexes and relationships. Careful attention needed to ensure primary/foreign key designs, index strategies, and denormalization choices are sound and won't create future bottlenecks.
Migration and rollback completeness: The implementation plan includes migration steps and rollback procedures; these warrant scrutiny to confirm they are safe, account for concurrent traffic, and provide clear success criteria.
Integration feasibility: Both documents reference integration points with existing views and admin UI. Verify that proposed hooks into ChatbotExperimentTableView and ChatbotSessionsTableView are feasible and won't introduce circular dependencies or unexpected side effects.
Compression and lifecycle policies: The time-bucketed plan outlines static compression (hourly → daily → monthly) and bucket lifecycle. Check that retention policies and archive strategies won't violate compliance or operational requirements.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description contains only the template structure with all sections left empty. No technical details, rationale, or implementation information has been filled in by the author.	Fill in the Technical Description section with details about the caching architecture, explain the problem being solved, and clarify whether docs/changelog updates are needed. Check the appropriate checkbox and add notes for changelog generation if applicable.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding caching infrastructure for chatbot experiment data, which aligns with the design documents added (experiment-statistics-cache-design.md and time-bucketed-statistics-implementation-plan.md).
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch claude/cache-experiment-data-01SwH1ob1hb5JbGNsoeuM36V

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (10)

docs/experiment-statistics-cache-design.md (5)
522-545: Use proper heading levels instead of bold emphasis.

Lines 522, 533, 546, 579, 1375, 1390 use bold emphasis (**text**) where Markdown headings (### text) would be more appropriate for section organization. This improves document structure and navigation.

Example fix (line 522):
- **1. Natural Incremental Updates** → ### 1. Natural Incremental Updates
1338-1338: Add language identifiers to all fenced code blocks.

Lines 1338, 1486, 1501 contain fenced code blocks without language specification (missing the language after opening backticks). This affects syntax highlighting and documentation rendering.

Examples:

Line 1338: ``` should be ```text or ```python

Line 1486-1497: ``` should be ```sql

Line 1501-1508: ``` should be ```sql

Also applies to: 1486-1486, 1501-1501

1010-1016: Implement fallback calculate_experiment_statistics_live() function.

Line 1010 references calculate_experiment_statistics_live(experiment) but it's only shown as a stub (pass on line 1016). This function is critical for graceful degradation.

Provide a complete implementation or clearly document:

What queries it should run (likely the original subqueries from ChatbotExperimentTableView)

The object interface it should return (must match cached statistics interface)

Whether it should be async-safe for concurrent requests

Would you like me to provide a reference implementation of the fallback calculation function?

1437-1437: Fix hyphenation in text.

Line 1437: "low risk, incremental delivery" should use a hyphen for compound adjectives before a noun: "low-risk, incremental delivery".

55-55: Replace weak intensifiers with stronger alternatives.

Static analysis flags the use of "very" in several places (lines 55, 870, 902, 1415) as an over-used intensifier. Consider:

Line 55: "Very simple" → "Simple" or "Straightforward"

Line 870: "very old, completed sessions" → "old completed sessions" or "historic completed sessions"

Line 902: "very old experiments" → "historic experiments" or "aged experiments"

Line 1415: "very large data volumes" → "large data volumes" or "high data volumes"

Also applies to: 870-870, 902-902, 1415-1415
docs/time-bucketed-statistics-implementation-plan.md (5)
1110-1145: Use database-level sorting in view queries.

Lines 1140-1143 sort experiments in Python after fetching, which only sorts the current page result set. If users paginate, the sort order may vary between pages and the sort is inefficient.

Preferred approach: Use Django QuerySet annotation with F() expressions to sort at the database level:
from django.db.models import F, Case, When

def get_table_data(self):
    queryset = super().get_table_data()
    
    # Fetch stats in one batch
    experiment_ids = list(queryset.values_list('id', flat=True))
    stats_map = ExperimentStatisticsBucket.objects.get_totals_for_experiments(experiment_ids)
    
    # Annotate with stats and preserve database-level sorting
    # (or use prefetch_related with custom query)
    return queryset  # Let database handle ordering via default queryset ordering
If stats must be fetched post-query, consider using preserve_order_with_stats() helper to maintain consistent sort across pages.

505-505: Remove unused imports.

Line 505 imports OuterRef and Subquery but they are not used in _update_experiment_buckets(). Remove to clean up code:
- from django.db.models import Count, Max, OuterRef, Subquery
+ from django.db.models import Count, Max
217-217: Use idiomatic Django pattern for model managers.

Lines 217 and 391 use add_to_class() to attach managers to models:
ExperimentStatisticsBucket.add_to_class('objects', ExperimentStatisticsBucketManager())
SessionStatisticsBucket.add_to_class('objects', SessionStatisticsBucketManager())
While this works, the idiomatic Django pattern is to define the manager on the model class directly:
class ExperimentStatisticsBucket(BaseTeamModel):
    objects = ExperimentStatisticsBucketManager()
    ...
add_to_class() is typically used for dynamic additions in edge cases. For model definition, using class-level assignment is clearer and more maintainable.

Check if this pattern (add_to_class() for managers) is used elsewhere in the codebase; if not, prefer the standard approach.

Also applies to: 391-391

874-874: Clarify datetime import in compression function.

Line 874 imports datetime inside _compress_session_hourly_to_daily():
def _compress_session_hourly_to_daily(session_id, date):
    ...
    from datetime import datetime
    ...
    day_start = timezone.make_aware(datetime.combine(date, datetime.min.time()))
While this works, it's unconventional to import at function scope. Move to the top-level imports with other datetime imports for consistency:
from datetime import datetime, timedelta
1-10: Provide clear assumptions and prerequisites section.

The document jumps into "Design Decisions" without explicitly stating prerequisites or assumptions. Add an "Assumptions & Prerequisites" section early (after Overview) that covers:

Database: PostgreSQL required (due to ArrayField)

Models: Assumes ParticipantData, CustomTaggedItem, Chat, ChatMessage models exist with specific relationships

Django version: Minimum version for features used

Celery: Assumes Celery is already set up with periodic tasks

Compatibility: List which other existing models/views this depends on

This helps implementers verify they're ready before starting.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f58c2d0 and 2150315.

📒 Files selected for processing (2)

docs/experiment-statistics-cache-design.md (1 hunks)
docs/time-bucketed-statistics-implementation-plan.md (1 hunks)

🧰 Additional context used

🪛 LanguageTool

docs/experiment-statistics-cache-design.md

[style] ~55-~55: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...-----------| | Query Simplicity | ✅ Very simple (single row) | ⚠️ Requires SUM across b...

(EN_WEAK_ADJECTIVE)

[style] ~870-~870: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...elete or archive session statistics for very old, completed sessions. *Implementation...

(EN_WEAK_ADJECTIVE)

[style] ~902-~902: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...regation Compression Approach: For very old experiments, aggregate statistics by ti...

(EN_WEAK_ADJECTIVE)

[style] ~1415-~1415: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...analysis / historical data - ✅ You have very large data volumes (millions of messages) - ✅...

(EN_WEAK_ADJECTIVE)

[grammar] ~1437-~1437: Use a hyphen to join words.
Context: ...ple, maintainable code initially - ✅ Low risk, incremental delivery - ✅ Clear mig...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.18.1)

docs/experiment-statistics-cache-design.md

522-522: Emphasis used instead of a heading