Skip to content

mvs,domain,server,session,metrics: add Materialized View Service (MVS) framework and runtime integration#66242

Open
solotzg wants to merge 56 commits intopingcap:feature/release-8.5-materialized-viewfrom
solotzg:mv
Open

mvs,domain,server,session,metrics: add Materialized View Service (MVS) framework and runtime integration#66242
solotzg wants to merge 56 commits intopingcap:feature/release-8.5-materialized-viewfrom
solotzg:mv

Conversation

@solotzg
Copy link
Contributor

@solotzg solotzg commented Feb 13, 2026

What problem does this PR solve?

Issue Number: ref #18023

Problem Summary:

Materialized View Service (MVS) lacked a complete end-to-end server implementation. The scheduler, executor, ownership assignment, runtime operations, observability, and metadata bootstrap needed to be implemented and wired across domain/session/server.

What changed and how does it work?

  • Add full MVS module under pkg/mvs.
  • Implement MV service runtime loop for periodic fetch, DDL-triggered refresh, due-task dispatch, and next-wakeup scheduling.
  • Implement task executor with bounded concurrency, timeout handling, queueing, close/drain behavior, and optional backpressure.
  • Implement MV refresh and MVLog purge handlers, including SQL execution helpers and purge history persistence.
  • Implement server membership maintenance and consistent-hash ownership.
  • Use byte-key hashing in consistent hash and support int64 ownership keys via fixed-width binary bytes.
  • Implement metadata fetch and in-memory rebuild from mysql.tidb_mview_refresh_info and mysql.tidb_mlog_purge_info.
  • Wire MVS into domain/session startup and etcd-based metadata change fanout/watch.
  • Add runtime settings HTTP endpoint GET/POST /mvservice/settings.
  • Add MVS metrics/reporting, including executor counters/gauges, task/fetch duration histograms, and run-event counters.
  • Update bootstrap DDL for MVS metadata/history tables.
  • Add/adjust unit and intest coverage for scheduler, executor, hash ownership, handlers, and settings APIs.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

MVS Design

This document describes the current Materialized View Service (MVS) implementation in TiDB.
It is generated from the latest code under:

  • pkg/mvs/*
  • pkg/domain/domain.go
  • pkg/server/handler/mvhandler/mv_service_handler.go
  • pkg/session/bootstrap.go

1. Scope and Responsibilities

MVS is an in-memory scheduler/executor for two background task types:

  • Materialized view refresh (mv_refresh)
  • Materialized view log purge (mvlog_purge)

Core responsibilities:

  • Load task metadata from system tables
  • Assign task ownership by consistent hashing across TiDB nodes
  • Dispatch due tasks to a bounded executor
  • Reschedule / retry / remove tasks based on execution result
  • React to metadata change notifications and periodic refresh windows
  • Report runtime metrics and events

2. Architecture Overview

Main modules:

  • pkg/mvs/service.go: scheduler state, task queues, rebuild/reschedule logic
  • pkg/mvs/service_runtime.go: main run loop, timers, wake-up behavior
  • pkg/mvs/task_executor.go: worker pool, queue, timeout/backpressure behavior
  • pkg/mvs/server_maintainer.go: TiDB node membership and ownership mapping
  • pkg/mvs/consistenthash.go: hash ring implementation
  • pkg/mvs/impl.go: real task handlers (RefreshMV, PurgeMVLog), SQL helpers, registration
  • pkg/mvs/metrics_reporter.go: metric flushing + event/duration observers
  • pkg/mvs/service_config.go: constructor config + runtime settings APIs
  • pkg/mvs/task_backpressure.go: CPU/memory backpressure controller
  • pkg/mvs/utils.go: Notifier + generic PriorityQueue

3. Integration and Lifecycle

3.1 Domain wiring

Domain.Init() registers MVS before DDL notifier starts:

  • pkg/domain/domain.go: mvs.RegisterMVS(do.ctx, do.ddlNotifier.RegisterHandler, do.sysSessionPool, do.notifyMVSMetadataChange)

RegisterMVS(...) (pkg/mvs/impl.go) does:

  1. Build MVService with default config + default backpressure thresholds.
  2. Call NotifyDDLChange() once to force an initial metadata refresh.
  3. Register DDL callback (notifier.MVServiceHandlerID), currently handling:
    • meta.ActionCreateMaterializedViewLog
  4. Invoke onDDLHandled() so domain can fan out notification to other nodes via etcd.

3.2 Service start

session boot flow starts MVS through domain:

  • pkg/session/session.go: dom.StartMVService()

StartMVService() launches two goroutines:

  • mvService.Run
  • watchMVSMetaChange

3.3 Cross-node notification

Domain uses etcd key /tidb/mvs/ddl:

  • writer: notifyMVSMetadataChange()
  • watcher: watchMVSMetaChange()

When watcher receives an event, it calls mvService.NotifyDDLChange().

4. Scheduler Data Model

MVService maintains two independent in-memory queues:

  • Refresh queue: pending map[int64]mvItem + PriorityQueue[*mv]
  • Purge queue: pending map[int64]mvLogItem + PriorityQueue[*mvLog]

Task structs:

  • mv: ID, nextRefresh, refreshInterval, orderTs, retryCount
  • mvLog: ID, baseTableID, nextPurge, purgeInterval, orderTs, retryCount

Scheduling key:

  • orderTs (Unix ms), smaller means earlier execution
  • maxNextScheduleTs indicates a task has been dispatched and is considered running

5. Run Loop Semantics

Main loop (MVService.Run) processes:

  • scheduler timer
  • metrics timer
  • notifier wake-up
  • context cancellation

Per iteration:

  1. Periodically refresh TiDB node list (serverRefreshInterval) via sch.refresh().
  2. Decide metadata fetch trigger:
    • periodic (fetchInterval)
    • DDL/etcd signal (ddlDirty)
  3. Fetch and rebuild task state (fetchAllMVMeta).
  4. Pop due tasks (fetchExecTasks).
  5. Submit purge/refresh tasks to TaskExecutor.
  6. Compute next wake time by min(nextFetchTime, nextDueTime).

If metadata fetch fails, lastRefresh is still updated to avoid tight retry loops.

6. Ownership and Consistent Hashing

6.1 Membership model

ServerConsistentHash keeps:

  • current TiDB server map
  • local server ID
  • ConsistentHash ring

It supports:

  • init with retry backoff (sch.init())
  • full membership refresh (sch.refresh())
  • ownership check (Available)

6.2 Key handling

Available(key any) accepts:

  • string: hashed as raw bytes of the string
  • int64: hashed as fixed-width 8-byte binary (big-endian)

ConsistentHash.GetNode now takes []byte directly.

7. Task Dispatch and Reschedule Rules

7.1 Due task collection

fetchExecTasks(now) scans each priority queue head-first:

  • stop when head is running (orderTs == maxNextScheduleTs)
  • stop when head is not due (orderTs > now)
  • for due tasks: mark orderTs = maxNextScheduleTs and dispatch

7.2 Refresh task

Execution path:

  • submit as mv-refresh/<mvID>
  • call mh.RefreshMV(...)

Result handling:

  • error: retryCount++, exponential backoff reschedule
  • nextRefresh.IsZero(): task removed
  • success with next time: retry count reset and rescheduled to next time

7.3 Purge task

Execution path:

  • submit as mvlog-purge/<mvLogID>
  • call mh.PurgeMVLog(...)

Result handling mirrors refresh logic.

Each completion path triggers notifier.Wake() to accelerate next scheduling cycle.

8. Task Executor Semantics

TaskExecutor includes:

  • ring-buffer queue
  • dynamic worker concurrency
  • timeout accounting
  • optional backpressure before dequeue
  • graceful close with task draining semantics

8.1 Timeout behavior

If task timeout is configured and reached:

  • worker slot is released immediately
  • timeout counters/gauges are updated
  • original task continues in background
  • final task completion/failure is recorded when background goroutine returns

Timeout does not cancel SQL execution directly.

8.2 Backpressure behavior

Backpressure is checked before taking task from queue.
If blocked, worker sleeps for reported delay and retries.

Default controller: CPUMemBackpressureController.

9. Metadata Fetch and Rebuild

Metadata fetch entry:

  • fetchAllTiDBMLogPurge() from mysql.tidb_mlog_purge_info
  • fetchAllTiDBMViews() from mysql.tidb_mview_refresh_info

Rebuild behavior:

  • existing task: update mutable fields
  • changed next time + not running: adjust orderTs and heap position
  • running task: defer heap adjustment until reschedule/remove
  • missing in latest metadata: remove from queue/map

Ownership filtering is applied after fetch via sch.Available(id).

10. Handler SQL Semantics

10.1 RefreshMV

serverHelper.RefreshMV(...):

  1. borrow sys session
  2. set internal source type context
  3. clear SQL mode temporarily
  4. resolve schema/table by mvID
  5. execute REFRESH MATERIALIZED VIEW ... WITH SYNC MODE FAST
  6. read NEXT_TIME from mysql.tidb_mview_refresh_info

If metadata no longer exists, returns zero next time (task should be removed).

10.2 PurgeMVLog

PurgeMVLog(ctx, sctx, mvLogID, autoPurge):

  1. resolve mlog/base-table metadata and validate consistency
  2. parse purge schedule (PurgeStartWith, PurgeNext)
  3. compute purge upper bound from min LAST_SUCCESS_READ_TSO
  4. run pessimistic internal transaction:
    • lock purge row (FOR UPDATE)
    • evaluate whether this round should run
    • delete eligible mlog rows
    • append purge history
    • update purge info and next schedule

nextPurge.IsZero() means no further schedule is needed.

11. Runtime Settings API

HTTP endpoint:

  • GET/POST /mvservice/settings
  • handler: pkg/server/handler/mvhandler/mv_service_handler.go

Supported runtime settings:

  • task concurrency / timeout
  • backpressure enable + cpu/mem threshold + delay
  • retry base/max delay

Handler behavior:

  • parse only provided fields
  • preserve unspecified fields
  • validate, then apply through MVService setters

12. Metrics and Events

Metrics are defined in pkg/metrics/mv.go and flushed by serverHelper.reportMetrics.

Covered dimensions:

  • executor counters: submitted/completed/failed/timeout/rejected
  • executor gauges: running/waiting/timed_out_running
  • service gauges: total/running counts for refresh and purge tasks
  • duration histograms:
    • task durations (refresh/purge, success/failure)
    • metadata fetch durations (mlog/mview, success/failure)
  • run-loop event counters (MVServiceRunEventCounterVec)

13. Bootstrap Metadata Tables

Bootstrap creates/maintains:

  • mysql.tidb_mview_refresh_info
  • mysql.tidb_mlog_purge_info
  • mysql.tidb_mview_refresh_hist
  • mysql.tidb_mlog_purge_hist

(see pkg/session/bootstrap.go)

14. Testing Model

  • Time-dependent logic is abstracted via time_proxy wrappers.
  • intest build uses mock time module (time_proxy_intest.go) for deterministic timer tests.
  • pkg/mvs/*_test.go covers:
    • executor behavior (concurrency/timeout/backpressure/close)
    • scheduler rebuild and dispatch behavior
    • consistent hash and ownership behavior
    • refresh/purge handler SQL flow
    • runtime settings parsing/apply paths

15. Current Constraints / TODO

  • DDL-triggered refresh currently handles ActionCreateMaterializedViewLog; other MV-related metadata events are TODO.
  • Timeout only releases worker slots; it does not actively interrupt underlying SQL.

Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
5
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
6
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
7
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
8
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
9
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
@solotzg solotzg marked this pull request as draft February 13, 2026 04:03
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 13, 2026
@tiprow
Copy link

tiprow bot commented Feb 13, 2026

@solotzg: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
@tiprow
Copy link

tiprow bot commented Feb 13, 2026

@solotzg: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@solotzg solotzg marked this pull request as ready for review February 13, 2026 04:52
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 13, 2026
@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 69.11636% with 706 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (feature/release-8.5-materialized-view@a76fec7). Learn more about missing BASE report.

Additional details and impacted files
@@                            Coverage Diff                             @@
##             feature/release-8.5-materialized-view     #66242   +/-   ##
==========================================================================
  Coverage                                         ?   57.6939%           
==========================================================================
  Files                                            ?       1801           
  Lines                                            ?     642388           
  Branches                                         ?          0           
==========================================================================
  Hits                                             ?     370619           
  Misses                                           ?     246557           
  Partials                                         ?      25212           
Flag Coverage Δ
integration 38.1238% <24.5844%> (?)
unit 72.5895% <68.6343%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9278% <0.0000%> (?)
parser ∅ <0.0000%> (?)
br 63.3895% <0.0000%> (?)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
@solotzg
Copy link
Contributor Author

solotzg commented Feb 13, 2026

/test mysql-test

@tiprow
Copy link

tiprow bot commented Feb 13, 2026

@solotzg: PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test.

Details

In response to this:

/test mysql-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
@ti-chi-bot ti-chi-bot bot added the sig/planner SIG: Planner label Feb 18, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Feb 18, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign d3hunter, time-and-fate, yudongusa for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Zhigao TONG <tongzhigao@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant