Skip to content

Simplify main agda module of certifier#7574

Open
basetunnel wants to merge 6 commits intomasterfrom
basetunnel/cert-interface
Open

Simplify main agda module of certifier#7574
basetunnel wants to merge 6 commits intomasterfrom
basetunnel/cert-interface

Conversation

@basetunnel
Copy link
Collaborator

@basetunnel basetunnel commented Feb 6, 2026

This simplifies the setup of VerifiedCompilation.lagda and will help with future certifier changes.

  • With RelationOf and certify functions it is now clear how passes are mapped to their corresponding translation relations and decision procedures/checkers
  • Renamed MatchOrCE to DecidableCE to be consistent with naming in standard library
  • Introduced Checkable, Certifiable synonyms
  • It adds the Trace A type, which represents the compiler trace without the duplicate terms (it converts the current dump with duplicates by using the output of previous passes as input of the next, see toTrace). We should probably dump this new Trace structure from the Haskell side.
  • Add VerifiedCompilation.NotImplemented, a trivial translation relation which can be used as placeholder for unfinished translation relations
  • Add unknown : SimplifierTag to represent passes that we don't know of in the certifier

@zliu41
Copy link
Member

zliu41 commented Feb 6, 2026

It adds the Trace A type, which represents the compiler trace without the duplicate terms (it now uses the output of previous passes as input of the next, see toTrace)

That's less flexible, since if we add a new compiler pass, we must update certifier in the same PR. How about allowing each AST in Trace to optionally provide the post-term?

@zliu41
Copy link
Member

zliu41 commented Feb 8, 2026

I added hints in #7575, which you should incorporate into your approach. You may want to wait till my PR is merged.

@basetunnel
Copy link
Collaborator Author

basetunnel commented Feb 9, 2026

That's less flexible

That's intentional because the existing structure allows certain problems, like the compiler forgetting to dump a pass, dumping passes incorrectly (e.g. (PassName, t, t) instead of (PassName, t, pass t), dumping the order of passes incorrectly etc. The certifier wouldn't catch that, because there is nothing that enforces a relation between output of one pass and input of the next.

We're basically proving $R_1(t_1, t_2) \wedge R_2(t_3, t_4) \wedge \dots \wedge R_n(t_{2n - 1},t_{2n})$ for translation relations $R_i$, while we could prove the stronger $(R_1 \circ \dots \circ R_n)(t_1, t_n)$, which composes relations to state a theorem about the input and output of the entire pipeline.

We could alternatively do equality checking on outputs/inputs in the current trace, but it would slow down the certifier unnecessarily. The dumping mechanism is a trusted component, so I think they should be as correct-by-construction as possible.

since if we add a new compiler pass, we must update certifier in the same PR.

What about an additional constructor unknown : SimplifierTag? Any new pass could just dump (unknown, t, pass t), for and the certifier then defaults to the NotImplemented translation relation.

@zliu41
Copy link
Member

zliu41 commented Feb 11, 2026

What about an additional constructor unknown : SimplifierTag?

Sounds reasonable.

@basetunnel basetunnel force-pushed the basetunnel/cert-interface branch 3 times, most recently from 90cc934 to fea274c Compare February 27, 2026 13:33
@basetunnel
Copy link
Collaborator Author

@zliu41 @ana-pantilie I've rebased it on master, and incorporated hints. Also added unknown : SimplifierTag to not break the certifier when adding new passes.

This also speeds up the certifier-bench up to 30% on my machine, possibly due to ignoring the duplicate ASTs in the dump:

On master:

benchmarking N Queens
time                 134.2 ms   (111.1 ms .. 152.9 ms)
                     0.960 R²   (0.857 R² .. 0.996 R²)
mean                 121.4 ms   (108.1 ms .. 135.8 ms)
std dev              20.09 ms   (13.75 ms .. 29.98 ms)
variance introduced by outliers: 48% (moderately inflated)

benchmarking Cardano Open Oracle Protocol
time                 293.6 ms   (218.0 ms .. 364.7 ms)
                     0.984 R²   (0.956 R² .. 1.000 R²)
mean                 296.5 ms   (279.3 ms .. 316.6 ms)
std dev              23.49 ms   (12.82 ms .. 35.60 ms)
variance introduced by outliers: 18% (moderately inflated)

benchmarking Linear Vesting
time                 164.1 ms   (149.1 ms .. 177.3 ms)
                     0.989 R²   (0.956 R² .. 0.999 R²)
mean                 143.0 ms   (125.3 ms .. 153.2 ms)
std dev              19.47 ms   (8.407 ms .. 23.22 ms)
variance introduced by outliers: 41% (moderately inflated)

benchmarking Cardano Loans
time                 1.232 s    (1.018 s .. 1.460 s)
                     0.996 R²   (0.984 R² .. 1.000 R²)
mean                 1.277 s    (1.235 s .. 1.319 s)
std dev              48.50 ms   (41.39 ms .. 53.76 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking Marlowe
time                 1.221 s    (1.154 s .. 1.262 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.207 s    (1.186 s .. 1.215 s)
std dev              14.90 ms   (87.59 μs .. 18.06 ms)
variance introduced by outliers: 19% (moderately inflated)

This branch:

benchmarking N Queens
time                 98.96 ms   (74.31 ms .. 119.6 ms)
                     0.888 R²   (0.660 R² .. 0.976 R²)
mean                 95.86 ms   (82.56 ms .. 111.4 ms)
std dev              22.93 ms   (17.93 ms .. 29.46 ms)
variance introduced by outliers: 76% (severely inflated)

benchmarking Cardano Open Oracle Protocol
time                 278.5 ms   (232.1 ms .. 352.7 ms)
                     0.983 R²   (0.950 R² .. 1.000 R²)
mean                 222.4 ms   (191.7 ms .. 252.6 ms)
std dev              40.39 ms   (29.13 ms .. 47.65 ms)
variance introduced by outliers: 38% (moderately inflated)

benchmarking Linear Vesting
time                 116.1 ms   (91.89 ms .. 136.1 ms)
                     0.969 R²   (0.942 R² .. 0.994 R²)
mean                 118.3 ms   (107.0 ms .. 128.4 ms)
std dev              16.86 ms   (11.95 ms .. 22.75 ms)
variance introduced by outliers: 48% (moderately inflated)

benchmarking Cardano Loans
time                 1.018 s    (931.7 ms .. 1.134 s)
                     0.998 R²   (0.998 R² .. 1.000 R²)
mean                 990.8 ms   (969.7 ms .. 1.009 s)
std dev              22.60 ms   (12.79 ms .. 28.31 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking Marlowe
time                 939.1 ms   (846.5 ms .. 1.026 s)
                     0.998 R²   (0.998 R² .. 1.000 R²)
mean                 944.6 ms   (918.9 ms .. 963.6 ms)
std dev              25.86 ms   (10.58 ms .. 32.39 ms)
variance introduced by outliers: 19% (moderately inflated)

@basetunnel basetunnel force-pushed the basetunnel/cert-interface branch from f2d585a to 7801511 Compare March 5, 2026 14:04
@basetunnel basetunnel added the No Changelog Required Add this to skip the Changelog Check label Mar 5, 2026
@basetunnel
Copy link
Collaborator Author

Rebased, the report now also prints a question mark instead of a check mark for passes that have not been implemented yet.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

-- it should print something like "certifier unavailable" rather than "✅".
"Pass " ++ showℕ n ++ ": " ++ showTag tag ++ " ✅" ++
"Pass " ++ showℕ n ++ ": " ++ showTag tag
++ (if hasRelation tag then " ✅" else " ❓ (certifier unavailable)") ++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I wouldn't use a red question mark since it's too eye-catching. Something lower profile like "⚠" is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

No Changelog Required Add this to skip the Changelog Check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants