Skip to content

Certifier: add force-case-delay translation relation#7594

Open
ana-pantilie wants to merge 2 commits intomasterfrom
ana/certifier-forcecasedelay
Open

Certifier: add force-case-delay translation relation#7594
ana-pantilie wants to merge 2 commits intomasterfrom
ana/certifier-forcecasedelay

Conversation

@ana-pantilie
Copy link
Contributor

@ana-pantilie ana-pantilie commented Feb 16, 2026

@ana-pantilie ana-pantilie changed the title Certifier: add force-case-delay transition relation Certifier: add force-case-delay translation relation Feb 17, 2026
@ana-pantilie ana-pantilie force-pushed the ana/certifier-forcecasedelay branch from 88d9ba0 to 3d80c91 Compare February 18, 2026 18:36
@ana-pantilie ana-pantilie force-pushed the ana/certifier-forcecasedelay branch 2 times, most recently from 8238d2e to c795e79 Compare February 26, 2026 13:54
@ana-pantilie ana-pantilie force-pushed the ana/certifier-forcecasedelay branch from c795e79 to 86ec713 Compare February 26, 2026 13:58
@ana-pantilie ana-pantilie marked this pull request as ready for review February 26, 2026 13:59
Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable - could you share benchmarking results from cabal bench plutus-benchmark:certifier-bench?


The Force-Case-Delay compiler optimization phase transforms terms of the form: `force (case scrut [\x1... -> delay term_1, ..., \x1... -> delay term_m])` into `case scrut [\x1... -> term_1, ..., \x1... -> term_m]`. Note that the delay must appear in all the branches of the case, under any number (including zero) of lambda abstractions.

An important remark is that this transformation is not semantics-preserving in general, but it is semantics preserving when the program is "well-formed". We do not have a formal definition of well-formedness, this is left as future work. For more information about the intuitive notion of well-formedness, see `Note [Applying force to delays in case branches]` in the Untyped Plutus Core implementation of the Force-Case-Delay optimization phase.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well-formedness

Is that not the same as "the UPLC program is typeable in TPLC"? Or similarly (though not equivalently), "evaluating the UPLC program does not lead to MachineError"? If you are talking about formalizing it in Agda, I don't think that's possible - I think it is undecidable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the UPLC program is typeable in TPLC

I think this is more restrictive than what we mean by "well-defined".

evaluating the UPLC program does not lead to MachineError

That's probably the best definition, and if I'm not mistaken it appears in the note.

If you are talking about formalizing it in Agda, I don't think that's possible - I think it is undecidable.

It might not be, but I'm not entirely convinced we can't find a formalization which, although not complete, is enough for our purposes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are talking about formalizing it in Agda, I don't think that's possible - I think it is undecidable.

It is certainly possible to formalize such a property, it's just undecidable in general. However, it (or a weaker property that suffices) may follow from typeability in TPLC and if it is preserved by the translation relations, we could prove it for what Ana describes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine it's only possible to formalize it in very special cases, unless I'm missing something

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's terms like

force
  case (constr 0 [true, false])
    [ \x y z -> delay x
    ]

that make it unsound in general (note the amount of parameters in the branch does not line up), because it's optimised to

case (constr 0 [true, false])
    [ \x y z -> x
    ]

I don't expect you could ever obtain the first term from the optimiser when starting out with a TPLC-typable term. Or can we?

In any case, it's unclear what that property would be exactly.


variable
n : ℕ
M N : n ⊢
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we use X ⊢ in most other places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but I don't like that notation. It gives me the impression that X can be any type, when it's just a natural number.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X is a left-over from when terms used to be indexes by X : Set which was instantiated to nested Maybe. I'm in favor of n, and we should eventually change that in other places too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or Γ, which is commonly used for context?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works too, although I'd associate it more with a typing context, rather than scoping.


isForceCaseDelay? : MatchOrCE (ForceCaseDelay {n})

{-# TERMINATING #-}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know which recursive call upsets the termination checker?

@ana-pantilie
Copy link
Contributor Author

ana-pantilie commented Feb 27, 2026

@zliu41 The benchmarking results:

benchmarking N Queens
time                 56.44 ms   (52.06 ms .. 59.90 ms)
                     0.985 R²   (0.965 R² .. 0.995 R²)
mean                 53.26 ms   (49.51 ms .. 56.85 ms)
std dev              7.223 ms   (5.064 ms .. 10.26 ms)
variance introduced by outliers: 52% (severely inflated)

benchmarking Cardano Open Oracle Protocol
time                 138.6 ms   (129.4 ms .. 145.4 ms)
                     0.997 R²   (0.992 R² .. 1.000 R²)
mean                 133.2 ms   (118.2 ms .. 139.3 ms)
std dev              13.93 ms   (3.785 ms .. 22.51 ms)
variance introduced by outliers: 35% (moderately inflated)

benchmarking Linear Vesting
time                 68.56 ms   (63.95 ms .. 72.28 ms)
                     0.990 R²   (0.975 R² .. 0.998 R²)
mean                 65.79 ms   (60.93 ms .. 70.00 ms)
std dev              7.765 ms   (5.032 ms .. 11.93 ms)
variance introduced by outliers: 43% (moderately inflated)

benchmarking Cardano Loans
time                 554.2 ms   (444.8 ms .. 654.2 ms)
                     0.995 R²   (0.982 R² .. 1.000 R²)
mean                 550.8 ms   (532.4 ms .. 563.1 ms)
std dev              18.72 ms   (8.999 ms .. 24.34 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking Marlowe
time                 564.4 ms   (541.3 ms .. 610.2 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 568.9 ms   (560.2 ms .. 580.9 ms)
std dev              11.63 ms   (3.984 ms .. 15.41 ms)
variance introduced by outliers: 19% (moderately inflated)

The numbers are quite similar to the ones from your PR which improved the certifier performance. Note that we're running them on different machines, so it's possible my machine is faster and that explains why the numbers are lower.

@basetunnel
Copy link
Collaborator

@zliu41 The benchmarking results:

benchmarking N Queens
time                 56.44 ms   (52.06 ms .. 59.90 ms)
                     0.985 R²   (0.965 R² .. 0.995 R²)
mean                 53.26 ms   (49.51 ms .. 56.85 ms)
std dev              7.223 ms   (5.064 ms .. 10.26 ms)
variance introduced by outliers: 52% (severely inflated)

benchmarking Cardano Open Oracle Protocol
time                 138.6 ms   (129.4 ms .. 145.4 ms)
                     0.997 R²   (0.992 R² .. 1.000 R²)
mean                 133.2 ms   (118.2 ms .. 139.3 ms)
std dev              13.93 ms   (3.785 ms .. 22.51 ms)
variance introduced by outliers: 35% (moderately inflated)

benchmarking Linear Vesting
time                 68.56 ms   (63.95 ms .. 72.28 ms)
                     0.990 R²   (0.975 R² .. 0.998 R²)
mean                 65.79 ms   (60.93 ms .. 70.00 ms)
std dev              7.765 ms   (5.032 ms .. 11.93 ms)
variance introduced by outliers: 43% (moderately inflated)

benchmarking Cardano Loans
time                 554.2 ms   (444.8 ms .. 654.2 ms)
                     0.995 R²   (0.982 R² .. 1.000 R²)
mean                 550.8 ms   (532.4 ms .. 563.1 ms)
std dev              18.72 ms   (8.999 ms .. 24.34 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking Marlowe
time                 564.4 ms   (541.3 ms .. 610.2 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 568.9 ms   (560.2 ms .. 580.9 ms)
std dev              11.63 ms   (3.984 ms .. 15.41 ms)
variance introduced by outliers: 19% (moderately inflated)

The numbers are quite similar to the ones from your PR which improved the certifier performance. Note that we're running them on different machines, so it's possible my machine is faster and that explains why the numbers are lower.

Could you comment out the other passes in VerifiedCompilation.lagda.md to see what the performance is of this single procedure?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants