Skip to content

Conversation

@Santymax98
Copy link
Contributor

🚀 Fast Gaussian Copula CDF + Student-t Correlation Extensions

Summary

  1. This PR introduces a new high-performance Monte Carlo estimator for the Gaussian copula CDF based on the GHK (Geweke–Hajivassiliou–Keane) algorithm combined with Sobol quasi-random sequences and Cranley–Patterson random shifts.
    The method achieves both high numerical accuracy and massive speedups, outperforming classical GHK implementations (e.g., mvtnorm::pmvnorm in R or MvNormalCDF.jl in Julia).

  2. A fully numeric implementation of ₂F₁(ν,ν;2ν;z) and the Spearman ρₛ for the t-Copula, removing the dependency on HypergeometricFunctions.jl, (But it is the administrator's decision to see whether this dependency is attached or not)...


Implementation Details

  • Implemented _cdf_base with Sobol quasi–Monte Carlo and antithetic sampling
  • Adaptive replication control using error-based termination
  • Pivoted Cholesky decomposition via RowMaximum()
  • Added user flag fast::Bool=true (default) for efficient evaluation

The implementation supports both:

  • fast=true → optimized for simulation and fitting
  • fast=false → adaptive high-precision reference computation

Benchmarks (ρ = 0.5, u = 0.8, 50k samples)

d Δ₍rob₎ Δ₍fast₎ t_ref (ms) t_rob (ms) t_fast (ms)
2 1.3645e−06 6.3197e−06 1.640 4.371 0.544
3 5.0843e−06 3.3419e−05 3.462 209.809 0.791
5 1.0712e−05 1.8286e−04 7.538 648.508 1.328
10 1.5549e−05 1.6206e−04 18.903 1408.483 2.733
15 4.1757e−05 5.4560e−05 31.084 2158.698 4.246
20 5.2347e−05 5.0156e−05 43.124 2965.256 5.811

julia> ρ = 0.5
0.5

julia> dims = [2, 3, 5, 10, 15, 20]
6-element Vector{Int64}:
  2
  3
  5
 10
 15
 20

julia> results = []
Any[]

julia> for d in dims
           Σ = fill(ρ, d, d)
           Σ[diagind(Σ)] .= 1.0
           C = GaussianCopula(Σ)
           u = fill(0.8, d)
           x = StatsFuns.norminvcdf.(u)

           # --- reference value (MvNormalCDF) ---
           p_ref, _ = MvNormalCDF.mvnormcdf(Σ, fill(-Inf, d), x; m=50_000)

           # --- Robust method ---
           p_rob = Copulas._cdf(C, u; fast=false)
           # --- fast method ---
           p_fast = Copulas._cdf(C, u; fast=true)

           Δrob = abs(p_ref - p_rob)
           Δfast = abs(p_ref - p_fast)

           # --- Benchmarks ---
           t_ref  = @belapsed MvNormalCDF.mvnormcdf($Σ, fill(-Inf, $d), $x; m=50_000)
           t_rob  = @belapsed Copulas._cdf($C, $u; fast=false)
           t_fast = @belapsed Copulas._cdf($C, $u; fast=true)

           push!(results, (d=d, p_ref=p_ref, p_rob=p_rob, p_fast=p_fast,
                           Δrob=Δrob, Δfast=Δfast,
                           t_ref=t_ref, t_rob=t_rob, t_fast=t_fast))
       end

✅ Example Usage

C = GaussianCopula([1.0 0.5; 0.5 1.0])
u = [0.8, 0.9]

julia> Copulas._cdf(C, u)
0.7515044459012423

julia> Copulas._cdf(C, u; fast=false)
0.7514970876295584

t-Copula Additions

  • _Gauss_hypergeometric, _Gauss_Euler, and _Gauss2F1_hybrid
  • rhoS_t(ν,ρ) (Spearman correlation) via adaptive quadrature
  • Stable across all ν, with Gaussian fallback for ν > 20 for asintotics propierties

@codecov
Copy link

codecov bot commented Oct 12, 2025

Codecov Report

❌ Patch coverage is 80.83832% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.65%. Comparing base (c974872) to head (8655131).

Files with missing lines Patch % Lines
src/EllipticalCopulas/TCopula.jl 48.78% 21 Missing ⚠️
src/EllipticalCopula.jl 91.37% 10 Missing ⚠️
src/EllipticalCopulas/GaussianCopula.jl 90.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #330       +/-   ##
===========================================
- Coverage   77.95%   17.65%   -60.31%     
===========================================
  Files          84       84               
  Lines        4986     5120      +134     
===========================================
- Hits         3887      904     -2983     
- Misses       1099     4216     +3117     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lrnv
Copy link
Owner

lrnv commented Oct 12, 2025

This is very interesting, but i think we need more tests and more comparisons to MvNormalCDF and/or other sources of truths (for example, you could compare to R::mvtnorm ?)

  1. MvNormalCDF outputs an error estimate, how does it compare with your ~ 1e-5 deltas ?

  2. Your timing for the robust version looks awfull compared to MvNormalCDF, but the fast ones are very good, while on the other handthe deltas are similar. So the fast one should be the default.

  3. The MvNormalCDF implementation has a default number of samples that scales with the dimension of the problem. Can you redo benchmarks in several dimensions using their scalings ? Why are your default number of samples so different than theirs ?

  4. To strongly validate the results, can you inport the tests from MvNormalCDF (there : https://github.com/PharmCat/MvNormalCDF.jl/blob/main/test/runtests.jl ) and see if they passes as well ?

  5. Can you do the same for the student copula cdf ? Then i think you'll have to compare to R::mvtnorm maybe.

@Santymax98
Copy link
Contributor Author

Okay, all of this needs to be studied. I'll expand the validation set as follows:

Perhaps I could add benchmarks using MvNormalCDF's dimension-based scaling (m = 1000*d)?

I'll compare my deltas to the reported error estimates because I currently use the MvNormalCDF package as a reference.

Also, I think we could add a basic consistency test for the Student's t-copula CDF.

However, I think that comparing it directly with R::mvtnorm would put us at a disadvantage, because:

R uses Alan Genz's Fortran implementation, vectorized and compiled, right?

@lrnv
Copy link
Owner

lrnv commented Oct 12, 2025

  1. Yes to benchmark with the 1000*d scaling, compare time and precision with this scaling, comparsing also with the error estimate from MvNormalCDF.

  2. R::mvtnorm uses Gentz's code, yes, but cant we do the same and/or reimplement it ? If you need to "vectorize" in the sense "taking several evaluation point at once", then you need to dig a bit into Distributions.jl's dispatching system to understand which method to implement for a vectorized cdf, but its clearly possible.

  3. If you could add the references to the papers into the docstrings of the classes with a sentence like "The cdf implementation for this copula uses myauthor1993 and add the ref at the bottom of the docstring and in the references.bib file

@Santymax98
Copy link
Contributor Author

Well, I did the following:

@testset "GaussianCopula – Scaling vs MvNormalCDF" begin
           using MvNormalCDF, StatsFuns, BenchmarkTools
           ρ = 0.5
           dims = [2, 3, 5, 10, 15, 20]
           for d in dims
               Σ = fill(ρ, d, d)
               Σ[diagind(Σ)] .= 1.0
               C = GaussianCopula(Σ)
               u = fill(0.8, d)
               x = StatsFuns.norminvcdf.(u)
               m = min(1_000*d, 100_000)
               p_ref, err_ref = MvNormalCDF.mvnormcdf(Σ, fill(-Inf, d), x; m=m)
               p_fast = Copulas._cdf(C, u)
               Δ = abs(p_ref - p_fast)
               @info "d=$d  Δ=  err_ref=$err_ref"
               @test Δ  5 * err_ref
           end
       end

and I got these results

Validation vs MvNormalCDF (scaled m = 1000*d)

To match MvNormalCDF's default scaling, we reran the validation using
m = min(1000*d, 100_000) samples. The deltas remain well within the reported Monte Carlo error.

d Δ err_ref Δ ≤ 5×err_ref
2 1.29e−4 2.43e−4
3 2.14e−4 3.32e−4
5 6.69e−5 3.71e−4
10 1.28e−4 2.87e−4
15 2.06e−4 5.94e−4
20 7.77e−5 4.95e−4

All tests pass:

Test Summary:                           | Pass  Total  Time
GaussianCopula – Scaling vs MvNormalCDF |    6      6  3.8s

I added the tests so you can view them and the benchmark shows the following

julia> val_est = Copulas._cdf_base(Σ, b; abseps=1e-4, releps=1e-4, maxpts=35_000, m0=1028, r=2)
0.6653147302829496

julia> @show val_est
val_est = 0.6653147302829496
0.6653147302829496

julia> @benchmark  Copulas._cdf_base($Σ, $b)
BenchmarkTools.Trial: 2308 samples with 1 evaluation per sample.
 Range (min  max):  1.899 ms   23.032 ms  ┊ GC (min  max): 0.00%  90.53%
 Time  (median):     2.113 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.160 ms ± 521.602 μs  ┊ GC (mean ± σ):  1.94% ±  6.39%

      ▄█▆▃▂                                                    
  ▃▅▄▄█████▇▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▃▅▆ █
  1.9 ms       Histogram: log(frequency) by time      4.12 ms <

 Memory estimate: 553.34 KiB, allocs estimate: 4209.
 
 julia> @benchmark Copulas._cdf_base($Σ, $b)
BenchmarkTools.Trial: 9824 samples with 1 evaluation per sample.
 Range (min  max):  421.708 μs   26.419 ms  ┊ GC (min  max): 0.00%  97.90%
 Time  (median):     483.375 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   507.761 μs ± 478.827 μs  ┊ GC (mean ± σ):  4.39% ±  5.10%

                                    ▂▄▆▇█▆▄▂                     
  ▂▁▁▁▁▁▂▂▁▂▂▂▂▁▁▂▂▂▂▂▁▁▁▁▁▁▂▂▂▂▃▄▅██████████▇▆▅▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂ ▃
  422 μs           Histogram: frequency by time          519 μs <

 Memory estimate: 196.05 KiB, allocs estimate: 4160.

These are the results, for comparison you can see the end of the MvNormalCDF.jl test file here

@Santymax98
Copy link
Contributor Author

Okay, sorry, I didn't mean to make the message so long. The first one corresponds to this.

Captura de pantalla 2025-10-13 a la(s) 2 53 29 p m

and the second corresponds to this

Captura de pantalla 2025-10-13 a la(s) 2 54 21 p m

both are reported at the end of the file

@lrnv
Copy link
Owner

lrnv commented Oct 13, 2025

Can you re-run the benchmarks with the two implementations, for the same level of precision (e.g. about the same number of samples) on the same machine, so that we get comparable timings ? maybe their machine was very slow compared to yours.

w.r.t. quality it seems like your new version is OK compared to the previous one, so perfect.

If possible, It'll be nice if the default number of sample matches what MvNormalCDF did, that is something growxing linearly with the dimension. But since we are that much faster, maybe we can add a bit more at the begining, like 1000*(d+1) or maybe 500+1000*d instead of 1000*d ?

Other question:

  1. How does your code behave when trying to AD through it (i.e. taking its derivatives with e.g. ForwardDiff) ?
  2. Are ALL the tests from MvNormalCDF ported or only some of them ?

@Santymax98
Copy link
Contributor Author

I've been working a bit using exactly the same conditions and in general on my laptop (the fast one) it's still much faster but if we use the same amount of samples and everything exactly as MvNormal does there we lose... maybe because of the Sobol sequence I use, I was seeing that it uses a sequence that I don't know very well... I think that one is faster than Sobol.

@lrnv
Copy link
Owner

lrnv commented Oct 13, 2025

Yes but if the new method is more precise with less samples then we win ?

Another way out: convert the current method from the Gaussian CDF to the Student CDF, which is really missing from the current implementation, and then you always win since there is no competition yet ;)

@Santymax98
Copy link
Contributor Author

Okay, I spent a bit of time reviewing the other package... I didn't quite understand what happened XD

Back to the topic... I’ve got it working now.

Although I couldn’t fully match the Fortran implementation used in mvtnorm, my current version slightly outperforms MvNormalCDF.jl in runtime and is notably better in memory efficiency.

As expected, using 12 randomized quasi-Monte Carlo replications provides a proper error estimate. However, when we skip the error estimation (single pass), the method becomes faster and more memory-efficient — in that configuration, it clearly outperforms existing Julia implementations.

The same approach extends naturally to the multivariate Student-t case, which I’ve already implemented. It works well, although the stochastic error is somewhat larger, likely due to the Sobol sequence and the adaptive Cholensky.

@lrnv
Copy link
Owner

lrnv commented Oct 16, 2025

Yes I am sorry for w<hat happen with the other package, maybe we could rediscuss it together someday.

Waiting for your code here to be ready, tell me when you want me to take a look

@Santymax98
Copy link
Contributor Author

Yes, I'll let you know when I have it ready so you can check it out. It won't take long, but I forgot my glasses and I can't work well like that 😂

@Santymax98
Copy link
Contributor Author

Hi @lrnv — I tried implementing these functions and adapted parts of the logic from MvNormalCDF.jl, improving both structure and efficiency.
Overall, we’re now slightly faster, but much more memory-efficient.

For the t distribution, I experimented with several approaches, but none can really match pure Fortran performance. So I decided to reuse the same QMC structure from the Gaussian case and just add a small radial χ²-based block. It works quite well and has been empirically validated against mvtnorm (R).

I think it could still be made a bit more “Julian,” but both versions perform consistently and respond well across dimensions.

Below is a short summary of what’s implemented. The only missing step is to hook it into the main API so that it dispatches directly from Distributions.cdf.


Implemented functions

  1. Φ(x) – Standard normal CDF.
  2. φ(x) – Standard normal PDF.
  3. Φinv(p) – Stable inverse of the standard normal CDF (probit).
  4. _δ(T) – Safe tolerance √eps(T) for clamping probabilities.
  5. richtmyer_roots(T, n) – Richtmyer lattice roots (√prime sequence) for QMC as MvNormalCDF.jl.
  6. _chlrdr_orthant!(R, b) – Cholesky with pivoting adapted to the orthant; mutates R and b.
  7. qmc_orthant_core! – Generic QMC core with radial scaling callback.
  8. qmc_orthant_normal! – Gaussian case (fill_w! ≡ 1).
  9. qmc_orthant_t! – Student-t case (w[k] = √(ν / χ²_ν) with extra Richtmyer root).

Example 1: Normal

using Random, LinearAlgebra, Distributions, MvNormalCDF, StableRNGs, Copulas

rng = StableRNG(123)
d = 8
A = randn(rng, d, d)
Σ = Matrix(Symmetric(A*A' + 0.1I))
a = fill(-Inf, d)
b = randn(rng, d)
m, r = 25_000, 12

rng_ref, rng_qmc = StableRNG(202), StableRNG(202)
p_ref, e_ref = MvNormalCDF.qsimvnv!(copy(Σ), copy(a), copy(b), m, rng_ref)
p_qmc, e_qmc = Copulas.qmc_orthant_normal!(copy(Σ), copy(b); m=m, r=r, rng=rng_qmc)

println(("p_ref = $p_ref ± $e_ref", "p_qmc = $p_qmc ± $e_qmc", "|Δ| = $(abs(p_ref - p_qmc))"))

("p_ref = 0.024469473604786227 ± 4.8609855372307254e-5", 
"p_qmc = 0.02446947360478623 ± 4.860985537231269e-5", 
"|Δ| = 3.469446951953614e-18")

Example 2 t multivariate

rng = StableRNG(123)
d, ν = 10, 5
A = randn(rng, d, d)
Σ = Matrix(Symmetric(A*A' + 0.1I))
b = randn(rng, d)

p_t, e_t = Copulas.qmc_orthant_t!(copy(Σ), copy(b), ν; m=25_000, r=12, rng=StableRNG(202))
println("Student-t (ν=$ν): p = $p_t ± $e_t")

Student-t (ν=5): p = 0.0003943872 ± 3.60e-6

Reference (R mvtnorm::pmvt):

pmvt(...) ≈ 0.0003932538 ± 1.78e-6

→ Results match exactly within the estimated error range, while our implementation uses less memory and fewer allocations.
For the t-case (ν = 5), results are consistent with mvtnorm::pmvt in R.

Furthermore, this idea serves to extend... for example for skeew t copulas... etc etc (although it is true that they are not used much 😂)

@lrnv
Copy link
Owner

lrnv commented Oct 20, 2025

@Santymax98 i'm confused: the current state of this pull-request does not change the _cdf() of Gaussian and T copula, and all the methods added are not even used in the packages API ? What Am i supposed to do with it ?

@Santymax98
Copy link
Contributor Author

Santymax98 commented Oct 20, 2025

Sorry — I just realized I forgot to push the cdf dispatch for the copulas 😅. Everything I had uploaded before were only internal utilities to improve performance.

I’ve now added the dispatch for Distributions.cdf, so the interface works directly.

For example, using the covariance matrix of size 25 from the tests in MvNormalCDF.jl, you can now do the following:

using Random, BenchmarkTools, Copulas
Random.seed!(123)
td = td[14, 1] #assuming you copied the entire td from the MvNormalCDF.jl tests
G = GaussianCopula(td)
u = rand(size(td,1))

@benchmark cdf(G, u; method=:qmc)
BenchmarkTools.Trial: 244 samples with 1 evaluation per sample.
 Range (min  max):  20.070 ms  37.483 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     20.190 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   20.464 ms ±  1.707 ms  ┊ GC (mean ± σ):  0.22% ± 1.93%

  █▇▂                                                          
  ███▆▅▄▁▅▄▁▁▁▄▁▁▄▁▁▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▄ ▅
  20.1 ms      Histogram: log(frequency) by time      26.9 ms <

 Memory estimate: 488.92 KiB, allocs estimate: 29.

@benchmark cdf(G, u; method=:normal)
BenchmarkTools.Trial: 197 samples with 1 evaluation per sample.
 Range (min  max):  25.122 ms   33.025 ms  ┊ GC (min  max): 0.00%  23.46%
 Time  (median):     25.385 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   25.436 ms ± 558.561 μs  ┊ GC (mean ± σ):  0.15% ±  1.67%

           ▁█▅▄▄ ▅▅▄                                            
  ▃▁▁▁▃▅▅▆▆█████████▆▅▇▅▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▃
  25.1 ms         Histogram: frequency by time         26.3 ms <

 Memory estimate: 596.03 KiB, allocs estimate: 485.

T = TCopula(5, td)
@benchmark cdf(T, u)
BenchmarkTools.Trial: 134 samples with 1 evaluation per sample.
 Range (min  max):  36.161 ms  48.235 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     37.259 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   37.451 ms ±  1.196 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▅▃█                                                   
  ▄▄▂▂▁▂▃████▆▆▃▁▃▄▁▁▁▅▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂ ▂
  36.2 ms         Histogram: frequency by time        43.2 ms <

 Memory estimate: 509.06 KiB, allocs estimate: 167.

The QMC version matches MvNormalCDF.qsimvnv! numerically but with less memory allocation, and the Student-t implementation, while not as fast as Fortran’s mvtnorm, performs consistently and reuses most of the Gaussian structure efficiently.

PS: For the Gaussian case I only left the two implementations for comparison...

@lrnv
Copy link
Owner

lrnv commented Oct 20, 2025

Okay thanks ^^

So now i guess :

  1. the tests needs to be updated to use these new interfaces for accessing the twoversion fo the gaussian one.
  2. Tests against AD for the student one could be added ? You can test _cdf(C, u) against the previous @invoke _cdf(C::Copulas.Copula, u) i guess directly, since the default _cdf(::Copula) uses HCubature.jl directly on the pdf.
  3. Will you function allow ForwardDiff.jl to go through ? The current tests errors on that. Maybe you need to remove type assertions from the functions signatures

@Santymax98
Copy link
Contributor Author

Yes, I'm working on it right now... actually it was a bit difficult at first to design something that works well but I think we got it... I think we don't have to change the tests... because it should work the same hahaha or so I hope... in the case of the Gaussian I left both methods so we can compare directly but if everything goes well I think it's your decision whether to leave both methods or not... although if it goes well we could remove the MvNormalCDF dependency

@lrnv
Copy link
Owner

lrnv commented Oct 20, 2025

yes, I think that the right approach is to remove the dependency to MvNormalCDF, except in the test where it is needed to compare to it of course

@Santymax98
Copy link
Contributor Author

Okay, I'll keep working on this... and on the tests... I won't be long 👍

@Santymax98
Copy link
Contributor Author

Santymax98 commented Oct 20, 2025

Ok, the Gaussian tests pass. One problem with the approximation of the marginal values ​​of the T cdf in terms of tolerance... I just don't know if I should modify the entire tolerance or maybe adjust something else...

By the way, I think there are simpler forms for two and three dimensions in the codes I've reviewed. Perhaps it would be appropriate to implement them? Or is it better to leave this one out for all cases?

@lrnv
Copy link
Owner

lrnv commented Oct 20, 2025

By simpler forms, you mean faster ? it would be nice to keep the generality and have something that works for all dimensions, but if there is a very very faster version for smaller dimensions, then it might be usefull to have it too.

  1. The dimension can be dispatched on, so implementing it separately is not an issue
  2. We can postpone specific dimensions for another pull-request, let's ensure that the generic ones works correctly first :)

Why did you remove the td tests ???

@Santymax98
Copy link
Contributor Author

Okay, I was working on it. I removed the td because they're generic, and I was debugging only the ones that work in our copula case, for example, selecting those that only have a lower bound of -infinity; otherwise, it doesn't make sense to talk about copulas. But if you'd like, I can modify them to compare the direct results of MvNormalCDF.jl with the new implementation.

qmc_orthant_normal!

Specifying the case where the lower bound is -infinity because otherwise, it wouldn't work.

@lrnv
Copy link
Owner

lrnv commented Oct 20, 2025

Do as you think is best, I haven't look at the details yet, i was just surprised that they disappeared ^^

@lrnv
Copy link
Owner

lrnv commented Oct 21, 2025

would it be correct to use some mvtnorm values ​​and use them as a reference?

Yes of course for the TCopula's CDF that is probably the right way to go

@lrnv
Copy link
Owner

lrnv commented Oct 21, 2025

For tests that are not from -Inf to b you shoudl compare to the measure() function

@lrnv
Copy link
Owner

lrnv commented Oct 22, 2025

By the tests, I see tha the default precision is not very good : 1e-3 for the TCopula uniformity checks, and 1e-3 for the measure() checks on the Gaussan copula ? Maybe we should increase the default number of samples to reduce these tolerences ?

@Santymax98
Copy link
Contributor Author

For the Gaussian, I think that error is unfair... note that measure() also uses 2^d times the Gaussian cdf from which we remove the error that they bring, that is why the tolerance must consider the error of the cdf with qmc + the error that the measure brings, as for the t... I think that its convergence takes a little longer... a solution could be to specify right now the routines for two and three variables

@lrnv
Copy link
Owner

lrnv commented Oct 22, 2025

  1. Yes I totally understand that the test against the measure is unfair, its precision is basically around 2^d * original_precision so increasing the number of sample for this one is perfectly fine.

  2. For the TCopula in generictests however, if it is really the ocnvergence of the estimator that is slow, i think we need to increase the default number of samples, maybe from 1000*d to 2000*d ? Do you know what is the policy in mvtnorm for example ?

@Santymax98
Copy link
Contributor Author

Hmm, yes, theirs is adaptive, meaning they increase the sample size until they achieve the desired precision...

@Santymax98
Copy link
Contributor Author

I tweaked the tests a bit, and the default sample size has increased... Generally, i think that should be documented, right? Or maybe put in a docstring?

@lrnv
Copy link
Owner

lrnv commented Oct 22, 2025

Yes, everything is this PR needs to be documented ;) The right place to do it is in the TCopula, GaussianCopula and EllipticalCopula docstrings. These docs needs to have the reference to the papers, and describe abit how the algorithms works. Also the meaning of the parameters: m is the sample size, what is r ?

But most importantly, there is no tests yet for the TCopula's cdf ? You should test it in test/EllipticalCopulas.jl:

  • test the cdf against the generic @invoke cdf(C::Copulas.Copula, u), which does call hcubature on the pdf
  • test the ForwardDiff.derivative of the cdf against the pdf.

@Santymax98
Copy link
Contributor Author

r is the number of Monte Carlo replications... that is, how many times we repeat QMC to estimate the error. By default, the packages I reviewed use 12; I decided to set it as a flexible parameter.

I'll work on the tests... to conclude...

@Santymax98
Copy link
Contributor Author

I prepared a test but for forwardDiff we would again have the error of... no compatibility with beta_inc... We forgot this detail 😶

@lrnv
Copy link
Owner

lrnv commented Oct 30, 2025

Hey just wanted to tell you @Santymax98 that i have not forgotten about this i just have less time right now, I will come back next week with updates :)

@Santymax98
Copy link
Contributor Author

Hi @lrnv , no problem, I'm also working on some things. I hope to finish them soon so I can conclude here with documentation and some tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants