fix the shape bug in LKJCovariancePrior#2686
fix the shape bug in LKJCovariancePrior#2686kayween wants to merge 2 commits intocornellius-gp:mainfrom
LKJCovariancePrior#2686Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a shape bug in LKJCovariancePrior by addressing incorrect event shapes in SmoothedBoxPrior and adding an unsqueeze operation for marginal standard deviations. The bug prevented unit tests from catching the issue reported in #2685.
Key Changes:
- Corrected event shape handling in
SmoothedBoxPriorso univariate distributions have empty event shapes - Added unsqueeze operation in
LKJCovariancePrior.sampleto ensure matrix shapes work correctly with all priors
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| test/priors/test_smoothed_box_prior.py | Added comprehensive test cases to verify correct batch and event shapes for various SmoothedBoxPrior configurations |
| gpytorch/priors/smoothed_box_prior.py | Removed line that incorrectly forced scalar inputs to have shape [1], allowing proper event shape inference |
| gpytorch/priors/lkj_prior.py | Added unsqueeze operation to marginal_sds before matrix operations to fix shape compatibility |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Two questions / concerns regarding this PR. First, I am not sure if the changed event shape would affect downstream packages. It does seems to be the right thing to do. All unit tests have passed. Nevertheless, it could be a breaking change. Second, the diagonal of the covariance matrices generated by the This is due this line. We sample a single marginal standard deviation for each batch. Thus, within each batch, the marginal standard deviations are the same across the dimensions. gpytorch/gpytorch/priors/lkj_prior.py Line 111 in 60be953 I wonder if this is intentional. I thought it makes more sense to sample from |
|
This change make sense to me. It's technically BC breaking but it also fixes some consistency issue that's kind of a pain and has been open for a long time (see my comments here).
I don't think this is intentional, this seems like a bug. This bit in the docstring suggest that the prior is the same for each element on the diagonal, but that's of course the prior and not the realization when sampling - Let's fix it! gpytorch/gpytorch/priors/lkj_prior.py Lines 77 to 78 in 60be953 |
| marginal_sds = self.sd_prior.rsample(sample_shape) | ||
|
|
||
| # expand sds to have the same shape as the base correlation matrix | ||
| marginal_sds = marginal_sds.unsqueeze(-1) |
There was a problem hiding this comment.
Let's change this to sample independently for each diagonal entry.
|
It seems that we need to support both homoscedastic and heteroscedastic marginal standard deviations. Both use cases are tested in Example 1Let's say we want a batch of LKJ covariance priors on 3 x 3 covariance matrices and heteroscedastic Gamma priors on the diagonal entries, i.e., each diagonal entry in each batch gets a different prior. Currently this is not possible. The Gamma prior always has an empty event shape. The shapes of its inputs only affect the batch shape, but not the event shape. This makes sense (for PyTorch) because Gamma distributions are univariate. But this causes headache for us. Ideally, we want the Gamma prior to have Example 2Again, we want a batch of LKJ covariance priors on 3 x 3 covariance matrices. This time the batch size is 3, which is coincidentally equal to the matrix dimension. The following code runs fine. But its semantics is ambiguous and has two different interpretations:
Again, the root cause is that the Gamma distributions always have empty event shapes. So it depends on the implementation to choose the interpretation. Perhaps the interpretation 1 is more plausible? Proposal 1I am going to assume
Case 1 & 2 are heteroskedastic and will invoke broadcasting in the last dimension. Case 3 is heteroscedastic. Pro. This implementation should work for most priors, including all univariate priors like Gamma, Normal, and SmoothedBoxPrior. Con.
Proposal 2Let the users specify the event shape properly by torch.distributions.Independent. For example, the issue in Example 1 would be solved nicely by reinterpreting the last dimension as the event shape. Pro. This seems to be the cleanest approach (from the implementation perspective). Con. The users have to do extra work. I am leaning towards Proposal 1 in this PR, but I am also open to thoughts on 2. |
| self.assertEqual(prior.batch_shape, torch.Size([2])) | ||
| self.assertEqual(prior.event_shape, torch.Size([3])) |
There was a problem hiding this comment.
| self.assertEqual(prior.batch_shape, torch.Size([2])) | |
| self.assertEqual(prior.event_shape, torch.Size([3])) | |
| self.assertEqual(prior.batch_shape, torch.Size([2, 3])) | |
| self.assertEqual(prior.event_shape, torch.Size([])) |
I think we should change this test case. SmoothedBoxPrior is a univariate distribution and thus should always have an empty event shape. This is aligned with other priors like GammaPrior.
This PR would fix #2685. There are two main changes in this PR.
SmoothedBoxPriorhas incorrect event shapes. As a result, the unit tests didn't catch the bug in [Bug] Error in sampling from LKJCovariancePrior #2685.SmoothedBoxPrior(a=1, b=2, sigma=0.1)has an event shapetorch.Size([]). However, univariate smoothed box priors currently has an event shapetorch.Size([1]). This PR fixes it.LKJCovariancePrior.sampleso that the matrix shapes work out for all priors, e.g.,GammaPrior.