Skip to content

Conversation

@coopa33
Copy link

@coopa33 coopa33 commented Aug 4, 2025

Solution for issue #3507:

The scatter plots for the correlation analysis did not show residuals of X and Y regressed on partial-out variables Z for partial correlation. Which it should.

This was not only the case for the datapoints, but also for the marginal distributions, which also did not change when one or more partial-out variables were added.
However, the plotted statistics did change while the plots did not.

Original Analysis - No partial correlation
Screenshot From 2025-08-04 12-41-24

Original Analysis - Partial correlation
Screenshot From 2025-08-04 12-48-25

Problem

  • The problem seemed to be that the functions .corrPairwisePlot() and .corrMatrixPlot() in "R/correlation.R" do not account for the presence of variables to be partialled out.
  • In there, the original data is fed to further plotting functions .corrScatter() and .corrMarginalDistribution(), which is why the scatterplot and marginal distributions don't change.
  • However, the function to plot the statistics .corrValuePlot() outputs the correct statistic, because it accesses the pre-computed corrResults where the partial-out variable is accounted for in calculating the statistic.
  • Unfortunately, pcor.test() which is used to compute partial correlation results in .corr.test() does not make residuals available. As far as I can see, they need to be calculated seperately.

Solution

  • To solve this, I defined a helper function called .corrExtractVarOrRes(). Depending on if options$partialOutVariables has any entries, it will either return the original data (as before), or it will return residual data X and Y regressed on one or more variables Z to be partialled out.
  • This allows me to only make changes in the higher-level functions .corrPairwisePlot() and .corrMatrixPlot(). Any subsequent functions will then use either the original data or the residuals and do not require further change

Modified Analysis - No partial correlation
Screenshot From 2025-08-04 13-01-04

Modified Analysis - Partial correlation
Screenshot From 2025-08-04 13-01-42

Modified Analysis - Multiple variables to partial out + Confidence & Prediction Intervals
Screenshot From 2025-08-04 13-03-50

Further Notes

  • Please note that the residuals displayed are not rank-transformed as you would expect for Spearman's rho. However, this was also not a feature before either. But as the interface of .corrPartialResiduals includes the options parameter, this could be implemented if the Spearman option is chosen. Then the function could return rank-transformed residuals.

…siduals of X and Y regressed on one/multiple variable(s) Z for partial correlations.
@coopa33
Copy link
Author

coopa33 commented Aug 4, 2025

@JohnnyDoorn Hi Johnny, I'm not able to request a review from you here, probably because I'm not a collaborator for this module. Would you be able to review my fix for the issue?

@mupeker
Copy link

mupeker commented Aug 6, 2025

When you place the two graphs side by side, you can provide the following explanation (perhaps in the help section):

"The normal correlation graph shows the relationship between the raw (original) values of the X and Y variables. The partial correlation graph reflects the relationship between X and Y after the effect of the Z variable has been removed. Therefore, the axes of the partial correlation graph represent the “residual” values obtained from regression models rather than the original variable values. As seen, when the effect of the Z variable is controlled, the relationship between X and Y exhibits a [strengthened/weakened/changed direction] structure."

coopa33 added 2 commits August 8, 2025 01:19
…tly that axes represent residuals for PC. Added same info in help sections.
@coopa33
Copy link
Author

coopa33 commented Aug 7, 2025

I have added a caption under the scatterplot, stating explicitly that the axes represent residuals:
Screenshot From 2025-08-08 01-25-04

This also works with multiple variables to be controlled for:
Screenshot From 2025-08-08 01-26-16

However, the function I use to do this unfortunately doesn't work with jaspGraphs::ggMatrixPlot objects (yet).
So, for matrix plots, it displays the caption under every scatterplot instead of only once under the complete matrix. This can look quite cluttered, but can be fixed by adjusting the size of the plot (this seems to work because within the matrix, the scatterplots are individual jaspPlotR objects):
Screenshot From 2025-08-08 01-30-34

But for the matrix format (if display pairwise is not clicked), it cannot be adjusted and just looks cluttered, even when making the matrix object bigger. Also, if the variables controlled for are identical for every combination, it is unnecessary to caption every plot:
Screenshot From 2025-08-08 01-40-29

I'll work on making the function work with ggMatrixPlot objects. If anyone has some input on how to make this work, pls let me know :)

@mupeker

  • I've added additional description in the help section. The same should also now be visible in the information box that appears if you hover over the plot option.
  • Is the caption explicit enough and a good idea? Otherwise, I was also thinking we could also have the X/Y labels state that with some prefix of sorts (like Residual: "variableName") but that might be even more cluttered then?

coopa33 added 15 commits August 16, 2025 13:31
Once again forgot to pull the repository before working on it
…ecause captions loaded only after plots were loaded"

This reverts commit b535117.

#### Plots
- Scatter plots: Display a scatter plots for each possible combination of the selected variables. In a matrix format, these are placed above the diagonal.
- Scatter plots: Displays scatter plots for all variable pairs. In a matrix format, these are placed above the diagonal. For partial correlations, plots show the relationship between X and Y after removing the effect of Z, and axes then represent residuals from regressing X and Y on Z.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is suggested to replace the conjunction "and" with "Therefore" in the help text of JASP, specifically in the sentence:

"For partial correlations, plots show the relationship between X and Y after removing the effect of Z**, and** axes then represent residuals from regressing X and Y on Z."

"For partial correlations, plots show the relationship between X and Y after removing the effect of Z### . Therefore, axes then represent residuals from regressing X and Y on Z."
This modification improves the flow of the sentence and clearly highlights that the second part of the statement is a logical consequence of the first. This makes the help text more explanatory and easier for users to understand the core concept behind partial correlation plots.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, in that case we could make it even more explicit? As in: "In that case, axes represent residuals from regressing X and Y on Z, instead of the raw X and Y variables", or something of the sort?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick response. I agree with your suggestion to make the wording more explicit. Your proposed phrase clarifies the difference between the residuals and the raw variables, which is the key point for a correct interpretation of the plot. Your suggested wording would be a significant improvement.

@coopa33
Copy link
Author

coopa33 commented Aug 19, 2025

So, after talking to @vandenman I understand that grid will be deprecated and will be replaced by patchwork in the near future. At that time, it will be possible to add captions. (see here).
Through the existing interface for jaspGraphs it is not possible to safely append a caption to a matrix plot, therefore there are two possibilities until the move to patchwork:

  • The caption is under each scatterplot in a matrix grid (as seen previously: b8fd878)
  • Instead of a caption, we append the axis labels. Looks like this:
Screenshot From 2025-08-19 13-36-35 Screenshot From 2025-08-19 13-37-32
  • I chose "Residual()" as this is the most explicit, but we could also shorten it

@JohnnyDoorn
Copy link
Contributor

JohnnyDoorn commented Aug 21, 2025

Really great stuff @coopa33! One suggestion for clarifying that it concerns the residuals is to tweak the name of the plot (the title that is listed right below "Scatter plots":
"petal.length vs. sepal.length" -> "Residuals of petal.length vs. sepal.length, after controlling for petal.width", although that could become quite lengthy and I think it's already good as is with the residuals on the axis labels (especially if we improve it after the patchwerk update).

Also, is the stuff in renv/activate.R necessary?

@coopa33
Copy link
Author

coopa33 commented Aug 21, 2025

@JohnnyDoorn Then if we tweak the title to explicitly state that these are residuals, we could also shorten the axes labels and make it less cluttered (Res: <varName>). I will try out if it works and how it looks and let you know. Thanks!

Edit: I will delete the renv/ directory. I think it was automatically generated when I tried to install the dependencies.

…nction to work for any condition and with prefix and suffix specs.
…nything if pre/suffix arguments are too long
@coopa33
Copy link
Author

coopa33 commented Aug 22, 2025

@JohnnyDoorn, this is how it looks after incorporating your feedback, I think actually it's not that cluttered!

Single partial variable
Screenshot From 2025-08-22 13-22-31

Multiple partial variable
Screenshot From 2025-08-22 13-57-25

@JohnnyDoorn
Copy link
Contributor

JohnnyDoorn commented Sep 1, 2025

Great work @coopa33!
I took a look at how it looks for linear regression (partial plots), and there we use "Residuals <'var name'>" on the axes, so maybe you can also do that here, for consistency? I do like that "controlling for.." is included in correlation, so I will look into also including that in the plot names for linear regression.

@coopa33
Copy link
Author

coopa33 commented Sep 1, 2025

@JohnnyDoorn Glad to help!
I'll take a look!

@coopa33
Copy link
Author

coopa33 commented Sep 1, 2025

@JohnnyDoorn
Screenshot From 2025-09-01 18-14-08

@JohnnyDoorn
Copy link
Contributor

Sorry for being unclear - I meant to use the same convention here (i.e., correlation), as we do there (i.e., regression) - so to use "Residuals 'varname'" for the partial correlation plots - I think just "Res 'varname'" is not entirely clear. The change I suggested for the partial plots in regression is to have the "controlled for.." in the plot title.

@coopa33
Copy link
Author

coopa33 commented Sep 3, 2025

Ah, you meant the other way around :D Sure, and should I also include the "controlling for" in the linear regression partial plots for you to check how it looks?

@JohnnyDoorn
Copy link
Contributor

Yes, awesome!

@coopa33
Copy link
Author

coopa33 commented Sep 3, 2025

@JohnnyDoorn also a quick question, I'm noting that in lin regression a partial plot is also produced for only one covariate/predictor. I'm just thinking because then it should be just a scatterplot of the raw variables and not residuals right? If that is the case. should we control flow it not show the "Residuals" prefix? Or maybe the plot should not display for only one predictor?

@JohnnyDoorn
Copy link
Contributor

@coopa33 Yes good suggestion - I would either leave it as is (consistent with multiple variables), or replace it with the scatterplot of the raw values (more logical).

@coopa33
Copy link
Author

coopa33 commented Sep 9, 2025

Following was implemented:

In Linear Regression, if there is only one predictor, the Partial Regression Plot gets renamed to Regression Plot, and the labels will only display the variable names, without the 'Residual:' prefix:

Screenshot From 2025-09-09 21-18-02 Screenshot From 2025-09-09 21-18-18

Allthough now, for one predictor, if you click Partial Plots, you ofc get not a partial plot. But maybe this can be explicitly mentioned in the info panel for the Partial Plot option?

Otherwise, I also only implemented that you get an error displayed that you need at least 2 predictors for partial plots in 181fdf9
(allthough I wrote covariates instead of predictors, but you get the idea)

@JohnnyDoorn
Copy link
Contributor

Sorry for my late reaction - I missed this update!
I prefer to give any plot when only when predictor, because there's no other way to get a scatterplot that you can use to check linearity in regression in a clear way otherwise. So I think just mentioning it in the help file entry (which shows up as tooltip when you hover over in newer versions) suffices.

@JohnnyDoorn
Copy link
Contributor

Hi @coopa33,

Can this be merged? There will be many people appreciating this update!

@coopa33
Copy link
Author

coopa33 commented Nov 11, 2025

Hi @JohnnyDoorn , didn't work on this for a while (Masters keeping me busy)...
From my last update, the partial correlation plot should be correctly labelled now and show the correct values. The only thing remaining was that the scatter plot for a single predictor in line ar regression is labeled as partial plot and the axes are labelled residuals, as mentioned.
I didn't quite understand what you meant in the last comment, do you think it's better to leave it as is (single predictor plot labelled as partial plot, and axes labelled residuals) and add the info to the help file, or to change it for single predictors (labelled regression plot and axes without residuals label)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants