-
Notifications
You must be signed in to change notification settings - Fork 29
Helper function to plot residuals in partial correlation scatter plots #417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…siduals of X and Y regressed on one/multiple variable(s) Z for partial correlations.
|
@JohnnyDoorn Hi Johnny, I'm not able to request a review from you here, probably because I'm not a collaborator for this module. Would you be able to review my fix for the issue? |
|
When you place the two graphs side by side, you can provide the following explanation (perhaps in the help section): "The normal correlation graph shows the relationship between the raw (original) values of the X and Y variables. The partial correlation graph reflects the relationship between X and Y after the effect of the Z variable has been removed. Therefore, the axes of the partial correlation graph represent the “residual” values obtained from regression models rather than the original variable values. As seen, when the effect of the Z variable is controlled, the relationship between X and Y exhibits a [strengthened/weakened/changed direction] structure." |
…tly that axes represent residuals for PC. Added same info in help sections.
…aptions loaded only after plots were loaded
Once again forgot to pull the repository before working on it
This reverts commit 11d31f8.
…ecause captions loaded only after plots were loaded" This reverts commit b535117.
This reverts commit 01640af.
…ble caption" This reverts commit adc103f.
This reverts commit a828314.
…llow for captions" This reverts commit 1d32af7.
inst/help/Correlation.md
Outdated
|
|
||
| #### Plots | ||
| - Scatter plots: Display a scatter plots for each possible combination of the selected variables. In a matrix format, these are placed above the diagonal. | ||
| - Scatter plots: Displays scatter plots for all variable pairs. In a matrix format, these are placed above the diagonal. For partial correlations, plots show the relationship between X and Y after removing the effect of Z, and axes then represent residuals from regressing X and Y on Z. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is suggested to replace the conjunction "and" with "Therefore" in the help text of JASP, specifically in the sentence:
"For partial correlations, plots show the relationship between X and Y after removing the effect of Z**, and** axes then represent residuals from regressing X and Y on Z."
"For partial correlations, plots show the relationship between X and Y after removing the effect of Z### . Therefore, axes then represent residuals from regressing X and Y on Z."
This modification improves the flow of the sentence and clearly highlights that the second part of the statement is a logical consequence of the first. This makes the help text more explanatory and easier for users to understand the core concept behind partial correlation plots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, in that case we could make it even more explicit? As in: "In that case, axes represent residuals from regressing X and Y on Z, instead of the raw X and Y variables", or something of the sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the quick response. I agree with your suggestion to make the wording more explicit. Your proposed phrase clarifies the difference between the residuals and the raw variables, which is the key point for a correct interpretation of the plot. Your suggested wording would be a significant improvement.
|
So, after talking to @vandenman I understand that grid will be deprecated and will be replaced by patchwork in the near future. At that time, it will be possible to add captions. (see here).
|
|
Really great stuff @coopa33! One suggestion for clarifying that it concerns the residuals is to tweak the name of the plot (the title that is listed right below "Scatter plots": Also, is the stuff in renv/activate.R necessary? |
|
@JohnnyDoorn Then if we tweak the title to explicitly state that these are residuals, we could also shorten the axes labels and make it less cluttered (Res: <varName>). I will try out if it works and how it looks and let you know. Thanks! Edit: I will delete the renv/ directory. I think it was automatically generated when I tried to install the dependencies. |
…nction to work for any condition and with prefix and suffix specs.
…nything if pre/suffix arguments are too long
|
@JohnnyDoorn, this is how it looks after incorporating your feedback, I think actually it's not that cluttered! |
|
Great work @coopa33! |
|
@JohnnyDoorn Glad to help! |
|
Sorry for being unclear - I meant to use the same convention here (i.e., correlation), as we do there (i.e., regression) - so to use "Residuals 'varname'" for the partial correlation plots - I think just "Res 'varname'" is not entirely clear. The change I suggested for the partial plots in regression is to have the "controlled for.." in the plot title. |
|
Ah, you meant the other way around :D Sure, and should I also include the "controlling for" in the linear regression partial plots for you to check how it looks? |
…ing label prefix for matrixplot
|
Yes, awesome! |
|
@JohnnyDoorn also a quick question, I'm noting that in lin regression a partial plot is also produced for only one covariate/predictor. I'm just thinking because then it should be just a scatterplot of the raw variables and not residuals right? If that is the case. should we control flow it not show the "Residuals" prefix? Or maybe the plot should not display for only one predictor? |
|
@coopa33 Yes good suggestion - I would either leave it as is (consistent with multiple variables), or replace it with the scatterplot of the raw values (more logical). |
… to Regression Plot. Also remove residuals from x and y labels
|
Following was implemented: In Linear Regression, if there is only one predictor, the Partial Regression Plot gets renamed to Regression Plot, and the labels will only display the variable names, without the 'Residual:' prefix:
Allthough now, for one predictor, if you click Partial Plots, you ofc get not a partial plot. But maybe this can be explicitly mentioned in the info panel for the Partial Plot option? Otherwise, I also only implemented that you get an error displayed that you need at least 2 predictors for partial plots in 181fdf9 |
|
Sorry for my late reaction - I missed this update! |
|
Hi @coopa33, Can this be merged? There will be many people appreciating this update! |
|
Hi @JohnnyDoorn , didn't work on this for a while (Masters keeping me busy)... |











Solution for issue #3507:
The scatter plots for the correlation analysis did not show residuals of X and Y regressed on partial-out variables Z for partial correlation. Which it should.
This was not only the case for the datapoints, but also for the marginal distributions, which also did not change when one or more partial-out variables were added.
However, the plotted statistics did change while the plots did not.
Original Analysis - No partial correlation

Original Analysis - Partial correlation

Problem
Solution
Modified Analysis - No partial correlation

Modified Analysis - Partial correlation

Modified Analysis - Multiple variables to partial out + Confidence & Prediction Intervals

Further Notes