-
Notifications
You must be signed in to change notification settings - Fork 1
Use std of 0 for singleton vectors #90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #90 +/- ##
=======================================
Coverage 99.25% 99.25%
=======================================
Files 12 12
Lines 135 135
=======================================
Hits 134 134
Misses 1 1
Continue to review full report at Codecov.
|
src/scaling.jl
Outdated
|
|
||
| compute_stats(x) = (mean(x), std(x)) | ||
| # Set std to 0 using corrected=false if x is a singleton | ||
| compute_stats(x) = (mean(x), std(x; corrected=(length(x) != 1))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's better to pass this via MeanStdScaling as a kwarg? since it's already exposed via Statistics.std anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you're coming from by leaving it up to the user. I guess it would just be inconvenient in the case that came up for me. I would rather change my downstream test case so that the data doesn't end up being one row, than add corrected=(length(x) != 1) to MeanStdScaling in the transform pipeline, in the source code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think it's better to not make those kinds of assumptions in the code here, especially when Statistics.std doesn't. In the instance of edge-case we just have to handle it on that end.
| """ | ||
| MeanStdScaling(A::AbstractArray; dims=:, inds=:) -> MeanStdScaling | ||
| MeanStdScaling(table, [cols]) -> MeanStdScaling | ||
| MeanStdScaling(A::AbstractArray; dims=:, inds=:, corrected=true) -> MeanStdScaling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just pass generic kwargs...? Or does it get confused about what to do with the dims kwarg?
Closes #89 by adding an optional kwarg
correctedwhich is passed to thestdfunction.