-
Notifications
You must be signed in to change notification settings - Fork 1k
Adding support for one level nested data.table #7568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7568 +/- ##
==========================================
- Coverage 98.98% 98.92% -0.07%
==========================================
Files 87 87
Lines 16729 16784 +55
==========================================
+ Hits 16560 16603 +43
- Misses 169 181 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
NEWS.md
Outdated
|
|
||
| 1. `nafill()`, `setnafill()` extended to work on logical vectors (part of [#3992](https://github.com/Rdatatable/data.table/issues/3992)). Thanks @jangorecki for the request and @MichaelChirico for the PR. | ||
|
|
||
| 2. `tables()` can now optionally report `data.table` objects stored one level deep inside list objects when `list_search=TRUE`, with `list_len_threshold` to avoid scanning extremely long lists, [#2606](https://github.com/Rdatatable/data.table/issues/2606). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example of when this new feature could be useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @jangorecki,
To support lists such as
list(data.table(a = 1, b = 4:6)),
data.table(a = 2, b = 7:10))
which can occure due to split.data.table or fread .
The previous code supported only data.table() top level and this code adds support for list( data.table...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if we dont want this search we can opt out for it maintaining backward compatiblity and performance as previous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jangorecki
Here is an example of previous and current versions
Reproducing the table() on list item and data.table
Here are the results as mentioned it doesnt take list(data.table(..), data.table(..))
After the PR, adding list_search
…e#7259) * allow showProgress=INTEGER to set progress bar update time * add NEWS * cleanup merge * refine NEWS * make check more explicit * adjust docs * adjust nocov to what they really should cover * phrasing comment Co-authored-by: Michael Chirico <[email protected]> * Update NEWS.md Co-authored-by: Michael Chirico <[email protected]> * refine NEWS * update man * remove as.integer from signature * remove unnecessary brackets --------- Co-authored-by: Michael Chirico <[email protected]>
…#7572) * fix(7571): bug fix for narm issue on gforce in int64 case * fix(7571): test sequencing * fix(7571): updated the NEWS.md * trailing newline * Use $V1 * fix(7571): added db optimize 2L * refine NEWS * fix(7571): add more tests and change to code similar to int for gsum * fix(7571): added more tests for mean * eliminate intermediate variable * NEWS again --------- Co-authored-by: Michael Chirico <[email protected]>
|
Will open another PR for the same from internal branch |
Closes #2606
added 2 args to
tables()one for shallow search and one to skip if the data.table is too bigif
list_searchis false then its the previous codeelse we loop through list-like objects using
is.listadded name for the nested list found
we skip list scanning beyond threshold
we merge to the nested list to the main list if found else it follows the old process