-
Notifications
You must be signed in to change notification settings - Fork 4
CMS-948: data creation scripts #445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This lets the signals be sent when running non-wagtail admin codepaths.
0a987a7 to
584053f
Compare
…d reusing stock delete functionality
The MP_Node factories need `parent` in `django_get_or_create`, and to be using `TreeQuerySet` else the needed methods don't exist.
Because the topic tree hides the root node in a number core method, it gets corrupted during deletion.
This keeps the command parts separate
By default, they enqueue tasks when instances are saved, which is unnecessary when many of the instances are about to be deleted.
This factory is definitely configured incorrectly, but something about the model, dependent tests or factory itself requires that it work this way.
They may not have been indexed yet
No need to be less informative when actually deleting.
| missing-function-docstring, | ||
| fixme, | ||
| too-many-public-methods, | ||
| line-too-long, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe question for @MebinAbraham but I think it might not be a good idea to disable line-too-long globally, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line length is already verified by ruff. pylint's line-too-long verifies that the entire line (including comments) is under a given size. That means a line can be valid with ruff, but fail with pylint simply because any ignore comments push it over the threshold. That distinction isn't really useful, so this just lets ruff handle it.
|
|
||
| class PageCreationConfig(ModelCreationConfig): | ||
| published: FractionalFloat = 0.5 | ||
| revisions: NonNegativeInt | RangeConfig = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I misunderstood, aren't all Pages guaranteed to have at least one revision in a normal environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the revision is the history, so the first draft save of a page won't necessarily have a revision. That's my understanding at least, I may very well be wrong.
cms/test_data/config.py
Outdated
|
|
||
|
|
||
| class PageCreationConfig(ModelCreationConfig): | ||
| published: FractionalFloat = 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better name would be published_ratio or published_probability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is probability rather than ratio, so I'll go with that.
Naming things is hard!
cms/test_data/tests/test_commands.py
Outdated
| "dataset_manual_links": 1, | ||
| "explore_more": 2, | ||
| "published": 1, | ||
| "revisions": {"min": 1, "max": 3}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Could we change the range min to 2? This way we test that the range works, and that it doesn't just create one item for any given range (if it always generates 1 item, the tests on line 70 and 71 will always pass).
| @@ -0,0 +1 @@ | |||
| SEEDED_DATA_PREFIX = "Z-RANDOM " | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Would it be better for this file to be named constants.py?
Prior art: https://github.com/ONSdigital/dis-wagtail/blob/main/cms/datavis/constants.py
utils.py usually just contain utility functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I think I'd imagined this module having more "utils", but never ended up writing any.
What is the context of this PR?
We need a way to create test data in bulk for performance testing etc.
https://officefornationalstatistics.atlassian.net/browse/CMS-948
Data creation is seeded, meaning it can be recreated reliably.
Data is identified by a prefix. To save complexity, it's only possible to delete all test data, rather than specific seeds.
How to review
There are 2 new management commands.
create_test_datacreates data based on a seed.delete_test_dataremoves all test data from the system (based on the prefix)Follow-up Actions
List any follow-up actions (if applicable), like needed documentation updates or additional testing.