Skip to content

Conversation

@RealOrangeOne
Copy link
Contributor

What is the context of this PR?

We need a way to create test data in bulk for performance testing etc.

https://officefornationalstatistics.atlassian.net/browse/CMS-948

Data creation is seeded, meaning it can be recreated reliably.

Data is identified by a prefix. To save complexity, it's only possible to delete all test data, rather than specific seeds.

How to review

There are 2 new management commands.

  • create_test_data creates data based on a seed.
  • delete_test_data removes all test data from the system (based on the prefix)

Follow-up Actions

List any follow-up actions (if applicable), like needed documentation updates or additional testing.

@MaciekBaron MaciekBaron added the component: Tooling Makefiles, linters, container instrumentation etc label Jan 2, 2026
This keeps the command parts separate
By default, they enqueue tasks when instances are saved, which is unnecessary when many of the instances are about to be
deleted.
This factory is definitely configured incorrectly, but something about the model, dependent tests or factory itself
requires that it work this way.
They may not have been indexed yet
@RealOrangeOne RealOrangeOne marked this pull request as ready for review January 21, 2026 14:52
@RealOrangeOne RealOrangeOne requested a review from a team as a code owner January 21, 2026 14:52
missing-function-docstring,
fixme,
too-many-public-methods,
line-too-long,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe question for @MebinAbraham but I think it might not be a good idea to disable line-too-long globally, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line length is already verified by ruff. pylint's line-too-long verifies that the entire line (including comments) is under a given size. That means a line can be valid with ruff, but fail with pylint simply because any ignore comments push it over the threshold. That distinction isn't really useful, so this just lets ruff handle it.


class PageCreationConfig(ModelCreationConfig):
published: FractionalFloat = 0.5
revisions: NonNegativeInt | RangeConfig = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I misunderstood, aren't all Pages guaranteed to have at least one revision in a normal environment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the revision is the history, so the first draft save of a page won't necessarily have a revision. That's my understanding at least, I may very well be wrong.



class PageCreationConfig(ModelCreationConfig):
published: FractionalFloat = 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better name would be published_ratio or published_probability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is probability rather than ratio, so I'll go with that.

Naming things is hard!

"dataset_manual_links": 1,
"explore_more": 2,
"published": 1,
"revisions": {"min": 1, "max": 3},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Could we change the range min to 2? This way we test that the range works, and that it doesn't just create one item for any given range (if it always generates 1 item, the tests on line 70 and 71 will always pass).

@@ -0,0 +1 @@
SEEDED_DATA_PREFIX = "Z-RANDOM "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Would it be better for this file to be named constants.py?

Prior art: https://github.com/ONSdigital/dis-wagtail/blob/main/cms/datavis/constants.py

utils.py usually just contain utility functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think I'd imagined this module having more "utils", but never ended up writing any.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: Tooling Makefiles, linters, container instrumentation etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants