-
Notifications
You must be signed in to change notification settings - Fork 11
feat: add aggregated data collector API #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add aggregated data collector API #314
Conversation
833d135 to
fe44aaa
Compare
d7ac416 to
12ed527
Compare
felipemontoya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me there are serious concerns about the security of this feature. We have it on a test environment and I will do all my testing there before I requests changes. However I need to voice the concern that "as is" this must not be merged.
|
Refactor:
|
|
@luisfelipec95 podríá pedirte que el setting sea: AGGREGATED_DATA_COLLECTOR_API_ENABLED ? Quiero que si alguien ve ese setting no le entre la duda de si los datos son PII |
Done |
…ta collector endpoint
…ATA_COLLECTOR_API_ENABLED
52e4111 to
b9e0f25
Compare
magajh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luisfelipec95 Thanks for all the work!
We need to add tests for this implementation. I also noticed that the functions don’t use type hints—we should add them.
|
Also @luisfelipec95, could you please specify how this has been tested? |
magajh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luisfelipec95 thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! We can approve and merge this once the missing docstring is added and the unit tests are fixed
felipemontoya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Must be explicitly enabled, the queries are fixed and completely anonymous. The API call is securely authenticated.
Nobody that does not want this enabled will get it by accident.
Description
This PR introduces a new
Data Collector API. The purpose of this API is to enable Open edX instances to collect and send content-related data to a central service (e.g., the Shipyard API). This lays the foundation for the Hosting division to make data-driven decisions and improve the overall service offering.The new API is part of the
data-collectormodule, ensuring clear separation from existing APIs in thedatadirectory for better modularity and future extensibility.Rationale
data-collector) allows for easier maintenance and potential migration if required in the future.Key Features
Authentication:
API Design:
/eox-core/data-api/v1/collect-data/query_file_content: Content of a YAML file containing SQL queries.query_file_url: A URL pointing to a publicly accessible YAML file with queries.destination_url: The URL to which the results will be sent.query_file_contentorquery_file_urlis required.Async Task Execution:
generate_report) processes the provided queries:edxappdatabase.destination_url.Validation:
Logging and Error Handling:
How to Test
eox-coreplugin installed.{ "query_file_content": "queries:\n - name: 'Example Query'\n query: 'SELECT COUNT(*) FROM auth_user;'", "destination_url": "https://example.com/api/v1/receive-data/" }Notes