Skip to content

Conversation

@nathanbrophy
Copy link

This PR updates the GitLab Enterprise / Community Edition handling of the list_repo call. This call has a bug in it where the call to list_repos_get_user_and_groups in the GitLab API is not properly paginated. This means that any GitLab instances with more than 100 groups will not load properly into Codecov. The result is after a repo sync from the UI, the user still cannot view all the repos for the configured instance, even when using the bot token config. Adding pagination to this call fixes that behavior.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

).substitute(page=page)
groups_paged = await self.api("get", url, token=token)
groups += groups_paged
if len(groups) < 100:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: Incorrect pagination logic causes infinite loops and data loss when listing GitLab repos.
  • Description: The pagination loop for fetching GitLab groups has a flawed break condition. It checks the total number of accumulated groups (len(groups)) instead of the number of groups returned on the current page (len(groups_paged)). This can cause an infinite loop if an organization has enough groups to fill multiple pages (e.g., 100 or more), leading to resource exhaustion and a potential service crash. It can also lead to data loss by terminating prematurely if the first page has fewer than 100 groups.

  • Suggested fix: The break condition should check the number of items on the current page, not the accumulated total. Change if len(groups) &lt; 100: to if len(groups_paged) &lt; 100:.
    severity: 0.9, confidence: 1.0

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handling of pagination logic was taken from else where in the existing GitLab code, and opted for that to stay consistent with existing standards. There are returned pagination headers on the GitLab API responses we should be using across the board instead to handle the pagination properly, but I view that as out of scope for this PR, as we are following existing methods.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanbrophy there's actually a python gitlab library which abstracts a bunch of this away which could be a useful solution.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think that would be a good follow on to replace the gitlab client here with the published GitLab sponsored one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants