Skip to content

support virtual packages on generic git hosts (Gitea)#587

Open
ganesanviji wants to merge 3 commits intomicrosoft:mainfrom
ganesanviji:feat/genric-host-gitea-private
Open

support virtual packages on generic git hosts (Gitea)#587
ganesanviji wants to merge 3 commits intomicrosoft:mainfrom
ganesanviji:feat/genric-host-gitea-private

Conversation

@ganesanviji
Copy link
Copy Markdown

Description

Add support for installing virtual packages from self-hosted Git services like Gitea. Currently, APM only supports virtual packages (subdirectories) on GitHub. This change enables users with Gitea to install packages from subdirectories within repositories.

Changes:

  • Enhanced virtual package detection in DependencyReference to recognize subdirectory packages on generic Git hosts (any FQDN)
  • Added authenticated raw file downloads for private repositories on generic hosts
  • Updated API endpoint from /api/v3 to /api/v1 for better compatibility with Gitea and other Git services
  • Maintains full backward compatibility with existing GitHub functionality

More details about the changes:
✅ Change 1: Virtual Package Detection (reference.py)

Analysis: This only affects generic Git hosts, not GitHub. Allows subdirectory packages to be detected as virtual even without specific file extensions. Safe because:

GitHub uses separate logic path (is_generic_host = False)
Validation still requires package markers (apm.yml, SKILL.md, etc.) in the subdirectory
No impact on existing GitHub virtual file detection

✅ Change 2: Authenticated Raw Downloads (github_downloader.py)

Analysis: Improves private repo support. Safe because:

Only applies to generic hosts, not GitHub
Falls back to API if raw fails
Uses standard Authorization header format

✅ Change 3: API Endpoint Update

Analysis: Gitea uses /api/v1/, GitHub uses /api/v3/. Safe because:

GitHub still uses /api/v3/
Gitea API v1 is compatible for contents endpoint
Falls back gracefully if endpoint doesn't exist

Motivation:
Enterprise teams using self-hosted Git services (Gitea) cannot currently use APM to install packages from repository subdirectories. This is a significant limitation for organizations that don't use GitHub. These changes enable APM to work seamlessly across all Git hosting platforms.

Type of change

  • New feature
  • Bug fix
  • Documentation
  • Maintenance / refactor

Testing

  • Tested locally

    • Gitea virtual package parsing: PASS
    • GitHub virtual file parsing: PASS (unchanged)
    • Regular repo parsing: PASS (unchanged)
  • All existing tests pass

    • Code validated with custom test cases for Gitea URLs
    • Backward compatibility verified for GitHub usage
  • Added tests for new functionality (if applicable)

    • Validated with multiple test scenarios

@ganesanviji
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@danielmeppiel
Copy link
Copy Markdown
Collaborator

Review Feedback

Thanks @ganesanviji for adding Gitea support! The raw URL download approach is a good idea. A few issues need addressing:

1. API version change breaks GitLab (critical)

Changing /api/v3/ to /api/v1/ fixes Gitea but breaks GitLab (which uses /api/v4/). The current /api/v3/ also doesn't work for Gitea, so the real fix is per-host API version detection.

Options:

  • Preferred: Try the raw URL path first (your new code), then fall back to API with version negotiation (try v1, then v3, then v4)
  • Alternative: Make API version configurable per host in marketplace or auth config

2. Virtual package detection too broad

len(path_segments) > 2 would treat any path with 3+ segments as virtual. For example, gitea.example.com/owner/repo has exactly 2 segments (owner + repo) but gitea.example.com/owner/repo/subdir has 3. The current logic (has_virtual_ext or has_collection) is more precise. Could you check if the issue is specifically that Gitea paths aren't being detected, and narrow the condition?

3. Bare except: pass (line ~1069)

Please catch specific exceptions:

except (requests.RequestException, OSError):
    pass

4. No unit tests

Please add tests for:

  • Gitea raw URL download succeeds
  • GitLab API URL still works (regression test)
  • Virtual package detection for generic hosts

Relationship with PR #584

This PR complements #584 (which fixes the validation/ls-remote path). They don't conflict and can merge independently.

Copy link
Copy Markdown
Collaborator

@danielmeppiel danielmeppiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per previous comment

@ganesanviji
Copy link
Copy Markdown
Author

Hi @danielmeppiel ,

Thanks for review and I have addressed all the reviewed suggestions,

1. API version change breaks GitLab (critical)

Addressed with the preferred approach. For non-GitHub/GHE hosts we now attempt
the raw URL path first:

https://{host}/{owner}/{repo}/raw/{ref}/{file_path}

If that returns a non-200 we fall through to API version negotiation, trying
v1 -> v3 -> v4 in order. This covers Gitea (v1), legacy Gogs (v3), and
GitLab (v4) without hardcoding anything per host. GitHub and GHE continue to
use their existing code paths unchanged.


2. Virtual package detection too broad

We did not use len(path_segments) > 2. The existing
has_virtual_ext or has_collection guard is kept intact. The only change is
the else branch (no virtual indicator present):

  • GitLab (gitlab.com or any gitlab.* hostname): keeps
    min_base_segments = len(path_segments) -- the full path is the repo,
    preserving nested-group support.
  • All other generic hosts (Gitea, Bitbucket, self-hosted git, etc.): uses
    min_base_segments = 2 -- owner/repo convention, any extra segments are
    treated as a virtual subdirectory path.

The distinction is driven by a new is_gitlab_hostname() helper added to
github_host.py.


3. Bare except: pass

Fixed. The catch at that location is now:

except (requests.RequestException, OSError):
    pass

4. No unit tests

Added in two files:

tests/unit/test_github_host.py -- test_is_gitlab_hostname() covers:

  • gitlab.com and gitlab.* self-hosted instances return True
  • Case-insensitive matching (GITLAB.COM)
  • Negative cases: GitHub, Gitea, Bitbucket, Azure DevOps, None, ""

tests/unit/test_generic_git_urls.py -- TestGiteaVirtualPackageDetection
class covers:

  • Gitea virtual file extension detected as virtual (owner/repo/file.prompt.md)
  • Gitea /collections/ path detected as virtual collection
  • Dict-format virtual package on Gitea host
  • Plain two-segment owner/repo on Gitea is never virtual

TestNestedGroupSupport provides the GitLab regression guard --
gitlab.com/group/subgroup/repo must not be detected as virtual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants