Skip to content

Latest commit

 

History

History
128 lines (97 loc) · 3.99 KB

File metadata and controls

128 lines (97 loc) · 3.99 KB

GitHub Setup Instructions

Step 1: Get Your Personal Access Token (GUI Method)

  1. Open GitHub in your browser and log in
  2. Click your profile picture (top-right corner)
  3. Click Settings
  4. Scroll down the left sidebar and click Developer settings (at the very bottom)
  5. Click Personal access tokens
  6. Click Tokens (classic)
  7. Click Generate new tokenGenerate new token (classic)
  8. Give it a name like "llama-cpp-standalone"
  9. Set expiration: Choose "No expiration" or custom
  10. Select scopes: Check the box for repo (this gives full repository access)
  11. Scroll down and click Generate token
  12. IMPORTANT: Copy the token immediately! You won't see it again.
    • It looks like: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Step 2: Create the Repository on GitHub

Option A: Use GitHub Web Interface (Easiest)

  1. Go to https://github.com/new
  2. Repository name: llama-cpp-python-standalone
  3. Description: "Simple Python wrapper for llama.cpp server - use new models before Python bindings catch up"
  4. Keep it Public (so others can benefit)
  5. DON'T initialize with README (we already have one)
  6. Click Create repository

Option B: Use Command Line

# Install GitHub CLI (if not installed)
# Ubuntu/Debian:
sudo apt install gh

# macOS:
brew install gh

# Login
gh auth login

# Create repo
gh repo create llama-cpp-python-standalone --public \
  --description "Simple Python wrapper for llama.cpp server - use new models before Python bindings catch up"

Step 3: Push the Code

cd /home/gregor/aidev/supercmd/tools/llama-cpp-python-standalone

# Initialize git if not already
git init

# Add all files
git add .

# Make initial commit
git commit -m "Initial release - Python wrapper for llama.cpp server

- Simple wrapper bypassing outdated llama-cpp-python
- OpenAI-compatible API
- Support for new architectures (Qwen3-VL, etc.)
- Auto-build script with GPU detection
- Vision model examples
- Context manager support"

# Add your GitHub repo as remote (replace YOUR_TOKEN)
git remote add origin https://YOUR_TOKEN@github.com/cronos3k/llama-cpp-python-standalone.git

# Push to GitHub
git branch -M main
git push -u origin main

Security Note: The token in the URL is temporary. After first push, git will remember it.

Step 4: Add Topics/Tags (Optional but Recommended)

  1. Go to your repo: https://github.com/cronos3k/llama-cpp-python-standalone
  2. Click the ⚙️ gear icon next to "About" (top right)
  3. Add topics: llama-cpp, llm, python, gguf, qwen, cuda, local-ai, openai-api
  4. Click Save changes

Step 5: Create a Release (Optional)

  1. Go to your repo
  2. Click Releases (right sidebar)
  3. Click Create a new release
  4. Tag: v1.0.0
  5. Title: Initial Release
  6. Description:
## Features
- 🚀 Simple Python wrapper for llama.cpp server
- ✅ Bypass outdated llama-cpp-python bindings
- ✅ Support for new architectures (Qwen3-VL, Gemma3, etc.)
- ✅ OpenAI-compatible API
- ✅ Auto-build script with GPU detection
- ✅ Vision model examples

## Quick Start
See README.md for installation and usage instructions.
  1. Click Publish release

Troubleshooting

"Permission denied" when pushing

  • Your token doesn't have repo scope
  • Generate a new token with correct permissions

"Repository not found"

  • Check the URL: https://github.com/cronos3k/llama-cpp-python-standalone
  • Make sure repo was created successfully

Token expired

  • Tokens can expire - generate a new one
  • Consider using SSH keys instead (more secure)

Alternative: SSH Keys (More Secure)

Instead of tokens, you can use SSH:

  1. Generate SSH key: ssh-keygen -t ed25519 -C "your_email@example.com"
  2. Add to GitHub: Settings → SSH and GPG keys → New SSH key
  3. Use SSH URL: git@github.com:cronos3k/llama-cpp-python-standalone.git