Skip to content

Conversation

@J-SUPHA
Copy link
Collaborator

@J-SUPHA J-SUPHA commented Jan 19, 2026

PR Type

  • RL Environment PR - Complete Environment Snapshot & Zero-Training sections
  • Non-Environment PR - Complete Description, Related Issues & Type of Change sections

📝 General Information

Description

Introduced Training with LORAs and training via shared weights to Atropos example trainer

Not an issue but a feature

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Code refactor (no functional changes)
  • Build/CI/CD related changes
  • Other (please describe):

✅ Developer & Reviewer Checklist

  • Code follows project style (black, isort, flake8 pass with pre-commit)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes
  • Docstrings added for all new public classes / functions
  • If .env vars required, did you add it to the .env.example in repo root?

@dmahan93
Copy link
Collaborator

so some comments

  • can you break up the giant files a bit 😅
  • How are you handling QKV? vLLM has QKV in one tensor, but HF has it on three different ones right?
  • How does the LoRA work here?

@J-SUPHA
Copy link
Collaborator Author

J-SUPHA commented Jan 20, 2026

  • Will break up the files once I fix and make sure point 2 works as intended
  • Yes this was is a mistake on my end. HF and vLLM both have fused qkv for Qwen So I assumed all did will fix and report back once I test it out with all models
  • For the LORA workflow it goes Trainer loads HF base model - Creates LoRA adapters on q_proj, v_proj - Trains adapters - Saves adapter - Calls POST /lora/load to vLLM. But the Lora/Load endpoint in vllm_api_server is not correct - will update this as well

@dmahan93
Copy link
Collaborator

@J-SUPHA
Copy link
Collaborator Author

J-SUPHA commented Jan 20, 2026

No - you are right will look into it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants