Add experimental support for transformers>=5.0 + min torch 2.8 + bug fixes for tests#975
Add experimental support for transformers>=5.0 + min torch 2.8 + bug fixes for tests#975kevalmorabia97 wants to merge 21 commits intomainfrom
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaced many Hugging Face Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Script
participant HF as "HuggingFace.from_pretrained"
participant ModelOpt as "ModelOpt plugin (_restore_qtensor_wrappers)"
participant FS as "modelopt_state.pth (FS)"
User->>Script: run (optional --trust_remote_code)
Script->>HF: from_pretrained(..., dtype=..., trust_remote_code=...)
HF-->>Script: returns model instance
Script->>ModelOpt: patched hook invoked after instantiation
ModelOpt->>FS: check for modelopt_state.pth
FS-->>ModelOpt: q_tensor_state (if present)
ModelOpt->>ModelOpt: re-wrap weights preserving QTensorWrapper metadata
ModelOpt-->>Script: model with restored wrappers
Script-->>User: continue (quantize/export/generate)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (2 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
3e28ada to
2b24815
Compare
|
|
/ok to test 2b24815 |
|
/ok to test 1f0726e |
1f0726e to
48b426f
Compare
|
/ok to test 48b426f |
48b426f to
0781ac7
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #975 +/- ##
==========================================
+ Coverage 70.21% 70.63% +0.41%
==========================================
Files 230 230
Lines 26073 26083 +10
==========================================
+ Hits 18308 18423 +115
+ Misses 7765 7660 -105 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
e740745 to
26cf04a
Compare
| decoder_cls = _setup_kimi_k2_decoder() | ||
|
|
||
| self.eagle_config = PretrainedConfig.from_dict(config.eagle_architecture_config) | ||
| arch_config = config.eagle_architecture_config |
9791ff0 to
ed000d2
Compare
…pu tests for torch 2.8 Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
ed000d2 to
6d3af7c
Compare
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
533e37e to
6bd8706
Compare
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
6bd8706 to
d5b61cb
Compare
What does this PR do?
--warmup-ratio: float(deprecated in 5.x), we now change it to--warmup-steps: float | intwhich works as ratio if float but only for 5.x. For 4.x, it will error out if float and prompt user to change back to--warmup-ratioor pass an int absolute step count.Add Workaround for TRT-LLM's import of deprecated transformers functions so trt-llm based gpu unit tests work fine. Still deployment for models needs proper fixes directly in TRT-LLM hence llm/vlm ptq example tests still run with transformers 4.57trust_remote_code=Truetests/examples/speculative_decoding- previously silently skippedTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True, usingtorch.load(..., weights_only=True), avoidingpickle, etc.).Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Documentation
Chores
Tests