chore(deps-dev): bump @huggingface/transformers from 3.8.1 to 4.0.1 by dependabot[bot] · Pull Request #831 · optave/ops-codegraph-tool

dependabot · 2026-04-04T10:23:24Z

Bumps @huggingface/transformers from 3.8.1 to 4.0.1.

Release notes

Sourced from @huggingface/transformers's releases.

4.0.0

🚀 Transformers.js v4

We're excited to announce that Transformers.js v4 is now available on NPM! After a year of development (we started in March 2025 🤯), we're finally ready for you to use it.
npm i @huggingface/transformers
Links: YouTube Video, Blog Post, Demo Collection

New WebGPU backend

The biggest change is undoubtedly the adoption of a new WebGPU Runtime, completely rewritten in C++. We've worked closely with the ONNX Runtime team to thoroughly test this runtime across our ~200 supported model architectures, as well as many new v4-exclusive architectures.

In addition to better operator support (for performance, accuracy, and coverage), this new WebGPU runtime allows the same transformers.js code to be used across a wide variety of JavaScript environments, including browsers, server-side runtimes, and desktop applications. That's right, you can now run WebGPU-accelerated models directly in Node, Bun, and Deno!

We've proven that it's possible to run state-of-the-art AI models 100% locally in the browser, and now we're focused on performance: making these models run as fast as possible, even in resource-constrained environments. This required completely rethinking our export strategy, especially for large language models. We achieve this by re-implementing new models operation by operation, leveraging specialized ONNX Runtime Contrib Operators like com.microsoft.GroupQueryAttention, com.microsoft.MatMulNBits, or com.microsoft.QMoE to maximize performance.

For example, adopting the com.microsoft.MultiHeadAttention operator, we were able to achieve a ~4x speedup for BERT-based embedding models.

ONNX Runtime improvements by @xenova in huggingface/transformers.js#1306

Transformers.js V4: Native WebGPU EP, repo restructuring, and more! by @xenova in huggingface/transformers.js#1382

New models

Thanks to our new export strategy and ONNX Runtime's expanding support for custom operators, we've been able to add many new models and architectures to Transformers.js v4. These include popular models like GPT-OSS, Chatterbox, GraniteMoeHybrid, LFM2-MoE, HunYuanDenseV1, Apertus, Olmo3, FalconH1, and Youtu-LLM. Many of these required us to implement support for advanced architectural patterns, including Mamba (state-space models), Multi-head Latent Attention (MLA), and Mixture of Experts (MoE). Perhaps most importantly, these models are all compatible with WebGPU, allowing users to run them directly in the browser or server-side JavaScript environments with hardware acceleration. We've released several Transformers.js v4 demos so far... and we'll continue to release more!

Additionally, we've added support for larger models exceeding 8B parameters. In our tests, we've been able to run GPT-OSS 20B (q4f16) at ~60 tokens per second on an M4 Pro Max.

Add support for Apertus by @nico-martin in huggingface/transformers.js#1465

Add support for FalconH1 by @xenova in huggingface/transformers.js#1502

Add support for Cohere's Tiny Aya models by @xenova in huggingface/transformers.js#1529

Add support for AFMoE by @xenova in huggingface/transformers.js#1542

Add support for new Qwen VL models (Qwen2.5-VL, Qwen3-VL, Qwen3.5, and Qwen3.5 MoE) by @xenova in huggingface/transformers.js#1551

Add support for Qwen2 MoE, Qwen3 MoE, Qwen3 Next, Qwen3-VL MoE, and Olmo Hybrid by @xenova in huggingface/transformers.js#1562

Add support for EuroBERT by @xenova in huggingface/transformers.js#1583

Add support for LightOnOCR and GLM-OCR by @xenova in huggingface/transformers.js#1582

Add support for Nemotron-H by @xenova in huggingface/transformers.js#1585

Add support for DeepSeek-v3 by @xenova in huggingface/transformers.js#1586

Add support for mistral4 by @xenova in huggingface/transformers.js#1587

Add support for GLM-MoE-DSA by @xenova in huggingface/transformers.js#1588

Add support for Chatterbox @xenova in huggingface/transformers.js#1592

Add support for Cohere ASR by @xenova in huggingface/transformers.js#1610

Add support for SolarOpen and CHMv2 models by @xenova in huggingface/transformers.js#1593

... (truncated)

Commits

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [@huggingface/transformers](https://github.com/huggingface/transformers.js) from 3.8.1 to 4.0.1. - [Release notes](https://github.com/huggingface/transformers.js/releases) - [Commits](https://github.com/huggingface/transformers.js/commits) --- updated-dependencies: - dependency-name: "@huggingface/transformers" dependency-version: 4.0.1 dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

greptile-apps · 2026-04-04T10:27:45Z

Greptile Summary

This Dependabot PR bumps @huggingface/transformers from 3.8.1 to 4.0.1 across both devDependencies and peerDependencies. The jump is a major version (v3 → v4) and carries one P1 API break:

The quantized: true pipeline option used in src/domain/search/models.ts (line 197) was replaced by a dtype string parameter in Transformers.js v3 and is confirmed removed in v4. Without a fix, the minilm model loads in full fp32 precision instead of q8, quadrupling its memory footprint.

Confidence Score: 4/5

Unsafe to merge without fixing the quantized→dtype API change in models.ts; otherwise the upgrade is clean

One clear P1: the quantized: true option no longer has any effect in v4, causing minilm to load unquantized. All other public API surface (pipeline, feature-extraction, dispose, Xenova/ model names, output.data) is confirmed compatible. Tests mock the library so CI stays green regardless.

src/domain/search/models.ts — needs quantized: true replaced with dtype: 'q8' before merging

Important Files Changed

Filename	Overview
package.json	Bumps @huggingface/transformers devDependency and peerDependency from ^3.8.1 to ^4.0.1; a major-version jump that drops the legacy `quantized` pipeline option used in models.ts
package-lock.json	Lock file updated to resolve @huggingface/transformers 4.0.1 and its new transitive dependencies; no concerns beyond the source-level API change

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[loadModel called] --> B{extractor cached?}
    B -- yes --> C[Return cached extractor]
    B -- no --> D[loadTransformers / dynamic import]
    D --> E["pipeline('feature-extraction', modelName, pipelineOpts)"]
    E --> F{config.quantized?}
    F -- "v3: {quantized:true} → loads q8 model" --> G["✅ ~23 MB"]
    F -- "v4: {quantized:true} ignored → loads fp32 model" --> H["⚠️ ~92 MB (4×)"]
    F -- "Fix: {dtype:'q8'} → loads q8 model" --> I["✅ ~23 MB"]
    G --> J[Extractor ready]
    H --> J
    I --> J
    J --> K["extractor(batch, {pooling:'mean', normalize:true})"]
    K --> L["output.data — flat Float32 array (unchanged in v4)"]

Comments Outside Diff (1)

src/domain/search/models.ts, line 197 (link)

quantized option silently dropped in v4 — minilm loads unquantized

The official Transformers.js docs state: "Before Transformers.js v3, we used the quantized option… Now, we've added the ability to select from a much larger list with the dtype parameter." With this bump to v4, { quantized: true } is no longer a recognized pipeline option and will be silently ignored, causing the minilm model to load in full fp32 precision instead of the intended q8 variant — roughly 4× the memory footprint.

_{Reviews (1): Last reviewed commit: "chore(deps-dev): bump @huggingface/trans..." | Re-trigger Greptile}

dependabot bot added dependencies Pull requests that update a dependency file javascript Pull requests that update javascript code labels Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps-dev): bump @huggingface/transformers from 3.8.1 to 4.0.1#831

chore(deps-dev): bump @huggingface/transformers from 3.8.1 to 4.0.1#831
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/npm_and_yarn/huggingface/transformers-4.0.1

dependabot bot commented on behalf of github Apr 4, 2026

Uh oh!

greptile-apps bot commented Apr 4, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot bot commented on behalf of github Apr 4, 2026

4.0.0

🚀 Transformers.js v4

New WebGPU backend

New models

Uh oh!

greptile-apps bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

greptile-apps bot commented Apr 4, 2026 •

edited

Loading