Add embedding cache support to oneflow base model (#5552) by EddyLXJ · Pull Request #5552 · pytorch/FBGEMM

EddyLXJ · 2026-03-30T23:02:20Z

Summary:

X-link: https://github.com/facebookresearch/FBGEMM/pull/2519

CONTEXT: Port embedding cache changes from blue_reels_umia_v1_exp_hstu_simplified_v5_base_model.py to the oneflow base model. All embedding cache related configs are gated behind SID_INJECTION_MODE_V5 == PRETRAIN_MAP_EMBEDDING_CACHE so the base model remains unchanged when the mode is not active.

WHAT:

Gate all embedding cache configs behind PRETRAIN_MAP_EMBEDDING_CACHE:
- sparse_object_id_for_embedding_cache feature and replication config
- zch_embedding_cache_item_fb_public embedding table (MX4-safe dibit encoding, 32x2bit)
- emb_cache_item_ec EmbeddingCollection initialization
- KVZCHTBEConfig with OneFlow feature store enrichment
- TBE SSD/cache params (prefetch_pipeline, rocksdb, l2_cache, etc.)
- QComms fp8_quantize_dim=32
Simplify forward path using build_embedding_cache_write_kjt from kvzch_utils
Extract decode_dibits_to_int64 utility to kvzch_utils.py for reuse
Update C++ encode/decode from FP8 nibbles (16x4bit) to MX4-safe dibits (32x2bit)

Reviewed By: xinzhang-nac

Differential Revision: D98399416

meta-codesync · 2026-03-30T23:02:27Z

@EddyLXJ has exported this pull request. If you are a Meta employee, you can view the originating Diff in D98399416.

Summary: X-link: facebookresearch/FBGEMM#2519 CONTEXT: Port embedding cache changes from blue_reels_umia_v1_exp_hstu_simplified_v5_base_model.py to the oneflow base model. All embedding cache related configs are gated behind SID_INJECTION_MODE_V5 == PRETRAIN_MAP_EMBEDDING_CACHE so the base model remains unchanged when the mode is not active. WHAT: - Gate all embedding cache configs behind PRETRAIN_MAP_EMBEDDING_CACHE: - sparse_object_id_for_embedding_cache feature and replication config - zch_embedding_cache_item_fb_public embedding table (MX4-safe dibit encoding, 32x2bit) - emb_cache_item_ec EmbeddingCollection initialization - KVZCHTBEConfig with OneFlow feature store enrichment - TBE SSD/cache params (prefetch_pipeline, rocksdb, l2_cache, etc.) - QComms fp8_quantize_dim=32 - Simplify forward path using build_embedding_cache_write_kjt from kvzch_utils - Extract decode_dibits_to_int64 utility to kvzch_utils.py for reuse - Update C++ encode/decode from FP8 nibbles (16x4bit) to MX4-safe dibits (32x2bit) Reviewed By: xinzhang-nac Differential Revision: D98399416

Summary: Pull Request resolved: pytorch#5552 X-link: https://github.com/facebookresearch/FBGEMM/pull/2519 CONTEXT: Port embedding cache changes from blue_reels_umia_v1_exp_hstu_simplified_v5_base_model.py to the oneflow base model. All embedding cache related configs are gated behind SID_INJECTION_MODE_V5 == PRETRAIN_MAP_EMBEDDING_CACHE so the base model remains unchanged when the mode is not active. WHAT: - Gate all embedding cache configs behind PRETRAIN_MAP_EMBEDDING_CACHE: - sparse_object_id_for_embedding_cache feature and replication config - zch_embedding_cache_item_fb_public embedding table (MX4-safe dibit encoding, 32x2bit) - emb_cache_item_ec EmbeddingCollection initialization - KVZCHTBEConfig with OneFlow feature store enrichment - TBE SSD/cache params (prefetch_pipeline, rocksdb, l2_cache, etc.) - QComms fp8_quantize_dim=32 - Simplify forward path using build_embedding_cache_write_kjt from kvzch_utils - Extract decode_dibits_to_int64 utility to kvzch_utils.py for reuse - Update C++ encode/decode from FP8 nibbles (16x4bit) to MX4-safe dibits (32x2bit) Reviewed By: xinzhang-nac Differential Revision: D98399416

Summary: X-link: facebookresearch/FBGEMM#2519 CONTEXT: Port embedding cache changes from blue_reels_umia_v1_exp_hstu_simplified_v5_base_model.py to the oneflow base model. All embedding cache related configs are gated behind SID_INJECTION_MODE_V5 == PRETRAIN_MAP_EMBEDDING_CACHE so the base model remains unchanged when the mode is not active. WHAT: - Gate all embedding cache configs behind PRETRAIN_MAP_EMBEDDING_CACHE: - sparse_object_id_for_embedding_cache feature and replication config - zch_embedding_cache_item_fb_public embedding table (MX4-safe dibit encoding, 32x2bit) - emb_cache_item_ec EmbeddingCollection initialization - KVZCHTBEConfig with OneFlow feature store enrichment - TBE SSD/cache params (prefetch_pipeline, rocksdb, l2_cache, etc.) - QComms fp8_quantize_dim=32 - Simplify forward path using build_embedding_cache_write_kjt from kvzch_utils - Extract decode_dibits_to_int64 utility to kvzch_utils.py for reuse - Update C++ encode/decode from FP8 nibbles (16x4bit) to MX4-safe dibits (32x2bit) Reviewed By: xinzhang-nac Differential Revision: D98399416

Summary: Pull Request resolved: pytorch#5552 X-link: https://github.com/facebookresearch/FBGEMM/pull/2519 CONTEXT: Port embedding cache changes from blue_reels_umia_v1_exp_hstu_simplified_v5_base_model.py to the oneflow base model. All embedding cache related configs are gated behind SID_INJECTION_MODE_V5 == PRETRAIN_MAP_EMBEDDING_CACHE so the base model remains unchanged when the mode is not active. WHAT: - Gate all embedding cache configs behind PRETRAIN_MAP_EMBEDDING_CACHE: - sparse_object_id_for_embedding_cache feature and replication config - zch_embedding_cache_item_fb_public embedding table (MX4-safe dibit encoding, 32x2bit) - emb_cache_item_ec EmbeddingCollection initialization - KVZCHTBEConfig with OneFlow feature store enrichment - TBE SSD/cache params (prefetch_pipeline, rocksdb, l2_cache, etc.) - QComms fp8_quantize_dim=32 - Simplify forward path using build_embedding_cache_write_kjt from kvzch_utils - Extract decode_dibits_to_int64 utility to kvzch_utils.py for reuse - Update C++ encode/decode from FP8 nibbles (16x4bit) to MX4-safe dibits (32x2bit) Reviewed By: xinzhang-nac Differential Revision: D98399416

meta-codesync · 2026-04-08T06:11:36Z

This pull request has been merged in 6de075d.

meta-cla bot added the cla signed label Mar 30, 2026

meta-codesync bot added fb-exported meta-exported labels Mar 30, 2026

meta-codesync bot changed the title ~~Add embedding cache support to oneflow base model~~ Add embedding cache support to oneflow base model (#5552) Apr 6, 2026

EddyLXJ force-pushed the export-D98399416 branch from f1abc7e to f365226 Compare April 6, 2026 23:27

EddyLXJ force-pushed the export-D98399416 branch 2 times, most recently from 4d59a9a to a83c23f Compare April 7, 2026 19:43

EddyLXJ force-pushed the export-D98399416 branch from a83c23f to c8b88b7 Compare April 7, 2026 19:44

EddyLXJ force-pushed the export-D98399416 branch from c8b88b7 to 170aa67 Compare April 7, 2026 19:47

EddyLXJ force-pushed the export-D98399416 branch from 170aa67 to 823d714 Compare April 7, 2026 19:52

EddyLXJ force-pushed the export-D98399416 branch from 823d714 to 6bcef33 Compare April 7, 2026 20:57

EddyLXJ force-pushed the export-D98399416 branch 2 times, most recently from ef6ec98 to 8aeec3a Compare April 7, 2026 22:33

EddyLXJ force-pushed the export-D98399416 branch from 8aeec3a to f69e314 Compare April 7, 2026 22:35

EddyLXJ force-pushed the export-D98399416 branch from f69e314 to 78f21f7 Compare April 7, 2026 22:43

meta-codesync bot closed this in 6de075d Apr 8, 2026

facebook-github-tools bot added the Merged label Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add embedding cache support to oneflow base model (#5552)#5552

Add embedding cache support to oneflow base model (#5552)#5552
EddyLXJ wants to merge 1 commit intopytorch:mainfrom
EddyLXJ:export-D98399416

EddyLXJ commented Mar 30, 2026 •

edited by meta-codesync bot

Loading

Uh oh!

meta-codesync bot commented Mar 30, 2026

Uh oh!

meta-codesync bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

EddyLXJ commented Mar 30, 2026 • edited by meta-codesync bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meta-codesync bot commented Mar 30, 2026

Uh oh!

meta-codesync bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EddyLXJ commented Mar 30, 2026 •

edited by meta-codesync bot

Loading