Skip to content

Conversation

@Cnp11784
Copy link

Making a test run see what happens

liyunlu0618 and others added 30 commits October 24, 2024 15:22
* Add a symlink for llama_models

* Update README.md paths
Updated model card to remove erroneous heading markdown in Hardware and Software section
…RLs (meta-llama#244)

* Model should work with "raw" bytes, never URLs

Modeling code or code close to it (chat_format.py specifically) should not be thinking of downloading
URLs, etc. Especially not doing it randomly on-demand.

* Use ModelInputMessage / ModelOutputMessage and BytesIO

* Fixes

* Fold everything into a much simpler RawMessage type, update prompt_format
…meta-llama#255)

* Test PR Submission - Write Permission

* Update EFS volume structuring

* Test cron scheduling

* Schedule cron job and update branch inputs / actions versioning

* Remove previous temporary unnecessary commit lines
…a-llama#256)

* models nightly

* publish

* environment

* schedule

* test manual trigger

* fix

* name

* test

* test manual input

* move back to workflow

* dev

* dev

* workflow_dispatch
raghotham and others added 22 commits April 5, 2025 12:23
* refactor: make llama3 and llama4 generation closer to each other

* llama3 script fixes

* fixes

* add llama3 quant

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* resurrect xpu codepath

* update readme
Fixes: b12e462 ("refactor: make llama3 generation closer to llama4 (meta-llama#309)")

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Verified with Llama3.2-3B-Instruct on Intel Data Center Max Series GPU (PVC):
```
torchrun --nproc-per-node=1 models/llama3/scripts/completion.py \
 "$CHECKPOINT_DIR" --world_size 1

torchrun --nproc-per-node=1 models/llama3/scripts/chat_completion.py \
 "$CHECKPOINT_DIR" --world_size 1
```

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Pytorch xccl distributed backend is available starting from 2.7 (requires
manual build of pytorch with `USE_C10D_XCCL=1 USE_XCCL=1`) and is targeted
for inclusion in binary builds starting from 2.8.

This patch improves support of visual llama3 model on XPU devices and tested
with `Llama3.2-11B-Vision-Instruct`.

Signed-off-by: Dmitry Rogozhkin <[email protected]>
* fix: update rope scaling for Llama-4-Scout

* update

* no defaults

* fix
```
PYTHONPATH=$(git rev-parse --show-toplevel) \
torchrun --nproc_per_node=1 \
-m models.llama4.scripts.chat_completion ../checkpoints/Llama-4-Scout-17B-16E-Instruct \
--world_size 1 \
--quantization-mode int4_mixed
```

Before this PR:
```
[rank1]: TypeError: quantize_int4() got multiple values for argument 'output_device'
```
# What does this PR do?


## Test Plan
# What does this PR do?


## Test Plan
Making a test run see what happens
@facebook-github-bot
Copy link

Hi @Cnp11784!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@Cnp11784 Cnp11784 changed the base branch from main to torchao September 24, 2025 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.