-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Create devcontainer.json #367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: torchao
Are you sure you want to change the base?
Conversation
* Add a symlink for llama_models * Update README.md paths
…ig -> model_config (meta-llama#225)
Updated model card to remove erroneous heading markdown in Hardware and Software section
…RLs (meta-llama#244) * Model should work with "raw" bytes, never URLs Modeling code or code close to it (chat_format.py specifically) should not be thinking of downloading URLs, etc. Especially not doing it randomly on-demand. * Use ModelInputMessage / ModelOutputMessage and BytesIO * Fixes * Fold everything into a much simpler RawMessage type, update prompt_format
…meta-llama#255) * Test PR Submission - Write Permission * Update EFS volume structuring * Test cron scheduling * Schedule cron job and update branch inputs / actions versioning * Remove previous temporary unnecessary commit lines
Signed-off-by: Dmitry Rogozhkin <[email protected]>
…a-llama#256) * models nightly * publish * environment * schedule * test manual trigger * fix * name * test * test manual input * move back to workflow * dev * dev * workflow_dispatch
* refactor: make llama3 and llama4 generation closer to each other * llama3 script fixes * fixes * add llama3 quant * fix * fix * fix * fix * fix * fix * fix * fix * fix * resurrect xpu codepath * update readme
…hiding of python_start, python_end
Fixes: b12e462 ("refactor: make llama3 generation closer to llama4 (meta-llama#309)") Signed-off-by: Dmitry Rogozhkin <[email protected]>
Verified with Llama3.2-3B-Instruct on Intel Data Center Max Series GPU (PVC): ``` torchrun --nproc-per-node=1 models/llama3/scripts/completion.py \ "$CHECKPOINT_DIR" --world_size 1 torchrun --nproc-per-node=1 models/llama3/scripts/chat_completion.py \ "$CHECKPOINT_DIR" --world_size 1 ``` Signed-off-by: Dmitry Rogozhkin <[email protected]>
Pytorch xccl distributed backend is available starting from 2.7 (requires manual build of pytorch with `USE_C10D_XCCL=1 USE_XCCL=1`) and is targeted for inclusion in binary builds starting from 2.8. This patch improves support of visual llama3 model on XPU devices and tested with `Llama3.2-11B-Vision-Instruct`. Signed-off-by: Dmitry Rogozhkin <[email protected]>
…snorm, name it as such (meta-llama#320)
* fix: update rope scaling for Llama-4-Scout * update * no defaults * fix
``` PYTHONPATH=$(git rev-parse --show-toplevel) \ torchrun --nproc_per_node=1 \ -m models.llama4.scripts.chat_completion ../checkpoints/Llama-4-Scout-17B-16E-Instruct \ --world_size 1 \ --quantization-mode int4_mixed ``` Before this PR: ``` [rank1]: TypeError: quantize_int4() got multiple values for argument 'output_device' ```
# What does this PR do? ## Test Plan
# What does this PR do? ## Test Plan
Making a test run see what happens
|
Hi @Cnp11784! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
Making a test run see what happens