8bit model by zshobbs · Pull Request #3 · Sentdex/ChatGPT-at-Home

zshobbs · 2023-01-25T23:08:59Z

Run the LLM's over multiple GPUS Using 8bit models to compress the vram footprint. "facebook/opt-30b" runs on 2 nvidia rtx 3090's. "facebook/opt-66b" might squeeze onto bigger GPUs or you can use float16 to and CPU or nvme/ssd offload.

This uses Huggingface accelerate and bitsandbytes.

zshobbs and others added 2 commits January 25, 2023 22:56

add HF accelerate for models

1f27a9a

Update README.md

dbaf544

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8bit model#3

8bit model#3
zshobbs wants to merge 2 commits intoSentdex:mainfrom
zshobbs:8bit-model

zshobbs commented Jan 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zshobbs commented Jan 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant