What is QLoRA?

Fine-Tuning

QLoRA (Quantized LoRA) — A technique that combines 4-bit quantization with LoRA fine-tuning. Enables fine-tuning a 65B parameter model on a single 48GB GPU by keeping the base model in 4-bit precision.

FAQ

What is QLoRA?

LoRA fine-tuning applied to a 4-bit quantized model. Fine-tune 65B models on a single GPU.

QLoRA vs LoRA?

QLoRA uses less memory (4-bit base model) with similar quality. LoRA keeps the base in 16-bit for slightly better results.

Related Terms

Learn QLoRA in depth

Free hands-on course with code examples and Google Colab notebooks.

Start Course →