Article Source
LLaMA & Alpaca; “ChatGPT” On Your Local Computer
- Dalai GitHub Repository: https://github.com/cocktailpeanut/dalai
- LLaMA: https://github.com/facebookresearch/llama
- LLaMA Weird Character Generation Issue: https://github.com/cocktailpeanut/dalai/issues/65
- LLaMA Weird Character Generation Issue: https://github.com/facebookresearch/llama/blob/main/FAQ.md#2
Abstract
In this video I will show you how you can run state-of-the-art large language models on your local computer. Yes, you’ve heard right. For this we will use the Dalai library which allows us to run the foundational language model LLaMA as well as the instruction-following Alpaca model. While the LLaMA model is a foundational (or broad) language model that is able to predict the next token (word) based on a given input sequence (sentence), the Alpaca model is a fine-tuned version of the LLaMA model capable of following instructions (which you can think of as ChatGPT behaviour). What’s even more impressive, both these models achieve comparable results or even outperform their GPT counterparts while still being small enough to run on your local computer. In this video I will show you that it only takes a few steps (thanks to the Dalai library) to run “ChatGPT” on your local computer.
How to run Meta AI’s LlaMa 4-bit Model on Google Colab
- Colab Code in Github: https://github.com/amrrs/llama-4bit-colab/blob/main/LLaMA_4_bit_on_Google_Colab.ipynb
- GPTQ Repo: https://github.com/qwopqwop200/GPTQ-for-LLaMa
- LLama 7b 4-bit Model on Hugging Face Model Hub: https://huggingface.co/decapoda-research/llama-7b-hf-int4