LLaMA & Alpaca; “ChatGPT” On Your Local Computer

Dalai GitHub Repository: https://github.com/cocktailpeanut/dalai
LLaMA: https://github.com/facebookresearch/llama
LLaMA Weird Character Generation Issue: https://github.com/cocktailpeanut/dalai/issues/65
LLaMA Weird Character Generation Issue: https://github.com/facebookresearch/llama/blob/main/FAQ.md#2

Abstract

In this video I will show you how you can run state-of-the-art large language models on your local computer. Yes, you’ve heard right. For this we will use the Dalai library which allows us to run the foundational language model LLaMA as well as the instruction-following Alpaca model. While the LLaMA model is a foundational (or broad) language model that is able to predict the next token (word) based on a given input sequence (sentence), the Alpaca model is a fine-tuned version of the LLaMA model capable of following instructions (which you can think of as ChatGPT behaviour). What’s even more impressive, both these models achieve comparable results or even outperform their GPT counterparts while still being small enough to run on your local computer. In this video I will show you that it only takes a few steps (thanks to the Dalai library) to run “ChatGPT” on your local computer.

How to run Meta AI’s LlaMa 4-bit Model on Google Colab

Colab Code in Github: https://github.com/amrrs/llama-4bit-colab/blob/main/LLaMA_4_bit_on_Google_Colab.ipynb
GPTQ Repo: https://github.com/qwopqwop200/GPTQ-for-LLaMa
LLama 7b 4-bit Model on Hugging Face Model Hub: https://huggingface.co/decapoda-research/llama-7b-hf-int4

Stop Thinking, Just Do!