Published on

Deploy an LLM on your local machine (Vicuna / GPU / Windows)



Vicuna has arrived, a fresh LLM model that aims to deliver 90% of the functionality of ChatGPT on your personal computer. Vicuna is a free LLM model designed to manage shared GPT and a database of interactions collected from ChatGPT users. The developers of Vicuna assert that it can attain up to 90% of ChatGPT's capabilities. This guide outlines the process for utilizing the Vicuna model on your PC.


  1. Git (
  2. An Nvidia GPU
  3. Windows OS


  1. Download the oobabooga one-click installer: and extract to a location of your choice.
    • a) Double click on the install.bat file, a. Choose NVIDIA GPU and then it will start installing all the requirements that it needs to run.
    • b) Run downloadmodel.bat. This will basically give you a list of all the models that you can download right now, and that will be automatically installed for the web UI. Select A for facebook opt 6.7 billion parameters, which takes around 12 gigabytes of space.
    • c) Run in the models folder:
      git lfs install
      git clone
    • d) Go back to the root folder where you have the startwebui.bat file,
      • Right click on the startwebui.bat file, click on edit with notepad,
      • Find call python and paste these two arguments: --wbits4 --group size 128
    • e) Go back to root folder and double click on the startwebui.bat file and choose the Vicuna model.
  2. Download the Vacuna model by clicking on this link:
    • a) Click on files and versions, then you're gonna click on the save tensor file, and then click on download. This will download the main 8GB model file onto your computer.
    • b) Copy and paste the file into the vicuna-13b-GPTQ-4bit-128g folder.
  3. Run start-webui.bat, which will then give you a local url, and to open it just hold ctrl and then left click. And there you go, now you have a beautiful webui to run all of your LLMs for absolutely free on your computer.