mattwiebe.blog
mattwiebe.blog
@mattwiebe.blog@mattwiebe.blog

this blog is for testing fediverse integration from WordPress.com, but soon I can actually just use it!

56 posts
47 followers

How to Run CodeLlama in VSCode on macOS

Many people are unable to run a coding assistant LLM like Copilot or ChapGPT because of privacy concerns in a non-open codebase. This is true of our private WordPress.com codebase: we don’t want to be sending our secrets to OpenAI or Microsoft.

But now, with the release of CodeLLaMa (and with a huge hat tip to llama.cpp) and thanks to the Continue VSCode extension, we can run these models directly on our own hardware.

Here’s how I did it:

  1. Download and install Ollama. It lets you run and serve these models in a way that Continue can use.
  2. Pick the model you want. 7B is the lightest weight and 13B and 34B are heavier, and there are a bunch of quantized versions as well. These are from TheBloke, see for example CodeLlama-7B-GGUF and scroll down to the Provided Files table to see the size vs performance tradeoffs.
  3. I chose the large 7B model, so I ran:
    ollama pull codellama:7b-instruct-q5_K_M
  4. While you’re waiting (that model is ~5GB), install the Continue VSCode extension.
  5. Follow the instructions on how to use Ollama in Continue. (The entire reason for this blog post is that those instructions are incomplete.) In my case, with config.py open, my Models line looks like:
    models=Models(default=Ollama(model="codellama:7b-instruct-q5_K_M")
    (Note: Continue will add some extra stuff to it later, adding prompt_templates etc.)
  6. Once your model is downloaded, you need to serve it. (This was my missing piece):
    ollama serve codellama:7b-instruct-q5_K_M
  7. You might need to reload VSCode but you should be up and running!
this took ~20 seconds on my M1 Pro with 16GB of RAM

2 responses to “How to Run CodeLlama in VSCode on macOS”

  1. @mattwiebe.blog nice bugs you got there, be a shame if somebody were to fix them

    Liked by 2 people

  2. […] my last post about setting up CodeLlama, some colleagues have asked the million dollar question: how does […]

    Like

Leave a comment