Run powerful AI models on your own computer โ completely private, completely free
Ollama makes it as easy to run an AI model locally as installing any other app. One command to pull a model, one command to chat. Your data never leaves your machine โ no API keys, no usage limits, no monthly fees. Supports Llama 3, Mistral, Gemma, Phi-3, DeepSeek, and dozens more.
| Model | Size | RAM | Best for |
|---|---|---|---|
llama3.2 | 2 GB | 8 GB | Fast chat, low-end hardware |
llama3.1 | 4.7 GB | 8 GB | General purpose, good quality |
llama3.1:70b | 40 GB | 64 GB | Near GPT-4 quality, needs high-end Mac |
mistral | 4.1 GB | 8 GB | Fast, strong coding tasks |
deepseek-coder | 776 MB | 8 GB | Code generation, very fast |
phi3:mini | 2.3 GB | 8 GB | Lightweight, good reasoning |
gemma2:2b | 1.6 GB | 8 GB | Google model, very fast |
codellama | 3.8 GB | 8 GB | Code completion, works with editors |
Install the "Continue" extension in VS Code โ set provider to Ollama โ point to your local model. You now have free local AI completions in VS Code with no API costs.