Ollama
d9ye0x1ebcmhvx7tsnziq4e2uxq756-large.webp&w=3840&q=75)
Dear Diary,
There’s a lot of hype around the new Chinese model, DeepSeek, lately, so I wanted to check it out myself. However, I wasn’t too keen on sending heaps of data to their servers (or the government). That meant running it locally. After some digging, I found Ollama, a tool by Meta, that made it possible. Here’s how I got it up and running.
What is Ollama?
Ollama is a free, open-source tool that allows me to efficiently run and manage local LLMs. It runs locally and offline, making it particularly valuable for environments such as finance, healthcare, and government applications.
Installation
- I installed it on my mac with Homebrew: brew install ollama.
- For Linux, you can use: curl -fsSL https://ollama.com/install.sh | sh.
- Or, for Windows, you can download the executable from: https://ollama.com/download/windows
Usage
- You can now run a model from https://ollama.com/search
- When you do the ollama run deepseek-r1:32b.
- Ollama will download the model to your storage and load it into RAM. It executes on your GPU (if available) or CPU if no suitable GPU is found.
To efficiently map the model to RAM, you should have:
- 7B models require ~8GB RAM
- 13B models require ~12GB RAM
- 32B models require ~16GB RAM
- 70B models require ~32GB RAM
After running ollama run, you can query the model directly in the terminal.
Some useful commands include:
- /? or /help → Show help
- /set parameter num_ctx 2192 → Set context size
- /save my_super_model → Save a new model config
- Ollama provides local web server so you can do curl -N http://127.0.0.1:11434/api/chat -d '{"model": "deepseek-r1:32b", "streaming": true, "messages": [{"role":"user", "content": "Say Hello"}]}'
- You can also explore different models, such as LLaVA for image classification:
- ollama run llava "What's in this image? /Pictures/smiley.png"
- To check all the downloaded models do ollama ls
User-friendly UI
If you’d like a more user-friendly UI, OpenWebUI is a great client for prompting local models. It runs as a Docker image, but you’ll need to log in with their account.
Why Ollama?
Ollama simplifies the process of running and managing LLMs, making it accessible for developers and organizations to leverage advanced language models without complex setups. Its efficient resource management and user-friendly commands make it a powerful tool for integrating LLMs into various projects. Plus, I can work efficiently—even on an airplane. 🙂
Conclusion
As we continue to explore and integrate tools like Ollama, we're excited about the possibilities they bring to our development workflow. Stay tuned for more updates and insights as we delve deeper into the world of LLMs and open-source solutions.