Skip to content

Commit 9fae635

Browse files
committed
Post on Ollama
1 parent c93e559 commit 9fae635

File tree

7 files changed

+78
-0
lines changed

7 files changed

+78
-0
lines changed

_posts/2025-01-01-ollama.markdown

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
---
2+
layout: post
3+
title: Ollama
4+
subtitle: running LLMs on your desktop
5+
date: 2025-01-01
6+
background: /img/headers/legislature.jpg
7+
comments: true
8+
published: true
9+
---
10+
11+
<img src="/img/posts/ollama_logo.png" class="img-fluid" style="margin-left:10px; float:right"/>
12+
13+
While the publicly available LLMs such as OpenAI's [ChatGPT](https://chatgpt.com/), Anthropic's [Claude](https://claude.ai/), Google's [Gemini](https://gemini.google.com/app), and X.ai's [Grok](https://x.com/i/grok) are all very useful, they have two primary drawbacks:
14+
15+
1. They all offer a free tier with limits on usage amount and types, but then start charging users for premium tiers.
16+
1. By definition, when you chat with these public LLMs, you are sending your data (your text) to that company, violating your privacy.
17+
18+
Meta (who owns Facebook) decided to take a different route and release their LLM as free to use on one's own computer. While not open-source, Meta offers Llama models downloads as free for us to test ourselves. Most amazingly, once downloaded the software model does not require further Internet access to use, allowing for actual private conversations. With people starting to use LLMs as fill-ins for their psychologist/psychiatrist, they then no longer need to share their private thoughts with a large corporation.
19+
20+
## Ollama with a GPU
21+
22+
The easiest way I found to run LLM models using Ollama was with [Open-WebUI](), running on [Docker](https://www.docker.com/products/personal/). My first test machine was a gaming PC with an AMD Ryzen 7 5700x 8-core CPU paired with a Nvidia 3070ti GPU. This is decidely mid-range personal hardware, now a few years old. It is not a super powerful system by any means.
23+
24+
From a Powershell on this Windows 11 system, I loaded the image and ran the following
25+
26+
```
27+
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
28+
```
29+
30+
This pulls the Open-WebUI image, with Ollama bundled within, and enables support for my Nvidia GPU.
31+
32+
<img src="/img/posts/ollama_galadriel_docker.png" class="img-fluid" />
33+
34+
Then I browsed to `http://localhost:3000` to create my first user and start interacting with Open-WebUI. First job was to pull down the newish Llama 3.2 model which was small enough to run on my older computer. Took a little while to work this out:
35+
36+
1. Click your profile icon in the top-right corner.
37+
1. Choose 'Settings' from the drop-down menu.
38+
1. In the Settings dialog, choose 'Admin Settings' along the left.
39+
1. In Admin Settings, choose 'Connections', and select the wrench icon to 'Manage Ollama API Connections'
40+
1. In that dialog, starting typing 'llama' in the box 'Pull a model from Ollama.com'
41+
1. Find the [model you want](https://ollama.com/library), and click the Download icon.
42+
43+
Alright, once you have loaded a model, you can then start chatting with it. For example:
44+
45+
<img src="/img/posts/ollama_chat_gpus.png" class="img-fluid" />
46+
47+
I was watching GPU load in the Performance tab of the Window's Task Manager and each response from the model did show GPU usage. As well, each response has a small 'info' icon below it so you can see the time it took to construct. I was seeing responses in a few seconds. Then I asked if my GPU was any good:
48+
49+
<img src="/img/posts/ollama_chat_gpu_3070ti.png" class="img-fluid" />
50+
51+
### Offline Mode
52+
53+
I then disabled my computer's network connection and asked the model to tell me about the history of Rome. Even with no Internet access, the model gave a useful response, showing just host much info is embedded within the multi-GB sized model download.
54+
55+
<img src="/img/posts/ollama_offline_mode.png" class="img-fluid" />
56+
57+
## Ollama with only a CPU
58+
59+
You don't actually need a GPU to run Ollama, but it certainly results in a more responsive chatting experience. My other test system was a similar gaming PC with the same CPU, but an AMD Radeon 6700xt GPU. Unfortunately, in the world of machine-learning and artificial-intelligence, support from AMD for consumer-level GPU support is neglible compared to Nvidia. In this case, that Radeon GPU is useless and Ollama relied exclusively upon the CPU to give responses.
60+
61+
<img src="/img/posts/ollama_amd_gpu.png" class="img-fluid" />
62+
63+
The 'Info' pop-up for the response above showed about 20 seconds, coming in about 10x slower than a similar response on the machine with an Nvidia GPU. So clearly, an Nivida GPU is still highly recommended for using these LLMs on one's own computer.
64+
65+
## Conclusions
66+
67+
Overall, it's great that these "open-weight" models exist and can be run on consumer hardware. I don't want only large corporations to control this technology. However, given that the actual training of these models takes data-center levels of hardware (millions of $$$), it seems that only large corporations may end up controlling the final destination of these LLMs. Definitely a space to watch...
68+
69+
### More in this series...
70+
* [Google Gemini](/2024/02/16/google-gemini) - Google Gemini
71+
* [Anthropic Claude](/2024/03/04/anthropic-claude) - Anthropic Claude
72+
* [Llama 3](/2024/04/19/llama-3) - Llama 3
73+
* [ChatGPT 4o](/2024/05/21/chatgpt-4o) - ChatGPT 4o
74+
* [Anthropic Claude in Canada](/2024/06/05/anthropic-claude-canada) - Claude eh?
75+
* [LLMs on Android](/2024/07/18/llms-on-android) - AI in your pocket
76+
* [Google Imagen3](/2024/08/28/google-imgen3) - AI Image Generation
77+
* [Azure AI Studio](/2024/09/30/azure-ai-studio) - AI on MS Azure
78+
* [Google AI Studio](/2024/12/08/google-ai-studio) - AI on Google

img/posts/ollama_amd_gpu.png

94.5 KB
Loading
216 KB
Loading

img/posts/ollama_chat_gpus.png

208 KB
Loading
74.2 KB
Loading

img/posts/ollama_logo.png

3.56 KB
Loading

img/posts/ollama_offline_mode.png

241 KB
Loading

0 commit comments

Comments
 (0)