|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: Deepseek Distill |
| 4 | +subtitle: testing tiny models |
| 5 | +date: 2025-02-06 |
| 6 | +background: /img/headers/carthew_alderson_redstone.jpg |
| 7 | +comments: true |
| 8 | +published: true |
| 9 | +--- |
| 10 | + |
| 11 | +<img src="/img/posts/deepseek_logo.png" class="img-fluid" style="margin-left:10px; float:right"/> |
| 12 | + |
| 13 | +With all the hype about LLMs, the stock market was surprised by the Deepseek r1 model release in late January, leading to a [large selloff](https://www.reuters.com/technology/chinas-deepseek-sets-off-ai-market-rout-2025-01-27/) in Nvidia shares. The thinking was that Deepseek trained their model very cheaply, barely using Nvidia chips, for $5 million, instead of the billions being spent by large US firms. A week later, it turns out the news headline was wrong and they spent [$1.6 billion using 50k Nvidia GPUs](https://www.windowscentral.com/software-apps/deepseek-6-million-r1-cost-efficient-model-training-might-be-a-ruse). Good example of why the headline news should be read with skepticism. |
| 14 | + |
| 15 | +As I have been [playing](/2025/01/03/ollama) around with small models on an old gaming PC using Ollama, however with only a RTX 3070ti GPU having 8 GB VRAM, running the fullsized Deepseek r1 is not possible. This is too bad as apparently the r1 model is quite good. Instead, it is possible to try running [a distilled model](https://ollama.com/library/deepseek-r1). These are actually Qwen models that have been distilled using Deepseek r1. |
| 16 | + |
| 17 | +<img src="/img/posts/deepseek_ollama_distills.png" class="img-fluid" /> |
| 18 | + |
| 19 | +In my case, I'll try the 7b model, downloading via "Settings > Admin Settings > Connections > Manage Ollama API Connections": |
| 20 | + |
| 21 | +<img src="/img/posts/deepseek_ollama_7b_download.png" class="img-fluid" /> |
| 22 | + |
| 23 | +This results in the chat window shown: |
| 24 | + |
| 25 | +<img src="/img/posts/deepseek_prompt_window.png" class="img-fluid" /> |
| 26 | + |
| 27 | +As r1 is a "thinking" model it is quite verbose, but does a good job of explaining its though process in the <think> tag: |
| 28 | + |
| 29 | +<img src="/img/posts/deepseek_sky_blue.png" class="img-fluid" /> |
| 30 | + |
| 31 | +This response came back in roughly 20 seconds, which wasn't too bad for my old GPU. I followed up with: "Is that on planet Earth only? What about on Mars?" |
| 32 | + |
| 33 | +<img src="/img/posts/deepseek_sky_mars.png" class="img-fluid" /> |
| 34 | + |
| 35 | +## Conclusions |
| 36 | + |
| 37 | +This dabbling with a distilled model on old hardware is fun, but not at all reflective of Deepseek's true performance. Hugging Face [reports](https://huggingface.co/deepseek-ai/DeepSeek-R1) this model "achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks." Overall, the LLM space continues with rapid improvements. Just this week Google released [Gemini 2.0](https://blog.google/technology/google-deepmind/gemini-model-updates-february-2025/) which looks promising. |
| 38 | + |
| 39 | +### More in this series... |
| 40 | +* [Google Gemini](/2024/02/16/google-gemini) - Google Gemini |
| 41 | +* [Anthropic Claude](/2024/03/04/anthropic-claude) - Anthropic Claude |
| 42 | +* [Llama 3](/2024/04/19/llama-3) - Llama 3 |
| 43 | +* [ChatGPT 4o](/2024/05/21/chatgpt-4o) - ChatGPT 4o |
| 44 | +* [Anthropic Claude in Canada](/2024/06/05/anthropic-claude-canada) - Claude eh? |
| 45 | +* [LLMs on Android](/2024/07/18/llms-on-android) - AI in your pocket |
| 46 | +* [Google Imagen3](/2024/08/28/google-imgen3) - AI Image Generation |
| 47 | +* [Azure AI Studio](/2024/09/30/azure-ai-studio) - AI on MS Azure |
| 48 | +* [Google AI Studio](/2024/12/08/google-ai-studio) - AI on Google |
| 49 | +* [Ollama](/2025/01/03/ollama) - Local LLMs on your own computer. |
0 commit comments