guydavis
diff --git a/‎_posts/2025-01-03-ollama.markdown‎
Lines changed: 2 additions & 1 deletion b/‎_posts/2025-01-03-ollama.markdown‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎_posts/2025-02-06-deepseek-distill.markdown‎
Lines changed: 49 additions & 0 deletions b/‎_posts/2025-02-06-deepseek-distill.markdown‎
Lines changed: 49 additions & 0 deletions
diff --git a/‎img/posts/deepseek_logo.png‎
2.67 KB b/‎img/posts/deepseek_logo.png‎
2.67 KB
diff --git a/‎img/posts/deepseek_ollama_7b_download.png‎
22.5 KB b/‎img/posts/deepseek_ollama_7b_download.png‎
22.5 KB
diff --git a/‎img/posts/deepseek_ollama_distills.png‎
27.2 KB b/‎img/posts/deepseek_ollama_distills.png‎
27.2 KB
diff --git a/‎img/posts/deepseek_prompt_window.png‎
31.2 KB b/‎img/posts/deepseek_prompt_window.png‎
31.2 KB
diff --git a/‎img/posts/deepseek_sky_blue.png‎
105 KB b/‎img/posts/deepseek_sky_blue.png‎
105 KB
diff --git a/‎img/posts/deepseek_sky_mars.png‎
107 KB b/‎img/posts/deepseek_sky_mars.png‎
107 KB
@@ -19,7 +19,7 @@ Meta (who owns Facebook) decided to take a different route and release their LLM
 
 ## Ollama with a GPU
 
-The easiest way I found to run LLM models using Ollama was with [Open-WebUI](), running on [Docker](https://www.docker.com/products/personal/).  My first test machine was a gaming PC with an AMD Ryzen 7 5700x 8-core CPU paired with a Nvidia 3070ti GPU.  This is decidely mid-range personal hardware, now a few years old.  It is not a super powerful system by any means.
+The easiest way I found to run LLM models using Ollama was with [Open-WebUI](https://openwebui.com/), running on [Docker](https://www.docker.com/products/personal/).  My first test machine was a gaming PC with an AMD Ryzen 7 5700x 8-core CPU paired with a Nvidia 3070ti GPU.  This is decidely mid-range personal hardware, now a few years old.  It is not a super powerful system by any means.
 
 From a Powershell on this Windows 11 system, I loaded the image and ran the following
 
@@ -76,3 +76,4 @@ Overall, it's great that these "open-weight" models exist and can be run on cons
 * [Google Imagen3](/2024/08/28/google-imgen3) - AI Image Generation
 * [Azure AI Studio](/2024/09/30/azure-ai-studio) - AI on MS Azure
 * [Google AI Studio](/2024/12/08/google-ai-studio) - AI on Google
+* [Deepseek](/2025/02/06/deepseek-distill) - Trying small distills locally.
@@ -0,0 +1,49 @@
+---
+layout: post
+title: Deepseek Distill
+subtitle: testing tiny models
+date: 2025-02-06
+background: /img/headers/carthew_alderson_redstone.jpg
+comments: true
+published: true
+---
+
+<img src="/img/posts/deepseek_logo.png" class="img-fluid" style="margin-left:10px; float:right"/>
+
+With all the hype about LLMs, the stock market was surprised by the Deepseek r1 model release in late January, leading to a [large selloff](https://www.reuters.com/technology/chinas-deepseek-sets-off-ai-market-rout-2025-01-27/) in Nvidia shares.  The thinking was that Deepseek trained their model very cheaply, barely using Nvidia chips, for $5 million, instead of the billions being spent by large US firms.  A week later, it turns out the news headline was wrong and they spent [$1.6 billion using 50k Nvidia GPUs](https://www.windowscentral.com/software-apps/deepseek-6-million-r1-cost-efficient-model-training-might-be-a-ruse).  Good example of why the headline news should be read with skepticism.
+
+As I have been [playing](/2025/01/03/ollama) around with small models on an old gaming PC using Ollama, however with only a RTX 3070ti GPU having 8 GB VRAM, running the fullsized Deepseek r1 is not possible.  This is too bad as apparently the r1 model is quite good. Instead, it is possible to try running [a distilled model](https://ollama.com/library/deepseek-r1).  These are actually Qwen models that have been distilled using Deepseek r1.  
+
+<img src="/img/posts/deepseek_ollama_distills.png" class="img-fluid" />
+
+In my case, I'll try the 7b model, downloading via "Settings > Admin Settings > Connections > Manage Ollama API Connections":
+
+<img src="/img/posts/deepseek_ollama_7b_download.png" class="img-fluid" />
+
+This results in the chat window shown:
+
+<img src="/img/posts/deepseek_prompt_window.png" class="img-fluid" />
+
+As r1 is a "thinking" model it is quite verbose, but does a good job of explaining its though process in the &lt;think&gt; tag:
+
+<img src="/img/posts/deepseek_sky_blue.png" class="img-fluid" />
+
+This response came back in roughly 20 seconds, which wasn't too bad for my old GPU.  I followed up with: "Is that on planet Earth only? What about on Mars?"
+
+<img src="/img/posts/deepseek_sky_mars.png" class="img-fluid" />
+
+## Conclusions
+
+This dabbling with a distilled model on old hardware is fun, but not at all reflective of Deepseek's true performance.  Hugging Face [reports](https://huggingface.co/deepseek-ai/DeepSeek-R1) this model "achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks."  Overall, the LLM space continues with rapid improvements.  Just this week Google released [Gemini 2.0](https://blog.google/technology/google-deepmind/gemini-model-updates-february-2025/) which looks promising.
+
+### More in this series...
+* [Google Gemini](/2024/02/16/google-gemini) - Google Gemini
+* [Anthropic Claude](/2024/03/04/anthropic-claude) - Anthropic Claude
+* [Llama 3](/2024/04/19/llama-3) - Llama 3
+* [ChatGPT 4o](/2024/05/21/chatgpt-4o) - ChatGPT 4o
+* [Anthropic Claude in Canada](/2024/06/05/anthropic-claude-canada) - Claude eh?
+* [LLMs on Android](/2024/07/18/llms-on-android) - AI in your pocket
+* [Google Imagen3](/2024/08/28/google-imgen3) - AI Image Generation
+* [Azure AI Studio](/2024/09/30/azure-ai-studio) - AI on MS Azure
+* [Google AI Studio](/2024/12/08/google-ai-studio) - AI on Google
+* [Ollama](/2025/01/03/ollama) - Local LLMs on your own computer.