<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Machine‑learning‑deployment on martinuke0&#39;s Blog</title>
    <link>https://martinuke0.github.io/tags/machinelearningdeployment/</link>
    <description>Recent content in Machine‑learning‑deployment on martinuke0&#39;s Blog</description>
    <image>
      <title>martinuke0&#39;s Blog</title>
      <url>https://martinuke0.github.io/%3Clink%20or%20path%20of%20image%20for%20opengraph,%20twitter-cards%3E</url>
      <link>https://martinuke0.github.io/%3Clink%20or%20path%20of%20image%20for%20opengraph,%20twitter-cards%3E</link>
    </image>
    <generator>Hugo -- 0.152.2</generator>
    <language>en</language>
    <lastBuildDate>Thu, 02 Apr 2026 14:00:25 +0000</lastBuildDate>
    <atom:link href="https://martinuke0.github.io/tags/machinelearningdeployment/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Scaling Small Language Models: Why SLMs Are Replacing Giants for On‑Device Edge Infrastructure</title>
      <link>https://martinuke0.github.io/posts/2026-04-02-scaling-small-language-models-why-slms-are-replacing-giants-for-ondevice-edge-infrastructure/</link>
      <pubDate>Thu, 02 Apr 2026 14:00:25 +0000</pubDate>
      <guid>https://martinuke0.github.io/posts/2026-04-02-scaling-small-language-models-why-slms-are-replacing-giants-for-ondevice-edge-infrastructure/</guid>
      <description>&lt;h2 id=&#34;table-of-contents&#34;&gt;Table of Contents&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&#34;#introduction&#34;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#the-rise-of-edge-ai&#34;&gt;The Rise of Edge AI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#why-large-language-models-llms-struggle-on-the-edge&#34;&gt;Why Large Language Models (LLMs) Struggle on the Edge&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#defining-small-language-models-slms&#34;&gt;Defining Small Language Models (SLMs)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#core-techniques-for-scaling-down&#34;&gt;Core Techniques for Scaling Down&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;5.1 &lt;a href=&#34;#knowledge-distillation&#34;&gt;Knowledge Distillation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;5.2 &lt;a href=&#34;#quantization&#34;&gt;Quantization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;5.3 &lt;a href=&#34;#pruning--structured-sparsity&#34;&gt;Pruning &amp;amp; Structured Sparsity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;5.4 &lt;a href=&#34;#efficient-architectures&#34;&gt;Efficient Architectures&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#practical-example-deploying-a-7%E2%80%91b-slm-on-a-raspberry-pi-4&#34;&gt;Practical Example: Deploying a 7‑B SLM on a Raspberry Pi 4&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#real%E2%80%91world-deployments-and-case-studies&#34;&gt;Real‑World Deployments and Case Studies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#performance-benchmarks--trade%E2%80%91offs&#34;&gt;Performance Benchmarks &amp;amp; Trade‑offs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#security-privacy-and-regulatory-advantages&#34;&gt;Security, Privacy, and Regulatory Advantages&lt;/a&gt;&lt;br&gt;
10 &lt;a href=&#34;#future-outlook-from-slms-to-federated-llms&#34;&gt;Future Outlook: From SLMs to Federated LLMs&lt;/a&gt;&lt;br&gt;
11 &lt;a href=&#34;#conclusion&#34;&gt;Conclusion&lt;/a&gt;&lt;br&gt;
12 &lt;a href=&#34;#resources&#34;&gt;Resources&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;hr&gt;
&lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;The last few years have witnessed a &lt;strong&gt;paradigm shift&lt;/strong&gt; in natural language processing (NLP). While the public imagination has been captured by ever‑larger language models—GPT‑4, PaLM‑2, LLaMA‑70B—practical deployments are increasingly gravitating toward &lt;strong&gt;small language models (SLMs)&lt;/strong&gt; that can run locally on edge devices such as smartphones, wearables, and industrial controllers.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
