The Anatomy of Tool Calling in LLMs: A Deep Dive

Introduction Tool calling (also called function calling or plugins) is the capability that turns large language models from text predictors into general-purpose controllers for software. Instead of only generating natural language, an LLM can: Decide when to call a tool (e.g., “get_weather”, “run_sql_query”) Decide which tool to call Construct arguments for that tool Use the result of the tool to continue its reasoning or response This post is a deep dive into the anatomy of tool calling: the moving parts, how they interact, what can go wrong, and how to design reliable systems on top of them. ...

January 7, 2026 · 14 min · 2879 words · martinuke0

A Deep Dive into Semantic Routers for LLM Applications (With Resources)

Introduction As language models are woven into more complex systems—multi-tool agents, retrieval-augmented generation, multi-model stacks—“what should handle this request?” becomes a first-class problem. That’s what a semantic router solves. Instead of routing based on keywords or simple rules, a semantic router uses meaning (embeddings, similarity, sometimes LLMs themselves) to decide: Which tool, model, or chain to call Which knowledge base to query Which specialized agent or microservice should own the request This post is a detailed, practical guide to semantic routers: ...

January 6, 2026 · 17 min · 3454 words · martinuke0

LangChain Cookbook: Zero-to-Hero Tutorial for Developers

As an expert LangChain engineer and educator, I’ll guide you from zero knowledge to hero-level proficiency with the LangChain Cookbook. This practical resource collection offers end-to-end code examples and workflows for building production-ready AI applications using components like RAG (Retrieval-Augmented Generation), agents, chains, tools, memory, embeddings, and databases[1][5][6]. Whether you’re a beginner prototyping in Jupyter or scaling to production, this tutorial provides step-by-step runnable examples, common pitfalls, extension tips, and best practices. ...

January 4, 2026 · 5 min · 856 words · martinuke0

Sub-Agents in LLM Systems : Architecture, Execution Model, and Design Patterns

As LLM-powered systems have grown more capable, they have also grown more complex. By 2025, most production-grade AI systems no longer rely on a single monolithic agent. Instead, they are composed of multiple specialized sub-agents, each responsible for a narrow slice of reasoning, execution, or validation. Sub-agents enable scalability, reliability, and controllability. They allow systems to decompose complex goals into manageable units, reduce context pollution, and introduce clear execution boundaries. This document provides a deep technical explanation of how sub-agents work, how they are orchestrated, and the dominant architectural patterns used in real-world systems, with links to primary research and tooling. ...

December 30, 2025 · 4 min · 807 words · martinuke0

Top LLM Tools & Concepts for 2025: A Deep Technical & Ecosystem Guide

By 2025, Large Language Models (LLMs) have evolved from isolated text-generation systems into general-purpose reasoning engines embedded deeply into modern software systems. This evolution has been driven by: Agentic workflows Retrieval-augmented generation Standardized tool interfaces Long-context reasoning Stronger evaluation and observability layers This article provides a system-level overview of the most important LLM tools and concepts shaping 2025, with direct links to specifications, repositories, and primary sources. 1. Frontier Language Models & Architectural Shifts 1.1 Frontier Closed-Source Models Closed-source models lead in reasoning depth, multimodality, and safety research. ...

December 30, 2025 · 3 min · 488 words · martinuke0
Feedback