DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Cut Your LLM Costs by ~30% With Prompt Optimization (What Actually Works in Production)

Cut Your LLM Costs by ~30% With Prompt Optimization (What Actually Works in Production)

1
Comments
3 min read
⚛ MCP Explained: A Simple Guide 📜 to AI 🤖 Agents

⚛ MCP Explained: A Simple Guide 📜 to AI 🤖 Agents

Comments
3 min read
TPU: Why Google Doesn’t Wait in Line for NVIDIA GPUs (2/2)

TPU: Why Google Doesn’t Wait in Line for NVIDIA GPUs (2/2)

Comments
7 min read
TPU: Why Google Doesn't Wait in Line for NVIDIA GPUs (1/2)

TPU: Why Google Doesn't Wait in Line for NVIDIA GPUs (1/2)

Comments
8 min read
Personal Identity Agent for your Agent

Personal Identity Agent for your Agent

Comments
1 min read
Release my PR for the project Bifrost

Release my PR for the project Bifrost

Comments
2 min read
RAG is more than Vector Search

RAG is more than Vector Search

1
Comments
4 min read
Code Generation for Ablation Technique — Documentation

Code Generation for Ablation Technique — Documentation

Comments
3 min read
How Bifrost Integrates With Your Existing LLM Stack (No Refactoring Required)

How Bifrost Integrates With Your Existing LLM Stack (No Refactoring Required)

Comments
4 min read
Semantic Caching Cut Our LLM Costs by 40%

Semantic Caching Cut Our LLM Costs by 40%

Comments
3 min read
Uncounted Tokens: The Game of Attack and Defense in AI Gateway Rate Limiting

Uncounted Tokens: The Game of Attack and Defense in AI Gateway Rate Limiting

Comments
3 min read
The Observability Tax: What You're Actually Paying for AI Agents (2026 Cost Reality)

The Observability Tax: What You're Actually Paying for AI Agents (2026 Cost Reality)

Comments
2 min read
Building Your First Agentic AI: Complete Guide to MCP + Ollama Tool Calling

Building Your First Agentic AI: Complete Guide to MCP + Ollama Tool Calling

1
Comments
14 min read
Why Your API's Error Messages Fail When Called by an LLM (And How to Fix Them)

Why Your API's Error Messages Fail When Called by an LLM (And How to Fix Them)

Comments
9 min read
The 6 Best AI Code Review Tools for Pull Requests in 2025

The 6 Best AI Code Review Tools for Pull Requests in 2025

Comments
11 min read
5 Must-Read Books for Backend Engineers in 20226

5 Must-Read Books for Backend Engineers in 20226

6
Comments
4 min read
Create Your First MCP App

Create Your First MCP App

2
Comments
6 min read
Stop Evaluating AI Agents Like ML Models: A Paradigm Shift for Developers

Stop Evaluating AI Agents Like ML Models: A Paradigm Shift for Developers

Comments
3 min read
Progressing in Bifrost project

Progressing in Bifrost project

Comments
2 min read
Base LLMs vs Instruction-Tuned LLMs: Understanding the Architecture Behind ChatGPT and Claude

Base LLMs vs Instruction-Tuned LLMs: Understanding the Architecture Behind ChatGPT and Claude

Comments
3 min read
Reranking and Two-Stage Retrieval: Precision When It Matters Most

Reranking and Two-Stage Retrieval: Precision When It Matters Most

Comments
2 min read
LLMs Hallucinate. RAG Fixes That — Here’s How We Built a Reliable Healthcare AI

LLMs Hallucinate. RAG Fixes That — Here’s How We Built a Reliable Healthcare AI

Comments
3 min read
Build Better RAG Pipelines: Scraping Technical Docs to Clean Markdown

Build Better RAG Pipelines: Scraping Technical Docs to Clean Markdown

Comments
2 min read
Bifrost: The Fastest Open Source LLM Gateway

Bifrost: The Fastest Open Source LLM Gateway

Comments
4 min read
Dec 12, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Dec 12, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Comments
5 min read
loading...