Blog

Page 5

12 articles

LLM Code Review in Production: Building a Diff Pipeline That Engineers Actually Trust
How to deploy LLMs as a code review layer that reduces review load without creating noise — covering diff preprocessing, false positive budgets, integration patterns, and the metrics that matter.
insiderai-engineering
May 49 min
The Feature Store Pattern for LLM Applications: Stop Retrieving What You Could Precompute
Applying feature store architecture to LLM context assembly cuts retrieval latency, reduces inference cost, and prevents the training-serving skew that quietly degrades model performance.
llmrag
May 410 min
What Your Fine-Tuned LLM Is Leaking About Its Training Data
Fine-tuned models can expose training data through verbatim extraction, membership inference, and attribute inference attacks — and a $200 budget is enough to demonstrate it. A technical guide to the threat model, differential privacy tradeoffs, output sanitization, and proactive audit methodology for production deployments.
insiderllm-security
May 410 min
AI Ops Is Not Platform Engineering: How Running LLM Services Breaks Your SRE Playbook
Running LLM services requires a distinct operational discipline from microservices. Here's where your existing SRE playbook transfers, where it fails, and the new runbook categories you don't have yet.
llmopssre
May 410 min
Multi-Model Consensus: When One LLM Isn't Enough to Sign Off
Most AI systems trust a single model and never know when the failure is systematic. Multi-model consensus routes outputs through multiple provider families, surfaces disagreement as a signal, and reduces tail risk in high-stakes decisions.
llmai-architecture
May 411 min
The Multilingual RAG Retrieval Gap: Why Cross-Lingual Queries Silently Fail Your Vector Search
Monolingual embeddings produce geometrically meaningless similarity scores across languages — here's why this silent failure mode destroys non-English retrieval quality and what to do about it.
ragembeddings
May 411 min
The N-Tier Confirmation Cascade: When More Human Approvals Make AI Less Safe
Adding more human approval stages to AI pipelines often produces the opposite of safety — fatigued reviewers rubber-stamp outputs, models learn to game tired annotators, and you pay the overhead of review without getting its benefit.
ai-engineeringhuman-in-the-loop
May 49 min
Non-Blocking AI: Async UX Patterns That Keep Applications Responsive While Agents Work
Long-running agent tasks destroy synchronous UX assumptions. Here are the backend and frontend patterns that keep your application responsive while agents do real work.
ai-engineeringux
May 411 min
Org-Level Goodhart: When Teams Game AI Adoption Metrics
When AI adoption metrics become performance targets, teams optimize for the metric instead of the outcome. Here's how it happens, why it's hard to detect, and what measurements actually survive contact with organizational incentives.
aiengineering
May 49 min
The Overfitting Org: When Your AI Team's Model Expertise Becomes a Liability
Deep model-specific expertise looks like a strength until a provider deprecates a model or shifts behavior. Here's how AI teams accidentally overfit to one model family — and what model-portable teams do differently.
ai-engineeringllm
May 49 min
Personalization Profile Decay: When Your AI's Model of the User Stops Being the User
AI personalization systems quietly degrade as user profiles grow stale — here's how to detect the decay before it becomes churn, and how to re-personalize without forcing users through onboarding again.
personalizationrecommendation-systems
May 410 min
The Population Prompt Problem: Why Your System Prompt Works for 80% of Users and Silently Fails the Other 20%
System prompts are written for an imagined median user, but production traffic is a distribution. Here's how to find the 20% your prompt silently fails — and what to do about it.
prompt-engineeringllm
May 410 min

About Tian Pan

I'm Tian Pan, an engineer-founder focused on agentic engineering — building autonomous AI systems and scaling engineering teams. I write practical guides on system design, technical leadership, and shipping with AI agents. Previously an early engineer at Uber, Brex, and IoTeX.

Page 5

LLM Code Review in Production: Building a Diff Pipeline That Engineers Actually Trust

The Feature Store Pattern for LLM Applications: Stop Retrieving What You Could Precompute

What Your Fine-Tuned LLM Is Leaking About Its Training Data

AI Ops Is Not Platform Engineering: How Running LLM Services Breaks Your SRE Playbook

Multi-Model Consensus: When One LLM Isn't Enough to Sign Off

The Multilingual RAG Retrieval Gap: Why Cross-Lingual Queries Silently Fail Your Vector Search

The N-Tier Confirmation Cascade: When More Human Approvals Make AI Less Safe

Non-Blocking AI: Async UX Patterns That Keep Applications Responsive While Agents Work

Org-Level Goodhart: When Teams Game AI Adoption Metrics

The Overfitting Org: When Your AI Team's Model Expertise Becomes a Liability

Personalization Profile Decay: When Your AI's Model of the User Stops Being the User

The Population Prompt Problem: Why Your System Prompt Works for 80% of Users and Silently Fails the Other 20%

About Tian Pan