35 posts tagged with "engineering"

AI Feature Payback: The ROI Model Your Finance Team Won't Fight You On

May 5, 2026 · 10 min read

Software Engineer

Every engineering team shipping AI features eventually hits the same wall: finance wants a spreadsheet that justifies the spend, and the spreadsheet you built doesn't actually work.

The problem isn't that AI features lack ROI. The problem is that AI economics break every assumption the standard ROI model was built on — fixed capital, linear cost curves, predictable timelines. Teams that treat AI spending like SaaS licensing get numbers that either look deceptively good before launch or collapse six months into production. The ten-fold gap between measured AI initiatives (55% ROI) and ad-hoc deployments (5.9% ROI) comes almost entirely from whether teams got the measurement model right before they shipped.

Building Trust Recovery Flows: What Happens After Your AI Makes a Visible Mistake

May 5, 2026 · 9 min read

Tian Pan

Software Engineer

When Google's AI Overview told users to add glue to pizza sauce and eat rocks for digestive health, it didn't just embarrass a product team — it exposed a systemic gap in how we think about AI reliability. The failure wasn't just that the model was wrong. The failure was that the model was confidently wrong, in a high-visibility context, with no recovery path for the users it misled.

Trust in AI systems doesn't erode gradually. Research shows it follows a cliff-like collapse pattern: a single noticeable error can produce a disproportionate trust decline with measurable effect sizes. Only 29% of developers say they trust AI tools — an 11-point drop from the previous year, even as adoption climbs to 84%. We're building systems that people use but don't trust. That gap matters when your product ships agentic features that act on behalf of users.

This post is about what engineers and product builders should do after the mistake happens — not just how to prevent it.

The Inherited AI System Audit: How to Take Ownership of an LLM Feature You Didn't Build

May 5, 2026 · 10 min read

Tian Pan

Software Engineer

Someone left. The onboarding doc says "ask Sarah" but Sarah is at a different company now. You're staring at a 900-line system prompt with sections titled things like ## DO NOT REMOVE THIS SECTION, and you have no idea what happens if you do.

This is the inherited AI system problem, and it's different from inheriting regular code. With legacy code, a determined engineer can trace execution paths, read tests, and reconstruct intent from behavior. With an inherited LLM feature, the prompt is the logic — but it's written in natural language, its failure modes are probabilistic, and the author's intent is trapped inside their head. There are no stack traces that tell you which guardrail fired and why.

Org-Level Goodhart: When Teams Game AI Adoption Metrics

May 5, 2026 · 9 min read

Tian Pan

Software Engineer

According to one study, 95% of generative AI pilots technically succeed—and 74% of companies using generative AI still haven't shown tangible business value. That gap is not a coincidence. It's a measurement problem dressed up as a technology problem, and most organizations won't diagnose it correctly because the people doing the measuring are the same ones being measured.

This is Goodhart's Law at the organizational level: once an AI adoption metric becomes a performance target, it ceases to measure what you care about. The metric keeps going up. The underlying outcome stays flat or gets worse.

AI System Design Advisor: What It Gets Right, What It Gets Confidently Wrong, and How to Tell the Difference

May 4, 2026 · 9 min read

Tian Pan

Software Engineer

A three-person team spent a quarter implementing event sourcing for an application serving 200 daily active users. The architecture was technically elegant. It was operationally ruinous. The design came from an AI recommendation, and the team accepted it because the reasoning was fluent, the tradeoff analysis sounded rigorous, and the system they ended up with looked exactly like the kind of thing you'd see on a senior engineer's architecture diagram.

That story is now a cautionary pattern, not an edge case. AI produces genuinely useful architectural input in specific, identifiable situations — and produces confidently wrong advice in situations that look nearly identical from the outside. The gap between them is not obvious if you approach AI as an answer machine. It becomes navigable if you approach it as a sparring partner.

The Org Chart Problem: Why AI Features Die Between Teams

May 4, 2026 · 10 min read

Tian Pan

Software Engineer

The model works. The pipeline runs. The demo looks great. And then the feature dies somewhere between the data team's Slack channel and the product engineer's JIRA board.

This is the pattern behind most AI project failures—not a technical failure, but an organizational one. A 2025 survey found that 42% of companies abandoned most of their AI initiatives that year, up from 17% the year prior. The average sunk cost per abandoned initiative was $7.2 million. When the post-mortems get written, the causes listed are "poor data readiness," "unclear ownership," and "lack of governance"—which are three different ways of saying the same thing: nobody was actually responsible for shipping the feature.

The Output Coupling Trap: Why Multi-Agent Systems Fail Silently at Interface Boundaries

May 4, 2026 · 9 min read

Tian Pan

Software Engineer

Your multi-agent pipeline finished. No exceptions were raised. The orchestrator reported success. And yet, the answer is wrong in a way that makes no sense — the executor skipped two steps, the summarizer collapsed three sections into one non-sequitur, and the output looks like it came from a different task entirely. There's no stack trace to follow. No error code to search. Just a quietly incorrect result.

This is the output coupling trap. It's not a model quality problem. It's an interface engineering problem, and it's the leading cause of silent production failures in multi-agent systems.

The AI Feature Nobody Uses: How Teams Ship Capabilities That Never Get Adopted

April 20, 2026 · 9 min read

Tian Pan

Software Engineer

A VP of Product at a mid-market project management company spent three quarters of her engineering team's roadmap building an AI assistant. Six months after launch, weekly active usage sat at 4%. When asked why they built it: "Our competitor announced one. Our board asked when we'd have ours." That's a panic decision dressed up as a product strategy — and it's endemic right now.

The 4% isn't an outlier. A customer success platform shipped AI-generated call summaries to 6% adoption after four months. A logistics SaaS added AI route optimization suggestions and got 11% click-through with a 2% action rate. An HR platform launched an AI policy Q&A bot that spiked for two weeks and flatlined at 3%. The pattern is consistent enough to name: ship an AI feature, watch it get ignored, quietly sunset it eighteen months later.

The default explanation is that the AI wasn't good enough. Sometimes that's true. More often, the model was fine — users just never found the feature at all.

The AI Feature Sunset Playbook: How to Retire Underperforming AI Without Burning Trust

April 20, 2026 · 10 min read

Tian Pan

Software Engineer

Engineering teams have built more AI features in the past three years than in the prior decade. They have retired almost none of them. Deloitte found that 42% of companies abandoned at least one AI initiative in 2025 — up from 17% the year before — with average sunk costs of $7.2 million per abandoned project. Yet the features that stay in production often cause more damage than the ones that get cut: they erode user trust slowly, accumulate technical debt that compounds monthly, and consume engineering capacity that could go toward things that work.

The asymmetry is structural. AI feature launches generate announcements, stakeholder excitement, and team recognition. Retirements are treated as admissions of failure. So bad features accumulate. The fix is not willpower — it is a decision framework that makes retirement a normal, predictable engineering outcome rather than an organizational crisis.

Organizational Antibodies: Why AI Projects Die After the Pilot

April 20, 2026 · 11 min read

Tian Pan

Software Engineer

The demo went great. The pilot ran for six weeks, showed clear results, and the stakeholders in the room were impressed. Then nothing happened. Three months later the project was quietly shelved, the engineer who built it moved on to something else, and the company's AI strategy became a slide deck that said "exploring opportunities."

This is the pattern that kills AI initiatives. Not technical failure. Not insufficient model capability. Not even budget. The technology actually works — research consistently shows that around 80% of AI projects that reach production meet or exceed their stated expectations. The problem is the 70-90% that never get there.

The AI Feature Retirement Playbook: How to Sunset What Users Barely Adopted

April 19, 2026 · 11 min read

Tian Pan

Software Engineer

Your team shipped an AI-powered summarization feature six months ago. Adoption plateaued at 8% of users. The model calls cost $4,000 a month. The one engineer who built it has moved to a different team. And now the model provider is raising prices.

Every instinct says: kill it. But killing an AI feature turns out to be significantly harder than killing any other kind of feature — and most teams find this out the hard way, mid-retirement, when the compliance questions start arriving and the power users revolt.

This is the playbook that should exist before you ship the feature, but is most useful right now, when you're staring at usage graphs that point unmistakably toward the exit.

LLM Vendor Lock-In Is a Spectrum, Not a Binary

April 19, 2026 · 10 min read

Tian Pan

Software Engineer

A team builds a production feature on GPT-4. Months later, they decide to evaluate Claude for cost reasons. They spend two weeks "migrating"—but the core API swap takes an afternoon. The remaining ten days go toward fixing broken system prompts, re-testing refusal edge cases, debugging JSON parsers that choke on unexpected prose, and re-tuning tool-calling schemas that behave differently across providers. Migration estimates that assumed a simple connector swap balloon into a multi-layer rebuild.

This is the LLM vendor lock-in problem in practice. And the teams that get burned aren't the ones who chose the wrong provider—they're the ones who didn't recognize that lock-in exists on multiple axes, each with a different risk profile.

About Tian Pan