A reality check for AI in engineering

      Most engineering leaders struggle to respond to the essential inquiry from their CFO: “Can you demonstrate that this AI expenditure is impacting outcomes rather than merely activity?”

      Every December, plans are finalized, budgets approved, and board presentations refined until everything appears accurate and managed. However, beneath the surface, many CTOs and VPs are operating with limited insight. They have a sense of their teams, but lack a dependable view of workflow, the true impact of AI on delivery, or the actual allocation of time and resources.

      For some time, this was manageable. Experience, pattern recognition, and accessible capital filled the gaps. Leaders could navigate around bottlenecks, overstaff essential teams, or discreetly divert attention from the more chaotic aspects of the system. Then AI emerged as an ideal distraction, with pilots, PoCs, Copilot subscriptions, and various “AI initiatives” generating observable activity and buying time.

      By 2026, this period of grace will come to an end. Boards and CFOs are transitioning from “demonstrating your experimentation” to “proving measurable impact this year.” This shift is not due to a loss of faith in AI; rather, the market is no longer satisfied with vague promises. Every dollar spent on AI will require a clear connection to productivity, quality, or customer value.

      The moment of scrutiny

      If you oversee engineering, you likely recognize this scenario. You present slides showcasing AI achievements. Adoption rates are increasing. Developers express satisfaction with the tools. You share anecdotes about expedited coding and smoother reviews. Then the CFO poses a straightforward question: “How exactly is this budget influencing output and outcomes?”

      Typical responses often focus on:

      - AI adoption metrics and licenses

      - Time saved on coding tasks

      - Future potential as outlined in roadmaps “once we fully implement this.”

      However, what tends to be absent is a concrete analysis of:

      - The areas where AI is genuinely utilized throughout the Software Development Life Cycle (SDLC)

      - The actual capacity it frees up in practice

      - How that freed time is redirected toward customer-facing tasks, quality improvements, or strategic initiatives

      - Whether AI is enhancing overall system performance and not just individual speed

      Consequently, discussions revert to learning curves, cumulative benefits, and attracting talent. While these points are valid, they are too soft for a rigorous budget review, which will not suffice.

      Why a 55% speed increase doesn’t equate to 55% more output

      AI providers often cite task-level statistics. A coding task completed 55% faster looks impressive on a presentation. However, when analyzed at the team and systems level, the reality shifts.

      Extensive data from thousands of developers reveals a consistent trend:

      - About half of team members report that AI is enhancing team productivity by 10% or less, with a notable portion seeing no measurable improvement at all.

      - Only a small segment claims gains of 25% to 50%, as evidenced in various case studies.

      Field experiments confirm that developers do accomplish more tasks with AI, yet the benefits are significantly less than the “55% faster” headlines suggest, especially when considering real-world complexities, debugging AI outputs, and integration tasks. Furthermore, when viewing delivery metrics across teams, certain organizations observe stagnation or even a decrease in throughput as AI usage rises, due to larger changesets, increased integration risks, and greater coordination demands.

      The key takeaway is straightforward: task-level efficiency does not inherently translate to system-level productivity. Brief moments of time saved are often consumed by meetings, support tasks, and shifting contexts. Developers require extended, uninterrupted periods for in-depth work, but their days are often fragmented. Even if AI reduces a task duration by 20-30 minutes, that time is typically lost to Slack messages, reviews, and incident alerts rather than leading to substantial new output.

      The issue lies not with the tools themselves, but with the absence of a system to manage where the “extra” capacity is allocated.

      The true productivity concern for 2026

      Most organizations still assess AI productivity primarily in terms of speed: more story points, increased ticket counts, and higher deployment rates. This perspective overlooks a more critical question:

      How much of our engineering capacity contributes to new value versus maintenance, incidents, and rework, and is AI enhancing that ratio?

      High-level benchmarks can be blunt but informative. On average, developers spend roughly 45% of their time on maintenance, minor enhancements, and bug fixes, rather than on genuinely new, customer-facing endeavors. If AI enables faster code production within an unchanged system, the risks include:

      - Delivering features at an increased pace while maintaining the same defect rate

      - Expanding the system’s scope while technical debt accumulates quietly

      - Making teams busier without resulting in any meaningful improvements to the product or the business

      Thus, one may end up with impressive local metrics while leadership perceives that engineering is actually slowing down.

      Two strategies to transform AI from hype to real gains

      For a productive