The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google argues that AI models constitute only about 10% of the system’s behavior. The main focus should be on harness and verification, not just the model itself. This shift impacts how companies develop, verify, and maintain AI-driven software.

Google’s latest whitepaper, “The New SDLC With Vibe Coding,” emphasizes that the AI model accounts for only about 10% of a system’s behavior. The core lesson: the harness and verification components are far more critical for reliable AI deployment, shifting industry focus away from solely improving models.

The paper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that 85% of professional developers now use AI coding agents regularly, with 41% generating code via AI. However, it stresses that the behavior of AI systems is predominantly determined by the harness—the prompts, tools, rules, and context management surrounding the model—rather than the model itself.

Concrete evidence is presented: experiments show that changing only the harness components, such as prompts or middleware, can significantly improve performance, even when the underlying model remains unchanged. For example, a team moved a coding agent from outside the Top 30 to the Top 5 on a benchmark simply by tweaking the harness, highlighting the importance of configuration and scaffolding.

The paper advocates for a paradigm shift in AI development, emphasizing verification, judgment, and configuration over model improvements alone. It warns that the often-cited ‘vibe coding’ approach—minimal review and quick prompts—can lead to high operating costs and security vulnerabilities if not properly managed through structured harnesses.

At a glance
reportWhen: published March 2026
The developmentGoogle’s new whitepaper reveals that the core of effective AI systems lies in harnessing and configuration, not just the AI model, challenging common assumptions.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Development Strategies

This whitepaper’s core message challenges the industry to rethink AI development. Instead of chasing ever-larger or more powerful models, companies should invest in harness design, configuration, and verification. The insight that the model accounts for only 10% of behavior means that costs, reliability, and security depend heavily on how the AI system is assembled and managed. This could lead to a shift in resource allocation and best practices across AI teams, emphasizing structured engineering over model size.

For organizations, adopting this approach could mean significant reductions in operational costs, improved system robustness, and enhanced security, especially as AI becomes more embedded in critical infrastructure.

Model Building Tools Kit,6-Piece with 4.3inch Precision Model Nipper, Clean Cuts with No Whitening, for Plastic Models, Gundam, Miniatures

Model Building Tools Kit,6-Piece with 4.3inch Precision Model Nipper, Clean Cuts with No Whitening, for Plastic Models, Gundam, Miniatures

【Complete 6-Piece Model Tools Kit】All-in-one hobby kit includes 1 high-quality single-edge nipper, 1 craft knife (hobby knife), 2…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on AI System Design Shifts

Until now, AI development has largely focused on improving model architecture, size, and training data. The narrative often centered on model performance benchmarks and pushing the frontier of AI capabilities. However, recent experiments and industry observations have shown that configuration and scaffolding—the harness—play a decisive role in actual system behavior. This aligns with broader trends in software engineering, where configuration management and verification are recognized as vital for reliable, secure systems.

The whitepaper builds on a growing understanding that AI’s practical deployment hinges on effective context engineering, tools, and guardrails, rather than solely on the model’s raw power. This represents a maturation of the AI engineering discipline, emphasizing system integration and control.

“The behavior you experience in AI tools is dominated by scaffolding you can build, own, and improve, not just the model itself.”

— Addy Osmani

Ai Automation Kit PLC Programming Software, Logic Function HMI, Run Simulator

Ai Automation Kit PLC Programming Software, Logic Function HMI, Run Simulator

1 PLC Controller

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Practical Implementation

While the whitepaper provides strong evidence that harness configuration is critical, it remains unclear how organizations will best scale these practices across diverse AI applications. Specific guidelines for best practices in harness design, verification, and ongoing maintenance are still emerging, and industry adoption may vary.

Additionally, the long-term impact of this shift on AI model development, hardware investment, and talent requirements is still being studied.

AI Assignments and Assessment for Teachers: Create Quizzes, Tests, Feedback, and Grading Systems in Minutes Using ChatGPT (No Tech Skills Required) (Classroom AI Success Series)

AI Assignments and Assessment for Teachers: Create Quizzes, Tests, Feedback, and Grading Systems in Minutes Using ChatGPT (No Tech Skills Required) (Classroom AI Success Series)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Engineering Teams

Organizations should reevaluate their AI development processes, emphasizing harness design, context engineering, and verification. Developing standardized frameworks and tools for system configuration will likely become a priority. Industry leaders may also invest in training and best practice sharing to adopt this new paradigm effectively.

Further research and case studies are expected to clarify how best to implement these principles at scale, and whether this approach reduces costs and enhances system robustness in diverse real-world scenarios.

Pydantic for AI Systems: Structured validation, LLM pipelines, and reliable agent design for intelligent applications (The Pydantic Engineering Series: ... intelligent systems with Python. Book 2)

Pydantic for AI Systems: Structured validation, LLM pipelines, and reliable agent design for intelligent applications (The Pydantic Engineering Series: … intelligent systems with Python. Book 2)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper shows that most AI failures and behaviors are due to how the system is configured, including prompts, tools, rules, and context management, rather than the underlying model itself.

What does harness refer to in AI systems?

Harness includes prompts, middleware, rules, tools, and observability mechanisms that shape and control the AI’s behavior, making it the most influential component.

How does this shift affect AI development costs?

While initial investment in harness and verification may be higher, operational costs tend to decrease because configuration and verification are more cost-effective and scalable than constantly upgrading models.

Will this change the way AI models are built?

Yes, the focus will shift from solely developing larger models to designing better systems for harnessing and verifying AI outputs, requiring new skills in system configuration and system engineering.

Source: ThorstenMeyerAI.com

You May Also Like

Sustainable Generative AI: Reducing Energy and Carbon Footprint

Theories and practices in sustainable generative AI are transforming energy use; discover how these innovations can significantly reduce environmental impact.

Why Data Lineage Matters More in Generative AI Projects

Just understanding data lineage in generative AI reveals critical insights that ensure transparency, accuracy, and ethical integrity—discover why it truly matters.

Copyright and IP Issues in Generative AI Content

A comprehensive look at copyright and IP challenges in generative AI content reveals crucial legal questions that demand further exploration.

Diffusion Models Explained in 5 Minutes (No PhD Required)

Unlock the secrets of diffusion models in just 5 minutes and discover how they’re transforming AI-generated media—continue reading to find out more.