📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google argues that AI models constitute only about 10% of the system’s behavior. The main focus should be on harness and verification, not just the model itself. This shift impacts how companies develop, verify, and maintain AI-driven software.

Google’s latest whitepaper, “The New SDLC With Vibe Coding,” emphasizes that the AI model accounts for only about 10% of a system’s behavior. The core lesson: the harness and verification components are far more critical for reliable AI deployment, shifting industry focus away from solely improving models.

The paper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that 85% of professional developers now use AI coding agents regularly, with 41% generating code via AI. However, it stresses that the behavior of AI systems is predominantly determined by the harness—the prompts, tools, rules, and context management surrounding the model—rather than the model itself.

Concrete evidence is presented: experiments show that changing only the harness components, such as prompts or middleware, can significantly improve performance, even when the underlying model remains unchanged. For example, a team moved a coding agent from outside the Top 30 to the Top 5 on a benchmark simply by tweaking the harness, highlighting the importance of configuration and scaffolding.

The paper advocates for a paradigm shift in AI development, emphasizing verification, judgment, and configuration over model improvements alone. It warns that the often-cited ‘vibe coding’ approach—minimal review and quick prompts—can lead to high operating costs and security vulnerabilities if not properly managed through structured harnesses.

At a glance

reportWhen: published March 2026

The developmentGoogle’s new whitepaper reveals that the core of effective AI systems lies in harnessing and configuration, not just the AI model, challenging common assumptions.

The Model Is Only 10% — The New SDLC With Vibe Coding

AI Dispatch · Field Notes

Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified

Vibe Coding

Casual prompts · “does it seem to work?” · disposable code · high risk

Structured AI-Assisted

Detailed prompts + constraints · manual testing · features in real codebases

Agentic Engineering

Formal specs · automated tests + evals + CI gates · production scale · low risk

Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.

The idea worth building your strategy around

Agent = Model + Harness

~10%

HARNESS — prompts · tools · context · hooks · sandboxes · observability

MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S

Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.

“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.

The economics: it’s a token-cost problem (CapEx vs OpEx)

Vibe Coding

Low CapEx · High OpEx

Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.

Agentic Engineering

High CapEx · Low OpEx

Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.

85%

of devs use AI coding agents (51% daily)

41%

of all new code is AI-generated

~90%

of agent behavior is the harness, not the model

+19%

longer on some tasks (METR) — verification is the cost

The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.

thorstenmeyerai.com

Implications for AI Development Strategies

This whitepaper’s core message challenges the industry to rethink AI development. Instead of chasing ever-larger or more powerful models, companies should invest in harness design, configuration, and verification. The insight that the model accounts for only 10% of behavior means that costs, reliability, and security depend heavily on how the AI system is assembled and managed. This could lead to a shift in resource allocation and best practices across AI teams, emphasizing structured engineering over model size.

For organizations, adopting this approach could mean significant reductions in operational costs, improved system robustness, and enhanced security, especially as AI becomes more embedded in critical infrastructure.

Systems and Software Verification: Model-Checking Techniques and Tools

As an affiliate, we earn on qualifying purchases.

Background on AI System Design Shifts

Until now, AI development has largely focused on improving model architecture, size, and training data. The narrative often centered on model performance benchmarks and pushing the frontier of AI capabilities. However, recent experiments and industry observations have shown that configuration and scaffolding—the harness—play a decisive role in actual system behavior. This aligns with broader trends in software engineering, where configuration management and verification are recognized as vital for reliable, secure systems.

The whitepaper builds on a growing understanding that AI’s practical deployment hinges on effective context engineering, tools, and guardrails, rather than solely on the model’s raw power. This represents a maturation of the AI engineering discipline, emphasizing system integration and control.

“The behavior you experience in AI tools is dominated by scaffolding you can build, own, and improve, not just the model itself.”
— Addy Osmani

Claude Harness Engineering: The Art of Engineering Around Claude

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Practical Implementation

While the whitepaper provides strong evidence that harness configuration is critical, it remains unclear how organizations will best scale these practices across diverse AI applications. Specific guidelines for best practices in harness design, verification, and ongoing maintenance are still emerging, and industry adoption may vary.

Additionally, the long-term impact of this shift on AI model development, hardware investment, and talent requirements is still being studied.

Rotation Tester and AC Detector for Circulating Pumps etc

Quickly indicates rotation direction

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Engineering Teams

Organizations should reevaluate their AI development processes, emphasizing harness design, context engineering, and verification. Developing standardized frameworks and tools for system configuration will likely become a priority. Industry leaders may also invest in training and best practice sharing to adopt this new paradigm effectively.

Further research and case studies are expected to clarify how best to implement these principles at scale, and whether this approach reduces costs and enhances system robustness in diverse real-world scenarios.

Pydantic for AI Systems: Structured validation, LLM pipelines, and reliable agent design for intelligent applications (The Pydantic Engineering Series: … intelligent systems with Python. Book 2)

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper shows that most AI failures and behaviors are due to how the system is configured, including prompts, tools, rules, and context management, rather than the underlying model itself.

What does harness refer to in AI systems?

Harness includes prompts, middleware, rules, tools, and observability mechanisms that shape and control the AI’s behavior, making it the most influential component.

How does this shift affect AI development costs?

While initial investment in harness and verification may be higher, operational costs tend to decrease because configuration and verification are more cost-effective and scalable than constantly upgrading models.

Will this change the way AI models are built?

Yes, the focus will shift from solely developing larger models to designing better systems for harnessing and verifying AI outputs, requiring new skills in system configuration and system engineering.

Source: ThorstenMeyerAI.com

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Cutrova: Edit the Words, Not the Timeline

Author

SmartCR Team

Share article

The model is only 10%

Implications for AI Development Strategies

Systems and Software Verification: Model-Checking Techniques and Tools

Background on AI System Design Shifts

Claude Harness Engineering: The Art of Engineering Around Claude

Unresolved Questions About Practical Implementation

Rotation Tester and AC Detector for Circulating Pumps etc

Next Steps for AI Engineering Teams

Pydantic for AI Systems: Structured validation, LLM pipelines, and reliable agent design for intelligent applications (The Pydantic Engineering Series: … intelligent systems with Python. Book 2)

Key Questions

Why is the model only 10% of the system’s behavior?

What does harness refer to in AI systems?

How does this shift affect AI development costs?

Will this change the way AI models are built?

Multimodal Generative AI: Combining Text, Images, and Audio

Create Funnels Faster Than Ever with AI Form Builders from Prompt to Launch

A Frontier AI Model Just Went Dark for 18 Days. The Kill-Switch Is Real Now.

Diffusion Models Explained in 5 Minutes (No PhD Required)

6 Best AI-Powered Student Organization Apps in 2026

How AI4S Could Transform STEM Scientific Careers, Backed By ByteDance

What High-Availability Design Means for AI APIs

5 Best AI-Powered Classroom Management Software in 2026

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Author

SmartCR Team

Share article

The model is only 10%

Implications for AI Development Strategies

Systems and Software Verification: Model-Checking Techniques and Tools

Background on AI System Design Shifts

Claude Harness Engineering: The Art of Engineering Around Claude

Unresolved Questions About Practical Implementation

Rotation Tester and AC Detector for Circulating Pumps etc

Next Steps for AI Engineering Teams

Pydantic for AI Systems: Structured validation, LLM pipelines, and reliable agent design for intelligent applications (The Pydantic Engineering Series: … intelligent systems with Python. Book 2)

Key Questions

Why is the model only 10% of the system’s behavior?

What does harness refer to in AI systems?

How does this shift affect AI development costs?

Will this change the way AI models are built?

You May Also Like