Why Agent Count Is the Wrong Procurement AI Metric

Listen to this blog

TL;DR

Agent count is the vendor’s revenue metric. It drives commercial growth for the vendor, not procurement value for the CPO.
These two metrics diverge structurally: agent count rises while outcome-per-agent metrics plateau from month six onward.
The divergence gap is where Agent Debt accumulates. Optimizing for agent count accelerates all four debt types simultaneously.
The three metrics a CPO should require: decisions traceable as a percentage of total actions, workflows without exception, and outcomes per sourcing cycle versus the pre-deployment baseline.
Deloitte’s 2026 State of AI found 75% of organizations plan agentic AI within two years. Only 21% have mature governance for it. That gap is where Agent Debt compounds.
Explore how the Merlin Agentic AI Platform structures every agent around an Intake-to-Outcomes workflow so agent count is always a means, never the metric.

The vendor tracks agents deployed because that is what drives their revenue. The CPO should track outcomes delivered because that is what drives procurement value. These are not the same metric and they diverge structurally as the estate grows.

The foundational Agent Debt piece established the mechanism. The CPO self-assessment provided the diagnostic. The studio teardown identified where the debt originates. The four debt types named the dimensions. The technical debt comparison explained why Agent Debt compounds faster and without visibility. This blog addresses the metric that drives the accumulation.

Agent Debt is the compounding operational liability an enterprise takes on when it deploys task-doing AI agents faster than it can govern, orchestrate, and tie them to business outcomes. The metric the enterprise uses to measure progress determines which side of that definition it is on.

Why does every agentic AI vendor lead with agent count as the headline metric?

Agent count is the metric that maps directly to vendor revenue. Studio-model vendors charge per agent, per seat, or per additional capability licensed. Agent count is not presented as the most useful metric for the CPO because it is; it is presented as the headline metric because it is the most useful metric for the vendor.

This is not a criticism of individual vendors. It is a structural observation about how software revenue models work. The vendor optimizes for the metric that drives their growth. The CPO needs to optimize for a different metric entirely and understand why those two metrics diverge before signing a contract.

What does the vendor’s financial model actually optimize for?

A studio vendor’s revenue grows with agents deployed. Their commercial incentive is to maximize the number of agents an enterprise licenses, expand the use cases covered, and increase the complexity of the estate — because complexity creates dependency and dependency creates retention.

The agent count metric serves all three of these commercial goals simultaneously. A high agent count signals a large deployment, justifies expansion licensing, and creates the integration complexity that makes the estate difficult to migrate. It is a revenue metric masquerading as a success metric.

“The top barriers to agentic AI ROI are limited visibility into long-term impact, undefined or inconsistent baseline metrics, and the absence of a dedicated AI value office or task force. Agentic AI does not solve these problems.”
IDC, Agentic AI Is Breaking Your ROI Model

It amplifies them. The absence of a CPO-side outcome metric is not accidental: it is what the vendor’s model requires.

What does the CPO’s value model actually require as a metric?

The CPO needs to measure decisions, not deployments. The substitution is direct:

Agents deployed becomes: decisions traceable without engineering involvement as a percentage of total agent actions.
Automations created becomes: workflows completed without human exception as a percentage of workflows initiated.
Agents available out of the box becomes: agent actions auditable on request within a defined time window, measured against a standard the enterprise set before deployment.

Each of these metrics measures what procurement functions actually need: outcomes and accountability, not estate size. Each is measurable before the first agent goes live. Each has a pre-deployment baseline that procurement functions already understand how to establish. And each has a definition of success that does not change based on the number of agents the vendor deploys on the account.

How do the two metrics diverge at 12 and 24 months?

In the first three to six months, agent count and outcomes appear correlated. Each new agent addresses a specific workflow and delivers early productivity gains. The metrics tell the same story.

From month six onward, they diverge. Integration complexity grows with each additional agent. Exception handling volume rises as agents encounter edge cases the coordination protocol did not anticipate. Governance gaps accumulate. The vendor’s revenue metric continues rising while the CPO’s outcome metrics plateau.

“Close to three-quarters of organizations plan to deploy agentic AI within two years. Only 21% have a mature model for governance of autonomous agents.”
Deloitte, State of AI in the Enterprise 2026 (N=3,235)

Three-quarters of enterprises are on the vendor’s adoption trajectory. One in five has the governance infrastructure to measure whether that adoption is delivering outcomes. That gap is where Agent Debt accumulates.

What happens to Agent Debt when the optimization target is wrong?

When agent count is the optimization target, organizations deploy additional agents before the governance and orchestration infrastructure is in place. Each new agent added without a coordination protocol contributes Orchestration Debt. Each agent added without a monitoring standard contributes Maintenance Debt. Each agent whose decision logic lives in the person who built it contributes Talent Debt.

The metric is not just wrong. It actively accelerates the four debt types this series has been documenting. Optimizing for agent count is optimizing for Agent Debt accumulation, whether or not the enterprise intends it. The compounding mechanism identified in the series does not require malicious intent: it only requires the absence of the right optimization target.

“Data security, privacy, and risk concerns are the top factor influencing AI strategy in the next six months for 91% of large enterprise AI leaders.”
KPMG, Q1 2026 AI Pulse Survey

These are precisely the governance concerns that agent count as an optimization target systematically defers.

What does a properly aligned metric look like at the estate level?

Three metrics a CPO should require from any vendor before deployment:

Decisions traceable without engineering effort ÷ total agent actions. This measures whether governance is structural or manual.
Workflows completed without human exception ÷ total workflows initiated. This measures whether the estate is delivering outcomes or generating exception queues.
Outcomes delivered per sourcing cycle vs. the pre-deployment baseline. This measures whether agentic AI is improving the function’s core purpose or just adding complexity.

These metrics are not radical departures from how procurement functions measure themselves. They are procurement outcome metrics applied to the agents operating within procurement workflows. They are what 30 years of technical debt management frameworks would look like translated into the CPO’s language.

Figure 1: Each vendor metric optimises for estate size. Each CPO metric optimises for procurement outcomes. The substitution is direct and measurable before the first agent deploys.

How does Intake-to-Outcomes make the right metric structural rather than deliberate?

The Merlin Agentic AI Platform is built around the outcome, not the agent count. Every agent in the system answers for its contribution to a specific procurement outcome: a sourcing decision, a contract milestone, a supplier qualification. Agent count is a consequence of what the workflow requires, not a target to be maximized.

The Intake-to-Outcomes architecture makes this structural: the metric the platform optimizes is procurement outcomes delivered, not agents deployed. The CPO does not need to manually enforce a different metric against the vendor’s model. The architecture enforces it by design.

Published by Zycus

Zycus measures procurement AI success by outcomes delivered, not agents deployed. See how the Merlin Agentic AI Platform structures every agent around a specific Intake-to-Outcomes workflow so that agent count is always a means, never the metric.

Read the series: What is Agent Debt? · The CPO Self-Assessment · What 50+ Agents Actually Means · The Four Types · Agent Debt vs Technical Debt · Beyond the Hype (Whitepaper)

FAQs

Q1. Why do agentic AI vendors always lead with agent count?
Agent count is the metric that maps directly to vendor revenue. Most studio-model vendors charge per agent, per seat, or per additional capability licensed. Maximizing agent count maximizes their addressable revenue on each account. The metric is not chosen because it is useful to the CPO; it is chosen because it drives the vendor’s growth model.

Q2. What metric should replace agent count for a CPO evaluating agentic AI?
The most useful replacement metrics are outcome-anchored: decisions traceable without engineering intervention as a percentage of total agent actions, workflows completed without human exception as a percentage of workflows initiated, and savings realized per sourcing cycle compared to the pre-deployment baseline. These measure what procurement functions actually need: outcomes, not activity.

Q3. How does agent count create Agent Debt?
When agent count is the optimization target, organizations deploy additional agents before the governance and orchestration infrastructure is in place to support them. Each new agent added without a coordination protocol contributes to Orchestration Debt. Each agent added without a monitoring standard contributes to Maintenance Debt. Each agent whose decision logic lives in the person who built it contributes to Talent Debt. Agent count drives the very accumulation the four-type framework describes.

Q4. Is agent count ever a useful metric?
Agent count is a useful inventory metric: it tells you what exists in the estate. It becomes dangerous when it is used as a success metric or as a proxy for value delivered. A procurement function with 12 well-governed, auditable agents delivering traceable outcomes is generating more value than an estate of 50 agents where 30 cannot produce a decision log on request. The number is not the problem; treating it as a value signal is.

Q5. How do vendor and CPO metrics diverge over time?
In the first three to six months, they appear aligned: agent count rises and there are early productivity signals. From month six onward, the divergence becomes visible. Integration complexity grows with each additional agent. Exception handling volume rises as agents encounter edge cases the coordination protocol did not anticipate. Governance gaps widen. The vendor’s revenue metric continues to rise while the CPO’s outcome metrics plateau or decline. By month 24, the gap is structural: the estate is optimized for the vendor’s model, not the CPO’s outcomes.

Q6. What does a correctly aligned vendor relationship look like?
A correctly aligned vendor relationship ties vendor commercial success to CPO outcome metrics, not to agent count. This means outcome-based or consumption-based pricing anchored to business results rather than licenses deployed. It also means the vendor has a structural incentive to ensure agents are well-governed, properly orchestrated, and delivering traceable outcomes — because that is the condition under which their revenue grows. Agent count can still be tracked as an inventory metric, but it does not determine what the vendor earns.

Q7. Can a vendor be held accountable for the Agent Debt it creates?
In the current market, rarely. Most vendor contracts define success as deployment completion, not outcome delivery. The CPO’s leverage is pre-contract: requiring outcome-based success criteria, governance documentation standards, and decision audit capabilities as conditions of the agreement before any agents are deployed. Post-deployment accountability is significantly harder to enforce, which is why the architecture and metric alignment decisions must be made before the first agent goes live.

Q8. How does the Merlin Agentic AI Platform avoid the agent count trap?
The Merlin Agentic AI Platform is built around the Intake-to-Outcomes architecture, which defines success at the workflow level rather than the agent level. Every agent in the platform answers for its contribution to a specific procurement outcome: a sourcing decision, a contract milestone, a supplier qualification. Agent count is a consequence of what the workflow requires, not a target to be maximized. The commercial model follows the same logic: Zycus grows when the CPO achieves better procurement outcomes, not when more agents are deployed.

Related Reads:

Webinars

CEWA’s Digital Transformation Journey: How Agentic AI is Reshaping Procurement in ANZ

Uday Jain

Uday in the business of making procurement leaders read past the first line. Content and product marketer at Zycus, turning product complexity into something worth their time. Demand gen is where I learned the craft from the ground up. Every headline earning the click, every paragraph earning the next, every word pulling its weight. If they bookmark it, I’ve done my job. If they share it, I’ve done it well.