What Is Agent Washing? 5 Questions for Vendor Demos

Listen to this blog

TL;DR

Agent washing is the practice of relabelling scripted automation, workflow tools, or RPA as agentic AI, claiming autonomy, governance, and decision capability that the underlying architecture does not deliver.
It matters beyond vendor credibility: agent-washed products generate all four types of Agent Debt on day one because the capabilities that prevent each type were never present.
Five questions separate genuine agentic platforms from agent-washed products; ask them in every vendor demo before any contract is signed.
No decision trace, no coordination protocol, no production monitoring, no accessible governance layer, and no retrospective audit trail are five signals of a washed product.
Gartner estimates only about 130 of the thousands of vendors claiming agentic AI capability offer genuine agentic features.
See how the Merlin Agentic AI Platform answers all five questions by design, not by configuration.

The foundational Agent Debt piece established the mechanism. The CPO self-assessment provided the diagnostic. The studio teardown identified where the debt originates. The four debt types named the dimensions. The technical debt comparison explained why it compounds faster. The agent count metric identified what drives accumulation. The prevention blog described the four architectural decisions that stop debt forming. The paydown blog addressed what to do when it has accumulated. This blog addresses how to identify, before signing a contract, whether the vendor being evaluated will generate Agent Debt from day one.

Agent Debt is the compounding operational liability an enterprise takes on when it deploys task-doing AI agents faster than it can govern, orchestrate, and tie them to business outcomes. Agent washing is the commercial mechanism through which that liability is generated invisibly: the vendor demonstrates autonomous, governed, orchestrated agents. The product delivers scripted automation with an agentic label.

What is agent washing, and why does it matter beyond vendor credibility?

Agent washing is the relabelling of scripted automation, workflow tools, or robotic process automation as agentic AI, claiming capabilities the architecture does not deliver: autonomous decision-making that is actually rule execution, governance that is actually demo-environment logging, orchestration that is actually scripted sequencing.

The reason it matters beyond marketing is structural. An agent-washed product generates all four types of Agent Debt on day one: Governance Debt because there is no real decision trace; Orchestration Debt because there is no real coordination protocol; Maintenance Debt because the monitoring is demo-environment monitoring; and Talent Debt because the operating knowledge lives in the implementation consultant. The enterprise does not realize this until a trigger event makes the debt undeniable.

Over 40 percent of agentic AI projects will be canceled by the end of 2027, due to escalating costs, unclear business value, or inadequate risk controls. Gartner estimates only about 130 of the thousands of agentic AI vendors offer genuine agentic features.
Gartner, June 2025

The five questions below identify the gap between 130 and thousands, before signing.

Figure 1. Five demo questions to identify agent washing, showing what each targets and the debt risk if the vendor cannot answer.

Question 1: Show me this agent’s decision trace for the last ten production actions.

A genuine answer provides a platform-level log of the last ten decisions in a customer production environment, accessible without an engineering request: input state, logic applied, output, and timestamp. The log is pulled from production in the session, not reconstructed. An agent-washed answer defers to implementation configuration or provides demo-environment metrics only. If the decision trace does not exist in production today, it will not exist after deployment. Governance Debt is structural from day one.

Question 2: What happens when two agents produce conflicting outputs on the same exception? Show me the coordination protocol.

Three-quarters of enterprise leaders say they are adopting agentic AI. Only a small minority have it running in meaningful production beyond “agentish” chatbots, and true scaled multiagent systems are rarer still.
Forrester, The State of Agentic AI 2026

That gap between adoption intent and production reality is where uncoordinated multi-agent systems live. A genuine answer demonstrates a platform-level exception routing protocol: when two agents disagree, the conflict routes to a defined owner by policy, not by manual configuration per exception. An agent-washed answer defers to the customer’s team or to implementation. No coordination protocol means every exception is a manual intervention. Orchestration Debt begins accumulating before the second agent is live.

Question 3: Can I see the monitoring dashboard from the last 30 days of production, not the demo environment?

A genuine answer opens a live dashboard from a customer production environment showing 30 days of agent performance: decisions made, exceptions generated, drift against the pre-deployment baseline, and alert history. The vendor shows this without preparation because the infrastructure runs continuously. An agent-washed answer offers to provide anonymized data later or explains that monitoring is customer-configured. If the vendor cannot show 30 days of production monitoring from an existing customer, the monitoring infrastructure either does not exist or must be built by the customer. Maintenance Debt begins at deployment.

Question 4: Who can change the governance policy layer, and can they do it without an engineering ticket?

A genuine answer demonstrates a business-accessible governance interface: the CPO or governance team can adjust decision policies and update exception thresholds without an engineering ticket. Changes are logged with a full audit trail. An agent-washed answer reveals policy changes that require engineering involvement, or a policy layer that is the underlying model’s system prompt.

Nearly eight in ten companies have deployed gen AI in some form, but roughly the same percentage report no material impact on earnings.
McKinsey, Seizing the Agentic AI Advantage

A governance layer that requires engineering on every change cannot adapt to evolving procurement priorities. The Governance Debt risk is not only audit exposure; it is a platform structurally unable to stay aligned with the business without an engineering project for each update.

Question 5: If an auditor challenges one of this agent’s decisions from six months ago, what does the audit trail look like?

A genuine answer retrieves a specific decision from six months ago in the platform record and walks through the trace: information available at the time, logic applied, output, and authorization. Retrieval takes minutes. The record was captured at the time of the decision, not reconstructed after. An agent-washed answer involves working with the customer’s team to reconstruct from available logs. Reconstruction is not an audit trail. The absence of retrospective auditability means all four Agent Debt types have been accumulating without visibility. The auditor question is the trigger event that makes them undeniable.

What does a vendor who answers all five look like, and what does it unlock?

A vendor who answers all five has built the four architectural components that prevent Agent Debt: a decision record at the platform level, a coordination protocol above the agents, continuous production monitoring, and platform-embedded operating knowledge. The CPO who buys that platform does not need the prevention decisions to be made as a separate project. The architecture makes them.

The Merlin Agentic AI Platform is built to answer all five. Decision traceability, coordination governance, production monitoring, business-accessible policy configuration, and retrospective audit capability are the architectural baseline from which every procurement agent operates, not features configured during implementation.

Published by Zycus

A procurement AI platform that cannot answer these five questions will generate Agent Debt from the first deployment. Evaluate every vendor against this standard before signing. Explore how the Merlin Agentic AI Platform is built to meet it.

Read the series: What is Agent Debt? · The CPO Self-Assessment · What 50+ Agents Actually Means · The Four Types · Agent Debt vs Technical Debt · Why Agent Count Is Wrong · How to Prevent Agent Debt · How to Pay Down Agent Debt · Beyond the Hype (Whitepaper)

FAQs

Q1. What is the difference between agent washing and ordinary vendor hype?
Ordinary vendor hype overstates the maturity or ROI of a genuine product. Agent washing misrepresents the category: the product is not, in architectural terms, an AI agent. It is rule-based automation, scripted workflow, or RPA with a natural language interface. Hyped products underperform. Washed products fail structurally when asked to do what agents do: make autonomous decisions, coordinate with other agents, and produce auditable outputs.

Q2. Can agent washing happen unintentionally?
Yes. Some vendors genuinely believe their product is agentic because it uses a large language model. Using an LLM does not make a product agentic. What makes a product agentic is autonomous decision-making, tool use, memory across tasks, and platform-level governance. Many LLM-powered products are sophisticated retrieval and summarization tools, not agents. The five questions elicit architectural evidence rather than vendor belief.

Q3. Should these five questions be asked before or after a proof of concept?
Before. A proof of concept is designed and executed by the vendor, who controls what the demo shows. A POC that cannot answer the five questions will produce evidence of the demo path the vendor prepared, not evidence of genuine agentic capability. The five questions should be asked at evaluation stage, before any POC is scoped, so that the answers determine whether the POC is worth running.

Q4. What if the vendor says the audit trail and governance layer will be configured during implementation?
That means these capabilities do not exist in the product today. They will be built during your implementation, at your cost, custom to your environment. Custom-built audit infrastructure is not maintained by the vendor, is not improved across the vendor’s customer base, and will not receive platform-level investment. The Governance and Maintenance Debt risk of custom-built audit infrastructure is significantly higher than platform-level infrastructure maintained by a vendor with multiple customers depending on it.

Q5. How does agent washing relate to the four Agent Debt types?
Agent washing is the commercial origin of all four types simultaneously. Governance Debt from day one because there is no platform-level decision trace. Orchestration Debt because there is no coordination protocol, only scripted sequencing. Maintenance Debt because monitoring is demo-environment monitoring, not production monitoring. Talent Debt because operating knowledge lives in the implementation consultant. Prevention decisions cannot be made retroactively against a washed product because the architectural foundations they assume do not exist.

Q6. Is Gartner’s estimate that only 130 vendors are genuinely agentic reliable?
It is an estimate from a firm that evaluated thousands of vendors against a defined set of agentic criteria. The precise number is less important than the implication: the market for genuine agentic AI is narrow relative to the number of vendors claiming the label. The five questions are the CPO’s equivalent of the criteria Gartner applied. The number of vendors who can answer all five in a live demo is significantly smaller than the number claiming agentic capability.

Q7. What happens if agent washing is discovered after a multi-year contract is signed?
Options narrow significantly. The enterprise can renegotiate terms to include genuine agentic capability commitments, pursue remediation as material misrepresentation, or manage Agent Debt for the contract duration using the paydown sequence. The key difference from a clean paydown is that the vendor product actively resists remediation because the governance, coordination, and monitoring capabilities being built are not supported by the underlying architecture.

Q8. How does the Merlin Agentic AI Platform answer all five questions?
Decision trace: every agent action generates a platform-level record accessible to business users without engineering involvement. Coordination protocol: all agent-to-agent interactions are mediated through a centralized coordination layer with defined exception routing. Production monitoring: continuous drift detection and alert routing operate from the first day of production. Governance access: procurement and compliance teams configure governance policies without engineering tickets, with a full change audit trail. Retrospective audit: any past decision is retrievable from the platform record, not from reconstructed logs.

Related Reads:

Webinars

CEWA’s Digital Transformation Journey: How Agentic AI is Reshaping Procurement in ANZ

Uday Jain

Uday in the business of making procurement leaders read past the first line. Content and product marketer at Zycus, turning product complexity into something worth their time. Demand gen is where I learned the craft from the ground up. Every headline earning the click, every paragraph earning the next, every word pulling its weight. If they bookmark it, I’ve done my job. If they share it, I’ve done it well.