Developing AI with Agile: Redefining “Done” and “User” Before the Model Redefines You

When we started applying Agile to AI work, I assumed our existing playbook—thin slices, short sprints, frequent demos—would carry over. It didn’t. We shipped increments, yes. But the increments often looked productive while learning the wrong lessons. In my world (Salesforce ecosystems in healthcare and banking), that gap showed up as models that “worked” in sprint reviews yet created downstream friction, compliance churn, or quietly optimized for the wrong behavior.

Over several programs, though, I learned that Agile does fit AI if you redefine two fundamentals up front:

“Done” is not a working model. It’s a model + evidence that we’re learning the right thing.
And “user” isn’t just an end user. It includes the data, the reviewers, and the controls that will live with your model after you ship.

Below, I’ll share the patterns that made the difference, the moments that forced me to change my mind, and a practical backbone you can take into your next AI iteration.

1. Redefine “done”: From output to outcome evidence

The moment my thinking changed came during a sprint demo where our AI assistance reduced Salesforce Opportunity creation time dramatically in a banking context. On paper, this was a win. But a week later, compliance flagged subtle edge cases the model had learned from historical shortcuts in the CRM. We had optimized speed at the cost of governance. Sprint “done” wasn’t business “done.”

What finally worked: we added Outcome Evidence to our Definition of Done (DoD):

Metric + guardrail pair for every story (e.g., time saved paired with policy adherence rate).
Champion–Challenger check in the acceptance criteria: the new model must beat a baseline and preserve guardrails in a holdout set.
Counterfactual example review: before closing the story, we ask, “Where would this model make a confident but wrong recommendation, and how will we detect it?”

This didn’t slow us down; it prevented expensive rework. It also changed our sprint reviews: instead of demoing “look what the model can do,” we demoed “look what we learned and how we know we didn’t learn the wrong thing.”

2. Expand “user”: Include data, reviewers, and controls

In healthcare integrations (Epic/Genesys ↔ Salesforce Health Cloud), we learned that the people who review model output—care coordinators, operations leads, security, compliance—are as critical as end users. If they lack visibility or a lightweight override path, your team will quietly harden manual workarounds outside the system. That’s how “shadow process” is born.

Our rule of thumb now:

Treat data stewards as primary users. Give them promptable checks, lineage notes, and drift signals where they live (often the CRM or SPM dashboards).
Treat reviewers as first‑class citizens. If a human‑in‑the‑loop can’t give feedback inside the workflow, your model quality will decay without anyone noticing.
Treat controls (security/compliance) as product features, not gate meetings. We ship small compliance surfaces early—policy hints, audit events, sampling views—so reviewers can see and shape the model’s boundaries.

When we took this view, adoption improved, and the model learned faster because feedback lived where the work is, not in a separate spreadsheet.

3. Thin vertical slices for AI are different: Slice by decision, not by screen

On a Salesforce program, our first “thin slice” was UI‑centric: autocomplete fields, smart defaults, then later a recommendation panel. Velocity looked great; learning did not. The model wasn’t exposed to the decision pressure we actually cared about (e.g., which lead deserves attention now; which case needs escalation).

We changed the slice to “one decision, end‑to‑end”:

Define the decision (e.g., “Which opportunity deserves a same‑day touch?”).
Ship a tiny model (or even a rules‑based challenger), a visible rationale, a one‑click override, and a feedback capture.
Measure decision quality against a baseline, not just click speed.

That slice exposed the real trade‑offs (precision vs. recall, throughput vs. fairness) far earlier and gave us the right conversations in sprint reviews.

4. Backlog hygiene: Separate model work, data work, and control work, but demo them together

In classic Agile, “as a user I want…” stories can blur model training, data remediation, and governance hardening into one mega‑ticket. We split the work into three tracks, each with its own cadences, but we demo them together:

Model work: features, loss curves, challenger results, drift signals.
Data work: pipeline health, lineage notes, labeled edge cases added this sprint.
Control work: audit events shipped, sampling policy, access scopes.

One sprint review equals one narrative: what we changed in the model, what data moved because of it, and how controls evolved. Executives track value; auditors see traceability; teams see cause and effect.

5. Make feedback frictionless (and visible) inside the tooling people already use

In one hospital rollout, we had brilliant feedback buried in SharePoint and email threads. We moved it into the systems where work happened: Salesforce comments, case objects, and lightweight review forms. We also surfaced “Top 5 feedback themes” on an SPM dashboard that leadership already reviewed weekly.

That closed the loop. Instead of asking clinicians and support teams to go somewhere else to be heard, we met them in the flow of work. Model quality improved because the feedback finally did.

6. When to say “no” to a model

A counterintuitive lesson: there are moments where the most Agile move is not to build a model. If the data generating process is unstable (e.g., new workflow, new form fields, evolving taxonomy), first stabilize the process with automation or explicit rules. Then invite an AI challenger. Otherwise, you’ll teach your model yesterday’s chaos and call it learning.

We now use a simple guardrail: If we cannot articulate a stationary slice of the process for 4–6 weeks, we don’t train, we instrument. That instrumentation later becomes gold for training.

A lightweight backbone you can copy

DoD upgrade: outcome metric + guardrail + challenger check + counterfactual example.
User map: end user, data steward, reviewer, control owner—each with an in‑tool feedback path.
Slice by decision: one real decision, end‑to‑end, with rationale and override.
Three‑track backlog: model / data / control, demoed as one story of change.
In‑flow feedback: capture and summarize where people already work.
Instrument before you model: only learn from processes stable enough to teach.

What this work has taught me

Working on AI inside fast‑moving, highly regulated environments hasn’t made me more certain about the “right way” to do things. If anything, it’s humbled me. I’ve had late‑night standups where the model drifted without warning, sprint reviews where a small insight changed the entire roadmap, and moments where I caught myself asking, “Are we even solving the right problem?”

What I’ve learned, often the hard way, is that the teams that thrive treat learning as the real product. Controls become features, and decisions become tiny, testable slices of truth. When we redefine “done” and “user” through that lens, the model gets better one sprint at a time, and the organization starts trusting it enough to scale.

So, if you’ve ever paused mid‑project and questioned, “Is this still working?” while building AI in a multi‑system, highly regulated world, you’re not alone. That question is usually where the real conversation begins. I’d love to hear how you’re navigating it, too.

We hope you found this post informative

Before you move on, please consider supporting our non-profit mission by making a donation to Agile Alliance today. This is a community blog post. The opinions contained within belong solely to the author or authors, and may not represent the opinion or policy of Agile Alliance.

Pulkit Singhal

Pulkit Singhal is a multi‑certified Senior Salesforce Business Analyst known for delivering enterprise‑level impact across financial services, healthcare, and operational environments. He has helped organizations to improve their business efficiency, he has led transformative Salesforce initiatives, strengthened cross‑functional alignment, and significantly improved operational efficiency across key business units. Blending Lean Six Sigma discipline with deep platform expertise, Pulkit excels at turning ambiguity into scalable, user‑centered solutions - earning recognition as a trusted strategic partner, an…

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
_csrf	session	This cookie is essential for the security of the website and visitor. It ensures visitor browsing security by preventing cross-site request forgery.
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
gdpr[allowed_cookies]	1 year	This cookie is set by the GDPR WordPress plugin. It is used to store the cookies allowed by the logged-in users and the visitors of the website.
JSESSIONID	session	Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
pmpro_visit		The cookie is set by PaidMembership Pro plugin. The cookie is used to manage user memberships.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	This cookie is set by Addthis to make sure you see the updated count if you share a page and return to it before our share count cache is updated.
__atuvs	30 minutes	This cookie is set by Addthis to make sure you see the updated count if you share a page and return to it before our share count cache is updated.
__jid	30 minutes	Used to remember the user's Disqus login credentials across websites that use Disqus
aka_debug		This cookie is set by the provider Vimeo.This cookie is essential for the website to play video functionality. The cookie collects statistical information like how many times the video is displayed and what settings are used for playback.
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
CONSENT	16 years 8 months 15 days 5 hours	Description Pending
disqus_unique	1 year	Disqus.com internal statistics
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
language		This cookie is used to store the language preference of the user.
lidc	1 day	This cookie is set by LinkedIn and used for routing.
locale	3 days	This cookie is used to store the language preference of a user allowing the website to content relevant to the preferred language.
STYXKEY_aa_signup_visited	session	No description

Cookie	Duration	Description
_gat_UA-17319182-1	1 minute	Set by Google Analytics and Google Tag Manager to enable website owners to track visitor behaviour and measure site performance. These cookies are used to collect information about how you use our website. The information collected includes number of visitors, pages visited and time spent on the website. The information is collected by Google Analytics in aggregated and anonymous form, and we use the data to help us make improvements to the website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_17319182_1	1 minute	Set by Google Analytics and Google Tag Manager to enable website owners to track visitor behaviour and measure site performance. These cookies are used to collect information about how you use our website. The information collected includes number of visitors, pages visited and time spent on the website. The information is collected by Google Analytics in aggregated and anonymous form, and we use the data to help us make improvements to the website.
_gat_UA-0000000-1	1 minute	Set by Google Analytics and Google Tag Manager to enable website owners to track visitor behaviour and measure site performance. These cookies are used to collect information about how you use our website. The information collected includes number of visitors, pages visited and time spent on the website. The information is collected by Google Analytics in aggregated and anonymous form, and we use the data to help us make improvements to the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
eud	1 year 24 days	The domain of this cookie is owned by Rocketfuel. This cookie is used to sync with partner systems to identify the users. This cookie contains partner user IDs and last successful match time.
S	1 hour	domain .google.com
uvc	1 year 1 month	The cookie is set by addthis.com to determine the usage of Addthis.com service.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.

Membership

Members-only Content

Become an Agile Alliance member!

IN-PERSON Events

Virtual Events

Community Events

Download the Agile Manifesto

NEW Manifesto for Enterprise Agility

Reimagining Agility

MEMBER INITIATIVES

Your Community

Global Development

Global Affiliates

Global Affiliates

OUR POLICIES

ABOUT US

Sign up for Agile News

Developing AI with Agile: Redefining “Done” and “User” Before the Model Redefines You

1. Redefine “done”: From output to outcome evidence

2. Expand “user”: Include data, reviewers, and controls

3. Thin vertical slices for AI are different: Slice by decision, not by screen

4. Backlog hygiene: Separate model work, data work, and control work, but demo them together

5. Make feedback frictionless (and visible) inside the tooling people already use

6. When to say “no” to a model

What this work has taught me

We hope you found this post informative

Pulkit Singhal

Recent Blog Posts

Recent Posts

Join Agile Alliance!

Post your comments or questions

Recent Agile Alliance Blog Posts

Ready to join Agile Alliance?