{"id":8103881,"date":"2026-01-27T12:22:47","date_gmt":"2026-01-27T20:22:47","guid":{"rendered":"https:\/\/agilealliance.org\/?p=8103881"},"modified":"2026-02-07T17:44:59","modified_gmt":"2026-02-08T01:44:59","slug":"developing-ai-with-agile-redefining-done-and-user-before-the-model-redefines-you","status":"publish","type":"post","link":"https:\/\/agilealliance.org\/developing-ai-with-agile-redefining-done-and-user-before-the-model-redefines-you\/","title":{"rendered":"Developing AI with Agile: Redefining \u201cDone\u201d and \u201cUser\u201d Before the Model Redefines You"},"content":{"rendered":"\n<p>When we started applying Agile to AI work, I assumed our existing playbook\u2014thin slices, short sprints, frequent demos\u2014would carry over. It didn\u2019t. We shipped increments, yes. But the increments often looked productive while learning the wrong lessons. In my world (Salesforce ecosystems in healthcare and banking), that gap showed up as models that \u201cworked\u201d in sprint reviews yet created downstream friction, compliance churn, or quietly optimized for the wrong behavior.<\/p>\n\n\n\n<p>Over several programs, though, I learned that Agile does fit AI if you redefine two fundamentals up front:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li>\u201cDone\u201d is not a working model. It\u2019s a model + evidence that we\u2019re learning the right thing.<\/li>\n\n\n\n<li>And \u201cuser\u201d isn\u2019t just an end user. It includes the <em>data<\/em>, the <em>reviewers<\/em>, and the <em>controls<\/em> that will live with your model after you ship.<\/li>\n<\/ol>\n\n\n\n<p>Below, I\u2019ll share the patterns that made the difference, the moments that forced me to change my mind, and a practical backbone you can take into your next AI iteration.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>1. Redefine \u201cdone\u201d: From output to outcome evidence<\/strong><\/h2>\n\n\n\n<p>The moment my thinking changed came during a sprint demo where our AI assistance reduced Salesforce Opportunity creation time dramatically in a banking context. On paper, this was a win. But a week later, compliance flagged subtle edge cases the model had learned from historical shortcuts in the CRM. We had optimized speed at the cost of governance. Sprint \u201cdone\u201d wasn\u2019t business \u201cdone.\u201d<\/p>\n\n\n\n<p>What finally worked: we added Outcome Evidence to our Definition of Done (DoD):<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Metric + guardrail pair for every story (e.g., <em>time saved<\/em> paired with <em>policy adherence rate<\/em>).<\/li>\n\n\n\n<li>Champion\u2013Challenger check in the acceptance criteria: the new model must beat a baseline and preserve guardrails in a holdout set.<\/li>\n\n\n\n<li>Counterfactual example review: before closing the story, we ask, \u201cWhere would this model make a confident but wrong recommendation, and how will we detect it?\u201d<\/li>\n<\/ul>\n\n\n\n<p>This didn\u2019t slow us down; it prevented expensive rework. It also changed our sprint reviews: instead of demoing \u201clook what the model can do,\u201d we demoed \u201clook what we learned and how we know we didn\u2019t learn the wrong thing.\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2. Expand \u201cuser\u201d: Include data, reviewers, and controls<\/strong><\/h2>\n\n\n\n<p>In healthcare integrations (Epic\/Genesys \u2194 Salesforce Health Cloud), we learned that the people who review model output\u2014care coordinators, operations leads, security, compliance\u2014are as critical as end users. If they lack visibility or a lightweight override path, your team will quietly harden manual workarounds outside the system. That\u2019s how \u201cshadow process\u201d is born.<\/p>\n\n\n\n<p><strong>Our rule of thumb now:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Treat data stewards as primary users. Give them promptable checks, lineage notes, and drift signals where they live (often the CRM or SPM dashboards).<\/li>\n\n\n\n<li>Treat reviewers as first\u2011class citizens. If a human\u2011in\u2011the\u2011loop can\u2019t give feedback inside the workflow, your model quality will decay without anyone noticing.<\/li>\n\n\n\n<li>Treat controls (security\/compliance) as product features, not gate meetings. We ship small compliance surfaces early\u2014policy hints, audit events, sampling views\u2014so reviewers can <em>see<\/em> and <em>shape<\/em> the model\u2019s boundaries.<\/li>\n<\/ul>\n\n\n\n<p>When we took this view, adoption improved, and the model learned faster because feedback lived <em>where the work is<\/em>, not in a separate spreadsheet.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>3. Thin vertical slices for AI are different: Slice by decision, not by screen<\/strong><\/h2>\n\n\n\n<p>On a Salesforce program, our first \u201cthin slice\u201d was UI\u2011centric: autocomplete fields, smart defaults, then later a recommendation panel. Velocity looked great; learning did not. The model wasn\u2019t exposed to the decision pressure we actually cared about (e.g., which lead deserves attention now; which case needs escalation).<\/p>\n\n\n\n<p>We changed the slice to \u201cone decision, end\u2011to\u2011end\u201d:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Define the decision (e.g., \u201cWhich opportunity deserves a same\u2011day touch?\u201d).<\/li>\n\n\n\n<li>Ship a tiny model (or even a rules\u2011based challenger), a visible rationale, a one\u2011click override, and a feedback capture.<\/li>\n\n\n\n<li>Measure decision quality against a baseline, not just click speed.<\/li>\n<\/ul>\n\n\n\n<p>That slice exposed the real trade\u2011offs (precision vs. recall, throughput vs. fairness) far earlier and gave us the right conversations in sprint reviews.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>4. Backlog hygiene: Separate model work, data work, and control work, but demo them together<\/strong><\/h2>\n\n\n\n<p>In classic Agile, \u201cas a user I want\u2026\u201d stories can blur model training, data remediation, and governance hardening into one mega\u2011ticket. We split the work into three tracks, each with its own cadences, but we demo them together:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Model work: features, loss curves, challenger results, drift signals.<\/li>\n\n\n\n<li>Data work: pipeline health, lineage notes, labeled edge cases added this sprint.<\/li>\n\n\n\n<li>Control work: audit events shipped, sampling policy, access scopes.<\/li>\n<\/ul>\n\n\n\n<p>One sprint review equals one narrative: what we changed in the model, what data moved because of it, and how controls evolved. Executives track value; auditors see traceability; teams see cause and effect.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>5. Make feedback frictionless (and visible) inside the tooling people already use<\/strong><\/h2>\n\n\n\n<p>In one hospital rollout, we had brilliant feedback buried in SharePoint and email threads. We moved it into the systems where work happened: Salesforce comments, case objects, and lightweight review forms. We also surfaced \u201cTop 5 feedback themes\u201d on an SPM dashboard that leadership already reviewed weekly.<\/p>\n\n\n\n<p>That closed the loop. Instead of asking clinicians and support teams to <em>go somewhere else<\/em> to be heard, we met them in the flow of work. Model quality improved <em>because the feedback finally did<\/em>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>6. When to say \u201cno\u201d to a model<\/strong><\/h2>\n\n\n\n<p>A counterintuitive lesson: there are moments where the most Agile move is not to build a model. If the data generating process is unstable (e.g., new workflow, new form fields, evolving taxonomy), first stabilize the process with automation or explicit rules. Then invite an AI challenger. Otherwise, you\u2019ll teach your model yesterday\u2019s chaos and call it learning.<\/p>\n\n\n\n<p>We now use a simple guardrail: If we cannot articulate a stationary slice of the process for 4\u20136 weeks, we don\u2019t train, we instrument. That instrumentation later becomes gold for training.<\/p>\n\n\n\n<p><strong>A lightweight backbone you can copy<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>DoD upgrade<\/strong>: outcome metric + guardrail + challenger check + counterfactual example.<\/li>\n\n\n\n<li><strong>User map<\/strong>: end user, data steward, reviewer, control owner\u2014each with an in\u2011tool feedback path.<\/li>\n\n\n\n<li><strong>Slice by decision<\/strong>: one real decision, end\u2011to\u2011end, with rationale and override.<\/li>\n\n\n\n<li><strong>Three\u2011track backlog<\/strong>: model \/ data \/ control, demoed as one story of change.<\/li>\n\n\n\n<li><strong>In\u2011flow feedback<\/strong>: capture and summarize where people already work.<\/li>\n\n\n\n<li><strong>Instrument before you model<\/strong>: only learn from processes stable enough to teach.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What this work has taught me<\/strong><\/h2>\n\n\n\n<p>Working on AI inside fast\u2011moving, highly regulated environments hasn\u2019t made me more certain about the \u201cright way\u201d to do things. If anything, it\u2019s humbled me. I\u2019ve had late\u2011night standups where the model drifted without warning, sprint reviews where a small insight changed the entire roadmap, and moments where I caught myself asking, \u201c<em>Are we even solving the right problem?\u201d<\/em><\/p>\n\n\n\n<p>What I\u2019ve learned, often the hard way, is that the teams that thrive treat learning as the real product. Controls become features, and decisions become tiny, testable slices of truth. When we redefine \u201cdone\u201d and \u201cuser\u201d through that lens, the model gets better one sprint at a time, and the organization starts trusting it enough to scale.<\/p>\n\n\n\n<p>So, if you\u2019ve ever paused mid\u2011project and questioned, \u201c<em>Is this still working?<\/em>\u201d while building AI in a multi\u2011system, highly regulated world, you\u2019re not alone. That question is usually where the real conversation begins. I\u2019d love to hear how you\u2019re navigating it, too.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Why classic Agile breaks down with AI and what actually works instead, from redefining \u201cdone\u201d and \u201cuser\u201d to slicing by decisions, evidence, and controls in regulated environments.<\/p>\n","protected":false},"author":8142433,"featured_media":8103891,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_tec_requires_first_save":true,"_EventAllDay":false,"_EventTimezone":"","_EventStartDate":"","_EventEndDate":"","_EventStartDateUTC":"","_EventEndDateUTC":"","_EventShowMap":false,"_EventShowMapLink":false,"_EventURL":"","_EventCost":"","_EventCostDescription":"","_EventCurrencySymbol":"","_EventCurrencyCode":"","_EventCurrencyPosition":"","_EventDateTimeSeparator":"","_EventTimeRangeSeparator":"","_EventOrganizerID":[],"_EventVenueID":[],"_OrganizerEmail":"","_OrganizerPhone":"","_OrganizerWebsite":"","_VenueAddress":"","_VenueCity":"","_VenueCountry":"","_VenueProvince":"","_VenueState":"","_VenueZip":"","_VenuePhone":"","_VenueURL":"","_VenueStateProvince":"","_VenueLat":"","_VenueLng":"","_VenueShowMap":false,"_VenueShowMapLink":false,"_tribe_blocks_recurrence_rules":"","_tribe_blocks_recurrence_description":"","_tribe_blocks_recurrence_exclusions":"","ep_exclude_from_search":false,"_jf_limit_responses":"","footnotes":""},"categories":[883,908,906],"tags":[],"content_source":[],"class_list":["post-8103881","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mindset","category-process","category-technology"],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/posts\/8103881","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/users\/8142433"}],"replies":[{"embeddable":true,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/comments?post=8103881"}],"version-history":[{"count":9,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/posts\/8103881\/revisions"}],"predecessor-version":[{"id":8104705,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/posts\/8103881\/revisions\/8104705"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/media\/8103891"}],"wp:attachment":[{"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/media?parent=8103881"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/categories?post=8103881"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/tags?post=8103881"},{"taxonomy":"content_source","embeddable":true,"href":"https:\/\/agilealliance.org\/wp-json\/wp\/v2\/content_source?post=8103881"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}