The 3 Phases of Enterprise AI Implementation

Most enterprise AI implementations fail. Here's the 3-phase sequence that actually gets AI into production.

MIT reports that 95% of generative AI pilots never reach production. In our experience, the reason is straightforward: organizations try to build before they know what to build, or before they have identified the right people to build it with. Part of the problem is that AI's usefulness is genuinely hard to predict in advance. A 2025 METR study found that experienced software developers were 19% slower on tasks when using AI than without it.

If AI's impact is this hard to predict in software development, you won’t be able to predict it in your organization either. Why is AI’s ability to automate certain tasks so unpredictable? It can be explained by a 2023 Harvard/BCG study of 758 consultants that found that AI creates a "jagged technological frontier." This means that people can't intuitively predict which tasks AI will excel at versus fail at, and they systematically over-trust AI where it's weakest and under-use it where it's strongest.

For example, ChatGPT has PhD level reasoning but surprisingly struggles when given a simple task like counting when asked the question “how many r’s are in the word strawberry?” You wouldn’t expect a PhD educated human to struggle with tasks like this.

The takeaway is that experimentation, not intuition, is the only reliable way to discover where AI adds value for your specific workflows. Experimentation is not a precursor to real work - it is the real work. In our experience, effective enterprise AI adoption follows this approximate sequence:

Phase 1 - Sandbox - puts consumer AI tools directly in front of the people doing the work, generating a use case inventory grounded in actual workflows and identifying the champions who will carry the next phase.
Phase 2 - Prototype - pairs those champions with AI engineers in an iterative development loop, where domain experts define what "correct" means and document real business logic directly into working software.
Phase 3 - Production - deploys validated, purpose-built tools across the organization through conventional change management, made achievable by the proven use cases and internal champions that Phase 1 and 2 produced.

Let’s talk about these 3 phases and how they build upon each other.

Phase 1 - Sandbox

Consumer AI tools like ChatGPT, Copilot, Claude, and Gemini are the best place to start your team’s AI journey. These tools provide a chat interface for users to interact with AI models, agents, and software built and maintained by the frontier AI labs. For the most part, we do not give these platforms direct access to our organization’s internal data (aka “sandboxed”).

More than technologies of the past, AI development needs to happen from the ground up. Instead of commissioning a study or forming a committee, firms should put consumer AI tools directly in front of the people doing the work and observe what happens. These tools are easily available and cost $20 - $50 per seat.

Give the AI to a carefully-selected pilot group, provide introductory training, and observe their initial usage. Your change management team should be collecting the use cases that surface organically - weekly or twice-monthly touchpoints work well - focusing on both successes and challenges of pilot users. Track usage and metrics through the platform (consumer AI tools typically have an admin analytics dashboard) and send regular surveys to pilot teams. Identify the pilot members who are engaging the most and generating the most valuable feedback; these users will become some of your most important allies through the AI change management process.

To summarize, the sandbox phase does two things simultaneously. First, it generates a use case inventory that is grounded in actual workflows rather than theoretical potential. Second, and equally important, it identifies your champions - those employees leading AI adoption through active experimentation and use case discovery.

Phase 2: Prototype

The most consequential participant in an AI development project is not the software engineer. It is the professional who has spent years accumulating the judgment, pattern recognition, and domain expertise that defines how your organization operates.

Historically, there was a lot of distance between the engineers who built the product and the people who used the product. AI has collapsed this distance. For the first time, the person who knows the business logic can encode it directly into a working software system using plain language (prompts). This requires a tighter integration of your software and non-software teams.

What This Looks Like in Practice

Let’s consider the development of an automated invoice processing pipeline. The goal isn’t just "extracting text," but reaching 99.99% accuracy of data extraction across a chaotic variety of vendor formats.

To start this process, the AI Engineer writes the initial pipeline, which is a series of calls to an LLM, database, and/or API to transform the raw input (an invoice) into structured data (the way your internal systems understand and store that invoice.) The system only gets good through evaluation, and evaluation is where the SME becomes indispensable.

The SME defines the rubric against which our AI pipeline's output will be evaluated, a process that can be thought of as creating an "answer key." The SME builds the "golden dataset", which is a curated collection of real invoices with their corresponding answer key. This data acts as the source of truth for our software’s development.

This is a tight iterative loop: the SME evaluates the output and improves the rubric, the AI Engineer applies the logic to the system architecture, and together they converge on something that actually works. Each cycle encodes more of the organization's real business logic into the system. Evaluating an AI pipeline is an academic field of its own, and we’ll cover that in future publications.

This is why Phase 1 (the Sandbox) is non-negotiable. Bespoke development requires "champions" with genuine AI literacy - people who understand model limitations, can evaluate outputs against real-world experience, and can spot the edge cases that separate a viable tool from an expensive toy. In this phase, the SME is no longer a "user" providing feedback; they are a technical collaborator defining what correct means.

Phase 3: Production

By Phase 3, your AI tool or pipeline is in production. The nature of the challenge has shifted:

✅you have validated use cases
✅you have purpose-built tools
✅you have a cohort of champions who understand AI deeply

What you now have to do is roll those tools out across the organization, and this is a problem that looks like conventional software change management.

Phase 3 is about deploying a finished product to a broad user base: training staff, integrating the tools into existing workflows, measuring adoption, and iterating on what isn't working. In some cases that means automating a process end-to-end. In others it means giving an entire department a capability they didn't have before. The technology question is largely settled by this point. The work is organizational.

The important distinction is that Phase 3 change management is significantly more tractable than the adoption challenge in Phase 1. In Phase 1, you are asking people to explore a tool whose value is unproven and whose application to their work is undefined. In Phase 3, you are asking people to adopt a specific tool that solves a specific problem - and you have champions inside the organization who can demonstrate, credibly, that it works.

Conclusion

The phases form a dependency chain. The sandbox generates use cases and champions. Those champions generate software that encodes real business logic. That software gives Phase 3 something worth deploying at scale.

There are countless benefits to a community-based approach to AI integration, but this approach has an additional benefit: partnering with employees of all levels for the rollout of a technology that has, at least in the media, threatened massive job losses. Involving employees in the discussion of AI integration promotes trust with the technology, and will ultimately lead to a faster speed of integration by your organization.