Conversation with an Expert: A Practical Guide to Building an Incremental Multi-Agent AI System for Business Process Automation

Lyndsay Yerbic
Oct 3
6 min read

Lyndsay Yerbic, Head of Generative AI Solution Architecture, Pier Paolo Ippolito, Generative AI Solution Architect - originally posted on Medium

The AI topic du jour is agentic automation for business processes. It’s no surprise. We all share the dream of building an army of intelligent, flexible digital workers to handle the tedious stuff, freeing us humans to tackle the more creative, high-value problems.

Large Language Models (LLMs) make this dream feel incredibly close. But as anyone who’s actually tried to build a production-grade agent will tell you, it isn’t magic. Building a truly useful AI system is a journey of careful planning and deliberate, step-by-step work — a lot like building any other piece of serious software.

To get a real-world perspective, I asked Pier, a member of our GenAI Solutions Architect team, to share his playbook. He has spent years in the trenches turning messy manual workflows into slick, automated, multi-agent systems. What he shared was a practical, phased approach that anyone can use to move from idea to implementation.

He has graciously agreed to share his framework below.

(For a deeper dive into the theory, we highly recommend reading Agentic Design Patterns.)

From Messy Workflows to Multi-Agent Systems: A 4-Phase Playbook

The promise of agentic AI is transformative, but realizing that promise requires discipline. In my work as a GenAI Solutions Architect, I’ve seen many projects get stuck in “pilot purgatory” because they leap to a technical solution before deeply understanding the problem. The most successful initiatives are not about magic; they are about methodical engineering.

To guide this process, I rely on a four-phase playbook that moves from foundational discovery to a robust, value-driven system. It treats agent development not as a science experiment, but as a core engineering discipline.

Phase 1: The Detective Work — Finding the Right Problem

Your first job isn’t writing code. It’s to observe. Before you even think about AI, you have to become an expert on the human process you want to improve.

The goal here is to understand a manual workflow so intimately that its cracks and opportunities become obvious.

Book a workshop with the Subject Matter Expert (SME) — the person who lives this workflow every day. Then, just shadow them. Record everything: their screen, their clicks, the documents they open, even the sighs of frustration. You’re trying to create a perfect, high-fidelity map of how work gets done right now.

Once you have that map, I filter potential projects through three key questions:

What’s the real impact? Sure, saving time and money is great, but can this unlock new revenue streams or create a capability that doesn’t exist today?
Is this actually doable? Be brutally honest about what’s possible with today’s technology. Your first project needs to be a clear win, not a wild science experiment.
Can we do this again? Look for a solution or pattern you can reuse in other parts of the business. The best projects create a template for future success.

You’re looking for the sweet spot: a project with high, measurable value but only moderate technical difficulty.

And here’s my most crucial piece of advice for this phase:

Don’t just try to slap an AI on top of what you’re already doing. The most powerful solutions are ‘agentic-first.’ They rethink the problem from the ground up and ask, ‘How would a team of intelligent agents solve this?’ The answer is often completely different from the human process — and far more effective.

Phase 2: The Monolith — Build One Thing Well

Once you’ve picked your problem, the temptation is to start architecting a massive, beautiful system. Resist that urge. Please. Just start with a single, simple component that proves the core idea works.

I call this the monolithic prototype. It isn’t meant to be pretty. It’s a quick-and-dirty test, often just a single, long prompt that tells one agent to perform a sequence of steps and gives it all the necessary tools. The only goal is to solve 80% of the core problem in the simplest way possible. You need to prove the approach is viable before you invest significant time and money.

At this stage, you face a choice: use a boilerplate (like an agent starter kit) or build from scratch. Honestly, if your task is fairly self-contained and doesn’t need deep cloud integration, starting from scratch is often cleaner. You avoid inheriting a bunch of code you don’t need.

This is also the time to start sketching out the future. I’m a big fan of using a mind map or a digital whiteboard (like Excalidraw). Put the main goal in the middle. Branch out with the absolute must-have capabilities for your prototype. Then, on totally separate branches, brainstorm all the ‘nice-to-have’ features — a news search, financial analysis, whatever. Write them down, and then park them. You’ll come back to them, I promise.

Phase 3: The Assembly Line — From One Agent to a Team

Eventually, your brilliant, do-it-all monolithic agent will start to stumble.

It’s like a star employee who’s been given way too many jobs. The prompt gets too long, and the agent starts forgetting instructions or getting confused. That’s your cue. It’s time to bring in the specialists.

This is where you break down your monolith into a “microservices-style” system of specialized agents that work together. The trick is to find the “natural division of labor.” Think about the distinct skills required. You might have a Data Scraper Agent that pulls info from APIs, a Fact Checker Agent that verifies it, and a Writer Agent that turns it all into a clean report. Each one is an expert at its single task.

So, how do you organize this new digital team? I rely on a few common patterns:

For step-by-step tasks, a simple assembly line is perfect. Agent A does its job, then hands its work to Agent B, and so on.
For complex decisions, a hierarchy is better. An agent trying to choose between 50 different user intents will fail. Instead, have a main Orchestrator Agent make one big decision — like routing a request to either the ‘Billing’ team or the ‘Support’ team. That sub-agent can then handle the more specific choices. You want to build deep, not wide.
For repetitive work, you can build loops. An agent can keep refining a draft or searching for data until it hits a certain quality bar.
For gathering info, you can have agents work in parallel. If you need data from three different sources, three agents can fetch it simultaneously, then pass their findings to a fourth agent to synthesize the results.

When you build this team, you have to ensure they can communicate effectively. Use the right tool for the job. Your orchestrator might be a powerful, top-tier model like Gemini 2.5 Pro, while the specialists can be smaller, faster models. Crucially, you must give them a shared memory — a common scratchpad where one agent can leave notes for the others.

One final, often overlooked, tip for orchestrators: you have to state the obvious. An LLM has no idea what day it is. If you need it to search for recent news, your prompt must literally begin with, “For context, today’s date is 20th of August 2025”

Once this core team is working smoothly, you can finally pull out that old mind map and start adding the “nice-to-have” features, knowing you have a solid foundation to build on.

Phase 4: The Reality Check — Does It Actually Work?

So, you’ve built it. Now for the million-dollar question: Is it any good?

How you evaluate your system depends on the project. For a simple internal tool, testing it against a checklist of examples might be enough. But for a customer-facing system, you need to be much more rigorous.

For those bigger systems, I suggest more advanced methods. You can generate synthetic data to test edge cases, or even use an ‘actor-critic’ pattern, where a second AI agent’s only job is to fact-check and grade the first one. To make this easier, you can use the evaluation utilities in frameworks like the Google Agent Development Kit to run structured tests and track performance over time.

But the ultimate test is, and always will be, human.

Before you pop the champagne, give the tool to the SME you shadowed way back in Phase 1. Let them use it, break it, and give you honest feedback. That’s the one insight you can’t get from any automated test. Does this actually make their day easier? Does it solve the real problem?

If the answer is yes, then you’ve done it. You haven’t just built a piece of tech; you’ve built a real solution. And that’s where the journey truly begins.

Lyndsay Yerbic