Building Joule Agents That Actually Ship

There is no shortage of AI agent demos. Polished slide decks, smooth conference walkthroughs, a chatbot that answers a question about an open purchase order. What’s harder to find is an honest account of what it takes to go from that demo to a fleet of agents running against a real S/4HANA tenant, in production, that a business user will actually trust.

This is that account.

Over the past several months, my team at Mindset built 23 deployable SAP Joule agents covering procurement, order-to-cash, warehouse management, plant maintenance, finance close, BDC, and platform administration. They run on SAP BTP Cloud Foundry against an S/4HANA 2023 on-premises system. They’re real. They ship. And building them taught us things that no SAP documentation prepared us for.

What We Built

Two GitHub repositories carry the platform. The first — com.mindset.joule.directory — contains the agents themselves: 23 deployable MTARs, each scoped to a real business workflow with standardized documentation. The second — com.mindset.mcp.servers — is the tools layer: 9 domain-scoped MCP servers exposing S/4HANA OData APIs to Joule Studio, totaling 132 live tools.

The architecture is straightforward on paper:

Getting there cleanly took considerably more iteration than that diagram suggests. The agents split into two grounding patterns, and that distinction is intentional.

Tool-grounded agents — the majority of the fleet — connect to live S/4HANA data via MCP servers. Every transactional workflow lives here: procurement, order-to-cash, warehouse, plant maintenance, finance close. The agent’s answers are only as stale as the system of record.

Document-grounded agents use Joule Studio’s native document grounding against product documentation. The BDC Administration Agent guides users through Datasphere and BDC formations — spaces, users, roles, connections — grounded in official SAP content. The BDC Story Creation Assistant does the same against the SAP Analytics Cloud User Guide. No live system needed; the answer lives in the doc.

Right tool for the right job. If the answer is “what’s the current stock level,” you need a live API call. If the answer is “how do I configure a space in Datasphere,” you need a well-chunked document and a good retrieval layer. Mixing those patterns — or forcing everything through one — is how you build agents that are mediocre at both.

Lesson 1: The Tool Count Problem Nobody Warns You About

Early in the build, we made a mistake that I now consider the most predictable trap in Joule agent development: we put too many tools in a single MCP server.

The symptoms are subtle at first. The agent gets slower. Tool selection becomes inconsistent — the right tool exists, but the model doesn’t always reach for it. At high tool counts, you start seeing the agent either stall or reach for the wrong operation entirely.

The root cause is how large language models handle tool selection when the option space is large. More choices create more ambiguity. The model has to reason over every available tool on every turn, and when a server exposes 50 or 80 or 100 tools, that reasoning degrades. Our experience taught us that staying below ~30 tools was most efficient.

Our fix was strict domain scoping. Each MCP server covers exactly one business domain, with tool counts ranging from 8 to 24. The procurement server handles PRs and POs. The sales server handles sales orders, delivery blocks, and customer returns. The finance server handles GL close and intercompany reconciliation. Nothing bleeds across boundaries.

The result: faster agent response, dramatically more consistent tool selection, and a codebase that’s easier to maintain and extend.

If you take one thing from this post: design your MCP server boundaries around business domains, not technical convenience. Keep tool counts tight. An agent that picks the right tool every time is worth more than one that theoretically can do everything.

Lesson 2: Identity, Trust, and the Principal Propagation Gap

One of the more nuanced challenges we encountered involves user identity. When a Joule agent calls an MCP server, the call is authenticated under a service identity — not the identity of the business user initiating the conversation.

This matters for principal propagation. In a well-governed SAP landscape, the system should know who is taking an action, not just what service is acting. This affects audit trails, authorization checks, and — critically — the degree to which you can trust agent-initiated transactions.

We ran into this directly when working with BTP destinations connecting to SAP Business Data Cloud and Datasphere, where the absence of user context in the downstream call is a real architectural constraint.

This is an evolving area across the industry, not a problem unique to SAP or Joule. MCP as a protocol is relatively young, and enterprise identity propagation through agentic layers is something the broader ecosystem is actively working through. Our current design accounts for this by treating agent-initiated write operations as requiring an explicit human approval step — which we’ll come back to in a moment.

Additionally, we learned early that in partner and customer sandbox environments, certain Joule Studio features — specifically native skill integration with BTP destinations —depend on entitlements that vary by tenant and licensing configuration. Rather than treating this as a blocker, we found that MCP servers are actually the stronger architectural pattern regardless: they’re composable, version-controllable, locally testable, and deployable via standard Cloud Foundry tooling. What started as a constraint became our preferred approach.

Lesson 3: Why the “Friday–Monday” Test Matters

Here is the standard I actually care about: a business user reviews an agent’s recommendation on Friday afternoon. Do they trust it enough to act on Monday morning?

That trust doesn’t come from the AI model. It comes from the design of the data layer underneath it.

Every agent in our fleet reads from S/4HANA through official OData APIs — the same APIs SAP exposes to Fiori applications. No custom RFCs, no direct database reads, no shadow data. The data the agent sees is the same data the system of record holds.

Traffic flows through BTP destinations and SAP Cloud Connector, meaning every call follows the same encrypted, auditable path your other integrations use. The MCP server layer is stateless, it translates a natural language tool call into an OData request and returns structured data. There’s no caching, no interpretation, no transformation that could introduce drift.

When the agent surfaces a recommendation flag this PR for risk, release this delivery block, escalate this invoice discrepancy it’s showing the user real S/4HANA data in a structured, consistent format. The user can verify it against the system directly. That’s the foundation of trust.

We also made a deliberate design choice for any operation that writes back to SAP: the agent recommends, the user approves. The agent is an accelerator for human judgment, not a replacement for it. That framing matters to the business stakeholders who own these processes.

What the BTP Integration Layer Actually Gives You

One thing that doesn’t get enough credit: BTP as an integration platform makes this architecture significantly more tractable than it would be otherwise.

Destinations abstract the S/4 connection. Cloud Connector handles the network boundary to on-prem without requiring inbound firewall rules. OAuth2 client credentials flow is managed at the platform level — the MCP server doesn’t handle secrets directly.

We also automated destination provisioning. Every MCP server’s mta.yaml now declares a module that creates its required BTP destination on cf deploy. No manual cockpit configuration, no onboarding steps that can be skipped. The destination exists when the server does.

That kind of repeatable, infrastructure-as-code approach is what lets a team of architects move across 23 agents without the operational overhead compounding.

What’s Next

The V4 warehouse server targeting SAP EWM is scaffolded and built — deployment is the next step. We’re also watching the identity propagation space closely; as Joule Studio and the broader MCP ecosystem mature, we expect this to close.

The MAccelerate agent directory is a living platform, not a project deliverable. The business domains covered today are a starting point. The patterns we established — domain-scoped servers, official API surfaces, human-in-the-loop write operations, automated infrastructure — are designed to extend.

If you’re an SAP customer or partner evaluating Joule agents for your landscape, the most important thing I can tell you is this: the technology is ready enough to ship real value today. But the teams that succeed are the ones who treat agent design with the same discipline they’d apply to any enterprise integration. Scope tightly. Instrument your trust signals. Keep humans in the loop on writes. And watch your tool counts.

Interested in learning more?
Visit Mindset’s Linkedin
Visit Mindset’s Blog Library
Visit Mindset’s YouTube Page

Building Joule Agents That Actually Ship

What We Built

Lesson 1: The Tool Count Problem Nobody Warns You About

Lesson 2: Identity, Trust, and the Principal Propagation Gap

Lesson 3: Why the “Friday–Monday” Test Matters

What the BTP Integration Layer Actually Gives You

What’s Next

More from Matthew Whigham

Simplifying Integration: Accessing APIs from On-Premise SAP Systems through SAP BTP

Mindset’s Voice of the Employee: An Employee Experience application for SAP & VoE

Mindset’s SafeTransport for SAP and how it helps you during COVID-19

Related posts

Mindset’s 2026 AI Vision: CEO Strategy

Mindset Press Release: Mindset Launches 20 SAP Joule Agents at Sapphire 2026

SAP Community Stammtisch at Mindset – 29th Jan 2026