NEWTurn locked documents into grounded data for Agentforce.

Transform scanned documents
into AI-ready data

Agentforce is only as good as the data it's grounded on. We turn PDFs, scans, and attachments into governed records in Data Cloud.

MuleSoft Anypoint IDP or Data Cloud Document AI: choosing the right path on volume, cost, governance, and where you already sit on the Salesforce platform.

A real architecture review, not a default to whatever was demoed last.

Unstructured
Medical record bundles
Loss runs & submissions
Certificates of insurance
Broker correspondence
Extraction
engine
Structured in Salesforce
Data Lake Object · Submission record
Policy numberWC-2026-04417
Named insuredNorthwind Logistics
CoverageWorkers' Comp
Limit$1,000,000
Effective date2026-01-01

Built on the Salesforce platform you already run

SalesforceData Cloud · Data 360MuleSoft AnypointAgentforceEinstein Trust Layer

The opportunity hiding in your documents

Most enterprise data that would make an agent genuinely useful is still locked away. The pattern is the same across insurance and beyond.

The agents are ready, the data isn't

The platform has the agents, the LLMs, and the Trust Layer. What's missing is the data, still locked in PDFs, scans, and email attachments. Grounding agents means freeing that data first.

The data exists, it just isn't structured

Loss runs, claim attachments, submissions, and policy documents already hold the answers. They simply aren't in Salesforce as queryable, governed records your agents and reps can use.

The extraction engine is swappable

The hard part is the pipeline around it, not the engine inside it. Choose IDP or Document AI, and change your mind later. The front door and landing zone stay constant.

The pattern that stays the same

One pipeline. A swappable engine.

Whichever extraction engine you pick, the surrounding architecture looks the same, so the choice stays contained and low-risk.

Document source

Upload, email, fax intake, or integration.

MuleSoft front door

Ingestion, normalization, and splitting under per-file limits.

Swappable

Extraction engine

IDP, Document AI, or both. This is the swappable piece.

Data Cloud Data Lake Object

Structured output, governed by the Einstein Trust Layer.

Salesforce + Agentforce

Records, summaries, and grounded agent context.

See your numbers

Estimate your document automation ROI

Describe your workload in plain language. Our assistant gathers a few details, then estimates your savings versus manual handling, and tells you whether MuleSoft IDP or Data Cloud Document AI is the cheaper engine for your document profile.

ROI assistant

Powered by PS Advisory

Tell me about your document workload in plain English. For example, "We process about 5,000 loss-run PDFs a month, around 20 pages each, and an analyst spends roughly 15 minutes on each one." I'll ask for anything I still need, then estimate your savings and which engine is cheaper.

Your estimate appears here

Answer a few quick questions on the left and we'll show your projected annual savings and the cheaper extraction engine.

The architectures available today

Three credible answers

The right one depends on volume, cost structure, governance, and where the customer already sits on the platform.

MuleSoft Anypoint IDP

The mature, production-proven path

Best fit when

  • Already a MuleSoft customer with vCore capacity
  • High, predictable volume where batch is acceptable
  • Classification across many document types
  • Pipeline feeds multiple systems beyond Salesforce
Cost model

Consumption via Automation Credits at roughly 30 credits per page (Automation Credits 3.0).

Heavy workloads can saturate vCores and force incremental purchases. Output to Data Cloud needs explicit integration work.

Data Cloud Document AI

Salesforce's native, Agentforce-ready path

Best fit when

  • Committed to Data Cloud as the Agentforce foundation
  • Moderate volume or relatively small files
  • Native Agentforce grounding without extra pipeline work
  • Unified governance under the Einstein Trust Layer
Cost model

Consumption via Data Cloud credits at roughly 750 credits per MB under Intelligent Processing. No separate SKU.

20 MB per-file limit; scanned PDFs need rasterizing to JPEG. Per-MB metering means DPI choices flow straight to the bill.

Hybrid / Phased

For mixed workloads and transitions

Best fit when

  • Mixed workload profiles across document types
  • IDP for high-volume, complex-classification work
  • Document AI for newer Agentforce-grounded use cases
  • Customers mid-transition to Data Cloud
Cost model

MuleSoft as the front door; route to one engine or the other by document type and target system.

Two engines to operate and govern. Plan the path to selectively retire IDP routes as Agentforce grows.
Cost modeling done right

Predictable economics you can defend

Each engine meters on a different unit. Once you know which lever drives your cost, modeling becomes straightforward, producing a defensible estimate and confident, predictable budgeting from day one.

MuleSoft IDP

Leveraged by page count

Pages × 30 credits × your rate
  • Straightforward to model from your contracted credit rate, so it's easy to forecast and approve.
  • Plan vCore capacity up front and production volume scales predictably right alongside the rest of your roadmap.
  • No separate IDP SKU to manage, since consumption lives entirely within your existing Automation Credits.

Data Cloud Document AI

Leveraged by file size, not pages

Megabytes × 750 credits × your rate
  • Because cost tracks file size, scan-quality tuning becomes a direct lever you control to optimize spend.
  • Measuring your real files produces a precise, defensible estimate, giving you confidence instead of guesswork.
  • Right-sizing DPI keeps spend lean; we measure first so you commit with confidence.
Plan ahead and there are no surprises. Most contracts are sized for baseline platform usage, so we fold new extraction workloads into your credit planning up front, giving you predictable budgeting and full cost visibility from day one.
How to choose

Four questions decide the architecture

The customer's answers determine which path fits. Most decisions are clear once the document profile is actually measured.

01

Where's the platform center of gravity?

If MuleSoft is the orchestration backbone and Data Cloud is just provisioned, lean IDP. If Data Cloud is the platform of record and Agentforce is near-term, lean Document AI.

02

What's the document volume and profile?

High-volume batch of complex multi-form documents favors IDP's classification and predictable per-page cost. Moderate volume, simpler types, or real-time agentic use favors Document AI.

03

What's the governance posture?

For PHI, PII, or regulated data where audit, BAA coverage, and unified governance matter, Document AI's path through the Einstein Trust Layer simplifies the compliance story.

04

What's the AI roadmap?

Building toward Agentforce as the core agentic platform aligns with Document AI. Using Salesforce as one of several downstream systems favors IDP's broader ecosystem integration.

WHERE THE INDUSTRY IS GOING

Data Cloud is the foundation.
Document AI is the on-platform path.

Salesforce's investment direction is unambiguous. MuleSoft IDP remains a strong production product and stays supported, but for new builds on a Data Cloud + Agentforce foundation, Document AI is increasingly the path of least resistance.

That doesn't mean every customer should rush to it. The right architecture is the one that matches volume, document profile, governance posture, and platform direction.

What we help clients do

The full unstructured-to-structured lifecycle

PS Advisory works with insurance carriers, MGAs, and reinsurers on the whole journey, from the question to a production pipeline.

Discovery & architecture

Profile your documents across volume, size, type, and scan quality, map them to the right engine, and produce a cost model grounded in your contracted rates, not list pricing.

Pilot & validation

Stand up a contained 60–90 day parallel run that benchmarks accuracy on real documents, validates operational behavior at volume, and produces AE-confirmed cost figures.

Production implementation

Build the MuleSoft pipeline, the Data Cloud DLO schema, the Salesforce records, the Agentforce grounding, and the runbook, all with insurance-specific document patterns.

Optimization & roadmap

As Document AI matures and volume grows, the Year 1 choice may not fit Year 3. We help you measure, adjust, and migrate without redoing the pipeline.

Governed by design

Regulated data, handled correctly

For the carriers, MGAs, and reinsurers handling PHI, PII, and regulated content where audit and BAA coverage are non-negotiable.

Einstein Trust Layer

Document AI processes documents through the Trust Layer, with zero data retention, prompt and response masking, and toxicity filtering. Your data never trains external models.

Unified governance

Extracted data lands in Data Cloud DLOs under one governance model, instead of adding a second AI processing surface to audit and secure.

Grounded for Agentforce

Structured output is immediately available to agents, with the same data, governance, and model surface as the agents that will consume it.

Measured, not assumed

We benchmark accuracy and cost on your real documents before you commit, so the production system behaves the way the pilot promised.

From the question to the answer, in weeks, not months.

Whether you're an AE choosing what to recommend, a partner scoping an implementation, or an architecture team deciding between IDP and Document AI for a real workload, we can help. Deep experience on both MuleSoft and Data Cloud, deeper on Salesforce for insurance.