Transforming Unstructured Data into API Calls in Caret

When we set out to build Caret, we wanted to make it easy for non-technical users to build workflows. We envisioned the user describing their process in plain English and having a functional, reliable workflow in a few minutes or less. Minimal back and forth with lots of inference by the product. This meant seriously limiting the amount of manual configuration required to create a workflow.

The dominating (and ultimate) use case for Caret is parsing unstructured data to populate new records in Google Sheets, Notion databases, Airtable tables, etc. This presented a challenge, becuase the required API calls to these external platforms aren't simple. To keep things easy for both our users and developers, we chose to rely heavily on LLM inference and structured outputs to construct requests.

Let's explore one way we solve for this complexity with a real user's Intuit QuickBooks scenario.

QuickBooks is not "quick"

One of our users wanted to extract invoice data from various sources (Notion pages, PDFs, emails) and create corresponding bills in QuickBooks. The QuickBooks Accounting APIs are not straightforward. Lots of object types and references. On top of that, our customer had loads of requirements that contributed to a high-complexity scenario:

The customer has two different QuickBooks accounts. The workflow would need to dynamically identify the appropriate account for the invoice.
Both QuickBooks accounts are multi-currency, with invoices being in one of two currencies.
Sometimes, the curreny symbols in the invoices can be incorrect. The workflow would need to infer the correct currency based on context.
Vendor names can vary slightly from invoice to invoice, meaning a perfect string match was not guaranteed.
If a matching vendor is not available, we need to create a new one and populate as much information as possible using the available data (e.g. name, address, contact information). Sometimes this data is available, sometimes it isn't.

LLM shoehorn for QuickBooks API calls

Constructing the necessary REST API calls to support our customer's QuickBooks use case would be impossible to do programatically without LLM assistance. It would also be a significant undertaking for this customer to implement all of their requirements in a no-code UI workflow builder using anything but natural language. Anyways, something like that would violate our ethos.

So, we started using LLM structured calls to handle the complex inference and API call construction. Here's how it works in practice:

A data entry workflow extracts generic information about an invoice from an unstructured input. This includes things like:
- Vendor information (optionally) including address, phone number, email, etc.
- Line items including amount, currency and (optionally) descriptions.
We ask the LLM to dynamically determine the appropriate QuickBooks account to use based on a set of natural language rules provided by the customer.
We get or create a matching vendor:
- If we find a vendor with an identical name, we use it.
- If no perfect match, we ask a cheap LLM to find a close match, if one is available.
- If no close match, we use a structured output call with Zod schema to generate a vendor creation request body and invoke the API.
We query the available classes and chart of accounts and ask the LLM to dynamically determine the best matches based on a set of natural language rules provided by the customer.
Finally, we construct the request to create the QuickBill using the data discovered in the previous steps as well as a final structured output call to fill in the remaining optional details.

Why not MCP?

In some ways, constructing a REST API request using structured LLM calls is a lot like leveraging Model Context Protocol (MCP) servers. However, there are a number of advantages to our approach:

Unlike MCP, REST APIs and their schemas are already widely available and well documented for all external services.
For services that do support MCP, most MCP servers must be hosted locally and do not support OAuth. This does not work well for a workflow platform that runs in the cloud.
We maintain the schemas for the structured LLM calls and can easily sanitize the results before invoking the API. This is technically also possible for MCP tool calls, although is complicated by dyamic tool discovery. MCP tools, and thus their arguments, are not strictly versioned. This makes validation complex, especially for remote MCP servers.

Note: points (1) and (2) are likely to change over time as MCP gains popularity.

Rocket fuel or temporary band-aid?

By leaning heavily on LLMs, the user and developer experience improvements are significant. The user gets to describe all of their complex requirements in plain English. Our developers get to make a much simpler product which increases implementation velocity and supports extreme complexity with minimal upfront effort.

That said, this isn't a magic solution. It requires ongoing monitoring and refinement. LLM inference is only as smart as the instructions you give it and real user inputs continue to surprise us. Unhandled edge cases continue to emerge despite the high flexibility of the solution. We continuously update our schemas and validation rules based on actual usage patterns.

Looking forward, we still think this represents a shift in how integrations could work. Instead of training users to think like APIs, we're building systems that meet users on their plane of thought. The technology exists now to make this practical, and as AI capabilities improve, the gap between human intuition and machine requirements will continue to shrink.

Want to learn more about implementing AI-powered integrations for your business? Schedule a consultation with our team to explore how structured LLM outputs can transform your workflows.

Transforming Unstructured Data into API Calls in Caret

QuickBooks is not "quick"

LLM shoehorn for QuickBooks API calls

Why not MCP?

Rocket fuel or temporary band-aid?

Stay in the loop