When we set out to build Caret, we wanted to make it easy for non-technical users to build workflows. We envisioned the user describing their process in plain English and having a functional, reliable workflow in a few minutes or less. Minimal back and forth with lots of inference by the product. This meant seriously limiting the amount of manual configuration required to create a workflow.
The dominating (and ultimate) use case for Caret is parsing unstructured data to populate new records in Google Sheets, Notion databases, Airtable tables, etc. This presented a challenge, becuase the required API calls to these external platforms aren't simple. To keep things easy for both our users and developers, we chose to rely heavily on LLM inference and structured outputs to construct requests.
Let's explore one way we solve for this complexity with a real user's Intuit QuickBooks scenario.
One of our users wanted to extract invoice data from various sources (Notion pages, PDFs, emails) and create corresponding bills in QuickBooks. The QuickBooks Accounting APIs are not straightforward. Lots of object types and references. On top of that, our customer had loads of requirements that contributed to a high-complexity scenario:
Constructing the necessary REST API calls to support our customer's QuickBooks use case would be impossible to do programatically without LLM assistance. It would also be a significant undertaking for this customer to implement all of their requirements in a no-code UI workflow builder using anything but natural language. Anyways, something like that would violate our ethos.
So, we started using LLM structured calls to handle the complex inference and API call construction. Here's how it works in practice:
In some ways, constructing a REST API request using structured LLM calls is a lot like leveraging Model Context Protocol (MCP) servers. However, there are a number of advantages to our approach:
Note: points (1) and (2) are likely to change over time as MCP gains popularity.
By leaning heavily on LLMs, the user and developer experience improvements are significant. The user gets to describe all of their complex requirements in plain English. Our developers get to make a much simpler product which increases implementation velocity and supports extreme complexity with minimal upfront effort.
That said, this isn't a magic solution. It requires ongoing monitoring and refinement. LLM inference is only as smart as the instructions you give it and real user inputs continue to surprise us. Unhandled edge cases continue to emerge despite the high flexibility of the solution. We continuously update our schemas and validation rules based on actual usage patterns.
Looking forward, we still think this represents a shift in how integrations could work. Instead of training users to think like APIs, we're building systems that meet users on their plane of thought. The technology exists now to make this practical, and as AI capabilities improve, the gap between human intuition and machine requirements will continue to shrink.
Want to learn more about implementing AI-powered integrations for your business? Schedule a consultation with our team to explore how structured LLM outputs can transform your workflows.