Work-Bench | Probservations S1E2 - Durable Execution: Can’t Stop, Won’t Stop 😎

Durable execution in software is not a new idea! State persistence, fault tolerance, and retries have always been important– there’s inherently a subset of “defensive infrastructure” that developers have to build in order to make their software reliable, and the last decade has seen more products emerging to take that issue of their hands. Temporal is the market leader that comes to mind: built around commercializing the open source project Cadence that gained massive steam (over 100 use cases) within Uber, Temporal started in 2019 (well before the AI wave and 4 years after the launch of Cadence) to commercialize that open source project.

But now, these ideas are more relevant than ever! When one squints, it becomes non-negotiably apparent that an AI agent is just a distributed system in disguise. If modern distributed systems involve a web of API calls, agentic distributed systems hold that same property – only now those API calls are to LLMs.

Take Hebbia for example, a company building agentic systems for financial services and is already working with the likes of MetLife and Oak Hill Advisors. Handling retries well has been an issue they’ve run into; they built a rate-limiting system that attempted to preempt LLM failures, but weren’t able to fully solve the problem. Lucas Haarmann recently shared in a talk how by using Temporal, they can run an agent as a series of activities (calling the OpenAI API being one such activity), with each activity getting its own set of retries/backups/timeouts.

Temporal isn’t the only company building for this growing set of agent use cases. DBOS, Inc., Inngest, Thread AI, and Restate (among others) are all at the frontier, building technology focused on letting agents run reliably at scale. Open questions remain: how will the nature of durable execution evolve to serve agentic applications? How will the non-determinsic nature of this software be accommodated? Will evals and fine-tuning flows make their way into the fold? How does one balance the tradeoff of technology robustness and the developer preference for lightweight solutions?

If you’re also thinking about these things, I would love to chat!

👋 I’m a Researcher at Work-Bench, a Seed stage enterprise-focused VC fund based in New York City. Our sweet spot for investment at Seed correlates with building out a startup’s early go-to-market motions.

Share