Today: How Microsoft is trying to hold on to its position at the center of professional software development, Qualcomm gears up — again — to enter the server market, and the latest funding rounds in enterprise tech.
Today on Product Saturday: OpenAI previews a coding agent, AWS launches a new service designed to migrate old workloads to the cloud, and the quote of the week.
Today: An interview with Linear CEO Karri Saarinen on the role of AI in product-management software, CoreWeave's up and down year has an up and down week, and the latest enterprise moves.
Why Vercel overhauled its serverless infrastructure for the AI era
Vercel's serverless infrastructure was designed at a time when speed was the most important goal. AI apps are a little different, and Fluid Compute is an effort to rebuild that infrastructure for the AI era.
As companies struggle to deploy apps built around large-language models, they're also exposing inefficiencies in cloud tools that were designed to run older workloads. After rewriting the infrastructure beneath its serverless computing service last year, Vercel is ready to shift its customers onto a new platform that will make it cheaper to run AI apps.
Fluid Compute is a new architecture for Vercel Functions that was designed to eliminate the idle period when an AI app is waiting for a model to answer a question — which can take seconds or even minutes on computing infrastructure used to operating in milliseconds — and costs real money. In an exclusive interview with Runtime, Vercel co-founder and CEO Guillermo Rauch described Fluid Compute as the natural evolution of serverless computing.
"Fluid Compute sets out to fix serverless for the AI era," Rauch said. It's an acknowledgement that tried-and-true computing infrastructure strategies can change very quickly when something like generative AI comes along, a broader topic that I'll be discussing with Rauch at the HumanX conference in March alongside fellow panelists Andrew Feldman of Cerebras, Robert Nishihara of Anyscale, and Sharon Zhou of Lamini AI.
Vercel, which according to Crunchbase has raised $563 million in funding, is primarily known for its web application development platform. Developers use Vercel's open-source Next.js framework and managed infrastructure services to quickly launch and run cloud apps without having to provision and configure their own hardware.
However, Vercel originally designed the infrastructure that powers its managed computing services to run traditional web apps. Fluid Compute is an effort to rebuild that infrastructure to process AI apps without changing anything about the way non-AI apps run.
AWS introduced the principles behind serverless computing back in 2014 with the launch of Lambda. Apps built around Lambda and other serverless development platforms use functions that execute distinct tasks in response to external triggers, which allows computing resources to spin up and shut down very quickly.
At that time developers were obsessed with speed, having realized that their users and customers wouldn't tolerate sites and apps that ran even 100 milliseconds or so slower than what they expected, and "we optimized the world's compute for that [problem]," Rauch said. Vercel's managed infrastructure runs on AWS and the company works closely with its Lambda team.
Basically, you're treating it more like a server when you need it.
But as Vercel's customers started using the serverless platform to build AI apps, they realized they were wasting computing resources while awaiting a response from the model. Traditional servers understand how to manage idle resources, but in serverless platforms like Vercel's "the problem is that you have that computer just waiting for a very long time and while you're claiming that space of memory, the customer is indeed paying," Rauch said.
Fluid Compute gets around this problem by introducing what the company is calling "in-function concurrency," which "allows a single instance to handle multiple invocations by utilizing idle time spent waiting for backend responses," Vercel said in a blog post last October announcing a beta version of the technology. "Basically, you're treating it more like a server when you need it," Rauch said.
Suno was one of Fluid Compute's beta testers, and saw "upwards of 40% cost savings on function workloads," Rauch said. Depending on the app, other customers could see even greater savings without having to change their app's configuration, he said.
Back is the new front
Fluid Compute was designed to work with Node.js and Python applications, which are two of the most widely used frameworks and programming languages (respectively) among professional developers surveyed by Stack Overflow. Cloudflare Workers is a rival serverless computing platform that uses a similar technology to more efficiently deal with idle requests, but it is based on a different runtime and Node,js developers have to implement a few workarounds to get their apps to run.
Rauch is hopeful that Fluid Compute will reduce the number of customers shocked by the size of their Vercel bills after their apps went viral or saw an unexpected surge in demand. That experience has been even more painful for AI app developers, who found they were paying more than they expected to serve their users with a slow app.
"Fluid addresses a huge percentage of those cases," Rauch said. "Developers felt like they weren't in control of that back end becoming slower, and Fluid brings into a world of predictability where you're concerned about what you do control, which is your code and the things that you ship."
The new platform could also make Vercel a more interesting option for larger enterprises that like the principles of serverless computing but need to make sure they're operating as efficiently as possible.
"Typically, Vercel has been seen by many as for front-end workloads." Rauch said. "With Fluid, you can run any kind of back-end workload as long as it's in those runtimes like Node and Python. It's not just the ability to run it, but to do so efficiently."
Tom Krazit has covered the technology industry for over 20 years, focused on enterprise technology during the rise of cloud computing over the last ten years at Gigaom, Structure and Protocol.
When the CNCF accepts open-source projects, it requires that any trademarks related to the project be handed over. Synadia never did that, and is now backing down from an attempt to use its ownership of the trademark as leverage to regain control of NATS.
Model Context Protocol (MCP) was introduced last November by Anthropic, which called it "an open standard that enables developers to build secure, two-way connections between their data sources and AI-powered tools." After kicking the tires for a few months, vendors are jumping on board.
Today: four perspectives on building AI infrastructure that can launch quickly and stand the test of time, ServiceNow buys an enterprise AI assistant company, and the latest funding rounds in enterprise tech.
It was only in the last six months that Canva decided generative AI coding assistants were good enough for its employees. It got there through a period of trial and error that suggests GenAI vendors need more flexible pricing strategies.