Today: Nvidia previews its next-generation chip design as it maneuvers to adapt to a changing market, Meta's latest flirtation with the enterprise business could actually stick, and the latest funding rounds in enterprise tech.
Today: We run out the clock on 2025 with a look back at the year in enterprise AI, the latest enterprise moves, and the last Runtime roundup of the year.
Today: IDC's most recent data shows both hyperscalers and server huggers are on a spending spree, the Trump administration might authorize U.S. security professionals to go after nation-state hackers, and the latest funding rounds in enterprise tech.
Today: Nvidia previews its next-generation chip design as it maneuvers to adapt to a changing market, Meta's latest flirtation with the enterprise business could actually stick, and the latest funding rounds in enterprise tech.
Welcome to Runtime! Happy New Year! Today: Nvidia previews its next-generation chip design as it maneuvers to adapt to a changing market, Meta's latest flirtation with the enterprise business could actually stick, and the latest funding rounds in enterprise tech.
Please forward this email to a friend or colleague! If it was forwarded to you,sign up here to get Runtime each week, and if you're interested in supporting independent enterprise tech journalism, click the button below and become a Runtime supporter today.
No company is more synonymous with mid-2020s AI than Nvidia, and as a new year dawns it's hard to see that changing any time soon. But while Nvidia's short-term AI strategy looks a lot like the status quo, its long-term AI strategy could be headed down a new path.
On Monday CEO Jensen Huang took the stage at the Consumer Electronics Show to announce that its next-generation Vera Rubin enterprise AI platform is on track to ship later this year, promising performance improvements for both training and running AI models. But that presentation was preceded by a far more interesting deal on Christmas Eve when AI inference startup Groq announced that Nvidia signed a "non-exclusive licensing agreement" for its technology that one investor told CNBC was worth $20 billion.
Nvidia said that Vera Rubin will deliver a 5X performance improvement over Blackwell when it comes to inference, and a 3X jump when it comes to training AI models, according to Bloomberg.
The new platform, which will be available through the Big Three cloud providers as well as Nvidia's harem of neoclouds in the second half of the year, also promises to reduce inference costs compared to Blackwell because Nvidia said customers won't need as many Vera Rubin chips to process the same amount of data.
But while Nvidia's GPUs were the engine of the generative AI boom over the last three years, those chips weren't originally designed for AI workloads; after all, they're still called "graphics processing units." Nvidia realized long ago that GPUs could be useful in the data center and built an impressive array of hardware and software to smooth that transition right as demand for AI workloads soared, but platform shifts tend to embrace new technologies purpose-built for the new era.
Groq raised $1.8 billion over the last several years to build fast and cheap AI inference processors called LPUs, which were designed specifically to run large-language models, and the technology behind that design looks set to become part of Nvidia's future chips.
According to The Register, Groq's chips process data in an "assembly line architecture," which is more efficient than the batch process used by most GPUs and less dependent on having large amounts of memory available.
And given that the supply of memory chips is expected to be tight for the next several years amid the AI infrastructure buildout, technology that helps Nvidia keep customers in the fold by reducing their need for memory could be worth $20 billion.
Right now enterprise AI buyers are caught between two competing impulses: Nobody wants to be dependent on one gigantic chip maker for the raw materials they need to stay in business, but running AI workloads across multiple chip architectures can create reliability problems, as Anthropic told Runtime last year.
Pretty much everyone involved with enterprise AI (including Nvidia) has predicted that inference workloads would become much more important than training workloads as the industry matured, and while 2025 was most definitely not "the year of the AI agent" there are signs that 2026 could be different.
If companies figure out how to deploy reliable AI agents at scale, they'll need fast and cheap AI inference providers to service demand for those agents.
Nvidia's deal with Groq "reflects a growing industry reality — the inference market is fragmenting, and a new category has emerged where speed isn't a feature — it's the entire value proposition," Cerebras CEO Andrew Feldman wrote on LinkedIn last week. "GPUs are phenomenal accelerators … they’re just not the right machine for high-speed inference."
Nvidia will sell a lot of Vera Rubin chips, but Groq's technology could be far more important to its continued success in the long run.
The Wall Street Journal reported the deal will cost Mark Zuckerberg around $2 billion, and Meta said it would "continue to operate and sell the Manus service, as well as integrate it into our products." That service consists of "a general-purpose AI agent designed to help users tackle research, automation, and complex tasks," Manus said in its own announcement, and the company sees itself "as an execution layer — turning advanced AI capabilities into scalable, reliable systems that can carry out end-to-end work in real-world settings.
Agents simply won't take over enterprise computing until they prove themselves more usable and reliable than current alternatives, and Manus' service tackles some of those obstacles. "Many early 'agent' systems fail not because the underlying models can’t reason, but because execution breaks down: tools fail silently, intermediate steps drift, or long-running tasks can’t be resumed or audited. Manus’s core value proposition is that it manages those failure modes," according to VentureBeat.
DayOne Data Centers landed $2 billion in Series C funding to expand its data-center campuses throughout Asia and Europe.
LMArena scored $150 million in Series A funding as it continues to build out a widely used platform for evaluating AI model performance.
Photonic raised $130 million in new funding for its quantum computing technology, which uses optical links to connect qubits.
The Runtime roundup
AWS raised prices on reserved GPU instances by around 15% over the past weekend, according to The Register.
Carnegie Mellon's Andy Pavlo wrote a thorough but quite readable roundup of the past year in database technologies and companies that is worth your time.
Tom Krazit has covered the technology industry for over 20 years, focused on enterprise technology during the rise of cloud computing over the last ten years at Gigaom, Structure and Protocol.
Today: We run out the clock on 2025 with a look back at the year in enterprise AI, the latest enterprise moves, and the last Runtime roundup of the year.
Today: IDC's most recent data shows both hyperscalers and server huggers are on a spending spree, the Trump administration might authorize U.S. security professionals to go after nation-state hackers, and the latest funding rounds in enterprise tech.
Today on the last Product Saturday of 2025: Cursor wants to give designers the same AI tools as developers, Glean's agents get autonomous, and the quote of the week.
Today: Wall Street dreams that Oracle could be the new face of enterprise AI will be deferred, Qualcomm hedges its Arm server chip bets, and the latest enterprise moves.