How to put agents to work

Welcome to Runtime! Today: Members of the Runtime Roundtable share their tips and tricks on getting agents from experiment to production, Google drops a new high-end version of Gemini, and the latest enterprise moves.

Please forward this email to a friend or colleague! If it was forwarded to you, sign up here to get Runtime each week, and if you value independent enterprise tech journalism, click the button below and become a Runtime supporter today.

Become a Runtime Supporter

Keep it simple

It has taken a long time for AI agents to make the impact on enterprise software that dozens of vendors insisted was coming soon way back in September 2024, but advances in AI models and engineering techniques are finally starting to pay off. Still, the knowledge required to turn agents into productive parts of an enterprise tech stack has yet to be evenly distributed, which made it a great topic for our latest edition of the Runtime Roundtable.

When asked how their companies built and launched agents for internal use, our nine participants echoed several common themes across their responses. One common thread centered a long-held truism about internal project management: companies need to understand exactly what problem they're trying to solve before spending time and money trying to solve it.

"The lesson was simple: AI agents succeed when they solve a real problem and are easy for people to use," said Rob Lee, chief technology and growth officer at Pure Storage.
There has been so much top-down pressure from management during the AI boom to implement these technologies as far and wide as possible, but companies deploying agents have realized that they shouldn't be hammering things that aren't nails.
"Most AI agent pilots fail because they’re dropped into workflows that were never designed for them," said Asana CIO Saket Srivastava.

Successful agents are limited by design when it comes to the data they're allowed to see, the decisions they're allowed to make, and the resources they're allowed to consume. "Most failures stem from expecting large-language models to make autonomous decisions on the fly, which results in unpredictability, inconsistency, and spiraling costs," said Don Schuerman, chief technology officer at Pega.

"Agents work best when they’re treated as constrained, reliable infrastructure designed to operate quietly in the background rather than adding operational complexity," said Michael Ameling, president of SAP's Business Technology Platform.
A disciplined approach also allows companies to put more wood behind fewer arrows (as per that old Google dictum) when they identify an opportunity to deploy agents.
"We found success by targeting 'scaling bottlenecks' — areas where demand grows faster than teams can scale — and by embedding agents into the everyday jobs to be done," said Naveen Zutshi, CIO of Databricks.

As companies scale agents, they can start to reuse some of the plumbing needed to get them up and running as those new opportunities emerge. The process required to develop agents is different from the one used to develop traditional enterprise applications, which is one of the reasons it has taken so long to get here.

"The agents that were developed in silos or on top of ungoverned data/processes could not come anywhere close to the same value of agents where the data pipelines, governance, semantic context, and security were built-in," said Josh Fecteau, chief data and AI officer at Teradata.
And some key lessons from the last era of enterprise application development still hold, such as staged rollouts, access controls, and making sure you can access a steady stream of operational data on what those agents are actually doing.
"Agents are introduced gradually with clear permissions, visible decision paths, and defined handoff points to humans. If they can’t be observed, audited, or rolled back, they don’t ship," Ameling said.

Twinning

After OpenAI and Anthropic dropped new AI models during back-to-back weeks this month, Google unveiled a new version of its flagship Gemini model Thursday designed for power users. Gemini 3.1 Pro is technically still a preview, but moves the ball forward on research and coding tasks compared to Gemini 3 Pro, which was released last November.

"3.1 Pro is designed for tasks where a simple answer isn’t enough, taking advanced reasoning and making it useful for your hardest challenges," Google said in a blog post. Google said the new model compares quite favorably to OpenAI and Anthropic, as might be expected, but 3.1 Pro did well on Simon WIllison's trademark pelican cyclist test.

One open question about the future of generative AI development is how many frontier models developers can reasonably support, given that it could be hard to justify the huge expenses needed to train models unless they attract a substantial user base. Right now it looks like a three-way race between OpenAI, Anthropic, and Google to stay on top of the pack, but the first generation of models trained on Nvidia's Blackwell chip are still rolling out.

Enterprise moves

Brandon Sweeney is the new president of Cyera, joining the AI security company after leadership roles at recently acquired companies like dbt Labs and HashiCorp.

Michael Henricks is the new chief financial and operating officer at One Identity, following a little over a year as CEO of Momentive Software.

Yuneeb Khan is the new chief financial officer at KnowBe4, joining the security training company after a year as CFO at Trellix.

James Chuong is the new chief financial officer at Atlassian, following 13 years in finance leadership roles at LinkedIn, most recently as CFO.

Bruce Felt is the new chief financial officer at Kong, joining the API management company after serving as CFO of Domo.

The Runtime roundup

The Microsoft 365 Copilot AI assistant was able to read and summarize sensitive emails that were supposed to be off-limits to AI assistants thanks to a "bug," according to Bleeping Computer.

The University of Mississippi Medical Center was forced to close Thursday due to some sort of cyberattack, which prevented access to medical records.

Illinois Governor J.B. Pritzker suggested canceling tax breaks for new data-center construction in the state for two years due to "rising demand and surging prices."

Meanwhile, an Oklahoma man was arrested at a town meeting while arguing against new data-center construction after taking more than his allotted three minutes, underscoring how local opposition to data centers is heating up.

Thanks for reading — see you Saturday!