Databricks found a new RAG; Lenovo thinks inference

Welcome to Runtime! Today on Product Saturday: Databricks researchers think they've come up with a better way to retrieve data in agents, Lenovo's new servers were designed for the on-premises inference enthusiast, and the quote of the week.

Please forward this email to a friend or colleague! If it was forwarded to you, sign up here to get Runtime each week, and if you value independent enterprise tech journalism, click the button below and become a Runtime supporter today.

Become a Runtime Supporter

Ship it

RAG race: Most early generative-AI applications would have never seen the light of day without RAG, or retrieval-augmented generation, a technique that allows AI models to access new sources of data beyond their training data. But there are limitations to this approach that are starting to become evident as companies work on AI agents, and Databricks researchers believe they've come up with something better.

Instructed Retriever "provides a highly-performant alternative to RAG, when low latency and small model footprint are required, while enabling more effective search agents for scenarios like deep research," Databricks said in a blog post. It's now available in the company's Agent Bricks service, and the timing might be right: "Enterprises are finding that simple retrieval-augmented generation breaks down once you move beyond narrow queries into system-level reasoning, multi-step decisions, and agentic workflows,” Phil Fersht of HFS Research told InfoWorld.

Check the stores: SAP kicked off the year at the National Retail Federation's big show, which I imagine is like CES for cash registers. The company introduced new AI-powered (of course) features for retail customers in its Business Data Cloud that promise to make checking inventory and business planning more autonomous than ever.

"Harmonizing real-time data from sales, inventory, customers and suppliers, Retail Intelligence uses AI-generated simulations so planners can anticipate outcomes and optimize inventory," the company said in a press release. Managers can now direct inventory across store locations with natural-language commands in the service, and SAP also added an MCP server to its Commerce Cloud service.

Serving tokens: Meanwhile, at the real CES, Lenovo devoted a portion of its splashy press conference at Sphere to introduce new AI servers based around Nvidia's GPUs. The company said the three new servers were designed for inference workloads for businesses that want to build AI-enabled applications and agents but want to stay out of the cloud.

The most powerful ThinkSystem SR675i V3 was "built to run full LLMs anywhere with massive scalability, for the largest workloads and accelerated simulation in manufacturing, critical healthcare and financial services environments," Lenovo said in a press release. It also announced partnerships with Nutanix, Red Hat, and Canonical for customers who want to buy the new servers as part of a larger package of software and storage.

No secret agents: It's only a matter of time before the first agent-related enterprise security disaster hits some unlucky company; agents require access to lots of different data sources to perform effectively, and when people try to introduce new and powerful tools into their stacks, they tend to make mistakes. Cybersecurity vendors are working on ways to save those companies from themselves, and Exabeam released new services this week that could help.

The latest release of its platform "unifies AI investigations in one place and strengthens teams’ ability to assess their security posture around AI usage and agent activity, supported by clear maturity tracking, targeted recommendations, and enhanced data and analytics to accurately model emerging agent behaviors," the company said in a press release. With agentic AI, "enterprises are no longer just protecting data. They are managing flows of autonomous software that can act on their own," according to Cato Networks' Etay Maor.

Stat of the week

A lot of companies are excited about the potential of AI coding assistants to help them unclog years of backlogged maintenance, as Microsoft's Jay Parikh told Runtime last year, but nothing in this world comes for free. According to new research released by Sonar, "40% of developers say AI has increased technical debt by creating unnecessary or duplicative code," and someone — or something — will have to deal with that code at some point.

Quote of the week

"The secret for being CEO for this long is 1, don't get fired, and 2, don't get bored. I don't know which one comes first." — Nvidia CEO Jensen Huang, explaining during a press conference at CES this week how he has stayed atop the chip juggernaut for the past 33 years.

The Runtime roundup

OpenAI and Softbank will invest $500 million each in SB Energy, which develops data-center campuses closely tied to energy sources.

Lambda is in talks to raise as much as $350 million to challenge CoreWeave for the top spot among the neoclouds, according to The Information.

Thanks for reading — see you Tuesday!

Wall Street doesn't get software

Claude takes on security; Tailscale tackles agents

How to put agents to work

Tom Krazit

Databricks found a new RAG; Lenovo thinks inference

Ship it

Stat of the week

Quote of the week

The Runtime roundup

Read next