Welcome to Runtime! Today: data center experts outline how AI workloads are going to upend a lot of design choices, Okta's customer service team suffers a security breach, and the quote of the week.
(Was this email forwarded to you? Sign up here to get Runtime each week.)
Racked and stacked
Given that AI has become the only thing Silicon Valley is capable of talking about this year, it's no surprise that it was a big topic at this week's Open Compute Summit. But as businesses experiment with generative AI technologies in hopes of bringing new capabilities to their products and services, they're putting quite a toll on the foundational layer of tech: the data center.
We're about to go through a new cycle of fresh thinking about the best and most efficient ways to build and manage the enormous complexes that run the world, more than a decade after the rise of cloud computing established some conventional wisdom. The powerful GPUs needed for AI applications are straining the power and cooling requirements of modern data centers and something is going to have to give.
- Synergy Research predicted this week that new data centers built with AI workloads in mind will require twice as much capacity as current designs, and as older buildings are redesigned that capacity could triple.
- “(AI) is not a trend but a major shift in the way technology is going forward to impact our lives,” said Zaid Kahn, chair of the Open Compute Project and vice president of cloud AI and advanced systems engineering, during this week's keynote address.
- Data centers are reaching a breaking point as they pack more and more GPUs into buildings that were originally designed to house CPUs.
- "Many data centers are built with a power budget of between 7.5 and 15kW per rack, but now a single Nvidia DGX can use up 10kW, meaning the entire power budget is used by a single 10U box," IDC's Andrew Buss told The Register.
The Open Compute Project was founded in 2011 as a way for industry leaders to share what they'd learned about data-center design, and 12 years later those leaders are focused on helping the industry prepare for the AI era.
- Much of the work involves agreeing on standards for the complex but undifferentiated work of integrating GPUs and other AI hardware into existing data centers.
- Microsoft said it is working with Google, Meta, Nvidia, and AMD on standardizing the cards that are used to add accelerators and GPUs into those data centers, which data-center providers currently have to do on their own and doesn't add a ton of value to their services.
- Google and Microsoft are also working on a new standard for security requirements applying to firmware that runs the surprisingly large number of small devices that help modern data centers work.
But while those advances are welcome, the AI data center is going to need some new thinking around cooling technologies.
- Until recently, most data-center operators could get away with air cooling those systems, either from the outside or pushed through evaporative cooling pads.
- But if AI workloads continue to grow at the pace that most hardware experts expect, the industry will be forced to turn to liquid cooling techniques, which are more complicated and expensive to operate.
- Meta has already said it plans to use liquid cooling technology as it redesigns data centers for AI and just this week resumed construction at two new facilities that will employ those techniques.
- AWS and Microsoft, however, will have a lot of work to do revamping their massive array of data-center buildings with liquid cooling in mind.
The HARd way
Okta's stock plunged more than 10% Friday after the identity management company disclosed that hackers had access to its customer support system for weeks before it was able to correct the problem.
Like many customer support teams, Okta asks customers to upload a HAR file that records their browser history and login information to help it troubleshoot problems with its service. In this case, someone was able to access that customer-support system after stealing a valid login credential and used the HAR files to try and gain control of a customer's Okta environment by posing as customer-support agents.
BeyondTrust disclosed the problem to Okta on Oct. 3rd, according to Brian Krebs, but it took several weeks for Okta to identify the issue. The company emphasized that there was no breach to its actual product, but it's not clear how long the attackers were able to impersonate Okta employees and gain access to customer systems.
Quote of the week
"I think the opportunity that the market has is if you look at the size and growth of unstructured data within an enterprise — I've seen analyst reports saying that it's growing at 3x the pace of any other type of data — the volumes of unstructured data that with some of this new technology can be made harvestable is massive." Michael Gilfix, chief product and engineering officer for KX, on the opportunity for vector databases in the generative AI boom.
The Runtime roundup
IBM has developed an AI chip architecture that combines processing and memory on a single die, and while it's not powerful enough for today's AI applications researchers think it could serve as a blueprint for future designs.
One of my favorite things about the generative AI models taking over the world is how bad they are at math. Gary Marcus explains why.
Thanks for reading — see you Tuesday!