Google's strongest challenge to OpenAI arrives
Today: Google rolls out its most powerful LLM to date, U.S. companies form a task force to endlessly debate AI risks, and this week's enterprise moves.
Welcome to Runtime! Today: Google rolls out its most powerful LLM to date, U.S. companies form a task force to endlessly debate AI risks, and this week's enterprise moves.
(Was this email forwarded to you? Sign up here to get Runtime each week.)
It takes two
After getting caught on its heels by the introduction of OpenAI's ChatGPT in November 2022 and GPT-4 a little less than a year ago, Google released its true answer to those ground-breaking AI models Thursday. Enterprise customers will have to wait a little longer to get their hands on its most powerful model to date, however, and it's unlikely that OpenAI is standing still.
Gemini Ultra was first unveiled in December alongside Gemini Pro, which compared to GPT-3.5. In true Google fashion, the launch of Ultra will also usher in new branding for Google Cloud products that have only been generally available for less than a year.
- Duet AI was the brand name for Google Cloud's answer to Microsoft's Copilot AI assistants for work, but the versions of Duet AI for both Workspace and Google Cloud will now be known as Gemini for Workspace and Gemini, respectively.
- As my former colleague David Pierce put it, "Google is famous for having a million similar products with confusingly different names and seemingly nothing in common," but at least it chose to standardize around one halo brand (for now) in its generative AI era.
- That doesn't mean it's still a little confusing, however, as the consumer products built around Ultra will actually be known as Gemini Advanced.
- "While today is about Gemini Advanced and its new capabilities, next week we'll share more details on what's coming for developers and Cloud customers," Google CEO Sundar Pichai said in a blog post.
So how does the new model stack up against OpenAI and the rest of the LLM community?
- "The largest model Ultra 1.0 is the first to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects — including math, physics, history, law, medicine and ethics — to test knowledge and problem-solving abilities," Pichai said in the blog post.
- Benchmarks aren't the best way to evaluate the real-world performance of anything, but they're a start until more people can put the model through its paces.
- "Let me start with the headline: Gemini Advanced is clearly a GPT-4 class model. ...for the first time since ChatGPT’s release, there is another company with an LLM that can compete with Open AI’s most advanced model," wrote Ethan Mollick, a professor at the University of Pennsylvania who was given a month to play with Gemini Advanced.
There will be a sorting-out period over the next year or so as the market evaluates the proliferation of models released since OpenAI kicked off the generative AI boom, but there will always be bragging rights at the top. Still, we don't really know if badass performance will translate to a long-term advantage among cloud providers.
- The question on the mind of AWS customers (and Amazon shareholders), of course, is whether or not the cloud leader needs to have a homegrown LLM of its own that can challenge Gemini and GPT-4, or if it can get away with its traditional retail-oriented philosophy of offering the easiest access to a wide variety of AI models across different levels of performance.
- The early results for Amazon Titan were not great, but AWS offers a ton of "good enough" cloud infrastructure services to its customers alongside market-leading services from partners and even competitors, just like a grocery store sells its own brands underneath the top-shelf stuff.
- Microsoft has acknowledged the need for smaller and cheaper models that serve a wider variety of real-world use cases, despite enjoying a clear first-mover advantage through its exclusive ability to offer OpenAI's technology to enterprise customers.
- There are few constants in the tech business, but history has shown that the best technology isn't always the winner in the end.
Safety in numbers
More than 200 companies joined a new consortium Thursday orchestrated by the U.S. Department of Commerce that will focus on developing generative AI safely, including every company mentioned in this newsletter so far.
"The U.S. government has a significant role to play in setting the standards and developing the tools we need to mitigate the risks and harness the immense potential of artificial intelligence," Commerce Secretary Gina Raimondo said in a statement to Reuters. The companies will collaborate on ways to enforce the Biden administration's executive order on AI released last year.
It's a little unclear how 200 companies are going to muster consensus on anything meaningful, given how tech companies have approached standards-setting practices in the past.The group will work under NIST's U.S. AI Safety Institute, which Raimondo announced Wednesday would be led by Elizabeth Kelly, who played a key role in the development of the AI executive order.
Danny Allan is the new CTO at Snyk, joining the developer security company after serving in the same role at Veeam.
Mark Anderson is the new president of revenue at Cloudflare, taking over for Marc Boroditsky after serving on Cloudflare's board of directors since 2019.
Don Wight is the new chief revenue officer at Simpplr, following a stint as chief sales officer at PAR Technology.
Abe Smith is the new chief of global field operations at Freshworks, joining the company from Zoom where he was head of international.
Oliver Asmus is the new field CTO at DAS42, joining the data consultancy company from Slalom.
The Runtime roundup
Cloudflare beat Wall Street expectations for revenue and earnings and enjoyed a 21% bump in its stock price during after-hours trading.
OpenAI is working on "agent" software that could automate a number of different work-related tasks on devices, according to The Information.
Software companies that do business with the U.S. government are pushing back on some of its new cybersecurity policies, notably a requirement that the government could demand "full access" to their systems after a breach.
Cisco and Nvidia will collaborate on hardware and software that will elevate the role of Ethernet in the AI data center, which is far more widely used than the Infiniband technology used to connect lots of AI clusters.
Thanks for reading — see you Saturday!