Why enterprise AI is at an inflection point: Lessons from Google’s Jason Gelman

The enterprise AI landscape is evolving fast. Generative models are getting cheaper, more powerful, and easier to deploy — but trust, reliability, and real-world utility are becoming harder to maintain. As AI shifts from experimentation to infrastructure, the companies that succeed will be the ones that treat it not as a shiny feature but as a system.
Jason Gelman, who leads product at Google Cloud for Vertex AI, has been deep in that system for years. In a conversation with LinkedIn’s Tanya Dua, he shared where the market is heading — and what enterprises (and builders) need to do to keep up.
Key takeaways
-
AI is moving from experimentation to infrastructure, and companies that focus on reliability, integration, and observability will pull ahead.
-
Stateless, API-native AI architectures are lowering barriers to adoption by eliminating the need for full cloud migrations.
-
The strongest AI use cases are tightly focused, solving specific, high-value problems rather than aiming for broad generalization.
-
In a fragmented model landscape, trust, cost-efficiency, and real-world performance are becoming the key differentiators.
These insights came from our ScaleUp:AI event in November 2024, an industry-leading global conference that features topics across technologies and industries. Watch the full session below:
The shift from experimentation to orchestration
“Before OpenAI had released any models, before Microsoft was using them at scale, and before AWS started hosting LLMs — we had the whole stack in place,” Gelman noted. That stack included Google’s TPUs, custom silicon they’ve been developing for over a decade, as well as early internal deployments that helped shape what eventually became Vertex AI.
The message is clear: Infrastructure still matters. While much of the AI conversation focuses on model outputs and applications, Gelman emphasized that success in this space is increasingly defined by integration, reliability, and scale. “We’ve really battle-tested that part of the stack in a way nobody else did,” he added.
That depth gives Google a unique position, but more broadly, it underscores a key shift in the AI adoption curve: companies are moving from stitching tools together to demanding cohesion, observability, and accountability from day one.
“Cloud and AI may be even a little bit more divorced than they used to be”
One of the most paradigm-shifting insights Gelman shared was how the relationship between AI and cloud has changed. “Cloud and AI may be even a little bit more divorced than they used to be,” he said. “Now you just use an API…and that API is stateless. You don’t have to move your data to the cloud.”
This is a major unlock for companies, especially mid-sized enterprises or data-sensitive sectors, that haven’t completed (or don’t want to begin) massive cloud migrations. With stateless, API-first architectures, AI adoption no longer requires a full replatforming effort.
This change lowers the activation energy for innovation. Enterprises can now build with AI in a modular way, targeting specific problems, without uprooting existing infrastructure.
Why the best use cases are often the most focused
When asked about standout enterprise applications, Gelman pointed to Snapchat’s deployment of Gemini in its My AI chatbot. “Their user base tends to skew younger… and we saw much higher engagement once there was a real purpose for the chatbot,” he explained. In this case, the purpose was homework help—a use case tailored to the audience and grounded in real utility.
On the other end of the spectrum, Mayo Clinic is using Vertex AI to process petabytes of medical research, helping oncologists and other specialists keep pace with fast-moving scientific developments. “I have a family member who’s an oncologist, and he says his team of doctors can’t figure out a care plan that really is up to date with the latest science just from memory anymore. There are too many papers coming out,” Gelman said. “Our models help process that sheer amount of data.”
In both cases, the common thread is focus. These applications aren’t trying to do everything — they’re solving high-friction problems in high-value contexts.
Flexibility is powerful, but model overload is real
One of Vertex AI’s key selling points is its model-agnostic architecture. Enterprises can choose from Google’s first-party Gemini models, open-source options like Llama, or partner models from Anthropic and others. But that flexibility creates a new challenge: How to decide?
To solve this, Google is rolling out tools that help enterprises evaluate models not just based on benchmarks, but on real-world performance for their use cases.
These tools are increasingly critical. As the market fragments, performance parity across similar-sized models will become more common. Differentiation will depend on understanding what works best where, and being able to measure that clearly.
The agent era: long-term potential, short-term caution
Agentic AI — autonomous systems that can take action on your behalf — is the hottest topic in the space. But Gelman didn’t hesitate to challenge the buzz. “Overhyped,” he said plainly, referring to the popular vision of an AI agent that books flights or manages your calendar. “That scenario…we’re not there yet.”
He explained that while the ambition is sound, the infrastructure is not. Agents need persistent context, task planning abilities, and robust observability frameworks. “The key to agents is really having a task and being able to break that task into planning steps,” he said. “But right now, those models are still a frozen piece of code.”
That doesn’t mean Google isn’t investing. “We’re working on both the infrastructure for the guardrails and the monitoring, and also on the research around model planning,” he added. However, for now, enterprises are better off building robust workflows around narrow automation than relying entirely on general-purpose agents.
Cost, hallucinations, and trust
While much attention is paid to what AI can do, Gelman highlighted the real-world constraints that still frustrate customers. “We’ve brought [the cost of AI] down by something like 92% this year,” he said. Still, for some use cases, cost remains a blocker. For others, accuracy is the top concern.
He was candid about hallucinations, too: “Hallucination is probably a technical shortcoming more than a pure ethical concern.” While accuracy is improving, no model is perfect. The takeaway? Responsible deployment requires acknowledging fallibility and designing workflows that accommodate it.
Small models, multimodal systems, and introspective AI
As for what’s next, Gelman sees two clear trends: Small models getting dramatically better, and large models becoming more reflective. “Smaller models today are doing as well as state-of-the-art models were 12 to 18 months ago,” he said. These models are faster and cheaper, and in many cases, good enough.
At the same time, Google is exploring “runtime inference introspection” — models that pause, review, and refine their own outputs. That ability to reason, reflect, and course-correct may unlock a new generation of enterprise applications, especially when paired with multimodal inputs.
Tools like Notebook LM are early signs of that future. “It asked questions I didn’t even know I was supposed to ask,” Gelman said of his own experience using it.
“Enterprise readiness is the floor, not the ceiling”
Gelman ended with a reminder that often gets lost in the excitement: “Google Cloud is different than Google Consumer,” he said. “We don’t see your prompts. We don’t use your data to train our models.”
That clear line between enterprise and consumer AI, combined with encryption, stateless design, and global compliance, isn’t just policy. It’s architecture.
The companies that lead in this next chapter won’t just ship fast. They’ll ship responsibly. Because as Gelman put it, “Enterprise readiness is the floor, not the ceiling.”