Google DeepMind’s introduction of Gemma 4 12B indicates that the future of agentic AI is shifting more toward local PCs and less toward the cloud, underscoring that the AI race is still in early stages and the market’s direction remains fluid.
Google’s AI lab released the open source model on June 3. It is the latest in the Gemma 4 family of models Google introduced in April. Gemma 4 12B is under the Apache 2.0 license, which gives enterprise developers flexibility and digital control in using, modifying and deploying the model commercially without licensing barriers. Google DeepMind also made Gemma 4 encoder-free, meaning that pictures and videos are directly fed into the multimodal model, with no barrier between the media and the model.
“It helps with the model running very efficiently on limited resources,” said Lian Jye Su, an analyst at Omdia, a division of Informa TechTarget.
A Bigger Trend
Google DeepMind launched Gemma 4 a day after Microsoft introduced its new Aion line of models on its Surface RTX Spark Dev Box computer. The models follow a recent trend among cloud providers, focusing on helping enterprise developers run workloads locally on edge devices rather than in the cloud, so developers and other users can run agentic workloads without being inhibited by the need to pay for the per-token processing in the cloud. Instead, they pay a one-time cost for the model itself or download it in the case of Gemma 4 12B, since the model is free for any user with a 16GB laptop.
In addition to its low cost for developers, Gemma 4 12B is an example of how models are becoming much smarter for edge devices like laptops, Su said.
“They used to be rather difficult to deploy,” he said. He added that for Gemma 4, benefits include processing insights from multiple data sources, such as vision, audio or sensor data.
“That’s very powerful to enhance UX and provide richer response and feedback to the user, which means enterprises can deploy smarter devices,” Su continued.
He added that it will be interesting to see how the model supports AI agents, because it’s a small model, while agentic reasoning processes usually require models with larger parameter counts.
“You need a reasonably sized model because you do want the model to have enough knowledge,” Su said. He added that while smaller models can manage agentic workloads, they are not complex and are task-specific.
Google DeepMind also released its Skills Repository, a library of skills for developers to build with.
Enterprise developers who want to try the model can do so in the Google AI Edge Gallery App, LM Studio and the Google AI Edge Eloquent app.

