Paul Krill
Editor at Large

Google intros EmbeddingGemma for on-device AI

news
Sep 8, 20252 mins

The multilingual text embedding model suitable for retrieval-augmented generation and semantic search runs on less than 200MB of RAM with quantization.

Is AI making us dumber?
Credit: Shutterstock/FGC

With the introduction of its EmbeddingGemma, Google is providing a multilingual text embedding model designed to run directly on mobile phones, laptops, and other edge devices for mobile-first generative AI.

Unveiled September 4, EmbeddingGemma features a 308 million parameter design that enables developers to build applications using techniques such as RAG (retrieval-augmented generation) and semantic search that will run directly on the targeted hardware, Google explained. Based on the Gemma 3 lightweight model architecture, EmbeddingGemma is trained on more than 100 languages and is small enough to run on fewer than 200MB of RAM with quantization. Customizable output dimensions are featured, ranging from 768 dimensions to 128 dimensions via Matryoshka representation and a 2K token context window.

EmbeddingGemma empowers developers to build on-device, flexible, privacy-centric applications, according to Google. Model weights for EmbeddingGemma can be downloaded from Hugging Face, Kaggle, and Vertex AI. By working with the Gemma 3n model, EmbeddingGemma can unlock new use cases for mobile RAG pipelines, semantic search, and more, Google said. EmbeddingGemma works with tools such as sentence-transformers, llama.cpp, MLX, Ollama, LiteRT, transformers.js, LMStudio, Weaviate, Cloudflare, LlamaIndex, and LangChain. Documentation for EmbeddingGemma can be found at ai.google.dev.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorld’s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorld’s audience of software developers and other information technology professionals. Paul has won a “Best Technology News Coverage” award from IDG.

More from this author