Gemma 4 Is Here — And It's Finally Apache 2.0
Google dropped Gemma 4 in four sizes (2B/4B/26B/31B), 140 languages, 256k context — all under Apache 2.0. The license gloves are finally off.

256,000 tokens
That's the context window on Gemma 4's top-tier 31B model. Long enough to swallow a full novel and still follow the plot. But the real story isn't the number.
On April 2, Google released Gemma 4 under a true Apache 2.0 license. For the first time, the weird "Gemma Terms of Use" that shackled the first three generations is gone. Commercial deployment, modification, redistribution, and even selling fine-tuned derivatives — all clear. The open-source crowd has been asking for this since day one.
Here's the deal
Gemma has always sat in an awkward spot. Google called it "open weight," but the license fine print was thorny. The "Gemma Prohibited Use Policy" could be updated unilaterally, and anything you built on top inherited the same restrictions. Your fine-tune, your distillation, your pruned variant — all still governed by Google.
That pushed the developer community into two camps. One side ran Llama (Meta Community License, with the 700M MAU carve-out for mega-corps). The other side ran the true Apache 2.0 crowd: Mistral, Qwen, DeepSeek. Gemma floated in between, nobody's first pick when legal review mattered.
| Generation | Released | Max size | Context | License |
|---|---|---|---|---|
| Gemma 1 | Feb 2024 | 7B | 8k | Gemma Terms |
| Gemma 2 | Jun 2024 | 27B | 8k | Gemma Terms |
| Gemma 3 | Mar 2025 | 27B | 128k | Gemma Terms |
| Gemma 4 | Apr 2026 | 31B | 256k | Apache 2.0 |
Fourth generation, and the license friction is finally gone. If that sounds abstract, picture every startup that wanted to ship a Gemma fine-tune inside a commercial product and had legal slow-walk the review. That era just ended.
The breakdown
Four sizes, one architecture
Gemma 4 ships in 2B, 4B, 26B, and 31B. Notably absent: a mid-tier 9B or 12B. Google's engineering blog explained the reasoning.
Sharing one architecture and training recipe across sizes means developers can prototype locally at 2B and predict the performance profile when they scale to 31B in production.
2B runs on a Raspberry Pi or an M1 MacBook Air. 4B targets mobile. 26B and 31B fit on a single H100 or H200 for full-parameter serving. The wide spacing isn't accidental — it's Google telling you to pick the size that matches your hardware reality, not the one you wish you had.
140 languages, finally
Gemma 3 was English-first with a polite nod to about 40 additional languages. Gemma 4 was trained across 140 languages, and non-English MMLU scores jumped an average of 18% over the previous generation, per Google's own docs. Korean, Japanese, Vietnamese, Arabic, Swahili — the coverage went wide, not just deeper in European languages.
This matters more than benchmark-chasers realize. Running a Korean RAG pipeline on Gemma 3 meant watching tokens burn 1.6x faster than equivalent English. Gemma 4 narrows that to about 1.2x. On a monthly inference bill, that delta is the difference between launching the feature and killing it.
256k context across the big models
Small models usually get the short end of the context stick. GPT-4o mini sits at 128k, and Gemma 4's 2B is limited there too. But the 26B and 31B variants both go to 256k — among the longest windows you'll find in an open-weight model at that parameter count.
| Model | Params | Context | License |
|---|---|---|---|
| Gemma 4 31B | 31B | 256k | Apache 2.0 |
| Qwen 3.6 Plus | 72B | 131k | Apache 2.0 |
| Llama 4 Scout | 17B×16 MoE | ~100k effective | Meta Community |
| gpt-oss-120b | 120B | 128k | Apache 2.0 |
| Mistral Small 4 | 22B | 128k | Apache 2.0 |
The bigger picture
The open-weight LLM race in April 2026 is a six-way fight: Google (Gemma 4), Alibaba (Qwen 3.6 Plus), Meta (Llama 4), Mistral (Small 4), OpenAI (gpt-oss-120b), and Zhipu AI (GLM-5). Five of them ship under Apache 2.0 or MIT. Only Meta still holds the line on its custom community license.
So why did Google finally cave on licensing? Three forces converged.
First, Qwen pressure. Alibaba's Qwen line spans 0.8B to 397B — the widest range in the field — all under Apache 2.0. It wins or ties on five of eight major coding benchmarks, including LiveCodeBench and SWE-bench. Google knew Gemma was getting outcoded, and they weren't about to lose on licensing too.
Second, OpenAI's gpt-oss move. When OpenAI shipped gpt-oss-120b under Apache 2.0 last year, it set a new floor: if the closed-source king can ship a fully permissive open-weight branch, every other lab has to match. Google running a less-permissive license on its "open" tier just looked stingy by comparison.
Third, enterprise demand. AWS Bedrock, Vertex AI, and Azure AI Foundry all host open-weight models now, and cloud buyers want something legal can rubber-stamp without a two-week review. Apache 2.0 is effectively the industry default — nothing else clears procurement that fast.
What actually changes
Three things shift immediately for builders.
You can ship Gemma 4 fine-tunes inside commercial products without license gymnastics. Upload to Hugging Face, wire into your SaaS backend, run on-prem — same rules everywhere. No more "let me check with legal" on every deploy.
Non-English RAG gets meaningfully cheaper. Korean, Japanese, Arabic pipelines that were bleeding tokens on Gemma 3 should see tangible savings on Gemma 4. Engineer time is the most expensive resource, and "less friction" tends to compound into shipping velocity.
Local agent workflows get real. 2B on edge devices, 31B on a single workstation GPU, 256k context for full codebases and long conversations — you can stand up offline agent loops without hitting an API. That's a category that was almost viable; now it is.
Gemma 4 isn't beating Gemini 3.1 Pro or GPT-5.4. That's not the point. It's claiming the "legally clean, best-in-class open weight" slot, and that slot is worth more to most teams than another point on a frontier benchmark. For context on the broader open-weight landscape, see our recent piece on Alibaba's Qwen 3.6 Plus agentic update.
References
AI 트렌드를 앞서가세요
매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.
