TOPGemmaGoogleOpen Source

Gemma 4 Is Here — And It's Finally Apache 2.0

Q: Which companies or organizations are mentioned in this article?

The key entities covered in this article include Gemma, Google, Open Source, LLM, Apache 2.0.

Q: When was this article published?

This article was published on 2026-04-05 by spoonai.

Q: What are the main topics covered in this article?

This article covers: 256,000 tokens, Here's the deal, The breakdown, The bigger picture, What actually changes.

Google dropped Gemma 4 in four sizes (2B/4B/26B/31B), 140 languages, 256k context — all under Apache 2.0. The license gloves are finally off.

2026년 4월 5일 (일)·6분 소요·

Gemma 4 announcement image — Source: Google

256,000 tokens

That's the context window on Gemma 4's top-tier 31B model. Long enough to swallow a full novel and still follow the plot. But the real story isn't the number.

On April 2, Google released Gemma 4 under a true Apache 2.0 license. For the first time, the weird "Gemma Terms of Use" that shackled the first three generations is gone. Commercial deployment, modification, redistribution, and even selling fine-tuned derivatives — all clear. The open-source crowd has been asking for this since day one.

Here's the deal

Gemma has always sat in an awkward spot. Google called it "open weight," but the license fine print was thorny. The "Gemma Prohibited Use Policy" could be updated unilaterally, and anything you built on top inherited the same restrictions. Your fine-tune, your distillation, your pruned variant — all still governed by Google.

That pushed the developer community into two camps. One side ran Llama (Meta Community License, with the 700M MAU carve-out for mega-corps). The other side ran the true Apache 2.0 crowd: Mistral, Qwen, DeepSeek. Gemma floated in between, nobody's first pick when legal review mattered.

Generation	Released	Max size	Context	License
Gemma 1	Feb 2024	7B	8k	Gemma Terms
Gemma 2	Jun 2024	27B	8k	Gemma Terms
Gemma 3	Mar 2025	27B	128k	Gemma Terms
Gemma 4	Apr 2026	31B	256k	Apache 2.0

Fourth generation, and the license friction is finally gone. If that sounds abstract, picture every startup that wanted to ship a Gemma fine-tune inside a commercial product and had legal slow-walk the review. That era just ended.

The breakdown

Four sizes, one architecture

Gemma 4 ships in 2B, 4B, 26B, and 31B. Notably absent: a mid-tier 9B or 12B. Google's engineering blog explained the reasoning.

Sharing one architecture and training recipe across sizes means developers can prototype locally at 2B and predict the performance profile when they scale to 31B in production.

2B runs on a Raspberry Pi or an M1 MacBook Air. 4B targets mobile. 26B and 31B fit on a single H100 or H200 for full-parameter serving. The wide spacing isn't accidental — it's Google telling you to pick the size that matches your hardware reality, not the one you wish you had.

140 languages, finally

Gemma 3 was English-first with a polite nod to about 40 additional languages. Gemma 4 was trained across 140 languages, and non-English MMLU scores jumped an average of 18% over the previous generation, per Google's own docs. Korean, Japanese, Vietnamese, Arabic, Swahili — the coverage went wide, not just deeper in European languages.

This matters more than benchmark-chasers realize. Running a Korean RAG pipeline on Gemma 3 meant watching tokens burn 1.6x faster than equivalent English. Gemma 4 narrows that to about 1.2x. On a monthly inference bill, that delta is the difference between launching the feature and killing it.

256k context across the big models

Small models usually get the short end of the context stick. GPT-4o mini sits at 128k, and Gemma 4's 2B is limited there too. But the 26B and 31B variants both go to 256k — among the longest windows you'll find in an open-weight model at that parameter count.

Model	Params	Context	License
Gemma 4 31B	31B	256k	Apache 2.0
Qwen 3.6 Plus	72B	131k	Apache 2.0
Llama 4 Scout	17B×16 MoE	~100k effective	Meta Community
gpt-oss-120b	120B	128k	Apache 2.0
Mistral Small 4	22B	128k	Apache 2.0

The bigger picture

The open-weight LLM race in April 2026 is a six-way fight: Google (Gemma 4), Alibaba (Qwen 3.6 Plus), Meta (Llama 4), Mistral (Small 4), OpenAI (gpt-oss-120b), and Zhipu AI (GLM-5). Five of them ship under Apache 2.0 or MIT. Only Meta still holds the line on its custom community license.

So why did Google finally cave on licensing? Three forces converged.

First, Qwen pressure. Alibaba's Qwen line spans 0.8B to 397B — the widest range in the field — all under Apache 2.0. It wins or ties on five of eight major coding benchmarks, including LiveCodeBench and SWE-bench. Google knew Gemma was getting outcoded, and they weren't about to lose on licensing too.

Second, OpenAI's gpt-oss move. When OpenAI shipped gpt-oss-120b under Apache 2.0 last year, it set a new floor: if the closed-source king can ship a fully permissive open-weight branch, every other lab has to match. Google running a less-permissive license on its "open" tier just looked stingy by comparison.

Third, enterprise demand. AWS Bedrock, Vertex AI, and Azure AI Foundry all host open-weight models now, and cloud buyers want something legal can rubber-stamp without a two-week review. Apache 2.0 is effectively the industry default — nothing else clears procurement that fast.

What actually changes

Three things shift immediately for builders.

You can ship Gemma 4 fine-tunes inside commercial products without license gymnastics. Upload to Hugging Face, wire into your SaaS backend, run on-prem — same rules everywhere. No more "let me check with legal" on every deploy.

Non-English RAG gets meaningfully cheaper. Korean, Japanese, Arabic pipelines that were bleeding tokens on Gemma 3 should see tangible savings on Gemma 4. Engineer time is the most expensive resource, and "less friction" tends to compound into shipping velocity.

Local agent workflows get real. 2B on edge devices, 31B on a single workstation GPU, 256k context for full codebases and long conversations — you can stand up offline agent loops without hitting an API. That's a category that was almost viable; now it is.

Gemma 4 isn't beating Gemini 3.1 Pro or GPT-5.4. That's not the point. It's claiming the "legally clean, best-in-class open weight" slot, and that slot is worth more to most teams than another point on a frontier benchmark. For context on the broader open-weight landscape, see our recent piece on Alibaba's Qwen 3.6 Plus agentic update.

Gemma 4 Is Here — And It's Finally Apache 2.0

256,000 tokens

Here's the deal

The breakdown

Four sizes, one architecture

140 languages, finally

256k context across the big models

The bigger picture

What actually changes

References

출처

관련 기사

The Week Open Source Caught Up: Gemma 4 and GLM-5.1

Qwen 3.5 Medium Beats Sonnet 4.5 on Benchmarks — and It's Free

DeepSeek V4 — 1 Trillion Parameters, Open-Weight, and Everything You Need to Know

256,000 tokens

Here's the deal

The breakdown

Four sizes, one architecture

140 languages, finally

256k context across the big models

The bigger picture

What actually changes

References

출처

관련 기사

The Week Open Source Caught Up: Gemma 4 and GLM-5.1

Qwen 3.5 Medium Beats Sonnet 4.5 on Benchmarks — and It's Free

DeepSeek V4 — 1 Trillion Parameters, Open-Weight, and Everything You Need to Know

AI 트렌드를 앞서가세요