Four seconds for an image, then straight into a video — Google just rewrote the price tag
On June 30, Google quietly dropped two things. On their own, each one is notable. Chained together, the story changes completely. The first is an image model that spits out results in four seconds and costs 34 cents to generate 1,000 of them. That's less than the price of a coffee for a few hundred images. The second is a video model, and this is the first time it's been opened to developers through an API. It runs 10 cents per second, meaning a 10-second clip costs a dollar.
But the real headline isn't the price. It's that these two models now snap together into a pipeline. You generate an image cheaply and fast, then hand that same image straight to the video model to bring it to life. In the old workflow, image generation and video generation were separate steps with a human stitching them together in between. Now that seam collapses into a handful of API calls.
The naming alone is worth a smile. "Nano Banana" sounds like a toy when you first hear it, but it's already grown into a full family of models. What launched this time is the "Lite" entry — the lightest, fastest member of that family. The heavy lifting stays with the Pro sibling, while Lite competes purely on speed and volume.
The announcement itself ran on Google's official blog, The Keyword, written by two product managers, Alisa Fortin and Anish Nangia. Posts like this tend to skip the marketing flourish and just explain what got built and why. But buried in that matter-of-fact tone are some genuinely aggressive numbers: four seconds, 34 cents, 10 cents a second. Let's unpack what those numbers actually mean.
The players
Let's start with the headline model: Nano Banana 2 Lite, whose model ID is gemini-3.1-flash-lite-image. Google itself calls it "the fastest, most cost-efficient Gemini Image model" it has shipped. Feed it a text prompt and an image comes back in roughly four seconds, at $0.034 per 1,000 images generated. To put that in context, Google is officially recommending it as the replacement for the older Nano Banana (gemini-2.5-flash-image). In other words, this isn't a side experiment — it's the designated successor.
The second player is Gemini Omni Flash, model ID gemini-omni-flash-preview, currently in public preview. It handles high-quality video generation plus conversational editing, and this marks the first time it's been made available to developers via API. Pricing is $0.10 per second of video output — matching the existing Veo 3.1 Fast rate. Right now it tops out at 10-second generations, with longer clips promised down the line. The model itself wasn't born this week; it was first introduced at Google I/O.
The venue matters too. This ran on Google's official blog, The Keyword, penned by product managers Alisa Fortin and Anish Nangia. A PM-authored post usually signals a working product spec rather than a marketing teaser — and sure enough, both models are already live in Google AI Studio, the Gemini API, and the Gemini Enterprise Agent Platform.
There's also a quieter player worth naming: SynthID. Both models embed SynthID watermarking so that anything they produce carries a marker identifying it as AI-generated. As image and video generation tools multiply, telling real from synthetic keeps getting harder, and Google has baked that distinction directly into the infrastructure from day one.
Finally, there's the consumer layer. Nano Banana 2 Lite isn't confined to the developer API — it's already wired into AI Mode in Search, the Gemini app, NotebookLM, Google Photos, Stitch, Google Flow, and Google Ads. One developer-facing model launch, and it's simultaneously seeping into nearly every corner of Google's consumer ecosystem.
What launched
If you had to sum up Nano Banana 2 Lite in one line, it'd be: speed-first, but not quality-blind. Text goes in, an image comes out within four seconds, and Google specifically called out that it preserves reliable prompt adherence, strong character consistency, and legible in-image text. Normally, cranking up speed means quality takes a hit somewhere. The fact that Google spells out these three quality guarantees suggests real effort went into holding that line.
The pricing is worth sitting with a moment longer. At $0.034 per 1,000 images, that works out to $0.000034 per image — practically free. That matters because image generation cost used to be a real constraint on product design. Questions like "how many free generations do we give users" or "how many times can someone hit regenerate" were fundamentally cost questions. At this price point, that whole category of tradeoff gets a lot lighter.
Gemini Omni Flash plays a different game. Speed isn't really the headline here — "conversational" is. The standout feature is natural-language video editing: you can describe a change and the model applies it. It also supports multimodal referencing, meaning you can mix images, text, and video as inputs. Google says it factors in real-world knowledge and keeps text and on-screen action synchronized. Through the Interactions API, it also supports multi-turn editing — up to three sequential edits in a row.
It's worth zooming out to see the full Nano Banana lineup, because there are now four models under that name. Here's how they stack up.
| Model | Formal Name | Positioning |
|---|---|---|
| Nano Banana 2 Lite | Gemini 3.1 Flash Lite Image | Speed-first, ~4 sec / $0.034 per 1K images |
| Nano Banana 2 | Gemini 3.1 Flash Image | Generalist workhorse, best speed-quality balance |
| Nano Banana Pro | Gemini 3 Pro Image | Professional-grade, finest control and advanced reasoning |
| Nano Banana (legacy) | Gemini 2.5 Flash Image | Older generation, replacement recommended |
That table tells you exactly what Google's strategy is. The era of one model trying to do everything is over — the new approach is "pick the tool that fits the job." Need speed, grab Lite. Need general-purpose reliability, use 2. Need precision, go Pro. This segmentation is arguably the real headline of the whole announcement.
What each side gains
Start with what Google itself gets out of this. The most obvious win is keeping developers tethered inside its ecosystem for longer. Turning an image into a video used to require stitching together tools from multiple vendors. Now that the whole image-to-video chain lives inside a single Google API, there's a lot less reason for a developer to go shopping elsewhere. "Lock-in" is the blunt word for it, and it fits.
Pricing this aggressively is also a classic land-grab move. At 34 cents per 1,000 images, matching that spec-for-spec becomes a genuinely expensive proposition for competitors. Google already has scale advantages from Search, Cloud, and Android that let it absorb a price war like this for longer than most rivals can. It's an endurance game that smaller players simply can't match dollar for dollar.
For developers, the upside is just as concrete. The cost structure changes outright — teams that hesitated to add image generation because of per-unit pricing can now try it with far less financial risk. Speed matters here too: a four-second turnaround directly shapes user experience, since shorter loading spinners tend to mean lower drop-off rates.
Enterprises are running a different calculation. With these models now inside the Gemini Enterprise Agent Platform, building automated pipelines that mass-produce ad creative or marketing content just got a lot more achievable. The models are already wired into Google Ads, too, so advertisers can pull creative assets without reaching for a separate tool.
Everyday consumers benefit indirectly. Because these models are already quietly running inside AI Mode in Search, the Gemini app, and Google Photos, users won't necessarily notice "a new model" — they'll just notice things feel a bit faster and more natural than before. It's an upgrade that happens invisibly, behind the scenes.
Precedents: wins and failures
Shipping a stripped-down, speed-and-cost-optimized version of an existing model isn't a new playbook in AI. One of the clearest success stories is Google's own Gemini Flash lineup. The original Pro models were powerful but heavy and expensive; releasing a lighter Flash variant gave developers a much lower-friction way to experiment and actually ship features into production. That two-track strategy — a light version to widen the funnel, a heavy version to serve advanced demand — has generally worked out well.
There are also cautionary tales worth remembering. A number of image and video generation startups have marketed themselves purely on "fast and cheap" and then lost user trust when output quality turned out to be inconsistent. Chasing speed tends to be where things break first — character consistency slips, or text rendered inside an image comes out garbled. That's likely why Google went out of its way to specify that Nano Banana 2 Lite preserves prompt adherence, character consistency, and text legibility even while prioritizing speed. It reads like a design choice made with those earlier failures in mind.
There's a parallel lesson on the video side. When the Veo series first launched, access was limited, and it took time before developers could freely experiment through an API. The lessons learned during that slower rollout appear to have shaped the decision to open Gemini Omni Flash to developer API access from day one this time — casting a wide net early to accelerate ecosystem growth rather than gating it.
Pricing wars offer a precedent too. Early cloud computing saw Amazon, Microsoft, and Google slug it out on price, and that competition ultimately grew the overall market. But the survivors were almost exclusively the large players with deep enough pockets to sustain it — a lot of smaller cloud providers got squeezed out along the way. There's a real possibility this round of image and video generation price cuts follows a similar trajectory, and that's exactly the concern circulating right now.
Rivals' counterplay
Competitors are almost certainly running the numbers right now. The most directly exposed players are video generation providers using similar per-second billing models. Google explicitly stating "10 cents a second, matching Veo 3.1 Fast" isn't just an internal pricing note — it's a signal that Google intends to reset the baseline price for the entire market, not just its own lineup. Any rival positioned above that number now has to justify why they cost more.
The same pressure applies on the image side. At 34 cents per 1,000 images, matching that price while holding comparable speed and quality is a genuinely tough bar to clear. Competitors are left with roughly two options: match the price head-on, or pivot the pitch entirely toward "we compete on quality and control, not cost." The second path is getting narrower too, though, since Google has already claimed that premium territory for itself with Nano Banana Pro.
Expect reaction from the open-source side as well. Community-driven image and video models have traditionally leaned on "it's free" as their core weapon. When Google prices this aggressively, the calculation shifts — is the hassle of self-hosting and maintenance still worth it versus a near-free managed API? Open-source projects will likely lean harder into differentiation points like customization freedom and on-premise deployment instead of trying to win on raw price.
There's also a platform-level dimension to this. Being able to run an entire image-to-video pipeline inside one company's API sends a clear message to developers who've been stitching together multiple vendors: one platform is now enough. That puts real pressure on competitors to rush their own end-to-end pipelines to completion — a fragmented toolset is a hard sell against this kind of bundling.
So what changes
For solo developers and side-project builders, this isn't just a fun new toy to poke at. Image generation cost has effectively dropped to near zero, which opens the door to ideas that used to feel financially reckless — like generating a live thumbnail preview every time a user types something. The four-second speed matters just as much, honestly. When you're prototyping, the time spent waiting for a result is often the biggest drag on how fast you can iterate.
For content creators and marketers, the workflow itself is likely to shift. In the old process, you'd generate an image, move it into a separate tool, and convert it to video there — and that handoff was a common source of format mismatches and style drift. Now that whole chain runs inside one company's API, so the "feel" of the original image can carry through into the video output. Teams that need to produce ad creative at volume will feel this difference the most.
For business decision-makers, this might mean revisiting the budget line entirely. If image and video generation now costs a fraction of what it used to, the cost-benefit question around adding these features gets a lot easier to answer. That said, it's worth keeping an eye on policies like SynthID watermarking that flag AI-generated content — as mass generation gets easier, tracking content provenance becomes a bigger operational question, not a smaller one.
For everyday users, the visible change might be minimal. Since these models are already quietly running inside Search, Photos, and the Gemini app, most people won't consciously register "I'm using something new" — it'll more likely register as "huh, this feels faster and more natural than before." But the price war and technology race happening behind that surface will keep compounding into better consumer-facing quality over time.
Zooming out to the developer ecosystem as a whole, this announcement is really a signal that image and video generation are no longer separate technology stacks. Going forward, new products and startup ideas are more likely to be designed around a single combined pipeline by default, rather than treating the two as distinct problems. Teams that adapt to this shift early are the ones likely to move fastest.
🥄 Three Things You're Probably Wondering
— Does this make Nano Banana Pro pointless now? Probably not. Lite is optimized for speed and volume, while Pro still holds the edge for work that needs fine-grained control and advanced reasoning — complex compositing, precise detail adjustments, that kind of thing. If anything, the more likely pattern is a combined workflow: fast drafts through Lite, final output through Pro.
— Video costs 10 cents a second now, and longer clips are coming — will the price stay the same then? Too early to say. Google only said longer generations are coming, without locking in what the billing structure will look like once that happens. Whether it stays a flat per-second rate or shifts to tiered pricing by length is something we'll have to wait and see.
— Isn't Google losing money pricing this low? On a pure per-model operating cost basis, the margins probably do look thin. But Google most likely isn't evaluating this in isolation — it's almost certainly bundled into the broader math around Search, Ads, and Cloud. Factor in the effect of locking developers into its ecosystem, and this looks a lot more like a long-term market-share bet than a short-term margin play.
References
- Start building with Nano Banana 2 Lite and Gemini Omni Flash — Google (The Keyword)
- Gemini 3.1 Flash-Lite Image — Model Card, Google DeepMind
- Google unveils Nano Banana 2 Lite (Gemini 3.1 Flash-Lite) for low-cost 4-second image generation — VentureBeat
- The latest AI news we announced in June 2026 — Google
Numbers are as of announcement and may change.



