Microsoft's Big Swing at Build 2026: In-House AI Models + a Copilot Coding Model Called 'Project Polaris' to Cut the OpenAI Cord
At its June 2 Build 2026 keynote, Microsoft unveiled MAI-Image 2.5, MAI-Voice 2 and MAI-Transcribe 1.5, plus a homegrown coding model — Project Polaris — that replaces GPT-4 Turbo as GitHub Copilot's default starting August. Nadella: 'Windows is no longer a platform for human users only.'

"Windows is no longer a platform for human users only" — Nadella's June 2 opener
Morning of June 2, Fort Mason Center, San Francisco. Satya Nadella walked onto the Build 2026 keynote stage and led with this line: "Windows is no longer a platform for human users only." After 40-plus years as the OS humans click around in, Windows is being redefined as a platform that AI agents execute on directly.
But the real news came right after the declaration — a parade of models. If Microsoft used to be "the company that poured tens of billions into OpenAI," at this Build it laid out, for the first time on one stage, a full in-house model stack that runs without OpenAI. A text-to-image model, multilingual text-to-speech, speech-to-text, and — most striking of all — a homegrown coding model for GitHub Copilot called Project Polaris. This isn't just a product launch. It's Microsoft drawing its sword on reducing OpenAI dependence across its AI supply chain.
Remember the news a few days ago that Microsoft shut down internal Claude Code usage because of exploding AI coding costs? You can't solve a cost crisis while staying dependent on someone else's models. The answer was always going to be "build our own" — and Build 2026 was the venue for revealing it.
Meet the players — MAI, Mustafa Suleyman, and the Build stage
MAI (Microsoft AI) is the organization at the center of this story — the team Microsoft assembled to build its own foundation models. Leading it is Mustafa Suleyman, a DeepMind co-founder who, after Inflection AI, joined Microsoft in 2024 to run all of consumer AI (Copilot included). That's precisely why Microsoft brought him in — to stop being "a company that sells models OpenAI built for it" and become "a company that owns its own models."
Satya Nadella is the one drawing the big picture. The CEO who revived Microsoft on the back of Azure cloud now arrives with the thesis "Windows as an agent platform." When a company that owns the PC OS, the cloud, Office, and the dev tooling (GitHub) declares that AI agents will operate all of it directly, that's not a feature add — it's an ambition to shift the computing paradigm itself.
Build is Microsoft's annual developer conference. Not a consumer event, but the place where Microsoft tells the world's developers — the ones building on its platform — what they'll get over the next year. So most Build reveals are tools developers can use right away. Around 2,500 developers gathered at Fort Mason this year, and in front of them Microsoft laid down its in-house model cards all at once.
What was announced — four models, and 'Polaris,' the real bomb
Here are the reveals in a table.
| Model | Type | Key point |
|---|---|---|
| MAI-Image 2.5 | Text→image | High-quality + faster 2.5e variant; accepts image uploads for editing |
| MAI-Voice 2 | Multilingual TTS | Adds ~15 languages incl. Korean; wider emotional range (angry, confused, embarrassed) |
| MAI-Transcribe 1.5 | Speech→text | Incremental step over the April model that claimed lowest WER across 25 languages |
| Project Polaris | Coding model | MoE; replaces GPT-4 Turbo as GitHub Copilot's default from August |
MAI-Image 2.5 generates images from text, shipping in two flavors — a high-quality version and a faster "MAI-Image-2.5e." The key is that it accepts image uploads: it doesn't just generate, it edits images users provide. It plugs into Copilot, Microsoft Foundry, and the MAI Playground.
MAI-Voice 2 is the multilingual TTS model. It adds about 15 languages — German, Australian and US English, Spanish, French, Hindi, Indonesian, Italian, Japanese, Korean, Dutch, Portuguese, Turkish, Vietnamese, and Chinese — and, crucially, a wider emotional range covering tones like angry, confused, and embarrassed. That's a core piece of making a voice assistant sound human.
MAI-Transcribe 1.5 is an upgrade to the April speech-to-text model. Where the predecessor claimed the lowest word error rate (WER) across 25 languages, this one layers incremental gains on top.
But those three are appetizers. The main course is Project Polaris — Microsoft's own coding model for GitHub Copilot. It's a Mixture-of-Experts (MoE) architecture with specialized sub-modules per language/framework, and it uses chain-of-thought and tree-of-thought reasoning for complex multi-file refactoring. From August 2026, it replaces GPT-4 Turbo as Copilot's default model, with automatic migration and a fallback window. In other words, millions of Copilot users will, from August, be coding on a Microsoft model rather than an OpenAI one.
Who gains what — Microsoft, developers, and OpenAI
For Microsoft, the core of this is cost control and independence. The better Copilot sold, the more inference cost it owed OpenAI. Use someone else's model and they set the price, policy, and roadmap. Switch to Polaris and Microsoft controls its inference cost directly, keeps the margin, and sets feature priorities itself. This is bigger than "saving money" — it's about who holds the steering wheel of the AI business.
For developers, choice expands first. Free/low-cost tiers get a faster, cheaper model; integrating image, voice, and transcription models into Azure and Foundry lowers the cost of adding AI to apps. But the forced switch of Copilot's default to Polaris in August is double-edged. Auto-migration is convenient, yet teams that tuned prompts and workflows to "the GPT-4 Turbo feel" may see subtly different output quality. That's exactly why Microsoft built in a three-month fallback.
For OpenAI, it's a complicated signal. Microsoft remains OpenAI's biggest investor and cloud provider, but it just publicly proved, on the Build stage, that "we can do this without you." With Anthropic and OpenAI racing toward IPOs, the image of your largest ally switching to in-house models casts a shadow over OpenAI's long-term revenue outlook. The ChatGPT/API market is still enormous, but losing part of a giant distribution channel like Copilot is no small thing.
Historical parallels — has "switch to your own model" always worked?
Platform companies replacing someone else's core component with their own is a recurring move in tech history — with both wins and losses.
Success — Apple's silicon independence. Apple used Intel CPUs for years, then moved the entire Mac line to its own M-series chips. Early on there was heavy doubt about compatibility, but it dominated on performance, power efficiency, and cost, and stuck the landing. Lesson: internalizing a core component brings short-term pain but long-term margin and control. Polaris is aiming down exactly this path.
Cautionary — Google's bumpy model transitions. Google pushed its own LLMs (Gemini family) into Search and Cloud and repeatedly hit quality controversies and launch delays. There were cases where "our model is best" announcements fell short in real use and drew criticism. Lesson: in-house models always carry a gap between "announced spec" and "real-world quality." When Polaris becomes Copilot's default in August, the real test is whether it beats GPT-4 Turbo in actual developer experience — not on HumanEval scores.
Failure risk — backlash from forced switches. Various SaaS products in the past force-swapped their engine for in-house tech and watched users churn, complaining "the old one was better." Dev tools especially: users adapt deeply to a tool's "habits," so changing the default model breeds subtle resistance. Lesson: Microsoft stressing auto-migration and fallback precisely because it knows this risk. The switch's success rides as much on "how smoothly you migrate users" as on raw model performance.
Competitor counter-plays
OpenAI counters with "even if Copilot leaves, we have ChatGPT, the API, and the enterprise market." At the same time it'll push its own coding agent and model improvements to prove "we're still better than Microsoft's in-house model." Preparing for an IPO, it needs to show the market a story where "even as Microsoft-dependent revenue shrinks, direct revenue grows."
Anthropic sits in an odd spot. Microsoft dropping internal Claude Code for an in-house model is a direct hit — but conversely, it's a chance to pull in "developers who don't want to be locked into Microsoft's ecosystem." Anthropic will target Polaris's gaps with coding performance (Claude's strength) and positioning as a "neutral model not beholden to any one Big Tech."
Google pushes its own stack with Gemini plus in-house coding tools (e.g., Android Studio, Gemini Code Assist). As Microsoft strengthens vertical integration with "Windows + Copilot + in-house model," Google answers by bundling "Google Cloud + Gemini + dev tools" the same way. It all heads toward a vertical-integration race where each Big Tech threads "OS–cloud–model–dev tools" into one line.
So what actually changes
For developers using Copilot, August is the inflection point. Do nothing and your default model switches to Polaris in August. If your team has optimized prompts and workflows around GPT-4 Turbo, run the same tasks on Polaris in July–August and compare quality — validate during the fallback window. Conversely, for cost-sensitive individuals and small teams, a cheaper, faster default is welcome news.
For people building AI apps, the key is that "you can now wire up multimodal entirely within the Microsoft stack." With image generation/editing (MAI-Image), TTS (MAI-Voice, including Korean), and transcription (MAI-Transcribe) integrated into Azure/Foundry, you can build a multimodal app in one place instead of stitching together multiple external APIs. Unit cost and integration friction go down.
For investors and readers watching the industry, the big picture is that "the axis of the AI infrastructure war has shifted to building your own models." Where "Big Tech buys models from OpenAI/Anthropic" used to be the formula, Microsoft, Google, and Amazon are all moving core workloads onto in-house models. From an era of buying models to one of building them. That said, the gap between announced spec and real-world quality always exists, so cool the hype a beat — Polaris's real verdict waits for developers' hands-on reviews after August.
References
- Notebookcheck — Build 2026 June 2 keynote preview
- Windows News — Homegrown AI models to power GitHub Copilot
- TestingCatalog — Microsoft readies MAI voice and image models for Build 2026
- AI Weekly — Microsoft targets Claude Code with Project Polaris
- ChatForest — Build 2026 recap: Windows is now an agent platform
출처
관련 기사

Microsoft Just Shipped Its Own Foundation Models

Microsoft Killed Internal Claude Code — Because Engineers Loved It Too Much

Microsoft's VibeVoice Handles 60-Minute Audio in One Shot — and It's Open Source
AI 트렌드를 앞서가세요
매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.
