spoonai
TOPOpenAIGPT-RosalindRevolut

The Week Vertical AI Arrived — GPT-Rosalind, Pragma, Muse Landed Together

OpenAI launched GPT-Rosalind for life sciences, Revolut unveiled its Pragma banking foundation model, and Meta confirmed its entertainment-focused Muse Spark — all in one week. The era of one-size-fits-all LLMs just ended.

·7분 소요·OpenAI
공유
Life sciences lab with DNA analysis — symbolizing the vertical AI moment
Lorem Picsum

Three Companies, One Message This Week

$40B. That's the estimated combined R&D spend behind OpenAI's GPT-Rosalind, Revolut's Pragma, and Meta's Muse Spark — all unveiled or confirmed this week.

Different teams. Same declaration: one general-purpose LLM isn't enough anymore.

OpenAI released GPT-Rosalind, its first domain-specific foundation model, to Amgen, Moderna, the Allen Institute, and Thermo Fisher as a research preview. Revolut dropped Pragma, a banking-specific foundation model, positioning it as "trained on 12 years of financial data — more accurate than any general model." Meta, through Alexandr Wang at HumanX, officially confirmed Muse Spark — a proprietary entertainment and advertising model kept in-house while Llama 4 goes open source.

The simultaneity matters. These aren't coincidental product launches. They're the same bet placed by three very different companies.

To Understand This, You Need the Backstory

For four years after ChatGPT, the dominant story in AI was brutally simple.

"Bigger models solve more problems." The scaling hypothesis. GPT-4, Claude 3, Gemini Ultra — all of them were bets on the idea that if you just pour in more parameters, more data, more FLOPs, general capability keeps climbing. And mostly, it worked.

But by late 2025, the cracks showed.

Metric 2024 Q1 2026
Largest frontier model 1.8T params 5T+ (estimated)
MATH benchmark 92.4% 97.1%
MedQA medical accuracy 88% 89.3%
Legal citation accuracy 71% 74%

Here's the deal: general benchmarks kept climbing, but domain-specific accuracy stalled. The formula that "the biggest model wins everywhere" started breaking.

Two paths emerged to fix it. One, the Anthropic route — move up the stack into agents. The other, which burst into full view this week: build domain-specific foundation models from scratch. Different training data. Different architecture choices. Different evaluations. Same "foundation model" label, fundamentally different product.

Breaking Down Each Move

OpenAI GPT-Rosalind — The First Non-Text-First Model

Named after Rosalind Franklin, this is OpenAI's first serious attempt at domain specialization.

Until now, OpenAI's entire lineup has been general-purpose. ChatGPT, GPT-4o, o-series reasoning models, even Sora video — all pitched as "works on anything." Rosalind is the first time they picked an industry upfront and trained a foundation model around it.

Training data: public bio literature, UniProt protein databases, PubMed abstracts, plus proprietary experimental datasets supplied by Amgen and Moderna as partners. Outputs aren't just text — Rosalind generates protein sequence predictions, drug interaction graphs, and clinical protocol drafts.

Amgen's statement: "We expect drug candidate discovery time to drop from 18 months to 6 months on average." Actual regulatory approval is separate, of course. But shaving the pre-screening phase is concrete.

OpenAI finally stepped into vertical specialization. This is the company's second identity pivot since ChatGPT.

Revolut Pragma — The First Banking Foundation Model

Revolut's move is the most interesting of the three. A challenger bank — not a research lab — just shipped its own foundation model.

Pragma trained on 140TB of proprietary data: 12 years of European and North American banking transactions, regulatory documents, KYC/AML compliance procedures. They didn't disclose the parameter count beyond "roughly 10% the size of GPT-4." Small but laser-focused.

The key differentiator: a new "financial reasoning" benchmark. In complex multi-step transaction analysis, Revolut claims Pragma's hallucination rate is 73% lower than GPT-5. In compliance-sensitive applications, that gap isn't academic — it's the difference between a fine and a license revocation.

Test GPT-5 Revolut Pragma
Complex transaction reasoning 81% 94%
Regulatory citation accuracy 67% 96%
Hallucination rate 12% 3.2%
Response latency (p50) 1.4s 0.3s

Revolut isn't open-sourcing Pragma. It's licensing it B2B to other banks and fintechs, starting at $2M/year per institution, scaling up to $10M for global deployments. ING and Santander are reportedly signed as launch customers.

Meta Muse Spark — The Quiet Bomb

Meta's move was the week's quiet bombshell. At HumanX, Alexandr Wang confirmed what had been rumored: Muse Spark, a proprietary entertainment and advertising foundation model, has been training internally since mid-2025.

While Meta makes noise about Llama 4 being open source, the model that actually drives revenue is closed. Three use cases were confirmed: Instagram Reels personalized editing suggestions, Facebook Ads conversion-optimized copy generation, and dynamic interaction generation in VR/AR environments.

Internal dashboards reportedly show +31% CTR improvement in ad campaigns using Muse Spark. Meta hasn't officially confirmed the number, but the leak has forced the conversation.

This breaks the "open source champion" narrative Yann LeCun has been pushing publicly. The gap between what Meta says and what Meta actually does with its best model is now visible.

The Bigger Picture

The thread connecting all three: who owns the data.

General LLMs train on the public internet. That's why anyone with enough GPUs can catch up — DeepSeek, Qwen, GLM-5 are proving it. Open weights have closed the gap with Claude and GPT in general tasks.

Vertical models are different. They require proprietary domain data that you can't scrape. Amgen's drug experiment results. Revolut's 12-year transaction history. Meta's ad click logs. You can't build these datasets in six months, or even six years — they're the residue of running the actual business for a decade.

Statista's estimate: global enterprise "private data" will hit 180 zettabytes in 2026. Less than 1% of that is publicly accessible. The other 99% sits locked inside corporate silos. Vertical AI is the key that unlocks it.

And here's why this matters long-term: once a domain-specific model is embedded in an industry, it's incredibly sticky. If a hospital trains MedAI-X on its patient records, switching back to "just use ChatGPT with a system prompt" becomes a non-starter.

This week's simultaneous moves signal the AI market splitting into two tiers. The lower tier — general LLM wars between OpenAI, Anthropic, and Google. The upper tier — industry-specific foundation models sold as platforms. And the upper tier might end up with higher margins than the lower one.

What Actually Changes

For developers, the concrete shift this week:

API options expand. Before, you picked from OpenAI, Anthropic, Google. Now industry-specific endpoints are emerging. Building a healthcare app? GPT-Rosalind API will likely outperform "GPT-5 with a clever system prompt." A 3x difference in hallucination rate isn't ignorable in production.

Pricing structure changes too. General models are priced per-token. Vertical models are mostly moving to "industry license" — a fixed annual fee with unlimited internal use. That's a barrier to entry for startups, but it's also an investment with clearer ROI once you're in.

For professionals, industry-specific AI literacy becomes a career moat. Being "good at ChatGPT" isn't enough anymore. Being the banker who can automate compliance checks with Pragma, or the researcher who can draft trial protocols with Rosalind, becomes genuinely valuable.

Zoom out one more level and this week's combined signal is clear: AI's second act just started. The first act asked "who builds the biggest general model?" The second act asks "who owns the most domain-specific data?" And the winners of the second act might not be the same as the winners of the first.

Sources

관련 기사

무료 뉴스레터

AI 트렌드를 앞서가세요

매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.

매일 30개+ 소스 분석 · 한국어/영어 이중 언어광고 없음 · 1-클릭 해지