Revolut Trained an AI on 40 Billion Banking Events. Here's What It Learned.
Revolut published PRAGMA, a foundation model trained on 40 billion financial events from 25 million users. It improves fraud detection by 20% and handles credit scoring, LTV prediction from a single pre-trained base.

40 Billion Events, One Language
40 billion events. That's what Revolut fed into PRAGMA, its in-house foundation model.
Every transfer, payment, currency exchange, investment, and subscription from 25 million users over several years -- treated as a single massive corpus. The way GPT reads internet text, PRAGMA reads the flow of money.
The paper dropped on arXiv on April 9, and it matters for one reason: this is the first publicly documented case of a bank building its own foundation model and deploying it in production.
Treating Banking Events Like Language
Foundation models -- think GPT, Claude -- are pre-trained at scale on general data, then fine-tuned for specific tasks. Revolut applied the exact same playbook to financial data.
Instead of tokenizing text, they tokenized financial event sequences. Instead of next-token prediction, they used masked modelling -- a self-supervised objective tailored to the discrete, variable-length nature of financial records.
Think of it like this: "If a user paid at a coffee shop Monday, received salary Tuesday, and sent an international transfer Wednesday, what are they likely to do Thursday?" PRAGMA learned these patterns from tens of billions of real events.
| Model Size | Parameters | Use Case |
|---|---|---|
| PRAGMA-10M | 10 million | Real-time fraud detection (ultra-low latency) |
| PRAGMA-100M | 100 million | Credit scoring, cross-sell prediction |
| PRAGMA-1B | 1 billion | Precision analysis (latency-tolerant tasks) |
All three share the same pre-trained weights and are fine-tuned per task. It's the "one base model, many applications" strategy that works so well in NLP, transplanted into finance.
20% Better Fraud Detection -- And That's Just the Start
The concrete results show why this approach matters.
Fraud Detection
Traditional fraud systems run on rules. "Flag any international transaction over $5,000." The problem: fraudsters know the rules too.
PRAGMA understands a user's entire behavioral pattern. It asks "does this transaction fit this person's normal pattern?" rather than checking against a static rulebook. Result: 20% improvement in fraud detection accuracy, with fewer false positives and more real fraud caught.
Credit Scoring
Traditional credit scoring relies on structured data -- credit scores, income, debt ratios. PRAGMA adds behavioral data: how someone actually spends money, their savings habits, subscription management. This creates a more nuanced and predictive credit picture.
Customer Lifetime Value
The model predicts which customers are likely to upgrade to premium services and which ones are at risk of churning. For a bank, that translates directly into marketing cost savings.
Here's the key insight: all these tasks run on embeddings from a single pre-trained model. No need to build separate models per task. Stack a simple linear model on top of PRAGMA's embeddings and you get strong performance out of the box.
200+ H100 GPUs in Production
This isn't just a paper. PRAGMA is running in Revolut's production systems right now.
The inference stack runs on 200+ NVIDIA H100 GPUs and powers AIR (Artificial Intelligence by Revolut), the company's AI assistant currently rolling out to 13 million UK customers.
The infrastructure runs on Nebius (formerly Yandex Cloud), a notable choice -- a European fintech using European-based AI cloud infrastructure, which matters for GDPR compliance.
When Banks Become AI Model Companies
The bigger picture PRAGMA reveals: fintech companies are moving beyond "using" AI to "building" their own foundation models.
Previous attempts like JPMorgan's IndexGPT or Bloomberg's BloombergGPT took text-based LLMs and added financial data. PRAGMA takes a fundamentally different approach -- financial event sequences are the native input, not an afterthought bolted onto a text model.
| Model | Approach | Training Data |
|---|---|---|
| BloombergGPT | Text LLM + financial docs | Financial news, reports |
| IndexGPT | Text LLM + financial QA | Investment advisory text |
| PRAGMA | Event sequence model | 40B real transaction events |
The distinction matters. BloombergGPT is "an AI that knows about finance." PRAGMA is closer to "an AI that has experienced finance."
What This Means for You
For developers and fintech builders, the PRAGMA paper sends clear signals.
First, domain-specific foundation models are here. General-purpose LLMs are powerful, but domains with unique data structures -- like finance -- may be better served by purpose-built models. PRAGMA proves the concept.
Second, data is the moat. Revolut can build this model because it has years of financial data from 25 million users. No startup, no research lab can replicate that dataset. The real competitive advantage isn't model architecture -- it's data.
Third, the approach is replicable. Any large fintech or neobank sitting on similar transaction volumes could build their own version. The architecture isn't secret -- it's published on arXiv. The barrier is the data, not the technique.
References
출처
관련 기사
AI 트렌드를 앞서가세요
매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.


