OpenAI's Lilli Replaces Internal Knowledge Search with AI Agents
OpenAI's internal search system Lilli launches for enterprise. Can it replace Notion and Confluence?

OpenAI officially launches Lilli, its internal knowledge management tool, for enterprise customers. This isn't just a search tool — it's an AI agent that connects search → understanding → action.
Background — A Problem OpenAI Experienced Firsthand
Lilli was originally an internal system at OpenAI. As headcount surged from 500 to 3,000, company knowledge scattered across Slack, Google Drive, Notion, and GitHub became unmanageable.
A project started by then-CTO Mira Murati at a 2024 internal hackathon became Lilli's predecessor, designed to solve the problem: "new engineers take an average of 3 weeks to understand the company's existing decisions."
After 6 months of internal use:
- New hire onboarding time reduced by 62%
- Internal search query resolution rate 89% (vs. 23% with Slack search)
- Meetings "re-discussing previously decided topics" decreased by 41%
These results gave them the confidence to launch externally.
Core Features — Detailed Analysis
1. Multi-Source Integration
Connects 40+ enterprise tools including Slack, Google Drive, GitHub, Confluence, Notion, Jira, Linear, and Figma into one interface.
Difference from existing search:
- Slack search: Keyword matching. Searching "database migration" returns only messages containing those words
- Lilli: Answers questions like "Why did we change the database architecture in Q3, and what alternatives were considered?" by synthesizing Slack conversations, technical docs, and Jira tickets
2. Context-Aware Conversations
Maintains previous search context for follow-up questions. Not simple Q&A — natural conversations like asking a colleague.
Example flow:
- "How is our product's authentication system implemented?"
- → Lilli synthesizes related tech docs, code, and Slack discussions
- "Then what work would be needed to switch from OAuth to SAML?"
- → Explains specific scope of changes while maintaining context from the previous answer
3. Action Execution
Creates tasks and drafts documents based on search results:
- Auto-create Jira/Linear tickets
- Draft technical documentation
- Share summaries to Slack channels
- Auto-generate meeting agendas
4. Permission-Based Access Control
Respects existing tool permission structures. Documents Team B has that Team A member can't see remain inaccessible through Lilli. SSO/SAML integration, SOC 2 Type II certified.
Technical Architecture
Lilli is built on GPT-4o with a self-optimized RAG pipeline.
Key technical differentiators:
- Hierarchical Indexing: Documents indexed at sentence → paragraph → section → document → project levels. Retrieves appropriate context depth based on question scope
- Temporal Awareness: Understands time expressions like "last month," "Q3," "last year" and distinguishes information by time period
- Entity Resolution: Automatically determines whether "DB," "database," "RDS," and "PostgreSQL" refer to the same thing in context
Competitive Landscape
The enterprise knowledge management market is $47B and growing at 12% annually (Gartner, 2026).
| Product | Approach | Strengths | Weaknesses |
|---|---|---|---|
| Lilli | AI Agent (conversational) | Multi-source, action execution | OpenAI lock-in, high price |
| Glean | AI search engine | Existing customer base | Not conversational |
| Notion AI | In-document AI | Notion ecosystem | Limited outside Notion |
| Confluence AI | In-document AI | Atlassian ecosystem | Limited to Jira/Confluence |
| Microsoft Copilot | M365 integration | Office ecosystem | Weak non-MS tool integration |
Pricing
- Starter: $20/user/month (5 source connections, 500 queries/month)
- Enterprise: $30/user/month (unlimited sources, unlimited queries, action execution)
- Enterprise Plus: $45/user/month (dedicated infra, 99.99% SLA, custom model fine-tuning)
- 14-day free trial
McKinsey's 2025 AI report found knowledge workers spend an average of 9.3 hours per week on information retrieval. If Lilli cuts 60% of that time, the productivity gain is roughly $15,000/user/year — clear ROI against $30/month.
Early Customer Feedback
Feedback from beta participants:
- Stripe: "Time to first PR for new engineers dropped from 2 weeks to 3 days"
- Figma: "Lilli was the most effective tool for tracking design decision history"
- Scale AI: "Internal ML experiment results became instantly searchable for the entire team"
Risks and Concerns
- Data privacy: Concerns about enterprise data passing through OpenAI servers. OpenAI states "Enterprise customer data is not used for model training," but some companies demand on-premise options
- Vendor lock-in: Dependency on OpenAI's API pricing changes or service disruptions
- Accuracy: The persistent RAG hallucination problem — risk of confidently delivering incorrect information
Related Projects and Background
The Evolution of RAG
Lilli's core technology, RAG (Retrieval-Augmented Generation), was proposed by Meta's Patrick Lewis team in 2020. Early RAG was a simple two-step "retrieve → generate" pipeline, but it has evolved significantly between 2024–2026:
- Naive RAG (2020): Query → vector search → LLM generation
- Advanced RAG (2023): Query rewriting, hybrid search (vector + BM25), reranking
- Modular RAG (2024): Routing, iterative retrieval, self-evaluation
- Agentic RAG (2025–): The agent decides the retrieval strategy itself. Lilli operates at this level
Comparison with Perplexity Enterprise
Perplexity also launched an enterprise search service in late 2025. The difference: Perplexity takes a hybrid approach combining external web search + internal documents, while Lilli focuses exclusively on internal data. For enterprises with high security requirements, Lilli's approach is advantageous.
The Bigger Picture: Enterprise AI Agents
Gartner predicts that by 2028, 33% of enterprises will automate business processes through AI agents. Salesforce's AgentForce, ServiceNow's Now Assist, and Atlassian's Rovo target the same market. Lilli's differentiator is tool-agnostic integration — Salesforce is limited to its ecosystem, ServiceNow to ITSM, but Lilli spans 40+ tools.
Implications
While existing knowledge management tools stop at "search," Lilli creates a flow from "search → understand → act."
In the bigger picture, this is a core pillar of OpenAI's B2B pivot. Having captured consumers with ChatGPT, they're now pursuing monthly recurring revenue (MRR) in the enterprise market with Lilli. The enterprise AI agent competition has officially begun.
관련 기사

Anthropic Launches Claude Marketplace — The First Real Enterprise AI App Store
Anthropic opened a B2B marketplace where enterprises can buy Claude-powered third-party apps using existing budgets. Zero commission, six launch partners, and a platform play that could reshape enterprise AI procurement.

Gemini Just Redefined Google Workspace — A Complete Breakdown of the Docs, Sheets, Slides, and Drive Overhaul
Google deeply integrated Gemini across Workspace. Sheets auto-fill is 9x faster, Docs draft from cross-app data, Drive gets AI search. Full specs, benchmarks, and competitive analysis.

GPT-5.4 Deep Dive — The First General-Purpose Model That Actually Uses Your Computer
OpenAI released GPT-5.4 with 1M token context, native Computer Use achieving 75% on OSWorld (surpassing humans), and a full model family. Complete specs, benchmarks, and competitive analysis.
AI 트렌드를 앞서가세요
매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.
