spoonai
GitHubAgentSelf-EvolvingComputer Use

GenericAgent — a self-evolving agent grown from a 3.3K-line seed

lsdefine's minimalist agent: 9 atomic tools + ~100-line agent loop, with skill-tree accretion that turns solved tasks into permanent tools. arXiv 2604.20710.

·3분 소요·GitHubGitHub
공유
GenericAgent skill-tree diagram — branching from 9 atomic tools
Source: GitHub (lsdefine)

9 + 100

Nine tools and a hundred-line agent loop. That's the entire core of GenericAgent. Layer an LLM's coding ability on top and the agent operates a browser, terminal, file system, keyboard, mouse, screen vision, and ADB-controlled mobile devices at the system level. 8.8K stars, +320 in 24 hours, with an arXiv companion paper (2604.20710).

The pitch: start from a minimal seed, grow a full LLM-driven desktop control surface.

Why minimalism

Frameworks like SuperAGI and AutoGPT bundle large tool catalogs upfront. Context bloat and tool-selection errors scale with that catalog.

GenericAgent flips it. "Atomic tools cap at nine; new capabilities emerge as the LLM writes code." Solved tasks crystallize into "skills" stored in a skill tree. Repeat tasks reuse skills, cutting token cost roughly 6x — the paper's headline result.

[IMG#1]

Six features

Feature Description
9 atomic tools Browser, terminal, file, keyboard, mouse, vision, ADB, interface, self-modify
~100-line agent loop Minimal core, readability-first
Self-evolving skill tree Solved tasks become permanent tools
Layered memory 30K context preserved, 6x token savings reported
Dynamic runtime install pip packages, external APIs, hardware on demand
5 frontends Streamlit, QQ, Telegram, Feishu, WeCom, DingTalk

Where most agents grow tool catalogs, GenericAgent goes the opposite way — keep tools small, grow skills.

Tech stack

  • Language: Python
  • UI: Streamlit (desktop), 5 messenger bots
  • External integrations: ADB (mobile), Selenium/Playwright (browser)
  • LLM: OpenAI / Anthropic API

Three layers — (1) Agent Loop (100 lines), (2) Atomic Tools (9 modules, ~300 lines each), (3) Layered Memory (context manager). New skills accumulate as code-plus-example pairs in a "Persistent Skills" region.

Repo comparison

Repo Stars License Position
lsdefine/GenericAgent 8.8K Apache-2.0 Minimal seed + self-evolving skill tree
TransformerOptimus/SuperAGI 23K MIT Full tool catalog
Significant-Gravitas/AutoGPT 168K MIT Early autonomous agent, full toolset
e2b-dev/awesome-ai-agents 10K MIT Curation list (not a runtime)

GenericAgent is fourth on stars, but its self-evolving design clearly differs from peers. The design point matters more than the absolute star count here.

[IMG#2]

Why now — ecosystem context

Three trends converged. (1) Computer Use and OSWorld benchmarks made "screen-driven agents" a measurable category. (2) Skill-tree paradigms à la Voyager (originally MineDojo) are migrating into general LLM agents. (3) An arXiv companion paper pulled in academic users immediately.

Getting started

git clone https://github.com/lsdefine/GenericAgent
cd GenericAgent
cp mykey_template.py mykey.py   # add API keys
python launch.pyw

Common gotchas — mykey.py must contain OpenAI/Anthropic keys or it halts. macOS ADB requires brew install android-platform-tools.

Limits and outlook

Limits — (1) Security: LLM has code execution rights with weak isolation; risky in production environments. Docker-isolation PR is in flight. (2) Skill-tree retrieval cost rises with size; the 6x token-savings claim should be re-measured beyond ~1,000 skills.

Outlook — next six months likely brings (a) Docker/gVisor isolation, (b) a skill marketplace for community sharing, (c) MCP integration to expose skills to other agents.

[IMG#3]

3-Line Summary

  • 9 atomic tools + 100-line agent loop deliver system-level control; 8.8K stars.
  • Solved tasks become permanent skills; reported 6x token savings via the skill tree.
  • arXiv 2604.20710 companion paper grounds the "small tools, growing skills" inversion.

References

관련 기사

무료 뉴스레터

AI 트렌드를 앞서가세요

매일 아침, 엄선된 AI 뉴스를 받아보세요. 스팸 없음. 언제든 구독 취소.

매일 30개+ 소스 분석 · 한국어/영어 이중 언어광고 없음 · 1-클릭 해지