I've been frustrated with AI coding tools that load 15K-28K tokens of system prompts before you can even ask a question. The AI spends most of its attention reading the manual, not solving your code.
So I built Huiyu Pi — a self-hosted AI coding agent that starts at ~80 tokens.

What it does:

  • Browser-based IDE (no heavy Electron app)
  • ~80 tokens system prompt (not 20K)
  • ~0.3s first token response
  • 90%+ cheaper per request
  • 100% local — your code, API keys, conversations never leave your machine
  • Multi-LLM: Claude, GPT, DeepSeek, Gemini, Mistral, Groq, xAI, OpenRouter
  • Built-in terminal, file editor, Git integration
  • PWA support (works on mobile)


How to try:
npx huiyu-pi
Then open http://localhost:9144
Tech stack:

  • Frontend: React 19, TypeScript, Vite, Tailwind CSS
  • Backend: Fastify, WebSocket, SSE
  • Terminal: xterm.js + node-pty
  • License: MIT
    GitHub: https://github.com/huiyu9144/Huiyu-Pi
    Would love feedback from the self-hosted community!