v0.3 · 66/66 passing on the baseline suite

An AI workforce you actually host yourself.

16 specialist agents. 19 integrations. All running on your own box, with your own LLM provider, behind your own auth. No cloud middleman, no vendor lock-in, no data leaving your network.

100%
Baseline evals passing
16
Specialist agents
19
Integrations
~5 min
From clone to running
Proof, not promises

Every agent ships with a regression suite.

We don't just say the agents work — we prove it with an automated eval suite scored by an independent LLM judge on every commit. Here's the current baseline, run against Claude Sonnet 4.6 with Opus 4.6 grading.

✓ 66 / 66 passing · 100%
baseline scenarios · 16 roles · ~15 min to run
RolePassing
🔎 Knowledge Agent4 / 4
⚙️ Backend Engineer5 / 5
🎨 Frontend Engineer4 / 4
🚀 DevOps Engineer5 / 5
🔍 QA Engineer4 / 4
✏️ UI/UX Designer4 / 4
📊 Data Analyst4 / 4
📈 Growth Analyst4 / 4
📢 Marketing Manager4 / 4
💼 Sales Rep4 / 4
💬 Customer Support4 / 4
📝 Technical Writer4 / 4
📋 Project Manager4 / 4
🎯 Recruiter4 / 4
⚖️ Legal Analyst4 / 4
💰 Finance / Bookkeeper4 / 4
# Run them on your own install
$ python -m evals.runner --all
What makes it different

Six decisions you can't reverse in a cloud product.

These are the choices that make Smart Agents Pro a product you install, not a service you subscribe to.

🏠

On-prem first

docker compose up -d and you're live. Your data, your box, your network. Fully air-gapped with Ollama is a single env var.

🔀

Provider-flexible LLMs

Eight providers wired: Anthropic, OpenAI, Groq, Cerebras, Gemini, OpenRouter, Ollama, and any OpenAI-compatible endpoint. Swap models per role from the admin UI, no code changes.

🧬

Cross-agent memory

When Sales learns that a customer prefers async demos, Marketing sees it too. Every specialist writes to and queries a shared knowledge store — the Knowledge Agent searches across it all.

Approvals, not chaos

Destructive actions stop and ask. Your agents can push code, send emails, and touch production — but only after you approve from the web UI or Telegram.

📜

Append-only audit log

Every state-changing action — logins, task submissions, model swaps, integration changes — lands in a tamper-evident log. Filterable by user, action, time.

🧪

Evals built in

Regression suite per role. Run it before every release, catch behavioral drift when you swap models. No more "it worked last week" surprises.

The killer feature

Ask anything about your team's work.

The Knowledge Agent fans out across your shared memory, Confluence, Notion, Jira, Linear, Slack, and GitHub — then synthesizes a cited answer in 5–15 seconds. Read-only by design. No hallucinated quotes.

Y
You
What do we know about Acme and how do we handle billing?
🔎
Knowledge Agent

Here's everything the team's knowledge base currently contains on both topics:

Acme Corp
  • Acme Corp prefers async demos over scheduled calls. When engaging them, favor recorded demos or written walkthroughs.
Billing
  • Switched from Chargebee to Stripe in December 2024 to support usage-based pricing.
Sources
  • Shared memory · customer:acme
  • Shared memory · billing,decision
answered in 17.7s
Try it in the app →
The team

16 specialists. Enable the ones you need.

Each one is a focused agent with its own system prompt, tool selection, and regression suite. Toggle any of them on or off from the admin page.

🔎
Knowledge Agent
Ask anything
⚙️
Backend Engineer
API, DB, server-side
🎨
Frontend Engineer
UI, design system, a11y
✏️
UI/UX Designer
Wireframes, specs
🚀
DevOps Engineer
Docker, CI/CD, infra
🔍
QA Engineer
Tests, regressions
📊
Data Analyst
Queries, reports
📈
Growth Analyst
Funnels, A/B tests
📢
Marketing Manager
Research, content
💼
Sales Rep
Leads, CRM, outreach
💬
Customer Support
FAQ, triage, escalation
📝
Technical Writer
README, API docs
📋
Project Manager
Planning, status
🎯
Recruiter
JDs, screening, ATS
⚖️
Legal Analyst
Contract review (not advice)
💰
Finance
Expenses, runway, P&L
Integrations

19 services connected from the admin UI.

Paste a credential, click Test connection, it's live. No code changes, no restarts. Everything encrypted at rest with Fernet, every change logged to the audit trail.

🐙GitHub
🦊GitLab
🔐SSH/EC2
🧠LLM Providers
📌Jira
📖Confluence
📓Notion
📐Linear
💬Slack
👥Teams
🚨PagerDuty
📈Datadog
💳Stripe
🧲HubSpot
🎫Zendesk
🗄️SQL Databases
📧Gmail
📁Drive
✈️Telegram
Install

Five minutes from git clone to "signed in."

One repo. One env file. One docker compose up. Smart Agents Pro brings its own Mongo, its own JWT secret, and the app container. You bring an LLM key.

Quick start

# 1. Clone the repo
$ git clone https://github.com/choochtech/smart-agents-pro.git
$ cd smart-agents-pro

# 2. Configure — set at least ANTHROPIC_API_KEY in .env
$ cp .env.example .env
$ nano .env    # or your editor of choice

# 3. Bring the stack up (app + mongo sidecar)
$ docker compose up -d

# 4. Open the app — first signup becomes the admin
$ open http://localhost:8090
1

Bring your own LLM

Set one of: ANTHROPIC_API_KEY, OPEN_AI_API_KEY, GROQ_API_KEY, CEREBRAS_API_KEY, GEMINI_API_KEY, OPENROUTER_API_KEY. Or point at a local Ollama.

2

Auto-create the admin

Set AI_AGENTS_ADMIN_EMAIL + AI_AGENTS_ADMIN_PASSWORD in .env before booting, or let the first visitor sign up as admin.

3

Connect integrations

From the Integrations admin page, paste credentials for GitHub, Jira, Slack, databases, etc. Click Test connection, it's live.

View on GitHub →