A personal SRE bot on Telegram that uses my Claude Max subscription, wakes up only when prod breaks, and runs on a 5W mini PC
I forked NanoClaw, wired it to my Claude Max OAuth token, and added monitoring crons that wake the agent only when something is actually wrong. $0 marginal cost, 5W at the wall, three production projects watched from my phone.
Three production projects. One mini PC on a shelf, pulling less power than my phone charger. A personal SRE bot that watches all three from Telegram and pings me only when something actually broke. Proposes a fix. Waits for a y. Opens the PR. Shuts up. And doesn’t touch a cent of my budget.
TL;DR: Forked NanoClaw, ~7k TS LOC I can read in an afternoon. Each message spawns an ephemeral Docker container. Claude credentials never land inside — OneCLI injects them at the proxy. Scheduled tasks run a cheap bash script first; the LLM wakes only if the script returns wakeAgent: true. Wired to my Claude Max OAuth token, which I pay for anyway for day work, so $0 marginal cost. Runs on an Intel N150 pulling 5W at the wall. Three production projects watched from my phone: AlertaAnunt.ro, this blog, one work project I won’t name.
What Claw does for me
Not in theory. In practice, on any given week:
When I ask (reactive, over Telegram or voice notes):
- Investigates a prod exception: reads traceback + surrounding source, cross-references Sentry, checks recent commits on the affected repo, proposes a diff, opens the PR on approval.
- Clears GitHub issue backlogs end-to-end on my OSS packages (pip-upgrader, reviewd): per-issue triage, PR, CI wait, merge, issue comment + close,
git tag vX.Y.Z— the repo’spublish.ymlpicks up the tag and publishes to PyPI via OIDC. - Cold outreach for AlertaAnunt.ro: uses WebSearch to find realtors and car dealers in specific regions, verifies contact info with
agent-browser(Chromium inside the container), then drafts personalized emails via Gmail from a Markdown contact table. Shows me each draft on Telegram, sends and deletes the draft on approval, marks the row “sent -”. Replies I still read myself; no auto-reply cron. - Acts on voice notes I send while walking: Telegram audio → Groq Whisper → transcribed command → agent executes.
- Queries prod Kubernetes via an SSH bastion with a dedicated key, and reads AlertaAnunt.ro’s SQLite database read-only for user and request analytics.
- Ships to this blog: opens a PR with copy edits when I spot a typo, drafts a new post from a few bullets, runs
npm run buildin the container to verify. - Reports on any long-running batch just by being asked “Deci?” (Romanian for “so?”) — reads the on-disk tracker and tells me where it’s at.
On its own (proactive, cron-gated):
- Hourly: scans AlertaAnunt.ro’s container logs for new exceptions. Wakes only if there is something to actually report.
- 4× per day: reviews free-text messages to the AlertaAnunt.ro bot that didn’t match any known command or button, classifies intent, summarizes on Telegram.
- Daily: compares the search-requests-to-new-listings ratio on each scraped site against the 24h baseline; if a parser has gone silent, proposes a selector fix.
- After any merged PR it pushed: calls
schedule_task(+2min, "verify containers restarted")on itself, then goes silent until the check fires.
A real message
I noticed something weird and wrote to my bot while in a taxi:
Simion: vezi că pe imobiliare includem și sponsored/ads
Bot: confirm, 23% din rezultatele trimise în ultimele 48h sunt sponsored. Cauza: site-ul a redenumit badge-ul din
.ad-badgeîn.sponsored-label. Propun fix pe selector + backfill ca să excludem sponsored-ul din notificările deja trimise? [y/n]
Tap y. PR opens. CI runs. Merges. Backfill script kicks off. Done.
Most days I don’t send anything at all. The bot sits silent. Health-check crons run in the background; when they pass, nothing wakes up. That’s the whole point.
Why NanoClaw
I evaluated OpenClaw first. Half a million lines of TypeScript. 53 config files. Seventy dependencies. One Node process where every group shares memory with every other. Permission checks at the application layer, which is another way of saying “if a bug in the permission code lets something through, your host is on fire.” I could not have told you what it was actually allowed to do. Closed the tab.
NanoClaw’s author had the same reaction and wrote it more plainly than I would have:
I wouldn’t have been able to sleep if I had given complex software I didn’t understand full access to my life.
That’s exactly it. My fork is 7000 lines across 26 files, and three invariants do most of the work:
- Security from OS isolation, not from permission checks. Bash runs in a container, not on my host. The container IS the allowlist.
- No config files, by design. Customization = diffs in git. YAML is just implicit code you can’t grep.
- AI-native ops. No dashboard, no debugger. When I want to know why the scheduler didn’t fire at 08:00, I don’t open a log viewer. I type “why didn’t the briefing send?” to the bot and it digs through journalctl and SQLite and reports back.
And the meta joke I actually enjoy: NanoClaw runs on Claude Agent SDK, and I extend NanoClaw using Claude Code on my laptop. Same OAuth token does both.
My fork = core + skills I chose
NanoClaw ships a minimal base. Features land as skill/* branches you merge on demand. 32 skills available. I merged the ones I use:
src/channels/telegram.ts ← /add-telegram
src/channels/gmail.ts ← /add-gmail
src/channels/whatsapp.ts ← /add-whatsapp (disabled)
src/transcription.ts ← /add-voice-transcription
container/skills/... ← /channel-formatting, /add-compact
+ OneCLI wiring ← /init-onecli, /use-native-credential-proxy
Not merged: Slack, Discord, Signal, Emacs, macOS statusbar, Ollama, wiki skills. Perfectly good code, not my use case. My codebase is 7k LOC because that is what my use case costs. Nobody carries anyone else’s weight.
Hardware + cost
- Box: Intel N150, 4 cores, 16 GB RAM, passive cooled. 5W at the wall. Less than my phone charger pulls when plugged into nothing.
- Co-tenants: 22 containers on the same host (Immich, Paperless-ngx, Vaultwarden, n8n, Caddy, OneCLI, my olx crawlers, 3 NanoClaw agent groups, a few odds and ends). None of them saturate. There is no fan to spin.
- LLM cost: zero marginal. I pay for Claude Max anyway for day work. NanoClaw reuses the exact same OAuth token via
CLAUDE_CODE_OAUTH_TOKEN. The container seesplaceholder; OneCLI injects the real token at request time. No second invoice exists. - Voice transcription: Groq’s free-tier
whisper-large-v3. Voice notes on Telegram become agent input text. Never hit the limit. My total spend on voice-enabled control of my prod is still zero.
wakeAgent: the most useful pattern I added
Before the gate pattern, the thing that makes any of this feel alive: the container agent has MCP tools it can call mid-conversation — send_message, schedule_task, list_tasks, update_task, pause_task, resume_task, cancel_task, register_group. It can schedule its own follow-ups. After git push, it casually calls schedule_task(run_at=+2min, prompt="verify containers restarted") and moves on. Two minutes later a new ephemeral container wakes, runs a health check, reports back on Telegram. Self-orchestration, not a stateless function call. That’s the thing that makes the loop feel like a colleague, not a webhook.
Now the wake gate. Every scheduled task has an optional script field. Runs first, 30s timeout, prints one line of JSON:
{"wakeAgent": false}
or
{"wakeAgent": true, "data": {"errors": [...]}}
If false, the LLM is never invoked. Task ends silently. If true, data is injected into the prompt so the agent starts with pre-collected signal.
My active gates on AlertaAnunt.ro:
| Cron | Gate | Wakes when |
|---|---|---|
0 * * * * | Docker container logs | New ERROR/Exception/Traceback in the last hour |
13 8,12,16,20 * * * | User messages to bot | Free-text messages that don’t match known patterns |
0 12 * * * | Search-to-listings ratio | Drops well below 24h baseline (parser likely broke) |
99% of firings return wakeAgent: false. No LLM call. No Telegram ping. No line in any log I will feel obligated to check. My morning stays quiet on purpose.
When it does wake up, it works
Exception monitoring. Gate scans container logs hourly for new exceptions. When one shows up, the agent reads the traceback, cross-references Sentry, checks recent commits on the affected repo, and drops a Telegram message: cause, proposed diff, [y/n]. I approve, PR opens, I review on the laptop later. Runs zero times on a clean day. Two or three on a bad one.
User intent drill-down. AlertaAnunt.ro has slash commands (/start, /add, /list, /status, /upgrade, /invite, /help, /clear), inline buttons, URL parsing. And then real users do what real users do: they type free text. “Why didn’t I get a notification today?” “Can I filter by price?” “Is this working on iPhone?” That used to die in a log I never read. Now a gate runs four times a day, collects the free-text messages, strips the ones that match known patterns, and if anything is left wakes the agent with the raw list plus per-user context. The agent classifies and summarizes on Telegram: “3 unexpected messages. A has a filter that excludes all results (UI bug). B wants a feature that doesn’t exist. C is spam.” The dead-letter queue became a signal I review twice a day with one LLM call.
Gmail outreach. Cold email campaign for AlertaAnunt.ro. I point the agent at a region and a persona — “bucurești, agenți imobiliari activi” or “dealeri auto second-hand în Cluj” — and it does the rest. WebSearch to find candidates, agent-browser to verify contact info on the candidate’s own site, then drafts personalized emails via Gmail from a Markdown contact table. It checks format against CLAUDE.md rules (no em dashes, keep it short, use the template), shows me each draft on Telegram, and on approval sends through Gmail and immediately deletes the draft. Then updates the Markdown row with “sent -
Selector health. Scrapers rot silently. “Zero new listings” can mean the market is quiet or the parser is broken, and I usually don’t know which. So the daily gate compares search-requests-to-new-listings against the 24h baseline per site. If the ratio drops, the agent curls the site, diffs the DOM against what the parser expects, and proposes a selector fix. Same approval loop as the exception monitor.
OSS maintenance on demand. I maintain pip-upgrader and reviewd. GitHub issues and Dependabot alerts accumulate on both, because that’s what active OSS projects do. Normally I’d batch them for a weekend hack session and half would die of old age. Now it goes like this:
Simion: vezi issues-urile deschise pe pip-upgrader, investighează cu Opus, propune fixurile
The agent reads each issue, reproduces where possible, greps the source for context, summarizes per-issue on Telegram. I reply da or skip. For the ones I approve, it does the whole release cycle without me touching a terminal: PR, CI wait, merge, issue comment + close, then git tag vX.Y.Z && git push --tags. The repo’s publish.yml runs uv build and publishes via PyPI trusted publishing (OIDC, no token). Cleared a five-issue backlog in forty minutes. On a laptop I would have punted another week.
Maintaining OSS has never been more fun. The part I actually like — triage, reading the bug, deciding what the fix should look like — is still mine. The part that makes me procrastinate — bump version, tag, build, changelog, close the issue — is the bot’s.
Defense in depth, all committed
Every layer is in the repo, in a diff I wrote or reviewed, in a branch I merged on purpose. If you want to know what my bot is allowed to do, you git log it.
NanoClaw out of the box:
- Ephemeral container per invocation (
docker run --rm). No process-level state across messages. - Claude credentials never land in the container. Outbound HTTP routes through OneCLI, which injects the real OAuth token at the proxy. Inside,
CLAUDE_CODE_OAUTH_TOKEN=placeholder. GitHub and Sentry tokens are passed into the container as env vars, scoped per group (GitHub scoped to the two OSS repos, Sentry scoped to the alerta-anunt-bot project). Not zero-secret, but narrow-secret and auditable. - Per-group filesystem isolation. Each Telegram group = its own folder, session, IPC dir,
CLAUDE.md. The AlertaAnunt.ro group cannot read the simion.cv group. - Docker socket is proxied, not mounted raw. Agent talks to docker-socket-proxy with an endpoint allowlist. No
docker run --privileged. - Mount allowlist lives outside the project root.
~/.config/nanoclaw/mount-allowlist.jsondecides which host directories a container can mount. The agent hasrwon NanoClaw’s own source (that’s how I extend it), but it can’t edit the allowlist because the allowlist isn’t mounted into the container. A compromised agent can’t rewrite its own security config. - Per-group queue + global concurrency cap. Each group’s messages are serialized, different groups run in parallel, global cap is
MAX_CONCURRENT_CONTAINERS=5. A stuck task in one group can’t starve the others, and a rogue cron can’t fork-bomb the host.
What I added:
.envmasking:-v /dev/null:/workspace/project/.env:ro. Even a prompt injection asking for env files gets nothing.- Source mounts gated by an allowlist. Paths not in
~/.config/nanoclaw/mount-allowlist.jsoncan’t be mounted into any container, at all. My allowlist is currently three entries:~/repos(rw, so the agent can push to the repos in there), the AlertaAnunt.ro data dir (ro), and an empty overlay for hiding directories. - SSH-only k8s. No kubeconfig on host, no API token. Dedicated key to a bastion; revoke independently.
- Read-only DB users where the agent only needs to report.
SELECT, nothing else. - Sender allowlist + trigger prefix. Messages from non-whitelisted senders never hit storage. Non-main groups need an explicit trigger word; ambient chat doesn’t reach the LLM.
- Session commands (
/clear,/compact,/remote-control) admin-only, enforced in TypeScript, not in the prompt.
Per-group CLAUDE.md files are the one soft layer. They’re tone and context, not security. An LLM can violate a CLAUDE.md rule any time it wants to. I treat them accordingly. The real boundaries are the ones above.
What I didn’t solve
No one is getting everything right. My honest list:
- Prompt injection from message content. Layers shrink the blast radius. They don’t eliminate it. If a trusted sender pastes a hostile payload, the LLM might still try things.
- Docker kernel escape. Theoretically possible. Mitigated by keeping the host kernel patched and the container attack surface small. I rely on that. I’m also aware it’s a bet.
- Max rate-limit exhaustion under a runaway loop. Haven’t seen it yet, but a misbehaving cron could eat the daily cap. No per-group token budget in place. When it bites me I’ll add one.
- LLM mistakes. This is why every destructive action is gated by a Telegram approval. I trust Claude to propose. I do not trust any model to execute a destructive change without me in the loop.
Try it
gh repo fork qwibitai/nanoclaw --clone
cd nanoclaw
claude
Then /setup inside Claude Code. It walks you through dependencies, channel auth, the main group. Once it’s up, the pattern for a monitoring cron is boring and that’s the feature: one scheduled task, a script field that returns wakeAgent: false when things are fine, a short prompt for when they aren’t. Commit the script. Commit the task. Push. Then, mostly, don’t expect to hear from the bot. That is exactly what you want.
If Docker’s isolation isn’t enough for your threat model, NanoClaw also supports Apple Container (native on macOS, lighter-weight) and Docker Sandboxes (each container inside its own micro-VM, real kernel isolation). Same harness. Harder boundary.