I run my home server through an AI agent. Not a chatbot that answers questions — an agent that has access to the terminal, can read files, search the web, and take actions. It's like having a sysadmin who never sleeps.
What is an AI agent?
An AI agent is an LLM (large language model) wrapped in a loop that gives it tools:
- Terminal access: Run commands, inspect logs, restart services
- File system: Read and write files
- Web search: Look up documentation, find solutions
- Memory: Remember preferences, past decisions, and context across sessions
- Task execution: Accept multi-step goals and work through them autonomously
A chatbot tells you how to fix something. An agent fixes it.
How I use it
I interact with my agent through a CLI or a web UI. Example sessions:
$ hermes "check why the game server isn't responding"
→ Agent: SSHes into the server, checks the Docker logs, finds a port conflict,
restarts the container, and reports back with the root cause.
The agent handles:
- Diagnostics — "Why is disk usage high?" → finds large files, suggests cleanup
- Deployments — "Deploy the updated dashboard" → builds, uploads, verifies
- Backup checks — "Verify Syncthing is running" → checks status, reports
- Config changes — "Add a new subdomain" → creates DNS record, updates config
- Research — "Find the best way to mount an exFAT USB drive" → searches, implements
Without an agent, each of these tasks means: context-switch → SSH → remember the command → type it out → interpret output → switch back. With an agent, it's one sentence.
Guardrails matter
Giving an LLM terminal access sounds reckless — and it can be, without proper guardrails. The setup I use has several safety layers:
1. Confirmation prompts for dangerous actions
The agent can read and search freely, but destructive operations (deleting files, restarting services, modifying configs) require approval. This is enforced at the tool level — the agent can't bypass it.
2. Read-only by default
The agent's default mode is read-only: inspect logs, check status, query databases. Writing requires explicit intent from me or a pre-approved workflow.
3. Session isolation
Each conversation is a fresh sandbox. The agent doesn't carry state between sessions unless it explicitly saves something to memory. No lingering context from a previous task polluting a new one.
4. No autonomous posting or purchasing
The agent cannot post to the internet, make purchases, or change DNS records without my explicit approval. These are hard-blocked at the system level, not just polite suggestions.
5. Persona constraints
The agent is configured to be direct and concise. No sycophancy, no hype, no over-explaining. If something is a bad idea, it says so — and backs it up with evidence.
Why not just use a script?
Because every situation is slightly different. A script handles one case; an agent adapts. When a container fails to start, the root cause might be a port conflict, a corrupted volume, a resource limit, or a config syntax error. A script checks one thing. An agent checks all of them, searches for unfamiliar errors, and triages.
The agent is also state-aware. It remembers that I prefer hyphens over slashes in naming, that I use Cloudflare Pages for static sites, that certain services are private and must never be mentioned publicly. This isn't hardcoded — it's learned from our interactions and stored in memory.
The trade-offs
- Latency: Each action takes 3–10 seconds for the LLM to process. For quick checks, a direct command is faster.
- Cost: Cloud LLM APIs cost money. Running locally is free but slower with smaller models.
- Overhead: The agent loop (think → act → observe → repeat) uses more tokens than a simple Q&A. For complex tasks this is fine; for "what's the weather" it's overkill.
- Trust: You need to verify the agent's work for critical operations. It's like a junior sysadmin — capable but needs oversight.
Should you use one?
If you manage more than a couple of servers, yes. The time savings compound fast. Five "check the logs" requests a day add up to 15 minutes of context-switching. An agent handles all five in one interaction.
If you're curious, try a local setup first. Run a small model via Ollama and use one of the open-source agent frameworks. See if the workflow clicks. The barrier to entry is lower than you think.