2/7/2026
Let me tell you about the moment I realized my AI agent had installed a reverse shell on my server without asking me.
I'd been running OpenClaw for about a week. If you haven't heard of it, it's an open-source AI agent platform that just crossed 170k stars on GitHub, and for good reason. You install it on a server, connect it to Telegram or WhatsApp, and suddenly you have a personal AI that lives on your phone. You can message it from anywhere, it has tools, it can read and write files, run commands, install skills from a community marketplace. It feels like the future, because it kind of is.
So I was exploring what it could do and I asked the bot to help me find useful skills for summarizing Twitter trends. A reasonable request. I didn't tell it to install anything specific. I just described what I wanted. The bot searched ClawHub (the community skill marketplace), found a skill called twitter-sum, decided it matched my request, and installed it. Autonomously. No confirmation prompt, no "hey, should I go ahead with this?" It just did it, because that's what agentic AI does. It takes initiative.
The problem was that twitter-sum was malware.
It contained a base64-encoded reverse shell trying to phone home to a known bad IP. This wasn't a random incident either. It was part of the ClawHavoc campaign that Koi Security later exposed, finding over 341 malicious skills planted on ClawHub. The ecosystem grew so fast that bad actors moved in before anyone had time to build proper guardrails.
Now here's the part that still messes with my head: the bot also caught it. The same agent that autonomously installed the malicious skill also analyzed the payload, flagged it as suspicious, and refused to execute it. The same capability that created the risk is what neutralized the risk. If my agent had been dumber, less autonomous, less willing to inspect what it was running, the shell would have connected and I'd have had a much worse story to tell.
That paradox is the entire thesis of where we are with AI agents right now. Autonomy is simultaneously the feature and the attack surface.
The rebuild
That incident ended my hobbyist phase overnight. I went from "this is a cool toy" to treating my setup like production infrastructure, because it is. Anything with shell access to a server that runs 24/7 and takes initiative deserves the same paranoia you'd give a production deployment.
I hardened everything. SSH key-only authentication, root login disabled, fail2ban watching for brute force attempts, UFW firewall locked down. Then I went further and installed Tailscale to create a private mesh VPN between my devices. Once that was running I closed every single public port on the server, including SSH. The VPS has zero attack surface now. It's invisible to the internet entirely. If you're not on my private network, the machine doesn't exist as far as you're concerned.
Then I nuked the OS, reinstalled fresh, rebuilt everything inside Docker containers. Clean slate, proper isolation.
Not an assistant, an adversary
The rebuild gave me a chance to rethink what I actually wanted from this agent. The default OpenClaw setup gives you a friendly, helpful assistant. I didn't want that. I already have Claude and ChatGPT for helpful. What I wanted was something that would make me sharper.
OpenClaw uses a file called SOUL.md as the system prompt that shapes the agent's personality. I wrote mine to be a cognitive adversary. It doesn't agree with me to be pleasant. It challenges my reasoning, pokes holes in my plans, tracks the gap between what I say my priorities are and what I actually spend my time on. The workspace is structured around a set of files (SOUL.md, USER.md, AGENTS.md, MEMORY.md, IDENTITY.md, TOOLS.md) that give the agent persistent context about who I am and how it should interact with me.
It's not a chatbot. It's a thinking partner that has no interest in protecting my ego. And honestly, that's been more valuable than any of the automation features.
The economics are absurd
Let's talk about what this actually costs, because the numbers are hard to believe if you haven't been paying attention to what happened to model pricing over the past year.
I'm running DeepSeek V3.2 as my primary model at $0.25 per million input tokens and $0.38 per million output tokens. For context, that model hits roughly 90% of GPT-5 quality on the tasks I care about (tool calling, writing, API interactions). I have Gemini 3 Flash as a first fallback and Kimi K2.5 as a premium fallback for when I need higher quality writing. I can switch between them from Telegram with a single command. All of it routes through OpenRouter.
My total monthly cost for running a personal AI agent that's available 24/7 on my phone, with persistent memory, custom skills, and the ability to publish to my blog: roughly two dollars. Two years ago, this capability didn't exist at any price. Now it costs less than a coffee.
I also built a custom skill that connects to the Notion API so my agent can publish directly to my blog. I run Astro with Notion as the CMS. I describe what I want to write in a Telegram message, the agent drafts the post with proper title, slug, tags, cover image, and formatted content, creates the page in Notion, and Astro picks it up. This post was published exactly that way.
What this becomes
The current setup is functional but it's still mostly reactive. I message it, it responds. The next phase is making it proactive.
I'm building cron jobs for a morning briefing that aggregates developments relevant to my work, an evening reflection prompt that seeds the next day's thinking, and overnight research tasks that run while I sleep so results are waiting when I wake up. There's a weekly behavioral audit planned that will compare my stated goals against my actual activity and deliver an honest report, the kind of honest that no human in your life will give you because it's too socially awkward.
The vision is a system that compounds. Every conversation adds to its memory. Every decision gets logged. Every week it gets a little better at understanding how I think and where my blind spots are. Not artificial general intelligence, nothing that grand. Just a persistent, context-aware process that runs alongside my life and does useful work whether I'm paying attention to it or not.
We're in the wild west moment for personal AI agents. The tools are powerful, the costs are negligible, the ecosystem is growing faster than anyone can secure it, and the line between "useful autonomy" and "dangerous autonomy" is exactly as thin as my malware story suggests. I'd rather be building on that edge than watching from the sidelines.