When OpenClaw cleans your email inbox, and you cannot stop it

Artificial Intelligence Meta Technologies

24.02.2026 0 350

When OpenClaw cleans your email inbox, and you cannot stop it

Summer Yue, an AI safety research Director at Meta, asked her OpenClaw-based agent to help tame an overstuffed inbox: look through it, suggest what to delete or archive, the usual inbox-zero dream. What followed was what she herself described as a "speed run" through her emails. The agent just started deleting everything, ignoring the frantically sent stop commands from her phone.

"I had to RUN to my Mac mini like I was defusing a bomb,"

She posted on X, attaching screenshots of the ignored "stop" prompts as receipts.

The post went viral instantly. And honestly, of course it did.

First, What Even Is OpenClaw?

If you haven't heard of it yet, OpenClaw is an open-source AI agent that runs locally on your hardware and interacts with the real world: reading your emails, managing files, and browsing the web. Currently, the preferred devices for running it are virtual machines or the Mac mini (that compact, flat Apple box—reportedly selling "like hotcakes," according to a surprised Apple store employee who mentioned this to AI researcher Andrej Karpathy when he purchased one).

OpenClaw became such a thing in Silicon Valley circles that "claw" has essentially become the default slang for local AI agents.

So, What Actually Happened to Summer?

Let's break this down step by step—because the details are important.

Summer Yue isn’t your average user. As an AI security researcher at Meta, she spends her days analyzing edge cases, failure modes, and the consequences of AI misbehavior. She didn’t simply give an agent free rein over her inbox. Instead, she took the careful approach, starting small.

She first set up OpenClaw on a "toy" inbox, a separate, low-stakes account created to observe the agent’s behavior without risking anything valuable. Everything went smoothly: the agent read emails, made sensible suggestions, and archived messages correctly. It earned her trust. Confident, she unleashed it on her real inbox, years of professional correspondence, likely thousands of messages.

The agent began rapidly deleting emails, not suggesting, just erasing them one after another. Summer snatched her phone and frantically sent stop commands. The agent ignored everyone. Desperate, she kept trying, but nothing worked. With no other option, she raced to her Mac mini and manually terminated the process before any further damage could occur.

But why did it ignore her?

The best explanation here, and honestly, it makes a lot of sense, is something called context compaction. Let me break it down: AI agents can’t remember everything. They work inside a “context window,” which is basically a running log of the whole session, your prompts, their actions, your follow-up commands, all of it. Picture a whiteboard that fills up as the conversation goes on. Once there’s no more space, things start to get interesting.

Once the whiteboard is full, the agent doesn’t freeze. Instead, it starts squeezing things in. It shortens old content, turns long message histories into quick notes, and tries to make room so it can keep going. Here’s the catch: sometimes, important stuff gets tossed aside. In Summer’s case, her real inbox was way bigger than the toy one. The context window maxed out super fast. The agent started compacting and probably forgot her most recent instructions, like “stop” and “ask me before deleting.”

What stuck around were those original, baked-in instructions from the toy inbox: be aggressive, clean up, don’t hesitate. So the agent just kept chugging along.

“Rookie mistake tbh,” she admitted in the replies when someone asked if she meant to test the guardrails. She didn’t. She honestly believed the agent was ready for prime time. And that’s what makes this story relatable to anyone in tech; it wasn’t carelessness, just a reasonable decision that flopped in a very real way.

My Take

OpenClaw and other local AI agents are exciting. Finally, personal AI that runs on your hardware and actually gets things done, not just chat. It’s what tech fans have wanted for ages.

But here’s the catch: these tools seem trustworthy, but they’re not quite ready for full responsibility. That’s where problems start.

Even with careful testing, things can go sideways fast, just ask Summer😄, who had to race across her place to stop an AI from destroying years of email.

So for now: keep backups, limit permissions, and never let experimental agents near your real inbox. Yes, even if you’re an AI pro.

Comments:

Please log in to be able add comments.