File system access is not one of OpenClaw's biggest security issues. If that were so, running it in a VM or another computer (I hear Mac Minis are popular!) would solve it.
If you need it to do anything useful[0], you have to connect it to your data and give it action capabilities. All the dragons are there.
If you play it careful and don't expose your data, comm channels, etc., then it's much like the other AI assistants out there.[1]
---
[0] for your definition of useful
[1] I do appreciate the self-modification and heartbeat aspects, and don't want to downplay how technically impressive it is. The comment is purely from POV of an end-user product.
I think the only sane way, if there is one, is to sandbox your LLM behind a fixed set of MCP servers that severely limit what it can do.
Reading your mail, WhatsApp and bank transactions? May be OK if your LLM runs locally, but even then, if it has any way to send data to the outside world without you checking it, maybe not even. You don’t want your LLM to send your private mail (including photos) or bank statements to somebody who uses prompt injection to get that data.
Agreed, sandboxing is only part of agent security. Authorization (what data the agent can access and what tools it can execute) is also a big part of it.
I've found primer on agent sandboxes [0] is a great reference on sandboxing options and the trade-offs
For agents there's a tension between level of restriction and utility. I think a large part of OpenClaw's popularity is that the lack of restriction by default has helped people see the potential utility of agents. But any agent that isn't just for trying things out requires consideration of what it should and should not be able to do and from there the decision around the best combination of sandboxing and authorization.
At work, we've found it helpful to distinguish coding agents vs product agents. Coding agents have the ability to add new execution paths by pulling in external code or writing their own code to run. Product agents have a strictly defined set of tools and the runtime prevents them from executing anything beyond that definition. This distinction helps us reason about what sandboxing is required.
For data permissions it's trickier. MCP uses OAuth for authentication but each server can have different expectations for access to the external service. Some servers let you use a service account where you can narrow the scope of access but others assume a token minted from an admin account which means the MCP server might have access to things beyond what the agent using the server should.
So for that, we have an MCP proxy that lets us define custom permissions for every tool and resource, and at runtime makes permission checks to ensure the agent only gets access to the subset of things we define ahead of time. (We're using SpiceDB to implement the authorization logic and checks) This works well for product agents because they can't add new execution paths. For coding agents, we've tinkered with plugins/skills to try to do the same but ultimately they can build their way around authorization layers that aren't part of the runtime system so it's something we're still trying to figure out.
Sandboxing is great, and stricter Authorization policies are great too, but with these kinds of software, my biggest fear (and that's why I am not trying them out now) is prompt injection.
It just seems unsolvable if you want the agent to do anything remotely useful
Ultimately a prompt injection attack is trying to get the agent to do something it wasn't intended to do and if you have the appropriate sandboxing and authorization in place, a compromised agent won't be able to actually execute the exploits
> Concrete Media: Public Relations for B2B tech companies
This is a marketing piece for Concrete Media.
Whenever you see an article like this, be sure to ask yourself how the author came up with the idea for the article, and how the author got in contact with any people interviewed in the article.
OpenClaw was released in November of 2025, yet the article sounds like NanoClaw _disrupts_ some old staple of the industry.
You can't use that wording 4 months into the whole "industry".
Even less so, when your competitor was "launched" 2 weeks ago.
Even less so when it's written by claude
This nothingburger is so much nothing, it might as well be an antiburger.
Container isolation is a good foundation, but one layer worth adding is network sandboxing. A filesystem-sandboxed agent can still exfiltrate data over the network if it gets prompt-injected — domain allowlists and egress filtering can reduce the risk significantly.
Another useful primitive is surrogate credentials: the agent never handles real API keys or tokens. A proxy swaps in real values only for scoped hosts on the way out. This keeps the access the agent has locked inside the container; surrogate credentials are not valid outside.
Any combination of 1-3 or more skills can result in a prompt injection attack if it satisfies the above criteria - Gmail or sales personal data, Reddit or X posts or comments in white text, Gmail or Reddit or X to send confidential information to the attacker.
The "lethal trifecta" is a limited view on security, as it's mostly concerned with leaking data. This solution focuses on a different aspect: the ability of rogue actions (instead of rogue communications per #3).
Nanoclaw is excellent. Natively uses Apple containers and easy to use with oauth Claude code subscription. Only annoying thing was it defaults to WhatsApp, but it’s easy to fork and mod as you want. The best thing is asking it to mod itself!
I have tried to solve the agent running wild, and I found two solutions, the first is to mount the workspace folder using WASM to scope any potential damage, the second is running rquickjs with all APIs and module imports disabled, requiring the agent to call a host function that checks permissions before accessing any files
This is why I really think for AI tools it’s probably good to just start fresh.
Like our emails, files, other accounts and stuff. That’s “ours” and personal.
Even for business, that should be off limits.
What we do give to AI should be brand new blank slates. Like say I roll out an AI solution in March 2026. That is the seed from which everything we do using AI will work.
To get there we could move data we want to the new environment. But no access to any existing stuff. We start fresh.
If it needs to take any actions on behalf of our existing accounts it needs to go through some secure pipeline where it only tells us intent, without access.
File system access is not one of OpenClaw's biggest security issues. If that were so, running it in a VM or another computer (I hear Mac Minis are popular!) would solve it.
If you need it to do anything useful[0], you have to connect it to your data and give it action capabilities. All the dragons are there.
If you play it careful and don't expose your data, comm channels, etc., then it's much like the other AI assistants out there.[1]
---
[0] for your definition of useful
[1] I do appreciate the self-modification and heartbeat aspects, and don't want to downplay how technically impressive it is. The comment is purely from POV of an end-user product.
I think the only sane way, if there is one, is to sandbox your LLM behind a fixed set of MCP servers that severely limit what it can do.
Reading your mail, WhatsApp and bank transactions? May be OK if your LLM runs locally, but even then, if it has any way to send data to the outside world without you checking it, maybe not even. You don’t want your LLM to send your private mail (including photos) or bank statements to somebody who uses prompt injection to get that data.
Thinking of prompt injection: we need LLMs with a Harvard architecture (https://en.wikipedia.org/wiki/Harvard_architecture), so that there is no way for LLM data inputs to be treated as instructions.
Agreed, sandboxing is only part of agent security. Authorization (what data the agent can access and what tools it can execute) is also a big part of it.
I've found primer on agent sandboxes [0] is a great reference on sandboxing options and the trade-offs
For agents there's a tension between level of restriction and utility. I think a large part of OpenClaw's popularity is that the lack of restriction by default has helped people see the potential utility of agents. But any agent that isn't just for trying things out requires consideration of what it should and should not be able to do and from there the decision around the best combination of sandboxing and authorization.
At work, we've found it helpful to distinguish coding agents vs product agents. Coding agents have the ability to add new execution paths by pulling in external code or writing their own code to run. Product agents have a strictly defined set of tools and the runtime prevents them from executing anything beyond that definition. This distinction helps us reason about what sandboxing is required.
For data permissions it's trickier. MCP uses OAuth for authentication but each server can have different expectations for access to the external service. Some servers let you use a service account where you can narrow the scope of access but others assume a token minted from an admin account which means the MCP server might have access to things beyond what the agent using the server should.
So for that, we have an MCP proxy that lets us define custom permissions for every tool and resource, and at runtime makes permission checks to ensure the agent only gets access to the subset of things we define ahead of time. (We're using SpiceDB to implement the authorization logic and checks) This works well for product agents because they can't add new execution paths. For coding agents, we've tinkered with plugins/skills to try to do the same but ultimately they can build their way around authorization layers that aren't part of the runtime system so it's something we're still trying to figure out.
---
[0] https://www.luiscardoso.dev/blog/sandboxes-for-ai
Sandboxing is great, and stricter Authorization policies are great too, but with these kinds of software, my biggest fear (and that's why I am not trying them out now) is prompt injection.
It just seems unsolvable if you want the agent to do anything remotely useful
Ultimately a prompt injection attack is trying to get the agent to do something it wasn't intended to do and if you have the appropriate sandboxing and authorization in place, a compromised agent won't be able to actually execute the exploits
Reminds me of https://xkcd.com/1200/
> Concrete Media: Public Relations for B2B tech companies
This is a marketing piece for Concrete Media.
Whenever you see an article like this, be sure to ask yourself how the author came up with the idea for the article, and how the author got in contact with any people interviewed in the article.
Should anyone think the comment is dismissive, here's this directly in the text:
> a respected public relations firm that often works with tech businesses covered by VentureBeat
Exactly this.
The whole wording also doesn't make sense.
OpenClaw was released in November of 2025, yet the article sounds like NanoClaw _disrupts_ some old staple of the industry.
You can't use that wording 4 months into the whole "industry". Even less so, when your competitor was "launched" 2 weeks ago. Even less so when it's written by claude
This nothingburger is so much nothing, it might as well be an antiburger.
Container isolation is a good foundation, but one layer worth adding is network sandboxing. A filesystem-sandboxed agent can still exfiltrate data over the network if it gets prompt-injected — domain allowlists and egress filtering can reduce the risk significantly.
Another useful primitive is surrogate credentials: the agent never handles real API keys or tokens. A proxy swaps in real values only for scoped hosts on the way out. This keeps the access the agent has locked inside the container; surrogate credentials are not valid outside.
My Claude Code over email project demonstrates both of these: https://github.com/airutorg/airut
How is NanoClaw immune to the Lethal trifecta attack based on prompt injection that OpenClaw is also prone to?
https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
Lethal trifecta:
1. Access to your private data
2. Exposure to untrusted content
3. The ability to externally communicate
Any combination of 1-3 or more skills can result in a prompt injection attack if it satisfies the above criteria - Gmail or sales personal data, Reddit or X posts or comments in white text, Gmail or Reddit or X to send confidential information to the attacker.
It is not immune, but it limits #1 and #2.
The "lethal trifecta" is a limited view on security, as it's mostly concerned with leaking data. This solution focuses on a different aspect: the ability of rogue actions (instead of rogue communications per #3).
Prompt injection just seems unsolvable.
Are there works toward preventing it 100% of the time ? (I would assume the LLMs architectures would have to change)
Nanoclaw is excellent. Natively uses Apple containers and easy to use with oauth Claude code subscription. Only annoying thing was it defaults to WhatsApp, but it’s easy to fork and mod as you want. The best thing is asking it to mod itself!
if you're looking for the repo: https://github.com/qwibitai/nanoclaw
not 500 lines but looks more reasonable then openclaw
I have tried to solve the agent running wild, and I found two solutions, the first is to mount the workspace folder using WASM to scope any potential damage, the second is running rquickjs with all APIs and module imports disabled, requiring the agent to call a host function that checks permissions before accessing any files
--- [0] https://github.com/netdur/hugind
This “article” completely written with “AI”
Aside from the security differences, what can OpenClaw do that NanoClaw cannot?
this is like saying we built a car that can't drive and we're so proud
This is why I really think for AI tools it’s probably good to just start fresh.
Like our emails, files, other accounts and stuff. That’s “ours” and personal.
Even for business, that should be off limits.
What we do give to AI should be brand new blank slates. Like say I roll out an AI solution in March 2026. That is the seed from which everything we do using AI will work.
To get there we could move data we want to the new environment. But no access to any existing stuff. We start fresh.
If it needs to take any actions on behalf of our existing accounts it needs to go through some secure pipeline where it only tells us intent, without access.
This is cutting off the "Access to private data" leg of the lethal trifecta. One of the few ways to actually make an agent secure.
Previous discussion on the Show HN: from the dev:
https://news.ycombinator.com/item?id=46850205
To their credit they put a single sentence of warning into the article they commissioned, but to highlight:
- I don't care deeply about this code.
- (This isn’t) production code but a reference or starting point they can use to build functional custom software for themselves.
- I spent a weekend giving instructions to coding agents to build this.
9 days ago