Security & Sandbox Configuration
Understand how Cowork protects your system through VM isolation, permission controls, and safety mechanisms. Learn best practices for secure autonomous AI usage.
Claude Cowork takes a defense-in-depth approach to security. Unlike traditional AI chatbots that run directly in your browser, Cowork operates as an autonomous agent with file system access—requiring robust isolation mechanisms to protect your system.
The security model combines hardware-level VM isolation, explicit permission grants, content classifiers for prompt injection detection, and reinforcement learning to recognize and refuse malicious instructions.
schema Security Architecture Diagram
macOS Host
Your protected system files and applications
VZVirtualMachine
Apple Virtualization Framework sandbox
Claude Agent
Runs inside isolated Linux VM
VM Isolation with Apple Virtualization Framework
Cowork does not run directly on your macOS host. Instead, it boots a lightweight, custom Linux virtual machine using Apple's VZVirtualMachine framework—the same technology that powers Docker Desktop on macOS.
Process Isolation
The agent runs in a completely separate OS instance. Even if compromised, it cannot escape the VM boundary.
No System Access
Cannot touch macOS system files, applications, or any folder you haven't explicitly granted access to.
Apple Silicon Optimized
Leverages ARM-based virtualization with unified memory for minimal performance overhead.
Fresh Every Session
Each Cowork session starts with a clean VM state. No persistent malware can survive between sessions.
Technical Note
The VZVirtualMachine framework boots a custom Linux root filesystem specifically designed for Cowork. This provides hard isolation—the agent can only access folders you explicitly "mount" into the VM environment.
Permission System
Cowork operates on the principle of least privilege. You control exactly what folders Claude can access, and the agent asks for explicit approval before performing significant actions.
Folder Access Grants
Before starting work, you select which folders Cowork can access. Claude can read, write, create, and delete files only within these mounted folders. Your Documents, Desktop, or other folders remain untouched unless explicitly granted.
Action Confirmations
For significant operations like deleting multiple files, Claude will pause and ask for your confirmation. You can review the proposed action and approve or reject it before proceeding.
Real-time Activity Log
Watch what Cowork is doing in real-time. Every file operation, web request, and action is logged and visible in the interface. You can pause or stop execution at any time.
Safety Mechanisms
Reinforcement Learning Safeguards
Claude has been trained through RLHF (Reinforcement Learning from Human Feedback) to recognize and refuse malicious instructions. It will decline requests that could harm your system or data.
Content Classifiers
When browsing the web or processing files, content classifiers scan untrusted content for potential prompt injections—hidden text that might try to trick the agent into unintended actions.
Audit Trail
Complete history of all actions taken during a session. Review what Claude did, when, and to which files. Useful for understanding changes and reverting if needed.
Local File Processing
Your files are processed locally within the VM. File contents are not uploaded to external servers or used for model training. Only the conversation context is sent to Claude's API.
Security Best Practices
Grant Minimal Folder Access
Only mount folders that are necessary for the task. Don't grant access to your entire home directory—create a dedicated working folder instead.
Back Up Important Files
Before letting Cowork reorganize or modify files, ensure you have backups. The sandbox protects your OS, but not your data within granted folders.
Review Plans Before Execution
Claude shows its plan before starting work. Take time to review it, especially for tasks involving file deletion or modification.
Be Specific in Instructions
Vague instructions like "clean up everything" can lead to unintended deletions. Be explicit about what should be kept, modified, or removed.
Monitor Browser Tasks
When using the Chrome extension for web tasks, be aware that malicious websites could attempt prompt injection. Monitor the activity log during web interactions.
Known Limitations
Understanding the security model's limitations helps you use Cowork safely:
No Protection from User Instructions
If you tell Claude to delete files, it will. The sandbox protects from external threats, not from commands you explicitly authorize.
Prompt Injection Risks
While classifiers help detect prompt injection, determined attackers on malicious websites might still succeed. Monitor web browsing tasks carefully.
Data Exfiltration Possibility
With web access enabled, a compromised session could potentially send data to external servers. Use network monitoring for sensitive work.
Research Preview Status
Cowork is still in research preview. Security measures will continue to evolve as the product matures.
Troubleshooting
VM fails to start expand_more
Folder access denied after granting permission expand_more
Claude refuses a safe operation expand_more
Security is a Priority
Anthropic has invested heavily in making Cowork secure by default. The VM isolation, permission system, and safety mechanisms work together to minimize risk. However, no system is perfect—always exercise appropriate caution when granting an AI agent access to your files.