Claude Cowork Security & Sandbox Setup

Understand how Cowork protects your system through VM isolation, permission controls, and safety mechanisms. Learn best practices for secure autonomous AI usage.

Claude Cowork takes a defense-in-depth approach to security. Unlike traditional AI chatbots that run directly in your browser, Cowork operates as an autonomous agent with file system access—requiring robust isolation mechanisms to protect your system.

The security model combines hardware-level VM isolation, explicit permission grants, content classifiers for prompt injection detection, and reinforcement learning to recognize and refuse malicious instructions.

schema Security Architecture Diagram

desktop_mac

macOS Host

Your protected system files and applications

memory

VZVirtualMachine

Apple Virtualization Framework sandbox

smart_toy

Claude Agent

Runs inside isolated Linux VM

Isolated Mounted folders only

VM Isolation with Apple Virtualization Framework

Cowork does not run directly on your macOS host. Instead, it boots a lightweight, custom Linux virtual machine using Apple's VZVirtualMachine framework—the same technology that powers Docker Desktop on macOS.

verified_user

Process Isolation

The agent runs in a completely separate OS instance. Even if compromised, it cannot escape the VM boundary.

folder_off

No System Access

Cannot touch macOS system files, applications, or any folder you haven't explicitly granted access to.

memory

Apple Silicon Optimized

Leverages ARM-based virtualization with unified memory for minimal performance overhead.

restart_alt

Fresh Every Session

Each Cowork session starts with a clean VM state. No persistent malware can survive between sessions.

lightbulb

Technical Note

The VZVirtualMachine framework boots a custom Linux root filesystem specifically designed for Cowork. This provides hard isolation—the agent can only access folders you explicitly "mount" into the VM environment.

Permission System

Cowork operates on the principle of least privilege. You control exactly what folders Claude can access, and the agent asks for explicit approval before performing significant actions.

folder_shared

Folder Access Grants

Before starting work, you select which folders Cowork can access. Claude can read, write, create, and delete files only within these mounted folders. Your Documents, Desktop, or other folders remain untouched unless explicitly granted.

approval

Action Confirmations

For significant operations like deleting multiple files, Claude will pause and ask for your confirmation. You can review the proposed action and approve or reject it before proceeding.

visibility

Real-time Activity Log

Watch what Cowork is doing in real-time. Every file operation, web request, and action is logged and visible in the interface. You can pause or stop execution at any time.

Safety Mechanisms

psychology

Reinforcement Learning Safeguards

Claude has been trained through RLHF (Reinforcement Learning from Human Feedback) to recognize and refuse malicious instructions. It will decline requests that could harm your system or data.

security

Content Classifiers

When browsing the web or processing files, content classifiers scan untrusted content for potential prompt injections—hidden text that might try to trick the agent into unintended actions.

history

Audit Trail

Complete history of all actions taken during a session. Review what Claude did, when, and to which files. Useful for understanding changes and reverting if needed.

cloud_off

Local File Processing

Your files are processed locally within the VM. File contents are not uploaded to external servers or used for model training. Only the conversation context is sent to Claude's API.

Security Best Practices

check_circle

Grant Minimal Folder Access

Only mount folders that are necessary for the task. Don't grant access to your entire home directory—create a dedicated working folder instead.

check_circle

Back Up Important Files

Before letting Cowork reorganize or modify files, ensure you have backups. The sandbox protects your OS, but not your data within granted folders.

check_circle

Review Plans Before Execution

Claude shows its plan before starting work. Take time to review it, especially for tasks involving file deletion or modification.

check_circle

Be Specific in Instructions

Vague instructions like "clean up everything" can lead to unintended deletions. Be explicit about what should be kept, modified, or removed.

check_circle

Monitor Browser Tasks

When using the Chrome extension for web tasks, be aware that malicious websites could attempt prompt injection. Monitor the activity log during web interactions.

Known Limitations

Understanding the security model's limitations helps you use Cowork safely:

warning

No Protection from User Instructions

If you tell Claude to delete files, it will. The sandbox protects from external threats, not from commands you explicitly authorize.

warning

Prompt Injection Risks

While classifiers help detect prompt injection, determined attackers on malicious websites might still succeed. Monitor web browsing tasks carefully.

warning

Data Exfiltration Possibility

With web access enabled, a compromised session could potentially send data to external servers. Use network monitoring for sensitive work.

warning

Research Preview Status

Cowork is still in research preview. Security measures will continue to evolve as the product matures.

Troubleshooting

VM fails to start expand_more

Ensure you're running on Apple Silicon (M1/M2/M3/M4) Mac. The VZVirtualMachine framework requires ARM architecture. Also check that you have sufficient disk space for the Linux filesystem to download.

Folder access denied after granting permission expand_more

Check macOS System Settings > Privacy & Security > Files and Folders. Ensure Claude Desktop has the necessary permissions. You may need to restart the app after granting permissions.

Claude refuses a safe operation expand_more

Sometimes the safety classifiers are overly cautious. Try rephrasing your request more specifically. If Claude still refuses a legitimate task, you can provide more context about why the operation is safe and necessary.

shield

Security is a Priority

Anthropic has invested heavily in making Cowork secure by default. The VM isolation, permission system, and safety mechanisms work together to minimize risk. However, no system is perfect—always exercise appropriate caution when granting an AI agent access to your files.