What is OpenHands (formerly OpenDevin)?

OpenHands is an MIT-licensed open-source AI coding agent platform developed by All-Hands-AI with 188+ contributors. It uses an event-stream architecture with Docker-based sandboxing, supports multi-agent delegation, includes a web UI with integrated VSCode, and achieves 72% on SWE-bench Verified with Claude 4.5. In November 2025, OpenHands raised $18.8M Series A funding. Major companies including AMD, Apple, Google, Amazon, Netflix, TikTok, and NVIDIA use it for automated software development.

What is SWE-Agent and who created it?

SWE-Agent is an academic AI coding agent created by researchers at Princeton and Stanford Universities, published at NeurIPS 2024. Its key innovation is the Agent-Computer Interface (ACI)—an abstraction layer designed specifically for LLM agents interacting with computers. SWE-Agent takes GitHub issues and automatically fixes them using custom search commands, interactive file editing with linting, and efficient context management. The Mini-SWE-Agent variant achieves >74% on SWE-bench in just 100 lines of Python code.

Which AI coding agent has better SWE-bench performance?

Both achieve excellent scores depending on configuration. OpenHands reaches 72% on SWE-bench Verified with Claude 4.5 + Extended Thinking and 53% with CodeAct 2.1 + Claude 3.5 Sonnet. SWE-Agent's Mini variant achieves >74% with a minimal codebase. On real-world SWE-bench-Live, OpenHands with Claude 3.7 scores 19.25% while SWE-Agent with GPT-4.1 scores 18.57%. For production use, performance depends more on the underlying LLM than the agent framework.

Can I run OpenHands or SWE-Agent with local LLMs?

Yes, both support local LLMs via Ollama. OpenHands works with OpenHands LM 32B (fine-tuned Qwen Coder), llama-3.3-70B, and DeepSeek-v3—smaller models (7B-8B) aren't recommended due to weak instruction following. SWE-Agent supports Ollama models with configurable api_base (http://localhost:11434) but requires thought_action parser if the model lacks function calling. For best results, use 32B+ models with at least 24GB VRAM or cloud APIs for complex tasks.

What are the hardware requirements for OpenHands?

For cloud APIs (Claude, GPT-4), OpenHands needs just 4GB RAM and a modern processor. For local LLMs, you need a discrete GPU with at least 16GB VRAM (RTX 4070+) or 64GB+ system RAM on Apple Silicon. Running OpenHands LM 32B requires a single RTX 3090 (24GB), or use 8-bit/4-bit quantization on consumer hardware. Docker Desktop is required on all platforms. macOS needs "Allow the default Docker socket" enabled; Windows needs WSL 2.

How does OpenHands sandboxing work?

OpenHands runs each session in an isolated Docker container accessed via SSH, mimicking remote development. This provides full OS capabilities while insulating the host system, protecting filesystem integrity, and preventing cross-agent interference. Containers are torn down post-session. The event-stream architecture captures all actions and observations in a perception-action loop. For enterprise deployment, OpenHands supports RBAC, audit trails, quotas, and VPC deployment via Kubernetes.

What is the Agent-Computer Interface (ACI) in SWE-Agent?

The ACI is SWE-Agent's key innovation—an abstraction layer between the LLM and computer designed specifically for agents. Unlike human interfaces (Linux shell), ACI provides: a small set of simple actions for viewing, searching, and editing files; guardrails preventing common mistakes; specific, concise feedback about command effects; and efficient context management (typically most recent 5 steps). The ACI significantly outperforms raw shell access and enables SWE-Agent's strong benchmark performance.

Should I use OpenHands or SWE-Agent for enterprise deployment?

OpenHands is better for enterprise use. It offers RBAC (role-based access control), audit trails, quotas, admin dashboards, and self-hosted VPC deployment via Kubernetes. Companies like AMD, Apple, Google, Amazon, and Netflix use it in production. SWE-Agent is research-focused without enterprise features. If you need production-grade security, compliance, and scalability, choose OpenHands. If you're doing academic research or simple single-issue fixes, SWE-Agent's simplicity is advantageous.

What models work best with OpenHands and SWE-Agent?

Both work best with Claude 3.7 Sonnet, Claude 4 Sonnet, GPT-4o, and GPT-5. OpenHands achieves 72% SWE-bench with Claude 4.5 + Extended Thinking. SWE-Agent supports Claude 4 Sonnet extended thinking mode natively. For local use, OpenHands recommends OpenHands LM 32B (fine-tuned Qwen Coder 2.5), llama-3.3-70B, or DeepSeek-v3. Avoid models smaller than 32B parameters for complex coding tasks—instruction following degrades significantly.

Can SWE-Agent be used for cybersecurity and CTF challenges?

Yes, SWE-Agent has a dedicated EnIGMA mode for offensive cybersecurity, achieving state-of-the-art on CTF benchmarks. This makes it unique among coding agents. EnIGMA mode enables automated security research, vulnerability discovery, and capture-the-flag competition solving. OpenHands doesn't have an equivalent specialized cybersecurity mode, making SWE-Agent the better choice for security research applications.

What are the best alternatives to OpenHands and SWE-Agent?

Top alternatives include: Aider (CLI-based, clean Git diffs, 70% of its own code written by itself), Cline/Claude Dev (VS Code extension, 4M+ developers, MCP support), Continue (1.6M+ installs, fully local via Ollama), Cursor (AI-first IDE, VS Code fork), and Devin (commercial AI software engineer, $73M ARR, $2B valuation). For enterprise: Devin or OpenHands. For VS Code users: Cline or Continue. For CLI purists: Aider. For local-first privacy: Continue with Ollama.

OpenHands vs SWE-Agent: Best AI Coding Agent 2026

Quick Comparison

OpenHands

✓ Web UI + VSCode integration

✓ Enterprise features (RBAC, audit)

✓ Multi-agent delegation

✓ 72% SWE-bench Verified

✓ $18.8M Series A funding

Best for: Enterprise, production deployment

SWE-Agent

✓ CLI-focused, minimal setup

✓ Agent-Computer Interface (ACI)

✓ EnIGMA cybersecurity mode

✓ >74% SWE-bench (Mini)

✓ Princeton/Stanford research

Best for: Research, security testing, learning

What Are AI Coding Agents?

AI coding agents represent the next evolution beyond code completion tools like GitHub Copilot. Instead of suggesting lines of code, these agents autonomously complete entire tasks—fixing bugs, implementing features, refactoring code, and resolving GitHub issues with minimal human intervention.

The two leading open-source options are OpenHands (formerly OpenDevin) and SWE-Agent (from Princeton/Stanford). Both achieve state-of-the-art performance on software engineering benchmarks, but they target different use cases:

OpenHands: Enterprise-ready platform with web UI, multi-agent architecture, and production features
SWE-Agent: Research-focused tool with innovative Agent-Computer Interface and minimal footprint

This guide compares everything: architecture, benchmarks, installation, use cases, and which to choose for your workflow.

OpenHands: Enterprise AI Coding Platform

Origin and Development

OpenHands started as OpenDevin in early 2024, inspired by Cognition's Devin announcement. The project has grown to include:

188+ contributors from academia and industry
2.1K+ contributions to the codebase
$18.8M Series A funding (November 2025) led by Madrona
Adoption by AMD, Apple, Google, Amazon, Netflix, TikTok, NVIDIA, Mastercard, VMware

The lead maintainer is Xingyao Wang, with key contributions from Frank F. Xu (web browsing), Mingchen Zhuge (GPTSwarm), Robert Brennan (architecture), and Boxuan Li.

Core Architecture

OpenHands uses an event-stream architecture that models the agent-environment interaction:

Agent → Actions → Environment → Observations → Agent
           ↑                           ↓
           └───────── Event Log ───────┘

Key architectural components:

Docker-based Sandboxing: Each session runs in an isolated container accessed via SSH
Jupyter Kernel Environment: Integrated Python execution with stateful code interaction
Browser Agent API: BrowserGym interface for web automation (DOM manipulation, navigation)
Multi-Agent Delegation: Hierarchical agent structures where agents can delegate to specialized sub-agents
CodeAct Architecture: The default "strong generalist" agent that combines code execution with reasoning

Installation and Setup

Docker Method (Recommended):

docker run -it --rm --pull=always \
  -e AGENT_SERVER_IMAGE_REPOSITORY=ghcr.io/openhands/agent-server \
  -e LOG_ALL_EVENTS=true \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v ~/.openhands:/.openhands \
  -p 3000:3000 \
  --add-host host.docker.internal:host-gateway \
  --name openhands-app \
  docker.openhands.dev/openhands/openhands:1.3

Access the web UI at http://localhost:3000.

pip Method:

uv pip install openhands
openhands serve --gpu  # With GPU support
openhands serve --mount-cwd  # Mount current directory

Platform Notes:

macOS: Enable "Allow the default Docker socket to be used" in Docker Desktop
Windows: Enable WSL 2 with Docker Desktop integration

Key Features

Feature	Description
Web UI	View files, commands, code, browser activity in real-time
VSCode Integration	Browser-based IDE for direct code editing
VNC Desktop	Persistent Chromium browser for visual inspection
MCP Support	Model Context Protocol for tool integration
REST/WebSocket API	Programmatic access for automation
15+ Benchmarks	SWE-bench, HumanEvalFix, ML-Bench, WebArena, GAIA, etc.
Enterprise Features	RBAC, audit trails, quotas, VPC deployment

Supported Models

Cloud APIs:

Claude 3.5/3.7/4 Sonnet, Claude Opus
GPT-4, GPT-4o, GPT-5
Any OpenAI-compatible API

Local Models (via Ollama):

OpenHands LM 32B (fine-tuned Qwen Coder 2.5)
llama-3.3-70B
DeepSeek-v3

Note: Models smaller than 32B parameters aren't recommended—instruction following degrades significantly for complex coding tasks.

SWE-Agent: Research-Grade Coding Agent

Origin and Development

SWE-Agent is an academic project from Princeton University and Stanford University, created by:

John Yang and Carlos E. Jimenez (co-first authors)
Alexander Wettig, Kilian Lieret
Shunyu Yao, Karthik Narasimhan, Ofir Press

Published at NeurIPS 2024, the research was supported by Princeton Language and Intelligence, Oracle Collaborative Research, and the National Science Foundation.

The Agent-Computer Interface (ACI)

SWE-Agent's key innovation is the Agent-Computer Interface (ACI)—an abstraction layer designed specifically for LLM agents.

Why ACI matters:

Traditional interfaces like Linux shell were designed for humans. LLMs struggle with:

Long, verbose command outputs
Complex state management
Error-prone command syntax
Context overflow from too much information

ACI provides:

Simplified Actions: Small set of commands for viewing, searching, editing files
Guardrails: Prevents common mistakes before they happen
Concise Feedback: Specific, minimal output about command effects
Context Management: Maintains recent 5 steps, collapses earlier history

Installation and Setup

SWE-Agent is CLI-focused and lightweight:

pip install sweagent

Basic Usage:

# Fix a GitHub issue
sweagent run --issue "https://github.com/user/repo/issues/123"

# With specific model
sweagent run --model claude-3-7-sonnet --issue "..."

API Key Configuration:

export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."

Key Features

Feature	Description
GitHub Issue Automation	Takes issues and automatically creates fixes
EnIGMA Mode	Offensive cybersecurity for CTF challenges
Custom Search Commands	Specialized code search beyond grep
Interactive File Editing	Built-in linting and syntax checking
Context Management	Efficient history condensation
SWE-ReX	Remote execution on AWS, Modal, cloud
Mini-SWE-Agent	100-line variant achieving >74% performance

Supported Models

Cloud APIs:

Claude 3 Opus, 3.5/3.7/4 Sonnet (with extended thinking)
GPT-4, GPT-4o, GPT-4.1, GPT-5

Local Models:

Ollama models (configurable api_base: http://localhost:11434)
Requires thought_action parser for models without function calling

Benchmark Comparison

SWE-bench Performance

SWE-bench is the gold standard for measuring AI coding agents—it tests the ability to fix real GitHub issues.

Agent Configuration	SWE-bench Verified	SWE-bench Lite
OpenHands + Claude 4.5 Extended Thinking	72%	-
OpenHands + Critic Model (5 attempts)	66.4%	-
OpenHands CodeAct 2.1 + Claude 3.5 Sonnet	53%	41.7%
Mini-SWE-Agent (100 lines)	>74%	-
SWE-Agent + GPT-4 Turbo (2024)	12.5%	-
SWE-Agent 1.0 + Claude 3.7	SOTA	SOTA

Key insight: Mini-SWE-Agent's >74% score in just 100 lines of Python demonstrates that architecture design matters more than complexity.

Real-World Performance (SWE-bench-Live)

SWE-bench-Live tests on fresh, unseen issues to avoid data contamination:

Agent	SWE-bench-Live	SWE-bench Verified (Re-run)
OpenHands + Claude 3.7	19.25%	43.20%
SWE-Agent + GPT-4.1	18.57%	-
SWE-Agent + Claude 3.7	17.13%	-

Reality check: On truly novel issues, both agents solve ~18-20%—far below the 70%+ on curated benchmarks. This reflects the gap between benchmarks and production use.

Other Benchmarks

Benchmark	OpenHands	SWE-Agent
GAIA (general tasks)	67.9% (Claude 4.5)	-
HumanEvalFix	-	87.7% pass@1
Multi-SWE-Bench	#1 (8 languages)	-
LiveSWEBench	Top-performing	-
CTF/Cybersecurity	-	SOTA (EnIGMA)

Architecture Deep Dive

OpenHands Event-Stream Architecture

┌─────────────────────────────────────────────────────────┐
│                     OpenHands Server                     │
├─────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐  │
│  │  Web UI     │  │  REST API   │  │  WebSocket API  │  │
│  └─────────────┘  └─────────────┘  └─────────────────┘  │
├─────────────────────────────────────────────────────────┤
│                     Agent Manager                        │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐  │
│  │  CodeAct    │  │  Browser    │  │  Delegate Agent │  │
│  │  Agent      │  │  Agent      │  │  (Sub-agents)   │  │
│  └─────────────┘  └─────────────┘  └─────────────────┘  │
├─────────────────────────────────────────────────────────┤
│                     Event Log                            │
│         Actions ←→ Observations ←→ State                 │
├─────────────────────────────────────────────────────────┤
│                Docker Container (SSH)                    │
│  ┌───────────┐  ┌───────────┐  ┌───────────────────┐   │
│  │  Jupyter  │  │  Browser  │  │  Filesystem       │   │
│  │  Kernel   │  │  (VNC)    │  │  (Sandboxed)      │   │
│  └───────────┘  └───────────┘  └───────────────────┘   │
└─────────────────────────────────────────────────────────┘

Strengths:

Multi-agent delegation for complex tasks
Full browser automation
Persistent state across interactions
Enterprise-grade isolation

SWE-Agent ACI Architecture

┌─────────────────────────────────────────────────────────┐
│                        LLM                               │
│         (Claude, GPT-4, Local via Ollama)               │
├─────────────────────────────────────────────────────────┤
│              Agent-Computer Interface (ACI)              │
│  ┌───────────────────────────────────────────────────┐  │
│  │  Simplified Actions:                               │  │
│  │  - view_file(path, start_line, end_line)          │  │
│  │  - search_dir(pattern, directory)                  │  │
│  │  - edit_file(path, changes)                        │  │
│  │  - run_command(cmd)                                │  │
│  └───────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────┐  │
│  │  Guardrails: Syntax checking, Linting, Validation │  │
│  └───────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────┐  │
│  │  Context Window: Last 5 steps (condensed history) │  │
│  └───────────────────────────────────────────────────┘  │
├─────────────────────────────────────────────────────────┤
│                   Execution Environment                  │
│            (Docker, SWE-ReX, or subprocess)             │
└─────────────────────────────────────────────────────────┘

Strengths:

Minimal complexity, maximum effectiveness
Purpose-built for LLM capabilities
Efficient context management
Easy to understand and extend

Feature Comparison

Feature	OpenHands	SWE-Agent
Interface	Web UI + CLI	CLI only
IDE Integration	Built-in VSCode, VNC	None (standalone)
Sandboxing	Docker + SSH	Docker, SWE-ReX
Multi-Agent	Yes (delegation)	No
Browser Automation	Yes (BrowserGym)	No
MCP Support	Yes	No
Enterprise Features	RBAC, audit, quotas	None
Cybersecurity Mode	No	Yes (EnIGMA)
Minimum Setup	Docker + 4GB RAM	pip + API key
License	MIT	MIT

Hardware Requirements

OpenHands

Setup	Requirements
Cloud APIs (Minimum)	4GB RAM, modern CPU, Docker
Local LLMs (Recommended)	16GB+ VRAM GPU or 64GB+ Apple Silicon
OpenHands LM 32B	RTX 3090 (24GB) or quantized on RTX 4090
Full Precision 32B	A100/H100 (64-80GB VRAM)

SWE-Agent

Setup	Requirements
Cloud APIs	Standard modern system
Local LLMs	Same as underlying model requirements
EnIGMA Mode	Docker recommended

When to Choose Each

Choose OpenHands For:

Enterprise Production
- Companies like AMD, Apple, Google, Netflix use it
- Self-hosted VPC deployment via Kubernetes
- RBAC, audit trails, compliance features
Complex Multi-Step Workflows
- Tasks requiring browser + code + filesystem access
- Multi-agent delegation for large projects
Team Collaboration
- Web UI for non-CLI users
- Shared dashboards and quotas
Multi-Language Projects
- #1 on Multi-SWE-Bench (8 languages)
- Not Python-focused
Automated Maintenance
- Dependency upgrades across hundreds of repos
- Vulnerability remediation sweeps
- PR review automation

Choose SWE-Agent For:

Academic Research
- Published at NeurIPS 2024
- Clean, understandable codebase
- Reproducible experiments
Cybersecurity Applications
- EnIGMA mode for CTF challenges
- Security vulnerability research
Minimal Infrastructure
- pip install and go
- No Docker requirement for basic use
Learning Agent Development
- ACI concepts well-documented
- Mini-SWE-Agent shows minimal viable agent
Single-Issue Bug Fixes
- Focused GitHub issue resolution
- No setup overhead

Alternatives Worth Considering

For CLI Users: Aider

pip install aider-chat
aider --model claude-3-7-sonnet

Terminal-based AI pair programming
Clean Git diffs and automatic commits
70% of Aider's code written by Aider itself

For VS Code Users: Cline

4M+ developers worldwide
MCP integration
Plan/Act modes for controlled autonomy
Supports Claude, Gemini 2.5, local models

For Local-First Privacy: Continue

1.6M+ installs, 20K+ GitHub stars
Fully local via Ollama
VS Code and JetBrains extensions
Highly configurable

For Turnkey Enterprise: Devin

$73M ARR (as of June 2025)
Fully autonomous project execution
Powered by Claude Sonnet 4.5
$2B valuation

Integration Examples

OpenHands with Ollama

# Start Ollama with OpenHands LM
ollama pull openhands-lm:32b
ollama serve

# Configure OpenHands
# In web UI: Settings → LLM → Ollama
# Set model: openhands-lm:32b
# Set base URL: http://host.docker.internal:11434

SWE-Agent with Local Models

# config.yaml
model:
  name: ollama/llama3.3:70b
  api_base: http://localhost:11434
  per_instance_cost_limit: 0  # No cost for local

# Use thought_action parser for models without function calling
parser: thought_action

OpenHands Python API

from openhands.client import OpenHandsClient

client = OpenHandsClient(
    base_url="http://localhost:3000",
    api_key="your-api-key"
)

# Create a task
task = client.create_task(
    prompt="Fix the null pointer exception in src/main.py",
    workspace="/path/to/repo"
)

# Monitor progress
for event in client.stream_events(task.id):
    print(event.type, event.content)

Performance Optimization Tips

For OpenHands

Use Claude 4.5 with Extended Thinking for maximum accuracy (72% SWE-bench)
Enable Critic Model for self-verification (66.4% SWE-bench)
Mount your codebase with --mount-cwd for faster file access
Limit context to relevant files—large repos slow performance

For SWE-Agent

Use Mini-SWE-Agent for simple issues—faster, same accuracy
Rotate API keys with ::: separator for batch runs
Enable extended thinking with Claude 4 Sonnet for complex reasoning
Pre-filter issues to tasks the agent can realistically solve

Key Takeaways

OpenHands is the enterprise choice—web UI, multi-agent, RBAC, $18.8M funding
SWE-Agent excels in research—ACI innovation, cybersecurity mode, academic pedigree
Benchmarks are similar (70-74% on SWE-bench)—choose based on features, not scores
Real-world performance is ~20%—both struggle with truly novel issues
Both support local LLMs—but 32B+ models recommended for complex tasks
Mini-SWE-Agent proves simplicity wins—100 lines of Python, >74% accuracy
Consider alternatives—Aider for CLI, Cline for VS Code, Continue for privacy

Next Steps

Set up Ollama for local model support
Compare AI coding tools for your IDE
Build AI agents from scratch
Understand MCP servers for tool integration
Check VRAM requirements for local models

OpenHands and SWE-Agent represent the cutting edge of autonomous AI software engineering. OpenHands offers the polish and features enterprises need, while SWE-Agent provides the clean architecture researchers love. Both are MIT-licensed, actively maintained, and capable of solving real coding problems—the choice depends on whether you need production deployment or research flexibility.

OpenHands vs SWE-Agent: Best AI Coding Agent 2026

Before we dive deeper...

Get your free AI Starter Kit

Quick Comparison

What Are AI Coding Agents?

OpenHands: Enterprise AI Coding Platform

Origin and Development

Core Architecture

Installation and Setup

Key Features

Supported Models

SWE-Agent: Research-Grade Coding Agent

Origin and Development

The Agent-Computer Interface (ACI)

Installation and Setup

Key Features

Supported Models

Benchmark Comparison

SWE-bench Performance

Real-World Performance (SWE-bench-Live)

Other Benchmarks

Architecture Deep Dive

OpenHands Event-Stream Architecture

SWE-Agent ACI Architecture

Feature Comparison

Hardware Requirements

OpenHands

SWE-Agent

When to Choose Each

Choose OpenHands For:

Choose SWE-Agent For:

Alternatives Worth Considering

For CLI Users: Aider

For VS Code Users: Cline

For Local-First Privacy: Continue

For Turnkey Enterprise: Devin

Integration Examples

OpenHands with Ollama

SWE-Agent with Local Models

OpenHands Python API

Performance Optimization Tips

For OpenHands

For SWE-Agent

Key Takeaways

Next Steps

Want to go from beginner to AI engineer?

Ready to start your AI career?

Get the complete roadmap

Local AI Master Research Team

My 77K Dataset Insights Delivered Weekly

Want structured AI education?

Continue Your Local AI Journey

How to Install Your First Local AI Model

How to Choose the Right AI Model for Your Computer

Comments (0)

My 77K Dataset Insights Delivered Weekly

Related Guides

AI Agents Local Guide

Cursor vs Copilot vs Claude Code

MCP Servers Explained

Jan vs LM Studio vs Ollama

Written by Pattanaik Ramswarup