Windows automation that actually works.
Windows MCP Server gives your AI assistant direct access to Windows applications through the Windows UI Automation API. The same API screen readers use to read buttons, menus, and text fields.
Your AI says “click Save” and the server finds the Save button by name. No screenshots. No pixel parsing. No coordinate guessing.
Ask your AI assistant to control any Windows application:
Works with GitHub Copilot, Claude Desktop, Cursor, and any MCP client.
Most automation tools take screenshots and ask vision models to find buttons in the pixels. That approach is slow, expensive, and breaks when windows move or themes change.
Windows MCP Server asks Windows directly: “What buttons exist?” Windows knows. It’s deterministic. Same command works every time, regardless of DPI, theme, or resolution.
Tool descriptions that seem clear to humans often confuse AI. Parameters get misunderstood. Actions get skipped.
We test every tool with real AI models (GPT-4.1, GPT-5.2) using agent-benchmark. 54 automated tests. 100% pass rate required for release.
If the AI can’t use it correctly, we fix the tool — not the prompt.
View latest LLM test results →
Install from VS Code Marketplace →
Download from GitHub Releases. Add to your MCP config:
{ "servers": { "windows": { "command": "path/to/Sbroenne.WindowsMcp.exe" } } }
| Tool | What It Does |
|---|---|
ui_click |
Click buttons, checkboxes, menu items by name |
ui_type |
Type into text fields |
ui_find |
Discover elements in a window (with timeout/retry) |
ui_read |
Read text (with OCR fallback) |
file_save |
Save files via Save As dialog |
screenshot_control |
Get element metadata (image optional) |
window_management |
Find, activate, move, resize windows |
mouse_control |
Coordinate-based clicks (fallback for games) |
keyboard_control |
Hotkeys and key sequences |
app |
Launch applications |
Find elements by name, type, or ID — not coordinates. Works regardless of DPI, theme, resolution, or window position.
Every tool tested with real AI models before release. 54 automated tests across 7 scenarios. 100% pass rate required.
Tested against classic Windows apps, modern Windows 11 apps, and Electron apps (VS Code, Teams, Slack). Same commands work across all.
Full support for multiple displays with per-monitor DPI scaling. Move windows between monitors, capture any screen.
Screenshot + mouse + keyboard for games and custom controls. Annotated screenshots return element metadata — image omitted by default to save tokens.
This MCP server controls your Windows desktop. Use responsibly.