Windows MCP Server

Windows automation that actually works.

Let AI Control Windows Apps — By Name, Not Pixels

Windows MCP Server gives your AI assistant direct access to Windows applications through the Windows UI Automation API. The same API screen readers use to read buttons, menus, and text fields.

Your AI says “click Save” and the server finds the Save button by name. No screenshots. No pixel parsing. No coordinate guessing.

What You Can Do

Ask your AI assistant to control any Windows application:

“Click the Save button in Notepad”
“Type my email in the login field”
“Toggle Dark Mode in Settings”
“Move this window to my second monitor”
“Read the error message from that dialog”

Works with GitHub Copilot, Claude Desktop, Cursor, and any MCP client.

Why This Approach

Most automation tools take screenshots and ask vision models to find buttons in the pixels. That approach is slow, expensive, and breaks when windows move or themes change.

Windows MCP Server asks Windows directly: “What buttons exist?” Windows knows. It’s deterministic. Same command works every time, regardless of DPI, theme, or resolution.

Tested with Real AI Models

Tool descriptions that seem clear to humans often confuse AI. Parameters get misunderstood. Actions get skipped.

We test every tool with real AI models (GPT-4.1, GPT-5.2) using pytest-aitest. 54 automated tests. 100% pass rate required for release.

If the AI can’t use it correctly, we fix the tool — not the prompt.

View latest LLM test results →

Quick Start

VS Code (Recommended)

Install from VS Code Marketplace →

Other MCP Clients

Download from GitHub Releases. Add to your MCP config:

{ "servers": { "windows": { "command": "path/to/Sbroenne.WindowsMcp.exe" } } }

Tools

Tool	What It Does
`ui_click`	Click buttons, checkboxes, menu items by name
`ui_type`	Type into text fields
`ui_find`	Discover elements in a window (with timeout/retry)
`ui_read`	Read text (with OCR fallback)
`file_save`	Save files via Save As dialog
`screenshot_control`	Get element metadata (image optional)
`window_management`	Find, activate, move, resize windows
`mouse_control`	Coordinate-based clicks (fallback for games)
`keyboard_control`	Hotkeys and key sequences
`app`	Launch applications

Complete tool reference →

Key Features

🧠 Semantic UI Access

Find elements by name, type, or ID — not coordinates. Works regardless of DPI, theme, resolution, or window position.

🧪 LLM-Tested

Every tool tested with real AI models before release. 54 automated tests across 7 scenarios. 100% pass rate required.

💻 Broad App Support

Tested against classic Windows apps, modern Windows 11 apps, and Electron apps (VS Code, Teams, Slack). Same commands work across all.

📺 Multi-Monitor

Full support for multiple displays with per-monitor DPI scaling. Move windows between monitors, capture any screen.

🔄 Full Fallback

Screenshot + mouse + keyboard for games and custom controls. Annotated screenshots return element metadata — image omitted by default to save tokens.

⚠️ Caution

This MCP server controls your Windows desktop. Use responsibly.

pytest-aitest — LLM agent testing framework (powers our integration tests)
Excel MCP Server — AI-powered Excel automation
OBS Studio MCP Server — AI-powered streaming control

This site is open source. Improve this page.