mcp-windows

Windows MCP Server

A Model Context Protocol (MCP) server providing Windows automation capabilities for LLM agents. Built on .NET 8 with native Windows API integration.

🤖 Co-designed with Claude Sonnet 4.5 via GitHub Copilot - This project was developed in collaboration with AI pair programming, leveraging Claude Opus 4.5’s capabilities through GitHub Copilot to design, create & test a robust, production-ready Windows automation solution.

Features

🖱️ Mouse Control

⌨️ Keyboard Control

🪟 Window Management

📸 Screenshot Capture

Why Choose Windows MCP?

Comprehensive Windows Automation - Unlike generic computer control tools, Windows MCP is purpose-built for Windows with native API integration. It handles Windows-specific challenges (UIPI elevation blocks, secure desktop restrictions, virtual desktops) that generic solutions miss.

Multi-Monitor & DPI-Aware - Correctly handles multi-monitor setups, DPI scaling, and virtual desktops—critical for modern Windows environments. Most alternatives struggle with coordinate translation and DPI awareness.

Full Windows API Coverage - Direct P/Invoke to Windows APIs (SendInput, SetWindowPos, GetWindowText, GdiPlus) provides reliable, low-level control. No browser automation tricks or approximate solutions.

Security-Conscious Design - Detects and gracefully handles elevated windows (UIPI), UAC prompts, and lock screens. Respects Windows security model instead of bypassing it.

Performance - Synchronous I/O on dedicated thread pool prevents blocking the LLM. Configurable delays for stability without sacrificing speed.

Active Development - Release workflows, comprehensive testing, VS Code extension, and clear contribution guidelines show this is a maintained project, not abandoned.

Key Advantages of Windows MCP

Prerequisites

Installation

Install the Windows MCP extension from the VS Code Marketplace for one-click deployment:

  1. Open VS Code
  2. Go to Extensions (Ctrl+Shift+X)
  3. Search for “Windows MCP”
  4. Click Install

The extension automatically configures the MCP server and makes it available to GitHub Copilot.

Option 2: Download from Releases

Download pre-built binaries from the GitHub Releases page:

  1. Download the latest mcp-windows-v*.zip
  2. Extract to your preferred location
  3. Add to your MCP client configuration (see MCP Configuration)

Usage

VS Code Extension

If you installed via the VS Code extension, the MCP server is automatically configured. No manual setup required.

Manual Configuration (For Downloaded Releases)

If you downloaded from the releases page, add to your MCP client configuration:

{
  "servers": {
    "windows": {
      "command": "dotnet",
      "args": ["path/to/extracted/Sbroenne.WindowsMcp.dll"],
      "env": {}
    }
  }
}

Note: Releases are framework-dependent and require .NET 8 Runtime to be installed.

Tools

mouse_control

Control mouse input on Windows.

Action Description Required Parameters
click Left-click at coordinates x, y
double_click Double-click at coordinates x, y
right_click Right-click at coordinates x, y
middle_click Middle-click at coordinates x, y
move Move cursor to coordinates x, y
drag Drag from current position to coordinates x, y
scroll Scroll at coordinates x, y, direction, amount

keyboard_control

Control keyboard input on Windows.

Action Description Required Parameters
type Type text using Unicode input text
press Press and release a key key
key_down Hold a key down key
key_up Release a held key key
combo Key + modifiers combination key, modifiers
sequence Multiple keys in order keys
release_all Release all held keys none
get_keyboard_layout Query current layout none

window_management

Control windows on the Windows desktop.

Action Description Required Parameters
list List all visible windows none
find Find windows by title title
activate Bring window to foreground handle
get_foreground Get current foreground window none
minimize Minimize window handle
maximize Maximize window handle
restore Restore window from min/max handle
close Close window (sends WM_CLOSE) handle
move Move window to position handle, x, y
resize Resize window handle, width, height
set_bounds Move and resize atomically handle, x, y, width, height
wait_for Wait for window to appear title

screenshot_control

Capture screenshots on Windows.

Action Description Required Parameters
capture Capture screenshot target
list_monitors List all connected monitors none

Capture Targets:

Target Description Additional Parameters
primary_screen Capture primary monitor none
monitor Capture specific monitor monitor_index
window Capture specific window window_handle
region Capture rectangular region x, y, width, height

Optional Parameters:

Parameter Type Default Description
include_cursor boolean false Include mouse cursor in capture

Supported Keys

Function Keys

f1 through f24

up, down, left, right, home, end, pageup, pagedown, insert, delete

Control

enter, tab, escape, space, backspace

Modifiers

ctrl, shift, alt, win

Media

volumemute, volumedown, volumeup, mediaplaypause, medianexttrack, mediaprevtrack, mediastop

Special

copilot (Windows 11 Copilot+ PCs)

Browser

browserback, browserforward, browserrefresh, browserstop, browsersearch, browserfavorites, browserhome

Error Handling

The server handles common Windows security scenarios:

Error Code Description
ElevatedWindowActive Target window is running as Administrator
SecureDesktopActive UAC prompt or lock screen is active
InvalidKey Unrecognized key name
InputBlocked Input was blocked by UIPI
Timeout Operation timed out
InvalidMonitorIndex Monitor index out of range
InvalidWindowHandle Window handle is invalid or window no longer exists
WindowMinimized Cannot capture minimized window
WindowNotVisible Window is not visible
InvalidRegion Capture region has invalid dimensions
CaptureFailed Screenshot capture operation failed
SizeLimitExceeded Requested capture exceeds maximum allowed size

Configuration

Environment Variables

Variable Default Description
MCP_WINDOWS_KEYBOARD_CHUNK_DELAY_MS 10 Delay between text chunks
MCP_WINDOWS_KEYBOARD_KEY_DELAY_MS 10 Delay between key presses
MCP_WINDOWS_KEYBOARD_SEQUENCE_DELAY_MS 50 Delay between sequence keys
MCP_WINDOWS_MOUSE_MOVE_DELAY_MS 10 Delay after mouse move
MCP_WINDOWS_MOUSE_CLICK_DELAY_MS 50 Delay after mouse click
MCP_WINDOWS_WINDOW_TIMEOUT_MS 5000 Default window operation timeout
MCP_WINDOWS_WINDOW_WAITFOR_TIMEOUT_MS 30000 Default wait_for timeout
MCP_WINDOWS_WINDOW_PROPERTY_TIMEOUT_MS 100 Timeout for querying window properties
MCP_WINDOWS_WINDOW_POLLING_INTERVAL_MS 250 Polling interval for wait_for
MCP_WINDOWS_WINDOW_ACTIVATION_MAX_RETRIES 3 Max retries for window activation
MCP_WINDOWS_SCREENSHOT_TIMEOUT_MS 5000 Screenshot operation timeout
MCP_WINDOWS_SCREENSHOT_MAX_PIXELS 33177600 Maximum capture size (default 8K)

Testing

# Run all tests
dotnet test

# Run unit tests only
dotnet test --filter "FullyQualifiedName~Unit"

# Run integration tests only (requires Windows desktop session)
dotnet test --filter "FullyQualifiedName~Integration"

Security Considerations

License

MIT License - see LICENSE file for details.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines on:

Start with Getting Started if you’re new to the project.