Docs / Architecture
Enterprise automation framework for MCP-native desktop control.
WinScript bridges high-level AI reasoning and low-level Windows execution through a structured pipeline that emphasizes observability, routing, and reliable recovery.
Execution pipeline
One flow from agent intent to Windows execution.
Step 1
AI agent
Claude, GPT, or your own runtime
Step 2
MCP server
Protocol and tool schema layer
Step 3
WinScript
Execution and recovery engine
Step 4
OS interface
Win32, COM, registry, and shell
Sandboxed execution
Commands are validated against the configured automation context so the runtime can stay useful without drifting into blind system mutation.
Audit logging
Each step from MCP request to Windows action can be captured as a trace, making it easier to inspect, replay, and debug.
Protocol state
Session state stays discrete between the agent and the OS so workflows remain reproducible across long-running tasks.
Visual feedback pipeline
Screen capture and state inspection are part of the runtime rather than bolted on after the fact. That gives the next model step the evidence it needs to verify state, not just hope for it.
{
"type": "image",
"metadata": {
"resolution": "1440x900",
"encoding": "jpeg"
},
"data": "base64..."
}Capture metadata
High-fidelity screen capture with latency tracking
Optional privacy redaction layer
Automatic OCR for text-heavy interfaces
Backend drivers
Driver model behind the runtime.
| Driver | Technology | Use case | Safety |
|---|---|---|---|
| AutomationProxy | pywinAuto | GUI navigation and clicks | HIGH |
| ComponentBridge | win32com | Office and legacy integrations | MEDIUM |
| ShellExecutor | subprocess | CLI tools and file operations | CRITICAL |
| InputSynthesizer | SendInput / API | Raw keyboard and mouse simulation | HIGH |