Fixing GPU/display crashes on Windows using MCP Server and AI (#ai #mcp #windows #troubleshoot #nvidia)

Overview
Some time ago, the displays of my desktop started to turn black (like if the display was off). This would come and go, until the point where the displays would turn black, I could listen to the computer's audio, but the video never came back. So I knew procrastination was not an option anymore.
Before we get into the weeds of the MCP Server, let's understand the problem and the context.
Troubleshooting in Windows
If you're a Windows user, you might know about Event Viewer, which is a tool that allows you to view system, application, and security event logs. These logs contain information about system events, such as errors, warnings, and information messages. Sounds nice and easy, right? The problem is that it is absolutely overwhelming to sift through all the logs to find the relevant information, especially if you don't know exactly what you're looking for, like I was. I knew the symptoms, but I didn't know what was causing them.
Another problem with Event Viewer is that some logs might contain sensitive information. This was not exactly pertinent to my problem, but if I wanted to enroll an AI to help me, I needed to take this into account.
Would AI really help solve this problem?
Since our current AI models are good at quickly sifting through large amounts of data and (hopefully) identifying patterns and correlations. However, to allow AI to do its job, I would need to allow it to run PowerShell commands to extract information, then run more scripts to parse the data into a format it can understand.
Besides the security implication, there was another problem: The sheer volume of data in those logs would get in the way of AI (filling up context windows pretty fast), and increasing token usage to an absurd amount.
It sounds like I'm exaggerating, but if you ever had to find something in Event Viewer, you know what I'm talking about. 😅
Calling the OS-Doctor!
With all that in mind, it would look like MCP Servers were made for exactly this type of use case. I thought about simply searching for an existing MCP Server that did this, because in this day and age, probably we have tons of ready-to-use out there. However, I was also curious to learn more about ai-tools, how they work, and make them tick.
How did it work?
Better than I expected. I integrated it with Claude Code, and it was quickly able to go through all the data, identify the possible causes of the crash, suggest fixes, and monitor the system while I tested the system. Overall, I'm pretty happy with the result, especially because I was able to fix the issue before it became a bigger problem.
How do you use it?
First you let your AI agent know it exists. In my case, I was using Claude Code, and I configured this tool to be available
globally, so in the file c:\Users\<my username>\.claude.json I added:
1{
2 "mcpServers": {
3 "os-doctor": {
4 "command": "gsudo",
5 "args": [
6 "C:/path/to/published/McpOsDoctor.exe"
7 ]
8 }
9 }
10}
Then you can start the agent, and ask something like What diagnostic tools do you have available?, and it will list
what the os-doctor mcp has available.
From there, you can simply ask some troubleshooting questions, and it will make use of the MCP server. For Instance:
- "Why is my computer running slow?": Claude will check processes, system info, and event logs
- "Show me any recent system errors" queries the event log for Error/Critical entries
- "Is the Windows Update service running?": checks service status
- "Has my computer crashed recently?": inspects boot history for unexpected shutdowns
- "What's using all my memory?": lists top processes by memory consumption
- "Monitor my CPU and GPU temperatures": starts sensor monitoring and reports thermal data
Which tools does os-doctor have?
| Tool | Description |
|---|---|
get_capabilities | Reports available tools, platform, elevation status, and parameter hints |
query_system_log | Search Windows Event Log entries by time, severity, source, and keywords |
list_log_sources | List available event log sources |
get_service_status | Query Windows services by name, pattern, or status |
list_top_processes | List top processes sorted by CPU or memory usage |
get_system_info | Hardware and OS snapshot (hostname, CPU, memory, disks, uptime) |
get_boot_history | Boot, shutdown, crash, and sleep/wake events with timestamps |
get_gpu_info | NVIDIA GPU info: model, driver, VRAM usage, temperature, utilization, power draw via nvidia-smi |
get_directx_info | DirectX version, display adapters (VRAM, drivers, feature levels), and sound devices via dxdiag |
start_sensor_monitoring | Start background hardware sensor polling (temperature, fan, voltage, clock, load, power) |
stop_sensor_monitoring | Stop background sensor monitoring; collected data remains available via get_sensor_data |
get_sensor_data | Retrieve sensor monitoring results with min/max/average/current statistics per sensor |
Note:
get_gpu_inforequiresnvidia-smito be installed and accessible in the system's PATH, andget_directx_inforequiresdxdiagto be installed and accessible in the system's PATH.
How to install this MCP Server?
If you want it ready to use, you can download the pre-built binary from the releases page, place it in any directory, and configure your AI agent to use it.
If you prefer building from source, you can clone the repository, and go from there. It was written in C# (.NET10).
Any plans on supporting other operating systems?
So you thought this is a nice project, but you are one of the lucky ones that don't use Windows? Well, I built this MCP Server thinking about cross-platform compatibility, so it should be pretty simple to add support for Linux, and MacOS. That being said, I don't have concrete plans on adding this feature at the moment.
Conclusion
Overall, this was a fun project to work on, and it worked really well. Goes to show that AI can be a powerful, and useful tool if you know the problem you're trying to solve.
If you want to check the code out, you can visit the GitHub repository.
Hope it helps! :)