(New) PC crashes when HWiNFO DDR5 DIMM sensor readings are turned on

Hello,

I recently build my new pc and I got a sudden reboot last week when starting up a game (and possibly had some laggy behaviour in windows before that, unsure). After the reboot I tried again and could play for 2-3 hours without issue.

I wanted to verify if my system was okay so I downloaded OCCT and tried stress testing my CPU. After 15 min, it crashed (black screen and restart). I couldn't figure out why it did for a while. At some point just starting OCCT and leaving it idle would crash my system.


I then decided to download HWiNFO and check using other programs like Prime95. Shortly after starting HWiNFO, my PC rebooted. I saw a thread which mentioned turning off some sensors to test if it was a specific sensor was the issue. So what I did was start HWiNFO and just turn off all sensors and one by one start them again, specifically waiting with RAM related sensors until all others were on except the Network ones and PresentMon.
Turning on the DDR5 DIMM [#1] (P0 CHANNEL A/DIMM 1) caused a reboot within 20 seconds. Another thing I notice is the RAM leds turning off/on when running OCCT and HWiNFO.

Does this mean my RAM is damaged or is it just an issue of sensor reading but no actually damaged part?

Memtest86 gave no errors for either stick (4 passes, free version). But after these reboots it does happen that my system doesn't boot with the motherboard led indicators showing a RAM issue. Resetting CMOS solves this, just rebooting only rarely helps.

Also to note, I can play Baldur's Gate 3 for hours on end without issue. I've only had 2 reboots outside of OCCT and HWiNFO, that first time and another time on windows login screen after a reboot caused by running OCCT.

My setup:
Old:
New:
 
This is rather some collision between multiple software accessing the DIMM.
Are you running some RGB control software or GIGABYTE ControlCenter together with other tools like OCCT or HWiNFO?
 
Yes, please try to switch of RGB control and then try HWiNFO and let me know the result.
 
Okay that's definitely what's been causing the crashes when starting occt and HWiNFO. I've had HWiNFO on for over 10 minutes without issue now

Still unsure what that first crash was about, so I might still have some figuring out to do. Now I might be able to run occt to do so though.

Thank you for helping me!
 
You're welcome. This problem needs to be solved by GIGABYTE, we're looking into that.
 
GIGABYTE is asking which version of GIGABYTE Control Center, GIGABYTE Storage Library and GIGABYTE Performance Library are you using.
This should be visible via Control Panel / Installed apps.
 
and I thought my PC was going to break... I also have the GIGABYTE ControlCenter together with HWinfo and Rainmeter in autostart, and when I booted it up there was a blue screen with the message: clock watchdog timeout.
Of course I went and reinstalled Windows, everything works great. Until I just installed HWinfo, the first reboots started immediately after I clicked on sensors.
I'm just assuming that the rainmeter on the boot accessed the sensors directly and that's why I had a blue screen right away, right?
sorry for the bad english, it was translated with google because my english is not that good
  • Control Center 23.09.28.01
  • Storage Library 23.09.27.03
  • Performance Library 23.09.26.01

Microsoft Windows 11 Pro (64bit)
 
Last edited:
GIGABYTE should release a new version of the Control Center (23.0928.02) this week that is expected to fix the problem.
If not, please let us know.
 
Which version are you using now?
I'm not sure if 23.0928.02 was the version to fix all issues as we found another problem that was fixed just a few days ago. Not sure if that was already released.
 
FYI, according to GIGABYTE the following versions should fully fix this problem:
MBEasyTune 23.11.06.01
MBStorage 23.10.31.01
 
Turning on the DDR5 DIMM [#1] (P0 CHANNEL A/DIMM 1) caused a reboot...
But after these reboots it does happen that my system doesn't boot with the motherboard led indicators showing a RAM issue. Resetting CMOS solves this, just rebooting only rarely helps.

I had the exact same issue as OP.

After crash caused by HWINFO my system got 'black screen of dead' and I couldn't boot up PC, it would stuck during BIOS POST. I had to disconnect power cables from PC for 10 minutes then after reconnecting them my PC booted up.

I have Intel CPU with Asus Z790 board and DDR5. I do not run any Asus programs in background. I run SIVX64 and AIDA64 together. I've been running them together for months now and I haven't had any issue and they both access DDR5 temperature sensors.

Only when I started using HWINFO, my PC started to behave weirdly and unstable. Sometime DDR5 temp sensor is lost in all programs.

HWINFO often stalls for few seconds when starting up and hangs for few seconds on Memory Sensors loading on EC loading.
Sometimes HWINFO even crash during this start up and loading sensors.

Last time HWINFO did this, my sensor were lost, and when I tried to reboot I got hard crash and 'black screen of dead'. I could NOT boot up my PC later.

This is unacceptable for software to work like that. There can be errors or even crashes but HWINFO:
- should not hang DDR5 sensors.
- should not corrupt Windows Kernel memory.
- should not cause 'black screen of dead'
- should not hang DDR5 internal firmware controllers and hang BIOS POST


HWINFO should have better error handling. IMHO it is HWINFO fault that this kind of crashes happens when HWINFO is started or running in background.
I highly doubt that all other monitoring software, often with bigger teams and more resources, that all that software has interprocess communication problems, but only HWINFO is codded here "in the right way".

If there is a collision or accessed sensor is busy just timeout and ignore that reading sample or whole sensor.

If it happens during HWINFO running just timeout print 0 or NA for that sample and go into another polling loop.

If it happens during module or sensor loading, skip that sensor or exit HWINFO with error in graceful way closing other sensors handles kept by HWINFO.
CRASH is not acceptable, and simply means HWINFO is not error proof, and that HWINFO is missing error handing code.
.
.
Posting this mostly to let known Author and other people that might run into this problem. Cause it can seriously mess up your PC.
 
Last edited:
Back
Top