Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Workaround]Hw64 GPU power sensor - Strange behaviour - stop reading values suddenly
#1
Googled, found other issues not so similar.

This is a new build under test - NoOC - stock value - fresh W10 64 install 
From August 14 :  Hw64 and MSIAfterburner have worked fine.

-yesterday I changed PSU replacing a defective one.

To-day I get this strange event. I cannot pinpoint to what is related.

_______________________________ Programs __________
Running Hw64 + MSIAfterburner graphs to check and control system.
Running programs: Opera , NiceHash both CPU and GPU.
_______________________________ Drivers  ___ BIOS
Last BIOS from MSI flashed 8 days ago.

After a few "Reboot after Bugcheck" errors pointing to 399.07 software,
  I DDU'd back to 398.98 Nvidia Drivers.
____________________________________


So, far as I can tell reading graphs, there is an event in mining software (like finishing one job and starting another) .
Suddenly the GPU power (watt)  value in graph stop moving, 
 while in Hw64 it become readable but grayed and not changing.

Event Viewer report some errors like that one below....
  So far, I was not able to see if these events  have the same timestamps. (White is unattended for hours)
Code:
Log Name:      System
Source:        Display
Date:          2018-09-02 04:03:50 PM
Event ID:      4101
Task Category: None
Level:         Warning
Keywords:      Classic
User:          N/A
Computer:      White
Description:
Display driver nvlddmkm stopped responding and has successfully recovered.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
 <System>
   <Provider Name="Display" />
   <EventID Qualifiers="0">4101</EventID>
   <Level>3</Level>
   <Task>0</Task>
   <Keywords>0x80000000000000</Keywords>
   <TimeCreated SystemTime="2018-09-02T14:03:50.892202800Z" />
   <EventRecordID>17407</EventRecordID>
   <Channel>System</Channel>
   <Computer>White</Computer>
   <Security />
 </System>
 <EventData>
   <Data>nvlddmkm</Data>
   <Data>
   </Data>
 </EventData>
</Event>
 

I need to understand if that lead to an HDW  issues on the GPU card,  or it's Software related. (Card is new)
Re-booting get back everything, till it stop again

I'm short on ideas where to look....   Many thanks for your ideas !


Alain

I cannot find where to edit a signature in this forum ... so far :

System Name WHITE:  i7-8700k - Crucial Ballistic 2x4G 2600 - Gigabyte G1 GTX1080 (driver 398.98) 
- M.2 WD120 - SSD Samsung Evo120 - HDD WD 7200 1T - Seasonic Platinum SSR750PX
- MSI Z370 Gaming Plus MS-7B61 - BIOS  1.50, 2018-07-05
- Itek IceBlack 240RGB AIO - CoolerMaster H500P mesh White
-  a White PC  with  55" 4k + 32" FHD black Samsung's Monitors and  Black Sound+ Samsung Soundbar
.

System Name BLACK: Intel Q9400 - Xtreme 4x2G 1333 -Asus P5P43 PRO - Asus GTX560Ti - Black Stock case - OEM PSU 750W 
- a Black PC (built in 2008) with a 37" black Samsung FHD


Attached Files Thumbnail(s)
   
Reply
#2
The GPU Power value is read via an NVIDIA provided interface, so they are completely responsible for it.
Hence I believe this is a problem of NVIDIA drivers. There is a lot of issues with later drivers including various BSODs.
Reply
#3
Thanks Martin... My though.

This is my thread in Nvdia forums:
Own Nvidia related thread

I've already DDU'd back to 398.98 - ready to go a jump back to Gigabyte drivers 391
They suggested a roll back to a stable driver known as no-problems.

Alain

PS: this is my last answer in Gigabyte direct support:

Code:
ou are right in all point.

Testing in another "fresh" environment... that's a hurdle , but it can be solved. I would try some active steps before .. and that will be last test before sending it back.
---
What is more strange: card perfect, work fine-- after a sleep (sometime after, or even immediately) it start to do blackscreens - log-out log-in don't do anything but with restart we have again a nickel situation.
Previous the MB BIOS update, it was 10 time worse. @[email protected]

Right now I'm pointing to this schedule:

Monday I should receive - PSU replacement - kryonaut paste
So before monday:
a) reseat/replug GPU - test - reboot -stress tests - run 1 day

PSU arrived
b) dont change anything else but replace PSU (test - reboot -stress tests - run 1 day)

c) reseat Waterblock with new paste (don't laugh, I started sneezing while .... trying to keep flat it on the CPU and with only one screw fastened!) (test - reboot -stress tests - run 1 day)

If nothing change:

d) prepare "Black" with a fresh install, swap-out the GTX560Ti, plug in the G1 ( boot up - start to pray - test ... etc..)

Unless d) I have to do all this steps so I will track events hoping one of this steps solve the situation.

For instance: yesterday AND today the System rebooted ...
(Ms EventViewer say "BSOD or stopped , reason unknown")
!! while in Sleep state !!

Lets hope for the good, thanks for your support!
Reply
#4
R391 drivers might do it, AFAIK those problems started with R397.
Let us know if that downgrade solved the problem.
Reply
#5
Ok, I will.
may be in 4 hours or so I'll DDU'd the 398 ... already downloaded the 391    384.76 from Gigabyte.

Few hours ago I got several very short blackscreen but nothing in event viewer.
Meanwhile I've added a secondary screen.

Only primary got involved in that sequence.
Very seldom both go black  toghther   like when changing some option in Display settings
(can also be a monitor "55 4k hurdle the black on that one - I'm dubious as is also Samsung )

Alain
Reply
#6
Well. something moved inside my head.
Some block shifted, Some hidden experience surfaced:

When Hw64 goes AWOL also the fans spin a little more then expected ...

Hmmm.... Diagnosis ... Hw64 is reading bible's pages instead on sensors >then>
> MSIAfterburner get something different from expected and react its best.

Proof... oh .. yes .. How I can I prove this thinking path is correct, without rebooting (that solve it, I know)

Went to TaskManager, checked everything connected to HW64 (not much) killed HW64
It came back after few seconds displaying correctly previous fix-greyed out value.
Also MSIAfterburner's Graphs shows now correct current values . and LO!

Fan goes back to a better sounding level!

I was simple to solve that part: "how to have them back working"
>>Now the hard part is:
Even with Nvidia drivers getting crazy for a moment (but working fine after ) how's HW64 don't reset itself ?
Or better, why it stop reading sensors current values and go for the bible?

:-)

Computers world was fun, once upon a time.
It is still the same crazy world today, but nobody laugh at it anymore!

Alain
Reply
#7
If MSI AB is taking information from HWiNFO as the source for fan control then an invalid readout will certainly impact the speeds.
I believe there's something wrong in NVIDIA's drivers which provide values to HWiNFO.
Still wondering if R396 or older drivers will fix this issue...
Reply
#8
I confirm that.
MSIAfterburner use as sensor source HW64.
I didn't install Aida, excluded the other sources as well.

I also confirm your believings ...
I just got the news from Nvidia that the problem started with 391.
That means that I will be forced back for stability to Gigabyte 384.76
Gigabyte as well, told me they stopped there for some reason :-)

I will keep an eye on Gigabyte's drivers, and stay safe and sound with the 384.
When the number will change I will update the Nvidia driver :-)))

Meanwhile it's BIG time to kill W Updates. ... I get one any 3 days...."please reboot" .... and then something stop working.

AAAAARGH

W10 Home ... not so much to do then edit Registry or run a specific KillTask.

The later is fun. You can tell him:
"Sir, when you see a Wupdate process you are authorized to shoot it, if you miss then..
... shoot it again...
and if it doesn't disappear immediately... kill it ! "

:-))

Alain
Reply
#9
Update:
Went to driver 384 around 1pm

few hours later got again problems.

I pointed the HW64 stopping when I get this error on event viewer.

Meanwhile I had updated also HW64...

When it's stuck I can kill it, it will restart and everything is back to normal.. till next time.
Boring.

Alain

Code:
Log Name:      System
Source:        nvlddmkm
Date:          2018-09-04 06:28:54 PM
Event ID:      13
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      White
Description:
The description for Event ID 13 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\UVMLiteProcess4
Graphics Exception: ESR 0x514648=0x119000e 0x514650=0x4 0x514644=0xd3eff2 0x51464c=0x17f

The message resource is present but the message was not found in the message table

Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
 <System>
   <Provider Name="nvlddmkm" />
   <EventID Qualifiers="49322">13</EventID>
   <Level>2</Level>
   <Task>0</Task>
   <Keywords>0x80000000000000</Keywords>
   <TimeCreated SystemTime="2018-09-04T16:28:54.165327000Z" />
   <EventRecordID>19230</EventRecordID>
   <Channel>System</Channel>
   <Computer>White</Computer>
   <Security />
 </System>
 <EventData>
   <Data>\Device\UVMLiteProcess4</Data>
   <Data>Graphics Exception: ESR 0x514648=0x119000e 0x514650=0x4 0x514644=0xd3eff2 0x51464c=0x17f</Data>
   <Binary>0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000</Binary>
 </EventData>
</Event>
Reply
#10
So you have also updated HWiNFO to v5.88 ?
Reply
#11
(09-04-2018, 07:12 PM)Martin Wrote: So you have also updated HWiNFO to v5.88 ?

Yes, I got noticed there was a new version around 4 pm.
Downloaded and installed (I had to uninstall previous version manually).

Alain
Reply
#12
I keep resetting manually HW64 (Icon in TaskBar, RightClick "Quit")

Any better idea / solution  for having it resetting by itself ?

Alain

In the first reset, I got back GPU's Power (W) sensor (Yellow line)
In the second, the System3 Fan  (green line)


Attached Files Thumbnail(s)
   
Reply
#13
You might create a batch file to perform those actions, this would involve killing the process using
taskkill /im HWiNFO64.exe
and then starting it again.
Reply
#14
hmm. Yes, good idea.
One click solution is better then a 5 clicks!

How to make it..
... automatic when HW64 loose a sensor?

Now I discover I've lost a sensor looking Afterburner's graphs.
Any way to check it inside (or outside) without a human looking at a screen ?

Alain
Reply
#15
If you know what the wrong sensor values are, you can setup an Alert in HWiNFO which will call that batch file.
Reply
#16
Circuitous Solution :-)

Yes, Sensors "GPU Power(W)" and "System3" (fan) - these are the two that usually get garbled (no reason given)

Yes I know IFTTT, (I never used IFTTT on a PC so far, only for domotic's triggers) if so, I dont have any idea how to start a BAT in IFTTT case

But may be you are suggesting a "direct alert" ??
In that case - problem arise:
GPU - HW64 Error:
The sensor is present and the power value can be present, say 93W, but that number will never change and is greyed out.

FAN - HW64 errors :
- the value is not shown nor is the sensor
- the sensor is shown the value is blank -
- the sensor is shown the value is present, will not change and greyed out -

B U T > If Zero, the value is correct - That fan was thermally stopped.

I'll try to think a valid solution... thank for the leads !!

Alain
Reply
#17
Oh, well, in such case I don't know how to do that. Grey values mean that HWiNFO is unable to read data from the sensor. If the value is initially not shown it also means the value is either 0 or can't be read.
Reply
#18
May be I can check for Windows errors.
Like "Display driver nvlddmkm stopped responding and has successfully recovered."
that I can find in EventViewer.

Alain
Reply
#19
Here we go :-)

[taskkill /im HWiNFO64.exe] >> -EV.BAT
--- >Managed by an EventViewer trigger.
That's a default EventViewer feature. RightClick on any kind of event.. "setup a task"
see pics

Alain


Attached Files Thumbnail(s)
       
Reply
#20
That was a great idea! Smile
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)