Z390 Gigabyte bug report (erratic HWinfo readings)

falkentyne

Active Member
Hi Martin!
On my Z390 Aorus Master, when I use HWinfo to monitor voltages and current, when I am running a direct3D/game, some readings get thrown off and become inaccurate.

Notably, GPU mhz (speed), GPU voltage, GPU power (Watts) (AMD Vega 64),
and
Current IOUT, Power POUT, VR VIN (CPU IR 35201)

Are the ones I notice.
the ITE 8792E sensors seem to be fine as well as the Super IO (ITE 8688E).

Oddly enough, VR VOUT on the IR 35201 seems to be unaffected and "seems" to read fine.

For example, VR VIN is the CPU +12V sensor.
Usually this is 11.906-11.938 maximum, and 11.875v during prime95 AVX load, and 11.813v worst case (talking about some 200 amp shenanigans here).

However when running a 3D game, the VR VIN goes up to 12.000v.  Even in something as weak as Overwatch.
This doesn't happen at all in Realbench 2.56, Cinebench, etc.  *ONLY* if a fullscreen D3D game is running.

Power POUT (IR 35201 power register) gets wild swings too.  However "CPU Package Power" remains nice and steady and expected.

Current IOUT has wild swings, sometimes jumping to 125 Amps (No way overwatch even comes close to pulling this much) then back down to 40 amps.
And the above GPU readouts.

However, MSI Afterburner reads the GPU speed, voltage and power perfectly accurately.  You can see this with both MSI afterburner and HWinfo running at the same time...HWinfo is allover the place randomly with the GPU speed (usually 25-50 mhz difference but sometimes dropping even down to 1000-1000 mhz then back up to 1600), and voltage/power readings fluctuating.  While MSI Afterburner is on target.

Is this some sort of bug or issue that might be addressable?

(again this only happens in games).
There is no problem in normal windows benchmarks or stress tests, where everything is dead accurate.
 
The IR 35201 values come straight from the VRM, I can only hardly imagine that some full-screen game could influence the accuracy of values read.
But you might try a test - try to raise the priority of HWiNFO in task manager and see if that will make any difference.

Regarding the GPU. I'm not sure how exactly MSI AB retrieves the values from GPU, but the difference might be because AB is using mean values - an average over a short period of time. It's normal for parameters to fluctuate with a high frequency and not easy to report actual status if you can't sample the values every microsecond. HWiNFO has also switched to a new (and more accurate) interface for retrieving AMD GPU parameters some time ago. Not sure if MSI AB supports it, or if your drivers do (I'd need to see the HWiNFO Debug File). So the question is - why do you think MSI AB is right and HWiNFO not?
 
Martin said:
The IR 35201 values come straight from the VRM, I can only hardly imagine that some full-screen game could influence the accuracy of values read.
But you might try a test - try to raise the priority of HWiNFO in task manager and see if that will make any difference.

Regarding the GPU. I'm not sure how exactly MSI AB retrieves the values from GPU, but the difference might be because AB is using mean values - an average over a short period of time. It's normal for parameters to fluctuate with a high frequency and not easy to report actual status if you can't sample the values every microsecond. HWiNFO has also switched to a new (and more accurate) interface for retrieving AMD GPU parameters some time ago. Not sure if MSI AB supports it, or if your drivers do (I'd need to see the HWiNFO Debug File). So the question is - why do you think MSI AB is right and HWiNFO not?

Good morning Martin!
I know MSI Afterburner is correct, because the framerate (FPS) is not fluctuating (maybe a couple up and down) if vsync is off (heaven, valley) in those areas, and never if vsync is on.
And the GPU clock speed will show consistent around 1640-1673 mhz depending on temps and load (MSI Afterburner shows a steady slow drop in speed as temps increase. :)

With HWinfo (even running at the same time, so RTSS overlay is on), HWinfo will show the GPU speed go up to 1760 mhz randomly (which is impossible; the highest boost clock at 40C I get is 1683 mhz), or down to 1473 mhz, wild swings like that.  As well as the GPU power usage, etc.  Afterburner shows it steady (tested this in Valley, Heaven, Overwatch, etc).
HWinfo was even showing in Apex Legends, sometimes the GPU clock going down to 80 mhz then back up (randomly).

GPU is set to 1702 mhz (maximum), which you never get at load due to droop.  Highest clock Afterburner ever shows is 1683 mhz (at 40C).

This same "swings" happen with the IR 35201 sensors (Current IOUT, and Power POUT and VR VIN specifically), so that's why I was guessing it was something with HWinfo.
If it was the gigabyte board itself, why would it also happen with the AMD card (clock swings, power usage swings?)

Again this ONLY happens when a direct3D game is running.  Never in windows or a windows benchmark.
For example:

Running prime95 in windows: IR 35201: AVX: Current IOUT is a nice stable predictable value (+/- 2 amps).
VR VIN (CPU 12V) is also nice and solid (11.875v).

IDLE VR VIN is 11.913v

But running Overwatch:
CPU +12v goes from 11.813v to 12.000v.
Overwatch uses a very light load.

By comparison, if I run prime95 on 2 or 4 threads, this doesn't happen at all.
So this issue, whatever it is, happens *only* in direct3d applications!

If I run a windows benchmark (like Realbench 2.56), HWinfo64 and MSI Afterburner report the same values.

Changing priority doesn't help.

*Edit*
I know it's completely different hardware, but on my Laptop (GTX 1070 MXM + 7820HK), MSI Afterburner and HWinfo64 are both 100% accurate on GPU speed and power usage.
 
Thanks for the detailed feedback.
Which Radeon drivers version are you using? Have you observed this issue with older drivers too?
Please attach the HWiNFO Debug File with sensor data so I can have a more detailed look at what exactly happens there.
 
Radeon drivers are 19.2.2 (Feb 12th).  Don't have any other drivers.
I'll get to the hwinfo debug file.

*Edit*
Where is the log/debug file stored?
Do I just enable "log all values for report (consumes memory)" or do i have to enable the automatic logging hotkey (or both?).

found it
This was just overwatch skirmish.
I'll take an Apex Legends one and a Realbench 2.56 one also
 

Attachments

Martin said:
Thanks for the detailed feedback.
Which Radeon drivers version are you using? Have you observed this issue with older drivers too?
Please attach the HWiNFO Debug File with sensor data so I can have a more detailed look at what exactly happens there.

Martin said:
Sorry, but that's not the Debug File. Please see here how to create it: https://www.hwinfo.com/forum/Thread-IMPORTANT-Read-this-before-submitting-a-report
Also it might be worth trying to upgrade to latest GPU drivers.

Here's a debug from Realbench 2.56

Notice the GPU clock is much too high (can't go to 1750 mhz; max clock is 1700 mhz that is set in Afterburner).
CPU +12v and Power POUT are reading accurately however. (11.938v maximum for CPU +12v in IR 35201).
 

Attachments

Martin said:
Thanks for the detailed feedback.
Which Radeon drivers version are you using? Have you observed this issue with older drivers too?
Please attach the HWiNFO Debug File with sensor data so I can have a more detailed look at what exactly happens there.

Here's the apex legends debug file.
You can see the CPU+12v shoots higher in fullscreen direct 3D now (and erratic Power POUT on  the IR 35201, but CPU Package Power is much more stable).
 

Attachments

Thanks, I can see it now.
But all values you see in HWiNFO is what's really read for the hardware. I have also verified output from another GPU interface and that reports clock up-to 1740 MHz.
So really not sure how MSI AB handles it or whether it perhaps applies some filtering.
Can you please try HWiNFO version 6.02 or older (you can download it here: https://www.fosshub.com/HWiNFO-old.html) what GPU clocks with that report?
 
Martin said:
Thanks for the detailed feedback.
Which Radeon drivers version are you using? Have you observed this issue with older drivers too?
Please attach the HWiNFO Debug File with sensor data so I can have a more detailed look at what exactly happens there.

Martin said:
Thanks, I can see it now.
But all values you see in HWiNFO is what's really read for the hardware. I have also verified output from another GPU interface and that reports clock up-to 1740 MHz.
So really not sure how MSI AB handles it or whether it perhaps applies some filtering.
Can you please try HWiNFO version 6.02 or older (you can download it here: https://www.fosshub.com/HWiNFO-old.html) what GPU clocks with that report?

It's the same (I have 6.00 still!)
The one thing that's interesting is the VR VIN (CPU +12v) rating also.
In windows/prime95/realbench/cinebench etc, it ranges from 11.938 (idle) to 11.875v (load).  That's what I expect and has proper results.
But when I run a 3D game (even a light load one with low CPU load), it ranges from 11.813v to 12.000v (!)

I know something is strange here, because the POWER POUT (also in IR 35201) shows wild fluctuations also. (like 25W to 113W!--in 3D games only, like Overwatch).
But CPU Package Power shows 35-50W, very steady.
But they both show proper readings in prime95, for example (and realbench), no not the exact same readings but no fluctuations

(CPU Package Power = VID * Amps). <--VID can be influenced by AC Loadline, CPU/cache multiplier, DC Loadline (mOhms)
(POWER POUT = VR VOUT * Amps).

It's really not a big deal anyway.  I use RTSS OSD overlay so I like to see the CPU VR reported Power consumption while playing games.
But with it fluctuating like that it's only useful in the desktop (no 3D games running).

Anyway thank you for looking at this for me, Martin !

I'll update GPU Drivers later anyway.  I'm always careful about updating drivers because of new bugs (if it ain't broke, don't fix it).

----------

Here is a one minute prime95 FMA3 15K run at 4.7 ghz (150 amps load).
You can see the VR VIN and POWER POUT responds perfectly with no fluctuations.
 

Attachments

Martin said:
Thanks for the detailed feedback.
Which Radeon drivers version are you using? Have you observed this issue with older drivers too?
Please attach the HWiNFO Debug File with sensor data so I can have a more detailed look at what exactly happens there.

Here is a "low load" test in Overwatch training mode (coop vs AI).
Notice the CPU +12V in IR 35201 (VR VIN) spikes to 12.000v here?
(and power POUT also seems to be erratic, compared to CPU Package Power).
 

Attachments

With same on v6.00 you mean the GPU clock? Is it showing it out of range too?
CPU Package power is measured internally by the CPU and it will never match the VR values. It's also quite possible that the value is an average over a short period, so it doesn't show intermittent spikes.
 
Martin said:
Thanks for the detailed feedback.
Which Radeon drivers version are you using? Have you observed this issue with older drivers too?
Please attach the HWiNFO Debug File with sensor data so I can have a more detailed look at what exactly happens there.

Martin said:
With same on v6.00 you mean the GPU clock? Is it showing it out of range too?
CPU Package power is measured internally by the CPU and it will never match the VR values. It's also quite possible that the value is an average over a short period, so it doesn't show intermittent spikes.

Yes, sorry, I've been playing videogames.
Yeah 6.00 shows the same readings as the current version.
But never in MSI Afterburner.

My question is, why do the "VR VIN" spikes happen?
It happens only in direct3D loads.

In prime95 stress test, or Realbench 2.56 for example, it never happens.
(the GPU one does though).

I wanted to monitor the VR "Power POUT" register so I can measure my exact CPU power usage, which is also linked to VR VOUT (VCC_Sense on-die voltage)

CPU Package Power is related to VID * Amps, and VID can be influenced by IMON Slope/offset and by AC Loadline value (and the droop on the VID comes from DC Loadline value).
 
Sorry, I don't have an explanation for the VRM values. All I can say is that what you see is what the VRM measures.
Also note that the VRM doesn't cover the whole CPU power usage - there are other rails involved in the CPU.
I'm more curious about the GPU clock anomaly and wondering whether MSI Afterburner perhaps uses some filtering.
 
Back
Top