Explaining the AMD Ryzen "Power Reporting Deviation" -metric in HWiNFO

The case fans were running at full speed, and the CPU fans too. The fan curves are set properly, so everything is maxed out, but at full load, it reaches 95°C.
My vcore is 1.45 at idle, 1.25 at full load.
I should lower the vcore in Ryzen Master?

Thanks.
Your vcore isn't the problem, their at normal levels.
Are you sure the cooler is mounted properly? Not butting up against the RAM is it? Not too much thermal paste?
It's also odd that from your screen shot the 3 current CPU temps are very different, under load mine are virtually the same, and when idle theirs about a 5-10C difference.
 
cB LOAD with ryzen master ( monitoring) auto vs 4.2 oc (bios) . same result. Auto has only better edc (but did a shorter test) all other metric are worst vs fixed clocks @ 1.1v ( FIRSTpic)
also idle cpu goes low power even with a fixed oc w balanced plan. And to SLEEP w low power. Why bother with boost @ PBO? Edit. No bias + regular not auto load line work best

4.2 FIXED CB LOAD OC.jpg




STOCK 3600 NO PBO, NO BIAS, AUTO ( WORST)
STOCK.jpg

4.2 "FIXED" IDLE

idle.jpg
 
Back to the orig subject, tested the default stock config power deviation (with no PBO, NO BIAS, regular load line) this are the results. Fired cinebench 1rst (loading)then hwinfo, stock 91.3% deviationIMG_20200615_211228_0.jpg. Bottom fixed 4.2 ghz 1.1v core (bios) no pbo, boost, bias. 100.3% power deviation. IMG_20200615_212427_2.jpg
 

Attachments

Last edited:

gebhaard

New Member
Your vcore isn't the problem, their at normal levels.
Are you sure the cooler is mounted properly? Not butting up against the RAM is it? Not too much thermal paste?
It's also odd that from your screen shot the 3 current CPU temps are very different, under load mine are virtually the same, and when idle theirs about a 5-10C difference.
I checked it, it's mounted properly. I don't know, maybe I put too much thermal paste, but how does that affects temps?
At idle, it's about 30-35°C, but with R20 full load, instantly jumps to 80, then 95°C, and stays there, with 3.8GHz (thermal throttling i guess).
Can too much paste cause this?
Thanks.
edit: Reapplied thermal paste properly, still 95°C on full load. Why? :(
I've seen people commenting with the same Freezer 34 cooler having 50-60°C temps at full load. :/
What am I doing wrong?
 
Last edited:

Mordeafaca

New Member
Hi guys,

Quick post just to share my results.

Asus Prime X570-P with r5 3600 no matter what I try to change (PBO / auto oc / uv / default / ) the best I can get is 85~90% deviation under load cine20.
Using latest bios (1407) and chipset drivers.
Am I missing something, or did Asus tricked a bit on this motherboard?

thanks
 

Attachments

I checked it, it's mounted properly. I don't know, maybe I put too much thermal paste, but how does that affects temps?
At idle, it's about 30-35°C, but with R20 full load, instantly jumps to 80, then 95°C, and stays there, with 3.8GHz (thermal throttling i guess).
Can too much paste cause this?
Thanks.
edit: Reapplied thermal paste properly, still 95°C on full load. Why? :(
I've seen people commenting with the same Freezer 34 cooler having 50-60°C temps at full load. :/
What am I doing wrong?
Some questions from my side, just to help...

- You removed the plastic protector from the heatsink base?
- The fans on the heatsink are in push pull configuration, not blowing against each other?
- Can you share a picture of your case internals?
- What thermal paste are you using?
- If above does not help, do you have access to another cooler? And if yes, can you try it? You never know the cooler is faulty and the heatpipes not filled or ......


Come back with some answers on these and let’s see....

UPDATE:
Just checked and you also have different stand-offs depending the socket type... Are you sure you did not use the wrong stand-offs, as this would lead to not enough contact with the CPU, explaining the overheating???
Also, the included spacers, did you use them? As these are apparently only for intel sockets, not for AMD!

Also see: https://support.arctic.ac/index.php?p=freezer34&lang=en

From the manual:
Stand-off type:
1592392726866.png

Stand-off orientation:
1592392888509.png
 
Last edited:
I checked it, it's mounted properly. I don't know, maybe I put too much thermal paste, but how does that affects temps?
At idle, it's about 30-35°C, but with R20 full load, instantly jumps to 80, then 95°C, and stays there, with 3.8GHz (thermal throttling i guess).
Can too much paste cause this?
Thanks.
edit: Reapplied thermal paste properly, still 95°C on full load. Why? :(
I've seen people commenting with the same Freezer 34 cooler having 50-60°C temps at full load. :/
What am I doing wrong?
Too much paste can act like an insulator, which would push temperatures up of course. Although normally with screw down coolers it would eventually squeeze out the excess, unless it is really thick paste. If you're sure now you've put the right amount on, then follow what Infernoken said.

Hi guys,

Quick post just to share my results.
Asus Prime X570-P with r5 3600 no matter what I try to change (PBO / auto oc / uv / default / ) the best I can get is 85~90% deviation under load cine20.
Using latest bios (1407) and chipset drivers.
Am I missing something, or did Asus tricked a bit on this motherboard?
thanks
Can't you adjust CPU PPT in the bios?
 

The Stilt

Member
V.I.P.
The "Power Reporting Deviation" -metric recently introduced in HWiNFO has raised much of discussion among both the consumers and the board manufacturers alike.

In addition to the much-welcomed discussion, it has also raised concerns about the effects it allegedly has on the longevity of the CPU. The alleged and frankly, unfounded reliability related concerns were mostly a creation, or at the very least a heavy exaggeration of a third party, who wrote an article based on my write-up on the subject. While the original write-up does mention the "potential negative effects on the CPUs life-span", this generally is considered as an industry standard disclaimer, that is brought up every time anything is being run outside of its specs. Unlike the third-parties interpretations, the original write-up at no point suggests, nor even hints that there would be any imminent risk for damaging or "burning-out" the CPU, the motherboard, or anything else for that matter. Rest assured, had there been any true risk of imminent "burn-outs", it would have been mentioned in the original write-up.

After various discussions with the board manufacturers about the realities of the CPU silicon variability, the original telemetry calibration process itself and also the tolerances generally involved in motherboard manufacturing, we decided to make few changes which will both help the user to understand the displayed metric through perception, but also reflect its original purpose a bit better, or at least fairer than the first implementation did.

As said before, this feature was not implemented to nag or to go after board manufacturers who might have minor discrepancies in the telemetry either due to manufacturing tolerances, less than perfect initial calibration, or for whatever reason. The feature was and is intended to prevent certain manufacturers from heavily and continuously taking advantage of this exploit. Initially, we suggested ±5% as the maximum allowed deviation to determine if the telemetry had been intentionally biased or not. Based on the realities brought up by the board manufacturers, the facts we know and what we can independently verify, the originally suggested ±5% figure for the allowable maximum deviation was somewhat overly ambitious.

While the most commonly used methods for power measurements, RdsOn and DCR measurements easily can and typically do provide < ±2% accuracy, there are other factors involved in form of e.g. CPU silicon variance, motherboard manufacturing tolerances and even in ambient conditions, which can affect the accuracy of the readings and cause them to fall out of the originally suggested ±5% window. Based on these factors and to limit any unfounded accusations towards the board manufacturers, we've decided to increase the suggested threshold for intentional telemetry biasing from ±5% to ±10%. The reporting of the metric itself remains completely unaffected in terms of the formula, since there really no is room for interpretation.


Starting from HWiNFO v6.27-4195 Beta build (https://www.hwinfo.com/download/) there are following "Power Reporting Deviation" related changes:


- The suggested telemetry deviation threshold for intentional biasing has been increased from ±5% to ±10%

- Perceivability has been improved by adding colour coding to the displayed figure. Questionable readings (i.e. < 90%) are displayed as blood red, values in range (the rest) remain neutral in colour.

- "Power Reporting Deviation" naming has been clarified and changed to "Power Reporting Deviation (Accuracy)"

- The "Power Reporting Deviation" -metric is now hidden when manual overclocking (i.e. AMD OC-Mode) is used, to reduce the chance for user error in reporting the results. The metric is only accurate when the CPU is in control of all of its parameters (i.e. at stock settings). NOTE: Voltage offsets or load-line changes MUST NOT be present when testing the figure.

- The metric has been disabled on TRX40 platform, since the telemetry is discarded on HEDT and server platforms and hence its accuracy is completely irrelevant.

- An error found in AMD Zen+ (Pinnacle Ridge) "Power Reporting Deviation" has been fixed


Just to re-iterate, for the last time (hopefully):

- The readout is ONLY VALID during a NEAR-FULL-LOAD scenario. The read-out during IDLE, SINGLE THREADED OR EVEN PART-LOAD IS TOTALLY IRRELEVANT since the power draw is anything but constant.

- Use CINEBENCH R20 NT (multithreaded) and NOTHING ELSE. Not because nothing else works, but so that the workload is consistent between the different users. 256-bit workloads, such as Prime95 are also a bad idea, since certain SKUs might hit some of the platform limits during them.

- Test the CPU at STOCK SETTINGS ONLY. The CPU must remain control of its operating parameters (frequency, voltage). Voltage offsets and load-line adjustments will cause the CPU to deviate from its V/F and cast an anomaly to the readout. The same applies to manual overclocking, since the CPU executes the parameters given by the user. During manual overclock (i.e. OC-Mode) the accuracy of the power reporting is also completely irrelevant to begin with, since in this mode the CPU isn't making any decisions based the reported telemetry.
 
Thanks for the update The Stilt :).
I (and I'm sure most others) never thought that your original post meant imminent CPU burn out, I don't know what THG were thinking! I didn't think they were like that!?
I wonder how far under reporting it would have to be to do that though? 50%? Lower??

Btw, am I right in thinking that alterations to the RAMs voltages, timing & speed have no effect on PRD?
 

JJJCCC

New Member
Ryzen CPUs for AM4 platform rely on external, motherboard sourced telemetry to determine their power consumption. The voltage, current and power telemetry is provided to the processor by the motherboard VRM controller through the AMD SVI2 interface. This information is consumed by the processors power management co-processor, that is responsible for adjusting the operating parameters of the CPU and ensuring, that neither the CPU SKU, platform or infrastructure specific limits are being violated.

The weakness of this method is, that the telemetry essentially uses an undefined scale for the current (and hence power) measurements. This means that the motherboard VRM controller will send an integer between 0 - 255 to the CPU, and based the reference value known by the co-processor firmwares, this integer is converted to a figure, that represents a physical current drawn by the CPU. Based on the accurately known current flow and the voltage, it is possible to calculate to CPU power draw in Watts (V * I).

The reference value mentioned earlier is generally different for each of the motherboard make and model, unless there are boards which have an identical power circuitry. Because of that, it is on the motherboard manufacturers responsibility to find the correct value for their motherboard design through the means of calibration, and then to declare it properly in AGESA, during the bios compile time. In case the motherboard design specific, correct value differs greatly from the declared value, there will be a bias in the power consumption seen by the CPU. In case the declared value is greater than the actual value, the power consumption seen by the CPU is greater than it actually is. Likewise, if the declared value would be an understatement... the CPU would think it consumes less power than it actually does.

Since at least two of the largest motherboard manufacturers, still insist on using this exploit to gain an advantage over their competitors despite being constantly asked and told not to, we thought it would be only fair to allow the consumers to see if their boards are doing something they're not supposed to do. The issue with using this exploit is, that it messes up the power management of the CPU and potentially also decreases its lifespan because it is running the CPU outside the spec, in some cases by a vast margin. Also, it can cause issues when this exploit goes undetected by a hardware reviewer, since both the performance and the sofware based power consumption figures will be affected by it.

For example, if we take a Ryzen 7 3700X CPU that has 65W TDP and 88W default power limit (PPT), and use it on a board which has declared only 60% of its actual telemetry reference current, we'll end up with effective power limit of ~ 147W (88 / 0.6) despite running at stock settings (i.e. without enabling manual overclocking or AMD PBO). While the 3700X SKU used in this example typically cannot even reach this kind of a power draw before running into the other limiters and limitations, the fact remains that the CPU is running far outside the spec without the user even acknowledging it. This exploit can also cause additional cost and work to the consumer, who starts wondering about the abnormally high CPU temperatures and starts troubleshooting the issue initially by remounting the cooling and usually, eventually by purchasing a better CPU cooler(s).

HWiNFO will display "Power Reporting Deviation" metric under the CPUs enhanced sensors. The displayed figure is a percentage, with 100.0% being the completely unbiased baseline. When the motherboard manufacturer has both properly calibrated and declared the reference value, the reported figure should be pretty close to 100% under a stable, near-full-load scenario. A ballpark for a threshold, where the readings become suspicious is around ±5%. So, if you see an average value that is significantly lower than ~ 95% there is most likely intentional biasing going on. Obviously, the figure can be greater than 100%, but for the obvious reasons it rarely is ;)

As stated before, this metric is only valid during a relatively stable near-full-load condition. That is due to the typical measurement accuracy of the VRM controller telemetry, and also due to the highly advanced and fast power management on Ryzen CPUs, that not only result in extremely low idle, but also in extremely rapidly changing power consumption. A suggested workload to get a stable and reproducable deviation metric is Cinebench R20 NT, with the HWiNFO sample rate set to less or equal to 1000ms.

As of now, outside of certain MSI motherboards, the biasing isn't end-user controllable. In case there is clear evidence of biasing taking place on certain motherboards or their bios versions, please contact the manufacturer and ask them to remove the telemetry biasing from the bios. The biasing can be implemented in different ways, it can be tied to a specific setting(s) (known as an "auto-rule") in the bios or be fixed in a certain bios version or in all available bios versions.

Here is an practical example recorded on MSI X570 Godlike motherboard, using the most recent 1.93 beta-bios version.
For this bios version MSI has declared 280A reference current, when the correct value that produces near 100% result (i.e. no deviation) and also a matching power draw compared to other boards (same CPU and workload) is 300A. This means that the board allows 7.14% (300/280) higher power draw for the CPU than AMD specifications state. Compared to the worst violators (up to 50%) this is minor infraction, so MSI deserves a benefit of a doubt whenever this is intentional or a honest error.

With the proper 300A setting, the average HWiNFO "CPU Power Reporting Deviation" during Cinebench R20 NT is 99.2%.
With this setting, the average CPU core frequency is 4027.4MHz, power consumption seen by the CPU 140.964W (of 142W limit) and peak CPU temperature of 73°C.



With 225A setting (75% of the actual), the average HWiNFO "Power Reporting Deviation" during Cinebench R20 NT is 75.3%.
With this setting, the average CPU core frequency is 4103.5MHz, power consumption seen by the CPU 125.241W (of 142W limit) and peak CPU temperature of 80°C.



With 150A setting (50% of the actual), the average HWiNFO "Power Reporting Deviation" during Cinebench R20 NT is 50.2%. With this setting, the average CPU core frequency is 4106.6MHz, power consumption seen by the CPU 91.553W (of 142W limit) and peak CPU temperature of 79°C. This setting is already limited by maximum voltage allowed by the silicon fitness (FIT), so there were pretty much no addition performance gains, or ill-effects for that matter to be had.



I'd like to stress that despite this exploit is essentially made possible by something AMD has included in the specification, the use of this exploit is not something AMD condones with, let alone promotes.
Instead they have rather actively put pressure on the motherboard manufacturers, who have been caught using this exploit.

In short: Some motherboard manufacturers intentionally declare an incorrect (too small) motherboard specific reference value in AGESA. Since AM4 Ryzen CPUs rely on telemetry sourced from the motherboard VRM to determine their power consumption, declaring an incorrect reference value will affect the power consumption seen by the CPU. For instance, if the motherboard manufacturer would declare 50% of the correct value, the CPU would think it consumes half the power than it actually does. In this case, the CPU would allow itself to consume twice the power of its set power limits, even when at stock. It allows the CPU to clock higher due to the effectively lifted power limits however, it also makes the CPU to run hotter and potentially negatively affects its life-span, same ways as overclocking does. The difference compared to overclocking or using AMD PBO, is that this is done completely clandestine and that in the past, there has been no way for most of the end-users to detect it, or react to it.
For semiconductor's nature, I agree it will decrease the lifespan.
But I really don't think just full time turbo (24hr, 365d turbo) will be over AMD's CPU spec.
In some sever, workstation workload, long time turbo is a true requirement.

Also It is common that some System Integrator ship their system with overclocked setting.
 

Skara_Brae

New Member
A "Hello" to all.

I have an 3700X (coming from an R5 1600 only a few weeks ago) with the stock Wraith Prism on an ASUS Prime X370-A motherboard (BIOS upgraded to 5204 to get the 3700X to work), together with an ASUS RTX 2060 Super "Dual" and 2 x 8 GB RAM.

Windows Power Management is "AMD Ryzen Balanced".

No overclocking (I have never ever done that, nor am I interested in it), except for an activated D.O.C.P. in the BIOS to get my 16 GB "HyperX" RAM to run at 3200 MHz instead of 2400 MHz. With my R5 1600 (and without any other meddling in the BIOS), only 2666 MHz or lower was stable.

A run of Cinebench R20.060 in the "4195" Beta shows:

-87.3 % (Current)
-86.5 % (Minimum)
-126.0 % (Maximum)
-92.7 % (Average)

A second run shows:

-85.2 % (Minimum)
-136.8 % (Maximum)
-95.8 % (Average)

Another run gives:

-87.0 % (Current)
-85.2 % (Minimum)
-153.7 % (Maximum)
-105.9 % (Average)

(Two results of CB R20 give 4643 and 4660, for those wondering.)

So, this is not "good", as I understand it?

But since I do not plan to ever overclock (and I am just a "casual" gamer, and this 3700X is overkill for me, really :) ), my 3700X will probably be fine for many years to come (...knock on wood...), so I probably should not have to worry much, right?
 

Attachments

Petr Borodkin

New Member
Hello!
Could you please give me a link to download hwi_627_4185.zip (old link is not working).
Currently, there is only hwi_627_4195.zip version. But it does not work for me. It does not show "power reporting deviation" for my EPYC CPUs (they removed such a reporting for sTRX4 systems for some reason. My CPU is not sTRX4, but EPYC CPUs have same microcode as sTRX4 CPUs).
So, please give a link to 627-4185 version.
Thanks.
 

Martin

HWiNFO Author
Staff member
It does not show "power reporting deviation" for my EPYC CPUs (they removed such a reporting for sTRX4 systems for some reason.
This feature was removed for valid reasons and sTRX4 is similar to SP3 systems - the "Power Reporting Deviation" has no meaning on both.
 

wawans1975

Well-Known Member
....
Starting from HWiNFO v6.27-4195 Beta build (https://www.hwinfo.com/download/) there are following "Power Reporting Deviation" related changes:
....

- The "Power Reporting Deviation" -metric is now hidden when manual overclocking (i.e. AMD OC-Mode) is used, to reduce the chance for user error in reporting the results. The metric is only accurate when the CPU is in control of all of its parameters (i.e. at stock settings). NOTE: Voltage offsets or load-line changes MUST NOT be present when testing the figure.
.....
Hi Martin,
FYI I manually set CPU clock and its voltage, and still the Power Reporting Deviation is not hidden
 

Attachments

I know the number when idle is meaningless. But, at idle mine is averaging about 42% while my wife's identical system averages someplace above 100% while idle. Does that indicate there's some difference between the systems or that the number really is TRULY meaningless at idle?
 

Martin

HWiNFO Author
Staff member
I know the number when idle is meaningless. But, at idle mine is averaging about 42% while my wife's identical system averages someplace above 100% while idle. Does that indicate there's some difference between the systems or that the number really is TRULY meaningless at idle?
That's tough to say precisely. Idle can mean several different states including low-power states with different fluctuations. That can have significant impact on power reporting.
 
I double-checked the wife's numbers. It's a bit different from what I thought. These are both Asus ROG Crosshair VIII Hero (X570) with AMD Ryzen 7 3700x:

Mine (Cinebench): Min = 80.0, Ave = 81.5
Hers (Cinebench): Min = 77.1, Ave = 78.4

Mine (Idle): Min = 32.9, Ave = 42.6
Hers (Idle): Min = 84.9, Ave = 94.9

So, at load, the two systems are fairly close (though bad). At idle, hers is within spitting distance of the load number, but mine is around half the load number. I've set up these systems using the same settings in BIOS and in Windows (I thought I did, anyway). It's odd there'd be that much difference in the idle numbers. But, you're right. There could be all kinds of things going on in the background to move things away from truly idle.
 
Top