NVIDIA PCI Express Error Counters

In the release notes, it says:
  • Added monitoring of NVIDIA PCI Express Error Counters.
but I don't see this anywhere. Where is the option to monitor this?
 
Please attach the HWiNFO Debug File for analysis. It's quite possible that 50 series work different.
 
Looks like the 50 series require adjustment here. Will need to look into that...
 
Could you please explane me what does it mean "Recovery Count"? And why the value of it rises a little by little?
 
Could you please explane me what does it mean "Recovery Count"? And why the value of it rises a little by little?
 
Doing some testing with my 4090, it seems that Recovery Count is tied to power management as it increases every time power states increase or decrease, probably due to the PCI-E link needing to be reset as it changes link speeds. It also seems that Bad TLP Count and NAKs Sent are tied together as they always match, but I haven't been able to see any events that match. Hopefully more documentations will be forth coming.
 
Sorry, I posted this in the other thread, so copying here too:
The "Recovery Count" counts the number of changes from L0 to Recovery. It triggers for example during a change in speed, width, or other possible reasons that usually don't mean a PCIe error occured.
The other counters however might indicate a problem on PCIe interface.
 
Back
Top