WHEA - PCI/PCIe Bus Errors getting higher and higher

AngelW

New Member
Hello everyone!

Recently I was getting a BSOD while playing Tom Clancy's The Division 2 regarding UNEXPECTED_STORE_EXCEPTION (most likely I need to reinstall my OS since I didn't in the past 6 months). But before I wanted to do that, I also checked using HWInfo64 the temps and frequencies along with stats if my processor was Thermal Throttling or anything else. Nothing unusual, everything behaved normal, I also have my processor cooled using Liquid Metal. I need to mention that I have a Lenovo Legion Y720 which is a laptop and yes, I know that it's not recommended using LM to cool down a laptop but I don't really travel a lot now and when I was I did used regular Arctic MX-4 to cool down my processor so a damage from LM application can't be since I was running the laptop like this for like 6-7 month using LM. Though, I remember seeing this error about one year ago (again, I was not using the LM technique to cool down the laptop) and noticed something odd, a lot of PCI/PCIe Bus Errors were stacking up and I really don't know what do they mean.

My laptop is working perfectly fine, even with a crazy overclock on the GPU (100mhz core clock/500mhz memory clock boost) and no crashed or anything else occurs, ONLY when I game for about 3-4 hours in Division 2 and even then, it's randomly occuring the BSOD with the same error, but it might be the OS that needs a fresh installation. So, like a man that likes to troubleshoot the problem first by itself, I searched for the WHEA and found in Event Viewer a TON of errors (all the same) with the EVENT ID 20, all without any information (except the RAW Data and others like that) and categorized as INFORMATION. Can somebody help me decipher these errors and see what can be the problem? I will also come back after a fresh Windows installation and drivers installation tomorrow to see if the problem still occurs (most likely it will since I know these errors from 1 year ago) and tell you if anything changes.

Also, so I will not forget, these are my specs of my Rig:

CPU: I7-7700HQ 2.8Ghz (TB to 3.8Ghz)
GPU: GTX 1060 6GB non-MAXQ
RAM: 16Gb 2400MHZ DDR4
Storage: ADATA XPG SX6000Lite SSD & Seagate Barracuda 2Tb HDD

Almost forgot, the temps are fine for a laptop even with that overclock on the GPU. The CPU gets to 83C max (only in Division 2 during heavy action going on) and the GPU I recorded a 77-78C max with the card still maintaining it's clock speeds (again, only in Division 2). In other games, the temps are lower with 5-7C, so I don't think it's an overheating issue. Though, I might be wrong.

Again, if somebody could help me with this problem I would be grateful.

Thanks!

PS: I attached a .DBG file when the errors started occuring. Wanted to upload one while I was playing Division 2 but it always has over 10Mb and the server does not allow me to upload it.
 

Attachments

  • HWiNFO64.DBG
    1.5 MB · Views: 12
WHEA errors can be also caused by other instabilities - problems in hardware or drivers.
Yours seems to be generated by the "Skylake PCH-H - PCI Express Root Port #9" device, to which the NVMe controller is attached. I'd recommend to check the SSD/NVMe drive.
 
So, I did checked the SSD which is indeed an NVMe PCIe one, and found no issues using CrystalDiskInfo. With "sfc /scannow" it returned no corrupted files, tried to update the drivers from Device Manager and see if the problem is fixed (tried to update the Standard NVM Express Controller and PCIe Root Controller along with the PCI Express Root Port #9 appearing as Intel 100 Series/C230 Series Chipset Family PCI Express Root Port #9 - A118) but I see the WHEA are still appearing (Same event ID)

I did not yet freshly installed Windows 10, but today I will do this after finishing the work I need to do since the PC is quite stable unless I play that damned game (sometimes even opened in background causes BSOD).

Did I missed something that I need to check? Can you recommend a software/tool that can check more in depth the drive? Thanks in advantage!
 
Coming back from two fresh W10 installations. Found what causes these WHEA but dunno how bad they are. So, first I installes W10 with Intel RST instead of AHCI set in bios and guess what, the problems have disappeared. Only one small problem: the whole system was lagging even if properly set up (drivers, optimizing the SSD, nothing seemed to fix the issue). Thought "hey, i can't run like this. Going back to AHCI. Installed W10 again with AHCI this time set in BIOS and voila, the errors are back. Now my question is: what can be the problem since RST thinks there are 0 but AHCI founds them?

*Update: So, I thought to install an older version of BIOS just for the sake of confirming that this was not an issue caused by that. Surprise, the latest BIOS update for my laptop model was causing the errors. I don't know if it's a problem with my laptop or not, but it seemed that an older version of the BIOS resolved the issue for whatever reason. Bear in mind that I reinstalled before the latest update and the error was still present. Now with the old bios, everything works fine again.
 
Last edited:
Back
Top