Sensor scan hangs on IPMI #0

dwhobrey

Member
Hi Martin,
I'm using HWInfo64 on W7x64 on a Tyan S8812 server board.
IPMI scanning used to work before I installed Intel's IPMI driver:
Intel Intelligent Management Bus Driver V13.0, dated 17/02/2008,
and disabled the MS Generic IPMI Compliant Device driver - which wasn't working anyway - Tyan had verified that. In a prior email you said HWInfo was accessing the IPMI interface directly via the ports rather than through a driver.
The Intel driver I got from ipmiutil.sourceforge.net.
This seems to work in that the ipmiutil uses it to report bmc sensor data ok.

So, the IPMI scan hangs when sensors is selected.
Clicking on the scan dialog brings up a system dialog asking to Close or wait. Selecting waiting gets over the hung state, but no IPMI sensor data is found.

Attached is the dbg dump.
Regards,
Darren.
 

Attachments

  • HWiNFO64.DBG
    250.5 KB · Views: 1
Hi Darren,
HWiNFO favors the IPMI driver in case it's present in system, so in this case it uses the driver which is also confirmed in the DBG dump.
According to the debug dump I don't see a problem there - the IPMI communicates correctly and returns some sensor data as well.
Can you try to wait longer after opening sensors ?
 
Hi Martin,
after waiting for two minutes - just on the IPMI scan, it only comes back with 1 cpu temp, and 1 ambient temp, it should report about half a dozen temps, voltages and fan rpms, which it used to do before I installed the Intel IPMI driver. Here's what the impiutil "sensors" command currently reports, via the Intel driver, - and it reports this immediately - no delay:

ipmiutil ver 2.80
isensor: version 2.80
-- BMC version 0.7, IPMI version 2.0
_ID_ SDR_Type_xx ET Own Typ S_Num Sens_Description Hex & Interp Reading
0001 SDR IPMB 12 12 dev: 20 00 ff 00 01 AST2050
0002 SDR Full 01 01 20 a 01 snum 01 CPU0 TEMP = 14 OK 20.00 degrees C
0003 SDR Full 01 01 20 a 01 snum 02 CPU1 TEMP = 00 OK 0.00 degrees C
0004 SDR Full 01 01 20 a 01 snum 03 CPU2 TEMP = 00 OK 0.00 degrees C
0005 SDR Full 01 01 20 a 01 snum 04 CPU3 TEMP = 00 OK 0.00 degrees C
0006 SDR Full 01 01 20 a 01 snum 06 AMBIENT = 15 OK 21.00 degrees C
0007 SDR Full 01 01 20 a 01 snum 07 PCI_INLET_AREA = 20 OK 32.00 degrees C
0008 SDR Full 01 01 20 a 01 snum 05 RD5690_CASE_TEMP = 24 OK 36.00 degrees C
0009 SDR Full 01 01 20 a 01 snum 08 SAS_CASE_TEMP = 27 OK 39.00 degrees C
000a SDR Full 01 01 20 a 01 snum 09 CPU0_MOS_AREA = 28 OK 40.00 degrees C
000b SDR Full 01 01 20 a 01 snum 0a CPU1_MOS_AREA = 1d OK 29.00 degrees C
000c SDR Full 01 01 20 a 01 snum 0b CPU2_MOS_AREA = 1b OK 27.00 degrees C
000d SDR Full 01 01 20 a 01 snum 0c CPU3_MOS_AREA = 1b OK 27.00 degrees C
000e SDR Full 01 01 20 a 02 snum 40 CPU0_VCORE = 24 OK 1.04 Volts
000f SDR Full 01 01 20 a 02 snum 41 CPU1_VCORE = 00 Crit-lo 0.00 Volts
0010 SDR Full 01 01 20 a 02 snum 43 CPU2_VCORE = 00 Crit-lo 0.00 Volts
0011 SDR Full 01 01 20 a 02 snum 44 CPU3_VCORE = 00 Crit-lo 0.00 Volts
0012 SDR Full 01 01 20 a 02 snum 2a CPU0_MEM = 34 OK 1.51 Volts
0013 SDR Full 01 01 20 a 02 snum 2b CPU1_MEM = 00 Crit-lo 0.00 Volts
0014 SDR Full 01 01 20 a 02 snum 2c CPU2_MEM = 00 Crit-lo 0.00 Volts
0015 SDR Full 01 01 20 a 02 snum 2d CPU3_MEM = 00 Crit-lo 0.00 Volts
0016 SDR Full 01 01 20 a 02 snum 23 P0_VDDNB_RUN = 88 OK 1.09 Volts
0017 SDR Full 01 01 20 a 02 snum 16 P1_VDDNB_RUN = 00 Crit-lo 0.00 Volts
0018 SDR Full 01 01 20 a 02 snum 17 P2_VDDNB_RUN = 00 Crit-lo 0.00 Volts
0019 SDR Full 01 01 20 a 02 snum 1a P3_VDDNB_RUN = 00 Crit-lo 0.00 Volts
001a SDR Full 01 01 20 a 02 snum 15 VDD_SB7001P2_RUN = 96 OK 1.20 Volts
001b SDR Full 01 01 20 a 02 snum 18 VSENS_12V = 7e OK 12.10 Volts
001c SDR Full 01 01 20 a 02 snum 19 RD890_1P1_RUN = 89 OK 1.10 Volts
001d SDR Full 01 01 20 a 02 snum 1b CPU_VLDT_RUN = 90 OK 1.15 Volts
001e SDR Full 01 01 20 a 02 snum 1c V1P0_SAS = 7f OK 1.02 Volts
001f SDR Full 01 01 20 a 02 snum 1d P1V8_AUX = e7 OK 1.85 Volts
0020 SDR Full 01 01 20 a 02 snum 1e P1V0_AUX = 7c OK 0.99 Volts
0021 SDR Full 01 01 20 a 02 snum 1f VSENS_6V8 = de OK 7.10 Volts
0022 SDR Full 01 01 20 a 02 snum 20 VCC3 = cb OK 3.25 Volts
0023 SDR Full 01 01 20 a 02 snum 21 PECI_RD8901.1RUN = 87 OK 1.08 Volts
0024 SDR Full 01 01 20 a 02 snum 22 P0_VDDR_RUN = 95 OK 1.19 Volts
0025 SDR Full 01 01 20 a 02 snum 24 P0_VTT_RUN = 5b OK 0.73 Volts
0026 SDR Full 01 01 20 a 04 snum 30 CPU0 FAN = 09 OK 810.00 RPM
0027 SDR Full 01 01 20 a 04 snum 31 CPU1 FAN = 00 Init 0.00 RPM
0028 SDR Full 01 01 20 a 04 snum 32 CPU2 FAN = 00 Init 0.00 RPM
0029 SDR Full 01 01 20 a 04 snum 33 CPU3 FAN = 00 Init 0.00 RPM
002a SDR Full 01 01 20 a 04 snum 34 FRONT FAN1 = 0d OK 1170.00 RPM
002b SDR Full 01 01 20 a 04 snum 35 FRONT FAN2 = 0e OK 1260.00 RPM
002c SDR Full 01 01 20 a 04 snum 36 FRONT FAN3 = 0e OK 1260.00 RPM
002d SDR Full 01 01 20 a 04 snum 37 FRONT FAN4 = 0e OK 1260.00 RPM
002e SDR Full 01 01 20 a 04 snum 38 REAR FAN1 = 07 OK 630.00 RPM
002f SDR Full 01 01 20 a 04 snum 39 REAR FAN2 = 0e OK 1260.00 RPM
SDR IPMI sensor: Power On Hours = 7124 hours
ipmiutil sensor, completed successfully
 
Thanks for the information. I'll verify the details in the debug dump with the information you posted.
Please also post a screenshot from HWiNFO sensors so I can see what values have been reported.
Is it possible to somehow dump binary data using the ipmiutil ?
 
I have analysed the dump and it seems HWiNFO correctly communicates with the BMC and enumerates the first entries up-to index 6 (snum 06 / AMBIENT temperature).
When HWiNFO asks for the next entry (#7 / PCI_INLET_AREA), the BMC stops responding and that's the reason why it takes so long - HWiNFO waits for a response, but doesn't get any and it times-out.
 
Hi,
see attached.
I don't suppose you know the IPMI command to send an smbus command to a device on the bmc's internal smbus? I'm trying to talk to a W83795G chip that hangs off the bmc's smbus - I think the IPMI spec was written by a robot!!
d.

What do you suggest? Would increasing the timeout delay fix it?
 

Attachments

  • ipmifiles.zip
    72.8 KB · Views: 4
I'm sorry, I don't know such command. It should be part of the IPMI specification (indeed it's quite complex).
I don't think an increased time-out might help, it's already quite long...
 
One more idea.. Please try to launch HWiNFO sensors more times in Debug Mode (and wait until it opens)..
Then send me the different debug files - I want to verify if the time-out occurs always at the same SDR index, or it's random...
 
Could there be a fall back option to use ipmi port directly?

Attached is the debug output of ipmiutil when performing the sensor command. It always fetches the data super quick. Maybe the debug output will give some clues as to how it does it.
 

Attachments

  • sd.txt
    116.2 KB · Views: 1
Thanks for the additional data, I'll check them.
I'd better like to fix this driver issue.
Please see my post above about different runs (if you haven't noticed)..
 
ok, attached are 5 dumps. One resulted in hwinfo crashing. One of them is with the Intel driver disabled - that one worked.
 

Attachments

  • HWiNFO64-1.DBG
    225.5 KB · Views: 1
  • HWiNFO64-2-Crash.DBG
    225.5 KB · Views: 1
  • HWiNFO64-3.DBG
    230.9 KB · Views: 1
  • HWiNFO64-4DrvDisabled.DBG
    303.3 KB · Views: 2
  • HWiNFO64-5.DBG
    228.8 KB · Views: 2
Thanks! When using the driver, it always stops at SDR #7, which doesn't respond. All data up-to that point are the same.
When using direct access, all devices respond correctly.
 
Did the ipmiutil sd.txt dump give any clues? It looked like it included the raw ipmi commands it was using to fetch the sdr's. i.e. there might be a quirk with the Intel driver in terms of cmd sequence / format etc. It would be interesting to know how it differed in terms of the cmds used.
 
Yes, it did.. I have an idea.. Give me few minutes and I'll post a new build to test.
 
Hi Martin,
it worked - and super fast too!
Apologies for responding so late, I was expecting to get an auto notification from the forum but didn't - I probably forgot to select the notify button.
Thanks again.
 
Back
Top