SMART Polling stops out of a sudden (?)

cccleaner

Member
Hello!
I have a chia farm using 19 external USB Harddisks. To prevent them from going to sleep or spinning down use HWINFO with SMART Polling activated every 20000 ms.
Now since about a week, no idea what exactly changed, HWINFO out of a sudden stopped to poll the disks regularly. I have no clue what is wrong. Without regularly keeping the discs up and spinning I have issues to run the farm.
Restarting HWINFO does not help. Restarting the Computer does for a certain time, then issue appears again.

What is the problem? why out of a sudden? My setup run without issues since month like this? ( i can see the polling usually when it is working from the regularly blinking LEDs from all the Harddisks at the same time.. Since the blinking stopped my disks spindles are somehow slowing down, which causes delays in reading out the chia plots on time.


HWINFO Version 7.20-4700 (together with Rainmeter and Gadgets (Devart)

How can I find out whats the culprit?

Thank you for any help!
 
Hard to say without seeing the HWiNFO Debug File.
But since only system restart helps, it might be a failure of the storage/USB driver or interface.
 
Thanks for your reply. Its an Intel NUC with two USB 3.0 Hubs. I did also a fresh install of WIndows and on this system is nothing running than the chia farm.
I dont know if it would be a hardware error since the farm is obviously running 24/7?


However I of course had to move tons of GBits on these harddisks. (141Tbit in total) all went well. Systems actually runs normally.. only the polling is disturbed,.

Could you have a look at the reports attached? Thank you so much!
 

Attachments

  • CHIALANE.zip
    305.5 KB · Views: 2
Is the attached Debug File capturing the situation when SMART polling stopped?
 
No need for that as I needed to see when it stops.
What settings have you used for Global and SMART polling period?
 
Global: 20000ms
Disk SMART every 100 Cycles (default as is)
Emb. COntroller every 1 Cycle (default as is)

It can be, everything runs all day without problems and say 17:54 in the evening or something like that it stops again and performance of the farm goes down. restart helps for a random new time.

right now I tested little bit with 1000 with 2000 and finally with 10000ms so far only one error ("looking up qualities" time is too long, which is the right expression for the chia farmers :)

I gave the HWINFO Process "realtime" priority.

I found probably a side issue which might explain the slow down of the disks (but actually not the stop of the polls):

Windows default installation is configuring a setting which is completely nonsense:

Advanced System Settings > Virtual Memory "Automatically manage paging file size for all drives." is per default on, so it could be the systemis placing wherever it wants a page file on the external disks..

I removed now this setting as well and put the page file on the SSD (C:)

So far the polling did not stop and I had no errors anymore.
 
and the last observation for today.. (setting now at 2000ms again) some disks do not blink (are getting polled?) at all anymore.. Are they somehow occupied by the farm process? no permanent and no poll blinking..

I will check again tomorrow if the farm is again stuck.. thank you
 
Not sure if you realize that with a global setting of 20000 ms and Disk SMART every 100 cycles, it means the disk SMART will be polled every 20000*100 seconds = 33.3 minutes.
 
Thanks for the hint.
I dont want to appear unprofessional but based on the setting I enter under Global I can see the frequency of the LEDs concurrent blinking is changing to exactly what I set.
The disks really seem to be polled exactly at that time interval. is that a misunderstanding by myself?

how would you set then these values to make sure the USB disks never go to sleep?
 
That must be some side-effect of a different sensor polling (not SMART), maybe the Disk Activity sensor.
The exact behavior of disk sleep depends on each disk's power management setting, there are tools that can change these settings.
 
but I can directly influence the LEDs by changing the Global time setting. the LEDs indicate activity. At least so much activity that the disk does not sleep?

So would you in general say that I cannot even use HWINFO to regularly poll the disks avoid them going to sleep?

today 11:54 the farm stuck again.

So how to proceed:


Shall I change the HWINFO Settings to: Global 1ms ? = 100secs = 1minute 40secs per Poll Cycle?

Shall I restart the full farm this evening again, activate the Debug Mode and wait until the farm stucks again? Would you see something then or is HWINFO anyway completely out of any error chain?


Thanks for your precious feedback.

I cannot tune windows more in terms of USB Devices. No drivers. All Power Suspending on Ports and Hubs are disabled since ever.

That I can tune the sleep time of a disk is not really known to me.. thought that is hardcoded. Can I just search for the tools at the support pages of the disk brands webpage?

To exclude the USB hubs and Disks as to be problematic I could use my laptop as alternative computer.


Thanks for your precious help and advices!
 
PS: the disks are not connected via external USB Case Controllers but via SATA to USB Converters.. so normal 3.5" Disks different brands, SATA > USB Converter > USB Hub > NUC. Powered via a normal PC Powersupply. So Can I still change the sleep time for the disks?
 
PPS: no issues with the setup for almost one year :) I think i really tuned everything I could and thought I solved the issue to avoid the disks go to sleep with HWINFO!
 
PPPS: plugging out and in the both USB Hubs instead of Restarting the NUC obviously has the same effect to reset the farm.
 
ok this is now really annoying: it seems the last BIOS update on the NUC hook it up..
Resetting defaults in the BIOS and redefine the settings did so far the trick and I have no issues anymore. This was more like a final tryout as I remember there was a BIOS update processed.

Nevertheless: I leave the polling at 1000ms 100cycle 1cycle. HWINFO.exe priority realtime.. disks are not going to sleep obvioulsy and the NUC does not leave the USB connections in a slugish state.

The LEDS however are now indeed not blinking in the poll window anymore.. I give up on this .. :) too much confusion.


DAMN
 
Back
Top