Fit-PC bricked due to leap second, how to prevent the second one from failing?

user asked:

I’ve got three Fit-PCs in use. They are being used as light-weight Linux servers. Unfortunately, on Jun 30, the first of them failed to start due to the leap-second bug. I tried rebooting it a few times, but the screen remained blank after the third bootup-attempt. This appeared to be hardware-related and we took it to a repair-man. He told us something had overheated and that the motherboard was broken. He was able to recover the data, but the fit-pc was written off.

The second Fit-PC was unable to reboot a few days later (first time we actually tried to reboot). With apparently sheer luck, it rebooted on the third attempt, and it is now working fine.

The third Fit-PC had not given any problems. When I found out the other ones failed due to the Leap-Second, I actually thought we were lucky with this third one. Fact is, the recent slowness of the server was most likely due to this same bug, and now that I rebooted this machine (first time after Jun 30), it’s giving me the exact same symptoms as the other ones. These symptoms are:

  • Initial reboot attempt fails; OS does not load.
  • I connect a screen to see what is going on. Remains black.
  • I reboot again. I now see the regular loading screen (“Intel Atom…”), but this freezes
  • I try to reboot again.
  • Screen now simply does not activate at all. It does now show any sign of life. The monitor simply acts as if nothing is sending any signal, so I have no way to interact with the CPU whatsoever.

I’ve trying to reboot about 4 times now, but am very much fearing the same problem as before. Where I live the Fit-PCs are uncommon and I am not sure if there are qualified techs who actually know how to repair this (and I am not even sure if the diagnosis of the other tech was correct). So I am asking: do you also think my motherboard was overheated and was yet another Fit-PC bricked, or is there something else I can do?

EDIT: Using Ubuntu 12.04 on all of the Fit-PCs.

EDIT:

I also considered a power-failure. But there are a few inconsistencies:

  • the servers are on three different sites,
  • no power surge was reported and no other hardware was affected – weather was sunny and calm,
  • the only similarity between the three machines was that they started acting odd every since Jun 30 (the third one was having high loads but I failed to recognize this until the first reboot since Jun 30, which I did today).

I could also not find other Fit-PCs affected by the leap-second, but am simply not sure what else could cause this…

My answer:


Software leap second issues wouldn’t have caused a physical hardware failure. Applications running under Linux might have issues, though, but any of those should only require being restarted (Java, for instance, was infamous for going haywire at the leap second.)

Notably, a quick search fails to turn up anyone else having leap second issues with Fit-PCs.

Most likely what’s happened is you suffered a power surge, lightning strike or similar activity that fried two of your three Fit-PCs and probably made the third one questionable, and that this happened on the night of June 30. What was the weather like? (Now a power surge might theoretically have been caused by the electrical company having its own leap second failures…)

Your best bet is to have them replaced.


View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.