Locked out of my pc

From: CHYRON (DSMITHHFX) 5 Mar 2021 16:16
To: ALL3 of 7
Since this is a thread about pc woes real and imagined, let me regale you with the saga of my home music/webdev server ...

Last week during a warm spell (!) our building's heat was jacked up to 11, probably because some bottom floor tenants' complaints. Ambient temps in our apartment hit 30C and stayed there.

Coincidentally the server started falling over. First I thought drive fail (as predicted by S.M.A.R.T.) and swapped in another spare drive, a 500G, 7200 rpm mofo. This in a SFF office pc.

Which did not help, cpu temp was still spiking >50C, and the case was hot enough to make a raw egg feel warmish. When I later mounted the removed drive with an sata-->usb cable, that fucker could deffo fry eggs into greasy carbon in <10-minutes.

It was like sticking a propane-fired space heater inside a shoebox.

Warming to the task (so to speak), I opened up the case and blew out the dust, pulling the hsf and blowing that out for good measure. The hsf did not have dust bunnies, it had dust bricks resembling crumbly blocks of felt.

This sort of helped, it still kept falling over but over longer intervals.

I bought an ssd, not any ssd but a "WD Green," low power ssd (primarily because it supports SATA-1 which are the limits of 2005 technology).

Now we're getting somewhere, but still falling over.

Opened the case again, pried out the pci slot covers and swapped in a 'new' case fan I'd had leftover from my Athlon 7 build, and opened our windows wide.

As a side story, I attempted to use fsarchiver to duplicate the os install from hdd but this was an utter fail in a virtualbox test, and proposed fixes ludicrously overly complicated and under documented, so I went straight to clean install of newer OS version (20.04 LTS) on the ssd.

Gotta say the ssd has given the ancien box new life, and it's stopped falling over. Can't wait for summer.

 
EDITED: 5 Mar 2021 16:20 by DSMITHHFX
From: william (WILLIAMA) 5 Mar 2021 16:56
To: CHYRON (DSMITHHFX) 4 of 7
Are you certain that your system problems were caused by overheating? I'm not saying it wasn't running too hot, I'm sure it was, with clogged up fans, but 50C isn't especially high for a CPU. Also, after an OS refresh it ran OK.
From: CHYRON (DSMITHHFX) 5 Mar 2021 17:11
To: ALL5 of 7
Yeah I'm 99% sure it was overheating. Anyway it's running more stabler now. The psu is still problematic as a heat source. I tried blowing/sucking dust out of it while still mounted in case but may need to open it up, replace fan and completely remove dust.

I probably got the sequence of attempted fixes garbled, but the server was still falling over with the new os until additional cooling measures applied.

Forgot to mention one other thing I'd done was replace thermal compound to cpu/hsf.
EDITED: 5 Mar 2021 17:13 by DSMITHHFX
From: CHYRON (DSMITHHFX) 7 Mar 2021 16:00
To: ALL6 of 7
Or not ... after about a day and a half of running ok, it went back to the falling over tricks. As a last hurrah I've booted it from systemrescuecd on a usb stick to see how long it stays up on that (monitoring via "watch -n 20 uptime" over ssh).

Maybe it'll do ok on minimal system load, or maybe another coincidence I'd forgotten was the first kernel upgrade on 18.04 LTS in what feels like a year, ~3-weeks ago. 20.04 may even share that kernel, or features therein too new for the 2005 dx5150.

Closer to throwing in the towel on that one.

Anyway, I have a cunning new plan: Swap in the ssd with installed system as a dual-boot to MrsD.'s other discard PC, the newer hp dx5750 (her latest is an i5 w/ 8G), currently hosting seldom-used Win 10 + Office 365.


 
EDITED: 7 Mar 2021 16:02 by DSMITHHFX
From: CHYRON (DSMITHHFX) 7 Apr 2021 16:04
To: ALL7 of 7
The mystery deepens. Deep in OCD  territory now, for I did not abandon the terminally ill device, but have nursed it along whilst studying the underlying morbidity.

Apparently (apologies) this model (dx5150) is known for its bad caps that are made badder due to *overheating* in the SFF case. If true, the damage was (likely) done during the mad, overheat the whole fucking building episode and, by all accounts, unrecoverable, being I lack SMD soldering skills. Others have suggested replacing the psu, yeah but no.

Anyhoo, the server stays up just so long as I don't do anything with it except (for example) "watch -n 20 uptime" via ssh console. It's been up for ~4-hours today!!!  :-O~~~

When I go and use it for anything useful, such as mounting a share, it typically goes down within an hour or two, which is enough time to get stuff done, I suppose  :-S .

Haven't tried the swapping the new ssd into my other, newer (5750 IIRC) model old Mrs.D pc trick as yet, but it approaches.

Edit: Whelp, I guess it didn't like the report card ^^^ because yesterday it stayed up ~10-hours and I got some use out of it as a dev server before normal shutdown via console at 5pm.
EDITED: 8 Apr 2021 13:33 by DSMITHHFX