Join Date: Jan 2010
Location: New Haven, CT or Liverpool, UK, the latter during schooltime
I think I have a mobo problem. Help confirm?
I've noticed various problems for the past... well technically couple of years, but more commonly in the past couple of months. PC Specs:
4-year old custom build - NOT overclocked whatsoever
AMD Phenom II X4 955 3.2GHz
XFX Radeon 4890 1GB GDDR5
8GB DDR3 RAM
74GB WD Raptor 10k RPM HDD
1TB WD Green HDD
BlackMagic Intensity Pro Capture Card
Win 7 64bit Ultimate
Symptoms: Strange hibernate issue (see below). Slow performance, especially booting. Problems running system tools (eg add/remove programs not loading).
The performance issues are likely due at least in part to disk space utilization as both drives are really full, I need to do some spring cleaning. I won't completely rule out the possibility of malware either, it's on my list of things to do. But the hibernate issue, plus some of my diagnostic attempts, is what's really throwing me for a loop here.
The system hibernates successfully with no problems. However, it has some issues resuming. It typically gets to the first "Resuming Windows" screen, turns black, and then never recovers. Now, the interesting thing here is that I'm stubborn and refuse to delete restoration data when I do a hard reset - instead, I try again. And about 70% of the time it works on the second attempt, and in all remaining cases it works on the third!
Now, this struck me as strange - either the data was corrupt or it wasn't, I reasoned, so I started to suspect hardware issues. I cleaned up my boot drive to see if it was free space, but after getting to about 11GBs free I was still suffering from the same problem.
It's worth noting that the issue doesn't happen on reboots, only cold boots, and it seems to be more prevalent when the ambient temperature is slightly lower - something like guaranteed at around 18C and almost unheard of at 25C. To further my tenuous temperature theory, the longer I let it pretend to resume the first time, the more likely it seems to work on the second attempt, making the issue seem time (and therefore temperature) related. Also, I run the PC with the side panel off - there's a side intake fan, so the panel sits nearby to help push some air onto the GPU and CPU, but it's nothing like actually being on - a good 6-8cm of clearance at least.
I've had a couple of memory issues in recent weeks with Civ 5 - telling me that it can't access memory at a given location - so I gave it a memory test (admittedly, just using Microsoft's tool, accessible using F8 while booting). I set it to do an extended test, it got through 6.5 cycles with no errors before I decided to try testing something else.
I got OCCT and tested the GPU (I didn't suspect it at all, but I wanted to rule it out) - no errors on a 30-minute test, with reasonable and stable temps. Next I tested the CPU, and this led me to my current mobo conclusion.
You see, the CPU test (Linpack) can't make it past 4 minutes - which sounds like a CPU issue, except that the error that stops it is the "CPU Temperature" which is reading 96C! From a bit of googling, it looks like the max operating temperature of this thing is rated at around 62C - and I'm getting idle temps reported at 57C. I'm fairly certain (and most of the google results agree) that if getting a bit of load for 3.5 minutes pulls my core temps up to 96C, my computer should be crashing. Considering I put in marathon sessions of Civ 5, 10-14 hour days on the weekends sometimes, I find it hard to believe my CPU is running this hot and not giving me MAJOR stability issues. Changing the different monitoring options in OCCT doesn't change the reported temperatures more than 1C.
So, I'm beginning to think the mobo is on the fritz. It seems like the only part of the PC which might be affected so thoroughly by a small change in ambient temperatures, and it could definitely explain erroneous temperature readings. It might also explain Civ 5's occasional memory access errors (which I can't find reference to elsewhere on the web).
I think my mobo's dying. How do I confirm? And if it is, what are my options for replacing? I used to treat mobo replacement as an excuse to replace processors as well, but is that actually necessary? Any other components need to go with it?