System Hard Locking Twice a Week

Nazrac

Solid State Member
Messages
7
Hey guys, hoping someone might have a moment to help me track down the root of this issue. Over the course of the past three months, my computer hard locks(typically looping the last couple seconds of audio though not always) an average of twice a week. I have tried to track down the issue, initially blaming the fan controller on my video card, but so far, I have not been able to discover the root cause. Any ideas you can suggest would be greatly appreciated.

System Info:

  • CPU- Pentium Dual Core E5200 Wolfdale
  • MotherBoard- Gigabyte ep45-UD3LR flashed to Latest version of Bios
  • Video Card- Radeon HD 5830
  • Power Supply-XFX XXX edition 650 watt (xfx are rebranded sea sonic power supplies)
  • Memory- 4 gigs g-skill
  • Sound -1 x Creative 70SB073A00000 7.1 Channels PCI Interface Sound Blaster X-Fi XtremeGamer
  • Cooling- case is Cooler Master HAF92. Cpu cooler is a xigmatek design.
  • The system is not currently overclocked in anyway.
  • Bios settings are set to safe(default) settings.
  • Operating system = Windows 7 x64
  • Currently five internal Harddrives, two of which are broken into two partitions.

Lock up occurrences:

  • All but two lock ups have occurred while playing one particular game(diablo 3 blizzard entertainment). If you Google hard lock along with the game you will get hits, but that is somewhat expected with the game selling over 10 million. I have the issue twice outside of the game so I don't believe the game is the sole fault.
  • 1 lock up occurred while streaming a tv show off of hulu.com
  • 1 lock up occurred in an OCCT power supply test(this occurred after I swapped video cards, as detailed below, and once I returned to my video card, the test can run for an hour without issue).

Test Info:

  • I can run OCCT CPU, GPU, Powersupply one hour large data set stress test without issue. Is there any benefit to running a test longer and/or using small/mid data sets?
  • Intel CPU tester passes all tests
  • I can run superpi to 64M decimals 32 reps without issue.
  • Virus, Malware scans all comeback clean.
  • The thermal sensors on my cpu are stuck at 57 but kick in once the cores exceed this temp, during the stress tests the highest they reach is 61* C
  • Video card with speed fan used to set the GPU fan to 100% never goes above 59* C. Without speedfan controlling the fan the GPU only hits 90*.
  • There are no minidump files(I would assume due to the hard lock and not a crash
  • System logs only record event 41 unexpected system shut down, again I didn't expect anything useful due to the hardlock and not a crash.

Steps taken:

  • I used speedfan as initially I noticed the gpu fan never speeding up and I thought it might handle heat poorly. I set the gpu fan to 100% which lowered the temp from 90* full load to 58 full load without overheating any other part of my system(speedfan reports all harddrives around 32* cpu below the stuck threshold of 57*). Issue remains and I don't believe 90* full load is killer.
  • I swapped out my video card for a Nvidia 8800gt. Completely removed all ati drivers and apps and installed updated Nvidia drivers. Less power draw+ other company. Issue remained and with this card the occt powersupply test caused a hard lock in about 27 mins. I have since swapped back to my ati card and all occt tests complete without issue.
  • At this point, I was looking at my sound card, as creative doesn't have the best rep for their drivers. I have a copy of windows 7 x32 lying around so I did a clean install of it, on a clean drive, as a duel boot option. The only thing I installed in this installation of windows was the newest ati video driver(driver only, no catalyst control center, no hdmi audio drivers). The issue seemed fixed, four days without issue, until today when I once again hit a hard lock while playing diablo 3. This would seem to rule out any software, most driver conflicts as it was a clean install of windows. I did point to the existing install of diablo 3 rather than doing a clean install of the game.

Any ideas? I can't answer much while at work(eastern time zone) but I will do all I can to provide any information that might help track this issue down. Thanks for any follow up you might be able to provide.
 
i think this is a good guide to follow. i would skip one and six. at the end it asks you to install a hd with known good OS. you'll want to backup any user data beforehand.

Troubleshooting lockup and hang issues - Desktops

Thanks for the link.

2,3 = Bios is already flashed to the latest version,and malware and virus scans have been coming up clean(AVG and Malware bytes).
4= Not sure what this refers to? No unrecognized devices in device manager.
5= As noted, I tried a clean install of windows 7 x32 on a clean drive, I don't believe a system restore on my main windows install could provide anything further(please correct me if I am missing something).
6= Memory is correctly recognized as 4 gigs, video ram correct lists as 1 gig. Have not changed the memory configuration for a couple of years.
7. Temps from every monitoring system avail all show uniformly low heat values. HSF920 uses 240 fans and there are no obstructions.
8. Have not attempted. May be something to try though won't this simply reset my bios settings to default? If so I have already done so though bios.
9. I have reseated everything and all appears solid
10.) I can disconnect a few HD's, sound card, and 2nd monitor and see if there is any change.
11. Drivers are kept up to date in my normal windows install, in the clean install of windows they are only up to date if found on windows update (exception is video card drivers, which I fully updated).
12.) This was done as detailed above clean install of windows 7 x32 on a clean drive as a duel boot option, only change I made was updating to the newest video drivers.
 
Update on some of the common questions I am getting on the d3 tech support board:

I have been running duel monitors to watch the temps during play recently. Cpu never gets above 58*, truthfully, I'm not sure what the Gpu will hit in game if I don't use speedfan to set the fan speed (hits 90* during stress tests and never has an issue and I would assume less during d3 as it would not be as demanding), but if I use speedfan to set the gpu fan speed it never goes above 59* and moving that heat does not cause any of my other temp sensors to spike(harddrives between 29 and 32*c). I can even see the temp it locks at since its not a crash last lockup at 56* cpu both cores, 57* gpu, 32,31,31,30,29 hardrives and the other remote sensors all showed as cool. All indications are that it is not heat related. I specifically chose my cpu cooler and the hsf case to allow myself to overclock without too much concern with heat, so it shouldn't be struggling currently with nothing overclocked. This is also in an air conditioned environment.

The power supply is a sea sonic with plenty of headroom(650) and solid amps on all rails, and I would assume the targeted power supply stress test and putting the power draw well above what d3 does without any issue(no rails show out of line fluctuation during the tests). Memory errors I would expect to show up during mem test+, one of the stress tests, or superpi all of which run rock solid.

If the issue were hardware, which the symptoms do point to, I wouldn't expect the lock ups to occur in D3 97% of the time(this is very disproportionate even before you consider I spend more time out of game then in), as I play a lot of other games and most are more demanding then d3 and have not encountered a single crash in another game(over this time period 3+ months(I was in the beta)). With that said, if it were software/driver conflicts I would expect the clean install of windows 7 x32 to have fixed the issue.

So far in one week only running the game on the clean install of windows I only have 1 lock up which is better then my twice a week average in my main install, however still within a margin of error(it might lock up tonight and put me back on track) I will provide updates as I get a larger sample size, though once I week is still not acceptable.
 
On Aug 21 blizzard released patch 1.0.4 and amd released new video drivers. Since these updates I have a solid week of activity without a single crash. I'm not convinced yet that my issue is resolved, but this is the best stability I have had for a while. Will update if the locks reemerge.

**Test results are currently limited to the clean x86 install of Windows, if I make it to two weeks without a lock I will attempt going back to my x64 install. It would be nice to have sound again.
 
So since act 21'st I have had two crashes, below what I was averaging before but still unacceptable for the work I do. Here is the details:

1 Lock up occurred in my clean x86 install in game.
1 Crash occurred in my full x64 install while streaming a youtube video.

The last occurrence was interesting, at first it looking like my typical situation with the system appearing to hard lock. However, in about 15 seconds I got an audio loop, about 6 seconds after that I actually got a blue screen, which gives me hope as this means a mini dump file.

Need help here, I brought the mini dump file into work this morning to look it over, and when I open it I receive 15 blocks all telling me that I specified an unqualified symbol. Is this a result of attempting to open the dump file from a different computer? If so I will wait until I get home and check on my own computer. Either way let me know and I can post the entire file. The bottom of the log does state:

Probably caused by : GenuineIntel

Followup: MachineOwner

Not real helpful to me, unless it is simply pointing to a processor issue(I would guess some of the chipsets on my motherboard could also be tied to intel? Maybe not). Can anyone shed some better light on this?
 
Last edited:
It appears that I need to set the symbol path to:
SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols

I did this and now I receive the following:

Microsoft (R) Windows Debugger Version 6.2.9200.16384 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [\Minidump\090612-64600-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 7 Kernel Version 7601 (Service Pack 1) MP (2 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS Personal
Built by: 7601.17835.amd64fre.win7sp1_gdr.120503-2030
Machine Name:
Kernel base = 0xfffff800`03063000 PsLoadedModuleList = 0xfffff800`032a7670
Debug session time: Thu Sep 6 21:58:19.361 2012 (UTC - 4:00)
System Uptime: 1 days 2:53:10.016
Loading Kernel Symbols
...............................................................
................................................................
..............................................
Loading User Symbols
Loading unloaded module list
........
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 124, {0, fffffa8004a3a038, f2000040, 800}

*** WARNING: Unable to verify timestamp for win32k.sys
*** ERROR: Module load completed but symbols could not be loaded for win32k.sys
Probably caused by : GenuineIntel

Followup: MachineOwner
 
Detailed Info:

1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: fffffa8004a3a038, Address of the WHEA_ERROR_RECORD structure.
Arg3: 00000000f2000040, High order 32-bits of the MCi_STATUS value.
Arg4: 0000000000000800, Low order 32-bits of the MCi_STATUS value.

Debugging Details:
------------------


BUGCHECK_STR: 0x124_GenuineIntel

CUSTOMER_CRASH_COUNT: 1

DEFAULT_BUCKET_ID: WIN7_DRIVER_FAULT

PROCESS_NAME: System

CURRENT_IRQL: f

STACK_TEXT:
fffff880`009f09d8 fffff800`0302ca3b : 00000000`00000124 00000000`00000000 fffffa80`04a3a038 00000000`f2000040 : nt!KeBugCheckEx
fffff880`009f09e0 fffff800`031efb03 : 00000000`00000001 fffffa80`04a3aea0 00000000`00000000 fffffa80`04a3aef0 : hal!HalBugCheckSystem+0x1e3
fffff880`009f0a20 fffff800`0302c700 : 00000000`00000728 fffffa80`04a3aea0 fffff880`009f0db0 fffff880`009f0d00 : nt!WheaReportHwError+0x263
fffff880`009f0a80 fffff800`0302c052 : fffffa80`04a3aea0 fffff880`009f0db0 fffffa80`04a3aea0 00000000`00000000 : hal!HalpMcaReportError+0x4c
fffff880`009f0bd0 fffff800`0302bf0d : 00000000`00000002 00000000`00000001 fffff880`009f0e30 00000000`00000000 : hal!HalpMceHandler+0x9e
fffff880`009f0c10 fffff800`0301fe88 : fffffa80`04cee1a0 fffffa80`036c6780 00000000`00000000 00000000`00000000 : hal!HalpMceHandlerWithRendezvous+0x55
fffff880`009f0c40 fffff800`030e0aac : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : hal!HalHandleMcheck+0x40
fffff880`009f0c70 fffff800`030e0913 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxMcheckAbort+0x6c
fffff880`009f0db0 fffff880`066c9b6e : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiMcheckAbort+0x153
fffff880`0311b0b8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : usbehci!EHCI_InterruptService+0x16


STACK_COMMAND: kb

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: GenuineIntel

IMAGE_NAME: GenuineIntel

DEBUG_FLR_IMAGE_TIMESTAMP: 0

FAILURE_BUCKET_ID: X64_0x124_GenuineIntel_PROCESSOR_BUS

BUCKET_ID: X64_0x124_GenuineIntel_PROCESSOR_BUS

Followup: MachineOwner
 
Back
Top Bottom