Showing posts with label memory. Show all posts
Showing posts with label memory. Show all posts

Sunday, April 8, 2012

Windows 7: BSOD: Errors 116 & 119: Interpreting the Minidump or Kernel Dump File

I had been having Blue Screen of Death (BSOD) crashes.  These were happening on one machine and not the other.  This was odd; both machines had virtually identical Windows 7 installations.  They also had the same motherboards and same amounts and kinds of RAM.  This post is a continuation in the effort to figure out why.

Given the similarities between the computers, I suspected the crashes were due to software.  Although the Windows installations were virtually identical, I was not always using exactly the same programs on both machines.  There was also a possibility that a CPU upgrade was responsible for a new bout of crashes:  both machines had previously had the same processors, but I had just installed a faster one on the crashing machine, and it had just begun crashing again.

In the previous episode, I had used BlueScreenView but had not known how to interpret its reports.  More accurately, I had not known how to interpret the minidump reports, viewable in BlueScreenView, that Windows would produce during a BSOD.  I wanted to be able to understand what the minidump file was telling me.

Understanding the minidump seemed especially important this time because, unlike the last time, the BSOD was not pausing onscreen long enough for me to see what it said.  It was flashing by so quickly that I just caught a glimpse of blue and then the machine was rebooting.  I recalled that I had seen, somewhere, a setting that would prevent that from happening.  Eventually I found it:  Start > Run > SystemPropertiesAdvanced.exe (or Control Panel > System > Advanced tab) > Startup and Recovery Settings > uncheck Automatically restart.  At present, my other settings there were for "Write an event to the system log," "Kernel memory dump" (not "None" or "Small memory dump (256KB)), "Dump file = %SystemRoot%\MEMORY.DMP," and "Overwrite any existing file" was selected.  I wasn't sure if those were the right settings; that's just what I had.  One source told me that I would want to overwrite because the MEMORY.DMP file would eat up lots of disk space.  With these settings, I would have a minidump for every crash and a MEMORY.DMP for only the most recent crash.  So then I clicked OK and got this message:

System Properties

Windows might not be able to record details that could help identify system errors because your current paging file is disabled or less than 800 megabytes.  Click OK to return to the Virtual Memory settings window, enable the paging file, and set the size to a value over 800 megabytes, or click Cancel to change your memory dump selection.
What we were inferring, from this, was that I could opt for the small memory dump with my existing settings, or else I would have to change the paging file settings.  Right there in the Advanced tab, I went to Performance Settings > Advanced tab > Virtual memory Change.  I had a 16MB paging file on drive C and a minimum 2GB paging file on another drive.  Apparently the kernel dump needed at least an 800MB paging file on drive C.  Since at least the days of Windows XP, I had emphasized putting the bulk of the paging file on another drive, in the belief that this would enhance performance.  A search now led to the suggestion that, especially on a machine with substantial RAM, I would rarely if ever run out of RAM and actually use the paging file.  On the other hand, a different post in that same thread quoted Microsoft as saying that paging files are used often and should be located on fast (and, obviously, uncompressed) drives if available.  A quick look at pagefile.sys on the second drive indicated that it was presently at the minimum 2GB size I had set for it, on a system with 12GB RAM.  So it seemed that advice to make the paging file half as large as RAM, or twice as large, or some other similar value, might significantly overstate how large a paging file I would actually need.  There had long been warnings that setting the minimum size too low would impose at least a slight performance hit, because Windows would have to dynamically resize the pagefile if it needed more space; but I thought that saving or otherwise manipulating a larger file might also cause a slowdown.  I concluded that the paging file probably was not being used often, that I didn't want to preallocate space that I might need for some other purpose in a pinch, that a fixed larger size could have its own drawbacks (including being inadvertently saved in a drive image), and therefore I should set the paging files on both drive C and the other drive to the System Managed Size > Set option.  After a reboot, I saw that the memory dump settings were as I had left them and the paging file size (with a full set of programs loaded) was 18427MB recommended and 24571MB currently allocated, or about 150% and 200%, respectively, of installed RAM.

One thing still on the burner was the indication, picked up from somewhere, that maybe I should be looking into the Windows Event Viewer (Start > Run > eventvwr.msc).  It seemed that Event Viewer was an alternative to BlueScreenView, so I wasn't sure I needed it.  Another recommended approach was to start by looking at the minidump to find the BCCode or STOP code, the cause, and the time when it happened.  I could see that BlueScreenView was showing me the Crash Time, the Bug Check Code, and a Caused by Driver column of information.  I didn't see a column for STOP codes.  I went into View > Choose Columns and saw that there wasn't even a column for STOP codes.  I had forgotten that Bug Check and STOP codes were synonyms.  Looking again, I saw that the three .dmp files shown in BlueScreenView all displayed Bug Check Codes of 0x00000116.  The "Caused by Driver" column listed three diffrent drivers, highlighted in the lower pane, but what was this bug check code telling me?  Microsoft's Bug Check Code Reference said that Bug Check 0x116 was VIDEO_TDR_ERROR.  The detailed description said, "This indicates that an attempt to reset the display driver and recover from a timeout failed."  (Later, I saw a suggestion that I would have found FaultWire more informative.  For this particular error code, I examined the suggestions below.)

So that was interesting.  It wasn't the CPU; it was the relatively new video card, an MSI R6570-MD2GD3 LP Radeon HD 6570 2GB.  I'd had it for a few weeks.  It seemed to me that the crashes were happening especially when I was running the Opera browser.  I couldn't make anything of the parameter information provided in the Bug Check Code Reference and listed in BlueScreenView, but I did do quick searches for the three drivers that were listed, for the three .dmp files shown in BlueScreenView:  pacer.sys, atikmpag.sys, and discache.sys.  Nothing jumped out at me for the other two, but I had seen pop-up dialogs referring to atikmpag when running Opera, and now it appeared that atikmpag.sys BSODs were related to video hardware problems (e.g., having a video card in the wrong slot). A right-click in Control Panel > Device Manager > Display Adapters indicated that I was already using the latest driver for the video card, and Opera said I was using the latest version.  Possibly this was happening only when Opera was overloaded:  I usually had a bunch of tabs open.  I decided to try the approach of killing Opera as soon as an atikmpag dialog popped up.  But the next crash wasn't due to Opera -- it wasn't running at the time -- so this was more like background information for the time being.  The next several runs of Opera produced no crashes, so possibly one or more of the steps taken here solved the problem.

Previously, I had gotten minidumps after an indication that my dump file size (presumably meaning my pagefile) was too small. Now that that was no longer a problem, I believed I could expect to see full kernel dumps instead of minidumps. I shelved my budding search for guidance on interpreting minidumps, to wait and see what I would get next.  After the next crash, I did have both a new minidump visible in BlueScreenView and a full MEMORY.DMP file in C:\Windows.  I wasn't sure how to view the MEMORY.DMP, so I ran a search and saw two options.  One was to upload the .DMP as an attachment to a request for help (at e.g., SevenForums.com).  I had a slow connection and my .DMP was about 1GB, so the recommended alternative (in an ExpertsExchange post) was to use Microsoft Debugging Tools for Windows.  (I had learned that I didn't have to pay to see the answers provided in ExpertsExchange.com threads; I just had to scroll to the bottom of the screen.)  The solution seemed to be to download the Windows SDK for Windows 7.  This gave me winsdk_web.exe.  That turned out to be a 2.5GB download that would require 4.5GB when installed.  I looked at my notes from the last time I flirted with the SDK.  I had apparently downloaded more than necessary; I was now seeing advice to download only the Debugging Tools for Windows.  (In my version of winsdk_web.exe, these were under Developer Tools, not under Common Utilities.)  This would be a 177MB download requiring 419MB when installed.  It downloaded and installed directly; it didn't give me an option of saving the download for future reinstallation.  It did not seem that it had actually downloaded 177MB, though; it was done in just a few minutes, and that would not have happened on my slow connection.

While that process was unfolding, I cleaned up the following notes that I had accumulated in the meantime; this post returns eventually (below) to the topic of using the SDK to read MEMORY.DMP.

One such miscellaneous note:  I saw a webpage on which Microsoft suggested two different sequences of steps, depending on whether Windows would start or not.  Since Windows was starting for me, their suggestion was, first, to undo recent changes using System Restore.  I had been having this problem for several days, past my most recent restore point.  Besides, by this point I believed I had traced the problem to the video hardware and/or Opera.  So the next step was to consult Control Panel > Action Center for clues.  Nothing there.  Next, make sure I was current on Control Panel > Windows Update.  I had already done that.

The next step recommended by Microsoft was to search for drivers on the manufacturer's website.  Well, I hadn't done that, not exactly.  I had relied on Device Manager, but now I went to the webpage of the video card manufacturer.  To do that, I started with GPU-Z (similiar to CPU-Z).  I discovered that I had to choose the Install rather than the Portable option:  the latter would make GPU-Z not only uninstalled but uninstallable on that machine.  Fortunately, I learned this on the machine that I was not trying to diagnose.  On the machine being examined, GPU-Z ran, and it gave me lots of information, but it didn't give me any more manufacturer information than I had gotten from Device Manager:  I was being lazy, but now I saw that I had an AMD Radeon HD 6570.  For that purpose, System Information for Windows (SIW) was a competent alternative.  To get the actual manufacturer information, it seemed I had no alternative but to consult my receipt, or the box that the video card came in.  Oddly, according to Device Manager, SIW, and GPU-Z, the driver I had installed was actually newer than the latest one on the manufacturer's webpage.  I decided to try the Roll Back Driver option in Device Manager.  That put it back to a driver dated about four months earlier.  I hadn't actually installed that older driver, to my knowledge; evidently Windows downloaded and installed the older driver automatically.  So I would have to see if that fixed the problem.  And in the long haul, that was one possible reason for the reduction in BSODs that I would experience in coming days.

In the meantime, the next step recommended by Microsoft was to use Safe Mode to troubleshoot problems.  They explained how to get into Safe Mode, but not what to do once I was there.  One possible intention was that I would load safe mode without startup programs that might be causing the problem.  A clean boot could be helpful at times, but did not seem highly relevant to the kind of crash I was having.  My crashes could occur after hours of operation.  Microsoft's final suggestion was to check for hard drive and memory errors.  I had recently run Windows Explorer > right-click on a drive > Properties > Tools > Check Now > check both options, and had also run MemTest86+.  These did not appear to be the problem in this case.

FaultWire offered other suggestions specifically oriented toward error 116.  The problem, they felt, was probably either in the driver or in hardware that was either defective or improperly installed.  On the video driver side, they suggested using their own commercial (nonfree) Driver Genius or Radar Sync to verify that I had the latest drivers, assuming I hadn't been comfortable with a direct search of the manufacturer's site.  On the hardware diagnostic side, they pointed me toward their Fix-It Utilities and System Suite, and also toward Eurosoft's PC Check and Iolo's System Mechanic (all commercial).  They also suggested checking the Windows 7 compatibility list

I did get another BSOD, within a day or two, but this time the error was different.  The number was 119 and the message was, "The video scheduler has encountered an unexpected fatal error."  I got it while running the Windows Experience Index test, so in that sense it seemed to be provoked by demanding use, as when Opera had been overloaded (above).  FaultWire had nothing new to add to what it had already said for error 116:  check the drivers, consider faulty hardware or incorrect hardware installation.  I hadn't previously searched the Win7 Compatibility Center, but now I did, and saw that there was no entry for my particular graphics card.  It was an MSI card, and a search of the Compatibility Center for "MSI" by itself turned up over 800 items, so it's not as though the database was weak.  I had evidently just stumbled into a product that was not listed.  I wasn't sure if that meant it hadn't been checked, or if it had been checked and was definitely not compatible.  Either way, this now seemed like something that I obviously should have checked before -- "obvious" being the standard word for what we have learned about, after we have learned it (or re-learned it, as the case may be).  I checked the manufacturer's page for the video card.  I was not impressed with MSI's website in this regard:  searching did not find the product, and when I did finally drill my way down to it, I got a notice:  "The specifications may differ from areas."  Some kind of typo there, but apparently they sold different products under the same model name.  I emailed MSI customer service, to verify that I was understanding the compatibility situation correctly.  They said no, it definitely was compatible.

I tried running the Windows Experience Index again, several weeks later.  By that point, I had rolled back the driver and had taken most if not all of the other steps described above.  This time, it did not crash.  I had also had no further crashes, with Opera or otherwise, during those weeks.  It seemed the driver rollback may have been the solution.  Having evidently solved the problem, the following notes are provided just for future reference.

By this point, I had installed SDK (above).  This gave me a couple of folders (e.g., C:\Program Files\Debugging Tools for Windows) and a Start Menu shortcut for a folder called Microsoft Windows SDK v7.0.  Choosing Open from the context menu for that folder shortcut took me to the C:\Program Data\Microsoft\Windows\Start Menu\Programs\Microsoft Windows SDK v7.0 folder.  There, I saw a shortcut for CMD Shell.  This opened up a command window.  It said, "The x64 compilers are not currently installed.  Please go to Add/Remove Programs to update your installation."  I went to Control Panel > Programs and Features > select Microsoft Windows SDK for Windows 7 (7.0) > click Change at the top of the list of programs there > Repair > Next.  But that didn't help.  I did a search and found that few people had had this problem.  My guess was that I got this message because I had installed only a fraction of the full contents of the SDK, and the solution was to install more of it, probably through that same Programs and Features route.  In that case, it seemed I might just ignore the message.

To use the SDK for reading MEMORY.DMP, Dirk Smith said I would actually run WinDbg.exe.  The link to this program (now located in C:\Program Files\Debugging Tools for Windows (x64)) had been installed in another Start Menu folder.  So evidently I was on the wrong track, when I opened the CMD Shell, or maybe WinDbg was just a front end for the command line.  Dirk said I needed to start by using WinDbg to find the proper symbol files.  This involved going into WinDbg > File > Symbol File Path.  There, I typed this:
srv*c:\cache*http://msdl.microsoft.com/download/symbols
Then I clicked OK.  Nothing seemed to happen.  But perhaps it was downloading the appropriate symbols quietly, which was what Dirk seemed to be saying.  The next step was apparently to go into WinDbg > File > Open Crash Dump > navigate to C:\Windows or wherever MEMORY.DMP was.  This got me a command window that seemed to hang, but apparently it was just figuring things out.  After a minute or two, it came back with errors:
Module load completed but symbols could not be loaded for atikmpag.sys.
Module load completed but symbols could not be loaded for atikmdag.sys.
Probably caused by:  dxgmms1.sys
Dirk said I could ignore the first two lines, but I wasn't so sure.  As noted above, an atikmpag file was named in one of my minidumps and I was seeing references to atikmpag in Opera.  He said I should focus on the last line, the reference to dxgmms1.sys.  That one hadn't been named in my minidumps.  Dirk told me to type "!analyze -v" (without quotes) in the command line at the bottom of the WinDbg screen.  That got me another error 119 message, and more besides:
****************************************
*                                                                             *
*                        Bugcheck Analysis                      *
*                                                                             *
*****************************************

VIDEO_SCHEDULER_INTERNAL_ERROR (119)
The video scheduler has detected that fatal violation has occurred. This resulted
in a condition that video scheduler can no longer progress. Any other values after
parameter 1 must be individually examined according to the subtype.

Arguments:
Arg1: 0000000000000001, The driver has reported an invalid fence ID.
Arg2: 0000000000004362
Arg3: 0000000000004363
Arg4: 0000000000004363

Debugging Details:
------------------
DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT
BUGCHECK_STR:  0x119
PROCESS_NAME:  System
CURRENT_IRQL:  a
LAST_CONTROL_TRANSFER:  from fffff880015e322f to fffff8000307ed40
STACK_TEXT: 
[displaying, here, only the right end of each line - RW]
nt!KeBugCheckEx
watchdog!WdLogEvent5+0x11b
dxgmms1!VidSchiVerifyDriverReportedFenceId+0xad
dxgmms1!VidSchDdiNotifyInterruptWorker+0x19d
dxgmms1!VidSchDdiNotifyInterrupt+0x9e
dxgkrnl!DxgNotifyInterruptCB+0x83
atikmpag+0x52dc
atikmdag+0x4f526
atikmdag+0x4d479
atikmdag+0x62070
atikmdag+0xfb298
atikmdag+0x1015de
atikmdag+0x10161d
atikmdag+0x101714
atikmdag+0x101845
atikmdag+0x108d7b
atikmdag+0xfa0dc
atikmdag+0x4d15f
atikmpag+0x5ddb
nt!KiInterruptDispatch+0x16c
amdppm!C1Halt+0x2
nt!PoIdle+0x52a
nt!KiIdleLoop+0x2c

STACK_COMMAND:  kb
FOLLOWUP_IP:
dxgmms1!VidSchiVerifyDriverReportedFenceId+ad
fffff880`053b9eb9 c744244053eeffff mov     dword ptr [rsp+40h],0FFFFEE53h
SYMBOL_STACK_INDEX:  2
SYMBOL_NAME:  dxgmms1!VidSchiVerifyDriverReportedFenceId+ad
FOLLOWUP_NAME:  MachineOwner
MODULE_NAME: dxgmms1
IMAGE_NAME:  dxgmms1.sys
DEBUG_FLR_IMAGE_TIMESTAMP:  4ce799c1
FAILURE_BUCKET_ID:  X64_0x119_dxgmms1!VidSchiVerifyDriverReportedFenceId+ad
BUCKET_ID:  X64_0x119_dxgmms1!VidSchiVerifyDriverReportedFenceId+ad
Followup: MachineOwner
Dirk said the right ends of the STACK TEXT lines were important for identifying third-party drivers.  Atikmpag and atikmdag were prominent there, just before (i.e., below) the dxgmms1 lines.  Anyway, the next step was to type "lmv" into the WinDbg command line.  This command provided details on all running programs or drivers (not sure) when Windows crashed.  As instructed, I searched this pile of information (using Ctrl-F) for the "probably caused by" item, which in my case (above) was dxgmms1.sys.  That search (with variations) found nothing.  I copied and pasted the WinDbg output into Notepad and tried my search there.  This time, it worked.  I tried it again in WinDbg, and this time it worked there too.  Not sure what I had done wrong the first time.  It seems the purpose of this step was to verify the manufacturer of the problematic file.  It looked like dxgmms1.sys came from Microsoft.  But if that Microsoft file had been the source of the problem, wouldn't I have been having these crashes before I installed the new video card?  WinDbg was showing me that the source of atikmdag.sys was AMD.  As Dirk said, Windows itself (i.e., Microsoft) was probably not the culprit.

It really looked like the purpose of this whole WinDbg and MEMORY.DMP rigmarole was just to get the identity of the driver manufacturer.  I wasn't sure this process was more effective than just doing web searches for the driver name and the error message.  I guess it added dxgmms1.sys to my list of possible causes, and provided confirmation that the atikmpag and atikmdag files were near the heart of this problem.  Whether I would be seeing more of this problem remained to be seen.  As noted above, the older driver presently seemed to have provided the desired stability.

There was one other approach that I hadn't pursued, and decided not to pursue at this point.  That was simply the suggestion to look at the time of the crash, in BlueScreenView, and then use NirSoft's MyEventViewer to examine events within a second or two before the crash.  Preliminarily, that seemed to be another way of getting at the contents of MEMORY.DMP, as listed in the STACK_TEXT above.  But possibly that would be more informative.  For me, further learning on that could await a future BSOD.

Monday, January 9, 2012

Windows 7: BSOD: Memory Dump

As discussed in another post, I was trying to understand the words that appeared on a blue screen of death (BSOD) -- that is, a crash notice -- in Windows 7.  At the bottom of the screen, there was a mention of a "memory dump."  This post describes steps I took to learn more about that concept.

The general idea seemed to be that Windows had stored information about the current contents of the system's RAM to a file somewhere on my computer.  A search led to a Microsoft webpage that said I could find a list of such files in "the %SystemRoot%\Minidump folder."  Another search led to suggestions that %SystemRoot% was the same as %windir%, and that a person with a standard Start Menu could find %SystemRoot% by going to Start and typing "%systemroot%\system32."  I found that if I went to Start > Run and typed %systemroot% I would get a Windows Explorer session focused on C:\Windows.  This was consistent with my belief that the %SystemRoot% environment variable meant simply C:\Windows.

Unfortunately, I didn't see a C:\Windows\Minidump folder, and a search of my system also yielded no files of the format suggested in the Microsoft webpage (e.g., Mini022900-01.dmp).  It was possible that the webpage was not relevant to Win7 x64:  its list of the Windows versions to which it applied went up only as far as Windows 7 Beta.  But I wasn't finding a lot of great sources of information, so I stuck with it for the time being.

The Microsoft webpage posed the possibility that I lacked a Minidump folder because a memory dump would require "a paging file of at least 2 megabytes (MB) on the boot volume."  I had configured Win7 to use only a paging file on another drive.  So I went into Advanced System Settings.  (One way to get there was Control Panel > System > Advanced system settings.  Another way was Start > Run > SystemPropertiesAdvanced.exe.  I preferred the latter because, once I knew the name of the .exe file, I could put it into a batch file, if ever I wanted it to come up automatically, and also because the Run box would remember it, and that seemed faster:  the SystemPropertiesAdvanced.exe option would come up as soon as I hit S, so the whole thing could be done very quickly via WinKey-R-S, if I didn't have other items beginning with R in my Start Menu.)

In Advanced System Settings, I went into Advanced tab > Performance Settings > Advanced tab > Virtual Memory > Change.  There, I highlighted drive C, selected Custom Size, and specified a minimum and maximum of 10MB.  (I wanted most paging done on a drive other than the one containing the Windows program files, in the belief that dividing tasks between drives would improve performance.)  Then I clicked Set.  I got an error:  "The initial paging file size must be between 16 MB and 16777216 MB."  So OK, 16.  Now I got another error:  "If you disable the paging file or set the initial size to less than 400 megabytes and a system error occurs, Windows might not record details that could help identify the problem."  Once again, then:  400MB.

Then I hibernated the system and then restarted it, to see if now I would get a Minidump folder.  But I didn't get a BSOD.  I wondered why not.  The previous BSOD had come after leaving the system shut down all night.  Maybe I should have allowed a minute or two for the memory to clear itself out, before restarting the system.  I tried hibernating again, gave it a while, and then powered the machine up again.  Now I got a BSOD.  Same as before, but with the following change near the bottom:

Dump file size is too small - requires at least 521307888 bytes.
Future kernel memory dumps may require larger size.
Switching to minidump ...
Physical memory dump complete.
So it seemed that creating the paging file on drive C had indeed been necessary, but that (by my calculation) it needed to be at least 497MB for a full memory dump.  But did I now have at least a minidump file?  I punched the Reset button.  Oddly, I didn't get the choice (above) between continuing or deleting restoration data.  Instead, I got the menu that offers a Normal boot or Safe Mode.  I went with Normal, and Windows proceeded to load.

The Microsoft webpage said that I could change the location of the memory dump files by going to Advanced System Settings (above) > Advanced Tab > Startup and Recovery Settings.  There, I saw that my dump file was presently set to go to %SystemRoot%\MEMORY.DMP.  So it seemed I had been looking in the wrong place:  there was not going to be a C:\Windows\Minidump folder.  I went into C:\Windows and saw that there was also no MEMORY.DMP file.  A search of my system found no such file anywhere.  Apparently MEMORY.DMP was the file that I would have gotten if I'd had a paging file of at least 500MB.  So, OK, was there at least a file of the kind mentioned above, mini*.dmp?  No, but a search did turn up several *.dmp files with much longer names (e.g., 38f2213f-dd1e-452d-932eac0a3f6911e7.dmp).  But those were in a Firefox program folder.  They seemed to be Firefox crash data.  So far, I was not succeeding in creating a dump file that would shed light on my BSOD.  I went back into the paging file settings (above), changed the Custom Size to 1000MB, and prepared to hibernate and retry.  Later, though, I saw an indication that it might be necessary for drive C to have a pagefile large enough to hold "a file whose size equals your entire RAM plus one megabyte."  I had 8GB of RAM.  So I tried again with a 10,000MB pagefile on drive C.

Before contining with that discussion, I should catch up, here, with some other information that came up along the way.  For one thing, the Microsoft webpage said that I could use Start > Run > dumpchk.exe to verify that my system was correctly creating memory dump files.  But it wasn't clear whether that program, available through the Windows XP Support Tools, could still somehow be available in Windows 7.  It didn't work to just go to Start > Run > dumpchk.exe.  It looked like I could find tools that would read the dump file if I downloaded and installed Microsoft's Debugging Tools for Windows.  But that would require an MSDN subscription, which I didn't think I had, and it also seemed that I was getting into developer territory, over my head.  I took a look at the download webpage anyway.  It turned out that all I needed was my Hotmail login, and suddenly I was looking at options to download all kinds of Microsoft programs -- Windows, Office, Visio, etc.  But then I ran into a couple of problems when I searched in that page for "Debugging":  it didn't show Debugging Tools for Windows, and it said, "There is no subscription associated with your Live ID."  So I seemed to be at the end of the line there.

Another possibility was to download and install the Microsoft Windows SDK (Software Development Kit) for Windows 7.  It was advertised as providing tools, among other things, and that sounded good; sometimes a tool I didn't need at one point would be useful later.  I ran the downloaded file (winsdk_web.exe) and, in the Installation Options, I selected all of the Common Utilities and all of the Redistributable Packages, except for the Application Verifiers.  I deselected Windows Native Code Development and .NET Development.  This would be a 517MB download that, after installing, would apparently take about 250MB of drive space.  It was probably somewhat more than I needed, but I was not entirely sure how to trim it further.  During the installation process, however, I got an error:
Installation Failed

A problem occurred while installing selected Windows SDK components.

Setup could not find the file WinSDKRedist_amd64\WinSDKRedist_amd64.msi at any of the specified source locations http://download.microsoft.com/download/A/6/A/A6AC035D-DA3F-4F0C-ADA4-37C8E5D34E3D/setup
A search for exactly that error message yielded no hits.  A related search led to an MSDN blog post that made me think the easiest way to proceed would be to download an ISO for the SDK and burn that to a blank DVD.  The ISO download webpage gave me several choices.  I selected the ISO for an AMD64 chip (GRMSDKX_EN_DVD.iso).  It would be a 1.4GB download.  I noticed that this would be a somewhat earlier version of the SDK; I hoped it did not matter for my purposes.

While that was downloading, I ran across a post that said an easier way to read memory dump (.dmp) files was just to use Nirsoft's BlueScreenView.  BlueScreenView was a portable with good ratings at Softpedia and CNET.  I let the SDK download proceed -- I figured I'd have time to play with its possibilities, once I had the ISO burned to a DVD -- but in the meantime I wanted to move ahead with BlueScreenView if I could.  Its homepage answered one question I'd wondered about:  it seemed to indicate that Windows 7, like XP, did create minidump files, so my lack of them wasn't due to the fact that I was running Win7.

When the SDK ISO finished downloading, I burned it to a DVD and ran its Setup.exe file.  That opened SDK options similar to those described above.  This was SDK for Win7 and NET Framework 3.5 SP1.  In this case, the installation options I selected were only the Debugging Tools for Windows and the Redistributable Components -- being uncertain, again, whether I would need all that.  I was kind of stuck, there, until I had a memory dump file to look at.

I hibernated the system, let it sit for several minutes, and rebooted.  No BSOD.  Windows came up normally.  Maybe I hadn't let it sit long enough.  I hibernated again, this time for an hour and a half.  But still no BSOD.  The stupid thing wouldn't fail!  It seemed that my testing of BlueScreenView would have to wait until the system reverted to a more cooperatively uncooperative stance.

Then I recalled seeing an indication that it was possible to generate a BSOD and/or a memory dump file manually.  I ran a search and found a YouTube video whose four-minute duration could be summarized as advising me to create and run this REG file, which I saved (as shown) with the name of CrashOnCtrlScroll.reg:
Windows Registry Editor Version 5.00

; *** CrashOnCtrlScroll.reg ***

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\i8042prt\Parameters]

"CrashOnCtrlScroll"=dword:00000001

; Undo:  "CrashOnCtrlScroll"=dword:00000000
and then reboot.  RightCtrl-ScrollLock x2 (i.e., hold the right-side Ctrl key and hit ScrollLock twice) would crash the computer and generate a BSOD.  As indicated by the last line of this batch file (which would function only as a comment, due to its leading semicolon), if CrashOnCtrlScroll value was set to 1, the crash would work.  If it was set to zero, it would not work.  So with a quick bit of editing in Notepad, this REG file could be used either to turn on the crash option or to turn it off.  I would just have to remember to reboot before relying on it.

So I ran the REG file -- and it worked.  I got a BSOD that was somewhat different from the one shown above.  After the first sentence, it added this remark:  "The end-user manually generated the crashdump."  A search confirmed that lots of people were aware of this tweak.  The other difference was that the STOP error code was E2 (i.e., 0x000000E2).

As before, the BSOD indicated that it had completed a dump of physical memory to disk.  Now I had both BlueScreenView and the SDK installed; surely I would be able to find the memory dump file.  A Hotgeek webpage informed me that BlueScreenView would search automatically for minidump files.  But on reboot, before I got to the point of starting BlueScreenView, Windows gave me a dialog that I had not seen before:
Windows has recovered from an unexpected shutdown

Windows can check online for a solution to the problem.

View problem details.
When I clicked to see the details, I got a nice presentation of information from this most recent, user-induced BSOD:
Problem signature:
Problem Event Name: BlueScreen
OS Version: 6.1.7601.2.1.0.256.1
Locale ID: 1033
Additional information about the problem:
BCCode: e2
BCP1: 0000000000000000
BCP2: 0000000000000000
BCP3: 0000000000000000
BCP4: 0000000000000000
OS Version: 6_1_7601
Service Pack: 1_0
Product: 256_1
Files that help describe the problem:
C:\Windows\Minidump\010912-21075-01.dmp
C:\Users\Ray\AppData\Local\Temp\WER-43040-0.sysdata.xml
Read our privacy statement online:
http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409
If the online privacy statement is not available, please read our privacy statement offline:
C:\Windows\system32\en-US\erofflps.txt
According to that, it seemed that now, at last, I did have a minidump file.  I saw that, at last, C:\Windows\Minidump did exist, and it contained the file shown in that list of details. I tried viewing that file in Notepad, but it was a mess.  Definitely not the tool for viewing .dmp files.

The dialog offered a "Check for solution" button, so I clicked it.  It said it was checking, and then after a moment it disappeared.  I had hit a key on the keyboard, so maybe that was to blame.  Anyway, I started up BlueScreenView and, sure enough, it did show that file.  In its bottom pane, BlueScreenView seemed to be presenting that file's contents.

I also started up the Microsoft SDK programs that I had installed from the downloaded ISO that I had burned to DVD.  The specific program I started was called WinDbg.  In WinDbg, I went into File > Open Crash Dump.  It seemed to be looking for a file called MEMORY.DMP.  That was the name of the file that was supposed to have been created by SystemPropertiesAdvanced (see above) > Startup and Recovery Settings.  But for some reason, my BSOD had not generated the specified kernel memory dump file called %SystemRoot%\MEMORY.DMP.  There did not seem to be any such file on my computer.  I tinkered with those Startup and Recovery Settings and saw that I had the option to create a Small Memory Dump (256KB) in %SystemRoot%\Minidump.  I changed to that setting.  But this was puzzling.  Why had Windows created a minidump when SystemPropertiesAdvanced was requesting a full kernel dump?  Would this change of settings in SystemPropertiesAdvanced make any difference to anything?

If minidumps were only going to be 256KB, presumably the paging file on drive C wouldn't need to be any larger than 16MB (above).  While I was still in SystemPropertiesAdvanced, I went back into Advanced tab > Performance Settings > Advanced tab > Virtual memory Change and set the drive C paging file to a minimum and maximum of 16MB.  Before rebooting to make that new paging file setting take effect, I went back into WinDbg > File > Open Crash Dump, navigated to C:\Windows\Minidump, and tried to open the minidump .dmp file there.  It gave me a warning:
Kernel symbols are WRONG.  Please fix symbols to do analysis.

Your debugger is not using the correct symbols.
It went on from there.  But I felt this warning was not accurate.  What it should have said was, "You are a complete screwup!  You have no idea what you are doing!  Get the hell out of here!"  In other words, it presently appeared that the entire project of downloading and burning and trying to use the SDK was a complete farce, and I should just hang my head in shame and stop even pretending to utilize that software in an intelligent manner.  Which I did.  My tool of choice for minidump files, for the foreseeable future, would be BlueScreenView.

I wasn't sure whether a crash at this point would accomplish the same thing as a proper reboot, for purposes of resetting the paging file.  But I wanted to generate another minidump file, so I hit RightCtrl-ScrollLock x2.  This gave me a forced reboot but no BSOD.  Odd.  Maybe the paging file situation had interfered with it somehow.  Anyway, Windows restarted.  I had, again, the same dialog as above -- "Windows has recovered from an unexpected shutdown" -- and, as before, it checked for a solution and then vanished without further ado.  I took a look at C:\Windows\Minidump.  Now there were two minidump files.  (Incidentally, their sizes were 270KB and 283KB, so I wasn't sure exactly what SystemPropertiesAdvanced meant, when it referred to a 256KB minidump.)  I wanted to see what the second .dmp file said -- what would have been displayed at the bottom of the BSOD, regarding the paging file -- so I fired up BlueScreenView and took a peek.  It didn't contain any comments like those shown above (e.g., "Future kernel memory dumps may require larger size.  Switching to minidump ...").  But then I realized I didn't really care.  I had the minidump; perhaps I wouldn't need the full kernel dump (which I still didn't know how to produce).

What BlueScreenView did reveal, of possible interest, was a line referring to stdriver64.sys -- which was the driver mentioned in the original BSOD that had prompted me to commence this voyage of discovery.  I didn't know how to interpret that line, but I thought possibly its contents would be revealing in comparison against the next BSOD produced by a crash of stdriver.64, as distinct from these manually initiated minidumps.  So I would be keeping these two minidump files for future reference.

I was done with BlueScreenView for now, pending the next stdriver.64 crash.  I had managed to configure my system, and had learned something, so that now I was possibly prepared to read and begin to understand what was happening when stdriver.64 crashed.

Sunday, January 8, 2012

Windows 7 x64: Bootable RAM Tester

I was running 64-bit Windows 7 with 8 gigabytes (8GB) of memory.  I wanted to test that RAM.  So I looked around for a program that would do the testing.

It appeared that Memtest was still limited to 32-bit Windows (hence its full name, Memtest86).  Another option was Microsoft's Windows Memory Diagnostic (WMD). It appeared both of these would involve burning the program to a CD, booting the computer from that CD, and letting it run for a while, probably hours. I tried that with WMD, but it gave me an error:

Windows Memory Diagnostic (WMD) could not process the system memory map. This occurred because of deficiencies within WMD, not the computer. As a result, not all of the computer memory will be tested. The specific problem was:

The memory map contained ranges that extended above four gigabytes.

Press (C) to continue. Press any other key to exit WMD.
So WMD wasn't good with more than 4GB of RAM. Now I saw that this limitation was indeed stated in the webpage -- down in the Appendix. RTFA. The Appendix also seemed to say that WMD, too, was suited only for x86 (not 64-bit) CPUs.

That could be confusing, because it turned out that 64-bit Windows 7 came with a memory tester built in: Start > Run > mdsched.exe would bring up a dialog titled "Windows Memory Diagnostic" with an option to restart the system and check the memory.  So apparently what I had been playing with previously (above) was a different version of WMD (which was, incidentally, a particularly unfortunate acronym for a piece of computer software).  The built-in variety of WMD would be something that one could build into a monthly or quarterly batch file, to come up automatically.  It would not be useful for a nonbootable system, however, so I continued my search.

That search led me back to Memtest86.  Wikipedia said that I (and, seemingly, Memtest's own webpage) was wrong:  Memtest86 or Memtest86+ would supposedly work with 64-bit CPUs, and Memtest86 version 3.5a or above would supposedly test more than 4GB of RAM.  The Background page at the Memtest86 website said that Version 4.0a did indeed test up to 64GB (or, if using 4.0b Server edition, up to 8TB) of RAM, and that it would also work with 64-bit CPUs and would support up to DDR3.  It came in Windows and Linux versions.  For Windows, it came in USB, floppy, and CD ISO flavors.  Meanwhile, it appeared that Memtest86+ had not been updated in nearly a year, so I stuck with just trying the original (version 4.0a) Memtest86 for now.

I downloaded, burned, and booted the Memtest86 4.0a ISO.  It recognized 7678MB of RAM.  I wasn't sure why it didn't seem to be testing a full 8GB.  I turned away to work on something else.  When I returned, 21 minutes later, it had already finished a first pass and had started a second one.  So it apparently took less than 20 minutes to test 8GB (or so) of RAM.  It reported, "Pass complete, no errors, press Esc to exit."

I decided to test the built-in Windows Memory Diagnostic.  I booted Windows 7 and ran mdsched.exe (above).  Its blue and white design was prettier than that of Memtest86, I thought, but it was far less informative.  It was done with my 8GB of RAM in about 15 minutes.  The main advantages of mdsched.exe seemed to be that it was built-in and could be batched, run by Task Scheduler, or called up at a moment's notice, without any need to load a separate CD.  Because of that last point, the computer could reboot, run mdsched.exe, and then continue seamlessly into Windows without requiring user interaction.

Saturday, December 31, 2011

Warning: System Memory Usage High

Summary

A sudden announcement from my computer speakers appeared to be due to a bug in Rizone Memory Booster (MB).  The solution seemed to be to change, rename, or delete the MP3 files in MB's Sounds folder.  I used this problem as an opportunity to look at some related freeware.

Description

I was working along as usual in Windows 7, and suddenly a voice announced from my computer speakers, "Warning:  System memory usage high!"  I had recently reinstalled Windows and all sorts of software, so it wasn't immediately obvious what piece of hardware or software would keep repeating this announcement every few minutes.  I ran a search and saw that nobody else seemed to be reporting exactly this problem, so I thought I had probably better log, here, my efforts to resolve it.

I first checked to see whether the warning was correct.  There were all sorts of things to know about memory, such as whether I was using 32-bit or 64-bit operating systems and hardware.  Getting an accurate and informative impression of the current status of my system could be tricky.  I was using the Windows 7 Task Manager (Start > Run > taskmgr.exe > Performance tab) and the Windows 7 Resource Monitor (Start > Run > perfmon.exe > Open Resource Monitor), but I wasn't entirely sure what they were telling me.  I thought maybe another tool would help to clarify the situation, so I did a search on CNET.

Among the several highly rated options there, it seemed that Iolo's System Mechanic Free might give me a good memory tool and might incidentally address some other needs.  On Iolo's website, I saw that their "standard" version of System Mechanic cost $40, so I was a little concerned that I might actually be installing shareware.  Not that I shouldn't pay for useful software, but I was already running behind in that department, and there were other programs of long service that had first dibs on my financial resources.  It developed, in any case, that flipping the switch from "Disabled" to "Enabled" on System Mechanic's option to "Automatically repair low memory problems" brought up a prompt to upgrade to the $40 version.  System Mechanic did have manual cleanup and reporting options.  For instance, by the time I got to the point of writing these words, its IntelliStatus report more or less agreed with Task Manager that about 70% of my RAM was free, and the adjacent Optimize button opened up a memory defragmentation process that ran for maybe 10 seconds and claimed to recover another 5% of RAM.  I decided to keep System Mechanic for a while and play with it some more.  Using Windows Explorer, I added a link to System Mechanic to the Startup folder in my Start Menu, so that System Mechanic's options window would open up when I started the computer.

At some point, I noticed that CNET's editors and users ranked Advanced SystemCare Free pretty highly.  It was another program, like System Mechanic, for cleaning and optimizing the system.  It appeared to be a lot more popular.  I had used a previous version for years, but I think I fell away from it when I transitioned from Windows XP to Windows 7.  I decided to try it too.  It seemed likely that I would join the crowd and prefer it over System Mechanic.

For present purposes, an automatic memory optimizer appeared to be what I needed, so I went back to CNET and looked at the popular MemInfo.  Its purpose seemed to be to provide fast, system-tray access to memory information and a manual RAM defragmenter.  But then I saw the widely used Moo0 SystemMonitor Portable.  I tried it and liked it.  It took me a minute to catch on to it.  I had to right-click on its onscreen display to get options.  It was easier to access and more configurable than Windows 7 Task Manager or Resource Monitor; it took less screen space; and it could be made to minimize to the system tray.  It didn't seem to have a measure of graphics performance, though, so it seemed I would have to use the Windows Experience Index for that (Start > Control Panel > Performance Information and Tools).

At some point in this inquiry, I remembered that I actually had already installed an automatic memory optimizer:  Rizonesoft's portable Rizone Memory Booster.  I hadn't tried to tweak it or anything; it had just sort of faded into the background or, more accurately, the system tray.  But now I thought, well, of course, that had to be where this phantom voice was coming from.  I right-clicked on its icon, which now seemed terribly obvious there in the tray, and looked at its options.  Yes, it did have an option to "Play warning every 3 minutes if load exceeds 80" (percent, I assumed).  The default values of 3 and 80 were adjustable.  But, oddly, this option was turned off.  I turned it on and changed it to 1% and clicked Apply.  Nothing happened.  I retried with 30%.  Still nothing.  Memory Booster's main screen said 50% of memory was used, so I should have gotten something.  Ah, but when I closed out of the dialog altogether and let Memory Booster retreat to the system tray, the voice came back.  "Warning:  system memory usage high."  So that was the culprit.  The program's readme file seemed to indicate that there had been a previous issue with saving the sound settings, so maybe the fix for that problem had created this new one where apparently the program would sometimes turn on the sound on its own initiative.

I played with Memory Booster (MB) for a few minutes and then sent Rizonesoft a link to this blog post.  MB had options to Optimize or Defrag memory.  Its writeup and an Addictive Tips review agreed that, unlike many other memory optimizers, MB's Optimize option used a safe method, involving "a Windows API call."  This would reportedly leave programs and data in memory and, as such, would only free up a minor amount of memory -- but it might also cure memory leaks and unfreeze programs.  By contrast, the writeup said that the Defrag option was an experimental (presumably potentially unstable) function that, unlike Optimize, would force most of the contents of memory into the pagefile (i.e., the portion of hard drive space set aside as a memory overflow area).  It seemed that Defrag was the more extreme option, carrying a risk of (temporarily) screwing up the system and requiring a system reboot.

The Defrag option was not included in the version that Addictive Tips reviewed.  Possibly it was previously a feature available only in the Gold ($14.95) version of MB.  Then again, I wasn't entirely sure whether a gold version continued to exist.  The writeup (dated July 7, 2011) said that MB "is now part of the Doors system," and explained how to install MB by installing Doors.  But I hadn't had to do that.  Maybe things had changed since the time of that writeup.  I wasn't familiar with the Doors system.  It seemed to be Linux-related.  So that part was a mystery.  One source had said that MB had only a nine-day trial period.  Maybe that had changed too -- maybe it had been removed or lengthened.  Rizonesoft's webpage said, "Demand no nonsense freeware," so apparently there were no worries there, unless the Doors situation had changed that.

I did like the program -- it was informative, and it seemed to be accurate, and its Intelligent Memory Optimization seemed to be working.  While some might not understand or appreciate the sarcasm on the website or in the readme file, it seemed that the programmer was responsible and meant to be helpful.

It occurred to me that I might be able to fix the sound problem myself.  I looked into the program's Sounds subfolder.  There, I saw five MP3 files.  One was called mem-high.mp3.  I played it.  Sure enough, that was the one I'd been hearing.  I created a subfolder called "Originals" and put these MP3s into it.  I wondered whether identically named replacement MP3 files would work.  First, I inserted a song MP3 into the Sounds subfolder, renamed as my new mem-high.mp3 file.  At first it didn't seem to be working, but I fiddled with it for a few minutes, and then it did.  Of course, I realized immediately that this had some prank possibilities.  But for my purposes, I removed all MP3s from the Sounds folder.  The program didn't crash when it failed to find a mem-high.mp3 file to play, and it seemed to continue to work.  I didn't want MB to make sounds, so that was the way I left it.