Tuesday, October 9, 2012

Windows 2008 R2 Server Hangs on Reboot


Have you noticed recently that your Windows 2008 R2 Server seems to hang or delay during the first logon after a reboot? 

Are you running a Windows 2008 R2 Server with backup software that leverages VSS snapshots.  Whether you are snapshoting VHD files, or Echange or SQL Server inside a Windows 2008 R2 server you could be at risk.

We battled this issue for weeks not knowing what the issue might be.  We first noticed it on our Exchange 2010 server which we have a backup process snapshoting using VSS several times a day.  The first time we noticed it the Exchange Server took about 30 mins to boot up.  We thought that maybe it was a dirty shutdown and the logs needed to replay or something.  Then the following reboot attempts over the coming weeks created longer and longer delays.

We worked with Microsoft Professional Services and then Premiere Services for many hours over several days to try to diagnose the issue.  We were sent from the Exchange team, to the Windows Server team, to the AD team, and even to their performance team and eventually it was by chance we found an article talking about this same issue. 

The article showed us how to clean up the issue and then prevent it from happening again as a two step process.  The long and short of it is that when using a VSS backup utility, such as Microsoft's DPM or Backup Exec, a snapshot of the vhd or volume is created and then reconnected to the server so it can clean it up before storing it away as a backup.  This second copy creates a record in the registry and acts like an additional plug-and-play device ... just like a USB stick or camera or something.  Every time you run your backup the registry hive increases, and every time you reboot your machine and logon the OS cycles through all of the plug-and-play registry records trying to reconnect or time-out.  The OS seems to work fine somewhere under 6000 devices, but we were sitting around 18,881 orphaned devices which was causing our huge delay as it waited to time-out on each device.

You can take a look at your registry to see if this might be happening to you.  Here is where I would look.  If you have a ton of records in these folders you may have an issue:
  • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\SCSI\
  • HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Enum\Storage\Volume\
  • HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\DeviceClasses\
Microsoft Hotfix (KB982210) was created to stop this creation of plug-and-play device records to be added to the registry, but it doesn't clean up any orphanded records that still exist.  To do that they released some code called DevNodeClean that you would have to compile yourself to identify and remove the records.  There are also sites out there that have done the compiling of this code for you.  We used the solution provided by Byte Solutions.

  1. Download DevNodeClean (this is the x64 version)
  2. Open a command prompt with elivated privledges and navigate to the folder you downloaded the application to.
  3. To see the list and a count of the phantom devices run DevNodeClean.x64.exe (this won't remove anything)
  4. To remove phanom devices run DevNodeClean.x64.exe /r (for 18,800 it took about 90 mins)
  5. Once you have removed the phantom devices then download and install Microsoft Hotfix KB982210

2 comments:

Corleon said...

Does MS hotfix 2526870 help at all http://support.microsoft.com/kb/2526870 ?

DevNodeClean didn't list all the phantom devices. Did you try ghostbuster http://ghostbuster.codeplex.com/ ?

Unknown said...

guys want to get good key why not try this link, all relevant software you can find here, absolutely genuine and cheap, very helpful www.aakeys.com

Post a Comment

Twitter Delicious Facebook Digg Stumbleupon Favorites More

 
Blog created to serve as a collection of online notes as well as to provide more breadcrumbs for helpful topics that may be difficult to find.