2. Case of the Unexplained 3 Mark Russinovich Technical Fellow Microsoft Corporation Session Code: WCL303
3. About Me Technical Fellow, Microsoft Co-founder and chief software architect of Winternals Software Co-author of Windows Internals 4th and 5th edition and Inside Windows 2000 3rd edition with David Solomon Author of TechNet Sysinternals Home of blog and forums Contributing Editor TechNet Magazine, Windows IT Pro Magazine Ph.D. in Computer Engineering
5. Case of the Unexplained… This is the 2009 version of the “case of the unexplained” talk series 2007 & 2008 versions covered different cases Can view webcast on Sysinternals->Mark’s webcasts Based on real case studies Some of these have been written up on my blog
6. Troubleshooting Most applications do a poor job of reporting unexpected errors Locked, missing or corrupt files Missing or corrupt registry data Permissions problems Errors manifest in several different ways Misleading error messages Crashes or hangs
7. Purpose of Talk Show you how to solve these classes of problems by peering beneath the surface Interpreting file and registry activity Interpreting call stacks You’ll learn tools and techniques to help you solve seemingly unsolvable problems
8. Tools We’ll Use Sysinternals: www.microsoft.com/technet/sysinternals Process Explorer – process/thread viewer Process Monitor – file/registry/process/thread tracing Autoruns – displays all autostart locations SigCheck – shows file version information PsExec – execute processes remotely or in the system account Pslist – list process information Strings – dumps printable strings in any file ADInsight – real time LDAP (Active Directory) monitor Zoomit – presentation tool I’m using Microsoft downloads: Kernrate – sample-based system profiler Visual Studio: Spy++ - Window analysis utility Debugging Tools for Windows: Windbg application and kernel debugger: www.microsoft.com/whdc/devtools/debugging/Windbg
10. The Case of the Slow Outlook Attachment User would see CPU burst and Outlook would hang for 15+ seconds whenever they received an attachment:
11. Process Monitor Process Monitor is a real-time file, registry, process and thread monitor It requires Windows 2000 SP4 w/Update Rollup 1, XP SP2 or higher, Server 2003 SP1 or higher, Vista, or Server 2008 (including 64-bit versions of Windows) It replaces Filemon and Regmon, but you can use Filemon and Regmon on older operating systems Enhancements over Filemon/Regmon include: More advanced filtering Operation call stacks Boot-time logging Data mining views Process tree to see short-lived processes When in doubt, run Process Monitor! It will often show you the cause for error messages It many times tells you what is causing sluggish performance
12. The Case of the Slow Outlook Attachment (Continued) Process Monitor trace of next received attachment implicated antivirus:
13. The Case of the Slow Outlook Attachment: Solved Searched web for confirmation: Checked AV settings found problematic option and disabled scanning:
14. Process Explorer Process Explorer is a Task Manager replacement You can literally replace Task Manager with Options->Replace Task Manager Hide-when-minimize to always have it handy Hover the mouse to see a tooltip showing the process consuming the most CPU Open System Information graph to see CPU usage history Graphs are time stamped with hover showing biggest consumer at point in time Also includes other activity such as I/O, kernel memory limits
15. The Case of the Periodic VMWare Freezes Noticed CPU peg every 10 seconds and the desktop freeze when running VMWare Saw in the Process Explorer System Information graph that it was the System process:
16. Processes and Threads A process represents an instance of a running program Address space Resources (e.g., open handles) Security profile (token) A thread is an execution context within a process Unit of scheduling (threads run, processes don’t run) All threads in a process share the same per-process address space The System process is the default home for kernel mode system threads Functions in OS and some drivers that need to run as real threads E.g., need to run concurrently with other system activity, wait on timers, perform background “housekeeping” work Other host processes: svchost, Iexplore, mmc, dllhost
17. Viewing Threads Task Manager doesn’t show thread details within a process Process Explorer does on “Threads” tab Displays thread details such as ID, CPU usage, start time, state, priority Start address is where the thread began running (not where it is now) Click Module to get details on module containing thread start address
18. Thread Start Functions and Symbol Information Process Explorer can map the addresses within a module to the names of functions This can help identify which component within a process is responsible for CPU usage Requires symbol information: Download the latest Debugging Tools for Windows from Microsoft (free) Configure Process Monitor’s symbol engine: Use dbghelp.dll from the Debugging Tools Point at the Microsoft public symbol server (or internal symbol server if you have access) Can configure multiple symbol paths separated by “;”
19. The Case of the Periodic VMWare Freezes: Solved Opened Threads tab for System process and paused after a spike: Ftser2k was XM Radio USB/Serial driver Stopping it didn’t remove spikes Http.sys is IIS kernel-mode cache driver Went to device manager and showed hidden devices Stopped http.sys and hangs went away Didn’t care about dependent services
20. The Case of the Runaway Internet Explorer Noticed a CPU spike and hovered over Process Explorer to see culprit: That was unexpected, because had just installed Adobe Acrobat Reader and exited Internet Explorer IE’s window wasn’t visible, but it was still in the process list
21. The Case of the Runaway Internet Explorer: Investigation The thread had a generic start address: Required deeper investigation…
22. Call Stacks Sometimes a thread start address doesn’t tell you what a thread is doing The stack might provide a hint: The stack is a per-thread region of memory that records a history of function nesting The bottom from (Function 3) is where the thread will continue executing Function 1 Function 2 Function 3
23. Viewing Call Stacks Click Stack on the Threads tab to view a thread’s call stack Lists functions in reverse chronological order Note that start address on Threads tab is different than first function shown in stack This is because all threads created by Windows programs start in a library function in Kernel32.dll which calls the programmed start address
24. The Case of the Runaway Internet Explorer: Stack Investigation I double-clicked on the thread to see its stack:
25. The Case of the Runaway Internet Explorer: What is GP.OCX? Opened DLL view to see DLL’s version information: DLL Search Online didn’t return any useful results
26. The Case of the Runaway Internet Explorer: Solved Searched for NOS Microsystems: Conclusion: Adobe uses gp.ocx, which had hit an infinite-loop bug Terminated IE process to stop CPU usage
28. The Case of the Logon Script Hangs Multiple users complained that logon would take three minutes Investigation revealed that all complaints were from Dell Precision 670 workstations But only some of the 670 workstations were affected User configured Process Explorer to run during logon and saw Lisa Client consuming CPU: Lisa Client was custom logon application that checked system for installed applications Lisa Client CPU then went idle for several minutes, then exited and system would start acting normally
29. The Case of the Logon Script Hangs (Continued) User captured a Process Monitor trace after manually running Lisa Client Saw three-minute delay correspond to device error: Details column showed IOCTL_SCSI_PASS_THROUGH Captured trace on working system and looked for IOCTL_SCSI_PASS_THROUGH operation No device error and no delay:
30. The Case of the Logon Script Hangs: Solved Device error lead user to look at disks: Working systems had Fujitsu disks Systems with hangs had Seagate Solution: Temporary: wrote WMI script that queried disk type and would not launch Lisa Client on Seagate systems Final: Application developers changed Lisa Client to avoid performing problematic command
32. The Case of the MMC Startup Failure User would get an error every time they started an MMC snapin:
33. The Case of the MMC Startup Failure: Solved Ran Process Monitor and saw an Access Denied error on an IE registry key: Checked permissions and Administrators had no access Solution: added full-access for Administrators and MMC started successfully
34. The Case of the Favorite that Wouldn’t Save User tried to change the URL for one of his IE favorites: Trying to save a new favorite resulted in a similar error:
35. The Case of the Favorite that Wouldn’t Save: Solved Captured a Process Monitor trace: AccessChk showed that folder was Medium Integrity (IE requires Low): Fixed integrity with Icacls and problem solved
36. The Case of the Persistent Executable Noticed that opening volumes in Explorer was really slow Volume context menu indicated presence of Autorun.inf
37. The Case of the Persistent Executable (Continued) Files reappeared after deleting, so monitored activity with Process Monitor File was recreated by Explorer, so looked at stack
38. Viewing Autostarts Use Autoruns to see what’s configured to start when the system boots and you login Windows MsConfig shows a subset defined autostart locations MsConfig doesn’t show as much information
39. The Case of the Persistent Executable (Solved) Process Explorer DLL search showed that amvo.dll loaded into Explorer and all its children Found amv0.exe and used Autoruns to delete it from the system Run key
41. Application Crashes In most cases, there’s nothing you can do about application crashes They are caused by a bug in in the program Only the developer can fix a bug However, the crash may be caused by misconfiguration or an extension (a plugin) Monitor the application’s crash with Process Monitor if it’s reproducible Look for extensions in the crash file with Windbg
42. Finding the Crash Dump On pre-Vista systems, finding the dump file is easy:
43. Attaching to the Dying Process Vista doesn’t save crash dumps for most crashes Only if Microsoft requests a dump for study and you send it in When a crash occurs, don’t dismiss the crash dialog: Launch Windbg and attach to the process You can save a dump with the .dumpcommand
44. Identifying the Crashed Process On Vista, the process name might not be enough to identify the instance that’s crashed: To determine the PID of the crashed instance, look at WerFault’s command line:
45. Enabling Dump Archiving on Vista and Windows Server 2008 Or you can configure Vista SP1 and Windows Server 2008 to always generate and save a dump file Create a key named:HKLMoftwareicrosoftindowsindows Error ReportingocalDumps Dumps go to %LOCALAPPDATA%rashDumps Override with a DumpFolder value (REG_EXPAND_SZ) Limit dump history with a DumpCount value (DWORD)
46. Analyzing a Crash Basic crash dump analysis is easy and it might tell you the cause Requires Windbg and symbol configuration Once the dump is loaded, find the faulting thread The debugger might identify it If the debugger doesn’t, examine each thread stack looking for “fault”, “exception”, or “error” names Examine the stack of the faulting thread to look for third-party plugins If you suspect an extension: Check for a new version Uninstall it if the problem persists
47. The Case of the Explorer Context Menu Crash Explorer would randomly crash when the user right-clicked on a file Attached to process and executed !analyze -v: Didn’t know what muangys.dll was and because module was unloaded, Windbg provided no information
48. The Case of the Explorer Context Menu Crash (Cont) Ran Process Explorer and looked at Explorer DLL view to find muangys.dll: File had no version information, but Strings identified the company and application:
49. The Case of the Explorer Context Menu Crash: Solved Was part of Icon editing software, which developer relied upon No newer version Solution: disable shell extension with Autoruns
51. Crashes and Hangs Windows has various components that run in Kernel Mode, the highest privilege mode of the OS OS components: Ntoskrnl.exe, Hal.dll Drivers: Ntfs.sys, Tcpip.sys, device drivers Kernel-mode components are privileged extensions to the OS have to adhere to various rules Not accessing invalid memory Accessing memory at the right “Interrupt Request Level” Not causing resource deadlocks When a kernel-mode component performs an illegal operation, Windows crashes (blue screens) Crashing helps preserve the integrity of user data A resource deadlock can hang the system
52. Online Crash Analysis When you reboot after a crash, Windows offers to upload it to Microsoft Online Crash Analysis (OCA) Automated server generates a thumbprint of the crash and uses it as a key in a database If the database has an entry, the user is told the cause and directed at a fix
53. Basic Crash Dump Analysis Many times OCA doesn’t know the cause: Basic crash dump analysis is easy and it might tell you the cause Requires Windbg and symbol configuration Dump files are in either: indowsemory.dmp: Vista and servers indowsinidump: Windows 2000 Pro and Windows XP
54. The Case of the Crashed Phone Call Laptop crashed during a Skype VOIP call User reconnected and system crashed again Minidump file pointed at Intel wireless driver:
55. The Case of the Crashed Phone Call (Cont) Looked at file properties to determine what device the driver was for: Found device in Device Manager:
56. The Case of the Crashed Phone Call (Cont) Right-clicked and checked Windows Update for newer driver: Need to check OEM site, so had to find version number
57. The Case of the Crashed Phone Call: Solved OEM site had older version: Intel site had newer one: Installed and crashes stopped
58. Summary and More Information A few basic tools and techniques can solve seemingly impossible problems I learn by always trying to determine the root cause Resources: Webcasts of two previous “Case of the Unexplained “ talked Sysinternals->Mark’s Webcasts Sysinternals Video Library: in-depth dive on tools and troubleshooting My blog Windows Internals: understand the way the OS works If you’ve solved one, send me a description, screenshots and log files! I’ll send you a signed copy of Windows Internals
59. www.microsoft.com/teched Sessions On-Demand & Community www.microsoft.com/learning Microsoft Certification & Training Resources http://microsoft.com/technet Resources for IT Professionals http://microsoft.com/msdn Resources for Developers www.microsoft.com/learning Microsoft Certification and Training Resources Resources
60.
61. Want to talk face-to-face with folks from the Windows Product Team? Meet us today at the Springboard Series Lounge, or visit us at www.microsoft.com/springboard Springboard Series The Springboard Series empowers you to select the right resources, at the right technical level, at the right point in your Windows® Client adoption and management process. Come see why Springboard Series is yourdestination for Windows 7.