Steps for analysing hangs in PM applications:
This example is of a hang in the WorkPlace caused by a PM application fault.
First we check out whether the User_Sem is held, whether the system queue is locked and if necessary who has the focus.
##db pmsemaphores+20 l209f3f:0000b4d4 50 4d 53 45 4d 00 00 00-00 00 00 00 00 00 00 00 PMSEM........... 9f3f:0000b4e4 00 00 00 00 00 00 00 00-03 00 01 80 00 00 00 00 ................ ##dd pmqsyslock l1 9f3f:0000ed14 12d3128c ##dd %12d3128c %12d3128c 12d31630 00000020 0000000a 12d31334 %12d3129c 12d31474 12d31334 12d31334 0000a400 %12d312ac 00000000 0000001a 00000009 00000012 %12d312bc 80030059 0097ec67 00000048 00000089 %12d312cc 00000001 12d3ff5c 00000000 00000010 %12d312dc 00000000 00000000 00000000 00000000 %12d312ec 00000000 00000000 00000000 00000000 %12d312fc 00000000 00000000 8000016e 00000000 ##d %12d3130c 0fe90000 00005453 00000325 00000000 %12d3131c 12d31228 0bff0002 00000000 00000000 %12d3132c 00000001 0000002e 00000000 00000000 %12d3133c 00000000 00000000 00000000 00000000 %12d3134c 00000000 00000000 00000000 00000000 %12d3135c 00000000 00000000 00000000 00000000 %12d3136c 00000000 00000000 00000000 00000000 %12d3137c 00000000 00000000 00000000 00000000 ##.p2e Slot Pid Ppid Csid Ord Sta Pri pTSD pPTDA pTCB Disp SG Name 002e 001a 0002 001a 0009 blk 0500 ab911000 ab9c9408 ab9bc6c0 1ed0 12 turkey ##.s 2e ##.r eax=80030059 ebx=00008000 ecx=00090000 edx=00000004 esi=ffffffff edi=12d3128c eip=1bd0c8e1 esp=00293ea8 ebp=00090000 iopl=2 -- -- -- nv up ei pl zr na pe nc cs=005b ss=0053 ds=0053 es=0053 fs=150b gs=0000 cr2=12d3028c cr3=001d4000 005b:1bd0c8e1 83c40c add esp,+0c ##ln 005b:1bd0c78c pmmerge:PM32BIT:SleepPmq + 155 005b:1bd0c940 CalcWakeBits - 5f
No one Owns the User_Sem since words at offsets +0x8 and +0xa are both zero.
We see that the system queue is held by slot 2e, who happens to be blocked in PMMERGE which is waiting for message activity. We also notice that at MQ+44 there is a non-zero value, which indicates that this thread has called WinSendMsg and is waiting for a response.
We investigate the WinSendMsg by examining the SMS pointed to by MQ+44
##dd %12d3ff5c %12d3ff5c 00000000 12d3ff5c 00000000 00000000 %12d3ff6c 0097ec67 12d3128c 12d32cb4 00000000 %12d3ff7c 00000000 12d33168 00000071 00250016 %12d3ff8c 00000000 12d3ff5c 12d3ff5c 00000000 %12d3ff9c 00000000 0092e954 12d3910c 12d3ca34 %12d3ffac 00000000 00000002 12d34fac 00000407 %12d3ffbc 00000000 00000000 12d3ff90 12d3ff5c %12d3ffcc 00000000 00000000 0092834a 12d3910c
The target MQ for the sent message is at offset +18, i.e. %12d32cb4
We find out who this is (the slot number is at MQ+a4).
##dd %12d32cb4 %12d32cb4 12d34940 00000020 0000000a 12d32d5c %12d32cc4 12d32e9c 12d32d5c 12d32d5c 04002fff %12d32cd4 04000400 0000001a 00000001 00000012 %12d32ce4 80030051 0093c378 00000000 00000000 %12d32cf4 00000000 00000000 00000000 00000010 %12d32d04 00000000 00000000 00000000 00000000 %12d32d14 00000000 00000000 00000000 00000000 %12d32d24 00000000 00000000 8000006c 00000000 ##d %12d32d34 0fe90000 00005453 00000325 00000000 %12d32d44 12d33304 0bff0000 00000000 12d3ff5c %12d32d54 00000001 00000028 00000000 00000000 %12d32d64 00000000 00000000 00000000 00000000 %12d32d74 00000000 00000000 00000000 00000000 %12d32d84 00000000 00000000 00000000 00000000 %12d32d94 00000000 00000000 00000000 00000000 %12d32da4 00000000 00000000 00000000 00000000 ##.p 28 Slot Pid Ppid Csid Ord Sta Pri pTSD pPTDA pTCB Disp SG Name 0028 001a 0002 001a 0001 crt 0500 ab905000 ab9c9408 ab9bbaf0 1f10 12 turkey
Offset +a4 gives us the slot number which turns out to be another thread of the turkey application. The status of this thread is crt! This indicates that some other thread in the same process has entered critical section, furthermore slot 28 would be ready to run had it not been for the critical section thread. Clearly this is why our application has hung the PM messaging function. The real culprit is the user of Critical Section, who is it?
The PTDA contains the address of the TCB in critical section. The TCB offset +0 contains the thread id followed by the slot number.
##dd %ab9c9408+ptda_ptcbcritsec-ptda_start l1 %ab9c96e8 ab9bc6c0 ##dd %ab9bc6c0 l1 %ab9bc6c0 002e0009 ##.p 2e Slot Pid Ppid Csid Ord Sta Pri pTSD pPTDA pTCB Disp SG Name 002e# 001a 0002 001a 0009 blk 0500 ab911000 ab9c9408 ab9bc6c0 1ed0 12 turkey
Our application has perpetrated one if not two faults: