Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Oracle State Objects and System State Dumps

688 vues

Publié le

Oracle State Objects and System State Dumps with Tanel Poder - Hacking Session slides

Video in YouTube:

https://youtu.be/O3jRs6BgidU

Publié dans : Technologie
  • Login to see the comments

Oracle State Objects and System State Dumps

  1. 1. 1 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Oracle State Objects and System State Dumps Hacking Session with Tanel Põder https://blog.tanelpoder.com @tanelpoder
  2. 2. 2 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • This is a free hacking session, not a formal training session • Only a few slides • Not much rehearsed or planned • Lots of live hacking fun (hopefully!) • Training info at blog.tanelpoder.com/seminar • Feb 2020: Advanced Oracle Troubleshooting • May or June 2020: Advanced Oracle SQL Tuning • All attendees get downloadable videos (upfront if needed) • Latest scripts in GitHub • https://github.com/tanelpoder/tpt-oracle • https://github.com/tanelpoder/tpt-oracle/blob/master/tools/unix/ssexplorer.sh About Please star my TPT repo if you use it :-) There's a @help.sql script now (1st step towards having actual documentation )
  3. 3. 3 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Let's explain the why first, before going to the what and how Why do state objects exist?
  4. 4. 4 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Every operation done on a shared object needs to leave a trace in shared memory • Can be used in case of "rollback" of that operation • Especially important for cleaning up after dead processes State object trees process SO session SO call SO transaction SO library object lock SO enqueue SO PMON process SOprocess SOprocess SO process SO library cache object handle x$kglob (v$sql) x$ksqrs (v$resource) x$ktcxb (v$transaction) PMON checks if the SPID stored in this process SO still exists x$ksuse (v$session) x$ksupr (v$process) Number of state objects (slots in v$process array) is controlled by processes parameter pointer to first child SO
  5. 5. 5 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Demo State object structure
  6. 6. 6 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Processstate dump, systemstate dump • Processstate dump dumps all state objects under a process • Systemstate dump runs processstate dump for all processes (heavy operation, large trace file) • Can be used for determining leaks and hangs in extreme circumstances • alter session set events 'immediate trace name processstate level 266' • oradebug dump –g all dump systemstate 258 • alter session set events 'immediate trace name systemstate level 266' • alter session set events '60 trace name systemstate level 10' Textual dumping of state object trees Level 266 (10 + 256) to get processstate dump at level 10 + short_stack stack traces dumped for all processes This would make a session automatically take a systemstate dump when it hits "ORA-60: deadlock detected" error In RAC you probably want to use level 258 (256 + 2) to avoid dumping lots of lock elements More info about state object dump levels: https://grepora.com/2017/01/04/systemstate-dump/
  7. 7. 7 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session =================================================== PROCESS STATE ------------- Process global information: process: 28A4D1EC, call: 28B5FBFC, xact: 00000000, curses: 28B36EAC, usrses: 28B36EAC ---------------------------------------- SO: 28A4D1EC, type: 2, owner: 00000000, flag: INIT/-/-/0x00 <-- SO address (process) Oracle pid=17, calls cur/top: 28B5FBFC/28B5FBFC, flag: (0) <-- SO type (process) int error: 0, call error: 0, sess error: 0, txn error 0 (post info) last post received: 0 0 0 last post received-location: No post last process to post me: none last post sent: 0 0 0 last post sent-location: No post last process posted by me: none (latch info) wait_event=0 bits=0 Process Group: DEFAULT, pseudo proc: 28A7F368 O/S info: user: SYSTEM, term: PORGAND, ospid: 3740 OSD pid info: Windows thread id: 3740, image: ORACLE.EXE (SHAD) Dump of memory from 0x28A3A368 to 0x28A3A4EC 28A3A360 00000005 27CF0130 [....0..'] 28A3A370 00000010 0003139D 28B5FBFC 00000003 [...........(....] 28A3A380 0003139D 280AEC60 0000000B 0003139D [....`..(........] Repeat 21 times 28A3A4E0 00000000 00000000 00000000 [............] ---------------------------------------- SO: 28B36EAC, type: 4, owner: 28A4D1EC, flag: INIT/-/-/0x00 <-- Child SO (indented) (session) sid: 152 trans: 00000000, creator: 28A4D1EC, flag: (41) USR/- BSY/-/-/-/-/- DID: 0001-0011-00000162, short-term DID: 0000-0000-00000000 Manual reading of a process/system state dump – structure and organization Its easy to search for held resources by searching for your problem objects address in the dump. For example if you see that your problem sessions are hung waiting to get a lock on library cache object at address X, it makes sense to search for that address X to see who else is holding a lock on that object
  8. 8. 8 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Oracle: ass.awk • Was in MOS • Part of LTOM • As from your friendly support guy • Or search in Google: • https://www.cnblogs.com/lYng/p/9436244.html • Use (scripts downloaded from forums) at your own risk! • Tanel: ssexplorer.sh • HTML-izes system state dumps Tools for analyzing system state dumps
  9. 9. 9 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Low(er) level research – recursive sessions ... ---------------------------------------- SO: 0x30f3c504, type: 3, owner: 0x30e23050, flag: INIT/-/-/0x00 (call) sess: cur 30f13638, rec 30f0bf28, usr 30f13638; depth: 0 ---------------------------------------- SO: 0x30f0bf28, type: 4, owner: 0x30f3c504, flag: INIT/-/-/0x00 (session) sid: 144 trans: (nil), creator: (nil), flag: (2) -/REC -/-/-/-/-/- DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000 txn branch: (nil) oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS temporary object counter: 0 ---------------------------------------- SO: 0x2eaa6de8, type: 53, owner: 0x30f0bf28, flag: INIT/-/-/0x00 LIBRARY OBJECT LOCK: lock=2eaa6de8 handle=2d7e4a70 mode=N call pin=0x2ea5b1d0 session pin=(nil) hpc=0000 hlc=0000 htl=0x2eaa6e34[0x2ea44f30,0x2ea92690] htb=0x2ea92690 ssga=0x2ea91fb4 user=30f13638 session=30f0bf28 count=1 flags=[0000] savepoint=0xdb8 A separate, recursive session under the user session's call SO Recursive sessions are used for data dictionary queries (SELECT and DML) executed as SYS Read my old blog entry to learn about recursive sessions: http://tech.e2sn.com/oracle/oracle-internals-and-architecture/recursive- sessions-and-ora-00018-maximum-number-of-sessions-exceeded Recursive sessions are different from regular recursive calls (separation provided via call state object typically)
  10. 10. 10 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Low(er) level research – nested multi-level wait events Current Wait Stack: 1: waiting for 'KSV master wait' =0, =0, =0 wait_id=410 seq_num=602 snap_id=2 wait times: snap=26 min 47 sec, exc=36 min 48 sec, total=36 min 49 sec wait times: max=infinite wait counts: calls=739 os=739 in_wait=1 iflags=0x15a0 0: waiting for 'ASM file metadata operation' msgop=11, locn=0, =0 wait_id=408 seq_num=599 snap_id=2 wait times: snap=0.000000 sec, exc=0.000019 sec, total=36 min 52 sec wait times: max=infinite wait counts: calls=0 os=0 in_wait=1 iflags=0x1520 Current Wait Stack: 1: waiting for 'CSS operation: data query' function_id=0x4, =0x0, =0x0 wait_id=766629 seq_num=39925 snap_id=1 wait times: snap=0.000315 sec, exc=0.000315 sec, total=0.000315 sec wait times: max=infinite, heur=0.155743 sec wait counts: calls=0 os=0 in_wait=1 iflags=0x520 0: waiting for 'ASM file metadata operation' msgop=0x0, locn=0xb, =0x0 wait_id=186922 seq_num=39924 snap_id=55419 wait times: snap=0.000000 sec, exc=8.818839 sec, total=380 min 53 sec wait times: max=infinite, heur=380 min 53 sec wait counts: calls=0 os=0 in_wait=1 iflags=0x15a0 One higher level wait temporarily switches into another operation (at lower level) and temporarily waits for that SQL Trace traces only the top level wait event! See my Oracle Wait Event internals background process communication hacking session: https://www.youtube.com/watch?v=mkmvZv58W6w
  11. 11. 11 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Latchless dumping ---------------------------------------- SO: 3a8729f90, type: 3, owner: 3ab9396b8, flag: INIT/-/-/0x00 (call) sess: cur 3aaddd538, rec 0, usr 3aaddd538; depth: 0 ---------------------------------------- SO: 3aab76570, type: 24, owner: 3a8729f90, flag: INIT/-/-/0x00 Aborting this subtree dump because of state inconsistency ---------------------------------------- SO: 3a7268bb8, type: 16, owner: 3ab9396b8, flag: INIT/-/-/0x00 (osp req holder) REDO: 0x0 SINGLE / -- / -- itl: 2, sno: 131, row size 28 insert key: (24): 06 68 69 6e 74 6f 6e 09 63 6c 65 76 65 6c 61 6e 64 06 00 4e f8 c5 00 07 ------------------------------------------------------ ------------------------------------------------------ IMU Undo change vector list (latched dump) ------------------------------------------------------ umap: 0xccc5b1d8 uba: 0x01013e30.1ad4.42 undobh 0x3fbef4c70 cv 0xccc5b060 rcvi 0 Not applied ------------------------------------------------------ ktudb redo: siz: 112 spc: 1028 flg: 0x0012 seq: 0x1ad4 rec: 0x42 xid: 0x0017.011.00011560 ktubl redo: slt: 17 rci: 0 opc: 11.1 [objn: 75287 objd: 75287 tsn: 7] Undo type: Regular undo Begin trans Last buffer split: No Temp Object: No Systemstate dumps & hanganalyze attempt to read state objects without taking latches I have hit bugs in past where systemstate dump itself was hung trying to get a held latch Apparently some "latched dumps" are still used (but I hope it just tries once with an immediate (nowait) latch get and moves on
  12. 12. 12 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session Other details (if have time) 10065, 00000, "limit library cache dump information for state object dump" // *Document: NO // *Cause: // *Action: level 1 - minimal (only the address of state objects) // level 2 - little more (no object details) // level 3 - normal 10809, 00000, "Trace state object allocate / free history" // *Document: NO // *Cause: // *Action: Set this event only under the supervision of Oracle development // *Comment: This event will trace the history of KSS allocations / deletions. // level: 0 = disabled, 1 = cleanup only, 2 = always From Julian Dyke's OracleDiagnostics.ppt: Level 1: Address of library object only Level 2: As 1 plus library object lock details Level 3: As 2 plus library object handle and library object In my testing on 18.3 this only traced state object deletions/releases and no allocations/gets. Could trace kss.* function calls instead
  13. 13. 13 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Tanel’s stuff • http://tech.e2sn.com/oracle/oracle-internals-and-architecture/recursive-sessions-and-ora-00018-maximum-number-of- sessions-exceeded • MOS Notes • Reading and Understanding Systemstate Dumps (Doc ID 423153.1) • Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash instances (Doc ID 11800959.8) • Julian Dyke’s internals diagrams (SGA data structures etc) • http://www.juliandyke.com/Presentations/Presentations.php • Frits Hoogland’s Oracle function name collection • http://orafun.info • http://orafun.info/stack • https://gitlab.com/FritsHoogland/ora_functions Additional Reading
  14. 14. 14 © Tanel Poder 2020 blog.tanelpoder.com Advanced Oracle Troubleshooting – Hacking Session • Next hacking session in the end of Jan 2020 (TBA) • Check http://blog.tanelpoder.com • Follow https://twitter.com/TanelPoder Thanks!

×