SlideShare une entreprise Scribd logo
1  sur  33
了解IO协议栈

核心系统数据库组 余锋
 http://yufeng.info
     @淘宝褚霸
     2012-03-18
                      1
提纲

•   IO子系统架构图
•   IO子系统各层分解
•   IO请求事件跟踪点
•   blktrace/btt解释




                          2
IO子系统架构图


                     $stap -l
                     'ioblock.*'
                     ioblock.end
                     ioblock.request
                   $stap -l 'ioscheduler.*'
                   ioscheduler.elv_add_request
                   ioscheduler.elv_completed_request
blktrace   DM层     ioscheduler.elv_next_request




                                                  3
块层框图



buffered   mmap     direct io
io




                                4
思考



  IO子系统有几层?

各层的输入输出分别是什么?



                5
块层

probe ioblock.request
   –Fires whenever making a generic block I/O
    request.


probe ioblock.end
  –Fires whenever a block I/O transfer is complete.




                                                      6
DM 层




•LVM2(Linux Volume Manager 2 version)
•EVMS(Enterprise Volume Management System)      7
•dmraid(Device Mapper Raid Tool)
请求队列/电梯

probe ioscheduler.elv_add_request.kp
  - kprobe based probe to indicate that a
  request was added to the request queue

probe ioscheduler.elv_next_request
  –Fires when a request is retrieved from the request
    queue


probe ioscheduler.elv_completed_request
   –Fires when a request is completed
                                                        8
调度器参数微调
# cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq




文档参考:Documentation/block/deadline-iosched.txt
                                                9
思考




电梯算法的核心作用是什么?




                10
驱动程序

• 中断平衡
 – /proc/irq/IRQ/smp_affinity
• 软中断平衡
 – /sys/block/DEV/queue/rq_affinity




                                      11
块请求关键事件点
$perf list|grep “block:”或者
$trace-cmd list |grep 'block:*‘
$stap -l ‘kernel.trace(“block_*”)‘

block:block_rq_abort         block:block_bio_queue
block:block_rq_requeue       block:block_getrq
block:block_rq_complete      block:block_sleeprq
block:block_rq_insert        block:block_plug
block:block_rq_issue         block:block_unplug_timer
block:block_bio_bounce       block:block_unplug_io
block:block_bio_complete     block:block_split
block:block_bio_backmerge    block:block_remap
block:block_bio_frontmerge   block:block_rq_remap

                                                        12
Tracepoint解释

C -- complete A previously issued request has been completed.

D -- issued A request that previously resided on the block
   layer
           queue or in the i/o scheduler has been sent to the
   driver.

I -- inserted A request is being sent to the i/o scheduler for
   addi-tion to the internal queue and later service by the
   driver.

Q -- queued This notes intent to queue i/o at the given
  location.
                                                                13
B -- bounced The data pages attached to this bio are not
Tracepoint解释(续)

M -- back merge A previously inserted request exists that
  ends on the boundary of where this i/o begins, so the i/o
  scheduler can merge them together.

F -- front merge Same as the back merge, except this i/o ends
  where a previously inserted requests starts.

G -- get request To send any type of request to a block
   device, a struct request container must be allocated first.

S -- sleep No available request structures were available,
  so the
         issuer has to wait for one to be freed.
                                                              14
Tracepoint解释(续)

P -- plug When i/o is queued to a previously empty block
   device queue, Linux will plug the queue in anticipation of
   future ios being added before this data is needed.
U -- unplug Some request data already queued in the device,
   start sending requests to the driver.
T -- unplug due to timer If nobody requests the i/o that was
   queued after plugging the queue, Linux will automatically
   unplug it after a defined period has passed.
X -- split On raid or device mapper setups, an incoming
   i/o may
           straddle a device or internal zone and needs to be
   hopped      up into smaller pieces for service.
A -- remap For stacked devices, incoming i/o is remapped to
   device below it in the i/o stack.
                                                            15
思考




如何可视化IO请求生命期?



                16
IO行为观察




不觉得信息量太少吗?

             17
blktrace架构图




              18
blktrace可过滤事件

barrier: barrier attribute
complete: completed by driver
fs: requests
issue: issued to driver
pc: packet command events
queue: queue operations
read: read traces
requeue: requeue operations
sync: synchronous attribute
write: write traces
notify: trace messages
drv_data: additional driver specific trace

                                             19
btrace第一感




            20
blkiomon




           21
btt

#   blktrace /dev/sdb
#   blkparse -i sdb -d sdb.bin
#   blkrawverify sdb
#   btt -i sdb.bin -A




                                 22
btt: Life of an I/O

• Q2I – time it takes to process an I/O prior to it
  being inserted or merged onto a request queue
  – Includes split, and remap time
• I2D – time the I/O is “idle” on the request
  queue
• D2C – time the I/O is “active” in the driver and
  on the device
• Q2I + I2D + D2C = Q2C
  – Q2C: Total processing time of the I/O




                                                      23
btt解读

==================== All Devices ====================

            ALL           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

Q2Q              0.000007098   0.085323752   1.189534849          14
Q2G              0.000000685   0.000001737   0.000004757          12
G2I              0.000000272   0.000001724   0.000004240          12
Q2M              0.000000475   0.000001036   0.000001362           3
I2D              0.000002502   0.000244633   0.002238651          12
M2D              0.000004870   0.000065011   0.000178722           3
D2C              0.000055488   0.000145720   0.000219068          15
Q2C              0.000062048   0.000357405   0.002303758          15

                                                                    24
btt解读(续)

==================== Device Overhead ====================

       DEV   |       Q2G       G2I       Q2M       I2D       D2C
----------   | --------- --------- --------- --------- ---------
 ( 8, 16)    |   0.3889%   0.3859%   0.0580% 54.7575% 40.7717%
----------   | --------- --------- --------- --------- ---------
   Overall   |   0.3889%   0.3859%   0.0580% 54.7575% 40.7717%




                                                                   25
btt解读(续)

==================== Device Merge Information ====================

       DEV |       #Q       #D Ratio |     BLKmin BLKavg BLKmax        Total
---------- | -------- -------- ------- | -------- -------- -------- --------
 ( 8, 16) |        15       12     1.2 |        8       10       24      120




                                                                               26
btt解读(续)
==================== Device Q2Q Seek Information ====================

       DEV   |          NSEEKS            MEAN          MEDIAN | MODE
----------   | --------------- --------------- --------------- | ---------------
 ( 8, 16)    |              15     620978236.7               0 | 0(5)
----------   | --------------- --------------- --------------- | ---------------
   Overall   |          NSEEKS            MEAN          MEDIAN | MODE
   Average   |              15     620978236.7               0 | 0(5)

==================== Device D2D Seek Information ====================

       DEV   |          NSEEKS            MEAN          MEDIAN | MODE
----------   | --------------- --------------- --------------- | ---------------
 ( 8, 16)    |              12     776222795.9               0 | 0(2)
----------   | --------------- --------------- --------------- | ---------------
   Overall   |          NSEEKS            MEAN          MEDIAN | MODE
   Average   |              12     776222795.9               0 | 0(2)
                                                                               27
btt解读(续)

==================== Plug Information ====================

       DEV |    # Plugs # Timer Us | % Time Q Plugged
---------- | ---------- ---------- | ----------------
 ( 8, 16) |           5(         1) |  0.226614061%

       DEV   |    IOs/Unp   IOs/Unp(to)
----------   | ----------   ----------
 ( 8, 16)    |        0.8          1.0
----------   | ----------   ----------
   Overall   |    IOs/Unp   IOs/Unp(to)
   Average   |        0.8          1.0



                                                             28
btt解读(续)

================= Active Requests At Q Information ================


       DEV | Avg Reqs @ Q
---------- | -------------
 ( 8, 16) |            0.9




                                                                      29
思考




除了用户应用,
谁还在使用块层?


           30
页面回写机制

$stap –l ‘kernel.function("congestion_wait")‘

$perf list|grep "writeback:“

writeback:writeback_nothread
..
 writeback:writeback_nowork
 writeback:writeback_bdi_register
 writeback:writeback_bdi_unregister
 writeback:writeback_task_start
 writeback:writeback_task_stop
 writeback:wbc_writeback_start
 writeback:wbc_writeback_written
 writeback:wbc_writeback_wait
 writeback:wbc_balance_dirty_start
 writeback:wbc_balance_dirty_written
 writeback:wbc_balance_dirty_wait
 writeback:wbc_writepage                  31
参考材料

• blktrace相关:
  http://blog.yufeng.info/archives/tag/blktra
  ce
• systemtap相关:
  http://blog.yufeng.info/archives/tag/system
  tap




                                            32
提问时间




谢谢大家!


         33

Contenu connexe

Tendances

A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN Riyaj Shamsudeen
 
Debunking myths about_redo_ppt
Debunking myths about_redo_pptDebunking myths about_redo_ppt
Debunking myths about_redo_pptRiyaj Shamsudeen
 
Thomas+Niewel+ +Oracletuning
Thomas+Niewel+ +OracletuningThomas+Niewel+ +Oracletuning
Thomas+Niewel+ +Oracletuningafa reg
 
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersRiyaj Shamsudeen
 
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleGuatemala User Group
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and HadoopDataWorks Summit
 
Debug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsDebug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsVipin Varghese
 
Understanding cts log_messages
Understanding cts log_messagesUnderstanding cts log_messages
Understanding cts log_messagesMujahid Mohammed
 
Implementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsImplementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsM Mei
 
New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...Sage Computing Services
 
Oow2007 performance
Oow2007 performanceOow2007 performance
Oow2007 performanceRicky Zhu
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Brendan Gregg
 
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL Quickly Locate Poorly Performing DB2 for z/OS Batch SQL
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL softbasemarketing
 
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobsMonowar Mukul
 
Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle Kyle Hailey
 

Tendances (20)

A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN
 
Rac introduction
Rac introductionRac introduction
Rac introduction
 
Debunking myths about_redo_ppt
Debunking myths about_redo_pptDebunking myths about_redo_ppt
Debunking myths about_redo_ppt
 
Redo internals ppt
Redo internals pptRedo internals ppt
Redo internals ppt
 
Thomas+Niewel+ +Oracletuning
Thomas+Niewel+ +OracletuningThomas+Niewel+ +Oracletuning
Thomas+Niewel+ +Oracletuning
 
Dbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineersDbms plan - A swiss army knife for performance engineers
Dbms plan - A swiss army knife for performance engineers
 
Casnewb
CasnewbCasnewb
Casnewb
 
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ OracleUnderstanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
Understanding Query Optimization with ‘regular’ and ‘Exadata’ Oracle
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
 
Debug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpointsDebug dpdk process bottleneck & painpoints
Debug dpdk process bottleneck & painpoints
 
Understanding cts log_messages
Understanding cts log_messagesUnderstanding cts log_messages
Understanding cts log_messages
 
Implementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew GroupsImplementing Useful Clock Skew Using Skew Groups
Implementing Useful Clock Skew Using Skew Groups
 
Combo fix
Combo fixCombo fix
Combo fix
 
New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...New Tuning Features in Oracle 11g - How to make your database as boring as po...
New Tuning Features in Oracle 11g - How to make your database as boring as po...
 
Oow2007 performance
Oow2007 performanceOow2007 performance
Oow2007 performance
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL Quickly Locate Poorly Performing DB2 for z/OS Batch SQL
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL
 
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
12c: Testing audit features for Data Pump (Export & Import) and RMAN jobs
 
Rmoug ashmaster
Rmoug ashmasterRmoug ashmaster
Rmoug ashmaster
 
Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle
 

Similaire à Io 120404052713-phpapp01

BRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdf
BRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdfBRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdf
BRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdfaaajjj4
 
Writing efficient sql
Writing efficient sqlWriting efficient sql
Writing efficient sqlj9soto
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and ArchitectureSidney Chen
 
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringNETWAYS
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and MonitoringOSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and MonitoringNETWAYS
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringGeorg Schönberger
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part IIIAlkin Tezuysal
 
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...aaajjj4
 
Cisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart LicensingCisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart Licensingdaxtindavon
 
Armboot process zeelogic
Armboot process zeelogicArmboot process zeelogic
Armboot process zeelogicAleem Shariff
 
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...Cuneyt Goksu
 
2010 03 papi_indiana
2010 03 papi_indiana2010 03 papi_indiana
2010 03 papi_indianaPTIHPA
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTanel Poder
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Jboss World 2011 Infinispan
Jboss World 2011 InfinispanJboss World 2011 Infinispan
Jboss World 2011 Infinispancbo_
 

Similaire à Io 120404052713-phpapp01 (20)

BRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdf
BRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdfBRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdf
BRKRST-3066 - Troubleshooting Nexus 7000 (2013 Melbourne) - 2 Hours.pdf
 
Writing efficient sql
Writing efficient sqlWriting efficient sql
Writing efficient sql
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
 
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
 
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and MonitoringOSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
Percona Live UK 2014 Part III
Percona Live UK 2014  Part IIIPercona Live UK 2014  Part III
Percona Live UK 2014 Part III
 
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
BRKDCT-3144 - Advanced - Troubleshooting Cisco Nexus 7000 Series Switches (20...
 
Cisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart LicensingCisco: Care and Feeding of Smart Licensing
Cisco: Care and Feeding of Smart Licensing
 
Armboot process zeelogic
Armboot process zeelogicArmboot process zeelogic
Armboot process zeelogic
 
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
 
2010 03 papi_indiana
2010 03 papi_indiana2010 03 papi_indiana
2010 03 papi_indiana
 
Oracle 11g caracteristicas poco documentadas 3 en 1
Oracle 11g caracteristicas poco documentadas 3 en 1Oracle 11g caracteristicas poco documentadas 3 en 1
Oracle 11g caracteristicas poco documentadas 3 en 1
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
C&C Botnet Factory
C&C Botnet FactoryC&C Botnet Factory
C&C Botnet Factory
 
Jboss World 2011 Infinispan
Jboss World 2011 InfinispanJboss World 2011 Infinispan
Jboss World 2011 Infinispan
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Io 120404052713-phpapp01

  • 2. 提纲 • IO子系统架构图 • IO子系统各层分解 • IO请求事件跟踪点 • blktrace/btt解释 2
  • 3. IO子系统架构图 $stap -l 'ioblock.*' ioblock.end ioblock.request $stap -l 'ioscheduler.*' ioscheduler.elv_add_request ioscheduler.elv_completed_request blktrace DM层 ioscheduler.elv_next_request 3
  • 4. 块层框图 buffered mmap direct io io 4
  • 6. 块层 probe ioblock.request –Fires whenever making a generic block I/O request. probe ioblock.end –Fires whenever a block I/O transfer is complete. 6
  • 7. DM 层 •LVM2(Linux Volume Manager 2 version) •EVMS(Enterprise Volume Management System) 7 •dmraid(Device Mapper Raid Tool)
  • 8. 请求队列/电梯 probe ioscheduler.elv_add_request.kp - kprobe based probe to indicate that a request was added to the request queue probe ioscheduler.elv_next_request –Fires when a request is retrieved from the request queue probe ioscheduler.elv_completed_request –Fires when a request is completed 8
  • 9. 调度器参数微调 # cat /sys/block/sda/queue/scheduler noop anticipatory [deadline] cfq 文档参考:Documentation/block/deadline-iosched.txt 9
  • 11. 驱动程序 • 中断平衡 – /proc/irq/IRQ/smp_affinity • 软中断平衡 – /sys/block/DEV/queue/rq_affinity 11
  • 12. 块请求关键事件点 $perf list|grep “block:”或者 $trace-cmd list |grep 'block:*‘ $stap -l ‘kernel.trace(“block_*”)‘ block:block_rq_abort block:block_bio_queue block:block_rq_requeue block:block_getrq block:block_rq_complete block:block_sleeprq block:block_rq_insert block:block_plug block:block_rq_issue block:block_unplug_timer block:block_bio_bounce block:block_unplug_io block:block_bio_complete block:block_split block:block_bio_backmerge block:block_remap block:block_bio_frontmerge block:block_rq_remap 12
  • 13. Tracepoint解释 C -- complete A previously issued request has been completed. D -- issued A request that previously resided on the block layer queue or in the i/o scheduler has been sent to the driver. I -- inserted A request is being sent to the i/o scheduler for addi-tion to the internal queue and later service by the driver. Q -- queued This notes intent to queue i/o at the given location. 13 B -- bounced The data pages attached to this bio are not
  • 14. Tracepoint解释(续) M -- back merge A previously inserted request exists that ends on the boundary of where this i/o begins, so the i/o scheduler can merge them together. F -- front merge Same as the back merge, except this i/o ends where a previously inserted requests starts. G -- get request To send any type of request to a block device, a struct request container must be allocated first. S -- sleep No available request structures were available, so the issuer has to wait for one to be freed. 14
  • 15. Tracepoint解释(续) P -- plug When i/o is queued to a previously empty block device queue, Linux will plug the queue in anticipation of future ios being added before this data is needed. U -- unplug Some request data already queued in the device, start sending requests to the driver. T -- unplug due to timer If nobody requests the i/o that was queued after plugging the queue, Linux will automatically unplug it after a defined period has passed. X -- split On raid or device mapper setups, an incoming i/o may straddle a device or internal zone and needs to be hopped up into smaller pieces for service. A -- remap For stacked devices, incoming i/o is remapped to device below it in the i/o stack. 15
  • 19. blktrace可过滤事件 barrier: barrier attribute complete: completed by driver fs: requests issue: issued to driver pc: packet command events queue: queue operations read: read traces requeue: requeue operations sync: synchronous attribute write: write traces notify: trace messages drv_data: additional driver specific trace 19
  • 21. blkiomon 21
  • 22. btt # blktrace /dev/sdb # blkparse -i sdb -d sdb.bin # blkrawverify sdb # btt -i sdb.bin -A 22
  • 23. btt: Life of an I/O • Q2I – time it takes to process an I/O prior to it being inserted or merged onto a request queue – Includes split, and remap time • I2D – time the I/O is “idle” on the request queue • D2C – time the I/O is “active” in the driver and on the device • Q2I + I2D + D2C = Q2C – Q2C: Total processing time of the I/O 23
  • 24. btt解读 ==================== All Devices ==================== ALL MIN AVG MAX N --------------- ------------- ------------- ------------- ----------- Q2Q 0.000007098 0.085323752 1.189534849 14 Q2G 0.000000685 0.000001737 0.000004757 12 G2I 0.000000272 0.000001724 0.000004240 12 Q2M 0.000000475 0.000001036 0.000001362 3 I2D 0.000002502 0.000244633 0.002238651 12 M2D 0.000004870 0.000065011 0.000178722 3 D2C 0.000055488 0.000145720 0.000219068 15 Q2C 0.000062048 0.000357405 0.002303758 15 24
  • 25. btt解读(续) ==================== Device Overhead ==================== DEV | Q2G G2I Q2M I2D D2C ---------- | --------- --------- --------- --------- --------- ( 8, 16) | 0.3889% 0.3859% 0.0580% 54.7575% 40.7717% ---------- | --------- --------- --------- --------- --------- Overall | 0.3889% 0.3859% 0.0580% 54.7575% 40.7717% 25
  • 26. btt解读(续) ==================== Device Merge Information ==================== DEV | #Q #D Ratio | BLKmin BLKavg BLKmax Total ---------- | -------- -------- ------- | -------- -------- -------- -------- ( 8, 16) | 15 12 1.2 | 8 10 24 120 26
  • 27. btt解读(续) ==================== Device Q2Q Seek Information ==================== DEV | NSEEKS MEAN MEDIAN | MODE ---------- | --------------- --------------- --------------- | --------------- ( 8, 16) | 15 620978236.7 0 | 0(5) ---------- | --------------- --------------- --------------- | --------------- Overall | NSEEKS MEAN MEDIAN | MODE Average | 15 620978236.7 0 | 0(5) ==================== Device D2D Seek Information ==================== DEV | NSEEKS MEAN MEDIAN | MODE ---------- | --------------- --------------- --------------- | --------------- ( 8, 16) | 12 776222795.9 0 | 0(2) ---------- | --------------- --------------- --------------- | --------------- Overall | NSEEKS MEAN MEDIAN | MODE Average | 12 776222795.9 0 | 0(2) 27
  • 28. btt解读(续) ==================== Plug Information ==================== DEV | # Plugs # Timer Us | % Time Q Plugged ---------- | ---------- ---------- | ---------------- ( 8, 16) | 5( 1) | 0.226614061% DEV | IOs/Unp IOs/Unp(to) ---------- | ---------- ---------- ( 8, 16) | 0.8 1.0 ---------- | ---------- ---------- Overall | IOs/Unp IOs/Unp(to) Average | 0.8 1.0 28
  • 29. btt解读(续) ================= Active Requests At Q Information ================ DEV | Avg Reqs @ Q ---------- | ------------- ( 8, 16) | 0.9 29
  • 31. 页面回写机制 $stap –l ‘kernel.function("congestion_wait")‘ $perf list|grep "writeback:“ writeback:writeback_nothread .. writeback:writeback_nowork writeback:writeback_bdi_register writeback:writeback_bdi_unregister writeback:writeback_task_start writeback:writeback_task_stop writeback:wbc_writeback_start writeback:wbc_writeback_written writeback:wbc_writeback_wait writeback:wbc_balance_dirty_start writeback:wbc_balance_dirty_written writeback:wbc_balance_dirty_wait writeback:wbc_writepage 31
  • 32. 参考材料 • blktrace相关: http://blog.yufeng.info/archives/tag/blktra ce • systemtap相关: http://blog.yufeng.info/archives/tag/system tap 32