SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Disksim with SSD extension
          -- A develop's perspective


                    Jiannan Ouyang
                     PhD CS@PITT
                         2011/04/07
Outline

  Overview

  Disksim implementation

  SSD extension
Disksim

Disksim: An open source disk simulator originally developed at
UMich. and enhanced at CMU.
Disksim features

  Various device model including: disk, simpledisk,
  memsmodel

  Controller model: simple, smart(with cache)

  Trace synthesis and different trace file format

  DIXtrac: automatic disk characterization
ssdmodel

 Developed by Microsoft.

 NOT for any specific SSD Device

 For an idealized SSD that is parameterized by the
 properties of NAND flash chips

 Cache is NOT natively supported
Source Dir


     src/         disksim source (disksim_*.c/h)

     ssdmodel/    ssd extension source (ssd_*.c/h)

     diskmodel/   diskmodel layout and mech

     memsmodel/   MEMS device model

     libparam/    parameter processing lib

     ...
Outline

  Overview

  Disksim implementation

  SSD extension
Disksim source: src/


    disksim_main*         main entrance     main()

    disksim_iodriver*     driver            iodriver_send_event_down_path()

    dismsim_bus*          bus               bus_deliver_event()

    disksim_controller*   controller        controller_event_arrive()

    disksim_diskctlr*     disk controller   disk_event_arrive()

    ...
Disksim Control Path

Event Based System:
   various types of events: io, interrupt, timer...
   all event are stored in a global queue in time order
   addtointq() and removefromintq() are used to access the
   global queue

Equivalent code:
while(curr=getnextevent()){
  swith (curr->type){
     case IO_REQUEST_ARRIVE:
          iodriver_request(curr); break;
  }
}
Example

src/disksim_iosim.c io_internal_event()
 case IO_ACCESS_ARRIVE:
   iodriver_schedule(0, curr);
   break;

src/disksim_iodriver.c iodriver_schedule()
   iodriver_send_event_down_path(curr);

src/disksim_iodriver.c iodriver_send_event_down_path()
   bus_deliver_event(busno.byte[0], slotno.byte[0], curr);
Example con.

src/disksim_bus.c bus_deliver_event()
  case CONTROLLER:
   controller_event_arrive(devno, curr);
   break;

 case DEVICE:
  ASSERT(devno == curr->devno);
  device_event_arrive(curr);
  break;


This control flow is a simulation of an event.
Disksim & Device Interface

INLINE void device_event_arrive (ioreq_event *curr)
{
  ASSERT1 ((curr->devno >= 0) && (curr->devno <
numdevices), "curr->devno", curr->devno);
  return disksim->deviceinfo->devices[curr->devno]-
>event_arrive(curr);
}



Funtion pointer! By dynamic tracing using gdb, we found that
For disk, it jumps to disk_event_arrive()
For ssd, it jumps to ssd_event_arrive()
event_arrive: disk v.s. ssd
disk_event_arrive()                                 ssd_event_arrive()
case IO_ACCESS_ARRIVE:                              case DEVICE_OVERHEAD_COMPLETE:
   disk_request_arrive(curr);                           ssd_request_arrive(curr);
 case DEVICE_OVERHEAD_COMPLETE:
  disk_request_arrive(curr);
                                                    case DEVICE_ACCESS_COMPLETE:
 case DEVICE_BUFFER_SEEKDONE:                           ssd_access_complete (curr);
  disk_buffer_seekdone(currdisk, curr);             case DEVICE_DATA_TRANSFER_COMPLETE:
 case DEVICE_BUFFER_SECTOR_DONE:                        ssd_bustransfer_complete(curr);
  disk_buffer_sector_done(currdisk, curr);          case IO_INTERRUPT_COMPLETE:
 case DEVICE_GOTO_REMAPPED_SECTOR:
  disk_goto_remapped_sector(currdisk, curr);
                                                        ssd_interrupt_complete(curr);
 case DEVICE_GOT_REMAPPED_SECTOR:                   case SSD_CLEAN_GANG:
  disk_got_remapped_sector(currdisk, curr);              ssd_clean_gang_complete(curr);
 case DEVICE_PREPARE_FOR_DATA_TRANSFER:             case SSD_CLEAN_ELEMENT:
  disk_prepare_for_data_transfer(curr);                  ssd_clean_element_complete(curr);
 case DEVICE_DATA_TRANSFER_COMPLETE:
  disk_reconnection_or_transfer_complete(curr);
 case IO_INTERRUPT_COMPLETE:
  disk_interrupt_complete(curr);



"buffer" is cache related events.                   "clean" is garbage collection and wear-leveling
"remapped sector" seems to related to data layout   related. "Gang" and "Element" specify the
(not sure)                                          allocation and reclaim unit.
Outline

  Overview

  Disksim implementation

  SSD extension
ssdmodel features

 Add an auxiliary level of parallel elements, each with a
 closed queue, to represent flash elements or gangs
 Add logic to serialized request completions from these
 parallel elements
 For each elements, maintain data structures to represent
 SSD logical block maps, cleaning state and wear_leveling
 state
 Delay is introduced when request is processed
 Parameters including background cleaning, gang-size, gang
 organization, interleaving, overprovisioning
Flash Package Internal
Flash Chip Performance
1. Latency                   4. Bandwidth and Interleave
bus<->data reg      100us


media->reg: read    25us     src plane -> dest plane 4 page copying
                             (100us per page)
reg->media: write   200us


erease              1.5ms



2. Two-plane commands
can be executed on their
plane pairs 0&1 or 2&3

3. Support background copy
on the same plane
SSD Simulation

 Logical Block Map
    allocation pool

 Cleaning
    greedy or wear-leveling aware

 Parallelism and Interconnect Density
    ganging, interleaving, background cleaning

 Persistence
    saving mapping information per block in DRAM
Interconnection - Ganging

  A gang of flash packages
  can be utilized in synchrony
  to optimized a multi-page
  request.
  Allow multiple packages to
  be used in parallel while
  sharing one request queue
  A request queue can be
  associated to each gang or
  to each element (full
  interconnection mode)
Logical Block Map

 Use allocation pool to think about how an SSD allocates
 flash blocks to service write requests

 An allocation pool an be a flash package or a gang

 Static: a portion of each LBA constitutes a fixed mapping to
 a specific allocation pool

 Dynamic: the non-static portion of a LBA is the lookup key
 for a mapping within a pool
Garbage Collection (Cleaning)

  active block: block available to holding incoming writes in a
  pool

  superseded page: out-of-date page

  cleaning efficiency: (superseded / total pages) in a block

  a pure greedy approach: choosing blocks to clean based on
  potential cleaning efficiency
Wear-Leveling

   average remaining lifetime(ARL) of a block
   age variance (say 20%) of the ARL
   retirement age (say 85%) of the ARL

Wear-aware garbage collection:
1. If ARL < retirement, migrate cold data into this block from a
   migration-candidate queue, and recycle the head block of
   the queue. Populate the queue with new blocks with cold
   data.

   Otherwise, if ARL<age variance, then restrict recycling of
   the block with a probability that increases linearly as the
   remaining lifetime drops to 0. (80% of average ~ Prob of
   recycle = 1; 0% of average ~ 0)
Source: ssdmodel/

ssdmodel is very simple, all c files listed below:

      ssd.c         main                       ssd_event_arrive()

      ssd_clean.c gabege collection and wear   ssd_activate_gang()
                  leveling
      ssd_gang.c several flash packages        ssd_clean_blocks_greedy()
                 orgnised as gang

      ssd_timing.c timing model                ssd_compute_access_time()

      ssd_utils.c   util

      ssd_init.c    init
Example

event sequences for one request:
ssd_request_arrive->ssd_interrupt_complete(reconnect)->ssd_bustransfer_complete-
>ssd_access_complete->ssd_interrupt_complete(completion)

ssd_bustransfer_complete() -> ssd_media_access_request ();
ssdmodel/ssd.c: ssd_media_access_request ()
     case SSD_ALLOC_POOL_PLANE:
     case SSD_ALLOC_POOL_CHIP:
       ssd_media_access_request_element(curr);
     break;
     case SSD_ALLOC_POOL_GANG:
#if SYNC_GANG
       ssd_media_access_request_gang_sync(curr);
#else
       ssd_media_access_request_gang(curr);
#endif
     break;
Example con.

ssd_media_access_request_element()
  -> sse_activate_element()
       -> ssd_invoke_element_cleaning()
       -> ssd_compute_access_time(currdisk, elem_num,
read_reqs, read_total);
       -> add complete into global event queue
       -> ssd_compute_access_time(currdisk, elem_num,
write_reqs, write_total);
       -> add complete into global event queue
Parallel processing sequential complete is achieved by processing batch of requests
in parallel, however, generate the ACCESS_COMPLETE events sequencially
References

Disksim: http://www.pdl.cmu.edu/DiskSim/
Disksim Manual: http://www.pdl.cmu.edu/PDL-
FTP/DriveChar/CMU-PDL-08-101.pdf
Disksim implementation doc: src/doc/Outline.txt
SSD Extension: http://research.microsoft.com/en-
us/downloads/b41019e2-1d2b-44d8-b512-ba35ab814cd4/
SSD Extension paper: Design Tradeoffs for SSD
Performance, N Agrawal, 2008
Cache over SSD project: Group 6 on http://www-users.cselabs.
umn.edu/classes/Spring-2009/csci8980-ass/
Thanks

Q&A?
Block stripping
// blocks can be concatenated (chained) from each plane
//
// plane 0 plane 1 plane 2 plane 3
// ------------------------------------------
// blk 0       blk 2048 blk 4096 blk 6144
// blk 1       blk 2049 blk 4097 blk 6145
// ...      ...
// blk 2047 blk 4095 blk 6143 blk 8191

// blocks can be stripped across all the planes
//
// plane 0 plane 1 plane 2 plane 3
// ------------------------------------------
// blk 0       blk 1      blk 2      blk 3
// blk 4       blk 5      blk 6      blk 7
// ...      ...
// blk 8188 blk 8189 blk 8190 blk 8191
//

Contenu connexe

Tendances

12.mass stroage system
12.mass stroage system12.mass stroage system
12.mass stroage systemSenthil Kanth
 
Hpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure ScopingHpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure ScopingJustin Goldberg
 
Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)Guy Harrison
 
Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)Guy Harrison
 
W1.1 i os in database
W1.1   i os in databaseW1.1   i os in database
W1.1 i os in databasegafurov_x
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Intel® Software
 
Backing Up the MySQL Database
Backing Up the MySQL DatabaseBacking Up the MySQL Database
Backing Up the MySQL DatabaseSanjay Manwani
 
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching SolutionAdaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching SolutionAdaptec by PMC
 
SysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdfSysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdfhtdvul
 
Unitrends Overview 2012
Unitrends Overview 2012Unitrends Overview 2012
Unitrends Overview 2012Tracy Hawkey
 
Volatile Uses for Persistent Memory
Volatile Uses for Persistent MemoryVolatile Uses for Persistent Memory
Volatile Uses for Persistent MemoryIntel® Software
 
Mass storage structurefinal
Mass storage structurefinalMass storage structurefinal
Mass storage structurefinalmarangburu42
 
Ch11 - Silberschatz
Ch11 - SilberschatzCh11 - Silberschatz
Ch11 - SilberschatzMarcus Braga
 
OS Slide Ch12 13
OS Slide Ch12 13OS Slide Ch12 13
OS Slide Ch12 13庭緯 陳
 
Solid state devices
Solid state devicesSolid state devices
Solid state devicesAqib Mir
 
Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400jebtang
 
Eonstor GSc family introduction
Eonstor GSc family introductionEonstor GSc family introduction
Eonstor GSc family introductioninfortrendgroup
 

Tendances (20)

12.mass stroage system
12.mass stroage system12.mass stroage system
12.mass stroage system
 
Dba tuning
Dba tuningDba tuning
Dba tuning
 
Hpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure ScopingHpux AdvFS On Disk Structure Scoping
Hpux AdvFS On Disk Structure Scoping
 
Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)Optimize Oracle On VMware (Sep 2011)
Optimize Oracle On VMware (Sep 2011)
 
Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)Optimize oracle on VMware (April 2011)
Optimize oracle on VMware (April 2011)
 
W1.1 i os in database
W1.1   i os in databaseW1.1   i os in database
W1.1 i os in database
 
Persistent Memory Programming with Java*
Persistent Memory Programming with Java*Persistent Memory Programming with Java*
Persistent Memory Programming with Java*
 
Backing Up the MySQL Database
Backing Up the MySQL DatabaseBacking Up the MySQL Database
Backing Up the MySQL Database
 
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching SolutionAdaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
Adaptec’s maxCache™ 3.0 Read and Write SSD Caching Solution
 
SysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdfSysInternals Disk2vhd - docs.pdf
SysInternals Disk2vhd - docs.pdf
 
Bare metal restore.
Bare metal restore.Bare metal restore.
Bare metal restore.
 
Unitrends Overview 2012
Unitrends Overview 2012Unitrends Overview 2012
Unitrends Overview 2012
 
Volatile Uses for Persistent Memory
Volatile Uses for Persistent MemoryVolatile Uses for Persistent Memory
Volatile Uses for Persistent Memory
 
Mass storage structurefinal
Mass storage structurefinalMass storage structurefinal
Mass storage structurefinal
 
Ch11 - Silberschatz
Ch11 - SilberschatzCh11 - Silberschatz
Ch11 - Silberschatz
 
OS Slide Ch12 13
OS Slide Ch12 13OS Slide Ch12 13
OS Slide Ch12 13
 
Solid state devices
Solid state devicesSolid state devices
Solid state devices
 
2 db2 instance creation
2 db2 instance creation2 db2 instance creation
2 db2 instance creation
 
Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400Seagate 7200 vs wd 5400
Seagate 7200 vs wd 5400
 
Eonstor GSc family introduction
Eonstor GSc family introductionEonstor GSc family introduction
Eonstor GSc family introduction
 

En vedette

Supporting Debian machines for friends and family
Supporting Debian machines for friends and familySupporting Debian machines for friends and family
Supporting Debian machines for friends and familyFrancois Marier
 
Swift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer StorySwift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer StoryBrian Cline
 
How to build Debian packages
How to build Debian packages How to build Debian packages
How to build Debian packages Priyank Kapadia
 
Dockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec KraloveDockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec Kralovedamovsky
 
Debian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIsDebian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIsJames Bromberger
 
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯SZ Lin
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stablejuet-y
 
Debian Packaging tutorial
Debian Packaging tutorialDebian Packaging tutorial
Debian Packaging tutorialnussbauml
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceAmazon Web Services
 
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Shinya Takamaeda-Y
 
Embedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 PlatformEmbedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 PlatformSZ Lin
 
Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014Guy Harrison
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 

En vedette (17)

Supporting Debian machines for friends and family
Supporting Debian machines for friends and familySupporting Debian machines for friends and family
Supporting Debian machines for friends and family
 
Swift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer StorySwift at Scale: The IBM SoftLayer Story
Swift at Scale: The IBM SoftLayer Story
 
How to build Debian packages
How to build Debian packages How to build Debian packages
How to build Debian packages
 
MySQL and SSD
MySQL and SSDMySQL and SSD
MySQL and SSD
 
Dockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec KraloveDockerize the World - presentation from Hradec Kralove
Dockerize the World - presentation from Hradec Kralove
 
Debian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIsDebian Cloud - building the Debian AMIs
Debian Cloud - building the Debian AMIs
 
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
Debian 套件打包教學指南 v0.19 - 繁體中文翻譯
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stable
 
Debian Packaging tutorial
Debian Packaging tutorialDebian Packaging tutorial
Debian Packaging tutorial
 
Deep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS PerformanceDeep Dive: Maximizing EC2 and EBS Performance
Deep Dive: Maximizing EC2 and EBS Performance
 
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
Debian Linux on Zynq (Xilinx ARM-SoC FPGA) Setup Flow (Vivado 2015.4)
 
Embedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 PlatformEmbedded Linux/ Debian with ARM64 Platform
Embedded Linux/ Debian with ARM64 Platform
 
Solid state drives
Solid state drivesSolid state drives
Solid state drives
 
Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014Optimizing Oracle databases with SSD - April 2014
Optimizing Oracle databases with SSD - April 2014
 
Linux introduction
Linux introductionLinux introduction
Linux introduction
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 

Similaire à Disksim with SSD_extension

Operation System
Operation SystemOperation System
Operation SystemANANTHI1997
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdfAdrian Huang
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems PerformanceBrendan Gregg
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoCMacpaul Lin
 
Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008Eduardo Castro
 
SQL Server Performance Analysis
SQL Server Performance AnalysisSQL Server Performance Analysis
SQL Server Performance AnalysisEduardo Castro
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in SparkShiao-An Yuan
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.Natalino Busa
 
SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)wqchen
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009lilyco
 
sector-sphere
sector-spheresector-sphere
sector-spherexlight
 
Ch14 OS
Ch14 OSCh14 OS
Ch14 OSC.U
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disksiammutex
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
 

Similaire à Disksim with SSD_extension (20)

Les 01 Arch
Les 01 ArchLes 01 Arch
Les 01 Arch
 
Operation System
Operation SystemOperation System
Operation System
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Vmfs
VmfsVmfs
Vmfs
 
Operation System
Operation SystemOperation System
Operation System
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems Performance
 
U boot porting guide for SoC
U boot porting guide for SoCU boot porting guide for SoC
U boot porting guide for SoC
 
Memory
MemoryMemory
Memory
 
Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008Ajuste (tuning) del rendimiento de SQL Server 2008
Ajuste (tuning) del rendimiento de SQL Server 2008
 
SQL Server Performance Analysis
SQL Server Performance AnalysisSQL Server Performance Analysis
SQL Server Performance Analysis
 
Debugging & Tuning in Spark
Debugging & Tuning in SparkDebugging & Tuning in Spark
Debugging & Tuning in Spark
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
 
SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)SparkR - Play Spark Using R (20160909 HadoopCon)
SparkR - Play Spark Using R (20160909 HadoopCon)
 
Sector Sphere 2009
Sector Sphere 2009Sector Sphere 2009
Sector Sphere 2009
 
sector-sphere
sector-spheresector-sphere
sector-sphere
 
OSCh14
OSCh14OSCh14
OSCh14
 
OS_Ch14
OS_Ch14OS_Ch14
OS_Ch14
 
Ch14 OS
Ch14 OSCh14 OS
Ch14 OS
 
What every data programmer needs to know about disks
What every data programmer needs to know about disksWhat every data programmer needs to know about disks
What every data programmer needs to know about disks
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
 

Dernier

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Dernier (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Disksim with SSD_extension

  • 1. Disksim with SSD extension -- A develop's perspective Jiannan Ouyang PhD CS@PITT 2011/04/07
  • 2. Outline Overview Disksim implementation SSD extension
  • 3. Disksim Disksim: An open source disk simulator originally developed at UMich. and enhanced at CMU.
  • 4. Disksim features Various device model including: disk, simpledisk, memsmodel Controller model: simple, smart(with cache) Trace synthesis and different trace file format DIXtrac: automatic disk characterization
  • 5. ssdmodel Developed by Microsoft. NOT for any specific SSD Device For an idealized SSD that is parameterized by the properties of NAND flash chips Cache is NOT natively supported
  • 6. Source Dir src/ disksim source (disksim_*.c/h) ssdmodel/ ssd extension source (ssd_*.c/h) diskmodel/ diskmodel layout and mech memsmodel/ MEMS device model libparam/ parameter processing lib ...
  • 7. Outline Overview Disksim implementation SSD extension
  • 8. Disksim source: src/ disksim_main* main entrance main() disksim_iodriver* driver iodriver_send_event_down_path() dismsim_bus* bus bus_deliver_event() disksim_controller* controller controller_event_arrive() disksim_diskctlr* disk controller disk_event_arrive() ...
  • 9. Disksim Control Path Event Based System: various types of events: io, interrupt, timer... all event are stored in a global queue in time order addtointq() and removefromintq() are used to access the global queue Equivalent code: while(curr=getnextevent()){ swith (curr->type){ case IO_REQUEST_ARRIVE: iodriver_request(curr); break; } }
  • 10. Example src/disksim_iosim.c io_internal_event() case IO_ACCESS_ARRIVE: iodriver_schedule(0, curr); break; src/disksim_iodriver.c iodriver_schedule() iodriver_send_event_down_path(curr); src/disksim_iodriver.c iodriver_send_event_down_path() bus_deliver_event(busno.byte[0], slotno.byte[0], curr);
  • 11. Example con. src/disksim_bus.c bus_deliver_event() case CONTROLLER: controller_event_arrive(devno, curr); break; case DEVICE: ASSERT(devno == curr->devno); device_event_arrive(curr); break; This control flow is a simulation of an event.
  • 12. Disksim & Device Interface INLINE void device_event_arrive (ioreq_event *curr) { ASSERT1 ((curr->devno >= 0) && (curr->devno < numdevices), "curr->devno", curr->devno); return disksim->deviceinfo->devices[curr->devno]- >event_arrive(curr); } Funtion pointer! By dynamic tracing using gdb, we found that For disk, it jumps to disk_event_arrive() For ssd, it jumps to ssd_event_arrive()
  • 13. event_arrive: disk v.s. ssd disk_event_arrive() ssd_event_arrive() case IO_ACCESS_ARRIVE: case DEVICE_OVERHEAD_COMPLETE: disk_request_arrive(curr); ssd_request_arrive(curr); case DEVICE_OVERHEAD_COMPLETE: disk_request_arrive(curr); case DEVICE_ACCESS_COMPLETE: case DEVICE_BUFFER_SEEKDONE: ssd_access_complete (curr); disk_buffer_seekdone(currdisk, curr); case DEVICE_DATA_TRANSFER_COMPLETE: case DEVICE_BUFFER_SECTOR_DONE: ssd_bustransfer_complete(curr); disk_buffer_sector_done(currdisk, curr); case IO_INTERRUPT_COMPLETE: case DEVICE_GOTO_REMAPPED_SECTOR: disk_goto_remapped_sector(currdisk, curr); ssd_interrupt_complete(curr); case DEVICE_GOT_REMAPPED_SECTOR: case SSD_CLEAN_GANG: disk_got_remapped_sector(currdisk, curr); ssd_clean_gang_complete(curr); case DEVICE_PREPARE_FOR_DATA_TRANSFER: case SSD_CLEAN_ELEMENT: disk_prepare_for_data_transfer(curr); ssd_clean_element_complete(curr); case DEVICE_DATA_TRANSFER_COMPLETE: disk_reconnection_or_transfer_complete(curr); case IO_INTERRUPT_COMPLETE: disk_interrupt_complete(curr); "buffer" is cache related events. "clean" is garbage collection and wear-leveling "remapped sector" seems to related to data layout related. "Gang" and "Element" specify the (not sure) allocation and reclaim unit.
  • 14. Outline Overview Disksim implementation SSD extension
  • 15. ssdmodel features Add an auxiliary level of parallel elements, each with a closed queue, to represent flash elements or gangs Add logic to serialized request completions from these parallel elements For each elements, maintain data structures to represent SSD logical block maps, cleaning state and wear_leveling state Delay is introduced when request is processed Parameters including background cleaning, gang-size, gang organization, interleaving, overprovisioning
  • 17. Flash Chip Performance 1. Latency 4. Bandwidth and Interleave bus<->data reg 100us media->reg: read 25us src plane -> dest plane 4 page copying (100us per page) reg->media: write 200us erease 1.5ms 2. Two-plane commands can be executed on their plane pairs 0&1 or 2&3 3. Support background copy on the same plane
  • 18. SSD Simulation Logical Block Map allocation pool Cleaning greedy or wear-leveling aware Parallelism and Interconnect Density ganging, interleaving, background cleaning Persistence saving mapping information per block in DRAM
  • 19. Interconnection - Ganging A gang of flash packages can be utilized in synchrony to optimized a multi-page request. Allow multiple packages to be used in parallel while sharing one request queue A request queue can be associated to each gang or to each element (full interconnection mode)
  • 20. Logical Block Map Use allocation pool to think about how an SSD allocates flash blocks to service write requests An allocation pool an be a flash package or a gang Static: a portion of each LBA constitutes a fixed mapping to a specific allocation pool Dynamic: the non-static portion of a LBA is the lookup key for a mapping within a pool
  • 21. Garbage Collection (Cleaning) active block: block available to holding incoming writes in a pool superseded page: out-of-date page cleaning efficiency: (superseded / total pages) in a block a pure greedy approach: choosing blocks to clean based on potential cleaning efficiency
  • 22. Wear-Leveling average remaining lifetime(ARL) of a block age variance (say 20%) of the ARL retirement age (say 85%) of the ARL Wear-aware garbage collection: 1. If ARL < retirement, migrate cold data into this block from a migration-candidate queue, and recycle the head block of the queue. Populate the queue with new blocks with cold data. Otherwise, if ARL<age variance, then restrict recycling of the block with a probability that increases linearly as the remaining lifetime drops to 0. (80% of average ~ Prob of recycle = 1; 0% of average ~ 0)
  • 23. Source: ssdmodel/ ssdmodel is very simple, all c files listed below: ssd.c main ssd_event_arrive() ssd_clean.c gabege collection and wear ssd_activate_gang() leveling ssd_gang.c several flash packages ssd_clean_blocks_greedy() orgnised as gang ssd_timing.c timing model ssd_compute_access_time() ssd_utils.c util ssd_init.c init
  • 24. Example event sequences for one request: ssd_request_arrive->ssd_interrupt_complete(reconnect)->ssd_bustransfer_complete- >ssd_access_complete->ssd_interrupt_complete(completion) ssd_bustransfer_complete() -> ssd_media_access_request (); ssdmodel/ssd.c: ssd_media_access_request () case SSD_ALLOC_POOL_PLANE: case SSD_ALLOC_POOL_CHIP: ssd_media_access_request_element(curr); break; case SSD_ALLOC_POOL_GANG: #if SYNC_GANG ssd_media_access_request_gang_sync(curr); #else ssd_media_access_request_gang(curr); #endif break;
  • 25. Example con. ssd_media_access_request_element() -> sse_activate_element() -> ssd_invoke_element_cleaning() -> ssd_compute_access_time(currdisk, elem_num, read_reqs, read_total); -> add complete into global event queue -> ssd_compute_access_time(currdisk, elem_num, write_reqs, write_total); -> add complete into global event queue Parallel processing sequential complete is achieved by processing batch of requests in parallel, however, generate the ACCESS_COMPLETE events sequencially
  • 26. References Disksim: http://www.pdl.cmu.edu/DiskSim/ Disksim Manual: http://www.pdl.cmu.edu/PDL- FTP/DriveChar/CMU-PDL-08-101.pdf Disksim implementation doc: src/doc/Outline.txt SSD Extension: http://research.microsoft.com/en- us/downloads/b41019e2-1d2b-44d8-b512-ba35ab814cd4/ SSD Extension paper: Design Tradeoffs for SSD Performance, N Agrawal, 2008 Cache over SSD project: Group 6 on http://www-users.cselabs. umn.edu/classes/Spring-2009/csci8980-ass/
  • 28. Block stripping // blocks can be concatenated (chained) from each plane // // plane 0 plane 1 plane 2 plane 3 // ------------------------------------------ // blk 0 blk 2048 blk 4096 blk 6144 // blk 1 blk 2049 blk 4097 blk 6145 // ... ... // blk 2047 blk 4095 blk 6143 blk 8191 // blocks can be stripped across all the planes // // plane 0 plane 1 plane 2 plane 3 // ------------------------------------------ // blk 0 blk 1 blk 2 blk 3 // blk 4 blk 5 blk 6 blk 7 // ... ... // blk 8188 blk 8189 blk 8190 blk 8191 //