SlideShare une entreprise Scribd logo
1  sur  52
Télécharger pour lire hors ligne
Data Recovery for MySQL
       Aleksandr Kuzminsky & Istvan Podor
                Percona Live London 2011
Agenda
1. InnoDB files format overview
2. InnoDB dictionary (SYS_INDEXES, SYS_TABLES)
3. InnoDB Primary and Secondary indexes
4. Typical failure scenarios
5. InnoDB recovery tool




                                            www.percona.com
1. InnoDB format overview




                        www.percona.com
InnoDB table spaces
●   Shared table space (ibdataX)
    ●   dictionary, rollback segment, undo space, insert
        buffer, double write buffer
    ●   Primary & Secondary indexes
    ●   External pages (BLOBs)
●   A table space per table (ibdataX + *.ibd)
    ●   Primary & Secondary indexes
    ●   External pages (BLOBs)



                                                www.percona.com
InnoDB logs
●   a.k.a REDO logs, transactional logs
●   ib_logfile[01]
●   Contain changes to InnoDB pages:
              Space ID   Page ID   Data




                                          www.percona.com
InnoDB Indexes. PRIMARY
●   Table is a clustered index named PRIMARY
●   B+ tree data structure, node is a page
●   Key is Primary key or unique key or ROW_ID
    (6bytes)




                                         www.percona.com
InnoDB Indexes. Secondary
●   Also B+ tree structure
●   Key is indexed field(s), values are primary keys
    ●   If table (id, first_name, last_name, birth_date),
    ●   And index (last_name)
    ●   The index structure stores (last_name, id)




                                               www.percona.com
Index identifier (index_id)
●   8 bytes integer, often in two numbers notation:
    0 254
●   Table name → index_id correspondence is
    stored in InnoDB dictionary
●   Visible in table monitor output(see next slide)
●   In I_S tables if Percona Server




                                          www.percona.com
Table monitor output

  mysql> CREATE TABLE innodb_table_monitor(x int)
 engine=innodb

In Error log:
TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1
         COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0;
   sites_count: DATA_INT len 4 prec 0;
                              created_at: DATA_INT len 8 prec 0; updated_at:
   DATA_INT len 8 prec 0;
                      DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID:
   DATA_SYS prtype 257 len 6 prec 0;
                      DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;
             INDEX: name PRIMARY, id 0 254, fields 1/7, type 3
              root page 271, appr.key vals 1, leaf pages 1, size pages 1
              FIELDS:  id DB_TRX_ID DB_ROLL_PTR name sites_count created_at
   updated_at




                                                                 www.percona.com
InnoDB page format
InnoDB page is 16 kilobytes


                              Size, bytes
FIL_HEADER                    36
PAGE_HEADER                   56
INFINUM+SUPREMUM RECORDS      varies
User records                  varies
Free space
Page directory                varies
FIL_TRAILER                   fixed




                                            www.percona.com
Fil Header
●   Common for all type of pages
    Name                   Siz   Remarks
                           e

    FIL_PAGE_SPACE         4     Space ID the page is in or checksum
    FIL_PAGE_OFFSET        4     ordinal page number from start of space
    FIL_PAGE_PREV          4     offset of previous page in key order
    FIL_PAGE_NEXT          4     offset of next page in key order
    FIL_PAGE_LSN           8     log serial number of page's latest log record
    FIL_PAGE_TYPE          2     current defined types are: FIL_PAGE_INDEX,
                                 FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE,
                                 FIL_PAGE_IBUF_FREE_LIST
    FIL_PAGE_FILE_FLUSH_   8     "the file has been flushed to disk at least up to this lsn" (log serial
    LSN
                                 number), valid only on the first page of the file
    FIL_PAGE_ARCH_LOG_     4     /* starting from 4.1.x this contains the space id of the page */
    NO




                                                                                       www.percona.com
Page header
●    Only for index pages
    Name               Size   Remarks

    PAGE_N_DIR_SLOTS   2      number of directory slots in the Page Directory part; initial value = 2
    PAGE_HEAP_TOP      2      record pointer to first record in heap
    PAGE_N_HEAP        2      number of heap records; initial value = 2
    PAGE_FREE          2      record pointer to first free record
    PAGE_GARBAGE       2      "number of bytes in deleted records"
    PAGE_LAST_INSERT   2      record pointer to the last inserted record
    PAGE_DIRECTION     2      either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION
    PAGE_N_DIRECTION   2      number of consecutive inserts in the same direction, e.g. "last 5 were all to the left"
    PAGE_N_RECS        2      number of user records
    PAGE_MAX_TRX_ID    8      the highest ID of a transaction which might have changed a record on the page (only set for secondary
                              indexes)
    PAGE_LEVEL         2      level within the index (0 for a leaf page)
    PAGE_INDEX_ID      8      identifier of the index the page belongs to
    PAGE_BTR_SEG_LEA   10     "file segment header for the leaf pages in a B-tree" (this is irrelevant here)
    F

    PAGE_BTR_SEG_TOP   10     "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here)



                                                                                                                   www.percona.com
How to check row format?
●   Can be either COMPACT or REDUNDANT
●   0 stands for REDUNDANT, 1 - for COMACT
●   The highest bit of the PAGE_N_HEAP from the
    page header
●   dc -e "2o `hexdump –C d pagefile |
    grep 00000020 | awk '{ print $12}'`
    p" | sed 's/./& /g' | awk '{ print
    $1}'


                                     www.percona.com
Rows in an InnoDB page
●   Single linked list   next        INFIMUM
                         NULL        SUPREMUM
●   The first record –   next   10   data
    INFIMUM              next   20   data
                         next   30   data
●   The last –           next   50   data
    SUPREMUM             next   40   data

●   Ordered by PK




                                        www.percona.com
Records are saved in insert order
insert into t1 values(10, 'aaa');
insert into t1 values(30, 'ccc');
insert into t1 values(20, 'bbb');

JG....................N<E.......
................................
.............................2..
...infimum......supremum......6.
........)....2..aaa.............
...*....2..ccc.... ...........+.
...2..bbb.......................
................................

                                    www.percona.com
Row format
CREATE TABLE `t1` (                 Name            Size
  `ID` int(11) unsigned NOT NULL,
  `NAME` varchar(120),              Offsets         1 byte for small types
  `N_FIELDS` int(10),               (lengths)       2 bytes for longer types
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT             Extra bytes     5 bytes COMPACT
CHARSET=latin1                                      6 bytes REDUNDANT
                                    Field content   varies




                                                             www.percona.com
Extra bytes(REDUNDANT)
Name              Size      Description

record_status     2 bits    _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM

deleted_flag      1 bit     1 if record is deleted

min_rec_flag      1 bit     1 if record is predefined minimum record

n_owned           4 bits    number of records owned by this record

heap_no           13 bits   record's order number in heap of index page

n_fields          10 bits   number of fields in this record, 1 to 1023

1byte_offs_flag   1 bit     1 if each Field Start Offsets is 1 byte long (this item is also called the "short"
                            flag)

next 16 bits      16 bits   pointer to next record in page



                                                                                           www.percona.com
Extra bytes(COMPACT)
Name             Size, bits   Description
record_status    4            4 bits used to delete mark a record, and mark a predefined minimum
deleted_flag                  record in alphabetical order
min_rec_flag


n_owned          4            the number of records owned by this record
                               (this term is explained in page0page.h)

heap_no          13           the order number of this record in the
                               heap of the index page

record type      3            000=conventional, 001=node pointer (inside B-tree), 010=infimum,
                              011=supremum, 1xx=reserved

next 16 bits     16           a relative pointer to the next record in the page



                                                                           www.percona.com
Example: Redundant row
A row: (10, ‘abcdef’, 20)
Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20)


4   6 7 6 4    ... next   00 00 00 0A   ...   ...   abcdef 80 00 00 14



    Offsets     Extra                          Fields
                bytes




                                                                     www.percona.com
The same row, but COMPACT
   A row: (10, ‘abcdef’, 20)
   Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20)


  6 ... next    00 00 00 0A   ...   ...   abcdef 80 00 00 14



Offsets Extra                                      Fields
        bytes


                                      13 % smaller!


                                                               www.percona.com
Data types (highlights)
●   Integer, float numbers are fixed size
●   Strings:
    ●   VARCHAR(x) – variable
    ●   CHAR(x) – fixed if not UTF-8
●   Date, time – fixed size
●   DECIMAL
    ●   Stored as string before 5.0.3
    ●   Binary format afterward


                                            www.percona.com
Data types (BLOBs)
●   Field length (so called offset) is one or two
    bytes long
●   If record size < (UNIV_PAGE_SIZE/2-200) ==
    ~7k – the record is stored internally (in a PK
    page)
●   Otherwise – 768 bytes in-page, the rest in an
    external page




                                           www.percona.com
InnoDB dictionary
(SYS_INDEXES, SYS_TABLES)




                   www.percona.com
Why are SYS_* tables needed?
●   Correspondence “table name” → “index_id”
●   Storage for other internal information




                                             www.percona.com
SYS_* structure
                               Always REDUNDANT format!
CREATE TABLE `SYS_INDEXES` (                  CREATE TABLE `SYS_TABLES` (
  `TABLE_ID` bigint(20) unsigned NOT NULL      `NAME` varchar(255) NOT NULL default '',
    default '0',
                                               `ID` bigint(20) unsigned NOT NULL default
  `ID` bigint(20) unsigned NOT NULL default      '0',
    '0',
                                               `N_COLS` int(10) unsigned default NULL,
  `NAME` varchar(120) default NULL,
                                               `TYPE` int(10) unsigned default NULL,
  `N_FIELDS` int(10) unsigned default NULL,
                                               `MIX_ID` bigint(20) unsigned default NULL,
  `TYPE` int(10) unsigned default NULL,
                                               `MIX_LEN` int(10) unsigned default NULL,
  `SPACE` int(10) unsigned default NULL,
                                               `CLUSTER_NAME` varchar(255) default NULL,
  `PAGE_NO` int(10) unsigned default NULL,
                                               `SPACE` int(10) unsigned default NULL,
  PRIMARY KEY   (`TABLE_ID`,`ID`)
                                               PRIMARY KEY   (`NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
                                              ) ENGINE=InnoDB DEFAULT CHARSET=latin1


           index_id = 0-3                                 index_id = 0-1




                                                                             www.percona.com
Example: SYS_*

    SYS_TABLES
●
  NAME             ID    …
●
  "sakila/actor"   40 8 1 0 0 NULL 0
●
  "sakila/actor"   40 8 1 0 0 NULL 0
●
  "sakila/actor"   40 8 1 0 0 NULL 0


    SYS_INDEXES
●
  TABLE_ID ID     NAME      …
●
  40       196389 "PRIMARY" 2 3 0 21031026
●
  40       196390 "idx_actor_last_name" 1 0 0 21031028




                                                         www.percona.com
3. InnoDB Primary and Secondary
indexes




                       www.percona.com
PRIMARY Index

The table:                Fields in the PK:
CREATE TABLE `t1` (
  `ID` int(11),           1.   ID
  `NAME` varchar(120),    2.   DB_TRX_ID
  `N_FIELDS` int(10),     3.   DB_ROLL_PTR
  PRIMARY KEY (`ID`),     4.   NAME
  KEY `NAME` (`NAME`)     5.   N_FIELDS
) ENGINE=InnoDB DEFAULT
  CHARSET=latin1




                                              www.percona.com
Secondary Index

The table:                Fields in the SK:
CREATE TABLE `t1` (
  `ID` int(11),           1. NAME
  `NAME` varchar(120),    2. ID  Primary key
  `N_FIELDS` int(10),
  PRIMARY KEY (`ID`),
  KEY `NAME` (`NAME`)
) ENGINE=InnoDB DEFAULT
  CHARSET=latin1




                                              www.percona.com
4. Typical failure scenarios




                           www.percona.com
Wrong DELETE
●   DELETE FROM `actor`;
●   What happens?
    ●   Row(s) is marked as deleted
    ●   TRX_ID and ROLL_PTR are updated
    ●   Row remains in the page till UNDO log is purged
●




                                              www.percona.com
Wrong DELETE
●   What to do first?
    ●   Kill -9 mysqld_safe ASAP!
    ●   Kill -9 mysqld ASAP
●   Find pages which belong to the dropped table
●   Fetch records from the
    pages(constraints_parser from the recovery
    toolkit)



                                       www.percona.com
Dropped Table/Database
●   DROP TABLE actor;
●   Very often DROP and then CREATE
●   Bad because .frm files are removed
●   Even worse when innodb_per_table
●   What happens inside InnoDB:
    ●   Page is marked as free (or deleted tablespace)
    ●   A record is deleted from the dictionary



                                                  www.percona.com
Dropped Table/Database
●   What to do?
    ●   Kill -9 safe_mysqld and mysqld
    ●   Remount a MySQL partition read-only
        –   Or take an image
        –   Or stop other services writing to the disk
●   Fetch records from index pages
    (constraints_parser from recovery toolkit)




                                                         www.percona.com
Truncate table
●   What happens?
    ●   Equivalent to DROP/CREATE
    ●   Dictionary is updated, but index_id may be reused
●   What to do?
    ●   The same as after the DROP: kill mysqld, fetch
        records from the index




                                              www.percona.com
Wrong UPDATE statement
●   What happens?
    ●   If the new row is the same size, in-place update
        happens; otherwise – insert/delete
    ●   The old values go to UNDO space
    ●   roll_ptr points to the old value in the UNDO space
●   What to do?
    ●   Kill -9 safe_mysqld, mysqld
    ●   However no tool available now



                                                www.percona.com
Other wrongdoings
●   Removed ibdata1 file
    ●   Take mysqldump ASAP before MySQL is stopped
    ●   Ibdconnect from recovery toolkit
●   Wrong backups
    ●   innodb_force_recovery
    ●   Fetch records from index pages
●   You name it



                                           www.percona.com
Corrupt InnoDB tablespace
●   Hardware failures
●   OS or filesystem failures
●   InnoDB bugs
●   Corrupted InnoDB tablespace by other
    processes
●   What to do?
    ●   innodb_force_recovery
    ●   fetch_data.sh
    ●   constraints_parser
                                      www.percona.com
5. InnoDB recovery tool




                          www.percona.com
Features
●   A toolset to works with InnoDB at low level
    ●   page_parser – scans a bytes stream, finds InnoDB
        pages and sorts them by page type/index_id
    ●   constraints_parser – fetches records from InnoDB
        page
    ●   Ibdconnect – a tool to “connect” an .ibd file to
        system tablespace.
    ●   fetch_data.sh – fetches data from partially
        corrupted tables choosing PK ranges.


                                                  www.percona.com
Recovery prerequisites
●   Media
    ●   ibdata1
    ●   *.ibd
    ●   HDD image
●   Tables structure
    ●   SQL dump
    ●   *.FRM files




                                www.percona.com
How to get CREATE info from
             .frm files
●   Create table `actor` (id int) engine=InnoDB
●   Stop MySQL and replace actor.frm
●   Run MySQL with innodb_force_recovery=4
●   SHOW CREATE TABLE actor;




                                         www.percona.com
Percona Data Recovery Tool for
            InnoDB
http://launchpad.net/percona-innodb-recovery-tool/
page_parser – splits InnoDB tablespace into 16k
pages
constraints_parser – scans a page and finds
good records




                                       www.percona.com
page_parser
●   Accepts a file
    # ./page_parser -f /var/lib/mysql/ibdata1
●   Produces:
●   # ll pages-1319190009
●   drwxr-xr-x 16 root root 4096 Oct 21 05:40 FIL_PAGE_INDEX/
●   drwxr-xr-x 2 root root 12288 Oct 21 05:40 FIL_PAGE_TYPE_BLOB/
●   # ll pages-1319190009/FIL_PAGE_INDEX/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-1/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-3/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-18/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-19/


                                                                    www.percona.com
constraints_parser
●   Accept a page or directory with pages
●   # ./bin/constraints_parser.SYS_TABLES -4Uf pages-1319185222/FIL_PAGE_INDEX/0-1
SYS_TABLES        "SYS_FOREIGN"       11       -2147483644        1         0           0
SYS_TABLES        "SYS_FOREIGN_COLS"           12        -2147483644        1           0
SYS_TABLES        "test/t1"           16       3         1        0         0

●   Table structure is defined in
    "include/table_defs.h"
●   Prints LOAD DATA INFILE to stderr



                                                                      www.percona.com
ibdconnect
●   Create empty InnoDB tablespace
●   Create the table:
    mysql>CREATE TABLE actor (
●   actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
●   first_name VARCHAR(45) NOT NULL,
●   last_name VARCHAR(45) NOT NULL,
●   last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON
    UPDATE CURRENT_TIMESTAMP,
●   PRIMARY KEY   (actor_id),
●   KEY idx_actor_last_name (last_name)
●   )ENGINE=InnoDB DEFAULT CHARSET=utf8;


                                                    www.percona.com
ibdconnect
●   Update InnoDB dictionary (MySQL is down
    now)
    # ibdconnect -o /var/lib/mysql/ibdata1 -f
    /var/lib/mysql/sakila/actor.ibd -d sakila -t actor
●   Fix InnoDB checksums:
    # ./innochecksum -f /var/lib/mysql/ibdata1
    # ./innochecksum -f /var/lib/mysql/ibdata1
●   Start MySQL and take mysqldump

                                            www.percona.com
Thank You to Our Sponsors
         Platinum Sponsor




          Gold Sponsor




         Silver Sponsors




                            www.percona.com
Percona Live London Sponsors
            Exhibitor Sponsors




        Friends of Percona Sponsors




              Media Sponsors




                                      www.percona.com
Annual MySQL Users Conference
      Presented by Percona Live
    The Hyatt Regency Hotel, Santa Clara, CA
               April 10th-12th, 2012
                 Featured Speakers
                    Mark Callaghan, Facebook

                   Jeremy Zawodny, Craigslist

                Marten Mickos, Eucalyptus Systems

                   Sarah Novotny, Blue Gecko

                      Peter Zaitsev, Percona

                    Baron Schwartz, Percona

         The Call for Papers is Now Open!
   Visit www.percona.com/live/mysql-conference-2012/
                                                    www.percona.com
aleksandr.kuzminsky@percona.com
        istvan.podor@percona.com



We're Hiring! www.percona.com/about-us/careers/
www.percona.com/live

Contenu connexe

Tendances

How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performanceoysteing
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilDatabricks
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsNirav Shah
 
Free Load Testing Tools for Oracle Database – Which One Do I Use?
Free Load Testing Tools for Oracle Database – Which One Do I Use?Free Load Testing Tools for Oracle Database – Which One Do I Use?
Free Load Testing Tools for Oracle Database – Which One Do I Use?Christian Antognini
 
What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1Satishbabu Gunukula
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationDatabricks
 
Advanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceAdvanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceZohar Elkayam
 
Case study mattel_ rev
Case study mattel_ revCase study mattel_ rev
Case study mattel_ revkenahnancy
 
Efficient Use of indexes in MySQL
Efficient Use of indexes in MySQLEfficient Use of indexes in MySQL
Efficient Use of indexes in MySQLAleksandr Kuzminsky
 
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlPostgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlReactive.IO
 
Facets of Strategic Marketing
Facets of Strategic MarketingFacets of Strategic Marketing
Facets of Strategic MarketingISAAC Jayant
 
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재PgDay.Seoul
 
Five_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptx
Five_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptxFive_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptx
Five_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptxMaria Colgan
 
Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaRandy Goering
 
Oracle sql high performance tuning
Oracle sql high performance tuningOracle sql high performance tuning
Oracle sql high performance tuningGuy Harrison
 
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by exampleMauro Pagano
 
A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN Riyaj Shamsudeen
 

Tendances (20)

How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tips
 
Free Load Testing Tools for Oracle Database – Which One Do I Use?
Free Load Testing Tools for Oracle Database – Which One Do I Use?Free Load Testing Tools for Oracle Database – Which One Do I Use?
Free Load Testing Tools for Oracle Database – Which One Do I Use?
 
What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1What’s New in Oracle Database 19c - Part 1
What’s New in Oracle Database 19c - Part 1
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
 
Advanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better PerformanceAdvanced PLSQL Optimizing for Better Performance
Advanced PLSQL Optimizing for Better Performance
 
Case study mattel_ rev
Case study mattel_ revCase study mattel_ rev
Case study mattel_ rev
 
ASM
ASMASM
ASM
 
Efficient Use of indexes in MySQL
Efficient Use of indexes in MySQLEfficient Use of indexes in MySQL
Efficient Use of indexes in MySQL
 
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlPostgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
 
Facets of Strategic Marketing
Facets of Strategic MarketingFacets of Strategic Marketing
Facets of Strategic Marketing
 
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
 
Five_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptx
Five_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptxFive_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptx
Five_Things_You_Might_Not_Know_About_Oracle_Database_v2.pptx
 
Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration Dilemma
 
Oracle sql high performance tuning
Oracle sql high performance tuningOracle sql high performance tuning
Oracle sql high performance tuning
 
Oracle 12c New Features
Oracle 12c New FeaturesOracle 12c New Features
Oracle 12c New Features
 
Oracle statistics by example
Oracle statistics by exampleOracle statistics by example
Oracle statistics by example
 
A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN
 
SKILLWISE-DB2 DBA
SKILLWISE-DB2 DBASKILLWISE-DB2 DBA
SKILLWISE-DB2 DBA
 

Similaire à Data recovery talk on PLUK

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)guest808c167
 
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tablesOpen sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tablesArvids Godjuks
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internalmysqlops
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureMySQLConference
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomyRyan Robson
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfycelgemici1
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structurezhaolinjnu
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space ManagementMIJIN AN
 
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)Ontico
 
PAGING MECHANISM Pentium.ppt
PAGING MECHANISM Pentium.pptPAGING MECHANISM Pentium.ppt
PAGING MECHANISM Pentium.pptPramodBorole2
 
(140625) #fitalk sq lite 소개와 구조 분석
(140625) #fitalk   sq lite 소개와 구조 분석(140625) #fitalk   sq lite 소개와 구조 분석
(140625) #fitalk sq lite 소개와 구조 분석INSIGHT FORENSIC
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemyRyan Robson
 
Pentium protected mode.ppt
Pentium protected mode.pptPentium protected mode.ppt
Pentium protected mode.pptPramodBorole2
 
Индексируем базу: как делать хорошо и не делать плохо Winter saint p 2021 m...
Индексируем базу: как делать хорошо и не делать плохо   Winter saint p 2021 m...Индексируем базу: как делать хорошо и не делать плохо   Winter saint p 2021 m...
Индексируем базу: как делать хорошо и не делать плохо Winter saint p 2021 m...Андрей Новиков
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMarco Tusa
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...Insight Technology, Inc.
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsJulien Le Dem
 

Similaire à Data recovery talk on PLUK (20)

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
 
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tablesOpen sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internal
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space Management
 
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
 
PAGING MECHANISM Pentium.ppt
PAGING MECHANISM Pentium.pptPAGING MECHANISM Pentium.ppt
PAGING MECHANISM Pentium.ppt
 
(140625) #fitalk sq lite 소개와 구조 분석
(140625) #fitalk   sq lite 소개와 구조 분석(140625) #fitalk   sq lite 소개와 구조 분석
(140625) #fitalk sq lite 소개와 구조 분석
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
 
Pentium protected mode.ppt
Pentium protected mode.pptPentium protected mode.ppt
Pentium protected mode.ppt
 
Индексируем базу: как делать хорошо и не делать плохо Winter saint p 2021 m...
Индексируем базу: как делать хорошо и не делать плохо   Winter saint p 2021 m...Индексируем базу: как делать хорошо и не делать плохо   Winter saint p 2021 m...
Индексируем базу: как делать хорошо и не делать плохо Winter saint p 2021 m...
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Page compression. PGCON_2016
Page compression. PGCON_2016Page compression. PGCON_2016
Page compression. PGCON_2016
 
MIPS Architecture
MIPS ArchitectureMIPS Architecture
MIPS Architecture
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
 
OpenGL 4.5 Reference Card
OpenGL 4.5 Reference CardOpenGL 4.5 Reference Card
OpenGL 4.5 Reference Card
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 

Plus de Aleksandr Kuzminsky

Plus de Aleksandr Kuzminsky (6)

ProxySQL at Scale on AWS.pdf
ProxySQL at Scale on AWS.pdfProxySQL at Scale on AWS.pdf
ProxySQL at Scale on AWS.pdf
 
Omnibus as a Solution for Dependency Hell
Omnibus as a Solution for Dependency HellOmnibus as a Solution for Dependency Hell
Omnibus as a Solution for Dependency Hell
 
Efficient Indexes in MySQL
Efficient Indexes in MySQLEfficient Indexes in MySQL
Efficient Indexes in MySQL
 
Netstore overview
Netstore overviewNetstore overview
Netstore overview
 
Undrop for InnoDB
Undrop for InnoDBUndrop for InnoDB
Undrop for InnoDB
 
Undrop for InnoDB
Undrop for InnoDBUndrop for InnoDB
Undrop for InnoDB
 

Dernier

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Dernier (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

Data recovery talk on PLUK

  • 1. Data Recovery for MySQL Aleksandr Kuzminsky & Istvan Podor Percona Live London 2011
  • 2. Agenda 1. InnoDB files format overview 2. InnoDB dictionary (SYS_INDEXES, SYS_TABLES) 3. InnoDB Primary and Secondary indexes 4. Typical failure scenarios 5. InnoDB recovery tool www.percona.com
  • 3. 1. InnoDB format overview www.percona.com
  • 4. InnoDB table spaces ● Shared table space (ibdataX) ● dictionary, rollback segment, undo space, insert buffer, double write buffer ● Primary & Secondary indexes ● External pages (BLOBs) ● A table space per table (ibdataX + *.ibd) ● Primary & Secondary indexes ● External pages (BLOBs) www.percona.com
  • 5. InnoDB logs ● a.k.a REDO logs, transactional logs ● ib_logfile[01] ● Contain changes to InnoDB pages: Space ID Page ID Data www.percona.com
  • 6. InnoDB Indexes. PRIMARY ● Table is a clustered index named PRIMARY ● B+ tree data structure, node is a page ● Key is Primary key or unique key or ROW_ID (6bytes) www.percona.com
  • 7. InnoDB Indexes. Secondary ● Also B+ tree structure ● Key is indexed field(s), values are primary keys ● If table (id, first_name, last_name, birth_date), ● And index (last_name) ● The index structure stores (last_name, id) www.percona.com
  • 8. Index identifier (index_id) ● 8 bytes integer, often in two numbers notation: 0 254 ● Table name → index_id correspondence is stored in InnoDB dictionary ● Visible in table monitor output(see next slide) ● In I_S tables if Percona Server www.percona.com
  • 9. Table monitor output mysql> CREATE TABLE innodb_table_monitor(x int) engine=innodb In Error log: TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1       COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT len 4 prec 0;                            created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0;                    DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257 len 6 prec 0;                    DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;           INDEX: name PRIMARY, id 0 254, fields 1/7, type 3            root page 271, appr.key vals 1, leaf pages 1, size pages 1            FIELDS:  id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at www.percona.com
  • 10. InnoDB page format InnoDB page is 16 kilobytes Size, bytes FIL_HEADER 36 PAGE_HEADER 56 INFINUM+SUPREMUM RECORDS varies User records varies Free space Page directory varies FIL_TRAILER fixed www.percona.com
  • 11. Fil Header ● Common for all type of pages Name Siz Remarks e FIL_PAGE_SPACE 4 Space ID the page is in or checksum FIL_PAGE_OFFSET 4 ordinal page number from start of space FIL_PAGE_PREV 4 offset of previous page in key order FIL_PAGE_NEXT 4 offset of next page in key order FIL_PAGE_LSN 8 log serial number of page's latest log record FIL_PAGE_TYPE 2 current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST FIL_PAGE_FILE_FLUSH_ 8 "the file has been flushed to disk at least up to this lsn" (log serial LSN number), valid only on the first page of the file FIL_PAGE_ARCH_LOG_ 4 /* starting from 4.1.x this contains the space id of the page */ NO www.percona.com
  • 12. Page header ● Only for index pages Name Size Remarks PAGE_N_DIR_SLOTS 2 number of directory slots in the Page Directory part; initial value = 2 PAGE_HEAP_TOP 2 record pointer to first record in heap PAGE_N_HEAP 2 number of heap records; initial value = 2 PAGE_FREE 2 record pointer to first free record PAGE_GARBAGE 2 "number of bytes in deleted records" PAGE_LAST_INSERT 2 record pointer to the last inserted record PAGE_DIRECTION 2 either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION PAGE_N_DIRECTION 2 number of consecutive inserts in the same direction, e.g. "last 5 were all to the left" PAGE_N_RECS 2 number of user records PAGE_MAX_TRX_ID 8 the highest ID of a transaction which might have changed a record on the page (only set for secondary indexes) PAGE_LEVEL 2 level within the index (0 for a leaf page) PAGE_INDEX_ID 8 identifier of the index the page belongs to PAGE_BTR_SEG_LEA 10 "file segment header for the leaf pages in a B-tree" (this is irrelevant here) F PAGE_BTR_SEG_TOP 10 "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here) www.percona.com
  • 13. How to check row format? ● Can be either COMPACT or REDUNDANT ● 0 stands for REDUNDANT, 1 - for COMACT ● The highest bit of the PAGE_N_HEAP from the page header ● dc -e "2o `hexdump –C d pagefile | grep 00000020 | awk '{ print $12}'` p" | sed 's/./& /g' | awk '{ print $1}' www.percona.com
  • 14. Rows in an InnoDB page ● Single linked list next INFIMUM NULL SUPREMUM ● The first record – next 10 data INFIMUM next 20 data next 30 data ● The last – next 50 data SUPREMUM next 40 data ● Ordered by PK www.percona.com
  • 15. Records are saved in insert order insert into t1 values(10, 'aaa'); insert into t1 values(30, 'ccc'); insert into t1 values(20, 'bbb'); JG....................N<E....... ................................ .............................2.. ...infimum......supremum......6. ........)....2..aaa............. ...*....2..ccc.... ...........+. ...2..bbb....................... ................................ www.percona.com
  • 16. Row format CREATE TABLE `t1` ( Name Size `ID` int(11) unsigned NOT NULL, `NAME` varchar(120), Offsets 1 byte for small types `N_FIELDS` int(10), (lengths) 2 bytes for longer types PRIMARY KEY (`ID`) ) ENGINE=InnoDB DEFAULT Extra bytes 5 bytes COMPACT CHARSET=latin1 6 bytes REDUNDANT Field content varies www.percona.com
  • 17. Extra bytes(REDUNDANT) Name Size Description record_status 2 bits _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM deleted_flag 1 bit 1 if record is deleted min_rec_flag 1 bit 1 if record is predefined minimum record n_owned 4 bits number of records owned by this record heap_no 13 bits record's order number in heap of index page n_fields 10 bits number of fields in this record, 1 to 1023 1byte_offs_flag 1 bit 1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag) next 16 bits 16 bits pointer to next record in page www.percona.com
  • 18. Extra bytes(COMPACT) Name Size, bits Description record_status 4 4 bits used to delete mark a record, and mark a predefined minimum deleted_flag record in alphabetical order min_rec_flag n_owned 4 the number of records owned by this record (this term is explained in page0page.h) heap_no 13 the order number of this record in the heap of the index page record type 3 000=conventional, 001=node pointer (inside B-tree), 010=infimum, 011=supremum, 1xx=reserved next 16 bits 16 a relative pointer to the next record in the page www.percona.com
  • 19. Example: Redundant row A row: (10, ‘abcdef’, 20) Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20) 4 6 7 6 4 ... next 00 00 00 0A ... ... abcdef 80 00 00 14 Offsets Extra Fields bytes www.percona.com
  • 20. The same row, but COMPACT A row: (10, ‘abcdef’, 20) Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20) 6 ... next 00 00 00 0A ... ... abcdef 80 00 00 14 Offsets Extra Fields bytes 13 % smaller! www.percona.com
  • 21. Data types (highlights) ● Integer, float numbers are fixed size ● Strings: ● VARCHAR(x) – variable ● CHAR(x) – fixed if not UTF-8 ● Date, time – fixed size ● DECIMAL ● Stored as string before 5.0.3 ● Binary format afterward www.percona.com
  • 22. Data types (BLOBs) ● Field length (so called offset) is one or two bytes long ● If record size < (UNIV_PAGE_SIZE/2-200) == ~7k – the record is stored internally (in a PK page) ● Otherwise – 768 bytes in-page, the rest in an external page www.percona.com
  • 24. Why are SYS_* tables needed? ● Correspondence “table name” → “index_id” ● Storage for other internal information www.percona.com
  • 25. SYS_* structure Always REDUNDANT format! CREATE TABLE `SYS_INDEXES` ( CREATE TABLE `SYS_TABLES` ( `TABLE_ID` bigint(20) unsigned NOT NULL `NAME` varchar(255) NOT NULL default '', default '0', `ID` bigint(20) unsigned NOT NULL default `ID` bigint(20) unsigned NOT NULL default '0', '0', `N_COLS` int(10) unsigned default NULL, `NAME` varchar(120) default NULL, `TYPE` int(10) unsigned default NULL, `N_FIELDS` int(10) unsigned default NULL, `MIX_ID` bigint(20) unsigned default NULL, `TYPE` int(10) unsigned default NULL, `MIX_LEN` int(10) unsigned default NULL, `SPACE` int(10) unsigned default NULL, `CLUSTER_NAME` varchar(255) default NULL, `PAGE_NO` int(10) unsigned default NULL, `SPACE` int(10) unsigned default NULL, PRIMARY KEY (`TABLE_ID`,`ID`) PRIMARY KEY (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 ) ENGINE=InnoDB DEFAULT CHARSET=latin1 index_id = 0-3 index_id = 0-1 www.percona.com
  • 26. Example: SYS_* SYS_TABLES ● NAME ID … ● "sakila/actor" 40 8 1 0 0 NULL 0 ● "sakila/actor" 40 8 1 0 0 NULL 0 ● "sakila/actor" 40 8 1 0 0 NULL 0 SYS_INDEXES ● TABLE_ID ID NAME … ● 40 196389 "PRIMARY" 2 3 0 21031026 ● 40 196390 "idx_actor_last_name" 1 0 0 21031028 www.percona.com
  • 27. 3. InnoDB Primary and Secondary indexes www.percona.com
  • 28. PRIMARY Index The table: Fields in the PK: CREATE TABLE `t1` ( `ID` int(11), 1. ID `NAME` varchar(120), 2. DB_TRX_ID `N_FIELDS` int(10), 3. DB_ROLL_PTR PRIMARY KEY (`ID`), 4. NAME KEY `NAME` (`NAME`) 5. N_FIELDS ) ENGINE=InnoDB DEFAULT CHARSET=latin1 www.percona.com
  • 29. Secondary Index The table: Fields in the SK: CREATE TABLE `t1` ( `ID` int(11), 1. NAME `NAME` varchar(120), 2. ID  Primary key `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 www.percona.com
  • 30. 4. Typical failure scenarios www.percona.com
  • 31. Wrong DELETE ● DELETE FROM `actor`; ● What happens? ● Row(s) is marked as deleted ● TRX_ID and ROLL_PTR are updated ● Row remains in the page till UNDO log is purged ● www.percona.com
  • 32. Wrong DELETE ● What to do first? ● Kill -9 mysqld_safe ASAP! ● Kill -9 mysqld ASAP ● Find pages which belong to the dropped table ● Fetch records from the pages(constraints_parser from the recovery toolkit) www.percona.com
  • 33. Dropped Table/Database ● DROP TABLE actor; ● Very often DROP and then CREATE ● Bad because .frm files are removed ● Even worse when innodb_per_table ● What happens inside InnoDB: ● Page is marked as free (or deleted tablespace) ● A record is deleted from the dictionary www.percona.com
  • 34. Dropped Table/Database ● What to do? ● Kill -9 safe_mysqld and mysqld ● Remount a MySQL partition read-only – Or take an image – Or stop other services writing to the disk ● Fetch records from index pages (constraints_parser from recovery toolkit) www.percona.com
  • 35. Truncate table ● What happens? ● Equivalent to DROP/CREATE ● Dictionary is updated, but index_id may be reused ● What to do? ● The same as after the DROP: kill mysqld, fetch records from the index www.percona.com
  • 36. Wrong UPDATE statement ● What happens? ● If the new row is the same size, in-place update happens; otherwise – insert/delete ● The old values go to UNDO space ● roll_ptr points to the old value in the UNDO space ● What to do? ● Kill -9 safe_mysqld, mysqld ● However no tool available now www.percona.com
  • 37. Other wrongdoings ● Removed ibdata1 file ● Take mysqldump ASAP before MySQL is stopped ● Ibdconnect from recovery toolkit ● Wrong backups ● innodb_force_recovery ● Fetch records from index pages ● You name it www.percona.com
  • 38. Corrupt InnoDB tablespace ● Hardware failures ● OS or filesystem failures ● InnoDB bugs ● Corrupted InnoDB tablespace by other processes ● What to do? ● innodb_force_recovery ● fetch_data.sh ● constraints_parser www.percona.com
  • 39. 5. InnoDB recovery tool www.percona.com
  • 40. Features ● A toolset to works with InnoDB at low level ● page_parser – scans a bytes stream, finds InnoDB pages and sorts them by page type/index_id ● constraints_parser – fetches records from InnoDB page ● Ibdconnect – a tool to “connect” an .ibd file to system tablespace. ● fetch_data.sh – fetches data from partially corrupted tables choosing PK ranges. www.percona.com
  • 41. Recovery prerequisites ● Media ● ibdata1 ● *.ibd ● HDD image ● Tables structure ● SQL dump ● *.FRM files www.percona.com
  • 42. How to get CREATE info from .frm files ● Create table `actor` (id int) engine=InnoDB ● Stop MySQL and replace actor.frm ● Run MySQL with innodb_force_recovery=4 ● SHOW CREATE TABLE actor; www.percona.com
  • 43. Percona Data Recovery Tool for InnoDB http://launchpad.net/percona-innodb-recovery-tool/ page_parser – splits InnoDB tablespace into 16k pages constraints_parser – scans a page and finds good records www.percona.com
  • 44. page_parser ● Accepts a file # ./page_parser -f /var/lib/mysql/ibdata1 ● Produces: ● # ll pages-1319190009 ● drwxr-xr-x 16 root root 4096 Oct 21 05:40 FIL_PAGE_INDEX/ ● drwxr-xr-x 2 root root 12288 Oct 21 05:40 FIL_PAGE_TYPE_BLOB/ ● # ll pages-1319190009/FIL_PAGE_INDEX/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-1/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-3/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-18/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-19/ www.percona.com
  • 45. constraints_parser ● Accept a page or directory with pages ● # ./bin/constraints_parser.SYS_TABLES -4Uf pages-1319185222/FIL_PAGE_INDEX/0-1 SYS_TABLES "SYS_FOREIGN" 11 -2147483644 1 0 0 SYS_TABLES "SYS_FOREIGN_COLS" 12 -2147483644 1 0 SYS_TABLES "test/t1" 16 3 1 0 0 ● Table structure is defined in "include/table_defs.h" ● Prints LOAD DATA INFILE to stderr www.percona.com
  • 46. ibdconnect ● Create empty InnoDB tablespace ● Create the table: mysql>CREATE TABLE actor ( ● actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, ● first_name VARCHAR(45) NOT NULL, ● last_name VARCHAR(45) NOT NULL, ● last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, ● PRIMARY KEY (actor_id), ● KEY idx_actor_last_name (last_name) ● )ENGINE=InnoDB DEFAULT CHARSET=utf8; www.percona.com
  • 47. ibdconnect ● Update InnoDB dictionary (MySQL is down now) # ibdconnect -o /var/lib/mysql/ibdata1 -f /var/lib/mysql/sakila/actor.ibd -d sakila -t actor ● Fix InnoDB checksums: # ./innochecksum -f /var/lib/mysql/ibdata1 # ./innochecksum -f /var/lib/mysql/ibdata1 ● Start MySQL and take mysqldump www.percona.com
  • 48. Thank You to Our Sponsors Platinum Sponsor Gold Sponsor Silver Sponsors www.percona.com
  • 49. Percona Live London Sponsors Exhibitor Sponsors Friends of Percona Sponsors Media Sponsors www.percona.com
  • 50. Annual MySQL Users Conference Presented by Percona Live The Hyatt Regency Hotel, Santa Clara, CA April 10th-12th, 2012 Featured Speakers Mark Callaghan, Facebook Jeremy Zawodny, Craigslist Marten Mickos, Eucalyptus Systems Sarah Novotny, Blue Gecko Peter Zaitsev, Percona Baron Schwartz, Percona The Call for Papers is Now Open! Visit www.percona.com/live/mysql-conference-2012/ www.percona.com
  • 51. aleksandr.kuzminsky@percona.com istvan.podor@percona.com We're Hiring! www.percona.com/about-us/careers/