SlideShare une entreprise Scribd logo
1  sur  52
Télécharger pour lire hors ligne
Data Recovery for MySQL
       Aleksandr Kuzminsky & Istvan Podor
                Percona Live London 2011
Agenda
1. InnoDB files format overview
2. InnoDB dictionary (SYS_INDEXES, SYS_TABLES)
3. InnoDB Primary and Secondary indexes
4. Typical failure scenarios
5. InnoDB recovery tool




                                            www.percona.com
1. InnoDB format overview




                        www.percona.com
InnoDB table spaces
●   Shared table space (ibdataX)
    ●   dictionary, rollback segment, undo space, insert
        buffer, double write buffer
    ●   Primary & Secondary indexes
    ●   External pages (BLOBs)
●   A table space per table (ibdataX + *.ibd)
    ●   Primary & Secondary indexes
    ●   External pages (BLOBs)



                                                www.percona.com
InnoDB logs
●   a.k.a REDO logs, transactional logs
●   ib_logfile[01]
●   Contain changes to InnoDB pages:
              Space ID   Page ID   Data




                                          www.percona.com
InnoDB Indexes. PRIMARY
●   Table is a clustered index named PRIMARY
●   B+ tree data structure, node is a page
●   Key is Primary key or unique key or ROW_ID
    (6bytes)




                                         www.percona.com
InnoDB Indexes. Secondary
●   Also B+ tree structure
●   Key is indexed field(s), values are primary keys
    ●   If table (id, first_name, last_name, birth_date),
    ●   And index (last_name)
    ●   The index structure stores (last_name, id)




                                               www.percona.com
Index identifier (index_id)
●   8 bytes integer, often in two numbers notation:
    0 254
●   Table name → index_id correspondence is
    stored in InnoDB dictionary
●   Visible in table monitor output(see next slide)
●   In I_S tables if Percona Server




                                          www.percona.com
Table monitor output

  mysql> CREATE TABLE innodb_table_monitor(x int)
 engine=innodb

In Error log:
TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1
         COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0;
   sites_count: DATA_INT len 4 prec 0;
                              created_at: DATA_INT len 8 prec 0; updated_at:
   DATA_INT len 8 prec 0;
                      DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID:
   DATA_SYS prtype 257 len 6 prec 0;
                      DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;
             INDEX: name PRIMARY, id 0 254, fields 1/7, type 3
              root page 271, appr.key vals 1, leaf pages 1, size pages 1
              FIELDS:  id DB_TRX_ID DB_ROLL_PTR name sites_count created_at
   updated_at




                                                                 www.percona.com
InnoDB page format
InnoDB page is 16 kilobytes


                              Size, bytes
FIL_HEADER                    36
PAGE_HEADER                   56
INFINUM+SUPREMUM RECORDS      varies
User records                  varies
Free space
Page directory                varies
FIL_TRAILER                   fixed




                                            www.percona.com
Fil Header
●   Common for all type of pages
    Name                   Siz   Remarks
                           e

    FIL_PAGE_SPACE         4     Space ID the page is in or checksum
    FIL_PAGE_OFFSET        4     ordinal page number from start of space
    FIL_PAGE_PREV          4     offset of previous page in key order
    FIL_PAGE_NEXT          4     offset of next page in key order
    FIL_PAGE_LSN           8     log serial number of page's latest log record
    FIL_PAGE_TYPE          2     current defined types are: FIL_PAGE_INDEX,
                                 FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE,
                                 FIL_PAGE_IBUF_FREE_LIST
    FIL_PAGE_FILE_FLUSH_   8     "the file has been flushed to disk at least up to this lsn" (log serial
    LSN
                                 number), valid only on the first page of the file
    FIL_PAGE_ARCH_LOG_     4     /* starting from 4.1.x this contains the space id of the page */
    NO




                                                                                       www.percona.com
Page header
●    Only for index pages
    Name               Size   Remarks

    PAGE_N_DIR_SLOTS   2      number of directory slots in the Page Directory part; initial value = 2
    PAGE_HEAP_TOP      2      record pointer to first record in heap
    PAGE_N_HEAP        2      number of heap records; initial value = 2
    PAGE_FREE          2      record pointer to first free record
    PAGE_GARBAGE       2      "number of bytes in deleted records"
    PAGE_LAST_INSERT   2      record pointer to the last inserted record
    PAGE_DIRECTION     2      either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION
    PAGE_N_DIRECTION   2      number of consecutive inserts in the same direction, e.g. "last 5 were all to the left"
    PAGE_N_RECS        2      number of user records
    PAGE_MAX_TRX_ID    8      the highest ID of a transaction which might have changed a record on the page (only set for secondary
                              indexes)
    PAGE_LEVEL         2      level within the index (0 for a leaf page)
    PAGE_INDEX_ID      8      identifier of the index the page belongs to
    PAGE_BTR_SEG_LEA   10     "file segment header for the leaf pages in a B-tree" (this is irrelevant here)
    F

    PAGE_BTR_SEG_TOP   10     "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here)



                                                                                                                   www.percona.com
How to check row format?
●   Can be either COMPACT or REDUNDANT
●   0 stands for REDUNDANT, 1 - for COMACT
●   The highest bit of the PAGE_N_HEAP from the
    page header
●   dc -e "2o `hexdump –C d pagefile |
    grep 00000020 | awk '{ print $12}'`
    p" | sed 's/./& /g' | awk '{ print
    $1}'


                                     www.percona.com
Rows in an InnoDB page
●   Single linked list   next        INFIMUM
                         NULL        SUPREMUM
●   The first record –   next   10   data
    INFIMUM              next   20   data
                         next   30   data
●   The last –           next   50   data
    SUPREMUM             next   40   data

●   Ordered by PK




                                        www.percona.com
Records are saved in insert order
insert into t1 values(10, 'aaa');
insert into t1 values(30, 'ccc');
insert into t1 values(20, 'bbb');

JG....................N<E.......
................................
.............................2..
...infimum......supremum......6.
........)....2..aaa.............
...*....2..ccc.... ...........+.
...2..bbb.......................
................................

                                    www.percona.com
Row format
CREATE TABLE `t1` (                 Name            Size
  `ID` int(11) unsigned NOT NULL,
  `NAME` varchar(120),              Offsets         1 byte for small types
  `N_FIELDS` int(10),               (lengths)       2 bytes for longer types
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT             Extra bytes     5 bytes COMPACT
CHARSET=latin1                                      6 bytes REDUNDANT
                                    Field content   varies




                                                             www.percona.com
Extra bytes(REDUNDANT)
Name              Size      Description

record_status     2 bits    _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM

deleted_flag      1 bit     1 if record is deleted

min_rec_flag      1 bit     1 if record is predefined minimum record

n_owned           4 bits    number of records owned by this record

heap_no           13 bits   record's order number in heap of index page

n_fields          10 bits   number of fields in this record, 1 to 1023

1byte_offs_flag   1 bit     1 if each Field Start Offsets is 1 byte long (this item is also called the "short"
                            flag)

next 16 bits      16 bits   pointer to next record in page



                                                                                           www.percona.com
Extra bytes(COMPACT)
Name             Size, bits   Description
record_status    4            4 bits used to delete mark a record, and mark a predefined minimum
deleted_flag                  record in alphabetical order
min_rec_flag


n_owned          4            the number of records owned by this record
                               (this term is explained in page0page.h)

heap_no          13           the order number of this record in the
                               heap of the index page

record type      3            000=conventional, 001=node pointer (inside B-tree), 010=infimum,
                              011=supremum, 1xx=reserved

next 16 bits     16           a relative pointer to the next record in the page



                                                                           www.percona.com
Example: Redundant row
A row: (10, ‘abcdef’, 20)
Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20)


4   6 7 6 4    ... next   00 00 00 0A   ...   ...   abcdef 80 00 00 14



    Offsets     Extra                          Fields
                bytes




                                                                     www.percona.com
The same row, but COMPACT
   A row: (10, ‘abcdef’, 20)
   Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20)


  6 ... next    00 00 00 0A   ...   ...   abcdef 80 00 00 14



Offsets Extra                                      Fields
        bytes


                                      13 % smaller!


                                                               www.percona.com
Data types (highlights)
●   Integer, float numbers are fixed size
●   Strings:
    ●   VARCHAR(x) – variable
    ●   CHAR(x) – fixed if not UTF-8
●   Date, time – fixed size
●   DECIMAL
    ●   Stored as string before 5.0.3
    ●   Binary format afterward


                                            www.percona.com
Data types (BLOBs)
●   Field length (so called offset) is one or two
    bytes long
●   If record size < (UNIV_PAGE_SIZE/2-200) ==
    ~7k – the record is stored internally (in a PK
    page)
●   Otherwise – 768 bytes in-page, the rest in an
    external page




                                           www.percona.com
InnoDB dictionary
(SYS_INDEXES, SYS_TABLES)




                   www.percona.com
Why are SYS_* tables needed?
●   Correspondence “table name” → “index_id”
●   Storage for other internal information




                                             www.percona.com
SYS_* structure
                               Always REDUNDANT format!
CREATE TABLE `SYS_INDEXES` (                  CREATE TABLE `SYS_TABLES` (
  `TABLE_ID` bigint(20) unsigned NOT NULL      `NAME` varchar(255) NOT NULL default '',
    default '0',
                                               `ID` bigint(20) unsigned NOT NULL default
  `ID` bigint(20) unsigned NOT NULL default      '0',
    '0',
                                               `N_COLS` int(10) unsigned default NULL,
  `NAME` varchar(120) default NULL,
                                               `TYPE` int(10) unsigned default NULL,
  `N_FIELDS` int(10) unsigned default NULL,
                                               `MIX_ID` bigint(20) unsigned default NULL,
  `TYPE` int(10) unsigned default NULL,
                                               `MIX_LEN` int(10) unsigned default NULL,
  `SPACE` int(10) unsigned default NULL,
                                               `CLUSTER_NAME` varchar(255) default NULL,
  `PAGE_NO` int(10) unsigned default NULL,
                                               `SPACE` int(10) unsigned default NULL,
  PRIMARY KEY   (`TABLE_ID`,`ID`)
                                               PRIMARY KEY   (`NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
                                              ) ENGINE=InnoDB DEFAULT CHARSET=latin1


           index_id = 0-3                                 index_id = 0-1




                                                                             www.percona.com
Example: SYS_*

    SYS_TABLES
●
  NAME             ID    …
●
  "sakila/actor"   40 8 1 0 0 NULL 0
●
  "sakila/actor"   40 8 1 0 0 NULL 0
●
  "sakila/actor"   40 8 1 0 0 NULL 0


    SYS_INDEXES
●
  TABLE_ID ID     NAME      …
●
  40       196389 "PRIMARY" 2 3 0 21031026
●
  40       196390 "idx_actor_last_name" 1 0 0 21031028




                                                         www.percona.com
3. InnoDB Primary and Secondary
indexes




                       www.percona.com
PRIMARY Index

The table:                Fields in the PK:
CREATE TABLE `t1` (
  `ID` int(11),           1.   ID
  `NAME` varchar(120),    2.   DB_TRX_ID
  `N_FIELDS` int(10),     3.   DB_ROLL_PTR
  PRIMARY KEY (`ID`),     4.   NAME
  KEY `NAME` (`NAME`)     5.   N_FIELDS
) ENGINE=InnoDB DEFAULT
  CHARSET=latin1




                                              www.percona.com
Secondary Index

The table:                Fields in the SK:
CREATE TABLE `t1` (
  `ID` int(11),           1. NAME
  `NAME` varchar(120),    2. ID  Primary key
  `N_FIELDS` int(10),
  PRIMARY KEY (`ID`),
  KEY `NAME` (`NAME`)
) ENGINE=InnoDB DEFAULT
  CHARSET=latin1




                                              www.percona.com
4. Typical failure scenarios




                           www.percona.com
Wrong DELETE
●   DELETE FROM `actor`;
●   What happens?
    ●   Row(s) is marked as deleted
    ●   TRX_ID and ROLL_PTR are updated
    ●   Row remains in the page till UNDO log is purged
●




                                              www.percona.com
Wrong DELETE
●   What to do first?
    ●   Kill -9 mysqld_safe ASAP!
    ●   Kill -9 mysqld ASAP
●   Find pages which belong to the dropped table
●   Fetch records from the
    pages(constraints_parser from the recovery
    toolkit)



                                       www.percona.com
Dropped Table/Database
●   DROP TABLE actor;
●   Very often DROP and then CREATE
●   Bad because .frm files are removed
●   Even worse when innodb_per_table
●   What happens inside InnoDB:
    ●   Page is marked as free (or deleted tablespace)
    ●   A record is deleted from the dictionary



                                                  www.percona.com
Dropped Table/Database
●   What to do?
    ●   Kill -9 safe_mysqld and mysqld
    ●   Remount a MySQL partition read-only
        –   Or take an image
        –   Or stop other services writing to the disk
●   Fetch records from index pages
    (constraints_parser from recovery toolkit)




                                                         www.percona.com
Truncate table
●   What happens?
    ●   Equivalent to DROP/CREATE
    ●   Dictionary is updated, but index_id may be reused
●   What to do?
    ●   The same as after the DROP: kill mysqld, fetch
        records from the index




                                              www.percona.com
Wrong UPDATE statement
●   What happens?
    ●   If the new row is the same size, in-place update
        happens; otherwise – insert/delete
    ●   The old values go to UNDO space
    ●   roll_ptr points to the old value in the UNDO space
●   What to do?
    ●   Kill -9 safe_mysqld, mysqld
    ●   However no tool available now



                                                www.percona.com
Other wrongdoings
●   Removed ibdata1 file
    ●   Take mysqldump ASAP before MySQL is stopped
    ●   Ibdconnect from recovery toolkit
●   Wrong backups
    ●   innodb_force_recovery
    ●   Fetch records from index pages
●   You name it



                                           www.percona.com
Corrupt InnoDB tablespace
●   Hardware failures
●   OS or filesystem failures
●   InnoDB bugs
●   Corrupted InnoDB tablespace by other
    processes
●   What to do?
    ●   innodb_force_recovery
    ●   fetch_data.sh
    ●   constraints_parser
                                      www.percona.com
5. InnoDB recovery tool




                          www.percona.com
Features
●   A toolset to works with InnoDB at low level
    ●   page_parser – scans a bytes stream, finds InnoDB
        pages and sorts them by page type/index_id
    ●   constraints_parser – fetches records from InnoDB
        page
    ●   Ibdconnect – a tool to “connect” an .ibd file to
        system tablespace.
    ●   fetch_data.sh – fetches data from partially
        corrupted tables choosing PK ranges.


                                                  www.percona.com
Recovery prerequisites
●   Media
    ●   ibdata1
    ●   *.ibd
    ●   HDD image
●   Tables structure
    ●   SQL dump
    ●   *.FRM files




                                www.percona.com
How to get CREATE info from
             .frm files
●   Create table `actor` (id int) engine=InnoDB
●   Stop MySQL and replace actor.frm
●   Run MySQL with innodb_force_recovery=4
●   SHOW CREATE TABLE actor;




                                         www.percona.com
Percona Data Recovery Tool for
            InnoDB
http://launchpad.net/percona-innodb-recovery-tool/
page_parser – splits InnoDB tablespace into 16k
pages
constraints_parser – scans a page and finds
good records




                                       www.percona.com
page_parser
●   Accepts a file
    # ./page_parser -f /var/lib/mysql/ibdata1
●   Produces:
●   # ll pages-1319190009
●   drwxr-xr-x 16 root root 4096 Oct 21 05:40 FIL_PAGE_INDEX/
●   drwxr-xr-x 2 root root 12288 Oct 21 05:40 FIL_PAGE_TYPE_BLOB/
●   # ll pages-1319190009/FIL_PAGE_INDEX/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-1/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-3/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-18/
●   drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-19/


                                                                    www.percona.com
constraints_parser
●   Accept a page or directory with pages
●   # ./bin/constraints_parser.SYS_TABLES -4Uf pages-1319185222/FIL_PAGE_INDEX/0-1
SYS_TABLES        "SYS_FOREIGN"       11       -2147483644        1         0           0
SYS_TABLES        "SYS_FOREIGN_COLS"           12        -2147483644        1           0
SYS_TABLES        "test/t1"           16       3         1        0         0

●   Table structure is defined in
    "include/table_defs.h"
●   Prints LOAD DATA INFILE to stderr



                                                                      www.percona.com
ibdconnect
●   Create empty InnoDB tablespace
●   Create the table:
    mysql>CREATE TABLE actor (
●   actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
●   first_name VARCHAR(45) NOT NULL,
●   last_name VARCHAR(45) NOT NULL,
●   last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON
    UPDATE CURRENT_TIMESTAMP,
●   PRIMARY KEY   (actor_id),
●   KEY idx_actor_last_name (last_name)
●   )ENGINE=InnoDB DEFAULT CHARSET=utf8;


                                                    www.percona.com
ibdconnect
●   Update InnoDB dictionary (MySQL is down
    now)
    # ibdconnect -o /var/lib/mysql/ibdata1 -f
    /var/lib/mysql/sakila/actor.ibd -d sakila -t actor
●   Fix InnoDB checksums:
    # ./innochecksum -f /var/lib/mysql/ibdata1
    # ./innochecksum -f /var/lib/mysql/ibdata1
●   Start MySQL and take mysqldump

                                            www.percona.com
Thank You to Our Sponsors
         Platinum Sponsor




          Gold Sponsor




         Silver Sponsors




                            www.percona.com
Percona Live London Sponsors
            Exhibitor Sponsors




        Friends of Percona Sponsors




              Media Sponsors




                                      www.percona.com
Annual MySQL Users Conference
      Presented by Percona Live
    The Hyatt Regency Hotel, Santa Clara, CA
               April 10th-12th, 2012
                 Featured Speakers
                    Mark Callaghan, Facebook

                   Jeremy Zawodny, Craigslist

                Marten Mickos, Eucalyptus Systems

                   Sarah Novotny, Blue Gecko

                      Peter Zaitsev, Percona

                    Baron Schwartz, Percona

         The Call for Papers is Now Open!
   Visit www.percona.com/live/mysql-conference-2012/
                                                    www.percona.com
aleksandr.kuzminsky@percona.com
        istvan.podor@percona.com



We're Hiring! www.percona.com/about-us/careers/
www.percona.com/live

Contenu connexe

Tendances

仮想化した DC を PowerShell で複製する
仮想化した DC を PowerShell で複製する仮想化した DC を PowerShell で複製する
仮想化した DC を PowerShell で複製する
junichi anno
 

Tendances (20)

The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScaleThe Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
 
Oracle管理藝術第1章 在Linux作業體統安裝Oracle 11g
Oracle管理藝術第1章 在Linux作業體統安裝Oracle 11gOracle管理藝術第1章 在Linux作業體統安裝Oracle 11g
Oracle管理藝術第1章 在Linux作業體統安裝Oracle 11g
 
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
 
InnoDB Performance Optimisation
InnoDB Performance OptimisationInnoDB Performance Optimisation
InnoDB Performance Optimisation
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internal
 
InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)InnoDB MVCC Architecture (by 권건우)
InnoDB MVCC Architecture (by 권건우)
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
Best practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialBest practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability Tutorial
 
MySql Practical Partitioning
MySql Practical PartitioningMySql Practical Partitioning
MySql Practical Partitioning
 
MySQL Data Encryption at Rest
MySQL Data Encryption at RestMySQL Data Encryption at Rest
MySQL Data Encryption at Rest
 
ProxySQL for MySQL
ProxySQL for MySQLProxySQL for MySQL
ProxySQL for MySQL
 
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
 
High performance and high availability proxies for MySQL
High performance and high availability proxies for MySQLHigh performance and high availability proxies for MySQL
High performance and high availability proxies for MySQL
 
Running MariaDB in multiple data centers
Running MariaDB in multiple data centersRunning MariaDB in multiple data centers
Running MariaDB in multiple data centers
 
HandsOn ProxySQL Tutorial - PLSC18
HandsOn ProxySQL Tutorial - PLSC18HandsOn ProxySQL Tutorial - PLSC18
HandsOn ProxySQL Tutorial - PLSC18
 
MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)
 
Mvcc (oracle, innodb, postgres)
Mvcc (oracle, innodb, postgres)Mvcc (oracle, innodb, postgres)
Mvcc (oracle, innodb, postgres)
 
Practical examples of using extended events
Practical examples of using extended eventsPractical examples of using extended events
Practical examples of using extended events
 
Percona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimizationPercona Live 2012PPT: MySQL Query optimization
Percona Live 2012PPT: MySQL Query optimization
 
仮想化した DC を PowerShell で複製する
仮想化した DC を PowerShell で複製する仮想化した DC を PowerShell で複製する
仮想化した DC を PowerShell で複製する
 

Similaire à Data recovery talk on PLUK

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
guest808c167
 
Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
Aleksandr Kuzminsky
 
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tablesOpen sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Arvids Godjuks
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
MySQLConference
 
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
Ontico
 

Similaire à Data recovery talk on PLUK (20)

Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
 
Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)Recovery of lost or corrupted inno db tables(mysql uc 2010)
Recovery of lost or corrupted inno db tables(mysql uc 2010)
 
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tablesOpen sql2010 recovery-of-lost-or-corrupted-innodb-tables
Open sql2010 recovery-of-lost-or-corrupted-innodb-tables
 
Inno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code StructureInno Db Internals Inno Db File Formats And Source Code Structure
Inno Db Internals Inno Db File Formats And Source Code Structure
 
cPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB AnatomycPanelCon 2014: InnoDB Anatomy
cPanelCon 2014: InnoDB Anatomy
 
Page Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdfPage Cache in Linux 2.6.pdf
Page Cache in Linux 2.6.pdf
 
Inno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structureInno db internals innodb file formats and source code structure
Inno db internals innodb file formats and source code structure
 
MySQL Space Management
MySQL Space ManagementMySQL Space Management
MySQL Space Management
 
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)InnoDB: архитектура транзакционного хранилища (Константин Осипов)
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
 
PAGING MECHANISM Pentium.ppt
PAGING MECHANISM Pentium.pptPAGING MECHANISM Pentium.ppt
PAGING MECHANISM Pentium.ppt
 
(140625) #fitalk sq lite 소개와 구조 분석
(140625) #fitalk   sq lite 소개와 구조 분석(140625) #fitalk   sq lite 소개와 구조 분석
(140625) #fitalk sq lite 소개와 구조 분석
 
cPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB AlchemycPanelCon 2015: InnoDB Alchemy
cPanelCon 2015: InnoDB Alchemy
 
Pentium protected mode.ppt
Pentium protected mode.pptPentium protected mode.ppt
Pentium protected mode.ppt
 
Индексируем базу: как делать хорошо и не делать плохо Winter saint p 2021 m...
Индексируем базу: как делать хорошо и не делать плохо   Winter saint p 2021 m...Индексируем базу: как делать хорошо и не делать плохо   Winter saint p 2021 m...
Индексируем базу: как делать хорошо и не делать плохо Winter saint p 2021 m...
 
MySQL innoDB split and merge pages
MySQL innoDB split and merge pagesMySQL innoDB split and merge pages
MySQL innoDB split and merge pages
 
Page compression. PGCON_2016
Page compression. PGCON_2016Page compression. PGCON_2016
Page compression. PGCON_2016
 
MIPS Architecture
MIPS ArchitectureMIPS Architecture
MIPS Architecture
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
 
OpenGL 4.5 Reference Card
OpenGL 4.5 Reference CardOpenGL 4.5 Reference Card
OpenGL 4.5 Reference Card
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 

Plus de Aleksandr Kuzminsky

Plus de Aleksandr Kuzminsky (7)

ProxySQL at Scale on AWS.pdf
ProxySQL at Scale on AWS.pdfProxySQL at Scale on AWS.pdf
ProxySQL at Scale on AWS.pdf
 
Omnibus as a Solution for Dependency Hell
Omnibus as a Solution for Dependency HellOmnibus as a Solution for Dependency Hell
Omnibus as a Solution for Dependency Hell
 
Efficient Indexes in MySQL
Efficient Indexes in MySQLEfficient Indexes in MySQL
Efficient Indexes in MySQL
 
Efficient Use of indexes in MySQL
Efficient Use of indexes in MySQLEfficient Use of indexes in MySQL
Efficient Use of indexes in MySQL
 
Netstore overview
Netstore overviewNetstore overview
Netstore overview
 
Undrop for InnoDB
Undrop for InnoDBUndrop for InnoDB
Undrop for InnoDB
 
Undrop for InnoDB
Undrop for InnoDBUndrop for InnoDB
Undrop for InnoDB
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Data recovery talk on PLUK

  • 1. Data Recovery for MySQL Aleksandr Kuzminsky & Istvan Podor Percona Live London 2011
  • 2. Agenda 1. InnoDB files format overview 2. InnoDB dictionary (SYS_INDEXES, SYS_TABLES) 3. InnoDB Primary and Secondary indexes 4. Typical failure scenarios 5. InnoDB recovery tool www.percona.com
  • 3. 1. InnoDB format overview www.percona.com
  • 4. InnoDB table spaces ● Shared table space (ibdataX) ● dictionary, rollback segment, undo space, insert buffer, double write buffer ● Primary & Secondary indexes ● External pages (BLOBs) ● A table space per table (ibdataX + *.ibd) ● Primary & Secondary indexes ● External pages (BLOBs) www.percona.com
  • 5. InnoDB logs ● a.k.a REDO logs, transactional logs ● ib_logfile[01] ● Contain changes to InnoDB pages: Space ID Page ID Data www.percona.com
  • 6. InnoDB Indexes. PRIMARY ● Table is a clustered index named PRIMARY ● B+ tree data structure, node is a page ● Key is Primary key or unique key or ROW_ID (6bytes) www.percona.com
  • 7. InnoDB Indexes. Secondary ● Also B+ tree structure ● Key is indexed field(s), values are primary keys ● If table (id, first_name, last_name, birth_date), ● And index (last_name) ● The index structure stores (last_name, id) www.percona.com
  • 8. Index identifier (index_id) ● 8 bytes integer, often in two numbers notation: 0 254 ● Table name → index_id correspondence is stored in InnoDB dictionary ● Visible in table monitor output(see next slide) ● In I_S tables if Percona Server www.percona.com
  • 9. Table monitor output mysql> CREATE TABLE innodb_table_monitor(x int) engine=innodb In Error log: TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1       COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT len 4 prec 0;                            created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0;                    DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257 len 6 prec 0;                    DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;           INDEX: name PRIMARY, id 0 254, fields 1/7, type 3            root page 271, appr.key vals 1, leaf pages 1, size pages 1            FIELDS:  id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at www.percona.com
  • 10. InnoDB page format InnoDB page is 16 kilobytes Size, bytes FIL_HEADER 36 PAGE_HEADER 56 INFINUM+SUPREMUM RECORDS varies User records varies Free space Page directory varies FIL_TRAILER fixed www.percona.com
  • 11. Fil Header ● Common for all type of pages Name Siz Remarks e FIL_PAGE_SPACE 4 Space ID the page is in or checksum FIL_PAGE_OFFSET 4 ordinal page number from start of space FIL_PAGE_PREV 4 offset of previous page in key order FIL_PAGE_NEXT 4 offset of next page in key order FIL_PAGE_LSN 8 log serial number of page's latest log record FIL_PAGE_TYPE 2 current defined types are: FIL_PAGE_INDEX, FIL_PAGE_UNDO_LOG, FIL_PAGE_INODE, FIL_PAGE_IBUF_FREE_LIST FIL_PAGE_FILE_FLUSH_ 8 "the file has been flushed to disk at least up to this lsn" (log serial LSN number), valid only on the first page of the file FIL_PAGE_ARCH_LOG_ 4 /* starting from 4.1.x this contains the space id of the page */ NO www.percona.com
  • 12. Page header ● Only for index pages Name Size Remarks PAGE_N_DIR_SLOTS 2 number of directory slots in the Page Directory part; initial value = 2 PAGE_HEAP_TOP 2 record pointer to first record in heap PAGE_N_HEAP 2 number of heap records; initial value = 2 PAGE_FREE 2 record pointer to first free record PAGE_GARBAGE 2 "number of bytes in deleted records" PAGE_LAST_INSERT 2 record pointer to the last inserted record PAGE_DIRECTION 2 either PAGE_LEFT, PAGE_RIGHT, or PAGE_NO_DIRECTION PAGE_N_DIRECTION 2 number of consecutive inserts in the same direction, e.g. "last 5 were all to the left" PAGE_N_RECS 2 number of user records PAGE_MAX_TRX_ID 8 the highest ID of a transaction which might have changed a record on the page (only set for secondary indexes) PAGE_LEVEL 2 level within the index (0 for a leaf page) PAGE_INDEX_ID 8 identifier of the index the page belongs to PAGE_BTR_SEG_LEA 10 "file segment header for the leaf pages in a B-tree" (this is irrelevant here) F PAGE_BTR_SEG_TOP 10 "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here) www.percona.com
  • 13. How to check row format? ● Can be either COMPACT or REDUNDANT ● 0 stands for REDUNDANT, 1 - for COMACT ● The highest bit of the PAGE_N_HEAP from the page header ● dc -e "2o `hexdump –C d pagefile | grep 00000020 | awk '{ print $12}'` p" | sed 's/./& /g' | awk '{ print $1}' www.percona.com
  • 14. Rows in an InnoDB page ● Single linked list next INFIMUM NULL SUPREMUM ● The first record – next 10 data INFIMUM next 20 data next 30 data ● The last – next 50 data SUPREMUM next 40 data ● Ordered by PK www.percona.com
  • 15. Records are saved in insert order insert into t1 values(10, 'aaa'); insert into t1 values(30, 'ccc'); insert into t1 values(20, 'bbb'); JG....................N<E....... ................................ .............................2.. ...infimum......supremum......6. ........)....2..aaa............. ...*....2..ccc.... ...........+. ...2..bbb....................... ................................ www.percona.com
  • 16. Row format CREATE TABLE `t1` ( Name Size `ID` int(11) unsigned NOT NULL, `NAME` varchar(120), Offsets 1 byte for small types `N_FIELDS` int(10), (lengths) 2 bytes for longer types PRIMARY KEY (`ID`) ) ENGINE=InnoDB DEFAULT Extra bytes 5 bytes COMPACT CHARSET=latin1 6 bytes REDUNDANT Field content varies www.percona.com
  • 17. Extra bytes(REDUNDANT) Name Size Description record_status 2 bits _ORDINARY, _NODE_PTR, _INFIMUM, _SUPREMUM deleted_flag 1 bit 1 if record is deleted min_rec_flag 1 bit 1 if record is predefined minimum record n_owned 4 bits number of records owned by this record heap_no 13 bits record's order number in heap of index page n_fields 10 bits number of fields in this record, 1 to 1023 1byte_offs_flag 1 bit 1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag) next 16 bits 16 bits pointer to next record in page www.percona.com
  • 18. Extra bytes(COMPACT) Name Size, bits Description record_status 4 4 bits used to delete mark a record, and mark a predefined minimum deleted_flag record in alphabetical order min_rec_flag n_owned 4 the number of records owned by this record (this term is explained in page0page.h) heap_no 13 the order number of this record in the heap of the index page record type 3 000=conventional, 001=node pointer (inside B-tree), 010=infimum, 011=supremum, 1xx=reserved next 16 bits 16 a relative pointer to the next record in the page www.percona.com
  • 19. Example: Redundant row A row: (10, ‘abcdef’, 20) Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20) 4 6 7 6 4 ... next 00 00 00 0A ... ... abcdef 80 00 00 14 Offsets Extra Fields bytes www.percona.com
  • 20. The same row, but COMPACT A row: (10, ‘abcdef’, 20) Actualy (10, TRX_ID, PTR_ID, ‘abcdef’, 20) 6 ... next 00 00 00 0A ... ... abcdef 80 00 00 14 Offsets Extra Fields bytes 13 % smaller! www.percona.com
  • 21. Data types (highlights) ● Integer, float numbers are fixed size ● Strings: ● VARCHAR(x) – variable ● CHAR(x) – fixed if not UTF-8 ● Date, time – fixed size ● DECIMAL ● Stored as string before 5.0.3 ● Binary format afterward www.percona.com
  • 22. Data types (BLOBs) ● Field length (so called offset) is one or two bytes long ● If record size < (UNIV_PAGE_SIZE/2-200) == ~7k – the record is stored internally (in a PK page) ● Otherwise – 768 bytes in-page, the rest in an external page www.percona.com
  • 24. Why are SYS_* tables needed? ● Correspondence “table name” → “index_id” ● Storage for other internal information www.percona.com
  • 25. SYS_* structure Always REDUNDANT format! CREATE TABLE `SYS_INDEXES` ( CREATE TABLE `SYS_TABLES` ( `TABLE_ID` bigint(20) unsigned NOT NULL `NAME` varchar(255) NOT NULL default '', default '0', `ID` bigint(20) unsigned NOT NULL default `ID` bigint(20) unsigned NOT NULL default '0', '0', `N_COLS` int(10) unsigned default NULL, `NAME` varchar(120) default NULL, `TYPE` int(10) unsigned default NULL, `N_FIELDS` int(10) unsigned default NULL, `MIX_ID` bigint(20) unsigned default NULL, `TYPE` int(10) unsigned default NULL, `MIX_LEN` int(10) unsigned default NULL, `SPACE` int(10) unsigned default NULL, `CLUSTER_NAME` varchar(255) default NULL, `PAGE_NO` int(10) unsigned default NULL, `SPACE` int(10) unsigned default NULL, PRIMARY KEY (`TABLE_ID`,`ID`) PRIMARY KEY (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 ) ENGINE=InnoDB DEFAULT CHARSET=latin1 index_id = 0-3 index_id = 0-1 www.percona.com
  • 26. Example: SYS_* SYS_TABLES ● NAME ID … ● "sakila/actor" 40 8 1 0 0 NULL 0 ● "sakila/actor" 40 8 1 0 0 NULL 0 ● "sakila/actor" 40 8 1 0 0 NULL 0 SYS_INDEXES ● TABLE_ID ID NAME … ● 40 196389 "PRIMARY" 2 3 0 21031026 ● 40 196390 "idx_actor_last_name" 1 0 0 21031028 www.percona.com
  • 27. 3. InnoDB Primary and Secondary indexes www.percona.com
  • 28. PRIMARY Index The table: Fields in the PK: CREATE TABLE `t1` ( `ID` int(11), 1. ID `NAME` varchar(120), 2. DB_TRX_ID `N_FIELDS` int(10), 3. DB_ROLL_PTR PRIMARY KEY (`ID`), 4. NAME KEY `NAME` (`NAME`) 5. N_FIELDS ) ENGINE=InnoDB DEFAULT CHARSET=latin1 www.percona.com
  • 29. Secondary Index The table: Fields in the SK: CREATE TABLE `t1` ( `ID` int(11), 1. NAME `NAME` varchar(120), 2. ID  Primary key `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 www.percona.com
  • 30. 4. Typical failure scenarios www.percona.com
  • 31. Wrong DELETE ● DELETE FROM `actor`; ● What happens? ● Row(s) is marked as deleted ● TRX_ID and ROLL_PTR are updated ● Row remains in the page till UNDO log is purged ● www.percona.com
  • 32. Wrong DELETE ● What to do first? ● Kill -9 mysqld_safe ASAP! ● Kill -9 mysqld ASAP ● Find pages which belong to the dropped table ● Fetch records from the pages(constraints_parser from the recovery toolkit) www.percona.com
  • 33. Dropped Table/Database ● DROP TABLE actor; ● Very often DROP and then CREATE ● Bad because .frm files are removed ● Even worse when innodb_per_table ● What happens inside InnoDB: ● Page is marked as free (or deleted tablespace) ● A record is deleted from the dictionary www.percona.com
  • 34. Dropped Table/Database ● What to do? ● Kill -9 safe_mysqld and mysqld ● Remount a MySQL partition read-only – Or take an image – Or stop other services writing to the disk ● Fetch records from index pages (constraints_parser from recovery toolkit) www.percona.com
  • 35. Truncate table ● What happens? ● Equivalent to DROP/CREATE ● Dictionary is updated, but index_id may be reused ● What to do? ● The same as after the DROP: kill mysqld, fetch records from the index www.percona.com
  • 36. Wrong UPDATE statement ● What happens? ● If the new row is the same size, in-place update happens; otherwise – insert/delete ● The old values go to UNDO space ● roll_ptr points to the old value in the UNDO space ● What to do? ● Kill -9 safe_mysqld, mysqld ● However no tool available now www.percona.com
  • 37. Other wrongdoings ● Removed ibdata1 file ● Take mysqldump ASAP before MySQL is stopped ● Ibdconnect from recovery toolkit ● Wrong backups ● innodb_force_recovery ● Fetch records from index pages ● You name it www.percona.com
  • 38. Corrupt InnoDB tablespace ● Hardware failures ● OS or filesystem failures ● InnoDB bugs ● Corrupted InnoDB tablespace by other processes ● What to do? ● innodb_force_recovery ● fetch_data.sh ● constraints_parser www.percona.com
  • 39. 5. InnoDB recovery tool www.percona.com
  • 40. Features ● A toolset to works with InnoDB at low level ● page_parser – scans a bytes stream, finds InnoDB pages and sorts them by page type/index_id ● constraints_parser – fetches records from InnoDB page ● Ibdconnect – a tool to “connect” an .ibd file to system tablespace. ● fetch_data.sh – fetches data from partially corrupted tables choosing PK ranges. www.percona.com
  • 41. Recovery prerequisites ● Media ● ibdata1 ● *.ibd ● HDD image ● Tables structure ● SQL dump ● *.FRM files www.percona.com
  • 42. How to get CREATE info from .frm files ● Create table `actor` (id int) engine=InnoDB ● Stop MySQL and replace actor.frm ● Run MySQL with innodb_force_recovery=4 ● SHOW CREATE TABLE actor; www.percona.com
  • 43. Percona Data Recovery Tool for InnoDB http://launchpad.net/percona-innodb-recovery-tool/ page_parser – splits InnoDB tablespace into 16k pages constraints_parser – scans a page and finds good records www.percona.com
  • 44. page_parser ● Accepts a file # ./page_parser -f /var/lib/mysql/ibdata1 ● Produces: ● # ll pages-1319190009 ● drwxr-xr-x 16 root root 4096 Oct 21 05:40 FIL_PAGE_INDEX/ ● drwxr-xr-x 2 root root 12288 Oct 21 05:40 FIL_PAGE_TYPE_BLOB/ ● # ll pages-1319190009/FIL_PAGE_INDEX/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-1/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-3/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-18/ ● drwxr-xr-x 2 root root 4096 Oct 21 05:40 0-19/ www.percona.com
  • 45. constraints_parser ● Accept a page or directory with pages ● # ./bin/constraints_parser.SYS_TABLES -4Uf pages-1319185222/FIL_PAGE_INDEX/0-1 SYS_TABLES "SYS_FOREIGN" 11 -2147483644 1 0 0 SYS_TABLES "SYS_FOREIGN_COLS" 12 -2147483644 1 0 SYS_TABLES "test/t1" 16 3 1 0 0 ● Table structure is defined in "include/table_defs.h" ● Prints LOAD DATA INFILE to stderr www.percona.com
  • 46. ibdconnect ● Create empty InnoDB tablespace ● Create the table: mysql>CREATE TABLE actor ( ● actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, ● first_name VARCHAR(45) NOT NULL, ● last_name VARCHAR(45) NOT NULL, ● last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, ● PRIMARY KEY (actor_id), ● KEY idx_actor_last_name (last_name) ● )ENGINE=InnoDB DEFAULT CHARSET=utf8; www.percona.com
  • 47. ibdconnect ● Update InnoDB dictionary (MySQL is down now) # ibdconnect -o /var/lib/mysql/ibdata1 -f /var/lib/mysql/sakila/actor.ibd -d sakila -t actor ● Fix InnoDB checksums: # ./innochecksum -f /var/lib/mysql/ibdata1 # ./innochecksum -f /var/lib/mysql/ibdata1 ● Start MySQL and take mysqldump www.percona.com
  • 48. Thank You to Our Sponsors Platinum Sponsor Gold Sponsor Silver Sponsors www.percona.com
  • 49. Percona Live London Sponsors Exhibitor Sponsors Friends of Percona Sponsors Media Sponsors www.percona.com
  • 50. Annual MySQL Users Conference Presented by Percona Live The Hyatt Regency Hotel, Santa Clara, CA April 10th-12th, 2012 Featured Speakers Mark Callaghan, Facebook Jeremy Zawodny, Craigslist Marten Mickos, Eucalyptus Systems Sarah Novotny, Blue Gecko Peter Zaitsev, Percona Baron Schwartz, Percona The Call for Papers is Now Open! Visit www.percona.com/live/mysql-conference-2012/ www.percona.com
  • 51. aleksandr.kuzminsky@percona.com istvan.podor@percona.com We're Hiring! www.percona.com/about-us/careers/