SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
   Overview
   Sub-systems
   Kernel API & EBA subsystem
   Wear-leveling subsystem
   Scanning subsystem




                                 2
   UBI – Unsorted Block Images

   A volume management system
       Provides static and dynamic volumes
       Wear-leveling across whole flash device
       Transparent bad blocks management
       Read disturbance handling

   Merged in the mainline Linux kernel since
    v2.6.22
                                                  3
Bootloader               Kernel Image                                     Root Filesystem (UBIFS)




                        Static UBI Volume                                     Dynamic UBI Volume




                (0,0)    (0,1)     ...      (0,P)    (1,0)   (1,1)    (1,2)      (1,3)   (1,4)     (1,5)   ...   (1,Q)




                 0         1        2        3        4          5     6          7       8          9     ...    N




MTD Partition                                          MTD Partition (UBI Device)




                                                    MTD Device

                                                                                                                         4
UBI Kernel API



   UBI Initialization             UBI Erase Block Association Subsystem



UBI Scanning Subsystem           UBI Wear-leveling Subsystem



                         UBI I/O Subsystem




                            MTD Layer




                                                                          5
   Read from an unmapped LEB
                                               UBI WL                 UBI IO
   Read from a mapped LEB
                                           ubi_wl_get_peb()         ubi_io_read()
   Write to a mapped LEB
   Write to an unmapped LEB               ubi_wl_put_peb()         ubi_io_write()

   Map a LEB                             ubi_wl_scrub_peb()     ubi_io_sync_erase()
   Unmap a LEB
                                             ubi_wl_flush()      ubi_io_mark_bad()
   Erase a LEB
                                                                 ubi_io_read_data()
     Filesystem           UBI KAPI            UBI EBA

      fs_read()         ubi_leb_read()     ubi_eba_read_leb()    ubi_io_write_data()

     fs_write()         ubi_leb_write()   ubi_eba_write_leb()   ubi_io_read_ec_hdr()

                        ubi_leb_map()      ubi_eba_map_leb()    ubi_io_write_ec_hdr()

                       ubi_leb_erase()    ubi_eba_copy_leb()    ubi_io_read_vid_hdr()

                       ubi_leb_unmap()    ubi_eba_unmap_leb()   ubi_io_write_vid_hdr()

                                                                                         6
   Responsible for
     Management of PEBs
     Wear-leveling
     Scrubbing (read disturbance)

   Works in terms of PEBs and erase counters
   Knows nothing about LEBs, volumes, etc
   Internal data structures
     Four RB-trees and one queue

   External interfaces
       ubi_wl_get_peb()
       ubi_wl_put_peb()
       ubi_wl_scrub_peb()
       ubi_wl_flush()

                                                8
All good PEBs are managed with four RB-trees, and one queue
                                                                                   drivers/mtd/ubi/ubi.h
 struct ubi_device {
   ...
   struct rb_root used;
   struct rb_root erroneous;
   struct rb_root free;
   struct rb_root scrub;
   struct list_head pq[UBI_PROT_QUEUE_LEN];
   ...
 }



                                     Free PEBs         free      1,1   1,7   2,5    3,2     7,8




       Good PEBs                                        pq       3,4

                                                       used      3,9   6,6   8,3
                                    In-used PEBs
                                                       scrub

                                                     erroneous
 Note: These RB-trees use (ec, pnum) pairs as keys
                                                                                                           9
drivers/mtd/ubi/wl.c
int ubi_wl_get_peb(struct ubi_device *ubi, int dtype)
int ubi_wl_put_peb(struct ubi_device *ubi, int pnum, int torture)
int ubi_wl_scrub_peb(struct ubi_device *ubi, int pnum)
int ubi_wl_flush(struct ubi_device *ubi)



                                                                     free       1,1     1,7   2,5   3,2    7,8


                                    ubi_wl_get_peb()

                                                                       pq       3,4
                                    ubi_wl_put_peb()
                                                                      used      3,9     6,6   8,3


                                  ubi_wl_scrub_peb()                 scrub

                                                                    erroneous
                                     ubi_wl_flush()


                                                                                      ubi_thread


                                                                                                                            10
drivers/mtd/ubi/wl.c
 int ubi_wl_get_peb(struct ubi_device *ubi, int dtype)


                                                            shortterm             unknown          longterm
1. Pick a PEB from the free RB-tree
   according to the hint @dtype
                                                          free          1,1     1,7   2,5   3,2    7,8
     • longterm
     • shortterm
     • unknown                                              pq          3,4

2. Move the picked PEB to the pq queue                     used         3,9     6,6   8,3

     •     why pq? why not used?                          scrub

                                                         erroneous

  Keep newly allocated PEBs from
  being moved due to wear-leveling.
                                                                              ubi_thread


                                                                                                                    11
drivers/mtd/ubi/wl.c
 int ubi_wl_scrub_peb(struct ubi_device *ubi, int pnum)




1. Move the PEB @pnum from pq/used to
   scrub
                                                           free       1,7     2,5   3,2   7,8
2. Schedule a wear-leveling request

                                                             pq       3,4     1,1

                                                            used      3,9     6,6   8,3

                                                           scrub

                                                          erroneous

   Besides wear-leveling, I also take
   care of scrubbing.
                                                                            ubi_thread


                                                                                                                  12
drivers/mtd/ubi/wl.c
int ubi_wl_put_peb(struct ubi_device *ubi, int pnum, int torture)




1. Remove the PEB @pnum from one of
   the in-used RB-trees or pq.
                                                                     free       1,7     2,5   3,2   7,8    6,6
2. Schedule the PEB @pnum for erasure.

3. When the erasure is done without any                                pq       3,4     1,1
   error, the PEB will be put back to the free
   RB-tree.                                                           used      3,9     6,6

                                                                     scrub      8,3

                                                                    erroneous

 Again, the erasure will be delagated
 to me.
                                                                                      ubi_thread


                                                                                                                           13
drivers/mtd/ubi/wl.c
int ubi_wl_flush(struct ubi_device *ubi)

                                                                                                               drivers/mtd/ubi/wl.c
struct ubi_work {

     struct list_head list;
     int (*func)(struct ubi_device *ubi, struct ubi_work *wrk, int cancel);

     /* The below fields are only relevant to erasure works */
     struct ubi_wl_entry *e;
     int torture;
};


1. Flush all pending works                                                    erase_worker()           wear_leveling_worker()




                                                                   ubi_work             ubi_work       ubi_work           ubi_work




                                                                                               ubi_thread

                                                                                                                                      14
drivers/mtd/ubi/wl.c
static int wear_leveling_worker(struct ubi_device *ubi,
                struct ubi_work *wrk, int cancel)

static struct ubi_wl_entry *find_wl_entry(struct rb_root *root, int max)

if (!free || (!scrub && !used))
   return

if (scrub) {
   e1 = pick the least worn out PEB from the @scrub
   e2 = find_wl_entry(free, WL_FREE_MAX_DIFF)                              free    1,7   2,5   3,2    6,6     7,8
}
else {
   e1 = pick the least worn out PEB from the @used
   e2 = find_wl_entry(free, WL_FREE_MAX_DIFF)                               pq     3,4   1,1

    if ((e2->ec – e1->ec)<UBI_WL_THRESHOLD)                                used    3,9
       return;
}                                                                          scrub   8,3
ubi_eba_copy_leb(ubi, e1->pnum, e2->pnum, vid_hdr)
                                                                      erroneous




                                                                                                                            15
drivers/mtd/ubi/wl.c
static int erase_worker(struct ubi_device *ubi, struct ubi_work *wrk, int cancel)

 err = sync_erase(ubi, e, wl_wrk->torture);
 if (!err) {
    wl_tree_add(e, &ubi->free);
    serve_prot_queue(ubi);
    return ensure_wear_leveling(ubi);
 }

 if (err == -EINTR || err == -ENOMEM || err == -EAGAIN || err == -EBUSY)
    return schedule_erase(ubi, e, 0)
 else if (err != -EIO)
    goto out_ro;                                                        free        1,7   2,5   3,2   6,6     7,8

 /* It is %-EIO, the PEB went bad */
 if (!ubi->bad_allowed)
    goto out_ro;                                                         pq         3,4   1,1

 if (ubi->beb_rsvd_pebs == 0)                                           used        3,9
    goto out_ro;
                                                                        scrub       8,3
 err = ubi_io_mark_bad(ubi, pnum);
 return err;
                                                                     erroneous
out_ro:
 ubi_ro_mode(ubi) /* switch to read-only mode */
 return err;
                                                                                                                             16
   Responsible for
     Scanning the flash media
     Checking UBI headers
     Providing complete information about the UBI flash image

   UBI on-flash data structures
     Erase Counter Header
     Volume Identifier Header
     Volume Table

   Temporary data structures during scanning process
       Scan Info
       Scan Volume
       Scan Erase Block
       Four lists: free, erase, corr, alien

   Unclean reboot
                                                                 18
   Every good PEB has a 64-byte Erase Counter Header
   Every good mapped PEB has a 64-byte Volume Identifier Header
   A “layout volume” contains two copies of the Volume Table




                                                                            …




LEBs      0,0   0,1   ...   0,P   1,0   1,1   2,0   2,1   2,2   ...   2,Q



PEBs      0     1     2     3     4     5     6     7     8     ...   N
                                                                                19
   Every good PEB has a 64-byte Erase Counter Header
                                                                                            drivers/mtd/ubi/ubi-media.h
struct ubi_ec_hdr {
   __be32 magic;       /* EC header magic number (%UBI_EC_HDR_MAGIC) */
   __u8 version;     /* version of UBI implementation */
   __u8 padding1[3]; /* reserved for future, zeroes */
   __be64 ec;       /* the erase counter */
   __be32 vid_hdr_offset; /* where the VID header starts */
   __be32 data_offset; /* where the user data start */
   __be32 image_seq; /* image sequence number */
   __u8 padding2[32]; /* reserved for future, zeroes */
   __be32 hdr_crc;      /* erase counter header CRC checksum */
} __attribute__ ((packed));                                                                                               …




LEBs        0,0      0,1     ...     0,P      1,0      1,1     2,0        2,1   2,2   ...     2,Q



PEBs        0        1       2        3        4       5        6         7     8     ...      N
                                                                                                                          20
   Every good mapped PEB has a 64-byte Volume Identifier Header
                                                                                        drivers/mtd/ubi/ubi-media.h
struct ubi_vid_hdr {
   __be32 magic;      /* VID magic number (%UBI_VID_HDR_MAGIC)*/
   __u8 version; /* version of UBI implementation */
   __u8 vol_type; /* volume type (%UBI_VID_DYNAMIC or %UBI_VID_STATIC) */
   __u8 copy_flag; /* for wear-leveling reasons */
   __u8 compat; /* compatibility of this volume */
   __be32 vol_id; /* ID of this volume */
   __be32 lnum;      /* LEB number */
   __u8 padding1[4]; /* reserved for future, zeroes */
   __be32 data_size; /* bytes of data this LEB contains */
   __be32 used_ebs; /* total number of used LEBs in this volume */
   __be32 data_pad; /* padded bytes at the end of this PEB */
                                                                                                                      …
   __be32 data_crc; /* CRC of the data stored in this LEB */
   __u8 padding2[4]; /* reserved for future, zeroes */
   __be64 sqnum;       /* sequence number */
   __u8 padding3[12]; /* reserved for future, zeroes */
   __be32 hdr_crc; /* VID header CRC checksum */
} __attribute__ ((packed));

LEBs       0,0      0,1     ...     0,P     1,0     1,1     2,0     2,1     2,2   ...     2,Q



PEBs        0       1       2       3       4        5       6       7      8     ...      N
                                                                                                                      21
   A “layout volume” contains two copies of the Volume Table

                                                                                                     drivers/mtd/ubi/ubi-media.h
struct ubi_vtbl_record {
   __be32 reserved_pebs;           /* physical eraseblocks reserved for this volume */
   __be32 alignment;           /* volume alignment */
   __be32 data_pad;            /* padded bytes for the requested alignment */
   __u8 vol_type;           /* %UBI_VID_DYNAMIC or %UBI_VID_STATIC */
   __u8 upd_marker;             /* if volume update was started but not finished */
   __be16 name_len;             /* volume name length */
   __u8 name[UBI_VOL_NAME_MAX+1]; /* volume name */
   __u8 flags;           /* volume flags (%UBI_VTBL_AUTORESIZE_FLG) */
                                                                                                                                   …
   __u8 padding[23];          /* reserved for future, zeroes */
   __be32 crc;            /* CRC32 checksum of the record */
} __attribute__ ((packed));




LEBs          0,0       0,1       ...      0,P        1,0       1,1      2,0       2,1   2,2   ...     2,Q



PEBs          0          1        2         3         4         5         6         7    8     ...      N
                                                                                                                                   22
Volumes
                       ...                                                      ...


                                   SEB                                   corr         …
Scan Erase Block                   100
                                                                                      …
                                         SEB                            free
                             ...
                                         205
                                                                                      …
                                                                        erase
                      SEB
  Scan Volume                      …       …
                       0                                                              …
                                                                        alien
  Scan Info
                                                    SEB
                                                    522
                      SEB
                       1
   “layout volume”                              …         …
       (internal)    SEB
                      0
                                          SEB       SEB
                                                              …
                                           2        101



PEBs                                                              ...                     23
    EC hdr is written to a PEB right after the PEB is erased
                                                                                     drivers/mtd/ubi/wl.c
static int sync_erase(struct ubi_device *ubi, struct ubi_wl_entry *e, int torture)
{
  unsigned long long ec = e->ec;

    [... Deleted ...]

    err = ubi_io_sync_erase(ubi, e->pnum, torture);
    if (err < 0)
       goto out_free;
    ec += err;
    if (ec > UBI_MAX_ERASECOUNTER) {
       /*
        * Erase counter overflow. Upgrade UBI and use 64-bit
        * erase counters internally.
        */
       ubi_err("erase counter overflow at PEB %d, EC %llu", e->pnum, ec);
       err = -EINVAL;
       goto out_free;
    }

    dbg_wl("erased PEB %d, new EC %llu", e->pnum, ec);
    ec_hdr->ec = cpu_to_be64(ec);
    err = ubi_io_write_ec_hdr(ubi, e->pnum, ec_hdr);
    [... Deleted ...]
}
                                                                                                            24
   Map a LEB L to PEB P
        Write VID header (with lnum L) to P

   Unmap a LEB L to PEB P
        Schedule P for erasure

   Remap a LEB L from PEB P0 to PEB P1
        Schedule P0 for erasure
        Write VID header (with lnum L) to P1

   Copy a PEB P0 which is mapped to L to PEB P1
        Write VID header (with lnum L) to P1
        Copy contents of P0 to P1
        Schedule P0 for erasure




                                                   25
        Whenever the volume table needs update
            (The following speaks in the context of “layout volume”)
            Unmap LEB 0
            Write updated table to LEB 0
            Unmap LEB 1
            Write updated table to LEB 1
                                                                             drivers/mtd/ubi/vtbl.c
    int ubi_change_vtbl_record(struct ubi_device *ubi, int idx,
                                 struct ubi_vtbl_record *vtbl_rec)
    {
      [... Deleted ...]
      layout_vol = ubi->volumes[vol_id2idx(ubi, UBI_LAYOUT_VOLUME_ID)];
      [... Deleted ...]

        memcpy(&ubi->vtbl[idx], vtbl_rec, sizeof(struct ubi_vtbl_record));
        for (i = 0; i < UBI_LAYOUT_VOLUME_EBS; i++) {
          err = ubi_eba_unmap_leb(ubi, layout_vol, i);
          if (err)
             return err;
          err = ubi_eba_write_leb(ubi, layout_vol, i, ubi->vtbl, 0,
                ubi->vtbl_size, UBI_LONGTERM);
          if (err)
             return err;
          return 0;
        }
    }                                                                                                 26
   Every piece about MTD and UBI can be found
    on the MTD website
     http://www.linux-mtd.infradead.org/

Contenu connexe

Tendances

Chapitre i introduction et motivations
Chapitre i introduction et motivationsChapitre i introduction et motivations
Chapitre i introduction et motivations
Sana Aroussi
 
Cours algorithme
Cours algorithmeCours algorithme
Cours algorithme
badr zaimi
 
2009 2010-l3-reseau-td2-adressage-routage ip-corrige
2009 2010-l3-reseau-td2-adressage-routage ip-corrige2009 2010-l3-reseau-td2-adressage-routage ip-corrige
2009 2010-l3-reseau-td2-adressage-routage ip-corrige
Ababacar Faye
 

Tendances (20)

Soutenance Finale
Soutenance FinaleSoutenance Finale
Soutenance Finale
 
Presentation Habilitation à Diriger des Recherches
Presentation Habilitation à Diriger des RecherchesPresentation Habilitation à Diriger des Recherches
Presentation Habilitation à Diriger des Recherches
 
Théorie de l'information
Théorie de l'informationThéorie de l'information
Théorie de l'information
 
Cours recherche google
Cours recherche googleCours recherche google
Cours recherche google
 
Chapitre i introduction et motivations
Chapitre i introduction et motivationsChapitre i introduction et motivations
Chapitre i introduction et motivations
 
Algorithme DPLL
Algorithme DPLLAlgorithme DPLL
Algorithme DPLL
 
Cours algorithme
Cours algorithmeCours algorithme
Cours algorithme
 
Chapitre 4 récursivité
Chapitre 4 récursivitéChapitre 4 récursivité
Chapitre 4 récursivité
 
2009 2010-l3-reseau-td2-adressage-routage ip-corrige
2009 2010-l3-reseau-td2-adressage-routage ip-corrige2009 2010-l3-reseau-td2-adressage-routage ip-corrige
2009 2010-l3-reseau-td2-adressage-routage ip-corrige
 
CCNP Route - OSPF
CCNP Route - OSPFCCNP Route - OSPF
CCNP Route - OSPF
 
ZABBIX ET PRTG
ZABBIX ET PRTG ZABBIX ET PRTG
ZABBIX ET PRTG
 
La gestion des exceptions avec Java
La gestion des exceptions avec JavaLa gestion des exceptions avec Java
La gestion des exceptions avec Java
 
SIG ET ANALYSE SPATIALE, SESSION 2
SIG ET ANALYSE SPATIALE, SESSION 2SIG ET ANALYSE SPATIALE, SESSION 2
SIG ET ANALYSE SPATIALE, SESSION 2
 
Codage
CodageCodage
Codage
 
Protocole Diameter
Protocole DiameterProtocole Diameter
Protocole Diameter
 
IPv6
IPv6IPv6
IPv6
 
Construire un moteur d'inférence
Construire un moteur d'inférenceConstruire un moteur d'inférence
Construire un moteur d'inférence
 
Diagramme de-composants152
Diagramme de-composants152Diagramme de-composants152
Diagramme de-composants152
 
Systeme de chiffrement et signature avec RSA en java
Systeme de chiffrement et signature avec RSA en javaSysteme de chiffrement et signature avec RSA en java
Systeme de chiffrement et signature avec RSA en java
 
La complexité des algorithmes récursives Géométrie algorithmique
La complexité des algorithmes récursivesGéométrie algorithmiqueLa complexité des algorithmes récursivesGéométrie algorithmique
La complexité des algorithmes récursives Géométrie algorithmique
 

En vedette

Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Anne Nicolas
 
Ubi Presentation
Ubi PresentationUbi Presentation
Ubi Presentation
sonicliner
 
Python and HDF5: Overview
Python and HDF5: OverviewPython and HDF5: Overview
Python and HDF5: Overview
andrewcollette
 

En vedette (20)

Linux I/O path_20070116
Linux I/O path_20070116Linux I/O path_20070116
Linux I/O path_20070116
 
Wait queue
Wait queueWait queue
Wait queue
 
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
Distro Recipes 2013 : Contribution of RDF metadata for traceability among pro...
 
Kernel Recipes 2013 - Persistent logs using UBI
Kernel Recipes 2013 - Persistent logs using UBIKernel Recipes 2013 - Persistent logs using UBI
Kernel Recipes 2013 - Persistent logs using UBI
 
Basic income
Basic incomeBasic income
Basic income
 
Implications of a Basic Income - Maureen O'Reilly
Implications of a Basic Income - Maureen O'ReillyImplications of a Basic Income - Maureen O'Reilly
Implications of a Basic Income - Maureen O'Reilly
 
Ubi Presentation
Ubi PresentationUbi Presentation
Ubi Presentation
 
The year basic income made it back to the table
The year basic income made it back to the tableThe year basic income made it back to the table
The year basic income made it back to the table
 
Substituting HDF5 tools with Python/H5py scripts
Substituting HDF5 tools with Python/H5py scriptsSubstituting HDF5 tools with Python/H5py scripts
Substituting HDF5 tools with Python/H5py scripts
 
Logic Over Language
Logic Over LanguageLogic Over Language
Logic Over Language
 
Introduction To Programming with Python-1
Introduction To Programming with Python-1Introduction To Programming with Python-1
Introduction To Programming with Python-1
 
Python and HDF5: Overview
Python and HDF5: OverviewPython and HDF5: Overview
Python and HDF5: Overview
 
An Introduction to Interactive Programming in Python 2013
An Introduction to Interactive Programming in Python 2013An Introduction to Interactive Programming in Python 2013
An Introduction to Interactive Programming in Python 2013
 
Logic: Language and Information 1
Logic: Language and Information 1Logic: Language and Information 1
Logic: Language and Information 1
 
Introduction To Programming with Python-5
Introduction To Programming with Python-5Introduction To Programming with Python-5
Introduction To Programming with Python-5
 
Introduction to Databases
Introduction to DatabasesIntroduction to Databases
Introduction to Databases
 
Introduction To Programming with Python-4
Introduction To Programming with Python-4Introduction To Programming with Python-4
Introduction To Programming with Python-4
 
Python 4 Arc
Python 4 ArcPython 4 Arc
Python 4 Arc
 
The Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5PyThe Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5Py
 
Clase 2 estatica
Clase 2 estatica Clase 2 estatica
Clase 2 estatica
 

Dernier

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Introduction to UBI

  • 1.
  • 2. Overview  Sub-systems  Kernel API & EBA subsystem  Wear-leveling subsystem  Scanning subsystem 2
  • 3. UBI – Unsorted Block Images  A volume management system  Provides static and dynamic volumes  Wear-leveling across whole flash device  Transparent bad blocks management  Read disturbance handling  Merged in the mainline Linux kernel since v2.6.22 3
  • 4. Bootloader Kernel Image Root Filesystem (UBIFS) Static UBI Volume Dynamic UBI Volume (0,0) (0,1) ... (0,P) (1,0) (1,1) (1,2) (1,3) (1,4) (1,5) ... (1,Q) 0 1 2 3 4 5 6 7 8 9 ... N MTD Partition MTD Partition (UBI Device) MTD Device 4
  • 5. UBI Kernel API UBI Initialization UBI Erase Block Association Subsystem UBI Scanning Subsystem UBI Wear-leveling Subsystem UBI I/O Subsystem MTD Layer 5
  • 6. Read from an unmapped LEB UBI WL UBI IO  Read from a mapped LEB ubi_wl_get_peb() ubi_io_read()  Write to a mapped LEB  Write to an unmapped LEB ubi_wl_put_peb() ubi_io_write()  Map a LEB ubi_wl_scrub_peb() ubi_io_sync_erase()  Unmap a LEB ubi_wl_flush() ubi_io_mark_bad()  Erase a LEB ubi_io_read_data() Filesystem UBI KAPI UBI EBA fs_read() ubi_leb_read() ubi_eba_read_leb() ubi_io_write_data() fs_write() ubi_leb_write() ubi_eba_write_leb() ubi_io_read_ec_hdr() ubi_leb_map() ubi_eba_map_leb() ubi_io_write_ec_hdr() ubi_leb_erase() ubi_eba_copy_leb() ubi_io_read_vid_hdr() ubi_leb_unmap() ubi_eba_unmap_leb() ubi_io_write_vid_hdr() 6
  • 7.
  • 8. Responsible for  Management of PEBs  Wear-leveling  Scrubbing (read disturbance)  Works in terms of PEBs and erase counters  Knows nothing about LEBs, volumes, etc  Internal data structures  Four RB-trees and one queue  External interfaces  ubi_wl_get_peb()  ubi_wl_put_peb()  ubi_wl_scrub_peb()  ubi_wl_flush() 8
  • 9. All good PEBs are managed with four RB-trees, and one queue drivers/mtd/ubi/ubi.h struct ubi_device { ... struct rb_root used; struct rb_root erroneous; struct rb_root free; struct rb_root scrub; struct list_head pq[UBI_PROT_QUEUE_LEN]; ... } Free PEBs free 1,1 1,7 2,5 3,2 7,8 Good PEBs pq 3,4 used 3,9 6,6 8,3 In-used PEBs scrub erroneous Note: These RB-trees use (ec, pnum) pairs as keys 9
  • 10. drivers/mtd/ubi/wl.c int ubi_wl_get_peb(struct ubi_device *ubi, int dtype) int ubi_wl_put_peb(struct ubi_device *ubi, int pnum, int torture) int ubi_wl_scrub_peb(struct ubi_device *ubi, int pnum) int ubi_wl_flush(struct ubi_device *ubi) free 1,1 1,7 2,5 3,2 7,8 ubi_wl_get_peb() pq 3,4 ubi_wl_put_peb() used 3,9 6,6 8,3 ubi_wl_scrub_peb() scrub erroneous ubi_wl_flush() ubi_thread 10
  • 11. drivers/mtd/ubi/wl.c int ubi_wl_get_peb(struct ubi_device *ubi, int dtype) shortterm unknown longterm 1. Pick a PEB from the free RB-tree according to the hint @dtype free 1,1 1,7 2,5 3,2 7,8 • longterm • shortterm • unknown pq 3,4 2. Move the picked PEB to the pq queue used 3,9 6,6 8,3 • why pq? why not used? scrub erroneous Keep newly allocated PEBs from being moved due to wear-leveling. ubi_thread 11
  • 12. drivers/mtd/ubi/wl.c int ubi_wl_scrub_peb(struct ubi_device *ubi, int pnum) 1. Move the PEB @pnum from pq/used to scrub free 1,7 2,5 3,2 7,8 2. Schedule a wear-leveling request pq 3,4 1,1 used 3,9 6,6 8,3 scrub erroneous Besides wear-leveling, I also take care of scrubbing. ubi_thread 12
  • 13. drivers/mtd/ubi/wl.c int ubi_wl_put_peb(struct ubi_device *ubi, int pnum, int torture) 1. Remove the PEB @pnum from one of the in-used RB-trees or pq. free 1,7 2,5 3,2 7,8 6,6 2. Schedule the PEB @pnum for erasure. 3. When the erasure is done without any pq 3,4 1,1 error, the PEB will be put back to the free RB-tree. used 3,9 6,6 scrub 8,3 erroneous Again, the erasure will be delagated to me. ubi_thread 13
  • 14. drivers/mtd/ubi/wl.c int ubi_wl_flush(struct ubi_device *ubi) drivers/mtd/ubi/wl.c struct ubi_work { struct list_head list; int (*func)(struct ubi_device *ubi, struct ubi_work *wrk, int cancel); /* The below fields are only relevant to erasure works */ struct ubi_wl_entry *e; int torture; }; 1. Flush all pending works erase_worker() wear_leveling_worker() ubi_work ubi_work ubi_work ubi_work ubi_thread 14
  • 15. drivers/mtd/ubi/wl.c static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk, int cancel) static struct ubi_wl_entry *find_wl_entry(struct rb_root *root, int max) if (!free || (!scrub && !used)) return if (scrub) { e1 = pick the least worn out PEB from the @scrub e2 = find_wl_entry(free, WL_FREE_MAX_DIFF) free 1,7 2,5 3,2 6,6 7,8 } else { e1 = pick the least worn out PEB from the @used e2 = find_wl_entry(free, WL_FREE_MAX_DIFF) pq 3,4 1,1 if ((e2->ec – e1->ec)<UBI_WL_THRESHOLD) used 3,9 return; } scrub 8,3 ubi_eba_copy_leb(ubi, e1->pnum, e2->pnum, vid_hdr) erroneous 15
  • 16. drivers/mtd/ubi/wl.c static int erase_worker(struct ubi_device *ubi, struct ubi_work *wrk, int cancel) err = sync_erase(ubi, e, wl_wrk->torture); if (!err) { wl_tree_add(e, &ubi->free); serve_prot_queue(ubi); return ensure_wear_leveling(ubi); } if (err == -EINTR || err == -ENOMEM || err == -EAGAIN || err == -EBUSY) return schedule_erase(ubi, e, 0) else if (err != -EIO) goto out_ro; free 1,7 2,5 3,2 6,6 7,8 /* It is %-EIO, the PEB went bad */ if (!ubi->bad_allowed) goto out_ro; pq 3,4 1,1 if (ubi->beb_rsvd_pebs == 0) used 3,9 goto out_ro; scrub 8,3 err = ubi_io_mark_bad(ubi, pnum); return err; erroneous out_ro: ubi_ro_mode(ubi) /* switch to read-only mode */ return err; 16
  • 17.
  • 18. Responsible for  Scanning the flash media  Checking UBI headers  Providing complete information about the UBI flash image  UBI on-flash data structures  Erase Counter Header  Volume Identifier Header  Volume Table  Temporary data structures during scanning process  Scan Info  Scan Volume  Scan Erase Block  Four lists: free, erase, corr, alien  Unclean reboot 18
  • 19. Every good PEB has a 64-byte Erase Counter Header  Every good mapped PEB has a 64-byte Volume Identifier Header  A “layout volume” contains two copies of the Volume Table … LEBs 0,0 0,1 ... 0,P 1,0 1,1 2,0 2,1 2,2 ... 2,Q PEBs 0 1 2 3 4 5 6 7 8 ... N 19
  • 20. Every good PEB has a 64-byte Erase Counter Header drivers/mtd/ubi/ubi-media.h struct ubi_ec_hdr { __be32 magic; /* EC header magic number (%UBI_EC_HDR_MAGIC) */ __u8 version; /* version of UBI implementation */ __u8 padding1[3]; /* reserved for future, zeroes */ __be64 ec; /* the erase counter */ __be32 vid_hdr_offset; /* where the VID header starts */ __be32 data_offset; /* where the user data start */ __be32 image_seq; /* image sequence number */ __u8 padding2[32]; /* reserved for future, zeroes */ __be32 hdr_crc; /* erase counter header CRC checksum */ } __attribute__ ((packed)); … LEBs 0,0 0,1 ... 0,P 1,0 1,1 2,0 2,1 2,2 ... 2,Q PEBs 0 1 2 3 4 5 6 7 8 ... N 20
  • 21. Every good mapped PEB has a 64-byte Volume Identifier Header drivers/mtd/ubi/ubi-media.h struct ubi_vid_hdr { __be32 magic; /* VID magic number (%UBI_VID_HDR_MAGIC)*/ __u8 version; /* version of UBI implementation */ __u8 vol_type; /* volume type (%UBI_VID_DYNAMIC or %UBI_VID_STATIC) */ __u8 copy_flag; /* for wear-leveling reasons */ __u8 compat; /* compatibility of this volume */ __be32 vol_id; /* ID of this volume */ __be32 lnum; /* LEB number */ __u8 padding1[4]; /* reserved for future, zeroes */ __be32 data_size; /* bytes of data this LEB contains */ __be32 used_ebs; /* total number of used LEBs in this volume */ __be32 data_pad; /* padded bytes at the end of this PEB */ … __be32 data_crc; /* CRC of the data stored in this LEB */ __u8 padding2[4]; /* reserved for future, zeroes */ __be64 sqnum; /* sequence number */ __u8 padding3[12]; /* reserved for future, zeroes */ __be32 hdr_crc; /* VID header CRC checksum */ } __attribute__ ((packed)); LEBs 0,0 0,1 ... 0,P 1,0 1,1 2,0 2,1 2,2 ... 2,Q PEBs 0 1 2 3 4 5 6 7 8 ... N 21
  • 22. A “layout volume” contains two copies of the Volume Table drivers/mtd/ubi/ubi-media.h struct ubi_vtbl_record { __be32 reserved_pebs; /* physical eraseblocks reserved for this volume */ __be32 alignment; /* volume alignment */ __be32 data_pad; /* padded bytes for the requested alignment */ __u8 vol_type; /* %UBI_VID_DYNAMIC or %UBI_VID_STATIC */ __u8 upd_marker; /* if volume update was started but not finished */ __be16 name_len; /* volume name length */ __u8 name[UBI_VOL_NAME_MAX+1]; /* volume name */ __u8 flags; /* volume flags (%UBI_VTBL_AUTORESIZE_FLG) */ … __u8 padding[23]; /* reserved for future, zeroes */ __be32 crc; /* CRC32 checksum of the record */ } __attribute__ ((packed)); LEBs 0,0 0,1 ... 0,P 1,0 1,1 2,0 2,1 2,2 ... 2,Q PEBs 0 1 2 3 4 5 6 7 8 ... N 22
  • 23. Volumes ... ... SEB corr … Scan Erase Block 100 … SEB free ... 205 … erase SEB Scan Volume … … 0 … alien Scan Info SEB 522 SEB 1 “layout volume” … … (internal) SEB 0 SEB SEB … 2 101 PEBs ... 23
  • 24. EC hdr is written to a PEB right after the PEB is erased drivers/mtd/ubi/wl.c static int sync_erase(struct ubi_device *ubi, struct ubi_wl_entry *e, int torture) { unsigned long long ec = e->ec; [... Deleted ...] err = ubi_io_sync_erase(ubi, e->pnum, torture); if (err < 0) goto out_free; ec += err; if (ec > UBI_MAX_ERASECOUNTER) { /* * Erase counter overflow. Upgrade UBI and use 64-bit * erase counters internally. */ ubi_err("erase counter overflow at PEB %d, EC %llu", e->pnum, ec); err = -EINVAL; goto out_free; } dbg_wl("erased PEB %d, new EC %llu", e->pnum, ec); ec_hdr->ec = cpu_to_be64(ec); err = ubi_io_write_ec_hdr(ubi, e->pnum, ec_hdr); [... Deleted ...] } 24
  • 25. Map a LEB L to PEB P  Write VID header (with lnum L) to P  Unmap a LEB L to PEB P  Schedule P for erasure  Remap a LEB L from PEB P0 to PEB P1  Schedule P0 for erasure  Write VID header (with lnum L) to P1  Copy a PEB P0 which is mapped to L to PEB P1  Write VID header (with lnum L) to P1  Copy contents of P0 to P1  Schedule P0 for erasure 25
  • 26. Whenever the volume table needs update  (The following speaks in the context of “layout volume”)  Unmap LEB 0  Write updated table to LEB 0  Unmap LEB 1  Write updated table to LEB 1 drivers/mtd/ubi/vtbl.c int ubi_change_vtbl_record(struct ubi_device *ubi, int idx, struct ubi_vtbl_record *vtbl_rec) { [... Deleted ...] layout_vol = ubi->volumes[vol_id2idx(ubi, UBI_LAYOUT_VOLUME_ID)]; [... Deleted ...] memcpy(&ubi->vtbl[idx], vtbl_rec, sizeof(struct ubi_vtbl_record)); for (i = 0; i < UBI_LAYOUT_VOLUME_EBS; i++) { err = ubi_eba_unmap_leb(ubi, layout_vol, i); if (err) return err; err = ubi_eba_write_leb(ubi, layout_vol, i, ubi->vtbl, 0, ubi->vtbl_size, UBI_LONGTERM); if (err) return err; return 0; } } 26
  • 27. Every piece about MTD and UBI can be found on the MTD website  http://www.linux-mtd.infradead.org/