SlideShare une entreprise Scribd logo
1  sur  77
Télécharger pour lire hors ligne
In Case of Failure
           ELAG 2011 Prague
      Patrick Hochstenbach * Ghent University
       Email: Patrick.Hochstenbach@UGent.be
               Twitter: @hochstenbach
 https://github.com/phochste/ELAG-2001-Bootcamp
BOM-VL/Archipel
http://www.slideshare.net/hochstenbach/20081007-workshop-bomvl-wp3
Life expectancies of media
                                     Magnetic Tape                                                                                             Optical Disk Paper                                                                                          Microfilm




                                                                                                                                                                                                      High Quality (low lignin)
                                                                                                                                                                            Newspaper (high lignin)




                                                                                                                                                                                                                                                                              Archival Quality (Silver)
                                                                                                                                                                                                                                  "Permanent" (buffered)
                                                                                            Data 8mm / Data VHS




                                                                                                                                                                                                                                                           Medium-Term Film
                                                                                                                              QIC / QIC-wide
                                                                                                                  DDS / 4mm
                                                                         3490/3490e
                       Retention                                                                                                                                                                                                                                                                           Retention




                                                                                                                                               CD-ROM
                                            Data D-2
                                                       Data D-3


                        Period -                                                                                                                                                                                                                                                                            Period -




                                                                                                                                                        WORM
                                                                                                                                                               CD-R
                       Required                                                                                                                                                                                                                                                                            Required
                                                                  3480




                                                                                                                                                                      M-O
                                                                                      DLT
                                     I-D1




                      Storage Life                                                                                                                                                                                                                                                                        Storage Life
                         1 year                                                                                                                                                                                                                                                                              1 year
                         2 years                                                                                                                                                                                                                                                                             2 years
                         5 years                                                                                                                                                                                                                                                                             5 years
                        10 years                                                                                                                                                                                                                                                                            10 years
                        15 years                                                                                                                                                                                                                                                                            15 years
                        20 years                                                                                                                                                                                                                                                                            20 years
                        30 years                                                                                                                                                                                                                                                                            30 years
                        50 years                                                                                                                                                                                                                                                                            50 years




“Storage Media Life Expectancies” - Van Bogart, 1998
Growth of digital data
                                           Capacity of desktop computers




http://commons.wikimedia.org/wiki/File:Hard_drive_capacity_over_time.png HanKwang (2008)
Growth in formats

                          *"+$!""$&'((,$-$.$$

                !"#$%$&'(()$$           !/0$%$&'((#$$                    *//$%$234$560$
   !",$1$&4?$                                                                       !7)$1$<@4$
                                                   !/0$1$:;<'=$


!"#$%                              !""$%                                   &$$$%

        !".$1$>:2$
*",$$1$49:$                                             !/#$1$234$567$       *77$1$82940777$
                      *"+$%$4'("/$
ABCDE;$
         *"+$%$4'("+$          !/0$1$8294$
Formats of formats
MIME type image/tiff:
•  TIFF (alle versies)
•  TIFF/IT
•  TIFF G4/LZW/UNC
•  Digital Negative Format (DNG)
•  GeoTIFF
•  Pyramid TIFF
•  !

Bron: PRONOM Technical Registry [http://www.nationalarchives.gov.uk/pronom/]
Short & long term risks

,'&/00#$=4#.&>&.#0?.($

      !"#$%&&'&()!*+($
               ,"-.$,'&/00#$1"23"+"4+.4$
                          5.784'-'+".$98":$
                                           ;&+04"(0#'&"(78.$<"23"+"4+.4$




  !"#$%                    !""$%                           &$$$%
                                                                     5"26$
Best practices
Best practices
1. Create a preservation plan
Best practices
1. Create a preservation plan
2. Backup and replicate your data
Best practices
1. Create a preservation plan
2. Backup and replicate your data
3. Store preservation metadata
Best practices
1. Create a preservation plan
2. Backup and replicate your data
3. Store preservation metadata
4. Store technical metadata
Best practices
1. Create a preservation plan
2. Backup and replicate your data
3. Store preservation metadata
4. Store technical metadata
5. Store representation metadata
Best practices
1. Create a preservation plan
2. Backup and replicate your data
3. Store preservation metadata
4. Store technical metadata
5. Store representation metadata
6. Don’t trust software
Best practices
1. Create a preservation plan
2. Backup and replicate your data
3. Store preservation metadata
4. Store technical metadata
5. Store representation metadata
6. Don’t trust software
7. Store descriptive metadata
Preservation Plan
• Preservation policies (what to preserve)
• Legal obligations
• Organizational & Technical constraints
• User requirements
• Context
• http://plato.ifs.tuwien.ac.at:8080/plato
Risk Analysis
Random error
3
``````       1

                 1
    4
                 1




         2

                     Random error
3
``````       1                  3
                              ``````    1         2

                 1                            1
    4                               4             1
                 1                            1




         2                                  1.9

                     Random error
Systematic error
3
``````   1         2

             1
    4               1
             1




                 Systematic error
3
``````   1           2

               1
    4                 1
               1




             1.9
                   Systematic error
3
``````   1           2                75

               1
    4                 1
               1




             1.9
                   Systematic error
MTBF
       MTBF = Mean Time Between Failure
         3
   2
                                      10
             5
Time

            Total Time               40 hours
   MTBF =                        =                = 13.3 hrs
            Number of failures       3 failures
MTTF
           MTTF = Mean Time To Failure
            3
       2
                                   10
                 5
Time
               Total time          20 hours
   MTTF =                      =              = 5 hrs
             Number of units       4 units
MTTF = 2 M hours = 228 years!
MTTF = 2 M hours = 228 years!
AFR = 1/MTTF = 0.004 = 0.4 %
MTTF = 2 M hours = 228 years!
AFR = 1/MTTF = 0.004 = 0.4 %
      R(t) = exp(-t/ϴ)
MTTF = 2 M hours = 228 years!
AFR = 1/MTTF = 0.004 = 0.4 %
       R(t) = exp(-t/ϴ)
R(5) = exp(-5/228) = 0.98 = 98%
MTTF = 2 M hours = 228 years!
AFR = 1/MTTF = 0.004 = 0.4 %
       R(t) = exp(-t/ϴ)
R(5) = exp(-5/228) = 0.98 = 98%

 50 disks = 0.98^ 50 = 0.36 = 36%
Experiments

• Simulate 100 disks with a 200 MTTF using
  Processing. What happens if the AFR is not
  0.4% but 4% (hint: what is MTTF in that
  case)?
• Given a MTTF of 200 years and 50 disks
  what is the reliability in 1,2 and 5 years?
Experiments
               •      Amazon S3 claims an AFR per object of
                      0.000000001% [1]. What is the MTTF?

               •      There are 100 billion objects in S3. Given an
                      estimated average size of 1 MB how big is S3?

               •      What is the chance (reliability) none of these 100
                      billion objects are lost in 1 year?



[1] http://aws.amazon.com/s3/faqs/#How_reliable_is_Amazon_S3#How_durable_is_Amazon_S3
http://db.usenix.org/events/fast07/tech/schroeder/schroeder_html/index.html
Shroeder & Gibson
1 yr   3-5yr
Experiments
• Given the lifetime of the universe (13
  billion years) as the lifetime of one storage
  byte. What is the probability one Tera byte
  (1 billion bytes) will survive 100 years?
• Discuss
http://www.hpl.hp.com/techreports/tandem/TR-85.7.html
Serial Failures
Serial Failures
    87 years
Serial Failures
    87 years

    75 years
Serial Failures
    87 years

    75 years

    50 years
Serial Failures
    87 years

    75 years

    50 years


     31 years
Serial Failures
    •A     B C D ....            SYSTEM

       1         1       1       1       1
             =       +       +       +       +
   SYSTEM        A       B       C       D

E.g. : components : 1 , 100 , 1000, 10000
          System: 0.989 years
Parallel Failures
        = 200 years




        = ?? years
Parallel Failures

           {
                A
                 B
SYSTEM =
                C
                D

   SYSTEM = A * B * C * D

   E.g. : components : 200,200
        System: 40000 years
Composite Failures


                = ?? years
Composite Failures
                = 40.000 years


                      = SYSTEM

  1            1            1
         =            +
SYSTEM       40.000       40.000

      SYSTEM = 20.000
Experiments
• Calculate the composite failure of the
  Tandem example (administration, software,
  hardware, environment)
• How would you make this setup more
  reliable? Calculate the effect
• What is the MTTF of a 5-way mirror of
  7K3000 disks?
http://old.hki.uni-koeln.de/people/herrmann/forschung/heydegger_archiving2008_40.pdf
Bit Errors

                      0110001011




                      0010101001

BER = Bit Error Rate = 3/10 = 0.3 = 30 %
Bit Errors

• Soft error - repeat the operation
• Hard error - after some repeats data is lost
• Typical disk BER = 10 to 10 (every 10KB
                        -5    -6



  to 100 KB read)
Bit Errors
 Drive Type   Hard Error                        14
                                               10 =~ 10 TB
 Consumer                     -14
                                                15
                       10                      10 =~ 100TB
   SATA
 Enterprise                   -15               16
                       10                      10 =~ 1 PB
   SATA
 Enterprise                  -16
                       10
    SAS
              *) BER-s are in bit = 1/8 byte




1 sector error for every 10 TB -> 1 PB read
Experiments
•   Collect a few sample document from the web
    (images, documents, executables, etc); flip one or
    more random bits; explain the resulting effect

•   Use the visual defects experiment to measure the
    effect of flipping bits on images files with various
    compressions

•   Open and save an image file. Measure the visual
    effects.

•   Calculate the checksum of the files and repeat the
    experiments. Check results.
File Formats

• The goal of digital preservation is not
  preserving the bits and bytes but the means
  to access and use the information
  represented by them.
File Formats




          Software
Bits                  Information
              +
        Environment
File Formats
hypothetical 3-bit format


   110110010111010


                            Width = bit [1 .. 3]
                            Height = bit [4 .. 6]
                            Data = bit [7 .. 15]
File Formats
With software you have only two options:


1. The software works and is maintained
2. The software doesn’t work and is not
   maintained
File Formats
  1. The software works and is maintained

• Your designated community has the
  software tools
• Your archive has the software tools
• In both cases you need to provide
  information which software you need and
  the steps required to get access to the data
File Formats
2. The software doesn’t work and is not maintained


  • Archive the source code of the orginal
     software
  • Emulate the original software
Experiments
• Experiment with different textencoding
  demo files to discover the bit content of
  these files.
• Use droid and jhove to characterize and
  validate the demo files.
• Invalidate the files using truncation, bit
  errors. Check the results.
• Use migration and emulation to get access
  to the demo.wp file.
Metadata

• Descriptive Metadata
• Administrative Metadata
• Structural Metadata
• Rights Metadata
• Representation Metadata
Packaging

• Digital objects are composite structures
• Need to be described, validated and
  accessed as a whole
• Complex Objects
Package Formats

• METS
• MPEG-21/DIDL
• LOM/IMS
• BagIt
• TIPR RXP
BagIt

• Library of Congress & California Digital
  Library
• NDIIP
• Generic Format
BagIt
Experiments

• Create using the Bagger toolkit a bag. Add
  Dublin Core descriptive metadata.
• Save the bag as ZIP-file and deposit it do
  the demo archive.
• As archivist access the deposit and validate
  its contents.
Conclusions

Contenu connexe

En vedette

@Agawish creating a stunning ui with oracle adf faces, using sass
@Agawish   creating a stunning ui with oracle adf faces, using sass@Agawish   creating a stunning ui with oracle adf faces, using sass
@Agawish creating a stunning ui with oracle adf faces, using sassAmr Gawish
 
Chicago Chemists
Chicago ChemistsChicago Chemists
Chicago Chemistshostage
 
Searchthewebtutorial2014
Searchthewebtutorial2014Searchthewebtutorial2014
Searchthewebtutorial2014Joyce Miller
 
Business Consulting
Business ConsultingBusiness Consulting
Business ConsultingChris Walker
 
Culture Change 2 days seminar
Culture Change 2 days seminarCulture Change 2 days seminar
Culture Change 2 days seminarChris Walker
 
Optimized Internet Marketing
Optimized Internet MarketingOptimized Internet Marketing
Optimized Internet MarketingHans Riemer
 
Abraham Upfront Frontality In The Dura Europos Narratives
Abraham Upfront  Frontality In The Dura Europos NarrativesAbraham Upfront  Frontality In The Dura Europos Narratives
Abraham Upfront Frontality In The Dura Europos NarrativesPaige Dansinger
 
Tutorial dynamics of a rigid body (part i)
Tutorial dynamics of a rigid body (part i)Tutorial dynamics of a rigid body (part i)
Tutorial dynamics of a rigid body (part i)Kumutha Danasakaran
 

En vedette (19)

Are You Rewarding Loyal Members? ASAE 2013 Annual Meeting
Are You Rewarding Loyal Members? ASAE 2013 Annual MeetingAre You Rewarding Loyal Members? ASAE 2013 Annual Meeting
Are You Rewarding Loyal Members? ASAE 2013 Annual Meeting
 
The Ying & Yang of Creative Management
The Ying & Yang of Creative ManagementThe Ying & Yang of Creative Management
The Ying & Yang of Creative Management
 
Incentive Cards Explained - Incentive Mag Dec 1995
Incentive Cards Explained - Incentive Mag Dec 1995Incentive Cards Explained - Incentive Mag Dec 1995
Incentive Cards Explained - Incentive Mag Dec 1995
 
@Agawish creating a stunning ui with oracle adf faces, using sass
@Agawish   creating a stunning ui with oracle adf faces, using sass@Agawish   creating a stunning ui with oracle adf faces, using sass
@Agawish creating a stunning ui with oracle adf faces, using sass
 
Chicago Chemists
Chicago ChemistsChicago Chemists
Chicago Chemists
 
Searchthewebtutorial2014
Searchthewebtutorial2014Searchthewebtutorial2014
Searchthewebtutorial2014
 
Business Consulting
Business ConsultingBusiness Consulting
Business Consulting
 
Culture Change 2 days seminar
Culture Change 2 days seminarCulture Change 2 days seminar
Culture Change 2 days seminar
 
Biradsfa qs
Biradsfa qsBiradsfa qs
Biradsfa qs
 
Style Plus Presentation (1)
Style Plus Presentation (1)Style Plus Presentation (1)
Style Plus Presentation (1)
 
Understanding Member Engagement
Understanding Member EngagementUnderstanding Member Engagement
Understanding Member Engagement
 
A Review of Incentive Reward Cards 1997 - White Paper
A Review of Incentive Reward Cards 1997 - White PaperA Review of Incentive Reward Cards 1997 - White Paper
A Review of Incentive Reward Cards 1997 - White Paper
 
Purchasing Cooperatives and Job Order Contracting Make Sense CJE news 2006 ...
Purchasing Cooperatives and Job Order Contracting Make Sense   CJE news 2006 ...Purchasing Cooperatives and Job Order Contracting Make Sense   CJE news 2006 ...
Purchasing Cooperatives and Job Order Contracting Make Sense CJE news 2006 ...
 
Delivering Consistent National Brand Service At Multiple Locations - ICSA Pr...
Delivering Consistent National Brand Service At Multiple Locations - ICSA  Pr...Delivering Consistent National Brand Service At Multiple Locations - ICSA  Pr...
Delivering Consistent National Brand Service At Multiple Locations - ICSA Pr...
 
Gent_M 2011-04-26
Gent_M 2011-04-26Gent_M 2011-04-26
Gent_M 2011-04-26
 
Open | Linked | Open Linked data
Open | Linked | Open Linked dataOpen | Linked | Open Linked data
Open | Linked | Open Linked data
 
Optimized Internet Marketing
Optimized Internet MarketingOptimized Internet Marketing
Optimized Internet Marketing
 
Abraham Upfront Frontality In The Dura Europos Narratives
Abraham Upfront  Frontality In The Dura Europos NarrativesAbraham Upfront  Frontality In The Dura Europos Narratives
Abraham Upfront Frontality In The Dura Europos Narratives
 
Tutorial dynamics of a rigid body (part i)
Tutorial dynamics of a rigid body (part i)Tutorial dynamics of a rigid body (part i)
Tutorial dynamics of a rigid body (part i)
 

Plus de Patrick Hochstenbach

Plus de Patrick Hochstenbach (20)

Elag2015
Elag2015Elag2015
Elag2015
 
Processing Linked Data with Catmandu
Processing Linked Data with CatmanduProcessing Linked Data with Catmandu
Processing Linked Data with Catmandu
 
The Library in 2050
The Library in 2050The Library in 2050
The Library in 2050
 
20130308 webstrategie
20130308 webstrategie20130308 webstrategie
20130308 webstrategie
 
MARC Died
MARC DiedMARC Died
MARC Died
 
LibreCat::Catmandu
LibreCat::CatmanduLibreCat::Catmandu
LibreCat::Catmandu
 
Catmandu Librecat
Catmandu LibrecatCatmandu Librecat
Catmandu Librecat
 
Catmandu / LibreCat Project
Catmandu / LibreCat ProjectCatmandu / LibreCat Project
Catmandu / LibreCat Project
 
UGent Datacenter of waarom we 140TB kopen
UGent Datacenter of waarom we 140TB kopenUGent Datacenter of waarom we 140TB kopen
UGent Datacenter of waarom we 140TB kopen
 
देवनागरी Devanāgarī
 देवनागरी Devanāgarī  देवनागरी Devanāgarī
देवनागरी Devanāgarī
 
Informatie Aan Zee - TTT E-Research
Informatie Aan Zee - TTT E-ResearchInformatie Aan Zee - TTT E-Research
Informatie Aan Zee - TTT E-Research
 
Informatie Aan Zee - TTT Digital Architecture
Informatie Aan Zee - TTT Digital ArchitectureInformatie Aan Zee - TTT Digital Architecture
Informatie Aan Zee - TTT Digital Architecture
 
Biblio
BiblioBiblio
Biblio
 
GREP - Ghent University Repository
GREP - Ghent University RepositoryGREP - Ghent University Repository
GREP - Ghent University Repository
 
20100831 igelu mobilise_ugent
20100831 igelu mobilise_ugent20100831 igelu mobilise_ugent
20100831 igelu mobilise_ugent
 
20100618 Datasalon5 Vooruit Gent
20100618 Datasalon5 Vooruit Gent20100618 Datasalon5 Vooruit Gent
20100618 Datasalon5 Vooruit Gent
 
20100306 Datasalon 4 : code4lib
20100306 Datasalon 4 : code4lib20100306 Datasalon 4 : code4lib
20100306 Datasalon 4 : code4lib
 
20091120 Vlengel Maastricht
20091120 Vlengel Maastricht20091120 Vlengel Maastricht
20091120 Vlengel Maastricht
 
Data Salon 3 - Ghent
Data Salon 3 - GhentData Salon 3 - Ghent
Data Salon 3 - Ghent
 
20081007 Workshop BOM-VL WP3
20081007  Workshop BOM-VL WP320081007  Workshop BOM-VL WP3
20081007 Workshop BOM-VL WP3
 

Dernier

Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPCeline George
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsArubSultan
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxryandux83rd
 

Dernier (20)

Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERP
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristics
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptx
 

ELAG2011 Bootcamp

  • 1. In Case of Failure ELAG 2011 Prague Patrick Hochstenbach * Ghent University Email: Patrick.Hochstenbach@UGent.be Twitter: @hochstenbach https://github.com/phochste/ELAG-2001-Bootcamp
  • 3. Life expectancies of media Magnetic Tape Optical Disk Paper Microfilm High Quality (low lignin) Newspaper (high lignin) Archival Quality (Silver) "Permanent" (buffered) Data 8mm / Data VHS Medium-Term Film QIC / QIC-wide DDS / 4mm 3490/3490e Retention Retention CD-ROM Data D-2 Data D-3 Period - Period - WORM CD-R Required Required 3480 M-O DLT I-D1 Storage Life Storage Life 1 year 1 year 2 years 2 years 5 years 5 years 10 years 10 years 15 years 15 years 20 years 20 years 30 years 30 years 50 years 50 years “Storage Media Life Expectancies” - Van Bogart, 1998
  • 4. Growth of digital data Capacity of desktop computers http://commons.wikimedia.org/wiki/File:Hard_drive_capacity_over_time.png HanKwang (2008)
  • 5. Growth in formats *"+$!""$&'((,$-$.$$ !"#$%$&'(()$$ !/0$%$&'((#$$ *//$%$234$560$ !",$1$&4?$ !7)$1$<@4$ !/0$1$:;<'=$ !"#$% !""$% &$$$% !".$1$>:2$ *",$$1$49:$ !/#$1$234$567$ *77$1$82940777$ *"+$%$4'("/$ ABCDE;$ *"+$%$4'("+$ !/0$1$8294$
  • 6. Formats of formats MIME type image/tiff: •  TIFF (alle versies) •  TIFF/IT •  TIFF G4/LZW/UNC •  Digital Negative Format (DNG) •  GeoTIFF •  Pyramid TIFF •  ! Bron: PRONOM Technical Registry [http://www.nationalarchives.gov.uk/pronom/]
  • 7. Short & long term risks ,'&/00#$=4#.&>&.#0?.($ !"#$%&&'&()!*+($ ,"-.$,'&/00#$1"23"+"4+.4$ 5.784'-'+".$98":$ ;&+04"(0#'&"(78.$<"23"+"4+.4$ !"#$% !""$% &$$$% 5"26$
  • 9. Best practices 1. Create a preservation plan
  • 10. Best practices 1. Create a preservation plan 2. Backup and replicate your data
  • 11. Best practices 1. Create a preservation plan 2. Backup and replicate your data 3. Store preservation metadata
  • 12. Best practices 1. Create a preservation plan 2. Backup and replicate your data 3. Store preservation metadata 4. Store technical metadata
  • 13. Best practices 1. Create a preservation plan 2. Backup and replicate your data 3. Store preservation metadata 4. Store technical metadata 5. Store representation metadata
  • 14. Best practices 1. Create a preservation plan 2. Backup and replicate your data 3. Store preservation metadata 4. Store technical metadata 5. Store representation metadata 6. Don’t trust software
  • 15. Best practices 1. Create a preservation plan 2. Backup and replicate your data 3. Store preservation metadata 4. Store technical metadata 5. Store representation metadata 6. Don’t trust software 7. Store descriptive metadata
  • 16. Preservation Plan • Preservation policies (what to preserve) • Legal obligations • Organizational & Technical constraints • User requirements • Context • http://plato.ifs.tuwien.ac.at:8080/plato
  • 19. 3 `````` 1 1 4 1 2 Random error
  • 20. 3 `````` 1 3 `````` 1 2 1 1 4 4 1 1 1 2 1.9 Random error
  • 22. 3 `````` 1 2 1 4 1 1 Systematic error
  • 23. 3 `````` 1 2 1 4 1 1 1.9 Systematic error
  • 24. 3 `````` 1 2 75 1 4 1 1 1.9 Systematic error
  • 25.
  • 26.
  • 27. MTBF MTBF = Mean Time Between Failure 3 2 10 5 Time Total Time 40 hours MTBF = = = 13.3 hrs Number of failures 3 failures
  • 28. MTTF MTTF = Mean Time To Failure 3 2 10 5 Time Total time 20 hours MTTF = = = 5 hrs Number of units 4 units
  • 29.
  • 30. MTTF = 2 M hours = 228 years!
  • 31. MTTF = 2 M hours = 228 years! AFR = 1/MTTF = 0.004 = 0.4 %
  • 32. MTTF = 2 M hours = 228 years! AFR = 1/MTTF = 0.004 = 0.4 % R(t) = exp(-t/ϴ)
  • 33. MTTF = 2 M hours = 228 years! AFR = 1/MTTF = 0.004 = 0.4 % R(t) = exp(-t/ϴ) R(5) = exp(-5/228) = 0.98 = 98%
  • 34. MTTF = 2 M hours = 228 years! AFR = 1/MTTF = 0.004 = 0.4 % R(t) = exp(-t/ϴ) R(5) = exp(-5/228) = 0.98 = 98% 50 disks = 0.98^ 50 = 0.36 = 36%
  • 35. Experiments • Simulate 100 disks with a 200 MTTF using Processing. What happens if the AFR is not 0.4% but 4% (hint: what is MTTF in that case)? • Given a MTTF of 200 years and 50 disks what is the reliability in 1,2 and 5 years?
  • 36. Experiments • Amazon S3 claims an AFR per object of 0.000000001% [1]. What is the MTTF? • There are 100 billion objects in S3. Given an estimated average size of 1 MB how big is S3? • What is the chance (reliability) none of these 100 billion objects are lost in 1 year? [1] http://aws.amazon.com/s3/faqs/#How_reliable_is_Amazon_S3#How_durable_is_Amazon_S3
  • 37.
  • 40. 1 yr 3-5yr
  • 41.
  • 42. Experiments • Given the lifetime of the universe (13 billion years) as the lifetime of one storage byte. What is the probability one Tera byte (1 billion bytes) will survive 100 years? • Discuss
  • 43.
  • 46. Serial Failures 87 years
  • 47. Serial Failures 87 years 75 years
  • 48. Serial Failures 87 years 75 years 50 years
  • 49. Serial Failures 87 years 75 years 50 years 31 years
  • 50. Serial Failures •A B C D .... SYSTEM 1 1 1 1 1 = + + + + SYSTEM A B C D E.g. : components : 1 , 100 , 1000, 10000 System: 0.989 years
  • 51. Parallel Failures = 200 years = ?? years
  • 52. Parallel Failures { A B SYSTEM = C D SYSTEM = A * B * C * D E.g. : components : 200,200 System: 40000 years
  • 53. Composite Failures = ?? years
  • 54. Composite Failures = 40.000 years = SYSTEM 1 1 1 = + SYSTEM 40.000 40.000 SYSTEM = 20.000
  • 55. Experiments • Calculate the composite failure of the Tandem example (administration, software, hardware, environment) • How would you make this setup more reliable? Calculate the effect • What is the MTTF of a 5-way mirror of 7K3000 disks?
  • 56.
  • 58. Bit Errors 0110001011 0010101001 BER = Bit Error Rate = 3/10 = 0.3 = 30 %
  • 59. Bit Errors • Soft error - repeat the operation • Hard error - after some repeats data is lost • Typical disk BER = 10 to 10 (every 10KB -5 -6 to 100 KB read)
  • 60. Bit Errors Drive Type Hard Error 14 10 =~ 10 TB Consumer -14 15 10 10 =~ 100TB SATA Enterprise -15 16 10 10 =~ 1 PB SATA Enterprise -16 10 SAS *) BER-s are in bit = 1/8 byte 1 sector error for every 10 TB -> 1 PB read
  • 61. Experiments • Collect a few sample document from the web (images, documents, executables, etc); flip one or more random bits; explain the resulting effect • Use the visual defects experiment to measure the effect of flipping bits on images files with various compressions • Open and save an image file. Measure the visual effects. • Calculate the checksum of the files and repeat the experiments. Check results.
  • 62.
  • 63. File Formats • The goal of digital preservation is not preserving the bits and bytes but the means to access and use the information represented by them.
  • 64. File Formats Software Bits Information + Environment
  • 65. File Formats hypothetical 3-bit format 110110010111010 Width = bit [1 .. 3] Height = bit [4 .. 6] Data = bit [7 .. 15]
  • 66. File Formats With software you have only two options: 1. The software works and is maintained 2. The software doesn’t work and is not maintained
  • 67. File Formats 1. The software works and is maintained • Your designated community has the software tools • Your archive has the software tools • In both cases you need to provide information which software you need and the steps required to get access to the data
  • 68. File Formats 2. The software doesn’t work and is not maintained • Archive the source code of the orginal software • Emulate the original software
  • 69. Experiments • Experiment with different textencoding demo files to discover the bit content of these files. • Use droid and jhove to characterize and validate the demo files. • Invalidate the files using truncation, bit errors. Check the results. • Use migration and emulation to get access to the demo.wp file.
  • 70.
  • 71. Metadata • Descriptive Metadata • Administrative Metadata • Structural Metadata • Rights Metadata • Representation Metadata
  • 72. Packaging • Digital objects are composite structures • Need to be described, validated and accessed as a whole • Complex Objects
  • 73. Package Formats • METS • MPEG-21/DIDL • LOM/IMS • BagIt • TIPR RXP
  • 74. BagIt • Library of Congress & California Digital Library • NDIIP • Generic Format
  • 75. BagIt
  • 76. Experiments • Create using the Bagger toolkit a bag. Add Dublin Core descriptive metadata. • Save the bag as ZIP-file and deposit it do the demo archive. • As archivist access the deposit and validate its contents.