SlideShare une entreprise Scribd logo
1  sur  17
Electronic Records Transfer
         Guidance
          at NARA
      Kevin De Vorsey
  kevin.devorsey@nara.gov
Current Guidance (Developed Between 2002-2004)




Reflect NARA’s
capabilities at the time.
                 National Archives and Records Administration
Current Guidance: limited scope that does not
address all format types




               National Archives and Records Administration
Current Guidance Products
 Demonstrate a preference for open, standards-
    based formats.
   Require that agencies transform or normalize data
    into acceptable formats prior to transfer*.
   Have proven an obstacle to the steady transfer of
    records.
   Are referred to at many points of the lifecycle.
   Cannot adapt to records with different retention
    periods.

                  National Archives and Records Administration
Project Scope
 In scope:
    This project seeks to support the work of federal agencies by
     providing flexible and realistic electronic records file format
     guidance on all electronic records types for use when transferring
     permanent records to NARA in accordance with the Federal
     Records Act.
    This project will identify and recommend changes but the
     execution of any additional guidance including guidance for all
     types of metadata as well as the revision of business
     processes, or development of standard operating procedures is
     beyond the scope of this project.

 Out of scope:
   Format guidance for other areas of the record lifecycle other than
    transfer to NARA
   Guidance on physical media
   Records of the Executive Office of the President and the records
    of the United States Congress. These branches are not covered
    by the Federal Records Act and are therefore excluded from
    consideration for this project
Project Phases
 Phase 1: Planning and Preparation
   July 1– December 30, 2011
 Phase 2: Conduct Informational Meetings
   February 6– August 12, 2012
   Internal NARA SMEs
   Future Perfect Conference
   Agency representatives
 Phase 3: Develop and Publish Guidance Product
   May 29 – September 7, 2012
 Phase 4: Evaluation and Completion
   December 12 - December 21, 2012
Electronic Records Lifecycle
              Migration       Decommissioning


                                                                            Processing
                      Maintenance
                                                                 Transfer
                                                                                                                 Transformation
                                                                                Preservation & Access Planning
System                                    Transfer Planning
design/planning




                                          Destruction Planning
        Record Creation

                                                                                                           Access
                                                                                     Preservation/
                                                                       Ingest
                          Scheduling/Appraisal                                       Maintenance


                                                                                            Regular
                                                                                                                    Public
                                                                                            Requests
                                                                                            (FOIA, etc.)




                                             National Archives and Records Administration
Revised Guidance Content Categories

  Electronic Textual Records
  Digital Still Images
  Digital Audio Records
  Digital Moving Image Records
  Structured Data
  Geospatial Records
  CAD and Vector Graphics
  Web Records
  E-mail Records


               National Archives and Records Administration
Relevant Content Categories Definitions
     Structured Data – includes the broad category of data that is stored in defined fields and
      includes:
        Databases – Database formats are organized collections of associated data that conform
          to a logical structure. Database formats are determined by “data models” that describe
          specific data structures used to model an application and generally include navigational,
          relational, and hybrid models.
        Spreadsheets – Spreadsheets are electronic simulations of paper accounting
          worksheets for financial plans, budgets, etc. Personal computer and server based
          spreadsheet programs [e.g. Microsoft Excel, Lotus 1-2-3, Open Office Calc.] can create
          both proprietary files as well as software independent files including text or XML. Cloud-
          based spreadsheets [e.g. Google Documents] include format export options such as .xls,
          .csv, .txt, .ods, PDF and HTML files as well as import and conversion options for common
          spreadsheet formats including .xls, .csv, and .ods.
        Statistical Data – Statistical Data is the result of scientific quantitative research and
          analyses. Statistical data formats contain collections of data presented in both tabular
          and non-tabular form. Datasets are formatted as strings of characters contained within a
          markup language [e.g. XML] or as software dependent proprietary files by commercial
          statistical and qualitative data analysis software tools (e.g. SAS and SPSS).
        Scientific data refers to research data collected by instrumentation tools during the
          scientific process. Scientific data formats are either domain specific such as those used
          within a single field of study [e.g. Flexible Image Transport System (FITS)] or are multi-
          domain formats useful for transfer of scientific data between domains [e.g. Common Data
          Format (CDF), HDF5].
Relevant Content Categories Definitions
  Geospatial – Geospatial data includes files created by
   geographic information systems (GIS) or other software
   applications for spatial analysis using computer systems.
   The data may be contained within a database to enable
   analysis across the datasets (e.g. geo-database), united
   within a complex file format structure where one geospatial
   file is comprised of several distinct, but related, formats
   (e.g. shapefile), or contained within a single file (e.g.
   GML).
  Computer Aided Design (CAD) and Vector Graphics–
   Non-raster Vector graphics formats use mathematical
   expressions to create and manipulate computer graphics
   and animations. Computer Aided Design (CAD) are
   vector programs used in engineering and manufacturing
   design to create animations and represent three-
   dimensional surfaces of inanimate objects. CAD and
   Vector graphics programs can output binary and XML
Record Categories Held in
Systems

                                                    Geospatial Data
                                 Geospatial
                                                    System Records
CADCAM System
                    CAD/CAM
Generated Records


                              Database        Database System
                                              Generated Records




                       NARA/ERA
Considerations*
 What part(s) of the system represents the record?
 Do we want to bring in the entire system?
 Could ERA cope with the formats, file size, and/or
  volume or files?
 If we only want a subset or can only accept an export
  then what is the “best” format for the electronic record
  type in question?
 What additional information should accompany the
  data?
 How should we validate and verify this data?


*These influence the transfer guidance but changes to existing work
  processes are out of the scope of this project.
Goals
  Provide clear, concise, and consistent direction to
   agencies regarding formats that are acceptable
   for use when transferring records to NARA.
  Develop a flexible and extensible framework that
   can adapt to future needs.
  Balance preference for open formats with the
   business needs of agencies and NARA.
  Support digital continuity across the lifecycle of
   electronic records.



              National Archives and Records Administration
Stress Sustainability*
   Disclosure: the degree to which complete specifications and technical
      integrity tools exist.
     Adoption: the degree to which the format is used by
      creators, disseminators, or users.
     Transparency: the degree to which the digital representation is open to
      direct analysis with basic tools, including human readability using a text-
      only editor.
     Self-documentation: formats that contain all the metadata needed to
      render the data as usable information.
     External dependencies: refers to the degree to which a format
      depends on particular hardware, operating system, or software for
      rendering or use.
     Impact of patents: Patents related to a digital format may inhibit the
      ability of archival institutions to sustain content in that format.
     Technical protection mechanisms: To preserve digital content and
      provide service to users and designated communities decades
      hence, NARA must be able to replicate the content on new
      media, migrate and normalize it in the face of changing technology, and
      disseminate it to researchers.

                       National Archives and Records Administration
  *adapted from http://www.digitalpreservation.gov/formats/
Recognize That Formats …

   Influence usability
   Affect behavior and performance
   Influence NARA’s capability to preserve
   complex records like databases, video, and
   GIS




            National Archives and Records Administration
Concluding Thoughts
 NARA should:
  expand the types of formats that NARA accepts
  balance the business requirements of agencies with
   NARA’s preservation and access needs
  minimize the need for agencies to transform records
   prior to transfer
  develop guidance across the lifecycle of electronic
   records to support digital continuity




                 National Archives and Records Administration
Thank You!

               Kevin L. De Vorsey
Supervisory Electronic Records Format Specialist
       Electronic Records Format Section
   Policy Analysis and Enforcement Division
       Office of the Chief Records Officer
                 Agency Services
  National Archive and Records Administration
       kevin.devorsey@nara.gov

Contenu connexe

En vedette

Bunny booktemplate1
Bunny booktemplate1Bunny booktemplate1
Bunny booktemplate1
mjbeichner
 
Ageofdiscovery
AgeofdiscoveryAgeofdiscovery
Ageofdiscovery
NinjaBlank
 

En vedette (12)

Fdtd
FdtdFdtd
Fdtd
 
Bunny booktemplate1
Bunny booktemplate1Bunny booktemplate1
Bunny booktemplate1
 
Grace Currie Ann Jebson First Things First
Grace Currie Ann Jebson First Things FirstGrace Currie Ann Jebson First Things First
Grace Currie Ann Jebson First Things First
 
Tourismo filipino1
Tourismo filipino1Tourismo filipino1
Tourismo filipino1
 
Dave Pearson The Adventures of Digi
Dave Pearson The Adventures of DigiDave Pearson The Adventures of Digi
Dave Pearson The Adventures of Digi
 
Ageofdiscovery
AgeofdiscoveryAgeofdiscovery
Ageofdiscovery
 
Steve Mc Eachern Australian Data Archive
Steve Mc Eachern Australian Data ArchiveSteve Mc Eachern Australian Data Archive
Steve Mc Eachern Australian Data Archive
 
Jeff Rothenberg Digital Preservation Perspective
Jeff Rothenberg Digital Preservation PerspectiveJeff Rothenberg Digital Preservation Perspective
Jeff Rothenberg Digital Preservation Perspective
 
OGD Wien - Ideensammlung
OGD Wien - IdeensammlungOGD Wien - Ideensammlung
OGD Wien - Ideensammlung
 
Bigger Hard Drive Jamie Lean
Bigger Hard Drive Jamie LeanBigger Hard Drive Jamie Lean
Bigger Hard Drive Jamie Lean
 
Steve Knight by Design
Steve Knight by DesignSteve Knight by Design
Steve Knight by Design
 
Michael Parsons Passion
Michael Parsons PassionMichael Parsons Passion
Michael Parsons Passion
 

Similaire à Kevin De Vorsey Past is Prologue

Lumbs Latest Resume
Lumbs Latest ResumeLumbs Latest Resume
Lumbs Latest Resume
John Lumb
 
2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrack2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrack
Rudolf Husar
 
Resume of Stacy Lauren Hendricks
Resume of Stacy Lauren HendricksResume of Stacy Lauren Hendricks
Resume of Stacy Lauren Hendricks
Stacy Hendricks
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
Ryan Andhavarapu
 
Scott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdfScott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdf
IUI
 
Reverse Engineering of Software Architecture
Reverse Engineering of Software ArchitectureReverse Engineering of Software Architecture
Reverse Engineering of Software Architecture
Dharmalingam Ganesan
 

Similaire à Kevin De Vorsey Past is Prologue (20)

Muench,John Res21 Jul2010 It Dr Pm San Compliance For It Pm Op (17 Nov 2010)
Muench,John Res21 Jul2010 It Dr Pm San Compliance For  It Pm Op (17 Nov 2010)Muench,John Res21 Jul2010 It Dr Pm San Compliance For  It Pm Op (17 Nov 2010)
Muench,John Res21 Jul2010 It Dr Pm San Compliance For It Pm Op (17 Nov 2010)
 
Lumbs Latest Resume
Lumbs Latest ResumeLumbs Latest Resume
Lumbs Latest Resume
 
2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrack2005-03-17 Air Quality Cluster TechTrack
2005-03-17 Air Quality Cluster TechTrack
 
Ws Stuff
Ws StuffWs Stuff
Ws Stuff
 
Database Systems.ppt
Database Systems.pptDatabase Systems.ppt
Database Systems.ppt
 
Resume of Stacy Lauren Hendricks
Resume of Stacy Lauren HendricksResume of Stacy Lauren Hendricks
Resume of Stacy Lauren Hendricks
 
Technical Comms Planning Nf
Technical Comms Planning NfTechnical Comms Planning Nf
Technical Comms Planning Nf
 
System design
System designSystem design
System design
 
WQD2011 - INNOVATION - DEWA - Tracking Management System
WQD2011 - INNOVATION - DEWA - Tracking Management SystemWQD2011 - INNOVATION - DEWA - Tracking Management System
WQD2011 - INNOVATION - DEWA - Tracking Management System
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
Scott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdfScott R. Stadelhofer December-2022.pdf
Scott R. Stadelhofer December-2022.pdf
 
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
Performance Enhancement using Appropriate File Formats in Big Data Hadoop Eco...
 
Reverse Engineering of Software Architecture
Reverse Engineering of Software ArchitectureReverse Engineering of Software Architecture
Reverse Engineering of Software Architecture
 
Mis module ii
Mis module iiMis module ii
Mis module ii
 
Chapter01
Chapter01Chapter01
Chapter01
 
Ena ch01
Ena ch01Ena ch01
Ena ch01
 
Ena ch01
Ena ch01Ena ch01
Ena ch01
 
charles finch CV
charles finch CVcharles finch CV
charles finch CV
 

Plus de Future Perfect 2012

Plus de Future Perfect 2012 (15)

Working Across Organizations white paper
Working Across Organizations white paperWorking Across Organizations white paper
Working Across Organizations white paper
 
Ensuring Data Integrity white paper
Ensuring Data Integrity white paperEnsuring Data Integrity white paper
Ensuring Data Integrity white paper
 
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
 
Joe Coleman Biodiversity Heritage Library
Joe Coleman Biodiversity Heritage LibraryJoe Coleman Biodiversity Heritage Library
Joe Coleman Biodiversity Heritage Library
 
James Smithies Academic Earthquake Research
James Smithies Academic Earthquake ResearchJames Smithies Academic Earthquake Research
James Smithies Academic Earthquake Research
 
Shaun Hendy Innovation Ecosystem
Shaun Hendy Innovation EcosystemShaun Hendy Innovation Ecosystem
Shaun Hendy Innovation Ecosystem
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP Online
 
Parul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right CombinationParul Sharma Sally Vermaaten Right Combination
Parul Sharma Sally Vermaaten Right Combination
 
Gabe Nault Data Integrity
Gabe Nault Data IntegrityGabe Nault Data Integrity
Gabe Nault Data Integrity
 
Jay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying FormatsJay Gattuso Persistently Identifying Formats
Jay Gattuso Persistently Identifying Formats
 
Stuart Wakefield Cloud Computing
Stuart Wakefield Cloud ComputingStuart Wakefield Cloud Computing
Stuart Wakefield Cloud Computing
 
Cassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSWCassie Findlay Digital Transformation SRNSW
Cassie Findlay Digital Transformation SRNSW
 
T Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation PilotT Bahr M Lindlar Goportis Digital Preservation Pilot
T Bahr M Lindlar Goportis Digital Preservation Pilot
 
Dennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital PreservationDennis Phillips Cooperative Digital Preservation
Dennis Phillips Cooperative Digital Preservation
 
Bedrich Vychodil DIFFER
Bedrich Vychodil DIFFERBedrich Vychodil DIFFER
Bedrich Vychodil DIFFER
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Kevin De Vorsey Past is Prologue

  • 1. Electronic Records Transfer Guidance at NARA Kevin De Vorsey kevin.devorsey@nara.gov
  • 2. Current Guidance (Developed Between 2002-2004) Reflect NARA’s capabilities at the time. National Archives and Records Administration
  • 3. Current Guidance: limited scope that does not address all format types National Archives and Records Administration
  • 4. Current Guidance Products  Demonstrate a preference for open, standards- based formats.  Require that agencies transform or normalize data into acceptable formats prior to transfer*.  Have proven an obstacle to the steady transfer of records.  Are referred to at many points of the lifecycle.  Cannot adapt to records with different retention periods. National Archives and Records Administration
  • 5. Project Scope  In scope:  This project seeks to support the work of federal agencies by providing flexible and realistic electronic records file format guidance on all electronic records types for use when transferring permanent records to NARA in accordance with the Federal Records Act.  This project will identify and recommend changes but the execution of any additional guidance including guidance for all types of metadata as well as the revision of business processes, or development of standard operating procedures is beyond the scope of this project.  Out of scope:  Format guidance for other areas of the record lifecycle other than transfer to NARA  Guidance on physical media  Records of the Executive Office of the President and the records of the United States Congress. These branches are not covered by the Federal Records Act and are therefore excluded from consideration for this project
  • 6. Project Phases  Phase 1: Planning and Preparation  July 1– December 30, 2011  Phase 2: Conduct Informational Meetings  February 6– August 12, 2012  Internal NARA SMEs  Future Perfect Conference  Agency representatives  Phase 3: Develop and Publish Guidance Product  May 29 – September 7, 2012  Phase 4: Evaluation and Completion  December 12 - December 21, 2012
  • 7. Electronic Records Lifecycle Migration Decommissioning Processing Maintenance Transfer Transformation Preservation & Access Planning System Transfer Planning design/planning Destruction Planning Record Creation Access Preservation/ Ingest Scheduling/Appraisal Maintenance Regular Public Requests (FOIA, etc.) National Archives and Records Administration
  • 8. Revised Guidance Content Categories  Electronic Textual Records  Digital Still Images  Digital Audio Records  Digital Moving Image Records  Structured Data  Geospatial Records  CAD and Vector Graphics  Web Records  E-mail Records National Archives and Records Administration
  • 9. Relevant Content Categories Definitions  Structured Data – includes the broad category of data that is stored in defined fields and includes:  Databases – Database formats are organized collections of associated data that conform to a logical structure. Database formats are determined by “data models” that describe specific data structures used to model an application and generally include navigational, relational, and hybrid models.  Spreadsheets – Spreadsheets are electronic simulations of paper accounting worksheets for financial plans, budgets, etc. Personal computer and server based spreadsheet programs [e.g. Microsoft Excel, Lotus 1-2-3, Open Office Calc.] can create both proprietary files as well as software independent files including text or XML. Cloud- based spreadsheets [e.g. Google Documents] include format export options such as .xls, .csv, .txt, .ods, PDF and HTML files as well as import and conversion options for common spreadsheet formats including .xls, .csv, and .ods.  Statistical Data – Statistical Data is the result of scientific quantitative research and analyses. Statistical data formats contain collections of data presented in both tabular and non-tabular form. Datasets are formatted as strings of characters contained within a markup language [e.g. XML] or as software dependent proprietary files by commercial statistical and qualitative data analysis software tools (e.g. SAS and SPSS).  Scientific data refers to research data collected by instrumentation tools during the scientific process. Scientific data formats are either domain specific such as those used within a single field of study [e.g. Flexible Image Transport System (FITS)] or are multi- domain formats useful for transfer of scientific data between domains [e.g. Common Data Format (CDF), HDF5].
  • 10. Relevant Content Categories Definitions  Geospatial – Geospatial data includes files created by geographic information systems (GIS) or other software applications for spatial analysis using computer systems. The data may be contained within a database to enable analysis across the datasets (e.g. geo-database), united within a complex file format structure where one geospatial file is comprised of several distinct, but related, formats (e.g. shapefile), or contained within a single file (e.g. GML).  Computer Aided Design (CAD) and Vector Graphics– Non-raster Vector graphics formats use mathematical expressions to create and manipulate computer graphics and animations. Computer Aided Design (CAD) are vector programs used in engineering and manufacturing design to create animations and represent three- dimensional surfaces of inanimate objects. CAD and Vector graphics programs can output binary and XML
  • 11. Record Categories Held in Systems Geospatial Data Geospatial System Records CADCAM System CAD/CAM Generated Records Database Database System Generated Records NARA/ERA
  • 12. Considerations*  What part(s) of the system represents the record?  Do we want to bring in the entire system?  Could ERA cope with the formats, file size, and/or volume or files?  If we only want a subset or can only accept an export then what is the “best” format for the electronic record type in question?  What additional information should accompany the data?  How should we validate and verify this data? *These influence the transfer guidance but changes to existing work processes are out of the scope of this project.
  • 13. Goals  Provide clear, concise, and consistent direction to agencies regarding formats that are acceptable for use when transferring records to NARA.  Develop a flexible and extensible framework that can adapt to future needs.  Balance preference for open formats with the business needs of agencies and NARA.  Support digital continuity across the lifecycle of electronic records. National Archives and Records Administration
  • 14. Stress Sustainability*  Disclosure: the degree to which complete specifications and technical integrity tools exist.  Adoption: the degree to which the format is used by creators, disseminators, or users.  Transparency: the degree to which the digital representation is open to direct analysis with basic tools, including human readability using a text- only editor.  Self-documentation: formats that contain all the metadata needed to render the data as usable information.  External dependencies: refers to the degree to which a format depends on particular hardware, operating system, or software for rendering or use.  Impact of patents: Patents related to a digital format may inhibit the ability of archival institutions to sustain content in that format.  Technical protection mechanisms: To preserve digital content and provide service to users and designated communities decades hence, NARA must be able to replicate the content on new media, migrate and normalize it in the face of changing technology, and disseminate it to researchers. National Archives and Records Administration *adapted from http://www.digitalpreservation.gov/formats/
  • 15. Recognize That Formats …  Influence usability  Affect behavior and performance  Influence NARA’s capability to preserve complex records like databases, video, and GIS National Archives and Records Administration
  • 16. Concluding Thoughts  NARA should:  expand the types of formats that NARA accepts  balance the business requirements of agencies with NARA’s preservation and access needs  minimize the need for agencies to transform records prior to transfer  develop guidance across the lifecycle of electronic records to support digital continuity National Archives and Records Administration
  • 17. Thank You! Kevin L. De Vorsey Supervisory Electronic Records Format Specialist Electronic Records Format Section Policy Analysis and Enforcement Division Office of the Chief Records Officer Agency Services National Archive and Records Administration kevin.devorsey@nara.gov