SlideShare une entreprise Scribd logo
1  sur  21
GRANT AGREEMENT: 601138 | SCHEME FP7 ICT 2011.4.3
Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics [Digital
Preservation]
“This project has received funding from the European Union’s Seventh Framework
Programme for research, technological development and demonstration under
grant agreement no601138”.
Information Packaging Techniques
An overview of methods and standards
Anna-Grit Eggers (University of Goettingen)
Information Packaging Techniques
a) Simple Archive Container Formats
b) Structured Packaging for Archiving
c) Metadata Schemes
Contents
Information Packaging
Simple Archive
Container Formats
● Sole purpose: packing files together.
● File containers are often combined with a compression
option to reduce the needed disk space for storage.
● All files in the containers are stored equally as
payload, and the containers have to be unzipped to
fully access the packed files.
Simple Archive Container Formats
● The ZIP format was introduced by PKWARE in 1989:
https://www.pkware.com/support/zip-app-note
◦ Public domain
◦ Supports various compression algorithms Dateiicon von WinZIP
◦ Each single file in the archive container is compressed => possible to unzip single
files from the container
◦ Option to aggregate files without using compression
◦ Preserves the original file paths and offers optional encryption
ZIP
◦ Size reduction: Archive containers can be divided into parts
◦ Flexibility: add or extract single files from a zip archive without
touching the other stored files
• + advantage: possibility to frequently change packages
• - disadvantage: causes overhead in the form of an additional file list
which is stored together with the content.
◦ Loss prevention: ZIP uses cyclic redundancy checks. In case a file
becomes corrupt, the other files would be still flawlessly
accessible.
ZIP (cont.)
● A widespread container format in UNIX (ustar, pax), LINUX (GNU tar), and BSD
(bsdtar) environments
● Can be enabled on Windows Operating Systems for example by software
libraries such as LibArchive
● Writes files sequentially into one file, called ‘tarball’
● Was originally used for tape drives
● TAR is combined with a compression algorithm like gzip or bzip2
● In contrast to ZIP, TAR doesn’t allow extracting single files from the container.
TAR
Information Packaging
Structured Packaging for
Archiving
Creation of an information container,
in which
the packed information can be
stored
in a well-defined and structured way.
Structured Packaging for Archiving
● A standard for storing files and their metadata in a well-defined directory structure
● Developed by the California Digital Library Digital Preservation Group and the Library of
Congress
● Often used for preservation purposes, e.g. by Tate (UK).
● Data files are stored in a data directory
● Their checksums are saved in a manifest file
● The metadata, or tags, are listed together with their checksums in a tag-manifest file.
● A further BagIt file stores the used BagIt version and the file encoding.
● BagIt is often combined with a simple archiving format, such as TAR or ZIP, for the
serialisation of the bag directory, or used only as directory structure technique for
sensible content.
● See: http://www.cdlib.org/cdlinfo/2008/07/02/bagit-transferring-digital-content/
BagIt
● Container files, which contain file aggregations serving a specific purpose
● Often used to store all files belonging to a video, and to group them as a single
self-describing video file.
● Popular examples for video containers are AVI and Ogg Media.
Xiph.Org Foundation
Compound Documents
● The source code of a computer program is often stored together with other project-
related resources, such as images, in a package with a well-defined directory structure.
● Structured source code packages are often executable (=> run the computer program).
● Examples: Java’s JARs, Ruby Gems and Python Eggs.
◦ The JAR format is derived from the ZIP format.
◦ JAR be seen as compound document similar to the video containers,
because the Java program which is represented by the JAR can be
executed by running the JAR.
◦ It contains a well-defined path structure and an optional manifest file,
which can be regarded as metadata file.
◦ Therefore the passage to the subsequent category of metadata schemes becomes fluent.
Structured Source Code Packages
Information Packaging
Metadata schemes
● Mostly used in combination with packaging
● But also be kept beside the described content and linked to it
● Or embedded with the content
● Most common is the use of the XML format to define a scheme for a use domain.
Metadata schemes
● METS standS for Metadata Encoding and Transmission Standard maintained by the METS
Editorial Board
● It provides an XML schema for encoding different types of metadata
● It simplifies the administration and exchange of digital objects between data collections.
● A METS-file serves as a hub file that links together the digital object with all its belonging files
and the metadata to create a digital entity.
● A METS XML-file consists of:
◦ Header: Contains metadata of the METS file itself, like the creation date and the authors.
◦ Descriptive metadata: Provides links to external metadata documents.
◦ Administrative metadata: Stores the data concerning storage, rights and creation.
◦ File section: Manages a list of all files belonging to the DO.
◦ Structural map: Describes the inner structure of the DO and provides the linkage between data and metadata.
◦ Structural links: Provides hyperlinks and is useful for the archiving of websites.
◦ Behaviour: Stores executable instructions for the behaviour.
● See: http://www.loc.gov/standards/mets/
METS
● ORE is a standard for Object Reuse and Exchange by the Open Archives Initiative OAI.
● It implements two new types of resources: Aggregations and Resource Maps.
● An Aggregation is a representation of a set of associated web resources.
◦ It is like a Semantic Web resource, hence has no representation by itself.
● A Resource Map belongs to an Aggregation.
◦ It holds a machine-readable description of the Aggregation and a list of associated resources. In
addition, it describes the relationships and properties relevant to all resources and has some
metadata for itself.
● Both resources are addressed by an HTTP URI in the Web.
◦ Aggregations can be used by applications to visualise all associated resources processing them
as a collection.
◦ This simplifies the exchange and archiving of resource sets.
◦ Various formats for the Resource are available: Atom XML, RDF/XML, and RDFa.
◦ All of these formats support serialisation.
● See: http://www.openarchives.org/ore/
OAI-ORE
● Developed by the PREservation Metadata: Implementation Strategies
(PREMIS) group of the Library of Congress
● It supports the preservation and long-term usability of digital objects and
their metadata
● The Data Dictionary is a specification for metadata handling in digital
archiving systems.
● The data model provides five entities: intellectual, object, event, agent and
rights.
● See: https://www.era.lib.ed.ac.uk/bitstream/handle/1842/3339/Higgins PREMIS_V-2-1-2009-
03.pdf?sequence=1&isAllowed=y
PREMIS Data Dictionary
● Used to describe and bundle research data in a way that supports citation and
sharing in a machine-readable fashion.
● The initiative includes a number of techniques that have a set of principles in
common:
◦ Identity
◦ Aggregation
◦ Annotation
● The metadata is described in the RO ontology.
● Bundling can be done using different techniques, including the RO bundling and
BagIt.
● See: http://www.researchobject.org/
Research Object (RO)
● The Long-term preservation Metadata for Electronic Resources project provides an XML schema
particularly for long-term preservation purposes, based on the preservation implementation
schema by the National Library of New Zealand.
● The schema was developed by the DNB (Deutsche National Bibliothek) as a schema for technical
metadata.
● It is used, in combination with METS, for defining the packaging format UOF.
● It is designed for cooperating with standard exchange formats, and can be integrated in METS.
● The LMER-schema consists of the following sections:
◦ Object: The object with an URN as persistent identifier.
◦ Process: Protocol of technical changes.
◦ Metadata: Metadata for each file that belongs to the digital object.
◦ Metadata modifications: Protocol of changes of the metadata.
● See: http://www.dnb.de/DE/Standardisierung/LMER/lmer_node.html
LMER
● Timothy DiLauro and Jonathan Petters introduced the Data Conservancy
Package Tool, at the International Digital Curation Conference (IDCC) 2015
(http://www.dcc.ac.uk/sites/default/files/documents/IDCC15/196.pdf).
● The tool facilitates the creation of packages for research data objects in the
conservation domain
● It provides a user interface for the definition of packages.
● It focusses on curation activities.
● See: http://dataconservancy.org/wp-content/uploads/2014/10/DCSDOCPKG-
PackageToolsDocumentationHome-Full.pdf
The Data Conservancy Package Tool

Contenu connexe

Tendances

Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository Tools
Hannes Ebner
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open Data
MetaSolutions AB
 
Dublin Core Metadata Initiatives
Dublin Core Metadata InitiativesDublin Core Metadata Initiatives
Dublin Core Metadata Initiatives
Shriram Pandey
 
IFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round tableIFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round table
Figoblog
 
Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109
Chengjen Lee
 

Tendances (11)

Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository Tools
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open Data
 
LDCache - a cache for linked data-driven web applications
LDCache - a cache for linked data-driven web applicationsLDCache - a cache for linked data-driven web applications
LDCache - a cache for linked data-driven web applications
 
Dublin Core Metadata Initiatives
Dublin Core Metadata InitiativesDublin Core Metadata Initiatives
Dublin Core Metadata Initiatives
 
Open data easy, explicit and fast
Open data easy, explicit and fastOpen data easy, explicit and fast
Open data easy, explicit and fast
 
IFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round tableIFLA 2012 - OCLC Linked Data round table
IFLA 2012 - OCLC Linked Data round table
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
Networked digital library through harvesting
Networked digital library through harvestingNetworked digital library through harvesting
Networked digital library through harvesting
 
Retooling a Research Data Repository: data.depositar.io
Retooling a Research Data Repository: data.depositar.ioRetooling a Research Data Repository: data.depositar.io
Retooling a Research Data Repository: data.depositar.io
 
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata MattersAlphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata Matters
 
Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109Ckan tutorial odw2013 131109
Ckan tutorial odw2013 131109
 

Similaire à PERICLES Information Packaging Techniques

Wed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservationsWed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservations
eswcsummerschool
 

Similaire à PERICLES Information Packaging Techniques (20)

The Oxford Common File Layout: A common approach to digital preservation
The Oxford Common File Layout: A common approach to digital preservationThe Oxford Common File Layout: A common approach to digital preservation
The Oxford Common File Layout: A common approach to digital preservation
 
Wed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservationsWed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservations
 
Archivematica and the digital archival chain of custody
Archivematica and the digital archival chain of custodyArchivematica and the digital archival chain of custody
Archivematica and the digital archival chain of custody
 
BatIg
BatIgBatIg
BatIg
 
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - RomeThe ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
The ECM world from the point of view of Alfresco - Linux Day 2013 - Rome
 
A Strategy for Improving the Performance of Small Files in Openstack Swift
 A Strategy for Improving the Performance of Small Files in Openstack Swift  A Strategy for Improving the Performance of Small Files in Openstack Swift
A Strategy for Improving the Performance of Small Files in Openstack Swift
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
Tdr Overview Pres Advocates
Tdr Overview Pres AdvocatesTdr Overview Pres Advocates
Tdr Overview Pres Advocates
 
Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...Integrating an electronic lab notebook with a university it environment rdmf ...
Integrating an electronic lab notebook with a university it environment rdmf ...
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)
 
Introducingthe anu datacommons
Introducingthe anu datacommonsIntroducingthe anu datacommons
Introducingthe anu datacommons
 
IRJET- Distributed Decentralized Data Storage using IPFS
IRJET- Distributed Decentralized Data Storage using IPFSIRJET- Distributed Decentralized Data Storage using IPFS
IRJET- Distributed Decentralized Data Storage using IPFS
 
Survey of clustered_parallel_file_systems_004_lanl.ppt
Survey of clustered_parallel_file_systems_004_lanl.pptSurvey of clustered_parallel_file_systems_004_lanl.ppt
Survey of clustered_parallel_file_systems_004_lanl.ppt
 
Repositories and digital preservation
Repositories and digital preservationRepositories and digital preservation
Repositories and digital preservation
 
ArchivesSpace-Archivematica-DSpace Workflow Integration
ArchivesSpace-Archivematica-DSpace Workflow IntegrationArchivesSpace-Archivematica-DSpace Workflow Integration
ArchivesSpace-Archivematica-DSpace Workflow Integration
 
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
OpenAIRE webinar: Principles of Research Data Management, with S. Venkatarama...
 
Archival Technologies
Archival TechnologiesArchival Technologies
Archival Technologies
 
HKU Data Curation MLIM7350 Student Project: Data Curation Workshop
HKU Data Curation MLIM7350 Student Project: Data Curation WorkshopHKU Data Curation MLIM7350 Student Project: Data Curation Workshop
HKU Data Curation MLIM7350 Student Project: Data Curation Workshop
 
Handout for Metadata for your Digital Collections
Handout for Metadata for your Digital CollectionsHandout for Metadata for your Digital Collections
Handout for Metadata for your Digital Collections
 
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
 

Plus de PERICLES_FP7

Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
PERICLES_FP7
 
Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...
PERICLES_FP7
 

Plus de PERICLES_FP7 (20)

Digital Ecosystem and Process Compiler - IDCC17
Digital Ecosystem and Process Compiler - IDCC17Digital Ecosystem and Process Compiler - IDCC17
Digital Ecosystem and Process Compiler - IDCC17
 
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
Technical Appraisal of Complex Digital Objects in Evolving Environments - IDC...
 
Technical appraisal and change impact analysis - IDCC17 workshop
Technical appraisal and change impact analysis - IDCC17 workshopTechnical appraisal and change impact analysis - IDCC17 workshop
Technical appraisal and change impact analysis - IDCC17 workshop
 
ForgetIT: human memory inspired Information Model
ForgetIT: human memory inspired Information ModelForgetIT: human memory inspired Information Model
ForgetIT: human memory inspired Information Model
 
Data quality, preservation and access: a DANS perspective
Data quality, preservation and access: a DANS perspectiveData quality, preservation and access: a DANS perspective
Data quality, preservation and access: a DANS perspective
 
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
Proactive Evolution management in Data-centric SW ecosystems - Acting on Chan...
 
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
Digital Preservation in the era of Big Data - The Diachron Platform - Acting ...
 
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
Detecting Semantic Drift for ontology maintenance - Acting on Change 2016
 
Filling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on ChangeFilling the Digital Preservation Gap - Acting on Change
Filling the Digital Preservation Gap - Acting on Change
 
Risk assessment for preservation in the active life of complex digital object...
Risk assessment for preservation in the active life of complex digital object...Risk assessment for preservation in the active life of complex digital object...
Risk assessment for preservation in the active life of complex digital object...
 
Technical Appraisal Tool, MICE - Acting on Change 2016
Technical Appraisal Tool, MICE - Acting on Change 2016Technical Appraisal Tool, MICE - Acting on Change 2016
Technical Appraisal Tool, MICE - Acting on Change 2016
 
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
PERICLES Workflow for the automated updating of Digital Ecosystem Models with...
 
Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...Capability gap - Preservation isn't just throwing tools at the problem - Acti...
Capability gap - Preservation isn't just throwing tools at the problem - Acti...
 
Automatic policy application and change management - Acting on Change 2016
Automatic policy application and change management - Acting on Change 2016Automatic policy application and change management - Acting on Change 2016
Automatic policy application and change management - Acting on Change 2016
 
Reproducibile scientific workflows - Acting on Change 2016
Reproducibile scientific workflows - Acting on Change 2016Reproducibile scientific workflows - Acting on Change 2016
Reproducibile scientific workflows - Acting on Change 2016
 
Pro-active solutions for higher reproducibility of scientific experiments - A...
Pro-active solutions for higher reproducibility of scientific experiments - A...Pro-active solutions for higher reproducibility of scientific experiments - A...
Pro-active solutions for higher reproducibility of scientific experiments - A...
 
PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...
 
PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016
 
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
PERICLES Ecosystem Modelling (NCDD use case) - Acting on Change 2016
 
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
PERICLES Process Compiler - ‘Eye of the Storm: Preserving Digital Content in ...
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 

PERICLES Information Packaging Techniques

  • 1. GRANT AGREEMENT: 601138 | SCHEME FP7 ICT 2011.4.3 Promoting and Enhancing Reuse of Information throughout the Content Lifecycle taking account of Evolving Semantics [Digital Preservation] “This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no601138”. Information Packaging Techniques An overview of methods and standards Anna-Grit Eggers (University of Goettingen)
  • 2. Information Packaging Techniques a) Simple Archive Container Formats b) Structured Packaging for Archiving c) Metadata Schemes Contents
  • 4. ● Sole purpose: packing files together. ● File containers are often combined with a compression option to reduce the needed disk space for storage. ● All files in the containers are stored equally as payload, and the containers have to be unzipped to fully access the packed files. Simple Archive Container Formats
  • 5. ● The ZIP format was introduced by PKWARE in 1989: https://www.pkware.com/support/zip-app-note ◦ Public domain ◦ Supports various compression algorithms Dateiicon von WinZIP ◦ Each single file in the archive container is compressed => possible to unzip single files from the container ◦ Option to aggregate files without using compression ◦ Preserves the original file paths and offers optional encryption ZIP
  • 6. ◦ Size reduction: Archive containers can be divided into parts ◦ Flexibility: add or extract single files from a zip archive without touching the other stored files • + advantage: possibility to frequently change packages • - disadvantage: causes overhead in the form of an additional file list which is stored together with the content. ◦ Loss prevention: ZIP uses cyclic redundancy checks. In case a file becomes corrupt, the other files would be still flawlessly accessible. ZIP (cont.)
  • 7. ● A widespread container format in UNIX (ustar, pax), LINUX (GNU tar), and BSD (bsdtar) environments ● Can be enabled on Windows Operating Systems for example by software libraries such as LibArchive ● Writes files sequentially into one file, called ‘tarball’ ● Was originally used for tape drives ● TAR is combined with a compression algorithm like gzip or bzip2 ● In contrast to ZIP, TAR doesn’t allow extracting single files from the container. TAR
  • 9. Creation of an information container, in which the packed information can be stored in a well-defined and structured way. Structured Packaging for Archiving
  • 10. ● A standard for storing files and their metadata in a well-defined directory structure ● Developed by the California Digital Library Digital Preservation Group and the Library of Congress ● Often used for preservation purposes, e.g. by Tate (UK). ● Data files are stored in a data directory ● Their checksums are saved in a manifest file ● The metadata, or tags, are listed together with their checksums in a tag-manifest file. ● A further BagIt file stores the used BagIt version and the file encoding. ● BagIt is often combined with a simple archiving format, such as TAR or ZIP, for the serialisation of the bag directory, or used only as directory structure technique for sensible content. ● See: http://www.cdlib.org/cdlinfo/2008/07/02/bagit-transferring-digital-content/ BagIt
  • 11. ● Container files, which contain file aggregations serving a specific purpose ● Often used to store all files belonging to a video, and to group them as a single self-describing video file. ● Popular examples for video containers are AVI and Ogg Media. Xiph.Org Foundation Compound Documents
  • 12. ● The source code of a computer program is often stored together with other project- related resources, such as images, in a package with a well-defined directory structure. ● Structured source code packages are often executable (=> run the computer program). ● Examples: Java’s JARs, Ruby Gems and Python Eggs. ◦ The JAR format is derived from the ZIP format. ◦ JAR be seen as compound document similar to the video containers, because the Java program which is represented by the JAR can be executed by running the JAR. ◦ It contains a well-defined path structure and an optional manifest file, which can be regarded as metadata file. ◦ Therefore the passage to the subsequent category of metadata schemes becomes fluent. Structured Source Code Packages
  • 14. ● Mostly used in combination with packaging ● But also be kept beside the described content and linked to it ● Or embedded with the content ● Most common is the use of the XML format to define a scheme for a use domain. Metadata schemes
  • 15. ● METS standS for Metadata Encoding and Transmission Standard maintained by the METS Editorial Board ● It provides an XML schema for encoding different types of metadata ● It simplifies the administration and exchange of digital objects between data collections. ● A METS-file serves as a hub file that links together the digital object with all its belonging files and the metadata to create a digital entity. ● A METS XML-file consists of: ◦ Header: Contains metadata of the METS file itself, like the creation date and the authors. ◦ Descriptive metadata: Provides links to external metadata documents. ◦ Administrative metadata: Stores the data concerning storage, rights and creation. ◦ File section: Manages a list of all files belonging to the DO. ◦ Structural map: Describes the inner structure of the DO and provides the linkage between data and metadata. ◦ Structural links: Provides hyperlinks and is useful for the archiving of websites. ◦ Behaviour: Stores executable instructions for the behaviour. ● See: http://www.loc.gov/standards/mets/ METS
  • 16. ● ORE is a standard for Object Reuse and Exchange by the Open Archives Initiative OAI. ● It implements two new types of resources: Aggregations and Resource Maps. ● An Aggregation is a representation of a set of associated web resources. ◦ It is like a Semantic Web resource, hence has no representation by itself. ● A Resource Map belongs to an Aggregation. ◦ It holds a machine-readable description of the Aggregation and a list of associated resources. In addition, it describes the relationships and properties relevant to all resources and has some metadata for itself. ● Both resources are addressed by an HTTP URI in the Web. ◦ Aggregations can be used by applications to visualise all associated resources processing them as a collection. ◦ This simplifies the exchange and archiving of resource sets. ◦ Various formats for the Resource are available: Atom XML, RDF/XML, and RDFa. ◦ All of these formats support serialisation. ● See: http://www.openarchives.org/ore/ OAI-ORE
  • 17. ● Developed by the PREservation Metadata: Implementation Strategies (PREMIS) group of the Library of Congress ● It supports the preservation and long-term usability of digital objects and their metadata ● The Data Dictionary is a specification for metadata handling in digital archiving systems. ● The data model provides five entities: intellectual, object, event, agent and rights. ● See: https://www.era.lib.ed.ac.uk/bitstream/handle/1842/3339/Higgins PREMIS_V-2-1-2009- 03.pdf?sequence=1&isAllowed=y PREMIS Data Dictionary
  • 18. ● Used to describe and bundle research data in a way that supports citation and sharing in a machine-readable fashion. ● The initiative includes a number of techniques that have a set of principles in common: ◦ Identity ◦ Aggregation ◦ Annotation ● The metadata is described in the RO ontology. ● Bundling can be done using different techniques, including the RO bundling and BagIt. ● See: http://www.researchobject.org/ Research Object (RO)
  • 19.
  • 20. ● The Long-term preservation Metadata for Electronic Resources project provides an XML schema particularly for long-term preservation purposes, based on the preservation implementation schema by the National Library of New Zealand. ● The schema was developed by the DNB (Deutsche National Bibliothek) as a schema for technical metadata. ● It is used, in combination with METS, for defining the packaging format UOF. ● It is designed for cooperating with standard exchange formats, and can be integrated in METS. ● The LMER-schema consists of the following sections: ◦ Object: The object with an URN as persistent identifier. ◦ Process: Protocol of technical changes. ◦ Metadata: Metadata for each file that belongs to the digital object. ◦ Metadata modifications: Protocol of changes of the metadata. ● See: http://www.dnb.de/DE/Standardisierung/LMER/lmer_node.html LMER
  • 21. ● Timothy DiLauro and Jonathan Petters introduced the Data Conservancy Package Tool, at the International Digital Curation Conference (IDCC) 2015 (http://www.dcc.ac.uk/sites/default/files/documents/IDCC15/196.pdf). ● The tool facilitates the creation of packages for research data objects in the conservation domain ● It provides a user interface for the definition of packages. ● It focusses on curation activities. ● See: http://dataconservancy.org/wp-content/uploads/2014/10/DCSDOCPKG- PackageToolsDocumentationHome-Full.pdf The Data Conservancy Package Tool