SlideShare une entreprise Scribd logo
1  sur  60
PRESERVING
                   DIGITAL PUBLIC
                   TELEVISION
Part of the NDIIPP Program of the Library of Congress

                              A STATUS REPORT
                                       Kara Van Malssen
            Senior Research Scholar & Metadata Specialist
                                      New York University

                                       February 24, 2010
                               SYMPTE-NY Section Meeting
NDIIPP =
National Digital Information
Infrastructure and Preservation
Program of the Library of Congress
www.digitalpreservation.gov




                           Image by Smiley Man with a Hat via Flickr http://www.flickr.com/photos/smileymanwithahat/2477365291/
PRESERVING
DIGITAL
PUBLIC
TELEVISION
PARTNERS         WNET
                           WGBH
                        NYU
              PBS


             Library of Congress
by massdistraction via Flickr http://www.flickr.com/photos/sharynmorrow/3718174646/in/set-72157621271414097/




DIGITAL ARCHAEOLOGY?
Design and build an
PDPTV   OAIS-compliant
        preservation repository
GOALS   for born digital public
        television
Implement and

PDPTV   recommend standards
        for metadata, wrapper

GOALS   and encoding formats,
        production workflow
        practices
PDPTV   Recommend selection
        criteria for long-term
GOALS   retention
PDPTV   Examine and recommend
        strategies for long term
GOALS   sustainability
“    An OAIS is an archive,
 consisting of an organization of
  people and systems, that has
 accepted the responsibility to
preserve information and make it
    available for a designated
          community....
      this distinguishes it                                         “
 from other uses of the term
            ‘archive.’
- Reference Model for an Open Archival Information System, ISO 14721:2003
PRESERVATION PLANNING
          DESCRIPTIVE                             DESCRIPTIVE

P            INFO                                    INFO
                                                                               C
R                              DATA                                            O
O                           MANAGEMENT                            queries      N
D                                                                              S
                                                                 result sets
U   SIP        INGEST                           ACCESS             orders      U
C                                                                              M
                              ARCHIVAL
E                                                                              E
                              STORAGE                           DIP
R                                               AIP
                                                                               R
              AIP


                            ADMINISTRATION




                            MANAGEMENT


                           OAIS Functional Model
TECHNOLOGIES
            {
  some of the
 REPOSITORY

                PROJECT
                SPECIFIC
                 CODE
}
METADATA
 MODEL
by bredgur via Flickr: http://www.flickr.com/photos/bredgur/1323025528/




PHASE   2006-2008
SUBMISSION

wnet


          NYU
wgbh
       repository

pbs
SUBMISSION

wnet


          NYU
wgbh
       repository

pbs



             SIP Class A
SUBMISSION
       SD
wnet      ESS
              EN
                CE
                     A

                            NYU
wgbh
                         repository

pbs



                               SIP Class A
SUBMISSION
       SD
wnet      ES
       ME SEN
          TA     CE
             DA
               TA A
                  1
                         NYU
wgbh
                      repository

pbs



                            SIP Class A
SUBMISSION
       SD
wnet      ES
       ME SEN
          TA     CE
             DA
               TA A
                  1
                             NYU
wgbh
                          repository
                   C EC
                 EN
            E SS
       SD
pbs



                                SIP Class A
SUBMISSION
       SD
wnet      ES
       ME SEN
          TA     CE
             DA
               TA A
                  1
                         NYU
wgbh
              ATA2    repository
           AD CE C
         ET EN
       M
          E SS
       SD
pbs



                            SIP Class A
SUBMISSION
       SD
wnet      ES
       ME SEN
          TA     CE
             DA
               TA B A
                  1
                           NYU
wgbh
              ATA2      repository
           AD CE C
         ET EN
       M
          E SS
       SD
pbs



                              SIP Class B
SUBMISSION
       SD
wnet      ES
       ME SEN
          TA     CE
             DA
               TA A
                  1
                              NYU
wgbh
               AT A2       repository
            AD CE C
         ET EN
       M
           E SS       ED
         D         NC
       S        SE
             ES
pbs     HD




                                 SIP Class C
SUBMISSION

wnet


                        NYU
wgbh
              ATA2   repository
           AD CE C
         ET EN
       M
          E SS
       SD
pbs



                           SIP Class D
SUBMISSION

wnet

       SD ESSENCE E
       METADATA 3        NYU
wgbh
              ATA2    repository
           AD CE C
         ET EN
       M
          E SS
       SD
pbs



                            SIP Class D
1. Aggregate content
2. Normalize filenames *
3. Aggregate & map descriptive metadata to PBCore *
4. Extract Technical Metadata
5. Map Technical Metadata to PBCore *
6. Generate PBCore *
7. Hunt down creating app info
8. Determine playback reqs *
9. Generate PREMIS *
10. Generate checksums
11. Generate METS *
                                   REPOSITORY
12. INGEST                       INGEST TASKS
by kapoue via Flickr http://www.flickr.com/photos/kapoue/2563697039/




PHASE   2008-2009
SUBMISSION



          NYU
wnet
       repository




            SIP Class A
SUBMISSION


       HD ESSENCE
                       NYU
wnet
                    repository




                         SIP Class A
SUBMISSION


         HD ESSENCE
       PBCORE METADATA      NYU
wnet
                         repository




                              SIP Class A
SUBMISSION



          NYU
wnet
       repository




            SIP Class B
SUBMISSION



                         NYU
wnet
       SOURCE FILES   repository




                           SIP Class B
SUBMISSION



       PBCORE METADATA      NYU
wnet
        SOURCE FILES     repository




                              SIP Class B
1. Normalize filenames
2. Generate PREMIS
3. Generate METS
4. Validate checksum
5. INGEST
                     REPOSITORY
                   INGEST TASKS
PHASE 1      PHASE 2


12 5
processing
   steps
             processing
                steps
?
What changed
[preservation-ready]
FILE-BASED WORKFLOW




      by Brian Daniel Eisenberg via Flickr http://www.flickr.com/photos/pplpwrd/2673102206/
File & Folder
   naming
conventions
      for
 production
 and post-
 production
                by drpritch via Flickr http://www.flickr.com/photos/drpritch/305053820/
Standard
                        MXF

                       DV100     Recording
 formats &
settings for
                        MXF

                       DV100   Transfer to HDD

 recording,             QT
                                Ingest & Re-
   editing,            DV100
                                   wrap
broadcast,              QT

                                    Edit
  archiving            DV100



(i.e. no transcoding    QT
   during workflow)     DV100      Playout
Technical
 metadata
extraction in
   house
mediainfo.sourceforge.net/en
PBCore
  records
  created
  in-house
pbcore.vermicel.li
Archiving can integrate
seamlessly into file-based
broadcast workflows if the right
practices are introduced early on
a few more
LESSONS
LEARNED
3
DIGITAL PRESERVATION
      REQUIREMENTS:
1. Bit Preservation
2. Accessibility of Content
3. Organizational Commitment
1
BIT PRESERVATION
ONE COPY IS NO COPY


by NightRPStar via Flickr http://www.flickr.com/photos/ninjanoodles/153893226/
“       (rules define how many copies
         to make, and which locations
          to put these in, with a typical
         strategy being 3 copies in 3
           geographically separate
                                                                          “
                   locations)


- M. Addis, et al “Sustainable Archiving and Storage Management of Audiovisual Digital
                  Assets” SMPTE Motion Imaging Journal, Nov/Dec 2009
Photo by quapan via Flickr http://www.flickr.com/photos/hinkelstone/2435823037/




Consider federated storage models
    for cost and sustainability reasons
2
CONTENT ACCESSIBILITY
by Shira Golding via Flickr http://www.flickr.com/photos/boojee/3743753784/
Define minimum metadata
          creation and collection
requirements & rules, throughout
     the production & broadcast
                        workflow
by ScrapyGraphics via Flickr http://www.flickr.com/photos/scrapygraphics/2515645664/




including file and folder naming
conventions
Metadata should be
standards-based
by DG Jones via Flickr http://www.flickr.com/photos/dgjones/1225183400/




a few words about
file formats...
“ Businesses may use different
  encoding formats for different
 business processes, but should
  strive to avoid transcoding
  wherever possible, because it                                       “
introduces a generation and thus
         reduces quality.

- Peter Thomas “File Formats in Television Archiving and Content Exchange”
              SMPTE Motion Imaging Journal, Nov/Dec 2009
3   ORGANIZATIONAL
      COMMITMENT
“
Benign neglect is the default
stewardship, collection policy.
Physical world, even more so
                                                              “
    in the digital world.


    - Cathy Marshall, Senior Researcher, Microsoft Research
            Keynote at Code4Lib, February 23, 2010.
                     via Twitter @jschneider
TRUSTWORTHY
REPOSITORIES AUDIT
AND CERTIFICATION:
   CRITERIA AND
    CHECKLIST
coming soon...

THE AMERICAN
    ARCHIVE
  of public broadcasting
THANK
 YOU
   www.thirteen.org/ptvdigitalarchive
          kvm211@nyu.edu
http://www.slideshare.net/kvanmalssen

Contenu connexe

Plus de Kara Van Malssen

Designing Sustainable Collaborations for Impact
Designing Sustainable Collaborations for ImpactDesigning Sustainable Collaborations for Impact
Designing Sustainable Collaborations for ImpactKara Van Malssen
 
Seeing the Forest for the Trees: A look outside the OAIS Reference Model
Seeing the Forest for the Trees: A look outside the OAIS Reference ModelSeeing the Forest for the Trees: A look outside the OAIS Reference Model
Seeing the Forest for the Trees: A look outside the OAIS Reference ModelKara Van Malssen
 
Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...
Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...
Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...Kara Van Malssen
 
What do you mean we need digital preservation? We have a repository!
What do you mean we need digital preservation? We have a repository!What do you mean we need digital preservation? We have a repository!
What do you mean we need digital preservation? We have a repository!Kara Van Malssen
 
Implementation of systems for Media / Digital Asset Management Systems in 10 ...
Implementation of systems for Media / Digital Asset Management Systems in 10 ...Implementation of systems for Media / Digital Asset Management Systems in 10 ...
Implementation of systems for Media / Digital Asset Management Systems in 10 ...Kara Van Malssen
 
When the Worst Happens: On long and short-term disasters and their impact on ...
When the Worst Happens: On long and short-term disasters and their impact on ...When the Worst Happens: On long and short-term disasters and their impact on ...
When the Worst Happens: On long and short-term disasters and their impact on ...Kara Van Malssen
 
Smithsonian Trustworthy Digital Repository Roundtable
Smithsonian Trustworthy Digital Repository RoundtableSmithsonian Trustworthy Digital Repository Roundtable
Smithsonian Trustworthy Digital Repository RoundtableKara Van Malssen
 
Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...
Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...
Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...Kara Van Malssen
 
Planning Beyond Digitization: Digital Preservation for Audiovisual Collections
Planning Beyond Digitization: Digital Preservation for Audiovisual Collections Planning Beyond Digitization: Digital Preservation for Audiovisual Collections
Planning Beyond Digitization: Digital Preservation for Audiovisual Collections Kara Van Malssen
 
PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!Kara Van Malssen
 

Plus de Kara Van Malssen (11)

Designing Sustainable Collaborations for Impact
Designing Sustainable Collaborations for ImpactDesigning Sustainable Collaborations for Impact
Designing Sustainable Collaborations for Impact
 
Seeing the Forest for the Trees: A look outside the OAIS Reference Model
Seeing the Forest for the Trees: A look outside the OAIS Reference ModelSeeing the Forest for the Trees: A look outside the OAIS Reference Model
Seeing the Forest for the Trees: A look outside the OAIS Reference Model
 
Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...
Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...
Breaking Down Silos: How Organizational Changes Can Help Unlock the Value of ...
 
What do you mean we need digital preservation? We have a repository!
What do you mean we need digital preservation? We have a repository!What do you mean we need digital preservation? We have a repository!
What do you mean we need digital preservation? We have a repository!
 
Implementation of systems for Media / Digital Asset Management Systems in 10 ...
Implementation of systems for Media / Digital Asset Management Systems in 10 ...Implementation of systems for Media / Digital Asset Management Systems in 10 ...
Implementation of systems for Media / Digital Asset Management Systems in 10 ...
 
From zerotodam
From zerotodamFrom zerotodam
From zerotodam
 
When the Worst Happens: On long and short-term disasters and their impact on ...
When the Worst Happens: On long and short-term disasters and their impact on ...When the Worst Happens: On long and short-term disasters and their impact on ...
When the Worst Happens: On long and short-term disasters and their impact on ...
 
Smithsonian Trustworthy Digital Repository Roundtable
Smithsonian Trustworthy Digital Repository RoundtableSmithsonian Trustworthy Digital Repository Roundtable
Smithsonian Trustworthy Digital Repository Roundtable
 
Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...
Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...
Implementing Metadata Standards for a Digital Audiovisual Preservation Reposi...
 
Planning Beyond Digitization: Digital Preservation for Audiovisual Collections
Planning Beyond Digitization: Digital Preservation for Audiovisual Collections Planning Beyond Digitization: Digital Preservation for Audiovisual Collections
Planning Beyond Digitization: Digital Preservation for Audiovisual Collections
 
PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!
 

Dernier

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Dernier (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Preserving Digital Public Television: A Status Report

  • 1. PRESERVING DIGITAL PUBLIC TELEVISION Part of the NDIIPP Program of the Library of Congress A STATUS REPORT Kara Van Malssen Senior Research Scholar & Metadata Specialist New York University February 24, 2010 SYMPTE-NY Section Meeting
  • 2. NDIIPP = National Digital Information Infrastructure and Preservation Program of the Library of Congress www.digitalpreservation.gov Image by Smiley Man with a Hat via Flickr http://www.flickr.com/photos/smileymanwithahat/2477365291/
  • 3. PRESERVING DIGITAL PUBLIC TELEVISION PARTNERS WNET WGBH NYU PBS Library of Congress
  • 4.
  • 5.
  • 6. by massdistraction via Flickr http://www.flickr.com/photos/sharynmorrow/3718174646/in/set-72157621271414097/ DIGITAL ARCHAEOLOGY?
  • 7. Design and build an PDPTV OAIS-compliant preservation repository GOALS for born digital public television
  • 8. Implement and PDPTV recommend standards for metadata, wrapper GOALS and encoding formats, production workflow practices
  • 9. PDPTV Recommend selection criteria for long-term GOALS retention
  • 10. PDPTV Examine and recommend strategies for long term GOALS sustainability
  • 11. An OAIS is an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a designated community.... this distinguishes it “ from other uses of the term ‘archive.’ - Reference Model for an Open Archival Information System, ISO 14721:2003
  • 12. PRESERVATION PLANNING DESCRIPTIVE DESCRIPTIVE P INFO INFO C R DATA O O MANAGEMENT queries N D S result sets U SIP INGEST ACCESS orders U C M ARCHIVAL E E STORAGE DIP R AIP R AIP ADMINISTRATION MANAGEMENT OAIS Functional Model
  • 13. TECHNOLOGIES { some of the REPOSITORY PROJECT SPECIFIC CODE
  • 15. by bredgur via Flickr: http://www.flickr.com/photos/bredgur/1323025528/ PHASE 2006-2008
  • 16. SUBMISSION wnet NYU wgbh repository pbs
  • 17. SUBMISSION wnet NYU wgbh repository pbs SIP Class A
  • 18. SUBMISSION SD wnet ESS EN CE A NYU wgbh repository pbs SIP Class A
  • 19. SUBMISSION SD wnet ES ME SEN TA CE DA TA A 1 NYU wgbh repository pbs SIP Class A
  • 20. SUBMISSION SD wnet ES ME SEN TA CE DA TA A 1 NYU wgbh repository C EC EN E SS SD pbs SIP Class A
  • 21. SUBMISSION SD wnet ES ME SEN TA CE DA TA A 1 NYU wgbh ATA2 repository AD CE C ET EN M E SS SD pbs SIP Class A
  • 22. SUBMISSION SD wnet ES ME SEN TA CE DA TA B A 1 NYU wgbh ATA2 repository AD CE C ET EN M E SS SD pbs SIP Class B
  • 23. SUBMISSION SD wnet ES ME SEN TA CE DA TA A 1 NYU wgbh AT A2 repository AD CE C ET EN M E SS ED D NC S SE ES pbs HD SIP Class C
  • 24. SUBMISSION wnet NYU wgbh ATA2 repository AD CE C ET EN M E SS SD pbs SIP Class D
  • 25. SUBMISSION wnet SD ESSENCE E METADATA 3 NYU wgbh ATA2 repository AD CE C ET EN M E SS SD pbs SIP Class D
  • 26. 1. Aggregate content 2. Normalize filenames * 3. Aggregate & map descriptive metadata to PBCore * 4. Extract Technical Metadata 5. Map Technical Metadata to PBCore * 6. Generate PBCore * 7. Hunt down creating app info 8. Determine playback reqs * 9. Generate PREMIS * 10. Generate checksums 11. Generate METS * REPOSITORY 12. INGEST INGEST TASKS
  • 27. by kapoue via Flickr http://www.flickr.com/photos/kapoue/2563697039/ PHASE 2008-2009
  • 28. SUBMISSION NYU wnet repository SIP Class A
  • 29. SUBMISSION HD ESSENCE NYU wnet repository SIP Class A
  • 30. SUBMISSION HD ESSENCE PBCORE METADATA NYU wnet repository SIP Class A
  • 31. SUBMISSION NYU wnet repository SIP Class B
  • 32. SUBMISSION NYU wnet SOURCE FILES repository SIP Class B
  • 33. SUBMISSION PBCORE METADATA NYU wnet SOURCE FILES repository SIP Class B
  • 34. 1. Normalize filenames 2. Generate PREMIS 3. Generate METS 4. Validate checksum 5. INGEST REPOSITORY INGEST TASKS
  • 35. PHASE 1 PHASE 2 12 5 processing steps processing steps
  • 37. [preservation-ready] FILE-BASED WORKFLOW by Brian Daniel Eisenberg via Flickr http://www.flickr.com/photos/pplpwrd/2673102206/
  • 38. File & Folder naming conventions for production and post- production by drpritch via Flickr http://www.flickr.com/photos/drpritch/305053820/
  • 39. Standard MXF DV100 Recording formats & settings for MXF DV100 Transfer to HDD recording, QT Ingest & Re- editing, DV100 wrap broadcast, QT Edit archiving DV100 (i.e. no transcoding QT during workflow) DV100 Playout
  • 40. Technical metadata extraction in house mediainfo.sourceforge.net/en
  • 41. PBCore records created in-house pbcore.vermicel.li
  • 42. Archiving can integrate seamlessly into file-based broadcast workflows if the right practices are introduced early on
  • 44. 3 DIGITAL PRESERVATION REQUIREMENTS: 1. Bit Preservation 2. Accessibility of Content 3. Organizational Commitment
  • 46. ONE COPY IS NO COPY by NightRPStar via Flickr http://www.flickr.com/photos/ninjanoodles/153893226/
  • 47. (rules define how many copies to make, and which locations to put these in, with a typical strategy being 3 copies in 3 geographically separate “ locations) - M. Addis, et al “Sustainable Archiving and Storage Management of Audiovisual Digital Assets” SMPTE Motion Imaging Journal, Nov/Dec 2009
  • 48. Photo by quapan via Flickr http://www.flickr.com/photos/hinkelstone/2435823037/ Consider federated storage models for cost and sustainability reasons
  • 50. by Shira Golding via Flickr http://www.flickr.com/photos/boojee/3743753784/
  • 51. Define minimum metadata creation and collection requirements & rules, throughout the production & broadcast workflow
  • 52. by ScrapyGraphics via Flickr http://www.flickr.com/photos/scrapygraphics/2515645664/ including file and folder naming conventions
  • 54. by DG Jones via Flickr http://www.flickr.com/photos/dgjones/1225183400/ a few words about file formats...
  • 55. “ Businesses may use different encoding formats for different business processes, but should strive to avoid transcoding wherever possible, because it “ introduces a generation and thus reduces quality. - Peter Thomas “File Formats in Television Archiving and Content Exchange” SMPTE Motion Imaging Journal, Nov/Dec 2009
  • 56. 3 ORGANIZATIONAL COMMITMENT
  • 57. “ Benign neglect is the default stewardship, collection policy. Physical world, even more so “ in the digital world. - Cathy Marshall, Senior Researcher, Microsoft Research Keynote at Code4Lib, February 23, 2010. via Twitter @jschneider
  • 59. coming soon... THE AMERICAN ARCHIVE of public broadcasting
  • 60. THANK YOU www.thirteen.org/ptvdigitalarchive kvm211@nyu.edu http://www.slideshare.net/kvanmalssen

Notes de l'éditeur

  1. Early context for the project was that public television was supposed to depositing one copy of all programs at LC. That hasn’t been happening.
  2. WNET and WGBH combined produce about 60% of nationally broadcast programming in the US. But they also produce and distribute local programming. PBS is only a distributor - not a content creator. NYU had expertise in building digital libraries (which public broadcasting did not) and an existing PR. LC is the funder, but also a repository.
  3. Why are we doing this? - Jacques Cousteau story.
  4. We’re not dealing with digitizing things.
  5. An archive in this sense is not a server. It is processes, procedures, people with a mission of preservation, for which it is responsible into the indefinite future.
  6. “The PR is designed as a set of loosely-coupled components communicating over stable interfaces.” Storage Resource Broker = supports shared collections that can be distributed across multiple organizations and heterogeneous storage systems. Can be used as a Data Grid Management System. Dspace = DSpace open source software, used for Archival Storage, Data Management, Dissemenation functions of OAIS
  7. These steps need just to process any one of those SIPs. These were the basic steps, but they were slightly different for each SIP class because of different metadata, different file formats, different PREMIS.
  8. Processing is the same for both SIP classes (production masters and source files).
  9. The “p” word was never used. It actually made sense to make changes during transition, and no practices were too entrenched yet, but they didn’t want to say it was for archival reasons, just that this was the way it was going to be done.
  10. Settings: Frame Size (1080i), Aspect Ratio (16/9), Frame Rate (29.97), Data Rate (117 mbps)
  11. Including technical & descriptive metadata
  12. Consider a grid or distributed but federated system
  13. Combined with local storage for most frequently accessed materials with Grid solutions. Make sure your approached is managed. Take a look at the AVATARm project in the UK for more info.
  14. One word. You are going to need a lot of it. Video files are not self describing. Filenames are not good for search and retrieval, file-level metadata is not searchable.
  15. If there are no cataloging rules for descriptive metadata, everyone will input differently. Combine cataloging rules with controlled vocabularies.
  16. Because: you won’t have to reinvent the wheel (elements, definitions, vocab), facilitates exchange, there is a support community.
  17. A few things about file formats: There is still no standard format for video preservation, especially for born digital, because it is born compressed. The most important thing to do is choose an open, widely support encoding format, that can be used in all systems in your core business processes without transcoding. MXF or QT (FCP) containers.
  18. Preservation does not happen in a vacuum. There must be ongoing commitment, funds, staffing, reviewed and updated policies and procedures, etc.