SlideShare une entreprise Scribd logo
1  sur  117
Télécharger pour lire hors ligne
Rachel Lovinger @rlovinger
Confab, 22 May, 2015
Image via Bond
2
©2015 All rights reserved.
• Experience Director, Content Strategy;
Razorfish New York
• Co-editor of scatter/gather, a content
strategy blog:
http://scattergather.razorfish.com
• Author of Nimble: A Razorfish Report
on Publishing in the Digital Age (June
2010): http://nimble.razorfish.com
• Twitter: @rlovinger
Metadata is a Love Note to the Future
4
5
6
7
8
9
10
11
©2015 All rights reserved.
is
HARDCORE
12
©2015 All rights reserved.
2006
2009
2008
2012
2011
2010
13
©2015 All rights reserved.
Metadata = Context
Context enables Connections
How does one convey that in a concise and powerful way?
14
Photo by Jesse Chan-Norris
Metadata Is A
Love note
To the Future
16
Tweet and photo by Erin Kissane, Tumblr by Austin Kleon
429 notes
82 retweets
17
Photo and shirt by Sarah
18
Photo by Rachel Lovinger
19
Content Strategy for Mobile by Karen McGrane
Metadata is a Love Note to the Future
21
• Nearly 60,000
files archived
• Mostly from
1980-1995
• Collected and
curated since
1998
• Almost no
metadata
Textfiles.com
22
Who needs a database?
23
Metadata Skeptic transformed into… Metadata Warrior
Photos by Jason Scott and Rachel Lovinger
24
Photo by Rachel Lovinger
25
• Me?
Photo by Rachel Lovinger
ENTERTAINMENTWEEKLY
Metadata for Journalism Products
27
©2015 All rights reserved.
~3 years online content ~10 years magazine content
28
©2015 All rights reserved.
Imported from text files to CMS
29
©2015 All rights reserved.
Semi-structured information
allowed us to map the files to
content types and site sections,
and add some metadata (author,
published date, keywords, etc.)
10 years
x 50 issues per year
x 100 files per issue (approx.)
50,000 estimated articles
30
©2015 All rights reserved.
Once in the CMS, we could add
photos, links, formatting, etc.
31
©2015 All rights reserved.
For the content already in the
CMS, keywords had been
manually typed in by authors
• 6790 “different” keywords
• Removed 12% during clean up
• Typos
• Redundant
• Not Useful
Metadata is a Love Note to the Future
33
©2015 All rights reserved.
• Star Wars: Episode I -- The Phantom Menace
• Episode 1
• Episode I
• Phantom Menace
• Star Wars Episode I The Phantom Menace
• Star Wars Episode I: The Phantom Menace
• Star Wars prequel
• Star Wars: Episode 1 -- The Phantom Menace
• Star Wars: Episode i -- the Phantom Menace
• Star Wars: Episode I: The Phantom Menace
• Star Wars: Episode I--The Phantom Menace
• Star Wars: Episode I--The Phantom Menance
• Star Wars: Episode One -- The Phantom Menace
• Star Wars: The Phantom Menace
• Star Wars: The Phantom Menace -- Episode I
• The Phantom Menace
• The Phanton Menace
34
©2015 All rights reserved.
• TAFKAP?
35
©2015 All rights reserved.
• TAFKAP?
• The Artist
• Artist Formerly Known as Prince
• The Artist Formerly Known As Prince
• The Artist formerly known as Prince
• the Artist Formerly Known as Prince
• The Artist Formerly Known as Prince (PKA)
Metadata is a Love Note to the Future
37
©2015 All rights reserved.
• The magazine was once a week
• The website published new
articles several times a day
• Plus: Over 50,000 past articles!
• How could we better use all
that content?
38
©2015 All rights reserved.
If you like James Bond, we wanted it to be easy for you to
discover everything we had.
Cover Story
Interview
Photo Gallery
Etc.
39
Entertainment Weekly
Journalism
IMDb-like
Information
40
41
©2015 All rights reserved.
We put our controlled vocabulary into categories, to make them more
distinct and meaningful.
For example:
• Book > Product > Harry Potter and the Goblet of Fire
• Movie > Product > Harry Potter and the Goblet of Fire
• Person > Individual > Daniel Radcliffe
• Person > Individual > J.K. Rowling
42
Capsule
Move
Review
Preview
Move Review
DVD Review
43
• Relationships
defined for each
media type
• Managed
separately from
the article content
• The full set of
metadata was
available to all
articles
44
©2015 All rights reserved.
• Standard relationships
• For example, for Movie:
- Lead Performers
- Director
- Writer
- Release Date
- EW Grade
- Etc.
• Select a related category for
each relationship, as applicable
• Some allow multiple values
45
• Authors just
selected the
primary category
• Related metadata
pulled in
automatically
• Updates appeared
on all articles
*Metadata categories and
relationships were managed
by a dedicated data librarian
46
47
©2015 All rights reserved.
• “Best Results” linked directly to
an aggregated page based on
the category.
• For example:
- “Cats & Dogs” vs. “The Truth
About Cats & Dogs”
- The Green Mile (Movie) vs. The
Green Mile (Book)
Metadata is a Love Note to the Future
49
• Wal-mart sold gallon jars of
Vlasic pickles for $2.97.
• A popular item – priced so low
it nearly put Vlasic out of
business.
• By achieving their goals, they
put themselves in a position
they might not survive.
See: http://www.fastcompany.com/47593/wal-mart-you-dont-know
50
©2015 All rights reserved.
• We wanted people to
discover older content, and
they did!
• By 2006, we had 16 years of
magazine and web content.
• Other Time Inc. publications
were interested in using our
categorization system, too.
51
Not well-suited for our expensive
and frequent database calls.
52
Our webservers were optimized to
serve up the latest “issue” of content.
40% of Time Inc.’s database calls,
only 25% of the total traffic
53
A 2007 redesign removed the “third column” entirely.
54
©2015 All rights reserved.
The creator of Freebase (a semi-semantic UGC site for structured
content, now read-only) said EW.com was way ahead of its time.
Metadata is a Love Note to the Future
METADATAWARRIOR
The making of a
57
Who needs a database?
58
“The hardest part of
[recording] history is to be
there when it happens.”
Photo by Rachel Lovinger
59
60
• An informal post on August 4th
• Notification sent out September 30th
• Shut down October 31st
61
“What happened to my web page on my husband, Bob Champine,
that took me many years to put together on his career and which
meant a lot to me and to the aviation community. I noticed with 9.0
I lost the left margin and the picture of him exiting the X-1. I need to
restore it to the internet as it is history. Please tell me what to do. I
will be glad to retype it, I just don’t want it lost to the world. I need
help. Gloria Champine”
62
Illustration from “Fire in the Library,” MIT Technology Review
63
“Archive Team is a loose collective of rogue archivists,
programmers, writers and loudmouths dedicated to
saving our digital heritage. Since 2009 this variant
force of nature has caught wind of shutdowns,
shutoffs, mergers, and plain old deletions - and done
our best to save the history before it's lost forever.”
64
65
66
67
68
69
70
71
72
• In 6 months Archive Team saved 900 Gb
• Estimated 4-5 Tb total
• Other people saved additional pages,
but probably ¼ is gone forever
• For many people, Geocities was their
first web presence
73
74
75
76
Those screenshots were automatically generated from
Geocities sites rescued by Archive Team in 2009
See more at One Terabyte of Kilobyte Age Photo Op:
http://oneterabyteofkilobyteage.tumblr.com/
77
Due to lack of metadata:
• The rescued data was less useful
• Really bulky files
• Case-sensitive filenames difficult to access and read
• Not in a web-ready format (WARC)
• The process was less efficient and more error prone
• Poor tracking of completed activity
• Lots of duplication of data
• Took way too long (6 months vs. 3 days)
• Could have gotten all the data in a month (estimated)
78
79
©2015 All rights reserved.
Mission:
The Internet Archive’s purposes include offering permanent access
for researchers, historians, scholars, people with disabilities, and the
general public to historical collections that exist in digital format.
Photo by Ulf Benjaminsson
80
81
82
83
Save the history before it's lost
forever
Offer permanent access to
historical collections that exist in
digital format
84
©2015 All rights reserved.
Internet Archive contains: web pages, texts, videos, audio files,
software, and images. (Plus concerts and collections)
• Media Type makes it Readable or Playable
• Emulator (for software) makes it Executable
• Subject Keywords makes it Findable
Metadata is a Love Note to the Future
86
©2015 All rights reserved.
• Is it Accurate?
• Is it Credible?
• What is the Source? (machines or people)
• It’s a lot of Effort. Do we have enough people and time?
Metadata is a Love Note to the Future
88
©2015 All rights reserved.
Additional processing takes place, depending on the type
89
• Description and keywords are required, but open fields
• Other metadata is optional
90
91
• Metadata attributes
determined by the
community
92
©2015 All rights reserved.
• For user-generated content, it’s just easier for people not to.
• Internet Archive will never have enough people on staff to do it
properly.
93
Crowdsource manual creation of metadata
Photo by Pascal
94
• Small a pool of volunteers, and
their drive didn’t last long
• Tools didn’t provide immediate
feedback/satisfaction. They had
to email their inputs and wait.
Photo by psyberartist
95
• 10 most common words + 10
most common 2-word phrases
• Applied to 200,000 items
• Much more scalable
• Heavily machine assisted: a
person can validate data and
create collections
Photo by James St. John
96
97
“Controversial, but roughly as
good as a bored intern.”
98
Topics:
switch, atari,
antenna, game,
cable, terminals,
console, television,
video, program,
power supply,
console unit, video
computer, game
program, computer
system, atari game,
power switch,
switch box, atari
video, screw
terminals
99
Having the stuff is vital, the
most important thing. But
it’s also vital to have a
system by which these
things are described.
“If a person can’t get the
information they need, then
we’re failing.”
Photo by Rachel Lovinger
Metadata is a Love Note to the Future
101
• Jason had converted to a
metadata advocate
But I realized that…
• Content strategists who care
about the long game should
think like historians,
archivists and futurists, too.
NATURALIS BIODIVERSITY
CENTER
Metadata from the past
103
• Dutch leader in academic research and education on
biodiversity and taxonomy.
• Has a collection of 37 million natural history objects.
104
Describe, understand and explore biodiversity for human
wellbeing and the future of our planet.
They do this with:
• Accessible collections
• Contributions to global
scientific research
• Awe of natural history
• Openly shared knowledge
105
• From 2010 to June 2015
• 250 staff members & 450 volunteers
• Digitizing 7 million objects in detail
• Adding metadata for the other 30 million objects
106
• Information is
more easily
discovered,
studied, and used.
• Scientists
worldwide can
access it directly
online, without
assistance.
• Some of this data
has never been
available in digital
form before.
107
• Scientific name
• Where it was found
• When it was found
• Who found it
“Objects [in the collection] have no scientific value
without this information.” - Suzanne de Jong-Kole
108
109
Employees enter data, verbatim, into the collection registration system.
110
This allows them to retrieve the physical specimen if requested.
111
• Vele Handen = Many Hands
• People helped transcribe
hand written labels
• In 9 months, people did
200,000, of which about half
were usable.
112
The person who collected the specimen wrote the metadata on the label.
This could be a professional researcher, or a non-professional enthusiast.
113
Darwin’s Finches
114
The oldest is this Spanish
pepper from 1550!
115
When they wrote this metadata, they had no idea that nearly
half a millennium later people would be “digitizing” it.
116
©2015 All rights reserved.
The ‘love note’ is when
you behave selflessly for
a partner – or customer –
that doesn’t exist yet.
A drawing Jason drew in my notebook in high
school, 20+ years before we ever dated.
Rachel Lovinger @rlovinger
Image via Bond

Contenu connexe

Tendances

DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...
DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...
DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...Paul Wlodarczyk
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019Randall Hunt
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
DITA and Metadata on an Enterprise Scale
DITA and Metadata on an Enterprise ScaleDITA and Metadata on an Enterprise Scale
DITA and Metadata on an Enterprise ScaleKristen Eberlein
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsDenodo
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesDATAVERSITY
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In PracticeMarcia Zeng
 
Taxonomy 101: Classifying DITA Tasks
Taxonomy 101: Classifying DITA TasksTaxonomy 101: Classifying DITA Tasks
Taxonomy 101: Classifying DITA TaskseasyDITA
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMatei Zaharia
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowRichard Wallis
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to MetadataEUDAT
 
Design Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDesign Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDenodo
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Advantages of metadata
Advantages of metadataAdvantages of metadata
Advantages of metadataAzeem Sultan
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogsValeria Pesce
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalHarvinder Atwal
 

Tendances (20)

DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...
DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...
DITA, Semantics, Content Management, Dynamic Documents, and Linked Data – A M...
 
Microsoft Purview
Microsoft PurviewMicrosoft Purview
Microsoft Purview
 
How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019How to Choose The Right Database on AWS - Berlin Summit - 2019
How to Choose The Right Database on AWS - Berlin Summit - 2019
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
DITA and Metadata on an Enterprise Scale
DITA and Metadata on an Enterprise ScaleDITA and Metadata on an Enterprise Scale
DITA and Metadata on an Enterprise Scale
 
Logical Data Fabric: Architectural Components
Logical Data Fabric: Architectural ComponentsLogical Data Fabric: Architectural Components
Logical Data Fabric: Architectural Components
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In Practice
 
Taxonomy 101: Classifying DITA Tasks
Taxonomy 101: Classifying DITA TasksTaxonomy 101: Classifying DITA Tasks
Taxonomy 101: Classifying DITA Tasks
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Schema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & How
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
Introduction to Metadata
Introduction to MetadataIntroduction to Metadata
Introduction to Metadata
 
Design Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDesign Guidelines for Data Mesh and Decentralized Data Organizations
Design Guidelines for Data Mesh and Decentralized Data Organizations
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Advantages of metadata
Advantages of metadataAdvantages of metadata
Advantages of metadata
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 

En vedette

10 Things I Learned in 10 Years as a Content Strategist
10 Things I Learned in 10 Years as a Content Strategist10 Things I Learned in 10 Years as a Content Strategist
10 Things I Learned in 10 Years as a Content StrategistRachel Lovinger
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata managementOpen Data Support
 
What is Metadata?
What is Metadata?What is Metadata?
What is Metadata?Adgistics
 
Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)robin fay
 
Content Modelling Workshop Preview
Content Modelling Workshop PreviewContent Modelling Workshop Preview
Content Modelling Workshop PreviewRachel Lovinger
 
Content Auditing: Unearthing the Substance of Your Brand
Content Auditing: Unearthing the Substance of Your BrandContent Auditing: Unearthing the Substance of Your Brand
Content Auditing: Unearthing the Substance of Your BrandRachel Lovinger
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureAccess Innovations, Inc.
 
Metadata in data warehouse
Metadata in data warehouseMetadata in data warehouse
Metadata in data warehouseSiddique Ibrahim
 
Metadata and Terminology Registries
Metadata and Terminology RegistriesMetadata and Terminology Registries
Metadata and Terminology RegistriesMarcia Zeng
 
SKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYCSKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYCjonphipps
 
Empowering Your Audience Ambassadors with Semantic Publishing
Empowering Your Audience Ambassadors with Semantic Publishing Empowering Your Audience Ambassadors with Semantic Publishing
Empowering Your Audience Ambassadors with Semantic Publishing Rachel Lovinger
 
Content in the Age of Promiscuous Reuse
Content in the Age of Promiscuous ReuseContent in the Age of Promiscuous Reuse
Content in the Age of Promiscuous ReuseRachel Lovinger
 
Making of The DEFCON Documentary
Making of The DEFCON DocumentaryMaking of The DEFCON Documentary
Making of The DEFCON DocumentaryRachel Lovinger
 
Journey Towards Datameaningfulness
Journey Towards DatameaningfulnessJourney Towards Datameaningfulness
Journey Towards DatameaningfulnessRachel Lovinger
 

En vedette (20)

Does metadata matter?
Does metadata matter?Does metadata matter?
Does metadata matter?
 
10 Things I Learned in 10 Years as a Content Strategist
10 Things I Learned in 10 Years as a Content Strategist10 Things I Learned in 10 Years as a Content Strategist
10 Things I Learned in 10 Years as a Content Strategist
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 
What is Metadata?
What is Metadata?What is Metadata?
What is Metadata?
 
Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)Metadata For Catalogers (introductions)
Metadata For Catalogers (introductions)
 
Content Modelling Workshop Preview
Content Modelling Workshop PreviewContent Modelling Workshop Preview
Content Modelling Workshop Preview
 
Content Auditing: Unearthing the Substance of Your Brand
Content Auditing: Unearthing the Substance of Your BrandContent Auditing: Unearthing the Substance of Your Brand
Content Auditing: Unearthing the Substance of Your Brand
 
Metadata in Business Intelligence
Metadata in Business IntelligenceMetadata in Business Intelligence
Metadata in Business Intelligence
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
Metadata in data warehouse
Metadata in data warehouseMetadata in data warehouse
Metadata in data warehouse
 
About Scanning and Metadata Standards - NEMO 2010
About Scanning and Metadata Standards - NEMO 2010About Scanning and Metadata Standards - NEMO 2010
About Scanning and Metadata Standards - NEMO 2010
 
"Love notes to the future": research data, metadata and long term re-use
"Love notes to the future": research data, metadata and long term re-use"Love notes to the future": research data, metadata and long term re-use
"Love notes to the future": research data, metadata and long term re-use
 
Metadata and Terminology Registries
Metadata and Terminology RegistriesMetadata and Terminology Registries
Metadata and Terminology Registries
 
SKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYCSKOS - 2007 Open Forum on Metadata Registries - NYC
SKOS - 2007 Open Forum on Metadata Registries - NYC
 
Empowering Your Audience Ambassadors with Semantic Publishing
Empowering Your Audience Ambassadors with Semantic Publishing Empowering Your Audience Ambassadors with Semantic Publishing
Empowering Your Audience Ambassadors with Semantic Publishing
 
Content in the Age of Promiscuous Reuse
Content in the Age of Promiscuous ReuseContent in the Age of Promiscuous Reuse
Content in the Age of Promiscuous Reuse
 
Making of The DEFCON Documentary
Making of The DEFCON DocumentaryMaking of The DEFCON Documentary
Making of The DEFCON Documentary
 
Journey Towards Datameaningfulness
Journey Towards DatameaningfulnessJourney Towards Datameaningfulness
Journey Towards Datameaningfulness
 
BBC Design Research Framework
BBC Design Research FrameworkBBC Design Research Framework
BBC Design Research Framework
 

Similaire à Metadata is a Love Note to the Future

Twitter Realtime Social Data @StartupFest
Twitter Realtime Social Data @StartupFestTwitter Realtime Social Data @StartupFest
Twitter Realtime Social Data @StartupFestSylvain Carle
 
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth RedmoreH2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth RedmoreSri Ambati
 
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)ux singapore
 
Development of the CyberCemetery (2011)
Development of the CyberCemetery (2011)Development of the CyberCemetery (2011)
Development of the CyberCemetery (2011)Dr. Starr Hoffman
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...Digital History
 
How To Create Content
How To Create ContentHow To Create Content
How To Create ContentAmy Vernon
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 KeynotePeter Wang
 
How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...
How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...
How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...NetSquared Vancouver
 
IWMW 2004: Socrates Building an intranet for the UK Research Councils
IWMW 2004: Socrates Building an intranet for the UK Research CouncilsIWMW 2004: Socrates Building an intranet for the UK Research Councils
IWMW 2004: Socrates Building an intranet for the UK Research CouncilsIWMW
 
Webinar - The Changing Landscape of Library Privacy - 2016-06-15
Webinar - The Changing Landscape of Library Privacy - 2016-06-15Webinar - The Changing Landscape of Library Privacy - 2016-06-15
Webinar - The Changing Landscape of Library Privacy - 2016-06-15TechSoup
 
The Digital 4 Ps of Marketing Campaigns Dave Drodge
The Digital 4 Ps of Marketing Campaigns Dave DrodgeThe Digital 4 Ps of Marketing Campaigns Dave Drodge
The Digital 4 Ps of Marketing Campaigns Dave DrodgeDavid Drodge
 
PTTP09 London Film Fest Workshop
PTTP09 London Film Fest WorkshopPTTP09 London Film Fest Workshop
PTTP09 London Film Fest WorkshopBrian Newman
 
Building Thought Leadership through Content Curation
Building Thought Leadership through Content CurationBuilding Thought Leadership through Content Curation
Building Thought Leadership through Content CurationCorinne Weisgerber
 
Asian Digital Storytelling Congress, Singapore
Asian Digital Storytelling Congress, SingaporeAsian Digital Storytelling Congress, Singapore
Asian Digital Storytelling Congress, SingaporeBarrie Stephenson
 

Similaire à Metadata is a Love Note to the Future (20)

Twitter Realtime Social Data @StartupFest
Twitter Realtime Social Data @StartupFestTwitter Realtime Social Data @StartupFest
Twitter Realtime Social Data @StartupFest
 
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth RedmoreH2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
 
Selected Thoughts on Modern Discovery and Access
Selected Thoughts on Modern Discovery and AccessSelected Thoughts on Modern Discovery and Access
Selected Thoughts on Modern Discovery and Access
 
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
 
Development of the CyberCemetery (2011)
Development of the CyberCemetery (2011)Development of the CyberCemetery (2011)
Development of the CyberCemetery (2011)
 
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
The Challenge of Digital Sources in the Web Age: Common Tensions Across Three...
 
Groeling, Tim: NewsScape: Preserving TV News
Groeling, Tim: NewsScape: Preserving TV NewsGroeling, Tim: NewsScape: Preserving TV News
Groeling, Tim: NewsScape: Preserving TV News
 
Creating content
Creating contentCreating content
Creating content
 
How To Create Content
How To Create ContentHow To Create Content
How To Create Content
 
Information symposium
Information symposiumInformation symposium
Information symposium
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 Keynote
 
Fighting Spam at Flickr
Fighting Spam at FlickrFighting Spam at Flickr
Fighting Spam at Flickr
 
How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...
How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...
How to Regularly – and Without a Lot of Extra Effort – Find, Capture and Shar...
 
IWMW 2004: Socrates Building an intranet for the UK Research Councils
IWMW 2004: Socrates Building an intranet for the UK Research CouncilsIWMW 2004: Socrates Building an intranet for the UK Research Councils
IWMW 2004: Socrates Building an intranet for the UK Research Councils
 
Webinar - The Changing Landscape of Library Privacy - 2016-06-15
Webinar - The Changing Landscape of Library Privacy - 2016-06-15Webinar - The Changing Landscape of Library Privacy - 2016-06-15
Webinar - The Changing Landscape of Library Privacy - 2016-06-15
 
Sx sw 2012
Sx sw 2012Sx sw 2012
Sx sw 2012
 
The Digital 4 Ps of Marketing Campaigns Dave Drodge
The Digital 4 Ps of Marketing Campaigns Dave DrodgeThe Digital 4 Ps of Marketing Campaigns Dave Drodge
The Digital 4 Ps of Marketing Campaigns Dave Drodge
 
PTTP09 London Film Fest Workshop
PTTP09 London Film Fest WorkshopPTTP09 London Film Fest Workshop
PTTP09 London Film Fest Workshop
 
Building Thought Leadership through Content Curation
Building Thought Leadership through Content CurationBuilding Thought Leadership through Content Curation
Building Thought Leadership through Content Curation
 
Asian Digital Storytelling Congress, Singapore
Asian Digital Storytelling Congress, SingaporeAsian Digital Storytelling Congress, Singapore
Asian Digital Storytelling Congress, Singapore
 

Plus de Rachel Lovinger

Content Strategy as a Methodology
Content Strategy as a MethodologyContent Strategy as a Methodology
Content Strategy as a MethodologyRachel Lovinger
 
Making of The DEFCON Documentary
Making of The DEFCON DocumentaryMaking of The DEFCON Documentary
Making of The DEFCON DocumentaryRachel Lovinger
 
Content Strategy: Why Now?
Content Strategy: Why Now?Content Strategy: Why Now?
Content Strategy: Why Now?Rachel Lovinger
 
Make Your Content Nimble - Sem Tech UK
Make Your Content Nimble - Sem Tech UKMake Your Content Nimble - Sem Tech UK
Make Your Content Nimble - Sem Tech UKRachel Lovinger
 
Make Your Content Nimble - Confab
Make Your Content Nimble - ConfabMake Your Content Nimble - Confab
Make Your Content Nimble - ConfabRachel Lovinger
 
Semantics in Publishing & Media
Semantics in Publishing & MediaSemantics in Publishing & Media
Semantics in Publishing & MediaRachel Lovinger
 
STC Summit 2010: Semantic Web and Content Strategy
STC Summit 2010: Semantic Web and Content StrategySTC Summit 2010: Semantic Web and Content Strategy
STC Summit 2010: Semantic Web and Content StrategyRachel Lovinger
 
Semantic Web and Content Strategy
Semantic Web and Content StrategySemantic Web and Content Strategy
Semantic Web and Content StrategyRachel Lovinger
 
The Rise and Fall of TOPICS
The Rise and Fall of TOPICSThe Rise and Fall of TOPICS
The Rise and Fall of TOPICSRachel Lovinger
 
Representing Taxonomies: What am I looking at here?
Representing Taxonomies: What am I looking at here?Representing Taxonomies: What am I looking at here?
Representing Taxonomies: What am I looking at here?Rachel Lovinger
 
Metadata Strategies And Tools
Metadata Strategies And ToolsMetadata Strategies And Tools
Metadata Strategies And ToolsRachel Lovinger
 
A Survey: Taxonomy Building Tools
A Survey: Taxonomy Building ToolsA Survey: Taxonomy Building Tools
A Survey: Taxonomy Building ToolsRachel Lovinger
 

Plus de Rachel Lovinger (16)

Content Strategy as a Methodology
Content Strategy as a MethodologyContent Strategy as a Methodology
Content Strategy as a Methodology
 
Making of The DEFCON Documentary
Making of The DEFCON DocumentaryMaking of The DEFCON Documentary
Making of The DEFCON Documentary
 
Orchestrated Content
Orchestrated ContentOrchestrated Content
Orchestrated Content
 
Content Strategy: Why Now?
Content Strategy: Why Now?Content Strategy: Why Now?
Content Strategy: Why Now?
 
Make Your Content Nimble - Sem Tech UK
Make Your Content Nimble - Sem Tech UKMake Your Content Nimble - Sem Tech UK
Make Your Content Nimble - Sem Tech UK
 
Make Your Content Nimble - Confab
Make Your Content Nimble - ConfabMake Your Content Nimble - Confab
Make Your Content Nimble - Confab
 
Semantics in Publishing & Media
Semantics in Publishing & MediaSemantics in Publishing & Media
Semantics in Publishing & Media
 
Nimble Report
Nimble ReportNimble Report
Nimble Report
 
STC Summit 2010: Semantic Web and Content Strategy
STC Summit 2010: Semantic Web and Content StrategySTC Summit 2010: Semantic Web and Content Strategy
STC Summit 2010: Semantic Web and Content Strategy
 
Semantic Web and Content Strategy
Semantic Web and Content StrategySemantic Web and Content Strategy
Semantic Web and Content Strategy
 
The Rise and Fall of TOPICS
The Rise and Fall of TOPICSThe Rise and Fall of TOPICS
The Rise and Fall of TOPICS
 
Content Gone Wild!
Content Gone Wild!Content Gone Wild!
Content Gone Wild!
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
Representing Taxonomies: What am I looking at here?
Representing Taxonomies: What am I looking at here?Representing Taxonomies: What am I looking at here?
Representing Taxonomies: What am I looking at here?
 
Metadata Strategies And Tools
Metadata Strategies And ToolsMetadata Strategies And Tools
Metadata Strategies And Tools
 
A Survey: Taxonomy Building Tools
A Survey: Taxonomy Building ToolsA Survey: Taxonomy Building Tools
A Survey: Taxonomy Building Tools
 

Dernier

Podvertise.fm - Founder.University - Pitch Deck 2024
Podvertise.fm - Founder.University - Pitch Deck 2024Podvertise.fm - Founder.University - Pitch Deck 2024
Podvertise.fm - Founder.University - Pitch Deck 2024Nedko Nedkov
 
Friends of Search Future Proof Accounts.pptx
Friends of Search Future Proof Accounts.pptxFriends of Search Future Proof Accounts.pptx
Friends of Search Future Proof Accounts.pptxNavah Hopkins
 
Top 15 Emerging Technologies for the Modern World
Top 15 Emerging Technologies for the Modern WorldTop 15 Emerging Technologies for the Modern World
Top 15 Emerging Technologies for the Modern WorldD Cloud Solutions
 
Dhanuka Agritech Limited - Sales and Marketing Intern
Dhanuka Agritech Limited - Sales and Marketing InternDhanuka Agritech Limited - Sales and Marketing Intern
Dhanuka Agritech Limited - Sales and Marketing Internrisabhpandeyconnect
 
Voltas turnaround strategy management case
Voltas turnaround strategy management caseVoltas turnaround strategy management case
Voltas turnaround strategy management caseAnkit Sarkar
 
Imposter Syndrome in Marketing & Why You're Not Alone
Imposter Syndrome in Marketing & Why You're Not AloneImposter Syndrome in Marketing & Why You're Not Alone
Imposter Syndrome in Marketing & Why You're Not AloneHerd
 
Elevate Your Design Skills: Enroll in Pune's Premier UI/UX Design Course
Elevate Your Design Skills: Enroll in Pune's Premier UI/UX Design CourseElevate Your Design Skills: Enroll in Pune's Premier UI/UX Design Course
Elevate Your Design Skills: Enroll in Pune's Premier UI/UX Design Courseamirshaikhv21realtyp
 
Digital Marketing Analytics: Driving Hotel Success (2016 May report)
Digital Marketing Analytics: Driving Hotel Success (2016 May report)Digital Marketing Analytics: Driving Hotel Success (2016 May report)
Digital Marketing Analytics: Driving Hotel Success (2016 May report)yaeyukimoto
 
Crafting High-Converting eCommerce Landing Pages
Crafting High-Converting eCommerce Landing PagesCrafting High-Converting eCommerce Landing Pages
Crafting High-Converting eCommerce Landing PagesVWO
 
The 2024 Next Gen Attention Study - www.livewire.group
The 2024 Next Gen Attention Study - www.livewire.groupThe 2024 Next Gen Attention Study - www.livewire.group
The 2024 Next Gen Attention Study - www.livewire.groupLivewire
 
TAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdf
TAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdfTAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdf
TAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdfSocial Samosa
 
Run more experiments with fewer resources
Run more experiments with fewer resourcesRun more experiments with fewer resources
Run more experiments with fewer resourcesVWO
 
A navigation of two creative processes Study
A navigation of two creative processes StudyA navigation of two creative processes Study
A navigation of two creative processes Studystuwilson.co.uk
 
Digital Marketing Services like SEO, SMM, SEM
Digital Marketing Services like SEO, SMM, SEMDigital Marketing Services like SEO, SMM, SEM
Digital Marketing Services like SEO, SMM, SEMNazal Digital
 
Marketing Team of 1, A Framework To Win!
Marketing Team of 1, A Framework To Win!Marketing Team of 1, A Framework To Win!
Marketing Team of 1, A Framework To Win!Joseph Skibbie
 
SVETLANA YONCHEVA Evolution of digital marketing.pdf
SVETLANA YONCHEVA Evolution of digital marketing.pdfSVETLANA YONCHEVA Evolution of digital marketing.pdf
SVETLANA YONCHEVA Evolution of digital marketing.pdfvikrs213
 
2024 Google SERP Features: New Strategies To Gain Visibility
2024 Google SERP Features: New Strategies To Gain Visibility2024 Google SERP Features: New Strategies To Gain Visibility
2024 Google SERP Features: New Strategies To Gain VisibilitySearch Engine Journal
 
Digital Gravity - Full-Scale SEO Services Preview
Digital Gravity - Full-Scale SEO Services PreviewDigital Gravity - Full-Scale SEO Services Preview
Digital Gravity - Full-Scale SEO Services PreviewDigital Gravity
 
Harnessing Social Media for Marketing Growth
Harnessing Social Media for Marketing GrowthHarnessing Social Media for Marketing Growth
Harnessing Social Media for Marketing Growthabinashdm2014
 
ToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptx
ToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptxToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptx
ToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptxivanrazine1
 

Dernier (20)

Podvertise.fm - Founder.University - Pitch Deck 2024
Podvertise.fm - Founder.University - Pitch Deck 2024Podvertise.fm - Founder.University - Pitch Deck 2024
Podvertise.fm - Founder.University - Pitch Deck 2024
 
Friends of Search Future Proof Accounts.pptx
Friends of Search Future Proof Accounts.pptxFriends of Search Future Proof Accounts.pptx
Friends of Search Future Proof Accounts.pptx
 
Top 15 Emerging Technologies for the Modern World
Top 15 Emerging Technologies for the Modern WorldTop 15 Emerging Technologies for the Modern World
Top 15 Emerging Technologies for the Modern World
 
Dhanuka Agritech Limited - Sales and Marketing Intern
Dhanuka Agritech Limited - Sales and Marketing InternDhanuka Agritech Limited - Sales and Marketing Intern
Dhanuka Agritech Limited - Sales and Marketing Intern
 
Voltas turnaround strategy management case
Voltas turnaround strategy management caseVoltas turnaround strategy management case
Voltas turnaround strategy management case
 
Imposter Syndrome in Marketing & Why You're Not Alone
Imposter Syndrome in Marketing & Why You're Not AloneImposter Syndrome in Marketing & Why You're Not Alone
Imposter Syndrome in Marketing & Why You're Not Alone
 
Elevate Your Design Skills: Enroll in Pune's Premier UI/UX Design Course
Elevate Your Design Skills: Enroll in Pune's Premier UI/UX Design CourseElevate Your Design Skills: Enroll in Pune's Premier UI/UX Design Course
Elevate Your Design Skills: Enroll in Pune's Premier UI/UX Design Course
 
Digital Marketing Analytics: Driving Hotel Success (2016 May report)
Digital Marketing Analytics: Driving Hotel Success (2016 May report)Digital Marketing Analytics: Driving Hotel Success (2016 May report)
Digital Marketing Analytics: Driving Hotel Success (2016 May report)
 
Crafting High-Converting eCommerce Landing Pages
Crafting High-Converting eCommerce Landing PagesCrafting High-Converting eCommerce Landing Pages
Crafting High-Converting eCommerce Landing Pages
 
The 2024 Next Gen Attention Study - www.livewire.group
The 2024 Next Gen Attention Study - www.livewire.groupThe 2024 Next Gen Attention Study - www.livewire.group
The 2024 Next Gen Attention Study - www.livewire.group
 
TAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdf
TAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdfTAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdf
TAM AdEx-A Pixelated view into Digital Advertising Trends for Y 2023.pdf
 
Run more experiments with fewer resources
Run more experiments with fewer resourcesRun more experiments with fewer resources
Run more experiments with fewer resources
 
A navigation of two creative processes Study
A navigation of two creative processes StudyA navigation of two creative processes Study
A navigation of two creative processes Study
 
Digital Marketing Services like SEO, SMM, SEM
Digital Marketing Services like SEO, SMM, SEMDigital Marketing Services like SEO, SMM, SEM
Digital Marketing Services like SEO, SMM, SEM
 
Marketing Team of 1, A Framework To Win!
Marketing Team of 1, A Framework To Win!Marketing Team of 1, A Framework To Win!
Marketing Team of 1, A Framework To Win!
 
SVETLANA YONCHEVA Evolution of digital marketing.pdf
SVETLANA YONCHEVA Evolution of digital marketing.pdfSVETLANA YONCHEVA Evolution of digital marketing.pdf
SVETLANA YONCHEVA Evolution of digital marketing.pdf
 
2024 Google SERP Features: New Strategies To Gain Visibility
2024 Google SERP Features: New Strategies To Gain Visibility2024 Google SERP Features: New Strategies To Gain Visibility
2024 Google SERP Features: New Strategies To Gain Visibility
 
Digital Gravity - Full-Scale SEO Services Preview
Digital Gravity - Full-Scale SEO Services PreviewDigital Gravity - Full-Scale SEO Services Preview
Digital Gravity - Full-Scale SEO Services Preview
 
Harnessing Social Media for Marketing Growth
Harnessing Social Media for Marketing GrowthHarnessing Social Media for Marketing Growth
Harnessing Social Media for Marketing Growth
 
ToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptx
ToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptxToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptx
ToShare_UG 13_03_24_Full_BelgianTrailblazerCommunity.pptx
 

Metadata is a Love Note to the Future

  • 1. Rachel Lovinger @rlovinger Confab, 22 May, 2015 Image via Bond
  • 2. 2 ©2015 All rights reserved. • Experience Director, Content Strategy; Razorfish New York • Co-editor of scatter/gather, a content strategy blog: http://scattergather.razorfish.com • Author of Nimble: A Razorfish Report on Publishing in the Digital Age (June 2010): http://nimble.razorfish.com • Twitter: @rlovinger
  • 4. 4
  • 5. 5
  • 6. 6
  • 7. 7
  • 8. 8
  • 9. 9
  • 10. 10
  • 11. 11 ©2015 All rights reserved. is HARDCORE
  • 12. 12 ©2015 All rights reserved. 2006 2009 2008 2012 2011 2010
  • 13. 13 ©2015 All rights reserved. Metadata = Context Context enables Connections How does one convey that in a concise and powerful way?
  • 14. 14 Photo by Jesse Chan-Norris
  • 15. Metadata Is A Love note To the Future
  • 16. 16 Tweet and photo by Erin Kissane, Tumblr by Austin Kleon 429 notes 82 retweets
  • 17. 17 Photo and shirt by Sarah
  • 18. 18 Photo by Rachel Lovinger
  • 19. 19 Content Strategy for Mobile by Karen McGrane
  • 21. 21 • Nearly 60,000 files archived • Mostly from 1980-1995 • Collected and curated since 1998 • Almost no metadata Textfiles.com
  • 22. 22 Who needs a database?
  • 23. 23 Metadata Skeptic transformed into… Metadata Warrior Photos by Jason Scott and Rachel Lovinger
  • 24. 24 Photo by Rachel Lovinger
  • 25. 25 • Me? Photo by Rachel Lovinger
  • 27. 27 ©2015 All rights reserved. ~3 years online content ~10 years magazine content
  • 28. 28 ©2015 All rights reserved. Imported from text files to CMS
  • 29. 29 ©2015 All rights reserved. Semi-structured information allowed us to map the files to content types and site sections, and add some metadata (author, published date, keywords, etc.) 10 years x 50 issues per year x 100 files per issue (approx.) 50,000 estimated articles
  • 30. 30 ©2015 All rights reserved. Once in the CMS, we could add photos, links, formatting, etc.
  • 31. 31 ©2015 All rights reserved. For the content already in the CMS, keywords had been manually typed in by authors • 6790 “different” keywords • Removed 12% during clean up • Typos • Redundant • Not Useful
  • 33. 33 ©2015 All rights reserved. • Star Wars: Episode I -- The Phantom Menace • Episode 1 • Episode I • Phantom Menace • Star Wars Episode I The Phantom Menace • Star Wars Episode I: The Phantom Menace • Star Wars prequel • Star Wars: Episode 1 -- The Phantom Menace • Star Wars: Episode i -- the Phantom Menace • Star Wars: Episode I: The Phantom Menace • Star Wars: Episode I--The Phantom Menace • Star Wars: Episode I--The Phantom Menance • Star Wars: Episode One -- The Phantom Menace • Star Wars: The Phantom Menace • Star Wars: The Phantom Menace -- Episode I • The Phantom Menace • The Phanton Menace
  • 34. 34 ©2015 All rights reserved. • TAFKAP?
  • 35. 35 ©2015 All rights reserved. • TAFKAP? • The Artist • Artist Formerly Known as Prince • The Artist Formerly Known As Prince • The Artist formerly known as Prince • the Artist Formerly Known as Prince • The Artist Formerly Known as Prince (PKA)
  • 37. 37 ©2015 All rights reserved. • The magazine was once a week • The website published new articles several times a day • Plus: Over 50,000 past articles! • How could we better use all that content?
  • 38. 38 ©2015 All rights reserved. If you like James Bond, we wanted it to be easy for you to discover everything we had. Cover Story Interview Photo Gallery Etc.
  • 40. 40
  • 41. 41 ©2015 All rights reserved. We put our controlled vocabulary into categories, to make them more distinct and meaningful. For example: • Book > Product > Harry Potter and the Goblet of Fire • Movie > Product > Harry Potter and the Goblet of Fire • Person > Individual > Daniel Radcliffe • Person > Individual > J.K. Rowling
  • 43. 43 • Relationships defined for each media type • Managed separately from the article content • The full set of metadata was available to all articles
  • 44. 44 ©2015 All rights reserved. • Standard relationships • For example, for Movie: - Lead Performers - Director - Writer - Release Date - EW Grade - Etc. • Select a related category for each relationship, as applicable • Some allow multiple values
  • 45. 45 • Authors just selected the primary category • Related metadata pulled in automatically • Updates appeared on all articles *Metadata categories and relationships were managed by a dedicated data librarian
  • 46. 46
  • 47. 47 ©2015 All rights reserved. • “Best Results” linked directly to an aggregated page based on the category. • For example: - “Cats & Dogs” vs. “The Truth About Cats & Dogs” - The Green Mile (Movie) vs. The Green Mile (Book)
  • 49. 49 • Wal-mart sold gallon jars of Vlasic pickles for $2.97. • A popular item – priced so low it nearly put Vlasic out of business. • By achieving their goals, they put themselves in a position they might not survive. See: http://www.fastcompany.com/47593/wal-mart-you-dont-know
  • 50. 50 ©2015 All rights reserved. • We wanted people to discover older content, and they did! • By 2006, we had 16 years of magazine and web content. • Other Time Inc. publications were interested in using our categorization system, too.
  • 51. 51 Not well-suited for our expensive and frequent database calls.
  • 52. 52 Our webservers were optimized to serve up the latest “issue” of content. 40% of Time Inc.’s database calls, only 25% of the total traffic
  • 53. 53 A 2007 redesign removed the “third column” entirely.
  • 54. 54 ©2015 All rights reserved. The creator of Freebase (a semi-semantic UGC site for structured content, now read-only) said EW.com was way ahead of its time.
  • 57. 57 Who needs a database?
  • 58. 58 “The hardest part of [recording] history is to be there when it happens.” Photo by Rachel Lovinger
  • 59. 59
  • 60. 60 • An informal post on August 4th • Notification sent out September 30th • Shut down October 31st
  • 61. 61 “What happened to my web page on my husband, Bob Champine, that took me many years to put together on his career and which meant a lot to me and to the aviation community. I noticed with 9.0 I lost the left margin and the picture of him exiting the X-1. I need to restore it to the internet as it is history. Please tell me what to do. I will be glad to retype it, I just don’t want it lost to the world. I need help. Gloria Champine”
  • 62. 62 Illustration from “Fire in the Library,” MIT Technology Review
  • 63. 63 “Archive Team is a loose collective of rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage. Since 2009 this variant force of nature has caught wind of shutdowns, shutoffs, mergers, and plain old deletions - and done our best to save the history before it's lost forever.”
  • 64. 64
  • 65. 65
  • 66. 66
  • 67. 67
  • 68. 68
  • 69. 69
  • 70. 70
  • 71. 71
  • 72. 72 • In 6 months Archive Team saved 900 Gb • Estimated 4-5 Tb total • Other people saved additional pages, but probably ¼ is gone forever • For many people, Geocities was their first web presence
  • 73. 73
  • 74. 74
  • 75. 75
  • 76. 76 Those screenshots were automatically generated from Geocities sites rescued by Archive Team in 2009 See more at One Terabyte of Kilobyte Age Photo Op: http://oneterabyteofkilobyteage.tumblr.com/
  • 77. 77 Due to lack of metadata: • The rescued data was less useful • Really bulky files • Case-sensitive filenames difficult to access and read • Not in a web-ready format (WARC) • The process was less efficient and more error prone • Poor tracking of completed activity • Lots of duplication of data • Took way too long (6 months vs. 3 days) • Could have gotten all the data in a month (estimated)
  • 78. 78
  • 79. 79 ©2015 All rights reserved. Mission: The Internet Archive’s purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format. Photo by Ulf Benjaminsson
  • 80. 80
  • 81. 81
  • 82. 82
  • 83. 83 Save the history before it's lost forever Offer permanent access to historical collections that exist in digital format
  • 84. 84 ©2015 All rights reserved. Internet Archive contains: web pages, texts, videos, audio files, software, and images. (Plus concerts and collections) • Media Type makes it Readable or Playable • Emulator (for software) makes it Executable • Subject Keywords makes it Findable
  • 86. 86 ©2015 All rights reserved. • Is it Accurate? • Is it Credible? • What is the Source? (machines or people) • It’s a lot of Effort. Do we have enough people and time?
  • 88. 88 ©2015 All rights reserved. Additional processing takes place, depending on the type
  • 89. 89 • Description and keywords are required, but open fields • Other metadata is optional
  • 90. 90
  • 92. 92 ©2015 All rights reserved. • For user-generated content, it’s just easier for people not to. • Internet Archive will never have enough people on staff to do it properly.
  • 93. 93 Crowdsource manual creation of metadata Photo by Pascal
  • 94. 94 • Small a pool of volunteers, and their drive didn’t last long • Tools didn’t provide immediate feedback/satisfaction. They had to email their inputs and wait. Photo by psyberartist
  • 95. 95 • 10 most common words + 10 most common 2-word phrases • Applied to 200,000 items • Much more scalable • Heavily machine assisted: a person can validate data and create collections Photo by James St. John
  • 96. 96
  • 97. 97 “Controversial, but roughly as good as a bored intern.”
  • 98. 98 Topics: switch, atari, antenna, game, cable, terminals, console, television, video, program, power supply, console unit, video computer, game program, computer system, atari game, power switch, switch box, atari video, screw terminals
  • 99. 99 Having the stuff is vital, the most important thing. But it’s also vital to have a system by which these things are described. “If a person can’t get the information they need, then we’re failing.” Photo by Rachel Lovinger
  • 101. 101 • Jason had converted to a metadata advocate But I realized that… • Content strategists who care about the long game should think like historians, archivists and futurists, too.
  • 103. 103 • Dutch leader in academic research and education on biodiversity and taxonomy. • Has a collection of 37 million natural history objects.
  • 104. 104 Describe, understand and explore biodiversity for human wellbeing and the future of our planet. They do this with: • Accessible collections • Contributions to global scientific research • Awe of natural history • Openly shared knowledge
  • 105. 105 • From 2010 to June 2015 • 250 staff members & 450 volunteers • Digitizing 7 million objects in detail • Adding metadata for the other 30 million objects
  • 106. 106 • Information is more easily discovered, studied, and used. • Scientists worldwide can access it directly online, without assistance. • Some of this data has never been available in digital form before.
  • 107. 107 • Scientific name • Where it was found • When it was found • Who found it “Objects [in the collection] have no scientific value without this information.” - Suzanne de Jong-Kole
  • 108. 108
  • 109. 109 Employees enter data, verbatim, into the collection registration system.
  • 110. 110 This allows them to retrieve the physical specimen if requested.
  • 111. 111 • Vele Handen = Many Hands • People helped transcribe hand written labels • In 9 months, people did 200,000, of which about half were usable.
  • 112. 112 The person who collected the specimen wrote the metadata on the label. This could be a professional researcher, or a non-professional enthusiast.
  • 114. 114 The oldest is this Spanish pepper from 1550!
  • 115. 115 When they wrote this metadata, they had no idea that nearly half a millennium later people would be “digitizing” it.
  • 116. 116 ©2015 All rights reserved. The ‘love note’ is when you behave selflessly for a partner – or customer – that doesn’t exist yet. A drawing Jason drew in my notebook in high school, 20+ years before we ever dated.