SlideShare une entreprise Scribd logo
1  sur  23
Big Data =         Bigger Meta
O’Reilly Strata Conference
February 29 2012
Pivot/Skate, etc…
   Founded 2003
    Poor man’s GIS
    Panamap

   Refounded 2006
    Neighborhood boundaries
    Mass transit data


   Refocused 2009
    SaaS for mapping + on-demand data
Achtung!

     NoSQL is no panacea
           Big Data isn’t about data
           Big Data isn’t new
           Big Data doesn’t present a Boolean quandary
           With power comes responsibility
            AWS bills
            Lady Gaga tweets
            Innumeracy (correlation v causation)
Big v Important

  Big                         Important
        Heterogeneous            Well-defined schema
        Raw                      High value (not free)
        Distributed              Test-driven
        Streaming/real time      Relational
        Search for meaning       Historical
        Time-sensitive           Enterprise-focused
        Philosophical
Data Exhaust


     Analytics                  Probes




                 Social Media            Gov 2.0
Platforms




 Commoditization of compute and storage
A Brief History of Metadata




       Callimachus            Library of Alexandria, Egypt
A Brief History of Metadata

                              “Pinakes” (lists)
                                  Title
                                  Category
                                  Author
                                  Author birthplace
                                  Father
                                  Word count




       Callimachus
A Brief History of Metadata
A Brief History of Metadata
A Brief History of Metadata




Card catalog room,
Library of Congress c. 1920
A Brief History of Metadata

 Dewey Decimal System goes electronic in 1967
Out with the Old, in with the New




Archiving card catalogs
after digitization
Why Can’t We Be Together?


      Metadata              Data
Exponential Growth in Data


         Unprecedented rate of data creation, 1995-today
Data




       Pinakes                                     Catalog     Taxonomy Database




         300 BC                                      1595 AD         1876   1970
Oh, How I’ve Missed You


The reunification of metadata
and the artifact
Together At Last
GIS Data is Unevolved




               +        =
Enter the Data Curator


Part social scientist, part librarian,
part statistician, part RDBMS wiz
DIKW Model
    Data
        Fact, Signal, Symbol
    Information
        Structural v Functional
        Symbolic v Subjective
    Knowledge
        Processed
        Procedural
        Propositional
Popularity (Google Trends)
Words to Live By




                   dx /
                          dt
Thank you!
ian@urbanmapping.com
@urbanmapping




                        R.I.P.
                       Schema

Contenu connexe

En vedette

Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Zaloni
 
Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data ArchitectureZaloni
 
10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)Ronald Quiros
 
Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)MIO | the data experts
 
Self-Service Access and Exploration of Big Data
Self-Service Access and Exploration of Big DataSelf-Service Access and Exploration of Big Data
Self-Service Access and Exploration of Big DataInside Analysis
 
Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Figoblog
 
Work In Progress
Work In ProgressWork In Progress
Work In Progresssamluk
 
The Design of Data
The Design of DataThe Design of Data
The Design of DataIan White
 
Project-imp Report 02
Project-imp Report 02Project-imp Report 02
Project-imp Report 02samluk
 
მშობლიურის აქტივობა
მშობლიურის აქტივობამშობლიურის აქტივობა
მშობლიურის აქტივობაcira75
 
Paolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynotePaolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynotePaolo Ciccarese
 
Assistive Technology Webquest
Assistive Technology WebquestAssistive Technology Webquest
Assistive Technology Webquestangtapper
 
დედაენა
დედაენადედაენა
დედაენაcira75
 
An Integrated Solution for Runtime Compliance Governance in SOA
An Integrated Solution for Runtime Compliance Governance in SOAAn Integrated Solution for Runtime Compliance Governance in SOA
An Integrated Solution for Runtime Compliance Governance in SOAAliaksandr Birukou
 

En vedette (20)

The Big Metadata
The Big MetadataThe Big Metadata
The Big Metadata
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...
 
Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data Architecture
 
JOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big DataJOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big Data
 
Data Harmony Thesaurus Master®
Data Harmony Thesaurus Master®Data Harmony Thesaurus Master®
Data Harmony Thesaurus Master®
 
3 dw architectures
3 dw architectures3 dw architectures
3 dw architectures
 
10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)
 
Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)
 
Self-Service Access and Exploration of Big Data
Self-Service Access and Exploration of Big DataSelf-Service Access and Exploration of Big Data
Self-Service Access and Exploration of Big Data
 
Inline Tagging and Dictionary Connection
Inline Tagging and Dictionary ConnectionInline Tagging and Dictionary Connection
Inline Tagging and Dictionary Connection
 
Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)
 
Work In Progress
Work In ProgressWork In Progress
Work In Progress
 
The Design of Data
The Design of DataThe Design of Data
The Design of Data
 
Project-imp Report 02
Project-imp Report 02Project-imp Report 02
Project-imp Report 02
 
მშობლიურის აქტივობა
მშობლიურის აქტივობამშობლიურის აქტივობა
მშობლიურის აქტივობა
 
Paolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynotePaolo ciccarese DILS 2013 keynote
Paolo ciccarese DILS 2013 keynote
 
Chapter 2 5
Chapter 2 5Chapter 2 5
Chapter 2 5
 
Assistive Technology Webquest
Assistive Technology WebquestAssistive Technology Webquest
Assistive Technology Webquest
 
დედაენა
დედაენადედაენა
დედაენა
 
An Integrated Solution for Runtime Compliance Governance in SOA
An Integrated Solution for Runtime Compliance Governance in SOAAn Integrated Solution for Runtime Compliance Governance in SOA
An Integrated Solution for Runtime Compliance Governance in SOA
 

Similaire à Big Data = Bigger Metadata

Normalization: A Workshop for Everybody Pt. 1
Normalization: A Workshop for Everybody Pt. 1Normalization: A Workshop for Everybody Pt. 1
Normalization: A Workshop for Everybody Pt. 1Command Prompt., Inc
 
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...Martin Kalfatovic
 
There's no such thing as big data
There's no such thing as big dataThere's no such thing as big data
There's no such thing as big dataAndrew Clegg
 
Tech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataTech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataSteve Watt
 
What is a database (for non techies)
What is a database (for non techies)What is a database (for non techies)
What is a database (for non techies)Eric Tachibana
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsIJMER
 
introduction to data warehousing and mining
 introduction to data warehousing and mining introduction to data warehousing and mining
introduction to data warehousing and miningRajesh Chandra
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...Stefan Dietze
 
INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_publicAttila Barta
 
Scaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseScaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseAge Mooij
 
Big Metadata: Mining Special Collections Catalogs for New Knowledge
Big Metadata: Mining Special Collections Catalogs for New KnowledgeBig Metadata: Mining Special Collections Catalogs for New Knowledge
Big Metadata: Mining Special Collections Catalogs for New KnowledgeAllison Jai O'Dell
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 

Similaire à Big Data = Bigger Metadata (20)

STI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital WorldsSTI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital Worlds
 
Normalization: A Workshop for Everybody Pt. 1
Normalization: A Workshop for Everybody Pt. 1Normalization: A Workshop for Everybody Pt. 1
Normalization: A Workshop for Everybody Pt. 1
 
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
 
There's no such thing as big data
There's no such thing as big dataThere's no such thing as big data
There's no such thing as big data
 
Tech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataTech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big Data
 
What is a database (for non techies)
What is a database (for non techies)What is a database (for non techies)
What is a database (for non techies)
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Data Mining: Future Trends and Applications
Data Mining: Future Trends and ApplicationsData Mining: Future Trends and Applications
Data Mining: Future Trends and Applications
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
 
CBS CEDAR Presentation
CBS CEDAR PresentationCBS CEDAR Presentation
CBS CEDAR Presentation
 
introduction to data warehousing and mining
 introduction to data warehousing and mining introduction to data warehousing and mining
introduction to data warehousing and mining
 
Thinking of Linking
Thinking of LinkingThinking of Linking
Thinking of Linking
 
Data Monetization
Data MonetizationData Monetization
Data Monetization
 
Base de datos historia
Base de datos historiaBase de datos historia
Base de datos historia
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
 
INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_public
 
Scaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBaseScaling Out With Hadoop And HBase
Scaling Out With Hadoop And HBase
 
Steve Watt Presentation
Steve Watt PresentationSteve Watt Presentation
Steve Watt Presentation
 
Big Metadata: Mining Special Collections Catalogs for New Knowledge
Big Metadata: Mining Special Collections Catalogs for New KnowledgeBig Metadata: Mining Special Collections Catalogs for New Knowledge
Big Metadata: Mining Special Collections Catalogs for New Knowledge
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 

Plus de Ian White

Everything about Data for SV2B in Vilnius, Lithuania
Everything about Data for SV2B in Vilnius, LithuaniaEverything about Data for SV2B in Vilnius, Lithuania
Everything about Data for SV2B in Vilnius, LithuaniaIan White
 
Departmental Seminar: Innovation
Departmental Seminar: InnovationDepartmental Seminar: Innovation
Departmental Seminar: InnovationIan White
 
Tableau Customer Conference - Geographic Analysis
Tableau Customer Conference - Geographic AnalysisTableau Customer Conference - Geographic Analysis
Tableau Customer Conference - Geographic AnalysisIan White
 
How Open Is Open (Redux)?
How Open Is Open (Redux)?How Open Is Open (Redux)?
How Open Is Open (Redux)?Ian White
 
Geotrends For 2011 And Beyond
Geotrends For 2011 And BeyondGeotrends For 2011 And Beyond
Geotrends For 2011 And BeyondIan White
 
Dark Side Of Data
Dark Side Of DataDark Side Of Data
Dark Side Of DataIan White
 
How Open Is Open?
How Open Is Open?How Open Is Open?
How Open Is Open?Ian White
 
Location Doesn\'t Matter
Location Doesn\'t MatterLocation Doesn\'t Matter
Location Doesn\'t MatterIan White
 

Plus de Ian White (8)

Everything about Data for SV2B in Vilnius, Lithuania
Everything about Data for SV2B in Vilnius, LithuaniaEverything about Data for SV2B in Vilnius, Lithuania
Everything about Data for SV2B in Vilnius, Lithuania
 
Departmental Seminar: Innovation
Departmental Seminar: InnovationDepartmental Seminar: Innovation
Departmental Seminar: Innovation
 
Tableau Customer Conference - Geographic Analysis
Tableau Customer Conference - Geographic AnalysisTableau Customer Conference - Geographic Analysis
Tableau Customer Conference - Geographic Analysis
 
How Open Is Open (Redux)?
How Open Is Open (Redux)?How Open Is Open (Redux)?
How Open Is Open (Redux)?
 
Geotrends For 2011 And Beyond
Geotrends For 2011 And BeyondGeotrends For 2011 And Beyond
Geotrends For 2011 And Beyond
 
Dark Side Of Data
Dark Side Of DataDark Side Of Data
Dark Side Of Data
 
How Open Is Open?
How Open Is Open?How Open Is Open?
How Open Is Open?
 
Location Doesn\'t Matter
Location Doesn\'t MatterLocation Doesn\'t Matter
Location Doesn\'t Matter
 

Dernier

FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607dollysharma2066
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environmentelijahj01012
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchirictsugar
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCRashishs7044
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Anamaria Contreras
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCRashishs7044
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessSeta Wicaksana
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxmbikashkanyari
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?Olivia Kresic
 
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxFinancial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxsaniyaimamuddin
 

Dernier (20)

FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environment
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchir
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
8447779800, Low rate Call girls in New Ashok Nagar Delhi NCR
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
Organizational Structure Running A Successful Business
Organizational Structure Running A Successful BusinessOrganizational Structure Running A Successful Business
Organizational Structure Running A Successful Business
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
 
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptxFinancial-Statement-Analysis-of-Coca-cola-Company.pptx
Financial-Statement-Analysis-of-Coca-cola-Company.pptx
 

Big Data = Bigger Metadata

Notes de l'éditeur

  1. Some background to Urban Mapping. Wasn’t a straight forward path, but it’s very relevant-started close to 10 yrs ago with a printed map that reveals different layers of thematic imagery—streets, subways, neighborhoods, depending on the angle of viewing. We all know what happened to print, so I shifted the business to a new medium-in 2006 or so we collected much of the same data, but now using a spatial database as opposed to regular old vector/adobe illustrator. The writing was on the wall for licensing content to local web publishers, so shifted again-this time we moved upstream—continue to develop our own data, but greatly expand that effort to include commercial data and deliver it through our own mapping service. We do this for customers in various market segments, like Tableau Software, where we perform a few geo-services like hosting the base map and overlaying data.
  2. I can be a bit of a curmudgeon and I hope a cautionary point of view has a place. Let’s talk about what Big Data is not. I’ll talk later about what it is.First thing to note is that Big Data isn’t really about data at all. But I am. It’s about tools and processes to manage and exploit info-nuggets. There’s nothing revolutionary about saying this, but I wanted to make it explicit. Second, big data isn’t especially new– Wall St and Walmart have been processing and deriving value for decades, but they don’t talk about it. Why? Because they make money doing so and don’t need to alert the competition. Anybody hear of Teradata? Whenever companies want to talk about what they are doing, it’s usually a red flag for me, meaning the technology, industry or something else hasn’t sufficiently evolved. But I’m also not saying Big Data is a rehash of enterprise software. More on that later…Finally, Big Data has democratized access to powerful tools at little cost. This doesn’t necessarily mean everybody knows how to use these tools. There can be some blowback, such as high credit card bills, analysis without direction/objective and lack of knowledge about basic statistics
  3. There’s been exponential growth in data and it comes from any number of places. Some are shown here—mobile devices as probes, which vast capabilities to record all kinds of environmental variables, open government, social media and a desire for analytics which has been rebranded as business intelligence,
  4. Processing and storage costs drop like rocks—enterprise software has been offering big solutions for decades to banking and others, but with incredibly low barriers to entry virtually anybody can participate.
  5. Kal-i-um-akuswas a noted poet in the Library of Alexandria in 3rd century BC.
  6. He created pin-a-keez, or Lists, a way of organizing works in the libraryEmbarked on the effort to organize 120k scrolls, by title, author, birthplace, father, education, summary of contents and other info. This was first effort to systematically create a bibliographic system. A direct link to metadata 2 millennia later
  7. 1595, Johan van der Does publishedNomenclator– this was the first instance of a printed catalog of library holdings. Represented a significant advancement over the Kal-i-um-akuslists, but it too close to two millennia to get here
  8. The modern cataloging system: Dewey Decimal System, created 1876. Its father was Melville DeweyThe Dewey Decimal System attempted to organize all knowledge into ten main classes. Further subdivided into ten divisions, and each division into ten sections, giving ten main classes, 100 divisions and 1000 sections. Allows for infinite hierarchy, numerical and faceted (linking content from different areas).Other systems followed: Universal Decimal Classification, Library of Congress, etc…
  9. This photo is from the Card Division at the Library of Congress in the1920s. The amount of physical metadata is astounding. Millions of library cards with metadata
  10. The next major advancement was in the late 1960s. Early attempts at electronic indexing focused on a taxonomy of keywords and related information. Was efficient for reporting on what the system contained, but also kept the long running divorce between artifact and metadataThe online computer library center was created as a nonprofit to further access to library resources across institutions and decrease costs.The OCLC acquired the Dewey Decimal System and as any standards body does, sought to perpetuate its existence over the decadesThen the internet happened
  11. That meant out wit the old, In with the new. This photo is library cards going into storage. Not sure why they’d even be archived after the transition to databases was made, but that’s for another time
  12. So this is the situation. Beginning in the late 60s, electronically-stored metadata began to grow. The library cards (at left) went away, but the bifurcation was complete. Total separation of the thing from the description of the thing. And it sort of made sense– IT was in its infancy, so storage and processing costs were high. Publishers also exerted a great deal of control over how they permitted libraries to index and make available works.
  13. To put the last 2000 years in perspective, Kal-i-um-akus created the first crude schema, leaving a place for metadata to be storedThe Nomenclator gave us the first bibliographic catalog, printed and bound, produced annuallyThe Dewey Decimal System was born in 1876 and was the basis of an extensive metadata system for published worksThen…the internet happened. In the top right you see the corner of a cloud. That’s my way of representing what happens next.The volume of data product grows exponentially, overtaking 2000 plus years of history in no time.
  14. So how about the bifurcation/divorce I mentioned? The web brought the artifact and metadata together again
  15. Google Books. Sure, we have the Dewey Decimal type stuff along with ISBN, retail price, etc…but we also threw in the whole damn book—full text search.Amazon does it too
  16. In my industry, the state of metadata is horrendous. We’re stuck in the green screen days. Proprietary data formats and slow moving vendors don’t help.While I’m the first person to admit GIS needs to get off its ass and change, radically, there’s also something the real time streaming web can learn from us.
  17. We hear about the rise of the curator, the part social scientist, part librarian, part RDBMS wiz and statistician.This is increasingly important across all industries—when dealing with a torrent of data, domain experts will be required to help make sense of it.
  18. The Knowledge Hierarchy, as it is sometimes known, has been used to represent relationships between the stuff that turns into something meaningful. You could look at this going from a letter to a sentence to a paragraph or an ingredient to a recipe to a meal or something else. The details don’t matter here, but I think about the fundamental building block of data.One geocoded tweet has little or no value on its own. Contrast that with per capital income for this ZIP code. By amassing enough geocoded tweets, it’s clear we can get to something meaningful, but I don’t know how many tweets that is. I do know that per capita income can directly inform my marketing plans for selling a new shampoo.
  19. With that, here’s some more wet blanket for everybody. Using Google Trends, I looked at a number of terms that might indicate the old fashioned RDBMS, SQL way of life and most seem to follow the blue line, which represents the term ‘metadata.’ Big Data, coincidentally, first appears a few months before the first Strata conference in 2011. ‘Curation’ has a longer life but doesn’t show the surge of Big Data, and everybody’s favorite ‘data scientist,’ doesn’t register as much more than a rounding error. I’m not using Google Trends to fully substantiate my argument, but I do hope you take a dose of skepticism before fully embracing ‘this.’
  20. In close, I’d like to leave you with an emergent cliché. It’s also my measure of how geeky an audience I have: one person’s metadata is another person’s data.