SlideShare une entreprise Scribd logo
1  sur  22
History of the Info: Part II Nick Ducoff CEO and Co-Founder, Infochimps
Early 2000s
Mid 2000s
Present Day
3000 BC Recording
3000 BC 1200 BC Recording Aggregating
3000 BC 1200 BC 300 BC Recording Aggregating Storing  at Scale
300s AD – Random Access 3000 BC 1200 BC 300 BC 300 AD Recording Aggregating Storing  at Scale Random  Access
3000 BC 1200 BC 300 BC 300 AD 1400 AD Recording Aggregating Storing  at Scale Random  Access Mass Distribution
3000 BC 1200 BC 300 BC 300 AD 1400 AD 1700 AD Recording Aggregating Storing  at Scale Random  Access Mass Distribution Infographics
1930s – Computation theory (Turing) 1940s – Information theory (Shannon) 1950s – Computer languages (1GL,2GL,3GL) 1960s – Standardized metadata (Avram) 1970s – Relational databases (IBM) 1980s – WWW (Al Gore   ) 1990s – Internet archive (Kahle) 3000 BC 1200 BC 300 BC 300 AD 1400 AD 1700 AD Recording Aggregating Storing  at Scale Random  Access Mass Distribution Infographics
 
 
 
 
 
 
Tables on web pages Open APIs Commercial data sources Augmentation Completion Normalization Name ZIP Average Rent Walter Cureton 78701 $400-$599 Ivy Caldwell 94103 >$1500 Regina Wootton 10027 $1000-$1499 Name Address City ZIP Brian James 901 Red River Austin 78701 Terri Becraft 262 7th St. San Francisco 94103 Paz Brummit 603 W. 114th St. New York 10027 Name Address Normalized Address Cecil Bartz 901 red river austin texas 901 Red River, Austin, TX 78701 Genaro Luz 702 w. 32nd st austin 702 W. 32nd St., Austin, TX 78705 Ruth Brown 114th + broadway, nyc W. 114th St. & Broadway, New York, NY 10027
 
 
 
[email_address]

Contenu connexe

En vedette

April Ansley Webportfolio
April Ansley WebportfolioApril Ansley Webportfolio
April Ansley Webportfolio
Phaenoh
 
"A Brief History of Data-Drivenness", Fabian Stelzer
"A Brief History of Data-Drivenness", Fabian Stelzer "A Brief History of Data-Drivenness", Fabian Stelzer
"A Brief History of Data-Drivenness", Fabian Stelzer
Dataconomy Media
 
The 12 Criteria of Population Health Management
The 12 Criteria of Population Health ManagementThe 12 Criteria of Population Health Management
The 12 Criteria of Population Health Management
Dale Sanders
 
Structured Data Presentation
Structured Data PresentationStructured Data Presentation
Structured Data Presentation
Shawn Day
 
MPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationMPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for Presentation
Shawn Day
 
PSOW 2012 - Quality & Reimbursement
PSOW 2012 - Quality & ReimbursementPSOW 2012 - Quality & Reimbursement
PSOW 2012 - Quality & Reimbursement
PSOW
 
Data presentation and transfer
Data presentation and transferData presentation and transfer
Data presentation and transfer
Iyad Abou Rabii
 
Precise Patient Registries: The Foundation for Clinical Research & Population...
Precise Patient Registries: The Foundation for Clinical Research & Population...Precise Patient Registries: The Foundation for Clinical Research & Population...
Precise Patient Registries: The Foundation for Clinical Research & Population...
Health Catalyst
 

En vedette (17)

April Ansley Webportfolio
April Ansley WebportfolioApril Ansley Webportfolio
April Ansley Webportfolio
 
My Favorite Marketing Definitions
My Favorite Marketing DefinitionsMy Favorite Marketing Definitions
My Favorite Marketing Definitions
 
CMOs in Flux
CMOs in FluxCMOs in Flux
CMOs in Flux
 
"A Brief History of Data-Drivenness", Fabian Stelzer
"A Brief History of Data-Drivenness", Fabian Stelzer "A Brief History of Data-Drivenness", Fabian Stelzer
"A Brief History of Data-Drivenness", Fabian Stelzer
 
The 12 Criteria of Population Health Management
The 12 Criteria of Population Health ManagementThe 12 Criteria of Population Health Management
The 12 Criteria of Population Health Management
 
Data Science Day New York: Data Science: A Personal History
Data Science Day New York: Data Science: A Personal HistoryData Science Day New York: Data Science: A Personal History
Data Science Day New York: Data Science: A Personal History
 
Mortality rates & standardization
Mortality rates &  standardizationMortality rates &  standardization
Mortality rates & standardization
 
An Empirical Study on the Use of CSS Preprocessors
An Empirical Study on the Use of CSS PreprocessorsAn Empirical Study on the Use of CSS Preprocessors
An Empirical Study on the Use of CSS Preprocessors
 
Acxiom Interactive Marketing Summit- The Marriage of Social Analytics & Socia...
Acxiom Interactive Marketing Summit- The Marriage of Social Analytics & Socia...Acxiom Interactive Marketing Summit- The Marriage of Social Analytics & Socia...
Acxiom Interactive Marketing Summit- The Marriage of Social Analytics & Socia...
 
State of Content Marketing in India 2015
State of Content Marketing in India 2015State of Content Marketing in India 2015
State of Content Marketing in India 2015
 
Hadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective AudienceHadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective Audience
 
Structured Data Presentation
Structured Data PresentationStructured Data Presentation
Structured Data Presentation
 
MPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationMPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for Presentation
 
PSOW 2012 - Quality & Reimbursement
PSOW 2012 - Quality & ReimbursementPSOW 2012 - Quality & Reimbursement
PSOW 2012 - Quality & Reimbursement
 
Data presentation and transfer
Data presentation and transferData presentation and transfer
Data presentation and transfer
 
Precise Patient Registries: The Foundation for Clinical Research & Population...
Precise Patient Registries: The Foundation for Clinical Research & Population...Precise Patient Registries: The Foundation for Clinical Research & Population...
Precise Patient Registries: The Foundation for Clinical Research & Population...
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

The History of Data

  • 1. History of the Info: Part II Nick Ducoff CEO and Co-Founder, Infochimps
  • 6. 3000 BC 1200 BC Recording Aggregating
  • 7. 3000 BC 1200 BC 300 BC Recording Aggregating Storing at Scale
  • 8. 300s AD – Random Access 3000 BC 1200 BC 300 BC 300 AD Recording Aggregating Storing at Scale Random Access
  • 9. 3000 BC 1200 BC 300 BC 300 AD 1400 AD Recording Aggregating Storing at Scale Random Access Mass Distribution
  • 10. 3000 BC 1200 BC 300 BC 300 AD 1400 AD 1700 AD Recording Aggregating Storing at Scale Random Access Mass Distribution Infographics
  • 11. 1930s – Computation theory (Turing) 1940s – Information theory (Shannon) 1950s – Computer languages (1GL,2GL,3GL) 1960s – Standardized metadata (Avram) 1970s – Relational databases (IBM) 1980s – WWW (Al Gore  ) 1990s – Internet archive (Kahle) 3000 BC 1200 BC 300 BC 300 AD 1400 AD 1700 AD Recording Aggregating Storing at Scale Random Access Mass Distribution Infographics
  • 12.  
  • 13.  
  • 14.  
  • 15.  
  • 16.  
  • 17.  
  • 18. Tables on web pages Open APIs Commercial data sources Augmentation Completion Normalization Name ZIP Average Rent Walter Cureton 78701 $400-$599 Ivy Caldwell 94103 >$1500 Regina Wootton 10027 $1000-$1499 Name Address City ZIP Brian James 901 Red River Austin 78701 Terri Becraft 262 7th St. San Francisco 94103 Paz Brummit 603 W. 114th St. New York 10027 Name Address Normalized Address Cecil Bartz 901 red river austin texas 901 Red River, Austin, TX 78701 Genaro Luz 702 w. 32nd st austin 702 W. 32nd St., Austin, TX 78705 Ruth Brown 114th + broadway, nyc W. 114th St. & Broadway, New York, NY 10027
  • 19.  
  • 20.  
  • 21.  

Notes de l'éditeur

  1. Internet brought offline businesses online
  2. Social networks created massive amounts of data
  3. Social networks created massive amounts of data
  4. Babylon was first society to systematically record knowledge, including the first census which systematically counted and recorded people and commodities for taxation and other purposes
  5. Library at Thebes was first known effort to gather and make many sources of knowledge available in one place
  6. Charged with collecting all the world's knowledge, the Library of Alexandria collected what is thought to have been nearly a half million objects
  7. Codex replaces scrolls, enabling random access of information, or browsing.
  8. Gutenberg’s printing press enables mass production and distribution of information
  9. William Playfair invents the line, bar and pie charts, paving the way for Charles Minard’s famous graphical representation of Napoleon’s March
  10. Alan Turing showed that any reasonable computation could be done by programming a machine Claude Shannon solved the engineering problem of the transmission of information over a noisy channel Computer language advanced quickly from first generation languages to third generation languages such as COBAL Henriette Avram created the Machine-readable cataloging system to metatag books Relational databases enabled storing and lookups of data at scale Tim Berners-Lee creates WWW which leads to mass adoption of internet, quickly growing to billions of pages, causing Brewster Kahle to begin systematically capturing and storing the information 1930s – Computation theory (Turing) 1940s – Information theory (Shannon) 1950s – Computer languages (1GL, 2GL, 3GL) 1960s – Standardized metadata (Avram) 1970s – Relational databases (IBM) 1980s – WWW (Al Gore  ) 1990s – Internet archive (Kahle)
  11. 1.8 ZB of data but still hard to find the pieces you want
  12. Aggregated, organized, accessible. When you can easily identify, understand and access the pieces, you can build anything.
  13. Map by Charles Joseph Minard portrays the losses suffered by Napoleon's army in the Russian campaign of 1812
  14. Better BI decisions and data-driven apps