SlideShare a Scribd company logo
1 of 15
F# Data
Making structured data first-class citizens
Tomas Petricek, University of Cambridge
Project homepage: http://fsharp.github.io/FSharp.Data
Get in touch: @tomaspetricek | tomas@tomasp.net
F# Software Foundation
http://www.fsharp.org
software stacks
trainings teaching F# user groups snippets
mac and linux cross-platform tutorials
F# community open-source MonoDevelop
contributions research support
consultancy mailing list
F# Data type providers
First-class data
CSV, REST, WorldBank…
R Type provider
Statistics & visualization
5000 tested packages
www.fslab.org
Deedle data frame
Data exploration
Indexing and aggregation
F# Charting library
Simple & composable
Interactive style
www.fslab.org
What are type providers?
Integrating WorldBank and R
http://youtu.be/7r2-B-5H_io
The confusion of languages
What are type providers?
What are type providers?
Type provider research questions
Data vs.
Schema
Laziness
Mapping
to types
Schema
inference
Schema
inference
Schema inference
Loading Titanic data from CSV
http://youtu.be/yjBdZduc0ko
Inferring primitive types
null intbool
string
decimal
float
Structure inference
Working with XML and JSON data
http://youtu.be/_DjX0ybaXZY
http://youtu.be/SkZBzlREOMo
Inferring structured types
person { name : string } person { name : string, age : int }
person { name : string, age : int option }
[ { num : int } ] [ { str : string } ]
[ { num : int option, str : string option } ]
int { value : int }
int + { value : int }
Does it scale?
Query movies using Apiary provider
http://youtu.be/-Am2uRUv39c
Conclusions
Inference from small-scale samples works!
Schema is (very) often missing
But data is (very) often regular
Check out F# Data and contribute!
Project homepage: http://fsharp.github.io/FSharp.Data
Get in touch: @tomaspetricek | tomas@tomasp.net

More Related Content

What's hot

Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...
Lviv Startup Club
 
AjayBhullar_Resume (5)
AjayBhullar_Resume (5)AjayBhullar_Resume (5)
AjayBhullar_Resume (5)
Ajay Bhullar
 

What's hot (20)

Benchmark BIB-R @ TPDL 2016
Benchmark BIB-R @ TPDL 2016Benchmark BIB-R @ TPDL 2016
Benchmark BIB-R @ TPDL 2016
 
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs:A DBpedia StudyCrowdsourcing the Quality of Knowledge Graphs:A DBpedia Study
Crowdsourcing the Quality of Knowledge Graphs: A DBpedia Study
 
Newspapers, IIIF, and ALTO
Newspapers, IIIF, and ALTONewspapers, IIIF, and ALTO
Newspapers, IIIF, and ALTO
 
A Closer Look at the Changing Dynamics of DBpedia Mappings
A Closer Look at the Changing Dynamics of DBpedia MappingsA Closer Look at the Changing Dynamics of DBpedia Mappings
A Closer Look at the Changing Dynamics of DBpedia Mappings
 
Milex 2010 final
Milex 2010 finalMilex 2010 final
Milex 2010 final
 
Topical_Facets
Topical_FacetsTopical_Facets
Topical_Facets
 
Creating a FRBRized Data Structure in XML
Creating a FRBRized Data Structure in XMLCreating a FRBRized Data Structure in XML
Creating a FRBRized Data Structure in XML
 
Data Quality Assessment in Europeana: Metrics for Multilinguality
Data Quality Assessment in Europeana:  Metrics for MultilingualityData Quality Assessment in Europeana:  Metrics for Multilinguality
Data Quality Assessment in Europeana: Metrics for Multilinguality
 
Search4similars
Search4similarsSearch4similars
Search4similars
 
Analyzing poetry databases to develop a metadata application profile. Why eac...
Analyzing poetry databases to develop a metadata application profile. Why eac...Analyzing poetry databases to develop a metadata application profile. Why eac...
Analyzing poetry databases to develop a metadata application profile. Why eac...
 
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sfSparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
 
Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...Oles Petriv “Creating one concept embedding space for persons, brands and new...
Oles Petriv “Creating one concept embedding space for persons, brands and new...
 
Link Discovery Tutorial Introduction
Link Discovery Tutorial IntroductionLink Discovery Tutorial Introduction
Link Discovery Tutorial Introduction
 
Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)Entity Retrieval (WSDM 2014 tutorial)
Entity Retrieval (WSDM 2014 tutorial)
 
Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)Entity Retrieval (SIGIR 2013 tutorial)
Entity Retrieval (SIGIR 2013 tutorial)
 
AjayBhullar_Resume (5)
AjayBhullar_Resume (5)AjayBhullar_Resume (5)
AjayBhullar_Resume (5)
 
Healthcare Data Management using Domain Specific Languages for Metadata Manag...
Healthcare Data Management using Domain Specific Languages for Metadata Manag...Healthcare Data Management using Domain Specific Languages for Metadata Manag...
Healthcare Data Management using Domain Specific Languages for Metadata Manag...
 
Summary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in GermanySummary of GSCL 2013 international NLP conference in Germany
Summary of GSCL 2013 international NLP conference in Germany
 
Linked Data APIs (Funding Circle May 2015)
Linked Data APIs (Funding Circle May 2015)Linked Data APIs (Funding Circle May 2015)
Linked Data APIs (Funding Circle May 2015)
 
Session 1.6 fostering interoperability of european qualifications: the qual...
Session 1.6   fostering interoperability of european qualifications: the qual...Session 1.6   fostering interoperability of european qualifications: the qual...
Session 1.6 fostering interoperability of european qualifications: the qual...
 

Similar to F# Data: Making structured data first class citizens

EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
European Data Forum
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashups
giurca
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
National Information Standards Organization (NISO)
 
Semantic Search Summer School2009
Semantic Search Summer School2009Semantic Search Summer School2009
Semantic Search Summer School2009
Peter Mika
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
Gezim Sejdiu
 

Similar to F# Data: Making structured data first class citizens (20)

F# and Financial Data Making Data Analysis Simple
F# and Financial Data Making Data Analysis SimpleF# and Financial Data Making Data Analysis Simple
F# and Financial Data Making Data Analysis Simple
 
IIIF for CNI Spring 2014 Membership Meeting
IIIF for CNI Spring 2014 Membership MeetingIIIF for CNI Spring 2014 Membership Meeting
IIIF for CNI Spring 2014 Membership Meeting
 
Wimmics Overview 2021
Wimmics Overview 2021Wimmics Overview 2021
Wimmics Overview 2021
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the Web
 
Data Portability with SIOC and FOAF
Data Portability with SIOC and FOAFData Portability with SIOC and FOAF
Data Portability with SIOC and FOAF
 
EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Semantic Pipes and Semantic Mashups
Semantic Pipes and Semantic MashupsSemantic Pipes and Semantic Mashups
Semantic Pipes and Semantic Mashups
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early AdoptersApril 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
 
How the Web can change social science research (including yours)
How the Web can change social science research (including yours)How the Web can change social science research (including yours)
How the Web can change social science research (including yours)
 
A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)
 
Linking library data
Linking library dataLinking library data
Linking library data
 
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
Ranking Resources in Folksonomies by Exploiting Semantic and Context-specific...
 
Open Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaicsOpen Data Mashups: linking fragments into mosaics
Open Data Mashups: linking fragments into mosaics
 
Semantic Search Summer School2009
Semantic Search Summer School2009Semantic Search Summer School2009
Semantic Search Summer School2009
 
The Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of LeipzigThe Semantic Data Web, Sören Auer, University of Leipzig
The Semantic Data Web, Sören Auer, University of Leipzig
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 

More from Tomas Petricek

Queries in general purpose languages
Queries in general purpose languagesQueries in general purpose languages
Queries in general purpose languages
Tomas Petricek
 

More from Tomas Petricek (13)

Coeffects: A Calculus of Context-Dependent Computation
Coeffects: A Calculus of Context-Dependent ComputationCoeffects: A Calculus of Context-Dependent Computation
Coeffects: A Calculus of Context-Dependent Computation
 
Creating Domain Specific Languages in F#
Creating Domain Specific Languages in F#Creating Domain Specific Languages in F#
Creating Domain Specific Languages in F#
 
F# Type Providers in Depth
F# Type Providers in DepthF# Type Providers in Depth
F# Type Providers in Depth
 
Asynchronous programming in F# (QCon 2012)
Asynchronous programming in F# (QCon 2012)Asynchronous programming in F# (QCon 2012)
Asynchronous programming in F# (QCon 2012)
 
Queries in general purpose languages
Queries in general purpose languagesQueries in general purpose languages
Queries in general purpose languages
 
Docase notation for Haskell
Docase notation for HaskellDocase notation for Haskell
Docase notation for Haskell
 
Accessing loosely structured data from F# and C#
Accessing loosely structured data from F# and C#Accessing loosely structured data from F# and C#
Accessing loosely structured data from F# and C#
 
F# on the Server-Side
F# on the Server-SideF# on the Server-Side
F# on the Server-Side
 
F# Tutorial @ QCon
F# Tutorial @ QConF# Tutorial @ QCon
F# Tutorial @ QCon
 
Teaching F#
Teaching F#Teaching F#
Teaching F#
 
F# in MonoDevelop
F# in MonoDevelopF# in MonoDevelop
F# in MonoDevelop
 
Academia
AcademiaAcademia
Academia
 
Concurrent programming with Agents
Concurrent programming with AgentsConcurrent programming with Agents
Concurrent programming with Agents
 

Recently uploaded

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

F# Data: Making structured data first class citizens