SlideShare une entreprise Scribd logo
1  sur  9
Télécharger pour lire hors ligne
Processing of scientific data
From field capture to web delivery

Hector Quintero Casanova

Postgraduate in e-Science
Why e-Science? Data-intensive

●

GMEP ticks all the boxes:
✔ Highly multidisciplinary: social, landscape, water, birds
plants...
✔ Large volumes of data: covers the whole of Wales.
✔ Cross-organisational collaboration: 13 institutions.
Why e-Science? Metadata
●

NERC's data policy says it all
–

●

“It is essential that metadata are submitted”

Metadata = context information about data
–

Provenance = who, when, where, how
●

–

Workflow = how. Essential if using models
●

●

Exposes data relationships → traceability

Enables reproducing outcome → repeatability

Exactly what information depends on the stage.
Data collection
●

Raw data from the field
–

Metadata: method, calibration, place, units...
Data analysis

●

Information products: e.g. data from models
–

Metadata: name, conditions, where it applies
Data analysis
●

Workflow metadata avoids costly reruns
–

●

Identify model output needed → reuse

But not enough for cross-organisation collab.
–
–

●

13 institutions in Glastir.
Differences in storage structure, metadata defs...

Need extra layer(s) for seamless access
–

Web already offers tools needed.
Publication: linked data
●

HTTP for generic retrieval of resources

●

URIs for unique identification of those resources
–

●

E.g. http://www.ceh.ac.uk

Both can be used to build web services
–
–

●

Amount to remote functions.
Eg: seamless recording of workflows across institutions.

Semantics for automated reasoning
–

Acts as standardised metadata aimed at machines.
… We've come full circle!

¿?
Thank you
www.hqcasanova.com

Hector Quintero Casanova
Postgraduate in e-Science

Contenu connexe

Similaire à Processing of scientific data: from field capture to web delivery

BigDataGrapes_Wine Making Pilot
BigDataGrapes_Wine Making Pilot BigDataGrapes_Wine Making Pilot
BigDataGrapes_Wine Making Pilot Big Data Grapes
 
Big&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 ShanghaiBig&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 ShanghaiVictoria López
 
Make Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringMake Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringDataRobot
 
Big Data Analytics for connected home
Big Data Analytics for connected homeBig Data Analytics for connected home
Big Data Analytics for connected homeHéloïse Nonne
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsDhruv Saxena
 
Ci2004-10.doc
Ci2004-10.docCi2004-10.doc
Ci2004-10.docbutest
 
Computational intelligence for big data analytics bda 2013
Computational intelligence for big data analytics   bda 2013Computational intelligence for big data analytics   bda 2013
Computational intelligence for big data analytics bda 2013oj08
 
chapter1_Introduction.pdf data mining ppt
chapter1_Introduction.pdf data mining pptchapter1_Introduction.pdf data mining ppt
chapter1_Introduction.pdf data mining pptGyanaKarn
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introductionDr-Dipali Meher
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfvishal choudhary
 
Increasing the Efficiency of Workflows: Use Cases in the Life Sciences
Increasing the Efficiency of Workflows: Use Cases in the Life SciencesIncreasing the Efficiency of Workflows: Use Cases in the Life Sciences
Increasing the Efficiency of Workflows: Use Cases in the Life SciencesSandra Gesing
 
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...Wesley De Neve
 
Research Data Overview
Research Data OverviewResearch Data Overview
Research Data Overviewntunmg
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxssuser1a4f0f
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxwahiba ben abdessalem
 
Himansu sahoo resume-ds
Himansu sahoo resume-dsHimansu sahoo resume-ds
Himansu sahoo resume-dsHimansu Sahoo
 

Similaire à Processing of scientific data: from field capture to web delivery (20)

BigDataGrapes_Wine Making Pilot
BigDataGrapes_Wine Making Pilot BigDataGrapes_Wine Making Pilot
BigDataGrapes_Wine Making Pilot
 
Big&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 ShanghaiBig&open data challenges for smartcity-PIC2014 Shanghai
Big&open data challenges for smartcity-PIC2014 Shanghai
 
Make Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringMake Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature Engineering
 
Big Data Analytics for connected home
Big Data Analytics for connected homeBig Data Analytics for connected home
Big Data Analytics for connected home
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Ci2004-10.doc
Ci2004-10.docCi2004-10.doc
Ci2004-10.doc
 
Computational intelligence for big data analytics bda 2013
Computational intelligence for big data analytics   bda 2013Computational intelligence for big data analytics   bda 2013
Computational intelligence for big data analytics bda 2013
 
Présentation de F. Joudelat Congrès IIRB février 2024
Présentation de F. Joudelat Congrès IIRB février 2024Présentation de F. Joudelat Congrès IIRB février 2024
Présentation de F. Joudelat Congrès IIRB février 2024
 
chapter1_Introduction.pdf data mining ppt
chapter1_Introduction.pdf data mining pptchapter1_Introduction.pdf data mining ppt
chapter1_Introduction.pdf data mining ppt
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
01 intro
01 intro01 intro
01 intro
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
Webinar@AIMS: Big Data challenges and solutions in agricultural and environme...
Webinar@AIMS: Big Data challenges and solutions in agricultural and environme...Webinar@AIMS: Big Data challenges and solutions in agricultural and environme...
Webinar@AIMS: Big Data challenges and solutions in agricultural and environme...
 
Increasing the Efficiency of Workflows: Use Cases in the Life Sciences
Increasing the Efficiency of Workflows: Use Cases in the Life SciencesIncreasing the Efficiency of Workflows: Use Cases in the Life Sciences
Increasing the Efficiency of Workflows: Use Cases in the Life Sciences
 
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
Deep Machine Learning for Making Sense of Biotech Data - From Clean Energy to...
 
Research Data Overview
Research Data OverviewResearch Data Overview
Research Data Overview
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Sdsc pi-mtg-ecss-sgci-7-12-16
Sdsc pi-mtg-ecss-sgci-7-12-16Sdsc pi-mtg-ecss-sgci-7-12-16
Sdsc pi-mtg-ecss-sgci-7-12-16
 
Himansu sahoo resume-ds
Himansu sahoo resume-dsHimansu sahoo resume-ds
Himansu sahoo resume-ds
 

Dernier

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Dernier (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Processing of scientific data: from field capture to web delivery

  • 1. Processing of scientific data From field capture to web delivery Hector Quintero Casanova Postgraduate in e-Science
  • 2. Why e-Science? Data-intensive ● GMEP ticks all the boxes: ✔ Highly multidisciplinary: social, landscape, water, birds plants... ✔ Large volumes of data: covers the whole of Wales. ✔ Cross-organisational collaboration: 13 institutions.
  • 3. Why e-Science? Metadata ● NERC's data policy says it all – ● “It is essential that metadata are submitted” Metadata = context information about data – Provenance = who, when, where, how ● – Workflow = how. Essential if using models ● ● Exposes data relationships → traceability Enables reproducing outcome → repeatability Exactly what information depends on the stage.
  • 4. Data collection ● Raw data from the field – Metadata: method, calibration, place, units...
  • 5. Data analysis ● Information products: e.g. data from models – Metadata: name, conditions, where it applies
  • 6. Data analysis ● Workflow metadata avoids costly reruns – ● Identify model output needed → reuse But not enough for cross-organisation collab. – – ● 13 institutions in Glastir. Differences in storage structure, metadata defs... Need extra layer(s) for seamless access – Web already offers tools needed.
  • 7. Publication: linked data ● HTTP for generic retrieval of resources ● URIs for unique identification of those resources – ● E.g. http://www.ceh.ac.uk Both can be used to build web services – – ● Amount to remote functions. Eg: seamless recording of workflows across institutions. Semantics for automated reasoning – Acts as standardised metadata aimed at machines.
  • 8. … We've come full circle! ¿?
  • 9. Thank you www.hqcasanova.com Hector Quintero Casanova Postgraduate in e-Science