SlideShare une entreprise Scribd logo
1  sur  30
Web delivery of giant climate data sets to
facilitate open science
James Hiebert
Pacific Climate Impacts Consortium
University of Victoria
Victoria, BC Canada
FOSS4G NA
May 22, 2013
Introduction
How to serve...
high-resolution,
downscaled,
climate data
on the web?
Pacific Climate Impacts Consortium
(PCIC)
● Regional climate services provider
● Non-profit hosted at the University of Victoria
● Fill a niche between pure research and
applied science
Consortium Partners
Consortium Partners
Consortium Partners
Consortium Partners
Consortium Partners
Consortium Partners
High-resolution
How to serve...
high-resolution,
downscaled,
climate data
on the web?
Global Climate Models (GCMs)
IPCC AR4 Report, 2007
GCM Resolution
IPCC AR4 Report, 2007
Global to Regional Scale
Downscaling
How to serve...
high-resolution,
downscaled,
climate data
on the web?
Downscaling
GCMs
(100-500 km)
RCMs
(15-50 km)
Downscaling
(1-4 km)
Impacts Models
Moving on
How to serve...
high-resolution,
downscaled,
climate data
on the web?
Technical Challenges
● Data volume
● Irregular grids and projections
● Large dimensionality
Challenges: Data Volume
n emission scenarios
x
m GCMs
x
o RCMs
x
p downscaling methods
x
variables
x
space
x
time
Big dataBig data
Challenges: Grids and Projections
Challenges: Large dimensionality
Demo
FOSS Components
Off the shelf
● PostgreSQL
● Python/h5py
● TileCache
● ncWMS
● OpenLayers
By PCIC (or heavily
modified)
● Pydap-3.2
● Python/pupynere
● The metadata database
indexing and
management
● The whole UI
FOSS Components
OpenStreetMap
OpenLayers
from
ncWMS
POST to
PyDAP
Big data, big RAM, BadRequest.
Oh my!
Hurry up and wait
Generators: 70's tech that works today!
● a function which yields execution rather than returning
● yields values one at a time, on-demand
● low memory footprint
● faster; calling overhead
● elegant!
Generator Example
from itertools import islice
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a+b
# print the first 10 values of the fobonacci sequence
for x in islice(fibonacci(), 10):
print x
Back to the demo
Conclusions
● Governments use climate model output to proactively
plan adaptation strategies
● Climate model output is pretty big
● Don't make the user wait for their data
● Generators are awesome
● Our work should be available in PyDAP later this year
Acknowledgments and Questions
Thanks to:
● My small dev team
● PCIC's consortium members
● Roberto De Almeida (for PyDAP)
● You! (OSS developers and supporters)

Contenu connexe

Tendances

Air 2014 barcelona
Air 2014 barcelonaAir 2014 barcelona
Air 2014 barcelonaStantec
 
Air 2015 aaae
Air 2015 aaaeAir 2015 aaae
Air 2015 aaaeStantec
 
Presentation for Osho.Y.B(June 2015)
Presentation for Osho.Y.B(June 2015)Presentation for Osho.Y.B(June 2015)
Presentation for Osho.Y.B(June 2015)Tunde Osho
 
2016 1018 CIWEM SW seminar
2016 1018 CIWEM SW seminar2016 1018 CIWEM SW seminar
2016 1018 CIWEM SW seminarAlbert Chen
 
DBCP monthly report (July 2016
DBCP monthly report (July 2016DBCP monthly report (July 2016
DBCP monthly report (July 2016JCOMMOPS
 
Workshop on Sustainability Performance of Energy Systems (SPES)
Workshop on Sustainability Performance of Energy Systems (SPES)Workshop on Sustainability Performance of Energy Systems (SPES)
Workshop on Sustainability Performance of Energy Systems (SPES)IEA-ETSAP
 
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...BigData_Europe
 
Session 1.1.2. Climate Monitoring
Session 1.1.2. Climate MonitoringSession 1.1.2. Climate Monitoring
Session 1.1.2. Climate MonitoringNAP Events
 
SitkaCaseStudy_CHaMP_031816_FNL
SitkaCaseStudy_CHaMP_031816_FNLSitkaCaseStudy_CHaMP_031816_FNL
SitkaCaseStudy_CHaMP_031816_FNLdkinpdx
 
Air 2014 energy_efficiency_forum
Air 2014 energy_efficiency_forumAir 2014 energy_efficiency_forum
Air 2014 energy_efficiency_forumStantec
 
Guidelines for the preparation of short range dispersion modelling assessment...
Guidelines for the preparation of short range dispersion modelling assessment...Guidelines for the preparation of short range dispersion modelling assessment...
Guidelines for the preparation of short range dispersion modelling assessment...IES / IAQM
 
Shair: Adding the missing dimensions to modelling
Shair: Adding the missing dimensions to modellingShair: Adding the missing dimensions to modelling
Shair: Adding the missing dimensions to modellingIES / IAQM
 
How TERN Data Infrastructure works
How TERN Data Infrastructure worksHow TERN Data Infrastructure works
How TERN Data Infrastructure worksTERN Australia
 
Gfd geospatial projects
Gfd geospatial projectsGfd geospatial projects
Gfd geospatial projectshoangvietanh74
 
Using the IAQM guidance on assessing air quality impacts on nature conservati...
Using the IAQM guidance on assessing air quality impacts on nature conservati...Using the IAQM guidance on assessing air quality impacts on nature conservati...
Using the IAQM guidance on assessing air quality impacts on nature conservati...IES / IAQM
 

Tendances (20)

Air 2014 barcelona
Air 2014 barcelonaAir 2014 barcelona
Air 2014 barcelona
 
Air 2015 aaae
Air 2015 aaaeAir 2015 aaae
Air 2015 aaae
 
Presentation for Osho.Y.B(June 2015)
Presentation for Osho.Y.B(June 2015)Presentation for Osho.Y.B(June 2015)
Presentation for Osho.Y.B(June 2015)
 
Swan CBP methods ws oct 2011
Swan CBP methods ws oct 2011Swan CBP methods ws oct 2011
Swan CBP methods ws oct 2011
 
2016 1018 CIWEM SW seminar
2016 1018 CIWEM SW seminar2016 1018 CIWEM SW seminar
2016 1018 CIWEM SW seminar
 
DBCP monthly report (July 2016
DBCP monthly report (July 2016DBCP monthly report (July 2016
DBCP monthly report (July 2016
 
Workshop on Sustainability Performance of Energy Systems (SPES)
Workshop on Sustainability Performance of Energy Systems (SPES)Workshop on Sustainability Performance of Energy Systems (SPES)
Workshop on Sustainability Performance of Energy Systems (SPES)
 
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
 
Cbp Component A
Cbp Component ACbp Component A
Cbp Component A
 
Session 1.1.2. Climate Monitoring
Session 1.1.2. Climate MonitoringSession 1.1.2. Climate Monitoring
Session 1.1.2. Climate Monitoring
 
SitkaCaseStudy_CHaMP_031816_FNL
SitkaCaseStudy_CHaMP_031816_FNLSitkaCaseStudy_CHaMP_031816_FNL
SitkaCaseStudy_CHaMP_031816_FNL
 
Air 2014 energy_efficiency_forum
Air 2014 energy_efficiency_forumAir 2014 energy_efficiency_forum
Air 2014 energy_efficiency_forum
 
Guidelines for the preparation of short range dispersion modelling assessment...
Guidelines for the preparation of short range dispersion modelling assessment...Guidelines for the preparation of short range dispersion modelling assessment...
Guidelines for the preparation of short range dispersion modelling assessment...
 
Caroline Graham (British Geological Society) - CONTAIN: The Impact of Hydroca...
Caroline Graham (British Geological Society) - CONTAIN: The Impact of Hydroca...Caroline Graham (British Geological Society) - CONTAIN: The Impact of Hydroca...
Caroline Graham (British Geological Society) - CONTAIN: The Impact of Hydroca...
 
Shair: Adding the missing dimensions to modelling
Shair: Adding the missing dimensions to modellingShair: Adding the missing dimensions to modelling
Shair: Adding the missing dimensions to modelling
 
How TERN Data Infrastructure works
How TERN Data Infrastructure worksHow TERN Data Infrastructure works
How TERN Data Infrastructure works
 
Gfd geospatial projects
Gfd geospatial projectsGfd geospatial projects
Gfd geospatial projects
 
Abingdon20110228
Abingdon20110228Abingdon20110228
Abingdon20110228
 
FAO-MOSAICC
FAO-MOSAICCFAO-MOSAICC
FAO-MOSAICC
 
Using the IAQM guidance on assessing air quality impacts on nature conservati...
Using the IAQM guidance on assessing air quality impacts on nature conservati...Using the IAQM guidance on assessing air quality impacts on nature conservati...
Using the IAQM guidance on assessing air quality impacts on nature conservati...
 

En vedette

ใบงาน แบบสำรวจและประวัติของอัศวชาติ
ใบงาน แบบสำรวจและประวัติของอัศวชาติใบงาน แบบสำรวจและประวัติของอัศวชาติ
ใบงาน แบบสำรวจและประวัติของอัศวชาติAsswachart Penkhun
 
интерфейс Usb
интерфейс Usbинтерфейс Usb
интерфейс UsbVitusKK
 
Dubai bus rentals
Dubai bus rentalsDubai bus rentals
Dubai bus rentalsarabiandre2
 
K bank fx & rates strategies views on thailand’s bond market in q3
K bank fx & rates strategies   views on thailand’s bond market in q3K bank fx & rates strategies   views on thailand’s bond market in q3
K bank fx & rates strategies views on thailand’s bond market in q3KBank Fx Dealing Room
 
PresentacióN001
PresentacióN001PresentacióN001
PresentacióN001Julio Czamo
 
The Pretense of Knowledge? The Utility of Scientific Method in Political Science
The Pretense of Knowledge? The Utility of Scientific Method in Political ScienceThe Pretense of Knowledge? The Utility of Scientific Method in Political Science
The Pretense of Knowledge? The Utility of Scientific Method in Political ScienceVadim Karastelev
 
REALOGY8K11808
REALOGY8K11808REALOGY8K11808
REALOGY8K11808finance35
 
bai tap va phuong phap giai bai tap dien xoay chieu
bai tap va phuong phap giai bai tap dien xoay chieubai tap va phuong phap giai bai tap dien xoay chieu
bai tap va phuong phap giai bai tap dien xoay chieuAquamarine Stone
 
10 Scariest Intranet Mistakes & How to Avoid Them
10 Scariest Intranet Mistakes & How to Avoid Them10 Scariest Intranet Mistakes & How to Avoid Them
10 Scariest Intranet Mistakes & How to Avoid ThemPrescient Digital Media
 
The groopify journey - EDEM - Pablo Viguera (groopify)
The groopify journey - EDEM - Pablo Viguera (groopify)The groopify journey - EDEM - Pablo Viguera (groopify)
The groopify journey - EDEM - Pablo Viguera (groopify)Groopify
 
November 12 2012 Academic Town Meeting
November 12 2012 Academic Town MeetingNovember 12 2012 Academic Town Meeting
November 12 2012 Academic Town Meetingmghihpmarcom
 
Workshop ANP Perssupport Mei 2010
Workshop ANP Perssupport Mei 2010Workshop ANP Perssupport Mei 2010
Workshop ANP Perssupport Mei 2010Coopr
 

En vedette (20)

Resume template 4
Resume template 4Resume template 4
Resume template 4
 
Solution cz
Solution czSolution cz
Solution cz
 
U03d1 mach
U03d1 machU03d1 mach
U03d1 mach
 
ใบงาน แบบสำรวจและประวัติของอัศวชาติ
ใบงาน แบบสำรวจและประวัติของอัศวชาติใบงาน แบบสำรวจและประวัติของอัศวชาติ
ใบงาน แบบสำรวจและประวัติของอัศวชาติ
 
10 Tips om sneller te bloggen
10 Tips om sneller te bloggen10 Tips om sneller te bloggen
10 Tips om sneller te bloggen
 
интерфейс Usb
интерфейс Usbинтерфейс Usb
интерфейс Usb
 
Dubai bus rentals
Dubai bus rentalsDubai bus rentals
Dubai bus rentals
 
Dr. Surinder Kapur: Creating Bridges Opening New Worlds
Dr. Surinder Kapur: Creating Bridges Opening New Worlds Dr. Surinder Kapur: Creating Bridges Opening New Worlds
Dr. Surinder Kapur: Creating Bridges Opening New Worlds
 
Esculturas de Areia
Esculturas de AreiaEsculturas de Areia
Esculturas de Areia
 
K bank fx & rates strategies views on thailand’s bond market in q3
K bank fx & rates strategies   views on thailand’s bond market in q3K bank fx & rates strategies   views on thailand’s bond market in q3
K bank fx & rates strategies views on thailand’s bond market in q3
 
CII Global Regulatory Update, October 2013
CII Global Regulatory Update, October 2013CII Global Regulatory Update, October 2013
CII Global Regulatory Update, October 2013
 
PresentacióN001
PresentacióN001PresentacióN001
PresentacióN001
 
The Pretense of Knowledge? The Utility of Scientific Method in Political Science
The Pretense of Knowledge? The Utility of Scientific Method in Political ScienceThe Pretense of Knowledge? The Utility of Scientific Method in Political Science
The Pretense of Knowledge? The Utility of Scientific Method in Political Science
 
REALOGY8K11808
REALOGY8K11808REALOGY8K11808
REALOGY8K11808
 
bai tap va phuong phap giai bai tap dien xoay chieu
bai tap va phuong phap giai bai tap dien xoay chieubai tap va phuong phap giai bai tap dien xoay chieu
bai tap va phuong phap giai bai tap dien xoay chieu
 
10 Scariest Intranet Mistakes & How to Avoid Them
10 Scariest Intranet Mistakes & How to Avoid Them10 Scariest Intranet Mistakes & How to Avoid Them
10 Scariest Intranet Mistakes & How to Avoid Them
 
The groopify journey - EDEM - Pablo Viguera (groopify)
The groopify journey - EDEM - Pablo Viguera (groopify)The groopify journey - EDEM - Pablo Viguera (groopify)
The groopify journey - EDEM - Pablo Viguera (groopify)
 
November 12 2012 Academic Town Meeting
November 12 2012 Academic Town MeetingNovember 12 2012 Academic Town Meeting
November 12 2012 Academic Town Meeting
 
CII Multilateral Newsletter, May 2014
CII Multilateral Newsletter, May 2014CII Multilateral Newsletter, May 2014
CII Multilateral Newsletter, May 2014
 
Workshop ANP Perssupport Mei 2010
Workshop ANP Perssupport Mei 2010Workshop ANP Perssupport Mei 2010
Workshop ANP Perssupport Mei 2010
 

Similaire à Web delivery of giant climate data sets to facilitate open science

Building Climate Resilience: Translating Climate Data into Risk Assessments
Building Climate Resilience: Translating Climate Data into Risk Assessments Building Climate Resilience: Translating Climate Data into Risk Assessments
Building Climate Resilience: Translating Climate Data into Risk Assessments Safe Software
 
Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections NAP Events
 
6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC Projections6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC ProjectionsNAP Events
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...Wolfgang Ksoll
 
ICLR Friday Forum: Floodplain mapping over Canada: performance at inundation...
ICLR Friday Forum:  Floodplain mapping over Canada: performance at inundation...ICLR Friday Forum:  Floodplain mapping over Canada: performance at inundation...
ICLR Friday Forum: Floodplain mapping over Canada: performance at inundation...glennmcgillivray
 
DSD-INT 2019 - Global Data Services and Analysis Frameworks - Twigt
DSD-INT 2019 - Global Data Services and Analysis Frameworks - TwigtDSD-INT 2019 - Global Data Services and Analysis Frameworks - Twigt
DSD-INT 2019 - Global Data Services and Analysis Frameworks - TwigtDeltares
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTTERN Australia
 
Christian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big dataChristian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big datajins0618
 
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)ICLR Friday Forum: Climate data in Ontario (November 13, 2015)
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)glennmcgillivray
 
Cloud processing close to the Earth data
Cloud processing close to the Earth dataCloud processing close to the Earth data
Cloud processing close to the Earth dataterradue
 
Flood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FME
Flood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FMEFlood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FME
Flood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FMESafe Software
 
Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...
Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...
Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...Decision and Policy Analysis Program
 
Solving advanced research problems with real time open data from satellites a...
Solving advanced research problems with real time open data from satellites a...Solving advanced research problems with real time open data from satellites a...
Solving advanced research problems with real time open data from satellites a...Wolfgang Ksoll
 
Research + Private Sector Partnerships - Climate information for social and e...
Research + Private Sector Partnerships - Climate information for social and e...Research + Private Sector Partnerships - Climate information for social and e...
Research + Private Sector Partnerships - Climate information for social and e...UCAR - Atmospheric & Earth System Science
 

Similaire à Web delivery of giant climate data sets to facilitate open science (20)

Building Climate Resilience: Translating Climate Data into Risk Assessments
Building Climate Resilience: Translating Climate Data into Risk Assessments Building Climate Resilience: Translating Climate Data into Risk Assessments
Building Climate Resilience: Translating Climate Data into Risk Assessments
 
Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections Session 1.1.3. Climate Projections
Session 1.1.3. Climate Projections
 
6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC Projections6.1.1 Methodologies for climate rational for adaptation- CC Projections
6.1.1 Methodologies for climate rational for adaptation- CC Projections
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
 
ICLR Friday Forum: Floodplain mapping over Canada: performance at inundation...
ICLR Friday Forum:  Floodplain mapping over Canada: performance at inundation...ICLR Friday Forum:  Floodplain mapping over Canada: performance at inundation...
ICLR Friday Forum: Floodplain mapping over Canada: performance at inundation...
 
DSD-INT 2019 - Global Data Services and Analysis Frameworks - Twigt
DSD-INT 2019 - Global Data Services and Analysis Frameworks - TwigtDSD-INT 2019 - Global Data Services and Analysis Frameworks - Twigt
DSD-INT 2019 - Global Data Services and Analysis Frameworks - Twigt
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MAST
 
Christian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big dataChristian jensen advanced routing in spatial networks using big data
Christian jensen advanced routing in spatial networks using big data
 
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)ICLR Friday Forum: Climate data in Ontario (November 13, 2015)
ICLR Friday Forum: Climate data in Ontario (November 13, 2015)
 
Geographic Information Systems for Food Security and Land Management in Afric...
Geographic Information Systems for Food Security and Land Management in Afric...Geographic Information Systems for Food Security and Land Management in Afric...
Geographic Information Systems for Food Security and Land Management in Afric...
 
Renewable Energy Decision Support Engine
Renewable Energy Decision Support EngineRenewable Energy Decision Support Engine
Renewable Energy Decision Support Engine
 
Cloud processing close to the Earth data
Cloud processing close to the Earth dataCloud processing close to the Earth data
Cloud processing close to the Earth data
 
Flood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FME
Flood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FMEFlood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FME
Flood and Landslide Impact Components for the OGC 2021 Disaster Pilot using FME
 
Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...
Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...
Climate modeling, climate change and agriculture. Durban Agrihack Talent Chal...
 
Navarro-Racines_C Major global dataset: CCAFS-Climate
Navarro-Racines_C Major global dataset: CCAFS-ClimateNavarro-Racines_C Major global dataset: CCAFS-Climate
Navarro-Racines_C Major global dataset: CCAFS-Climate
 
9th IWERN Meeting Presentation
9th IWERN Meeting Presentation9th IWERN Meeting Presentation
9th IWERN Meeting Presentation
 
Solving advanced research problems with real time open data from satellites a...
Solving advanced research problems with real time open data from satellites a...Solving advanced research problems with real time open data from satellites a...
Solving advanced research problems with real time open data from satellites a...
 
01-11 StreamAir - Donald.pdf
01-11 StreamAir - Donald.pdf01-11 StreamAir - Donald.pdf
01-11 StreamAir - Donald.pdf
 
Downscaling and its limitation on climate change impact assessments
Downscaling and its limitation on climate change impact assessmentsDownscaling and its limitation on climate change impact assessments
Downscaling and its limitation on climate change impact assessments
 
Research + Private Sector Partnerships - Climate information for social and e...
Research + Private Sector Partnerships - Climate information for social and e...Research + Private Sector Partnerships - Climate information for social and e...
Research + Private Sector Partnerships - Climate information for social and e...
 

Dernier

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxabhishekdhamu51
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 

Dernier (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 

Web delivery of giant climate data sets to facilitate open science

Notes de l'éditeur

  1. My name is James Hiebert from the Pacific Climate Impacts Consortium in British Columbia, and I'm going to speak about my experience putting high-resolution, downscaled, climate data on the web. Give me a few slides to unpack that statement back to front.
  2. First of all I work for the Pacific Climate Impacts Consortium (PCIC), a regional climate service provider in British Columbia hosted at the University of Victoria. We are a publicly funded organization whose mandate is to conduct and facilitate regional climate research and deliver regional climate services.
  3. As such we work directly with local and provincial governments, providing information on the physical science of climate change in support of the development of adaptation strategies.
  4. Our stakeholders use our climate projections to answer a wide array of pertinent questions. Examples include whether future river flows can support hydro power...
  5. whether future storm intensity will necessitate larger culverts, storm drains, or bridges...
  6. to what degree sea level rise might inundate our homes and farmland (this photos is just a couple minute walk from my office)...
  7. or whether our forests might become more susceptible to fire...
  8. or outbreaks of disease. Our stakeholders are potentially making policy decisions and engineering decisions for incredibly expensive infrastructure based on the results of our impacts models. We have a responsibility to openly provide the results of each phase of the climate modeling pipeline. This provides a higher degree of transparency, and also facilitates a greater degree of stakeholder engagement with a tighter feedback loop. Think Agile development, but for science.
  9. To answer these kinds of questions requires us to have information at a very small scale, which brings us to “high-resolution”. As applied to climate data, the term “high-resolution” is a little difficult to define. For impacts models, our users typically require landscape level information or better. So “high” is a relative term that usually means as high as you can get. But allow me a brief interlude to first explain where climate data comes from and how we get to landscape level information.
  10. Modeling centers around the world develop and run what are called Global Climate Models or (GCMs). GCMs are derived from fundamental physical laws and include a variety of features that affect the climate system.
  11. They run at relatively coarse scale, one pixel representing a few hundred kilometers square. The output from these GCMs can then be taken and transformed to a finer resolutions, sometimes using statistical relationships, sometimes by running the model again on a smaller domain.
  12. This transformation step is a process referred to as downscaling. The result of this downscaling step is what is used to drive impacts models and that is our target data for distribution.
  13. This is a very simplified flow diagram of the climate data pipeline, but to be clear, there is a lot of sophisticated physics, climate science, statistics, machine learning and other domains that go into each of these steps, each which have their own set of scientific questions to be answered and associated uncertainties.
  14. So hopefully I have explained these parts well enough for you to get by and we can move on to putting the data on the web.
  15. It turns out that it's hard to do! There are a variety of technical challenges that are worth mentioning.
  16. If you were paying attention though that last slide, you may have thought, “hey, that sounds like a lot of moving parts”. If you consider that climate scientists are modeling some number of future scenarios for human emissions, multiply that by all of the different global climate models, by all of the different regional climate models, by all of the different types of downscaling, by some number of measured quantities, by time, by space... all of a sudden we're talking about a lot of data. You see, unlike observations, we're not limited by the sensors or satellites that we can afford to deploy. We can just create data out of thin air... or at least out of a model that is churning away at simulating the earth and the ocean. We're not even limited by what has happened in the past. We are projecting the future. And not just one future, but many, many realizations of the future. As modeled by many people and many modeling centers.
  17. Additional challenges include that model grids are often irregular, curvilinear or in projections that aren't supported by FOSS that we want to use. And in simulating global climate, the poles and high latitudes are often very important to the climate system. But those same places are often overlooked by popular, well-supported projections like web mercator since few people lives there.
  18. Also, the geospatial component of the data is not the only component. Maps are usually what we use in this community as our final outputs, but with all the different permutations of our data, this would end up being tens of thousands of maps, or billions if you consider the full time dimension.
  19. With those challenges in mind, let's get back to the web and I'm going to go though a demo of what we developed to accommodate data delivery to our users. I'll go through this demonstration as hypothetical user of our climate services. Thisl user, Alice, is a engineer with BC's Ministry of Transportation. She is working in one of BC's remote coastal communities--attached to the outside world by one ferry and a single road—and is assessing its vulnerability to extreme precipitation and its effects on roads, culverts, bridges, and other critical infrastructure. Alice wants plausible future climate scenarios for the watersheds around the highway. We give the user a map to select their region of interest with an overlay of the climate raster to see where information is available. On the right hand side, we give our user a tree of various scenarios to select that correspond to different greenhouse gas concentrations pathways, different sets of GCMs and different downscaling methods. There's a time selector in case they're only interested in a subset of the future. Perhaps they're only desire an analysis of the past plus the future forty years to correspond to the projected lifespan of their bridge. Then the user just has to select an output format and hit download, and the data starts streaming. Now, the datasets that a user could download are potentially very large. Each scenario of the full spatiotemporal domain is around 150 GB. Which isn't ridiculous to serve up as a static file over HTTP. However, as soon as you want to write a web app around it, allow dynamic responses, subset request, etc., all of a sudden you have a lot of I/O bound computations to make before your HTTP reseponse gose back, which, ideally should happen in less than once second. So, while we're waiting for the data to download, I'll go through the FOSS components that we used off the shelf and some of the more “creative” things that we had to do to make all this work.
  20. To give a sense of the breakdown between how much plug-and-play we were able to do versus how much custom development work we did, here is a comparison. Everything in the left column, we have been able to utilize out of the box to meet our needs. On the right you'll see a couple of python packages to which we made heavy modifications and then there's our data management schemas and the user interface.
  21. Placing those components on the UI... the front-end is obviously OpenLayers, with a basemap based on OpenStreetMap data rendered by Mapnik, served through TileCache. The climate overlays layer is being served through ncWMS, an aptly named program that takes NetCDF input files and serves them as WMS layers. The data service layer is an OPeNDAP server named PyDAP which has been heavily modified by us and will eventually be released as open source.
  22. One of the technical problems that we ran up against was that all of the available OPeNDAP data servers load their responses entirely into RAM before sending them out. So if you want to serve up large data sets, the size of your response is limited by your available RAM divided by the number of concurrent responses that you are prepared to serve. If you try and make a request to, say, THREDDS OPeNDAP server that's larger than the JVM allocated memory, the user will just get back a BadRequest error. For some applications this may be fine, or even desirable, but for the purposes of serving large data sets, the network pipe is usually the bottleneck. Rather than annoy and frustrate the user by forcing them to carve up their data requests to be arbitrarily small, we wanted to allow as large a request as the users were prepared to accept.
  23. One of my big pet peeves is when you go to a website to get data, you hone in on what you want, the anticipation builds, you're ready for your data and then, you get some dialog box suggesting that “We'll e-mail you when your data is ready!” It's the biggest let down to focus in on this dataset that you want and need and then to have to turn your attention to something else while the server puts you, the user, on indefinite hold until it's ready. I don't intend any disrespect to these websites which I have shown, but at PCIC, one of our key requirements was to avoid this tragic UX mistake and give the user an instant response and begin streaming the data immediately.
  24. Enter generators and coroutines. Generators are a programming control where a function, rather than returning, can yield execution and sort of return values one at a time on-demand. It has the performance advantage of maintaining a low memory footprint, if you want to return something large, you don't have to do so all at once, and they tend to be slightly faster, because you avoid a lot of calling overhead of stack manipulation. Generators have been around for a good thirty-five years, but have been experiencing a bit of a Renaissance lately. If one programs in python, they are extremely easy to use, and with the advent of big data applications, they have a lot of utility.
  25. For those who aren't familiar, here's a quick example to understand generators. Generating a Fibonacci sequence is kind of the quintessential toy example. The generator function, fibonacci(), is defined at the top. You'll notice that it's an infinite loop, because the sequence is by definition, infinite. But rather than building up the values in memory, it just has a simple and elegant “yield” statement right inside the loop. The calling loop down below, actually pulls items from the function, one at a time, and then does whatever it needs to do with them. It's fast, efficient, and actually fairly elegant, readable code, too. So you can see, for something like a web application serving big datasets, this is perfect, because we can provide a very low latency response, and then stream the data to the user as our high-latency operations like disk reads take place. None of the OPeNDAP servers out there supported streaming, so our development team has made substantial additions to the pydap OPeNDAP server, to utilize generators for server large datasets. We hope to open source these changes later in the year. All of the code will be available soon enough, so I won't go into it, but know that generators were the key enabling technology.
  26. With the generators discussion behind us, let's turn back to our demo. We're probably done downloading the data, and it looks like we are. Our engineer has now quickly downloaded gigabytes of custom-selected, climate scenario output with just a few clicks. The data is fully attributed with metadata, with units, with references and citations to the methods used to perform the downscaling. In the spirit of open science, having all of the metadata directly attached to the data is actually a pretty big deal, because it ensures data provenance is trackable even if further operations are performed on the data later on, which is highly likely. From here, it's relatively easy for her to plug the numbers into whatever impacts model she wants to run.
  27. With that, I'll leave you with my simple conclusions. Governments use our downscaled climate model output to plan for the effects of climate change on their infrastructure. There's so much model output, that data delivery is a non-trival problem. We've tried to make it as easy as possible for our users to narrow the data down to what they actually need and we stream it to them right away. And hopefully later this year, we'll be able to open source our work and make it available to the community.
  28. Finally, I'll close by thanking all of you FOSS developers out there, because without this community, none of this would have been at all possible. I'll be happy to answer any questions at this time.