SlideShare une entreprise Scribd logo
1  sur  14
Large Scale Scientific Data Notes From The Field Rob Gillen Computer Science Research Oak Ridge National Laboratory Planet Technologies, Inc.
ORNL is DOE’s largest scienceand energy laboratory ,[object Object]
Nation’s largest concentrationof open source materials research
$1.3B budget
4,350 employees
3,900 researchguests annually
$350 million investedin modernization
Nation’s most diverse energy portfolio
Operating the world’s most intense pulsed neutron source
Managing the billion-dollar U.S. ITER project,[object Object]
World’s most powerful open scientific computing facility
Jaguar XT operating at  >1.64 petaflops
Exascale system by the end of the next decade
Focus on computationally intensive projects of large scale and high scientific impact

Contenu connexe

Plus de Rob Gillen

What's in a password
What's in a password What's in a password
What's in a password Rob Gillen
 
How well do you know your runtime
How well do you know your runtimeHow well do you know your runtime
How well do you know your runtimeRob Gillen
 
Software defined radio and the hacker
Software defined radio and the hackerSoftware defined radio and the hacker
Software defined radio and the hackerRob Gillen
 
So whats in a password
So whats in a passwordSo whats in a password
So whats in a passwordRob Gillen
 
Hiding in plain sight
Hiding in plain sightHiding in plain sight
Hiding in plain sightRob Gillen
 
DevLink - WiFu: You think your wireless is secure?
DevLink - WiFu: You think your wireless is secure?DevLink - WiFu: You think your wireless is secure?
DevLink - WiFu: You think your wireless is secure?Rob Gillen
 
You think your WiFi is safe?
You think your WiFi is safe?You think your WiFi is safe?
You think your WiFi is safe?Rob Gillen
 
Anatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow AttackAnatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow AttackRob Gillen
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Rob Gillen
 
A Comparison of AWS and Azure - Part2
A Comparison of AWS and Azure - Part2A Comparison of AWS and Azure - Part2
A Comparison of AWS and Azure - Part2Rob Gillen
 
A Comparison of AWS and Azure - Part 1
A Comparison of AWS and Azure - Part 1A Comparison of AWS and Azure - Part 1
A Comparison of AWS and Azure - Part 1Rob Gillen
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaRob Gillen
 
Hands On with Amazon Web Services (StirTrek)
Hands On with Amazon Web Services (StirTrek)Hands On with Amazon Web Services (StirTrek)
Hands On with Amazon Web Services (StirTrek)Rob Gillen
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldRob Gillen
 
Amazon Web Services for the .NET Developer
Amazon Web Services for the .NET DeveloperAmazon Web Services for the .NET Developer
Amazon Web Services for the .NET DeveloperRob Gillen
 
05561 Xfer Research 01
05561 Xfer Research 0105561 Xfer Research 01
05561 Xfer Research 01Rob Gillen
 
Azure: Lessons From The Field
Azure: Lessons From The FieldAzure: Lessons From The Field
Azure: Lessons From The FieldRob Gillen
 

Plus de Rob Gillen (18)

What's in a password
What's in a password What's in a password
What's in a password
 
How well do you know your runtime
How well do you know your runtimeHow well do you know your runtime
How well do you know your runtime
 
Software defined radio and the hacker
Software defined radio and the hackerSoftware defined radio and the hacker
Software defined radio and the hacker
 
So whats in a password
So whats in a passwordSo whats in a password
So whats in a password
 
Hiding in plain sight
Hiding in plain sightHiding in plain sight
Hiding in plain sight
 
DevLink - WiFu: You think your wireless is secure?
DevLink - WiFu: You think your wireless is secure?DevLink - WiFu: You think your wireless is secure?
DevLink - WiFu: You think your wireless is secure?
 
You think your WiFi is safe?
You think your WiFi is safe?You think your WiFi is safe?
You think your WiFi is safe?
 
Anatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow AttackAnatomy of a Buffer Overflow Attack
Anatomy of a Buffer Overflow Attack
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
AWS vs. Azure
AWS vs. AzureAWS vs. Azure
AWS vs. Azure
 
A Comparison of AWS and Azure - Part2
A Comparison of AWS and Azure - Part2A Comparison of AWS and Azure - Part2
A Comparison of AWS and Azure - Part2
 
A Comparison of AWS and Azure - Part 1
A Comparison of AWS and Azure - Part 1A Comparison of AWS and Azure - Part 1
A Comparison of AWS and Azure - Part 1
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with Cuda
 
Hands On with Amazon Web Services (StirTrek)
Hands On with Amazon Web Services (StirTrek)Hands On with Amazon Web Services (StirTrek)
Hands On with Amazon Web Services (StirTrek)
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The Field
 
Amazon Web Services for the .NET Developer
Amazon Web Services for the .NET DeveloperAmazon Web Services for the .NET Developer
Amazon Web Services for the .NET Developer
 
05561 Xfer Research 01
05561 Xfer Research 0105561 Xfer Research 01
05561 Xfer Research 01
 
Azure: Lessons From The Field
Azure: Lessons From The FieldAzure: Lessons From The Field
Azure: Lessons From The Field
 

Dernier

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Dernier (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Azure Sample for Climate Analysis

  • 1. Large Scale Scientific Data Notes From The Field Rob Gillen Computer Science Research Oak Ridge National Laboratory Planet Technologies, Inc.
  • 2.
  • 3. Nation’s largest concentrationof open source materials research
  • 7. $350 million investedin modernization
  • 8. Nation’s most diverse energy portfolio
  • 9. Operating the world’s most intense pulsed neutron source
  • 10.
  • 11. World’s most powerful open scientific computing facility
  • 12. Jaguar XT operating at >1.64 petaflops
  • 13. Exascale system by the end of the next decade
  • 14. Focus on computationally intensive projects of large scale and high scientific impact
  • 15. Just upgraded to ~225,000 cores
  • 16. Addressing key science and technology issues
  • 20. Bioenergy3 Managed by UT-Battellefor the Department of Energy
  • 21. Initial Context Studying the intersection of HPC/scientific computing and the cloud Data locality is a key issue for us Cloud computing looks to fill a niche in pre- and post-processing as well as generalized mid-range compute This project is an introductory or preparatory step into the larger research project
  • 22. Sample Application Goals Make CMIP3 data more accessible/consumable Prototype the use of cloud computing for post-processing of scientific data Answer the questions: Can cloud computing be used effectively for large-scale data How accessible is the programming paradigm Note: focus is on the mechanics, not the science (could be using number of foobars in the world rather than temp simulations)
  • 23. Technologies Utilized Windows Azure (tables, blobs, queues, web roles, worker roles) OGDI (http://ogdisdk.cloudapp.net/) C#, F#, PowerShell, DirectX, SilverLight, WPF, Bing Maps (Virtual Earth), GDI+, ADO.NET Data Services
  • 24. Two-Part Problem Get the data into the cloud/exposed in such a way as to be consumable by generic clients in Internet-friendly formats Provide some sort of visualization or sample application to provide context/meaning to the data.
  • 25. Context: 35 Terabytes of numbers - How much data is that? A single latitude/longitude map at typical climate model resolution represents about ~40 KB. If you wanted to look at all 35 TB in the form of these latitude/longitude plots and if.. Every 10 seconds you displayed another map and if You worked 24 hours a day 365 days each year, You could complete the task in about 200 years.
  • 26. Dataset Used 5 GB worth of NetCDF files Contributing Sources NOAA Geophysical Fluid Dynamics Laboratory, CM2.0 Model NASA Goddard Institute for Space Studies, C4x3 NCAR Parallel Climate Model (Version 1) Climate of the 20th Century Experiment, run 1, daily Surface Air Temperature (tas) Maximum Surface Air Temperature (tasmax) Minimum Surface Air Temperature (tasmin) > 1.1 billion unique values (lat/lon/temp pairs) 0.014 % of total set
  • 27. Application Workflow Source file are uploaded to blob storage Each source file is split into 1000’s of CSV files stored in blob storage Process generates a Load Table command for each CSV created Load Table workers process jobs and load CSV data into Azure Tables. Once a CSV file has been processed, a Create Image job is created
  • 28. Application Workflow Create Image workers process queue, generating a heat map image for each time set Once all data is loaded and images are created, a video is rendered based on the resulting images and used for inclusion in visualization applications.
  • 29. Current Data Loaded > 1.1 billion table entries (lat/lon/value) > 250,000 blobs > 75 GB (only blob)
  • 30. Data Load Review Results for first subset Averaged 2:30/time period 40,149 time periods 24 per worker hour 1,672.8 worker-hours 14 active workers 119.5 calendar hours 328,900,608 total entities Near-linear scale out This represents 0.003428 % of total set
  • 31. WPF Data Visualization Application Demo

Notes de l'éditeur

  1. For updates to this content please download the latest Azure Services Platform Training Kit from: http://www.azure.com