SlideShare une entreprise Scribd logo
1  sur  42
“Toward A National Big Data Superhighway”
Closing Kenote
Internet2 Global Summit
Washington, DC
April 26, 2017
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
Jacobs School of Engineering, UCSD
http://lsmarr.calit2.net
1
Abstract
Research in data-intensive fields is increasingly multi-investigator and multi-institutional,
depending on ever more rapid access to ultra-large heterogeneous and widely
distributed datasets. The Pacific Research Platform (PRP) is an NSF-funded research
project which extends NSF-funded campus Science DMZs to a regional model, built on
the CENIC/Pacific Wave backbone, establishing a science-driven high-capacity data-
centric "freeway system." The PRP spans all 10 campuses of the University of
California, as well as the major California private research universities, four
supercomputer centers, and several universities outside California. Fifteen multi-campus
data-intensive application teams, including particle physics, astronomy/astrophysics,
earth sciences, biomedicine, and scalable multimedia, act as drivers of the PRP,
providing feedback over the five years to the technical design staff. Over the next three
years, PRP will examine sustainable methods for expanding such regional networks to a
national scale.
Vision: Creating a West Coast “Big Data Freeway”
Connected by CENIC/Pacific Wave to Internet2 & GLIF
Use Lightpaths to Connect
Big Data Generators and Consumers,
Creating a “Big Data” Freeway
Integrated With High Performance Global Networks
“The Bisection Bandwidth of a Cluster Interconnect,
but Deployed on a 20-Campus Scale.”
This Vision Has Been Building for Over a Decade
NSF’s OptIPuter Project: Using Supernetworks
to Meet the Needs of Data-Intensive Researchers
OptIPortal–
Termination
Device
for the
OptIPuter
Global
Backplane
Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI
Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST
Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent
2003-2009
$13,500,000
In August 2003,
Jason Leigh and his
students used
RBUDP to blast
data from NCSA to
SDSC over the
TeraGrid DTFnet,
achieving18Gbps
file transfer out of
the available
20Gbps
LS Slide 2005
DOE ESnet’s Science DMZ: A Scalable Network
Design Model for Optimizing Science Data Transfers
• A Science DMZ integrates 4 key concepts into a unified whole:
– A network architecture designed for high-performance applications,
with the science network distinct from the general-purpose network
– The use of dedicated systems as data transfer nodes (DTNs)
– Performance measurement and network testing systems that are
regularly used to characterize and troubleshoot the network
– Security policies and enforcement mechanisms that are tailored for
high performance science environments
http://fasterdata.es.net/science-dmz/
Science DMZ
Coined 2010
The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis
for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program
Based on Community Input and on ESnet’s Science DMZ Concept,
NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways
Red 2012 CC-NIE Awardees
Yellow 2013 CC-NIE Awardees
Green 2014 CC*IIE Awardees
Blue 2015 CC*DNI Awardees
Purple Multiple Time Awardees
Source: NSF
I Believe as Greg Bell Has Said
We Should Engineer the Network as an Instrument of Discovery
It is all about the end users!
We Must Optimize The Instrument
For Multi-Campus Collaborating Application Teams
How CC-NIE Prism@UCSD Grant Transforms Big Data Microbiome Science:
Preparing for Knight/Smarr 1 Million Core-Hour Analysis
12 Cores/GPU
128 GB RAM
3.5 TB SSD
48TB Disk
10Gbps NIC
Knight Lab
FIONA
10Gbps
Gordon
Prism@UCSD
Data Oasis
7.5PB,
200GB/s
Knight 1024 Cluster
In SDSC Co-Lo
CHERuB
100Gbps
Emperor & Other Vis Tools
64Mpixel Data Analysis Wall
120Gbps
40Gbps
1.3Tbps
The Next Logical Step:
Build a Regional DMZ by Connecting West Coast Campus DMZs
• May 2014 LS Gives Invited Presentation to UC IT Leadership Council
– Strong Support from UC and UCOP CIOs
• July 2014 LS Gives Invited Talk to CENIC Annual Retreat
– CENIC/PW Agrees to Act as Backplane
– CIO Support Extends to CA Private Research Universities
• December 2014 UCOP CIO and VPR’s Provide PRP “Momentum Money”
• January 2015 Kickoff of PRPv0 by Network Engineers
– Begins Every Two Week Conference Calls, Now Weekly
• March 2015 LS Invited “Blue Sky” Presentation to UC VCR/CIO Summit
– NSF PRP Proposal Submitted With Letters of Commitment From:
– 50 Researchers from 15 Campuses
– 32 IT/Network Organization Leaders
The Pacific Research Platform:
a Working End-to-End Science-Driven Regional DMZ-Connector
NSF CC*DNI Grant
$5M 10/2015-10/2020
PI: Larry Smarr, UC San Diego Calit2
Co-Pis:
• Camille Crittenden, UC Berkeley CITRIS,
• Tom DeFanti, UC San Diego Calit2,
• Philip Papadopoulos, UCSD SDSC,
• Frank Wuerthwein, UCSD Physics and SDSC
(GDC)
PRP is Built on CENIC/Pacific Wave
Our Prototype System – Built for for Scientists
Out of a Bunch of Independently Managed Networks
• Challenge:
– Campus DMZs, Regional (e.g., CENIC), National (Internet2), International
Networks (e.g., GLIF) are Individually-Architected Systems
• How Do They Work Together with Predictable Performance?
•  PRP is Focused on Disk-to-Disk Data Movement
– From the Eyes of Domain Scientists
– End-to-End for Their Data is Their Only Real Metric of Concern (As it Should Be)
Source: Phil Papadopoulos
PRP Science DMZ Data Transfer Nodes (DTNs) -
Flash I/O Network Appliances (FIONAs)
UCSD Designed FIONAs
To Solve the Disk-to-Disk
Data Transfer Problem
at Full Speed
on 10G, 40G and 100G Networks
FIONAS—10/40G, $8,000
FIONette—1G, $1,000
Phil Papadopoulos, SDSC &
Tom DeFanti, Joe Keefe & John Graham, Calit2
John Graham, Calit2
More Than 30 PRP Installed FIONAs:
Customized to the Needs of Application Teams
• Data Transfer Nodes
– 1, 10, 40, and 100Gb/s NICs
• Storage Transfer Nodes
– Up to 160TB of Rotating Disks
– Nonvolatile Memory Disks (NVMe - 10x Faster than Flash)
– ½ PB Flash Disk (at SC15, on Loan From Vendor)
• Compute Transfer Nodes
– 12-48 Intel CPU Cores
– 1-8 GPUs (Delivers Up to 500,000 GPU Core Hours/Day)
• Visualization Transfer Nodes
– 3-45 Tiled displays (up to 180 Megapixels, 2D & 3D)
– 360-Megapixel SunCAVE Coming Soon
PRP Continues to Expand Rapidly While Increasing Connectivity:
1 1/2 Years of Progress – 12 Sites to 24 Sites
January 29, 2016
Connected 24 DMZ FIONAs
at 10G and 40G
April 24, 2017
Source: John Graham, Calit2
We Measure FIONA Disk-to-Disk Throughput with 10GB File Transfer
4 Times Per Day in Both Directions for All PRP Sites
See Time Lapse Movie Jan 2016 to Today
http://prp-maddash.calit2.optiputer.net/optiputer/optiputer.mp4
We Have Held a Number of
PRP Science Engagement Workshops
Source: Camille Crittenden, UC Berkeley
UC San Diego
UC Merced
UC Davis UC Berkeley
PRP’s First 1.5 Years:
Connecting Campus Application Teams and Devices
We Scale the Working PRP by Providing Multi-Campus Application Teams
With Disk-to-Disk Measurements
UIC
UCSD
UCI
U Hawaii
USC
NCAR
SDSU
LHC Rearchers Look to PRP to Fix the Last Mile Architecture in California:
Data and Compute Resources Can Both Be Shared
PRP provides an Implementation of All This on a Single FIONA,
PRP helps Integrate Local Resources into This FIONA.
login nodes
compute
scheduler
compute cluster
storage clusterDTN
CTN
WAN
CTN = compute transfer node
DTN = data transfer node
Science DMZ
Source: Frank Wuerthwein, UCSD, SDSC
>360 California Scientists Are Researching
Particle Physics Big Data Analysis
• ATLAS
– UCB/LBNL (63)
– SLAC/Stanford (51)
– UCSC (30)
– UCI (32)
• Total of 176 members listed in
ATLAS HR database at CERN
• CMS (Members)
– Caltech (29)
– LLNL (3)
– UCD (41)
– UCLA (17)
– UCR (25)
– UCSD (36)
– UCSB (35)
• Total of 186 members listed in CMS
HR database at CERN
Source: Frank Wuerthwein, UCSD, SDSC
LHC Computing and Data Resources
10 Institutions
• ATLAS Institutions
– SLAC “T2”
– NERSC (used by both)
– UCSC T3
– UCI T3
• CMS Institutions
– Caltech T2
– UCSD T2
– SDSC (used by both)
– UCD T3
– UCR T3
– UCSB T3
Lots of Potential Network Traffic for LHC on PRP
Source: Frank Wuerthwein, UCSD, SDSC
100 Gbps FIONA at UCSC Connects the UCSC Hyades Cluster
to the NERSC Supercomputer at LBNL
Supporting UCSC Remote Access
to Large Data Subsets
of the Dark Energy Spectroscopic Instrument (DESI)
and AGORA Galaxy Simulation Data
Produced at NERSC.
250 images per night
800GB per night
Shawfeng Dong, UCSC Cyberengineer
UCSC Feb 7, 2017
40G FIONAs
20x40G PRP-connected
WAVE@UC San Diego
PRP Now Enables
Distributed Virtual Reality
PRP
WAVE @UC Merced
Transferring 5 CAVEcam Images from UCSD to UC Merced:
2 Gigabytes now takes 2 Seconds (8 Gb/sec)
PRP Will Link the Laboratories of
the Pacific Earthquake Engineering Research Center
http://peer.berkeley.edu/
PEER Labs: UC Berkeley, Caltech, Stanford,
UC Davis, UC San Diego, and UC Los Angeles
John Graham Installing FIONette at PEER Feb 10, 2017
Cancer Genomics Hub (UCSC) is Housed in SDSC:
Large Data Flows to End Users at UCSC, UCB, UCSF, …
1G
8G
Data Source: David Haussler,
Brad Smith, UCSC
15G
Jan 2016
30,000 TB
Per Year
NIH’s Cancer Genomics Database Moved
So the PRP Deployed a FIONA to Chicago’s MREN
The Prototype PRP Has Attracted
New Application Drivers-More in Next Larry and Scott Talks
Scott Sellars, Marty Ralph
Center for Western Weather and Water Extremes
Frank Vernon - Expansion of HPWREN
Tom Levy, Cultural Heritage
Cryo EM
GPU JupyterHub:
2 x 14-core CPUs
256GB RAM
1.2TB FLASH
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted Platform
Module
GPU JupyterHub:
1 x 18-core CPUs
128GB RAM
3.8TB SSD
Nvidia K80 GPU
Dual 40GbE NICs
And a Trusted Platform
Module
PRP UC-JupyterHub Backbone
UCB Next Step: Deploy Across PRP UCSD
Source: John Graham, Calit2
Atmospheric
Rivers
(fall and winter)
Southwest
Monsoon
(summer & fall)
Great Plains Convection
(spring and summer)
Front Range Upslope
(rain/snow)
Funded collaborations
CW3E
Based at UCSD/Scripps Oceanography
CW3E-North
at Sonoma
County Water
Agency
Key Phenomena Causing Extreme Precipitation in the Western U.S. (Ralph et al.
2014)
Director: F. Martin Ralph Website: cw3e.ucsd.edu
Data is at the heart of what we do!
• High resolution numerical models
• Satellite images
• Ground based weather stations
• Weather radar
• Historical climate data
Big Data Collaboration with:
Source: Scott Sellers, CW3E
Collaboration on Atmospheric Water
Between UC San Diego and UC Irvine
Director, Soroosh Sorooshian, UCSD Website http://chrs.web.uci.edu
Calit2’s FIONA
SDSC’s COMET
Calit2’s FIONA
Pacific Research Platform (10-100 Gb/s)
GPUsGPUs
Complete workflow time: 20 days20 hrs20 Minutes!
UC, Irvine UC, San Diego
Improvement of Over 1000x With PRP
Cryo-electron Microscopy (cryo-EM)
Has Driven a “Resolution Revolution” in the Last Five Years
Exposure (every 60 seconds):
X & Y dimensions: 7420 x 7676 Pixels
Frames per Movie: 10 - 50
Size: 3 - 10 GB per Movie
Every 24 hours:
Number of Movies: ~1400
Data Size: ~5 TB
Typical Datasets:
Length of Time: 2 - 6 Days
Total size: 10 - 30 TB
Each Cryo-EM ‘Image’ is Actually a Movie
Source: Michael A. Cianfrocco,
Elizabeth Villa, & Andres Leschziner, UCSD
Using PRP to Connect Cryo-EM across California
With End Users and Computational Facilities
Long term:
‣ Partner with Cryo-EM Facilities to Stream Data
Straight from Microscopes (over PRP) to SDSC
‣ Perform All Cryo-EM Analysis (from Micrographs
to 3D Models) via Web Browser on SDSC
‣ Expand Computing to Other XSEDE Resources
(e.g. Xstream) and DOE’s NERSC
Short term:
‣ Provide 2D and 3D Analysis on Particle Stacks on
Comet at SDSC
Source: Michael A. Cianfrocco, UCSD
*
*
SDSC
NERSC
Xstream
3 Supercomputer Centers
cosmic-cryoem.org
~20 Microscopes in CA
UCLA
UC Davis
UC Santa Cruz
SF Bay
UC Berkeley, LBNL,
UCSF, Stanford
San Diego
UCSD, TSRI, Salk*
Linking Cultural Heritage and Archaeology Datasets
at UCB, UCLA, UCM and UCSD with CAVEkiosks
48 Megapixel CAVEkiosk
UCSD Library
48 Megapixel CAVEkiosk
UCB Library
24 Megapixel CAVEkiosk
UCM Library
PRP is the Platform Chosen for 2017 Expansion
of HPWREN, Connected to CENIC, into Orange and Riverside Counties
• PRP CENIC 100G Link
UCSD to SDSU
– DTN FIONAs Endpoints
– Data Redundancy
– Disaster Recovery
– High Availability
– Network Redundancy
• Anchor to CENIC at UCI
– PRP FIONA Connects to
CalREN-HPR Network
– Data Replication Site
• Potential Future UCR
CENIC Anchor
UCR
UCI
UCSD
SDSU
Source: Frank Vernon,
Greg Hidley, UCSD
Proposed Cognitive Hardware and Software Ecosystem
On the Pacific Research Platform
• Working With 30 CSE Machine Learning Researchers
– Goal is 320 Game GPUs in 32-40 FIONAs at 10 PRP Campuses
– PRP Couples FIONAs with GPUs into a Condor-Managed Cloud
• PRP Access to Emerging Processors
– IBM TrueNorth, KnuEdge, FPGA, and Qualcomm Snapdragon
• Software Including a Wide Range of Open ML Algorithms
• Metrics for Performance of Processors and Algorithms
Source: Tom DeFanti, Calit2
FIONA with 8-Game GPUs
We are Now Investigating
How the PRP Prototype Might Be Extended to National-Scale
From the text of the PRP cooperative agreement:
After approximately 18 (or TBD) months, a site visit and comprehensive review of
progress towards meeting project milestones and goals and overall performance and
management processes will take place, including user community relationships,
scientific impacts, and the status of the project as a model for potential future
national-scale, network-aware, data-focused cyberinfrastructure attributes,
approaches, and capabilities.
Expanding to National Research Platform and Global Research Platform
Via CENIC/Pacific Wave, Internet2, and International Links
PRP’s Current
International
Partners
Korea Shows Distance is Not the Barrier
to Above 5Gb/s Disk-to-Disk Performance
PRP Working on Connecting Guam
via the University of Oregon-Based Network Startup Resource Center
The PRP shipped a FIONette
to CENIC’s John Hess
to be Installed in Guam Mid-May
To support projects in:
• Geography
• Climate History
• Guam EPSCoR
• The UOG Marine Laboratory
“During the quarter century that this group has been helping to build internet infrastructure
around the world, there’s hardly a place on the planet that has not been touched
by the great work of the Network Startup Resource Center,” -- Larry Smarr.
PRP is Partnering with the Advanced CyberInfrastructure –
Research and Education Facilitators (ACI-REF) NSF Grant to Explore Extension
PRP Connected
 ACI-REF has also spawned the 28-
member Campus Research
Computing consortium (CaRC),
funded by the NSF as a Research
Coordination Network (RCN).
 CaRC is dedicated to sharing best
practices, expertise, and
resources, enabling the
advancement of campus- based
research computing activities
around the nation.
Jim Bottum, Principal Investigator
ACI-REF
CaRC
Announcing the First National Research Platform
Workshop August 7-8, 2017
Co-Chairs:
Larry Smarr, Calit2
& Jim Bottum, Internet2
See pacificresearchplatform.org
for Registration Information
Toward a National Research Platform
PRP has 3 FTEs to Connect ~25 Campuses.
How Many are Needed to Expand to a NRP
Serving Researchers at 250 Campuses in Dozens of Fields?
What is the Path Forward?
As Internet2 Board of Trustees Member
John Evans Said to Me Last Night:
“We Are Near an Inflection Point.”
Our Support:
• US National Science Foundation (NSF) awards CNS 0821155 and
CNS-1338192, CNS-1456638, ACI-1540112, and ACI-1541349
• University of California Office of the President CIO
• UCSD Chancellor’s Integrated Digital Infrastructure Program
• UCSD Next Generation Networking initiative
• Calit2 and Calit2 Qualcomm Institute
• CENIC, PacificWave and StarLight
• DOE ESnet

Contenu connexe

Tendances

Tendances (20)

Supercomputer End Users: the OptIPuter Killer Application
Supercomputer End Users: the OptIPuter Killer ApplicationSupercomputer End Users: the OptIPuter Killer Application
Supercomputer End Users: the OptIPuter Killer Application
 
Information Technology Infrastructure Committee (ITIC): Report to the NAC
Information Technology Infrastructure Committee (ITIC): Report to the NACInformation Technology Infrastructure Committee (ITIC): Report to the NAC
Information Technology Infrastructure Committee (ITIC): Report to the NAC
 
The Strongly Coupled LambdaCloud
The Strongly Coupled LambdaCloudThe Strongly Coupled LambdaCloud
The Strongly Coupled LambdaCloud
 
The Energy Efficient Cyberinfrastructure in Slowing Climate Change
The Energy Efficient Cyberinfrastructure in Slowing Climate ChangeThe Energy Efficient Cyberinfrastructure in Slowing Climate Change
The Energy Efficient Cyberinfrastructure in Slowing Climate Change
 
Why Researchers are Using Advanced Networks
Why Researchers are Using Advanced NetworksWhy Researchers are Using Advanced Networks
Why Researchers are Using Advanced Networks
 
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific ApplicationsOptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
 
The Future of the Internet and its Impact on Digitally Enabled Genomic Medicine
The Future of the Internet and its Impact on Digitally Enabled Genomic MedicineThe Future of the Internet and its Impact on Digitally Enabled Genomic Medicine
The Future of the Internet and its Impact on Digitally Enabled Genomic Medicine
 
Living in a World of Nanobioinfotechnology
Living in a World of NanobioinfotechnologyLiving in a World of Nanobioinfotechnology
Living in a World of Nanobioinfotechnology
 
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
 
From the Shared Internet to Personal Lightwaves: How the OptIPuter is Transfo...
From the Shared Internet to Personal Lightwaves: How the OptIPuter is Transfo...From the Shared Internet to Personal Lightwaves: How the OptIPuter is Transfo...
From the Shared Internet to Personal Lightwaves: How the OptIPuter is Transfo...
 
The OptIPuter Project: From the Grid to the LambdaGrid
The OptIPuter Project: From the Grid to the LambdaGridThe OptIPuter Project: From the Grid to the LambdaGrid
The OptIPuter Project: From the Grid to the LambdaGrid
 
The Pacific Research Platform: Leading Up to the National Research Platform
The Pacific Research Platform:  Leading Up to the National Research PlatformThe Pacific Research Platform:  Leading Up to the National Research Platform
The Pacific Research Platform: Leading Up to the National Research Platform
 
How Personal Lightwaves Enable Telepresence: Collapsing the Flat World to a “...
How Personal Lightwaves Enable Telepresence: Collapsing the Flat World to a “...How Personal Lightwaves Enable Telepresence: Collapsing the Flat World to a “...
How Personal Lightwaves Enable Telepresence: Collapsing the Flat World to a “...
 
Remote Telepresence for Exploring Virtual Worlds
Remote Telepresence for Exploring Virtual WorldsRemote Telepresence for Exploring Virtual Worlds
Remote Telepresence for Exploring Virtual Worlds
 
High Performance Cyberinfrastructure for Data-Intensive Research
High Performance Cyberinfrastructure for Data-Intensive ResearchHigh Performance Cyberinfrastructure for Data-Intensive Research
High Performance Cyberinfrastructure for Data-Intensive Research
 
Toward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing CyberinfrastructureToward a Global Interactive Earth Observing Cyberinfrastructure
Toward a Global Interactive Earth Observing Cyberinfrastructure
 
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * C...
 
The Emergence of the Digitally Connected World
The Emergence of the Digitally Connected WorldThe Emergence of the Digitally Connected World
The Emergence of the Digitally Connected World
 
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...
 
Towards GigaPixel Displays
Towards GigaPixel DisplaysTowards GigaPixel Displays
Towards GigaPixel Displays
 

Similaire à Toward A National Big Data Superhighway

Similaire à Toward A National Big Data Superhighway (20)

The Pacific Research Platform Two Years In
The Pacific Research Platform Two Years InThe Pacific Research Platform Two Years In
The Pacific Research Platform Two Years In
 
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...
The Pacific Research Platform: A Regional-Scale Big Data Analytics Cyberinfra...
 
Toward a National Research Platform
Toward a National Research PlatformToward a National Research Platform
Toward a National Research Platform
 
An Integrated West Coast Science DMZ for Data-Intensive Research
An Integrated West Coast Science DMZ for Data-Intensive ResearchAn Integrated West Coast Science DMZ for Data-Intensive Research
An Integrated West Coast Science DMZ for Data-Intensive Research
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
The PRP and Its Applications
The PRP and Its ApplicationsThe PRP and Its Applications
The PRP and Its Applications
 
Towards a High-Performance National Research Platform Enabling Digital Research
Towards a High-Performance National Research Platform Enabling Digital ResearchTowards a High-Performance National Research Platform Enabling Digital Research
Towards a High-Performance National Research Platform Enabling Digital Research
 
Creating a Big Data Machine Learning Platform in California
Creating a Big Data Machine Learning Platform in CaliforniaCreating a Big Data Machine Learning Platform in California
Creating a Big Data Machine Learning Platform in California
 
Toward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data AnalysisToward a Global Research Platform for Big Data Analysis
Toward a Global Research Platform for Big Data Analysis
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
UC-Wide Cyberinfrastructure for Data-Intensive Research
UC-Wide Cyberinfrastructure for Data-Intensive ResearchUC-Wide Cyberinfrastructure for Data-Intensive Research
UC-Wide Cyberinfrastructure for Data-Intensive Research
 
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
 
Peering The Pacific Research Platform With The Great Plains Network
Peering The Pacific Research Platform With The Great Plains NetworkPeering The Pacific Research Platform With The Great Plains Network
Peering The Pacific Research Platform With The Great Plains Network
 
Creating a Science-Driven Big Data Superhighway
Creating a Science-Driven Big Data SuperhighwayCreating a Science-Driven Big Data Superhighway
Creating a Science-Driven Big Data Superhighway
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
Advanced Global-Scale Networking Supporting Data-Intensive Artificial Intelli...
Advanced Global-Scale Networking Supporting Data-Intensive Artificial Intelli...Advanced Global-Scale Networking Supporting Data-Intensive Artificial Intelli...
Advanced Global-Scale Networking Supporting Data-Intensive Artificial Intelli...
 
Pacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big DataPacific Wave and PRP Update Big News for Big Data
Pacific Wave and PRP Update Big News for Big Data
 
Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025Looking Back, Looking Forward NSF CI Funding 1985-2025
Looking Back, Looking Forward NSF CI Funding 1985-2025
 
Building a Regional 100G Collaboration Infrastructure
Building a Regional 100G Collaboration InfrastructureBuilding a Regional 100G Collaboration Infrastructure
Building a Regional 100G Collaboration Infrastructure
 
Pacific Research Platform Science Drivers
Pacific Research Platform Science DriversPacific Research Platform Science Drivers
Pacific Research Platform Science Drivers
 

Plus de Larry Smarr

Plus de Larry Smarr (20)

My Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 YearsMy Remembrances of Mike Norman Over The Last 45 Years
My Remembrances of Mike Norman Over The Last 45 Years
 
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
Metagenics How Do I Quantify My Body and Try to Improve its Health? June 18 2019
 
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
 
Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
 
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
 Wireless FasterData and Distributed Open Compute Opportunities and (some) Us... Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
Wireless FasterData and Distributed Open Compute Opportunities and (some) Us...
 
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...Panel Discussion: Engaging underrepresented technologists, researchers, and e...
Panel Discussion: Engaging underrepresented technologists, researchers, and e...
 
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon MoonThe Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
The Asia Pacific and Korea Research Platforms: An Overview Jeonghoon Moon
 
Panel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving InstitutionsPanel: Reaching More Minority Serving Institutions
Panel: Reaching More Minority Serving Institutions
 
Panel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An OverviewPanel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An Overview
 
Panel: Future Wireless Extensions of Regional Optical Networks
Panel: Future Wireless Extensions of Regional Optical NetworksPanel: Future Wireless Extensions of Regional Optical Networks
Panel: Future Wireless Extensions of Regional Optical Networks
 
Global Research Platform Workshops - Maxine Brown
Global Research Platform Workshops - Maxine BrownGlobal Research Platform Workshops - Maxine Brown
Global Research Platform Workshops - Maxine Brown
 
Built around answering questions
Built around answering questionsBuilt around answering questions
Built around answering questions
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Democratizing Science through Cyberinfrastructure - Manish Parashar
Democratizing Science through Cyberinfrastructure - Manish ParasharDemocratizing Science through Cyberinfrastructure - Manish Parashar
Democratizing Science through Cyberinfrastructure - Manish Parashar
 
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
Panel: Building the NRP Ecosystem with the Regional Networks on their Campuses;
 
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
Open Force Field: Scavenging pre-emptible CPU hours* in the age of COVID - Je...
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
 
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
Panel: Open Infrastructure for an Open Society: OSG, Commercial Clouds, and B...
 
Frank Würthwein - NRP and the Path forward
Frank Würthwein - NRP and the Path forwardFrank Würthwein - NRP and the Path forward
Frank Würthwein - NRP and the Path forward
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Toward A National Big Data Superhighway

  • 1. “Toward A National Big Data Superhighway” Closing Kenote Internet2 Global Summit Washington, DC April 26, 2017 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD http://lsmarr.calit2.net 1
  • 2. Abstract Research in data-intensive fields is increasingly multi-investigator and multi-institutional, depending on ever more rapid access to ultra-large heterogeneous and widely distributed datasets. The Pacific Research Platform (PRP) is an NSF-funded research project which extends NSF-funded campus Science DMZs to a regional model, built on the CENIC/Pacific Wave backbone, establishing a science-driven high-capacity data- centric "freeway system." The PRP spans all 10 campuses of the University of California, as well as the major California private research universities, four supercomputer centers, and several universities outside California. Fifteen multi-campus data-intensive application teams, including particle physics, astronomy/astrophysics, earth sciences, biomedicine, and scalable multimedia, act as drivers of the PRP, providing feedback over the five years to the technical design staff. Over the next three years, PRP will examine sustainable methods for expanding such regional networks to a national scale.
  • 3. Vision: Creating a West Coast “Big Data Freeway” Connected by CENIC/Pacific Wave to Internet2 & GLIF Use Lightpaths to Connect Big Data Generators and Consumers, Creating a “Big Data” Freeway Integrated With High Performance Global Networks “The Bisection Bandwidth of a Cluster Interconnect, but Deployed on a 20-Campus Scale.” This Vision Has Been Building for Over a Decade
  • 4. NSF’s OptIPuter Project: Using Supernetworks to Meet the Needs of Data-Intensive Researchers OptIPortal– Termination Device for the OptIPuter Global Backplane Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent 2003-2009 $13,500,000 In August 2003, Jason Leigh and his students used RBUDP to blast data from NCSA to SDSC over the TeraGrid DTFnet, achieving18Gbps file transfer out of the available 20Gbps LS Slide 2005
  • 5. DOE ESnet’s Science DMZ: A Scalable Network Design Model for Optimizing Science Data Transfers • A Science DMZ integrates 4 key concepts into a unified whole: – A network architecture designed for high-performance applications, with the science network distinct from the general-purpose network – The use of dedicated systems as data transfer nodes (DTNs) – Performance measurement and network testing systems that are regularly used to characterize and troubleshoot the network – Security policies and enforcement mechanisms that are tailored for high performance science environments http://fasterdata.es.net/science-dmz/ Science DMZ Coined 2010 The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program
  • 6. Based on Community Input and on ESnet’s Science DMZ Concept, NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways Red 2012 CC-NIE Awardees Yellow 2013 CC-NIE Awardees Green 2014 CC*IIE Awardees Blue 2015 CC*DNI Awardees Purple Multiple Time Awardees Source: NSF
  • 7. I Believe as Greg Bell Has Said We Should Engineer the Network as an Instrument of Discovery It is all about the end users! We Must Optimize The Instrument For Multi-Campus Collaborating Application Teams
  • 8. How CC-NIE Prism@UCSD Grant Transforms Big Data Microbiome Science: Preparing for Knight/Smarr 1 Million Core-Hour Analysis 12 Cores/GPU 128 GB RAM 3.5 TB SSD 48TB Disk 10Gbps NIC Knight Lab FIONA 10Gbps Gordon Prism@UCSD Data Oasis 7.5PB, 200GB/s Knight 1024 Cluster In SDSC Co-Lo CHERuB 100Gbps Emperor & Other Vis Tools 64Mpixel Data Analysis Wall 120Gbps 40Gbps 1.3Tbps
  • 9. The Next Logical Step: Build a Regional DMZ by Connecting West Coast Campus DMZs • May 2014 LS Gives Invited Presentation to UC IT Leadership Council – Strong Support from UC and UCOP CIOs • July 2014 LS Gives Invited Talk to CENIC Annual Retreat – CENIC/PW Agrees to Act as Backplane – CIO Support Extends to CA Private Research Universities • December 2014 UCOP CIO and VPR’s Provide PRP “Momentum Money” • January 2015 Kickoff of PRPv0 by Network Engineers – Begins Every Two Week Conference Calls, Now Weekly • March 2015 LS Invited “Blue Sky” Presentation to UC VCR/CIO Summit – NSF PRP Proposal Submitted With Letters of Commitment From: – 50 Researchers from 15 Campuses – 32 IT/Network Organization Leaders
  • 10. The Pacific Research Platform: a Working End-to-End Science-Driven Regional DMZ-Connector NSF CC*DNI Grant $5M 10/2015-10/2020 PI: Larry Smarr, UC San Diego Calit2 Co-Pis: • Camille Crittenden, UC Berkeley CITRIS, • Tom DeFanti, UC San Diego Calit2, • Philip Papadopoulos, UCSD SDSC, • Frank Wuerthwein, UCSD Physics and SDSC (GDC) PRP is Built on CENIC/Pacific Wave
  • 11. Our Prototype System – Built for for Scientists Out of a Bunch of Independently Managed Networks • Challenge: – Campus DMZs, Regional (e.g., CENIC), National (Internet2), International Networks (e.g., GLIF) are Individually-Architected Systems • How Do They Work Together with Predictable Performance? •  PRP is Focused on Disk-to-Disk Data Movement – From the Eyes of Domain Scientists – End-to-End for Their Data is Their Only Real Metric of Concern (As it Should Be) Source: Phil Papadopoulos
  • 12. PRP Science DMZ Data Transfer Nodes (DTNs) - Flash I/O Network Appliances (FIONAs) UCSD Designed FIONAs To Solve the Disk-to-Disk Data Transfer Problem at Full Speed on 10G, 40G and 100G Networks FIONAS—10/40G, $8,000 FIONette—1G, $1,000 Phil Papadopoulos, SDSC & Tom DeFanti, Joe Keefe & John Graham, Calit2 John Graham, Calit2
  • 13. More Than 30 PRP Installed FIONAs: Customized to the Needs of Application Teams • Data Transfer Nodes – 1, 10, 40, and 100Gb/s NICs • Storage Transfer Nodes – Up to 160TB of Rotating Disks – Nonvolatile Memory Disks (NVMe - 10x Faster than Flash) – ½ PB Flash Disk (at SC15, on Loan From Vendor) • Compute Transfer Nodes – 12-48 Intel CPU Cores – 1-8 GPUs (Delivers Up to 500,000 GPU Core Hours/Day) • Visualization Transfer Nodes – 3-45 Tiled displays (up to 180 Megapixels, 2D & 3D) – 360-Megapixel SunCAVE Coming Soon
  • 14. PRP Continues to Expand Rapidly While Increasing Connectivity: 1 1/2 Years of Progress – 12 Sites to 24 Sites January 29, 2016 Connected 24 DMZ FIONAs at 10G and 40G April 24, 2017 Source: John Graham, Calit2
  • 15. We Measure FIONA Disk-to-Disk Throughput with 10GB File Transfer 4 Times Per Day in Both Directions for All PRP Sites See Time Lapse Movie Jan 2016 to Today http://prp-maddash.calit2.optiputer.net/optiputer/optiputer.mp4
  • 16. We Have Held a Number of PRP Science Engagement Workshops Source: Camille Crittenden, UC Berkeley UC San Diego UC Merced UC Davis UC Berkeley
  • 17. PRP’s First 1.5 Years: Connecting Campus Application Teams and Devices
  • 18. We Scale the Working PRP by Providing Multi-Campus Application Teams With Disk-to-Disk Measurements UIC UCSD UCI U Hawaii USC NCAR SDSU
  • 19. LHC Rearchers Look to PRP to Fix the Last Mile Architecture in California: Data and Compute Resources Can Both Be Shared PRP provides an Implementation of All This on a Single FIONA, PRP helps Integrate Local Resources into This FIONA. login nodes compute scheduler compute cluster storage clusterDTN CTN WAN CTN = compute transfer node DTN = data transfer node Science DMZ Source: Frank Wuerthwein, UCSD, SDSC
  • 20. >360 California Scientists Are Researching Particle Physics Big Data Analysis • ATLAS – UCB/LBNL (63) – SLAC/Stanford (51) – UCSC (30) – UCI (32) • Total of 176 members listed in ATLAS HR database at CERN • CMS (Members) – Caltech (29) – LLNL (3) – UCD (41) – UCLA (17) – UCR (25) – UCSD (36) – UCSB (35) • Total of 186 members listed in CMS HR database at CERN Source: Frank Wuerthwein, UCSD, SDSC
  • 21. LHC Computing and Data Resources 10 Institutions • ATLAS Institutions – SLAC “T2” – NERSC (used by both) – UCSC T3 – UCI T3 • CMS Institutions – Caltech T2 – UCSD T2 – SDSC (used by both) – UCD T3 – UCR T3 – UCSB T3 Lots of Potential Network Traffic for LHC on PRP Source: Frank Wuerthwein, UCSD, SDSC
  • 22. 100 Gbps FIONA at UCSC Connects the UCSC Hyades Cluster to the NERSC Supercomputer at LBNL Supporting UCSC Remote Access to Large Data Subsets of the Dark Energy Spectroscopic Instrument (DESI) and AGORA Galaxy Simulation Data Produced at NERSC. 250 images per night 800GB per night Shawfeng Dong, UCSC Cyberengineer UCSC Feb 7, 2017
  • 23. 40G FIONAs 20x40G PRP-connected WAVE@UC San Diego PRP Now Enables Distributed Virtual Reality PRP WAVE @UC Merced Transferring 5 CAVEcam Images from UCSD to UC Merced: 2 Gigabytes now takes 2 Seconds (8 Gb/sec)
  • 24. PRP Will Link the Laboratories of the Pacific Earthquake Engineering Research Center http://peer.berkeley.edu/ PEER Labs: UC Berkeley, Caltech, Stanford, UC Davis, UC San Diego, and UC Los Angeles John Graham Installing FIONette at PEER Feb 10, 2017
  • 25. Cancer Genomics Hub (UCSC) is Housed in SDSC: Large Data Flows to End Users at UCSC, UCB, UCSF, … 1G 8G Data Source: David Haussler, Brad Smith, UCSC 15G Jan 2016 30,000 TB Per Year
  • 26. NIH’s Cancer Genomics Database Moved So the PRP Deployed a FIONA to Chicago’s MREN
  • 27. The Prototype PRP Has Attracted New Application Drivers-More in Next Larry and Scott Talks Scott Sellars, Marty Ralph Center for Western Weather and Water Extremes Frank Vernon - Expansion of HPWREN Tom Levy, Cultural Heritage Cryo EM
  • 28. GPU JupyterHub: 2 x 14-core CPUs 256GB RAM 1.2TB FLASH 3.8TB SSD Nvidia K80 GPU Dual 40GbE NICs And a Trusted Platform Module GPU JupyterHub: 1 x 18-core CPUs 128GB RAM 3.8TB SSD Nvidia K80 GPU Dual 40GbE NICs And a Trusted Platform Module PRP UC-JupyterHub Backbone UCB Next Step: Deploy Across PRP UCSD Source: John Graham, Calit2
  • 29. Atmospheric Rivers (fall and winter) Southwest Monsoon (summer & fall) Great Plains Convection (spring and summer) Front Range Upslope (rain/snow) Funded collaborations CW3E Based at UCSD/Scripps Oceanography CW3E-North at Sonoma County Water Agency Key Phenomena Causing Extreme Precipitation in the Western U.S. (Ralph et al. 2014) Director: F. Martin Ralph Website: cw3e.ucsd.edu Data is at the heart of what we do! • High resolution numerical models • Satellite images • Ground based weather stations • Weather radar • Historical climate data Big Data Collaboration with: Source: Scott Sellers, CW3E Collaboration on Atmospheric Water Between UC San Diego and UC Irvine Director, Soroosh Sorooshian, UCSD Website http://chrs.web.uci.edu
  • 30. Calit2’s FIONA SDSC’s COMET Calit2’s FIONA Pacific Research Platform (10-100 Gb/s) GPUsGPUs Complete workflow time: 20 days20 hrs20 Minutes! UC, Irvine UC, San Diego Improvement of Over 1000x With PRP
  • 31. Cryo-electron Microscopy (cryo-EM) Has Driven a “Resolution Revolution” in the Last Five Years Exposure (every 60 seconds): X & Y dimensions: 7420 x 7676 Pixels Frames per Movie: 10 - 50 Size: 3 - 10 GB per Movie Every 24 hours: Number of Movies: ~1400 Data Size: ~5 TB Typical Datasets: Length of Time: 2 - 6 Days Total size: 10 - 30 TB Each Cryo-EM ‘Image’ is Actually a Movie Source: Michael A. Cianfrocco, Elizabeth Villa, & Andres Leschziner, UCSD
  • 32. Using PRP to Connect Cryo-EM across California With End Users and Computational Facilities Long term: ‣ Partner with Cryo-EM Facilities to Stream Data Straight from Microscopes (over PRP) to SDSC ‣ Perform All Cryo-EM Analysis (from Micrographs to 3D Models) via Web Browser on SDSC ‣ Expand Computing to Other XSEDE Resources (e.g. Xstream) and DOE’s NERSC Short term: ‣ Provide 2D and 3D Analysis on Particle Stacks on Comet at SDSC Source: Michael A. Cianfrocco, UCSD * * SDSC NERSC Xstream 3 Supercomputer Centers cosmic-cryoem.org ~20 Microscopes in CA UCLA UC Davis UC Santa Cruz SF Bay UC Berkeley, LBNL, UCSF, Stanford San Diego UCSD, TSRI, Salk*
  • 33. Linking Cultural Heritage and Archaeology Datasets at UCB, UCLA, UCM and UCSD with CAVEkiosks 48 Megapixel CAVEkiosk UCSD Library 48 Megapixel CAVEkiosk UCB Library 24 Megapixel CAVEkiosk UCM Library
  • 34. PRP is the Platform Chosen for 2017 Expansion of HPWREN, Connected to CENIC, into Orange and Riverside Counties • PRP CENIC 100G Link UCSD to SDSU – DTN FIONAs Endpoints – Data Redundancy – Disaster Recovery – High Availability – Network Redundancy • Anchor to CENIC at UCI – PRP FIONA Connects to CalREN-HPR Network – Data Replication Site • Potential Future UCR CENIC Anchor UCR UCI UCSD SDSU Source: Frank Vernon, Greg Hidley, UCSD
  • 35. Proposed Cognitive Hardware and Software Ecosystem On the Pacific Research Platform • Working With 30 CSE Machine Learning Researchers – Goal is 320 Game GPUs in 32-40 FIONAs at 10 PRP Campuses – PRP Couples FIONAs with GPUs into a Condor-Managed Cloud • PRP Access to Emerging Processors – IBM TrueNorth, KnuEdge, FPGA, and Qualcomm Snapdragon • Software Including a Wide Range of Open ML Algorithms • Metrics for Performance of Processors and Algorithms Source: Tom DeFanti, Calit2 FIONA with 8-Game GPUs
  • 36. We are Now Investigating How the PRP Prototype Might Be Extended to National-Scale From the text of the PRP cooperative agreement: After approximately 18 (or TBD) months, a site visit and comprehensive review of progress towards meeting project milestones and goals and overall performance and management processes will take place, including user community relationships, scientific impacts, and the status of the project as a model for potential future national-scale, network-aware, data-focused cyberinfrastructure attributes, approaches, and capabilities.
  • 37. Expanding to National Research Platform and Global Research Platform Via CENIC/Pacific Wave, Internet2, and International Links PRP’s Current International Partners Korea Shows Distance is Not the Barrier to Above 5Gb/s Disk-to-Disk Performance
  • 38. PRP Working on Connecting Guam via the University of Oregon-Based Network Startup Resource Center The PRP shipped a FIONette to CENIC’s John Hess to be Installed in Guam Mid-May To support projects in: • Geography • Climate History • Guam EPSCoR • The UOG Marine Laboratory “During the quarter century that this group has been helping to build internet infrastructure around the world, there’s hardly a place on the planet that has not been touched by the great work of the Network Startup Resource Center,” -- Larry Smarr.
  • 39. PRP is Partnering with the Advanced CyberInfrastructure – Research and Education Facilitators (ACI-REF) NSF Grant to Explore Extension PRP Connected  ACI-REF has also spawned the 28- member Campus Research Computing consortium (CaRC), funded by the NSF as a Research Coordination Network (RCN).  CaRC is dedicated to sharing best practices, expertise, and resources, enabling the advancement of campus- based research computing activities around the nation. Jim Bottum, Principal Investigator ACI-REF CaRC
  • 40. Announcing the First National Research Platform Workshop August 7-8, 2017 Co-Chairs: Larry Smarr, Calit2 & Jim Bottum, Internet2 See pacificresearchplatform.org for Registration Information
  • 41. Toward a National Research Platform PRP has 3 FTEs to Connect ~25 Campuses. How Many are Needed to Expand to a NRP Serving Researchers at 250 Campuses in Dozens of Fields? What is the Path Forward? As Internet2 Board of Trustees Member John Evans Said to Me Last Night: “We Are Near an Inflection Point.”
  • 42. Our Support: • US National Science Foundation (NSF) awards CNS 0821155 and CNS-1338192, CNS-1456638, ACI-1540112, and ACI-1541349 • University of California Office of the President CIO • UCSD Chancellor’s Integrated Digital Infrastructure Program • UCSD Next Generation Networking initiative • Calit2 and Calit2 Qualcomm Institute • CENIC, PacificWave and StarLight • DOE ESnet