Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Cyberinfrastructure for Einstein's Equations and Beyond
1. Cyberinfrastructure for
Einstein’s Equations
and Beyond
Gabrielle Allen
Professor, Astronomy
Research Professor, Computer Science
Associate Dean, College of Education
Senior Research Scientist, NCSA
University of Illinois Urbana-Champaign
3. National Center for
Supercomputing Applications
“NCSA will be a home for
addressing complex research
problems in science and society,
powered by the development
and application of advanced
and comprehensive digital
environments.”
5. Complex Problems
World: multiscale, multiphysics, data-
driven
E.g. Neutron Stars, Plants, Viruses, …
General Challenges
Determine correct scale to describe a
physical event and the correct
governing equations
Determine how different phenomena
interact - often at different scales
Determine data inputs (experimental,
observational, …)
Design simple but effective interfaces
that can be implemented in software
(Find, fund & motivate team)
7. Main Research Areas
LIGO Scientific Consortium, NANOGrav Consortium,
Einstein Toolkit Consortium
Connected to Dark Energy Survey, Large Synoptic
Survey Telescope projects (NCSA main data hub)
Analytic and numerical relativity, black hole and
neutron star astrophysics, computational
astrophysics, gravitational wave source modeling,
high performance & high throughput computing,
applications of deep learning, multimessenger
astrophysics scenarios
Scientific software, cybersecurity & identity
management, network engineering, open data
repositories and open science
8. Gravitational Waves
Changes in the curvature of
spacetime that propagate as
waves at the speed of light
Transport energy as gravitational
radiation
Predicted by Einstein (1916)
Theory of General Relativity
For 100 years until 2016 indirectly
observed
Observations by new gravitational
wave detectors provide a new
window on the universe …
gravitational wave astronomy!
8
9. Gravitational Wave Physics
Instruments
Models & Simulation
Theory
Scientific Discovery!
Gµν = 8π Tµν
Colliding black holes & neutron
stars, supernovae collapse,
gamma-ray bursts, big bang, …
10. GW150914
Observed by LIGO
Sept 14, 2015
1 billion light years away
Initial black holes
36 and 29 solar masses
Final black hole
62 solar masses
Difference radiated as
gravitational waves
E=mc2
New discoveries!
Existence of binary black hole
systems
Direct detection of
gravitational waves
First observation of binary
black hole merger
B.P. Abbott et al. (LIGO Scientific Collaboration and Virgo
Collaboration), Phys. Rev. Lett. 116, 061102
12. Cactus: www.CactusCode.org
Open source component framework for HPC
Modular system with high level abstractions
Components (“thorns”) defined by parameters, variables,
methods
Cactus “flesh” binds together
Cactus Computational Toolkit: general thorns
Different application areas
Numerical relativity, CFD, coastal science, petroleum,
quantum gravity, cosmology, …
13. Building a Computational
Numerical Relativity Community
Cactus came from the relativity community
European project with 10 sites developed community
open code base
Each group had different expertise
Cactus allowed developing shared interfaces/standards
Easy to add a component, share components
Supports both collaboration and competition
EU Network for Gravitational Wave Sources: 2001
14. Key Features
Cactus framework provides scheduling, application
APIs for parallel operations
Driver thorn provides scheduling, load balancing,
parallelization
Application thorns deal only with local part of parallel
mesh
Different thorns can be used to provide the same
functionality, easily swapped.
15. Adaptive Mesh Refinement: Carpet
Set of Cactus thorns
Developed by Erik Schnetter
Berger-Oliger style adaptive
mesh refinement with sub-
cycling in time
High order differencing (4,6,8)
Domain decomposition
Hybrid MPI-OpenMP
2002-03: Design of Cactus
means many groups, even
competing ones, suddenly
had AMR with little code
change
AEI (Rezzolla,
Kaehler)
E. Schnetter
17. Einstein Toolkit Consortium
developing and supporting open software for relativistic
astrophysics.
provide the core computational tools to enable new
science, broaden community, facilitate interdisciplinary
research and take advantage of emerging petascale
computers and advanced cyberinfrastructure.”
Consortium: 126 members, 75 sites, 14 countries
Sustainable community model:
Maintainers across 6 sites: oversee
technical developments, quality
control, V&V, distributions /releases
Whole consortium engaged in
directions, support, development
Open development meetings
Six month releases
http://www.einsteintoolkit.org
18. Components
Currently
150 Cactus thorns: Initial data, evolution, analysis, AMR, …
Tools, viz, etc
Provide extensible standard interface for general relativity
variables (e.g. variables, parameters, data model for output)
Examples and tutorials
Complete open production codes for black holes, neutron stars
New users: Test account on supercomputer
Community support: active mail list
http://www.einsteintoolkit.org
21. New - DataVault
Building community open data repository for
numerical relativity data based on yt platform and
whole tale
Waveforms
Simulation data
Analysis scripts
DNNs
Desired features:
Community driven
Searchable
Provenance info
Reproducibility
Citable
Analysis
22. Computational Research
Grid and distributed computing
Automatic code generation
Parallel I/O and checkpointing
GPU computing
Remote visualization
Interactive steering
Shared data repositories, reproducibility,
citablity, etc
26. Used Wolfram Language
(Mathematica) based on MXNet.
Tesla, GTX1080, and P100 GPUs
(Innovative Systems Lab at NCSA)
Simple designs: 3 convolutional
layers and 2 fully connected
layers.
26
Designing DNNs
arXiv:1701.00008, Deep Neural Networks To Enable Real-time
Multimessenger Astrophysics, Daniel George, Eliu A. Huerta
27. Real-time analysis (milliseconds).
Thousands of inputs can be
processed at once on a GPU.
Dedicated inference chips can
offer additional speed-up.
27
Speed Up
arXiv:1701.00008, Deep Neural Networks To Enable Real-time
Multimessenger Astrophysics, Daniel George, Eliu A. Huerta
28. 28
Detection & Parameter Estimation
arXiv:1701.00008, Deep Neural Networks To Enable Real-time
Multimessenger Astrophysics, Daniel George, Eliu A. Huerta
29. New Types of
GWs
Eccentric, Spinning
Not included in training.
Same accuracy of
detection.
DNNs learned to generalize.
Missed by current methods.
arXiv:1701.00008, Deep Neural Networks To Enable Real-time
Multimessenger Astrophysics, Daniel George, Eliu A. Huerta
30. Future Directoins
On-site GPU based analysis
Direct stream to GPUs, continuously
retrain with real-time noise
characteristics
Big data and new hardware
Petascale data, DGX-1 at LIGO
Hanford
Extending to new signals
8-dimensions, eccentric, spin-
precessing
Distributed computing
Einstein@home, MXNet, smartphones
Future GW missions
NANOGrav, eLISA
Transient detection with
telescopes
DES, LSST, JWST, WFIRST
Scope for improvements
Larger template banks, non-
Gaussian noise
Deeper networks, complex designs,
RNNs
Multitask learning, Source modeling
30
31. Conclusion
Open source toolkits such as the Cactus Framework and
Einstein Toolkit developed over last 20 years have been
essential to provide computational cyberinfrastructure and
also to build communities
Developing simulation software for supercomputers requires
continuous attention to software design, optimization and
scaling, data I/O etc.
Community is now developing catalogues of simulated
gravitational waveforms --- researchers need data science
to curate, manipulate, reproduce, analysis, cite, etc.
Deep convolutional networks showing great promise as
alternative to conventional matched filter approach to
provide new possibilities for real time MMA
Same methodologies can be applied to other disciplines,
crops-in-silico, learning sciences, etc.
32. NCSA Colloquia Series
Started in 2014 to bring leaders in computational and data science
to Illinois --- now 37 posted!
Colloquia are all posted on-line via You-Tube
Great speakers in many different disciplines! E.g.
"Archiving Capacity and Data Infrastructure: Holes, Goals, Roles and
Responsibility" — Margaret Hedstrom, University of Michigan
“Big Data Visual Analysis” — Chris Johnson, University of Utah
“Toward Reliable and Reproducible Inference in Big Data” -- Victoria
Stodden, University of Illinois
“From Data to Knowledge with Workflows and Provenance” -- Bertram
Ludäscher, University of Illinois
Search for NCSA Colloquia
Series
https://www.youtube.com/pla
ylist?
list=PLO8UWE9gZTlAgHZPaxQb
pUNY0T26zeL_f
35. Open Discussion: Collaboration
with International Partners
Many discussions about how to collaborate
Ideas:
Find projects of mutual interest and identify groups to
coordinate hackathons/bootcamps, follow up with visit
National Data Service (Nationaldataservice.org)
Whole Tale (wholetale.org), Brown Dog (
http://browndog.ncsa.illinois.edu)
Data sets from Pakistan of interest to US researchers and v.v.
Work with HEC and other agencies to support visits and
educational fellowships
E.g. Conacyt-Illinois model
Special development or funding of online courses
Would like to collect ideas and put a concept together
Notes de l'éditeur
Science research needed from many disciplines
Software and Hardware – software needs to be collaboratively developed and used, and need to be used on modern computing infrastructures (largest computers, clouds, multicore, mobile devices)
Community – Need to build an inclusive community, which takes into account cultures, and prepares new workforce
Outreach – Work to societal impact and public outreach, open science and accessibity is important part of this
Positive side – this is similar in all disciplines, and underlying numerical methods are the same
Increased fidelity for modeling complex problems depends on multiphysics and multidisciplinary coupling as well as resolution and accuracy of component solvers
Challenges typical monolithic software packages, need attention to coupling, potentially refactoring to enable more modularity
Summer Schools
Undergraduate Research
Open codes
The future of astronomy is multimessenger astrophysics.
This means that we observe events through multiple messenger such as gravitational waves, light, neutrinos cosmic rays etc.
We want to hear events first through GWs (like LIGO) then turn our telescopes (such as the upcoming NASA JWST) around to see it, and then feel it via astro-particles.
Getting all this different information simultaneously would lead to groundbreaking insights about the nature of the universe.
Mention Ed was involved in LIGO and IceCube.
GWs are very weak. We got lucky with the first detection. It was very loud.
However, the 2nd detection and the majority of expected signals are very weak as shown below. They don’t even show up in a spectrogram.
So how do we detect and extract parameters of these signals which are much weaker than the background noise?
The current approach used by LIGO for weak signals is template-matching also called matched-filtering. A template bank of about 300,000 signals is used.
This is so computationally expensive that current analysis is limited to a small class of signals. Furthermore it takes several days to accurately reconstruct the parameters of an event.
So is there a better way to this so that we can quickly find GW events and look at it through our telescopes to enable this scenario of multimessenger astrophysics?
To give an analogy to current LIGO analysis methods.
Supposed you wanted to discover words in a stream of characters.
What LIGO does is similar to comparing every subsequence with every word in a dictionary (template bank).
However, notice that we are able to pick out these words immediately.
Our brains do not compare against a database of all words.
This suggests that the same approach can work for GW analysis by using deep learning.
The template bank can be used once for training a deep neural network, after which it will be able to recognize signals instantly.
We used the new DNN functionality in Mathematica which was an added layer over MXNet.
We explored only simple designs of DNNs and still achieved excellent results.
There is a lot of scope for finding better designs
For example recurrent neural networks, particularly LSTM networks, are promising as they have been shown to be superior for voice recognition tasks (which is closely related to GW analysis)
Much faster than matched-filtering. Takes only milliseconds. Can be run on a laptop or smartphone.
Template banks were GBs but the DNN was only 4MB. This means it learned patterns in the data.
This is actually a lower-bound since we did not account for several additional steps needed for matched-filtering such as FFTs, aligning the signal templates, etc. In practice, DNNs would be much more faster.
Both classification and regression.
Outperformed all other methods by a huge margin.
What was most surprising was these DNNs were able to detect completely different kinds of signals even though we did not use it for training.
We only expected interpolation, but this was able to extrapolate.
Current matched-filtering pipelines would miss these.
Emphasize that these kinds of signals were not at all included in the training process.
The same accuracy for detection but slightly larger errors for parameter estimation.
Our plan is to generate catalogs of eccentric simulations for training to further improve this accuracy.
We are working on generating catalog of eccentric simulations on Blue Waters
We are using the DGX-1 at LIGO Hanford Lab, to build a production-grade pipeline which will be continuously retrained with the latest LIGO data
There is also no limit to the size of template banks that can be used for training. So we can extend this to all types of GW events, which is not possible with current matched-filtering methods.
Same technique can be applied to detect transients in image data. So we can have a unified deep learning framework for analyzing data from all observation instruments to enable multimessenger astrophysics.
Multitask learning allows a single DNN to perform detection, classification to sub-categories, and parameter estimation.