The story of how Globus helped the Arecibo Observatory save 50+ years of data for posterity and future research. Presented at the GlobusWorld 2021 conference by Julio Alvarado Negron.
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
GlobusWorld 2021: Saving Arecibo Observatory Data
1. Arecibo
Observatory
Data Movement:
so much more than data
2021.05.12
Julio Alvarado Negron
Big Data Program Manager @ Arecibo Observatory
George B. Robb III,
EPOC - Performance Chaser
ESnet - Infrastructure Team
Globus World 2021
2. What is Big Data?
A collection of data that is huge in volume and yet
growing exponentially with time. In short such data is so
large and complex that limited traditional data
management tools are able to store it or process it
efficiently.
Examples
The New York Stock Exchange generates about
1TB of new trade data per day. Facebook generates over
500TB of data daily. A jet engine generates over 10TB of
data in 30 minutes of flight.
AO has the capability to generate over 80TB per
day, with a total of over 3PB of data stored.
Big Data @ AO
- Data Management and Governance practices
implementation
- Facilitate access to community to Arecibo’s data
- Enables access to High-Performing Computing
- Implement best practices and lessons learned from
partner observatories and research community
Big Data Overview
3. A Full Spectrum Pioneer of Sciences Since 1963
Is the study of radio waves
produced by a astronomical objects
such as Sun, planets, pulsars,
stars, etc. Arecibo radio telescope
sensitivity allows astronomers to
detect faint radio signals from
far-off regions of the universe.
Fast Radio Bursts, Pulsars,
Spectral line, Exoplanets, VLBI.
More Info Here
Radio Astronomy
Is the investigation of the earth's
gaseous envelope. The Arecibo
Radio Telescope can measure the
growth and decay of disturbances
in ionosphere (altitudes above 30
miles). The "big dish" is also used
to study plasma physics processes
in the electrically charged regions
where radio waves are influenced
most.
More Info Here
Atmospheric Sciences
The Arecibo Observatory was the
world's most powerful planetary
radar system. The 305 meter
Arecibo telescope equipped with a
1 MW transmitter at S-band (12.6
cm, 2380 MHz) was used for
studies of small bodies in the solar
system, terrestrial planets, and
planetary satellites including the
Moon.
Near Earth Asteroids
characterization, Surface Structure
(spacecrafts landing)
More Info Here
Planetary Radar
4. ALFA
The Arecibo L-band Feed Array (ALFA) is a seven feed system that allows large-scale surveys of the sky to be
conducted with unprecedented sensitivity using the 305-m Arecibo telescope in Puerto Rico. ALFA, operating near 1.4 GHz,
consists of a cluster of seven cooled dual-polarization feeds, a fiber-optical transmission system, and digital back-end signal
processors.
Most of this projects are considered “surveys” due to their nature. The radar is left static in a position while the Earth
rotates, allowing to “drift scan” the sky above Arecibo.
It could generate an aggregate of 875MB/s, 76TB per day.
Knowing the Sources and Discoveries
Using ALFA for ALFALFA
5. Knowing the Sources and Discoveries
Venus Characterization
Venus is covered in a thick layer of clouds, but Arecibo’s radar beams were able to cut through that haze and
bounce off of the rocky planet’s surface, allowing researchers to map the terrain.
In the figures, we can compare the first large scale view of Venus (1971) and the 2015 image with improved
equipments.
- Arecibo Discoveries
6. Knowing the Sources and Discoveries
Fast Radio Bursts
Fast radio bursts, or FRBs, are brief, brilliant blasts of radio waves with unknown origins. The first FRB known to
give off multiple bursts was FRB 121102, which Arecibo first spotted in 2012 and again in 2015.
Arecibo’s discovery backed up the theory from the Charles Parkes telescope in Australia that FRB’s are events
that come farther than the Milky Way.
Radio bursts are observed during 90 days followed by a silent period of 67 days. The same behaviour then repeats
every 157 days.
- Arecibo Discoveries
9. First Cable Snaps
On August 10th a first cable snaps
causing damage to the dish.
Second Cable Snaps
On November 6th a second cable
snaps causing major damage to the
dish.
December’s Check Mate
A main support cable broke from
Tower 4, causing the platform to fall
over the dish.
The team got together and
realized that the data safety and
integrity was a priority.
A Sequence of Snaps
10. The Big Picture
Arecibo Observatory holds over
3PB of data onsite. This amount is
spread between active hard drives,
offline disks and the tape library.
Arecibo also has copies of data
stored on various institutions across the
globe, to which we refer to as offsite
data.
Not enough fiber
Arecibo’s Internet connection is
limited to 1Gbps due to the condition of
the infrastructure to the site.
With the existing connection,
transferring 3PB would need over 24
months.
The Data in Numbers and Infra Limitations
11. The Call for Help
Right after the collapse, the team
at Arecibo understood the urgency of
adding redundancy and safekeep the
data. Immediately, we reached the
Office of Research at UCF. From there,
the logistics were driven funneled
through the Research community.
Getting the Teams Together
In a matter of days, Arecibo got
connected to working teams, BIG THANKS:
- EPOC/ESnet - transfer optimization
and hardware
- CICoE - data management practices
- TACC - high performance
computing and storage
- Univ of Puerto Rico HPCf - 10Gbps
connectivity (I2 - AMPATH)
- Engine-4 - 10Gbps connectivity
- Globus - data transfer optimization
The SOS Call THANK
YOU!
12. Data Migration
Once the working groups worked intensely
to establish the processes and the mechanisms, the
team at Arecibo proceeded to load the data to the
NAS boxes.
Those boxes are being taken to our partners,
University of Puerto Rico at Mayaguez (UPRM) and
Engine-4 (E4) in Bayamon. From there, the data is
uploaded to the TACC via 10Gbps links.
The UPRM has a 10Gbps via the AMPATH (I2)
and E4 has a 10Gbps via commercial route.
Benchmarks
Before utilizing Globus, the team relied in
rsync to move the data from Arecibo and the
partners. That resulted in an avg transfer speed of
47MBps via 10Gbps wire.
Once Globus Connect Personal was installed
and configure in the NAS, the Effective Speed
reported has been sustained at over 200MBps.
The Data Transfer
Arecibo Data Transfer
Project
Data uploaded to Computing Center
2
S
t
o
r
a
g
e
t
r
a
n
s
p
o
r
t
e
d
b
a
c
k
t
o
A
O
3
1
D
a
t
a
t
r
a
n
s
p
o
r
t
e
d
t
o
P
a
r
t
n
e
r