3. “I need to easily, quickly, & reliably move or
mirror portions of my data to other places.”
Research Computing HPC Cluster
Campus Home Filesystem
Lab Server
Desktop Workstation
Personal Laptop
XSEDE Resource
Public Cloud
4. “I need to get data from a scientific
instrument to my analysis server.”
Next Gen
Sequencer
MRI Advanced
Light Sheet Microscope
Light Source
5. “I need to easily and securely share
my data with my colleagues at other
institutions.”
6. “I need a good place to store / backup
/ archive my (big) research data, at a
reasonable price.”
Campus Store Mass Store Public Cloud Archive
7. “I need to publish my data so that
others can find it and use it.”
Scholarly
Publication
Reference
Dataset
Active
Research
Collaboration
9. Globus delivers:
Big data transfer, sharing,
publication, and discovery…
…directly from your own
storage systems
10. It’s about the user experience…
…for your photos
…for your e-mail
…for your entertainment
…for your research data
11. Globus is Software-as-a-Service (SaaS)
• Web, command line, and REST interfaces
• Reduced IT operational costs
• New features automatically available
• Consolidated support & troubleshooting
• Easy to add your laptop, server, cluster,
supercomputer, etc. with Globus Connect
12. Reliable, secure, high-performance
file transfer and replication
• “Fire-and-forget”
transfers
• Automatic fault
recovery
• Seamless security
integration
• Powerful GUI
and APIs
Data
Source
Data
Destination
User initiates
transfer
request
1
Globus
moves and
syncs files
2
3
Globus
notifies user
13. Simple, secure sharing off existing
storage systems
Data
Source
User A selects
file(s) to share,
selects user or
group, and sets
permissions
1
Globus tracks shared
files; no need to
move files to cloud
storage!
2
3
User B logs in
to Globus and
accesses
shared file
• Easily share large data
with any user or group
• No cloud storage
required
14. Curated publication of data, with
relevant metadata for discovery
• Identify
• Describe
• Curate
• Verify
• Protect
• Preserve Researcher
assembles data
set; describes it
using metadata
1
Published
Data
Store
3
Curator reviews and
approves; data set
published on campus
or other storage
Peers, public
search and
discover data
sets; access
using Globus
2
Metadata
• Search
• Browse
• Access
16. Typical publication workflow
Univ. of Chicago Argonne IIT UIUC
1. Publish Data 6. Download
Argonne Storage System
3. Assemble Dataset
(Transfer Data)
Argonne Curator
2. Describe
Submission
Scientist
Shared Endpoint
4. Curate Dataset
5. Search
17. Globus’ view of data publishing
Metadata
Access Control
License
Storage
Curation
Workflow
Policies
Collection
Dataset
Metadata Data
Metadata
Data
Data
Metadata
Dataset
Dataset
Community
19. Globus Platform-as-a-Service
…
Identity, Group, and
Profile Management
Globus Toolkit
Globus APIs
Globus Connect
Data Publication & Discovery
File Sharing
File Transfer & Replication
20. We are a non-profit, delivering a
production-grade service to the
non-profit research community
Our challenge:
Sustainability
21. Globus Provider Subscriptions
• Standard Provider
– Host shared endpoints
– Data publication*
– Management console
– Usage reports
– Priority support
– Mass Storage System optimization
– Integration support
• Branded Web Site
• Alternate Identity Provider (InCommon is standard)
globus.org/provider-plans
22. Thank you to our sponsors!
U. S . DE PA RTME NT OF
ENERGY
Notes de l'éditeur
A number of common patterns, each supported by Globus—plus services deployed on various campus resources. Explains why so many endpoints!
A scientist has just published a paper associated with his research on nanoscale materials. He now wants to go ahead and publish the data associated with this publication.
Using the Globus publication system, he is able to select the Argonne community, and the Center for Nanoscale Materials (CNM) collection.
He elects to publish his data
He describes the submission with both publication (Dublin core) and scientific metadata
The CNM collection has been preconfigured with its own storage provided at Argonne
As part of this submission, a unique endpoint is created for the scientist; the endpoint is created so that only the scientist can write to it
The scientist now “assembles” his dataset on this endpoint by transferring files from one or more locations (e.g. lab server, personal laptop, supercomputing facility)
He can assemble this dataset over a long period of time and can return to the submission workflow when he is happy with the submission.
The CNM collection has also been preconfigured with a workflow requiring that an Argonne curator must approve the submission
The curator is able to view and edit the metadata and files of the dataset
Once approved, the submission is published in the CNM collection with a DOI
Other users (with permission to view the collection) can then discover published datasets by their DOI or using the Globus discovery interface to find datasets by their metadata
These users can choose to browse published datasets and download datasets to other resources (local campus systems, personal computers, etc.)
Collection is a set of Datasets
Dataset is data + metadata
Collection is published/managed by a Community
Policies on a Collection
Metadata
Access control
Curation workflow
License
Storage
Highlight XSEDE’s planned adoption of user, group and profile management
Talk about the Globus as being part of UChicago + ANL, as well as other context setting about how this work came about and is funded