SlideShare une entreprise Scribd logo
1  sur  54
Moving Gigantic Files In
and Out of the Repository
Jeff Potts
Metaversant Group, Inc.
Learn. Connect. Collaborate.
What’s the Deal with Large Files?
• Alfresco can manage files of any size, but getting large files into and out of
the repo is often problematic
• They take way too long to transfer
– Sessions timeout
– Machines go to sleep
– Incomplete files get transferred
– Users think, “Is this thing hung?” and then cancel
• End-users must actively monitor transfers in most cases
This talk is a technical case study
about an approach to significantly
improving large file transfers
Learn. Connect. Collaborate.
About Noble Research Institute
• Research organization focused on
improving agriculture for all mankind
– Research
– Producer Relations
– Applied agricultural systems and
stewardship
– Education
• About 400 employees from all over
the world
• Headquartered in Ardmore,
Oklahoma
• https://www.noble.org
Learn. Connect. Collaborate.
• Consulting firm focused on solving business problems with open source
Content Management, Workflow, & Search technology
• Founded in 2010
• Clients all over the world in a variety of industries, including:
– Airlines
– Manufacturing
– Construction
– Financial Services
– Higher Education
– Life Sciences
– Professional Services
https://www.metaversant.com
Learn. Connect. Collaborate.
The Problem
• Researchers work with very large files
• Typical size ranges from a few GB to hundreds of GB
• Source of the files is mixed
– Generate internally (e.g., gene sequencing machines)
– Acquire data sets from other research institutions
• Data governance team wants everything in Alfresco
• Large size makes moving files in and out of Alfresco difficult
Learn. Connect. Collaborate.
What We Tried
• Desktop Sync
• CMIS update content stream
– Versions are created, somewhat painful to disable auto-versioning
• Increasing timeouts
– Losing battle when files are multiple gigabytes
• Using Alfresco FTP
– Usually requires thick client installed
– Not preferred by end-users
• Resumable upload Share customization
– Actually worked pretty well
– Only handles uploads, not downloads
Learn. Connect. Collaborate.
Sidebar: Resumable Upload Details
• Share customization (closed source)
• Leverages resumable.js, see http://www.resumablejs.com/
• Utilizes the HTML5 File API
• If an upload stalls or ends prematurely, the end-user can restart where it
left off
Learn. Connect. Collaborate.
Inescapable math related to moving large files
• How long does it take to move 25 GB of data?
– Ethernet = 10 Mbit/s = 333.33 minutes
– Fast Ethernet = 100 Mbit/s = 33.33 minutes
– Gigabit Ethernet = 1 Gbit/s = 3.33 minutes
– 10 Gigabit Ethernet = 10 Gbit/s = 0.33 minutes
– 100 Gigabit Ethernet = 100 Gbit/s = 0.03 minutes
• Assumes full bandwidth is available
• Network only, does not account for disk or other non-network latencies on
either end
It’s not the actual import/export
that’s killing us, it’s the movement of
so many bytes over the network
Learn. Connect. Collaborate.
Technologies That Move Large Files
• BitTorrent
– Looked at BitTorrent Sync which became Resilio Sync
– Performance increases when multiple people have the same file
– Primarily peer-to-peer with an emphasis on desktop-to-desktop or between
devices
• GridFTP
– Extends FTP to add parallelism
– Multiple implementations, including at least one that is commercially supported
– Works between servers, desktop-to-server, and between devices
Learn. Connect. Collaborate.
GridFTP was created to move large files to clusters
• Extension of FTP
• Defined by the Open Grid Foundation (http://www.ogf.org)
• Designed specifically to facilitate transfers of large files and large sets of
files
• Uses multiple parallel streams to move data over TCP
• One of several ways that a product called Globus uses to move data
between end points
• More information at http://toolkit.globus.org/toolkit/docs/6.0/gridftp/
Learn. Connect. Collaborate.
Globus provides data migration tools to researchers
• Non-profit business within the University of Chicago
• Focused on providing low-cost tools to researchers doing data-intensive
research
• Globus is SaaS that acts as a middleman to coordinate transfers of data
between endpoints
• Publishes a list of public endpoints
• Provides API and services such as authentication
• Sync between two endpoints typically uses GridFTP protocol
• It is possible to use GridFTP without leveraging Globus
– See http://toolkit.globus.org/toolkit/docs/latest-stable/admin/install/
Globus/GridFTP helps move bytes
over the network. Alfresco BFSIT
does fast imports once the files are
on the server
Learn. Connect. Collaborate.
High-Level Approach: Two Step Import
First Step: Globus Personal Connect to Globus Endpoint
Shared Mount
Learn. Connect. Collaborate.
High-Level Approach: Two Step Import
Second Step: Alfresco Bulk File System Import
Shared Mount
Learn. Connect. Collaborate.
High-Level Approach: Two Step Export
First Step: Write file(s) to File System
Shared Mount
Learn. Connect. Collaborate.
High-Level Approach: Two Step Export
Second Step: Globus Endpoint to Globus Personal Connect
Shared Mount
With the high-level approach
determined, it was time to work on
the details
Learn. Connect. Collaborate.
Where to Put the UI?
• Considered Share
– But researchers were already looking for a more streamlined interface
• Considered ADF
– But it was too new at the time
– Wasn’t the right fit for this particular application
• Decided on custom Spring Boot application
– Needed an app anyway
– Could bring ADF later in if desired
Learn. Connect. Collaborate.
Custom Globus Alfresco Transfers application
Simple Scope
• Start transfer jobs
• See the status of transfer jobs
• Publishes and subscribes to queues used to
coordinate multi-step transfers
• Authentication
– Authenticates against Alfresco
– Accounts linked to Globus via Oauth
Built With
• Spring Boot
• Angular 4
• Bootstrap 3
• Apache ActiveMQ
• Apache Maven
Learn. Connect. Collaborate.
• Alfresco Enterprise
Edition, Clustered
• Globus Server
Endpoint
• Both point to the
same shared mount
Solution
Components
Shared Mount
Learn. Connect. Collaborate.
Solution
Components
• Globus SaaS
communicates with
– Globus Server
Endpoint
– Each individual’s
Globus Personal
Connect
• Globus SaaS provides
a REST API
Shared Mount
Learn. Connect. Collaborate.
• Spring Boot application
used to create transfer
jobs
• Coordinates the
transfers
• Persists transfer job
and user objects to
PostgreSQL
Solution
Components
Shared Mount
Learn. Connect. Collaborate.
• Everything is
asynchronous
• Apache ActiveMQ acts
as the message broker,
persists queues
Solution
Components
Shared Mount
Learn. Connect. Collaborate.
Queues and Listeners
Alfresco
Import
Listener
Alfresco
Export
Listener
Globus
Inbound
Transfer
Listener
Globus
Outbound
Transfer
Listener
Transfer
Status
Listener
Given a file
path, imports
it into a
specified
node ref using
BFSIT
Given a node
ref, exports it
to a specified
file path
Given an
endpoint ID
and a path,
transfer it to
the Noble
endpoint
Given a path on
the Noble
endpoint,
transfer to a
specified path
on an endpoint
Persist status
changes; Kick
off next step
AMP Globus Alfresco Transfers Spring Boot App
Importing into Alfresco
Learn. Connect. Collaborate.
1. Save Transfer Job
2. Put message on a
queue
Transfer to
Alfresco (1)
1.
2. “Do Globus transfer”
Shared Mount
Learn. Connect. Collaborate.
1. See message
2. Start transfer
3. Perform the transfer
4. Put message on the
queue
Transfer to
Alfresco (2)
1. ”Do Globus
transfer”
2.
3.
4. “Globus transfer done”
Shared Mount
Learn. Connect. Collaborate.
2.
3. “Do Alfresco transfer”
1. See message
2. Update status
3. Queue message
Transfer to
Alfresco (3)
1. “Globus
transfer done”
Shared Mount
Learn. Connect. Collaborate.
5.
4. “Alfresco import done”
1. “Do Alfresco
import”
2. BFSIT
3. “Alfresco import done”
Transfer to
Alfresco (4)
1. See message
2. BFSIT import
3. Queue message
4. See message
5. Update status
Shared Mount
Downloading from Alfresco
Learn. Connect. Collaborate.
1. Save Transfer Job
2. Put message on a
queue
Transfer from
Alfresco (1)
1.
2. "Do Alfresco export”
Shared Mount
Learn. Connect. Collaborate.
1. See message
2. Custom export
3. Queue message
Transfer from
Alfresco (2)
1. “Do Alfresco
export”
2.
3. “Alfresco export done”
Shared Mount
Learn. Connect. Collaborate.
1. See message
2. Update status
3. Queue message
Transfer from
Alfresco (3)
1. “Alfresco
export done”
3. “Do Globus transfer”
2.
Shared Mount
Learn. Connect. Collaborate.
1. See message
2. Initiate transfer
3. Do transfer
4. Queue message
5. See message
6. Set status
Transfer from
Alfresco (4)
6.
1. “Do Globus
transfer”
3.4. “Globus transfer done”
2.
5.
Shared Mount
How did we do?
Learn. Connect. Collaborate.
Metrics: Multi-file* Upload/Download
Upload to Alfresco Download from Alfresco
Method Time Rate Time Rate
Out-of-the-box 5 minutes 612 MB/min 6.4 minutes 476.6 MB/min
Globus
Alfresco
Transfers
2 minutes 1530 MB/min 3.6 minutes 1020 MB/min
Improvement 60% faster 150% more
throughput
53% faster 114% more
throughput
*Four files totaling 3,060 MB
Learn. Connect. Collaborate.
Metrics: Single-file* Upload/Download
Upload to Alfresco Download from Alfresco
Method Time Rate Time Rate
Out-of-the-box 7.2 minutes 616.2 MB/min DNF** DNF**
Globus
Alfresco
Transfers
3.6 minutes 1220.4 MB/min 5.1 minutes 862.9 MB/min
Improvement 50% faster 98% more
throughput
Infinitely
faster
Infinitely greater
throughput
*Single file of size 4,418 MB **Alfresco throws an
exception at around 1 GB
Learn. Connect. Collaborate.
Results
• Transfers can now be done as “fire-and-forget” jobs
• Any number of files, any size
• Streamlined, purpose-built UI keeps researchers focused
• Integrates with existing sync technology researchers like
• Reduced transfer time by 50 - 60%
• Increased transfer rate by 100 – 150%
Learn. Connect. Collaborate.
Futures
• Improve download by doing a move from content store rather than a write
• Send files to/from any Globus endpoint, including external
– Currently transfer source/target is Globus Personal Connect on Noble
workstations
• Security hardening
• Set metadata on multiple files during import
• Auditing/usage reports
• Possible new requirements
– Scheduled/recurring transfers
– Share integration
– ADF integration
Thank You!
https://www.metaversant.com
https://ecmarchitect.com
@jeffpotts01

Contenu connexe

Tendances

The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...Symphony Software Foundation
 
Alfresco Share - Recycle Bin Ideas
Alfresco Share - Recycle Bin IdeasAlfresco Share - Recycle Bin Ideas
Alfresco Share - Recycle Bin IdeasAlfrescoUE
 
Using and extending Alfresco Content Application
Using and extending Alfresco Content ApplicationUsing and extending Alfresco Content Application
Using and extending Alfresco Content ApplicationDenys Vuika
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoRichard McKnight
 
Exciting New Alfresco REST APIs
Exciting New Alfresco REST APIsExciting New Alfresco REST APIs
Exciting New Alfresco REST APIsJ V
 
Alfresco DevCon 2019: Encryption at-rest and in-transit
Alfresco DevCon 2019: Encryption at-rest and in-transitAlfresco DevCon 2019: Encryption at-rest and in-transit
Alfresco DevCon 2019: Encryption at-rest and in-transitToni de la Fuente
 
ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...
ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...
ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...Alfresco Software
 
Alfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみよう
Alfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみようAlfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみよう
Alfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみようTasuku Otani
 
Alfresco勉強会#26 Alfresco SDK + Eclipseで開発してみよう
Alfresco勉強会#26 Alfresco SDK + Eclipseで開発してみようAlfresco勉強会#26 Alfresco SDK + Eclipseで開発してみよう
Alfresco勉強会#26 Alfresco SDK + Eclipseで開発してみようJun Terashita
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionFrancesco Corti
 
Metadata Extraction and Content Transformation
Metadata Extraction and Content TransformationMetadata Extraction and Content Transformation
Metadata Extraction and Content TransformationAlfresco Software
 
Ef09 installing-alfresco-components-1-by-1
Ef09 installing-alfresco-components-1-by-1Ef09 installing-alfresco-components-1-by-1
Ef09 installing-alfresco-components-1-by-1Angel Borroy López
 
Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platformLuis Cabaceira
 
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...Lviv Startup Club
 
Ansible Automation Platform.pdf
Ansible Automation Platform.pdfAnsible Automation Platform.pdf
Ansible Automation Platform.pdfVuHoangAnh14
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 
Moving From Actions & Behaviors to Microservices
Moving From Actions & Behaviors to MicroservicesMoving From Actions & Behaviors to Microservices
Moving From Actions & Behaviors to MicroservicesJeff Potts
 

Tendances (20)

The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
 
Alfresco Share - Recycle Bin Ideas
Alfresco Share - Recycle Bin IdeasAlfresco Share - Recycle Bin Ideas
Alfresco Share - Recycle Bin Ideas
 
Webscripts
WebscriptsWebscripts
Webscripts
 
Using and extending Alfresco Content Application
Using and extending Alfresco Content ApplicationUsing and extending Alfresco Content Application
Using and extending Alfresco Content Application
 
Bulk Export Tool for Alfresco
Bulk Export Tool for AlfrescoBulk Export Tool for Alfresco
Bulk Export Tool for Alfresco
 
Exciting New Alfresco REST APIs
Exciting New Alfresco REST APIsExciting New Alfresco REST APIs
Exciting New Alfresco REST APIs
 
Alfresco CMIS
Alfresco CMISAlfresco CMIS
Alfresco CMIS
 
Alfresco DevCon 2019: Encryption at-rest and in-transit
Alfresco DevCon 2019: Encryption at-rest and in-transitAlfresco DevCon 2019: Encryption at-rest and in-transit
Alfresco DevCon 2019: Encryption at-rest and in-transit
 
GIT INTRODUCTION
GIT INTRODUCTIONGIT INTRODUCTION
GIT INTRODUCTION
 
ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...
ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...
ECM Decision Matrix - Deciding Between Alfresco Community Edition, Alfresco E...
 
Alfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみよう
Alfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみようAlfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみよう
Alfresco勉強会#36 alfresco 5でカスタムREST APIを作ってみよう
 
Alfresco勉強会#26 Alfresco SDK + Eclipseで開発してみよう
Alfresco勉強会#26 Alfresco SDK + Eclipseで開発してみようAlfresco勉強会#26 Alfresco SDK + Eclipseで開発してみよう
Alfresco勉強会#26 Alfresco SDK + Eclipseで開発してみよう
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in Action
 
Metadata Extraction and Content Transformation
Metadata Extraction and Content TransformationMetadata Extraction and Content Transformation
Metadata Extraction and Content Transformation
 
Ef09 installing-alfresco-components-1-by-1
Ef09 installing-alfresco-components-1-by-1Ef09 installing-alfresco-components-1-by-1
Ef09 installing-alfresco-components-1-by-1
 
Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platform
 
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...
Sergii Bielskyi "Using Kafka and Azure Event hub together for streaming Big d...
 
Ansible Automation Platform.pdf
Ansible Automation Platform.pdfAnsible Automation Platform.pdf
Ansible Automation Platform.pdf
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
Moving From Actions & Behaviors to Microservices
Moving From Actions & Behaviors to MicroservicesMoving From Actions & Behaviors to Microservices
Moving From Actions & Behaviors to Microservices
 

Similaire à Moving Gigantic Files Into and Out of the Alfresco Repository

Kubernetes - Hosted OSG Services
Kubernetes - Hosted OSG ServicesKubernetes - Hosted OSG Services
Kubernetes - Hosted OSG ServicesIgor Sfiligoi
 
Partner spotlight: Telestream
Partner spotlight: TelestreamPartner spotlight: Telestream
Partner spotlight: TelestreamFileCatalyst
 
Nov 2014 webinar Making The Transition From Ftp
Nov 2014 webinar Making The Transition From FtpNov 2014 webinar Making The Transition From Ftp
Nov 2014 webinar Making The Transition From FtpFileCatalyst
 
Sochi games wrap-up
Sochi games wrap-upSochi games wrap-up
Sochi games wrap-upFileCatalyst
 
OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010
OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010
OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010Adrian Trenaman
 
Partner webinar featuring CatDV
Partner webinar featuring CatDVPartner webinar featuring CatDV
Partner webinar featuring CatDVFileCatalyst
 
Spotlight on the petroleum and energy vertical
Spotlight on the petroleum and energy vertical Spotlight on the petroleum and energy vertical
Spotlight on the petroleum and energy vertical FileCatalyst
 
Building Cloud Native Software
Building Cloud Native SoftwareBuilding Cloud Native Software
Building Cloud Native SoftwarePaul Fremantle
 
Swift Buildpack for Cloud Foundry
Swift Buildpack for Cloud FoundrySwift Buildpack for Cloud Foundry
Swift Buildpack for Cloud FoundryRobert Gogolok
 
Three years of OFELIA - taking stock
Three years of OFELIA - taking stockThree years of OFELIA - taking stock
Three years of OFELIA - taking stockFIBRE Testbed
 
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...Nicole Szigeti
 
Partner spotlight: Empress
Partner spotlight: EmpressPartner spotlight: Empress
Partner spotlight: EmpressFileCatalyst
 
Tackling Terraform at Ticketmaster
Tackling Terraform at TicketmasterTackling Terraform at Ticketmaster
Tackling Terraform at TicketmasterFastly
 
Introduction to Globus: Research Data Management Software at the ALCF
Introduction to Globus: Research Data Management Software at the ALCFIntroduction to Globus: Research Data Management Software at the ALCF
Introduction to Globus: Research Data Management Software at the ALCFGlobus
 
Questions and answers
Questions and answersQuestions and answers
Questions and answersFileCatalyst
 
An Introduction to FileCatalyst
An Introduction to FileCatalystAn Introduction to FileCatalyst
An Introduction to FileCatalystFileCatalyst
 
Open stack summit-2015-dp
Open stack summit-2015-dpOpen stack summit-2015-dp
Open stack summit-2015-dpDirk Petersen
 

Similaire à Moving Gigantic Files Into and Out of the Alfresco Repository (20)

Upgrading to Alfresco 6
Upgrading to Alfresco 6Upgrading to Alfresco 6
Upgrading to Alfresco 6
 
Kubernetes - Hosted OSG Services
Kubernetes - Hosted OSG ServicesKubernetes - Hosted OSG Services
Kubernetes - Hosted OSG Services
 
Partner spotlight: Telestream
Partner spotlight: TelestreamPartner spotlight: Telestream
Partner spotlight: Telestream
 
Serverless design with Fn project
Serverless design with Fn projectServerless design with Fn project
Serverless design with Fn project
 
Nov 2014 webinar Making The Transition From Ftp
Nov 2014 webinar Making The Transition From FtpNov 2014 webinar Making The Transition From Ftp
Nov 2014 webinar Making The Transition From Ftp
 
Sochi games wrap-up
Sochi games wrap-upSochi games wrap-up
Sochi games wrap-up
 
OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010
OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010
OSGi for real in the enterprise: Apache Karaf - NLJUG J-FALL 2010
 
Partner webinar featuring CatDV
Partner webinar featuring CatDVPartner webinar featuring CatDV
Partner webinar featuring CatDV
 
Spotlight on the petroleum and energy vertical
Spotlight on the petroleum and energy vertical Spotlight on the petroleum and energy vertical
Spotlight on the petroleum and energy vertical
 
Building Cloud Native Software
Building Cloud Native SoftwareBuilding Cloud Native Software
Building Cloud Native Software
 
Swift Buildpack for Cloud Foundry
Swift Buildpack for Cloud FoundrySwift Buildpack for Cloud Foundry
Swift Buildpack for Cloud Foundry
 
Galera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replicationGalera webinar migration to galera cluster from my sql async replication
Galera webinar migration to galera cluster from my sql async replication
 
Three years of OFELIA - taking stock
Three years of OFELIA - taking stockThree years of OFELIA - taking stock
Three years of OFELIA - taking stock
 
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
Alfresco Coding mit dem Alfresco SDK (auf Englisch) - Julien Bruinaud, Techni...
 
Partner spotlight: Empress
Partner spotlight: EmpressPartner spotlight: Empress
Partner spotlight: Empress
 
Tackling Terraform at Ticketmaster
Tackling Terraform at TicketmasterTackling Terraform at Ticketmaster
Tackling Terraform at Ticketmaster
 
Introduction to Globus: Research Data Management Software at the ALCF
Introduction to Globus: Research Data Management Software at the ALCFIntroduction to Globus: Research Data Management Software at the ALCF
Introduction to Globus: Research Data Management Software at the ALCF
 
Questions and answers
Questions and answersQuestions and answers
Questions and answers
 
An Introduction to FileCatalyst
An Introduction to FileCatalystAn Introduction to FileCatalyst
An Introduction to FileCatalyst
 
Open stack summit-2015-dp
Open stack summit-2015-dpOpen stack summit-2015-dp
Open stack summit-2015-dp
 

Plus de Jeff Potts

No Docker? No Problem: Automating installation and config with Ansible
No Docker? No Problem: Automating installation and config with AnsibleNo Docker? No Problem: Automating installation and config with Ansible
No Docker? No Problem: Automating installation and config with AnsibleJeff Potts
 
Flexible Permissions Management with ACL Templates
Flexible Permissions Management with ACL TemplatesFlexible Permissions Management with ACL Templates
Flexible Permissions Management with ACL TemplatesJeff Potts
 
Could Alfresco Survive a Zombie Attack?
Could Alfresco Survive a Zombie Attack?Could Alfresco Survive a Zombie Attack?
Could Alfresco Survive a Zombie Attack?Jeff Potts
 
Connecting Content Management Apps with CMIS
Connecting Content Management Apps with CMISConnecting Content Management Apps with CMIS
Connecting Content Management Apps with CMISJeff Potts
 
The Challenges of Keeping Bees
The Challenges of Keeping BeesThe Challenges of Keeping Bees
The Challenges of Keeping BeesJeff Potts
 
Getting Started With CMIS
Getting Started With CMISGetting Started With CMIS
Getting Started With CMISJeff Potts
 
Alfresco: What every developer should know
Alfresco: What every developer should knowAlfresco: What every developer should know
Alfresco: What every developer should knowJeff Potts
 
CMIS: An Open API for Managing Content
CMIS: An Open API for Managing ContentCMIS: An Open API for Managing Content
CMIS: An Open API for Managing ContentJeff Potts
 
Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...
Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...
Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...Jeff Potts
 
Alfresco: The Story of How Open Source Disrupted the ECM Market
Alfresco: The Story of How Open Source Disrupted the ECM MarketAlfresco: The Story of How Open Source Disrupted the ECM Market
Alfresco: The Story of How Open Source Disrupted the ECM MarketJeff Potts
 
Join the Alfresco community
Join the Alfresco communityJoin the Alfresco community
Join the Alfresco communityJeff Potts
 
Intro to the Alfresco Public API
Intro to the Alfresco Public APIIntro to the Alfresco Public API
Intro to the Alfresco Public APIJeff Potts
 
Apache Chemistry in Action
Apache Chemistry in ActionApache Chemistry in Action
Apache Chemistry in ActionJeff Potts
 
Building Content-Rich Java Apps in the Cloud with the Alfresco API
Building Content-Rich Java Apps in the Cloud with the Alfresco APIBuilding Content-Rich Java Apps in the Cloud with the Alfresco API
Building Content-Rich Java Apps in the Cloud with the Alfresco APIJeff Potts
 
Alfresco Community Survey 2012 Results
Alfresco Community Survey 2012 ResultsAlfresco Community Survey 2012 Results
Alfresco Community Survey 2012 ResultsJeff Potts
 
Getting Started with CMIS
Getting Started with CMISGetting Started with CMIS
Getting Started with CMISJeff Potts
 
Relational Won't Cut It: Architecting Content Centric Apps
Relational Won't Cut It: Architecting Content Centric AppsRelational Won't Cut It: Architecting Content Centric Apps
Relational Won't Cut It: Architecting Content Centric AppsJeff Potts
 
Alfresco SAUG: State of ECM
Alfresco SAUG: State of ECMAlfresco SAUG: State of ECM
Alfresco SAUG: State of ECMJeff Potts
 
Alfresco SAUG: CMIS & Integrations
Alfresco SAUG: CMIS & IntegrationsAlfresco SAUG: CMIS & Integrations
Alfresco SAUG: CMIS & IntegrationsJeff Potts
 
Should You Attend Alfresco Devcon 2011
Should You Attend Alfresco Devcon 2011Should You Attend Alfresco Devcon 2011
Should You Attend Alfresco Devcon 2011Jeff Potts
 

Plus de Jeff Potts (20)

No Docker? No Problem: Automating installation and config with Ansible
No Docker? No Problem: Automating installation and config with AnsibleNo Docker? No Problem: Automating installation and config with Ansible
No Docker? No Problem: Automating installation and config with Ansible
 
Flexible Permissions Management with ACL Templates
Flexible Permissions Management with ACL TemplatesFlexible Permissions Management with ACL Templates
Flexible Permissions Management with ACL Templates
 
Could Alfresco Survive a Zombie Attack?
Could Alfresco Survive a Zombie Attack?Could Alfresco Survive a Zombie Attack?
Could Alfresco Survive a Zombie Attack?
 
Connecting Content Management Apps with CMIS
Connecting Content Management Apps with CMISConnecting Content Management Apps with CMIS
Connecting Content Management Apps with CMIS
 
The Challenges of Keeping Bees
The Challenges of Keeping BeesThe Challenges of Keeping Bees
The Challenges of Keeping Bees
 
Getting Started With CMIS
Getting Started With CMISGetting Started With CMIS
Getting Started With CMIS
 
Alfresco: What every developer should know
Alfresco: What every developer should knowAlfresco: What every developer should know
Alfresco: What every developer should know
 
CMIS: An Open API for Managing Content
CMIS: An Open API for Managing ContentCMIS: An Open API for Managing Content
CMIS: An Open API for Managing Content
 
Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...
Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...
Apache Chemistry in Action: Using CMIS and your favorite language to unlock c...
 
Alfresco: The Story of How Open Source Disrupted the ECM Market
Alfresco: The Story of How Open Source Disrupted the ECM MarketAlfresco: The Story of How Open Source Disrupted the ECM Market
Alfresco: The Story of How Open Source Disrupted the ECM Market
 
Join the Alfresco community
Join the Alfresco communityJoin the Alfresco community
Join the Alfresco community
 
Intro to the Alfresco Public API
Intro to the Alfresco Public APIIntro to the Alfresco Public API
Intro to the Alfresco Public API
 
Apache Chemistry in Action
Apache Chemistry in ActionApache Chemistry in Action
Apache Chemistry in Action
 
Building Content-Rich Java Apps in the Cloud with the Alfresco API
Building Content-Rich Java Apps in the Cloud with the Alfresco APIBuilding Content-Rich Java Apps in the Cloud with the Alfresco API
Building Content-Rich Java Apps in the Cloud with the Alfresco API
 
Alfresco Community Survey 2012 Results
Alfresco Community Survey 2012 ResultsAlfresco Community Survey 2012 Results
Alfresco Community Survey 2012 Results
 
Getting Started with CMIS
Getting Started with CMISGetting Started with CMIS
Getting Started with CMIS
 
Relational Won't Cut It: Architecting Content Centric Apps
Relational Won't Cut It: Architecting Content Centric AppsRelational Won't Cut It: Architecting Content Centric Apps
Relational Won't Cut It: Architecting Content Centric Apps
 
Alfresco SAUG: State of ECM
Alfresco SAUG: State of ECMAlfresco SAUG: State of ECM
Alfresco SAUG: State of ECM
 
Alfresco SAUG: CMIS & Integrations
Alfresco SAUG: CMIS & IntegrationsAlfresco SAUG: CMIS & Integrations
Alfresco SAUG: CMIS & Integrations
 
Should You Attend Alfresco Devcon 2011
Should You Attend Alfresco Devcon 2011Should You Attend Alfresco Devcon 2011
Should You Attend Alfresco Devcon 2011
 

Dernier

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Moving Gigantic Files Into and Out of the Alfresco Repository

  • 1. Moving Gigantic Files In and Out of the Repository Jeff Potts Metaversant Group, Inc.
  • 2. Learn. Connect. Collaborate. What’s the Deal with Large Files? • Alfresco can manage files of any size, but getting large files into and out of the repo is often problematic • They take way too long to transfer – Sessions timeout – Machines go to sleep – Incomplete files get transferred – Users think, “Is this thing hung?” and then cancel • End-users must actively monitor transfers in most cases
  • 3. This talk is a technical case study about an approach to significantly improving large file transfers
  • 4. Learn. Connect. Collaborate. About Noble Research Institute • Research organization focused on improving agriculture for all mankind – Research – Producer Relations – Applied agricultural systems and stewardship – Education • About 400 employees from all over the world • Headquartered in Ardmore, Oklahoma • https://www.noble.org
  • 5. Learn. Connect. Collaborate. • Consulting firm focused on solving business problems with open source Content Management, Workflow, & Search technology • Founded in 2010 • Clients all over the world in a variety of industries, including: – Airlines – Manufacturing – Construction – Financial Services – Higher Education – Life Sciences – Professional Services https://www.metaversant.com
  • 6. Learn. Connect. Collaborate. The Problem • Researchers work with very large files • Typical size ranges from a few GB to hundreds of GB • Source of the files is mixed – Generate internally (e.g., gene sequencing machines) – Acquire data sets from other research institutions • Data governance team wants everything in Alfresco • Large size makes moving files in and out of Alfresco difficult
  • 7. Learn. Connect. Collaborate. What We Tried • Desktop Sync • CMIS update content stream – Versions are created, somewhat painful to disable auto-versioning • Increasing timeouts – Losing battle when files are multiple gigabytes • Using Alfresco FTP – Usually requires thick client installed – Not preferred by end-users • Resumable upload Share customization – Actually worked pretty well – Only handles uploads, not downloads
  • 8. Learn. Connect. Collaborate. Sidebar: Resumable Upload Details • Share customization (closed source) • Leverages resumable.js, see http://www.resumablejs.com/ • Utilizes the HTML5 File API • If an upload stalls or ends prematurely, the end-user can restart where it left off
  • 9. Learn. Connect. Collaborate. Inescapable math related to moving large files • How long does it take to move 25 GB of data? – Ethernet = 10 Mbit/s = 333.33 minutes – Fast Ethernet = 100 Mbit/s = 33.33 minutes – Gigabit Ethernet = 1 Gbit/s = 3.33 minutes – 10 Gigabit Ethernet = 10 Gbit/s = 0.33 minutes – 100 Gigabit Ethernet = 100 Gbit/s = 0.03 minutes • Assumes full bandwidth is available • Network only, does not account for disk or other non-network latencies on either end
  • 10. It’s not the actual import/export that’s killing us, it’s the movement of so many bytes over the network
  • 11. Learn. Connect. Collaborate. Technologies That Move Large Files • BitTorrent – Looked at BitTorrent Sync which became Resilio Sync – Performance increases when multiple people have the same file – Primarily peer-to-peer with an emphasis on desktop-to-desktop or between devices • GridFTP – Extends FTP to add parallelism – Multiple implementations, including at least one that is commercially supported – Works between servers, desktop-to-server, and between devices
  • 12. Learn. Connect. Collaborate. GridFTP was created to move large files to clusters • Extension of FTP • Defined by the Open Grid Foundation (http://www.ogf.org) • Designed specifically to facilitate transfers of large files and large sets of files • Uses multiple parallel streams to move data over TCP • One of several ways that a product called Globus uses to move data between end points • More information at http://toolkit.globus.org/toolkit/docs/6.0/gridftp/
  • 13. Learn. Connect. Collaborate. Globus provides data migration tools to researchers • Non-profit business within the University of Chicago • Focused on providing low-cost tools to researchers doing data-intensive research • Globus is SaaS that acts as a middleman to coordinate transfers of data between endpoints • Publishes a list of public endpoints • Provides API and services such as authentication • Sync between two endpoints typically uses GridFTP protocol • It is possible to use GridFTP without leveraging Globus – See http://toolkit.globus.org/toolkit/docs/latest-stable/admin/install/
  • 14. Globus/GridFTP helps move bytes over the network. Alfresco BFSIT does fast imports once the files are on the server
  • 15. Learn. Connect. Collaborate. High-Level Approach: Two Step Import First Step: Globus Personal Connect to Globus Endpoint Shared Mount
  • 16. Learn. Connect. Collaborate. High-Level Approach: Two Step Import Second Step: Alfresco Bulk File System Import Shared Mount
  • 17. Learn. Connect. Collaborate. High-Level Approach: Two Step Export First Step: Write file(s) to File System Shared Mount
  • 18. Learn. Connect. Collaborate. High-Level Approach: Two Step Export Second Step: Globus Endpoint to Globus Personal Connect Shared Mount
  • 19. With the high-level approach determined, it was time to work on the details
  • 20. Learn. Connect. Collaborate. Where to Put the UI? • Considered Share – But researchers were already looking for a more streamlined interface • Considered ADF – But it was too new at the time – Wasn’t the right fit for this particular application • Decided on custom Spring Boot application – Needed an app anyway – Could bring ADF later in if desired
  • 21. Learn. Connect. Collaborate. Custom Globus Alfresco Transfers application Simple Scope • Start transfer jobs • See the status of transfer jobs • Publishes and subscribes to queues used to coordinate multi-step transfers • Authentication – Authenticates against Alfresco – Accounts linked to Globus via Oauth Built With • Spring Boot • Angular 4 • Bootstrap 3 • Apache ActiveMQ • Apache Maven
  • 22. Learn. Connect. Collaborate. • Alfresco Enterprise Edition, Clustered • Globus Server Endpoint • Both point to the same shared mount Solution Components Shared Mount
  • 23. Learn. Connect. Collaborate. Solution Components • Globus SaaS communicates with – Globus Server Endpoint – Each individual’s Globus Personal Connect • Globus SaaS provides a REST API Shared Mount
  • 24. Learn. Connect. Collaborate. • Spring Boot application used to create transfer jobs • Coordinates the transfers • Persists transfer job and user objects to PostgreSQL Solution Components Shared Mount
  • 25. Learn. Connect. Collaborate. • Everything is asynchronous • Apache ActiveMQ acts as the message broker, persists queues Solution Components Shared Mount
  • 26. Learn. Connect. Collaborate. Queues and Listeners Alfresco Import Listener Alfresco Export Listener Globus Inbound Transfer Listener Globus Outbound Transfer Listener Transfer Status Listener Given a file path, imports it into a specified node ref using BFSIT Given a node ref, exports it to a specified file path Given an endpoint ID and a path, transfer it to the Noble endpoint Given a path on the Noble endpoint, transfer to a specified path on an endpoint Persist status changes; Kick off next step AMP Globus Alfresco Transfers Spring Boot App
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34. Learn. Connect. Collaborate. 1. Save Transfer Job 2. Put message on a queue Transfer to Alfresco (1) 1. 2. “Do Globus transfer” Shared Mount
  • 35. Learn. Connect. Collaborate. 1. See message 2. Start transfer 3. Perform the transfer 4. Put message on the queue Transfer to Alfresco (2) 1. ”Do Globus transfer” 2. 3. 4. “Globus transfer done” Shared Mount
  • 36. Learn. Connect. Collaborate. 2. 3. “Do Alfresco transfer” 1. See message 2. Update status 3. Queue message Transfer to Alfresco (3) 1. “Globus transfer done” Shared Mount
  • 37. Learn. Connect. Collaborate. 5. 4. “Alfresco import done” 1. “Do Alfresco import” 2. BFSIT 3. “Alfresco import done” Transfer to Alfresco (4) 1. See message 2. BFSIT import 3. Queue message 4. See message 5. Update status Shared Mount
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. Learn. Connect. Collaborate. 1. Save Transfer Job 2. Put message on a queue Transfer from Alfresco (1) 1. 2. "Do Alfresco export” Shared Mount
  • 46. Learn. Connect. Collaborate. 1. See message 2. Custom export 3. Queue message Transfer from Alfresco (2) 1. “Do Alfresco export” 2. 3. “Alfresco export done” Shared Mount
  • 47. Learn. Connect. Collaborate. 1. See message 2. Update status 3. Queue message Transfer from Alfresco (3) 1. “Alfresco export done” 3. “Do Globus transfer” 2. Shared Mount
  • 48. Learn. Connect. Collaborate. 1. See message 2. Initiate transfer 3. Do transfer 4. Queue message 5. See message 6. Set status Transfer from Alfresco (4) 6. 1. “Do Globus transfer” 3.4. “Globus transfer done” 2. 5. Shared Mount
  • 49. How did we do?
  • 50. Learn. Connect. Collaborate. Metrics: Multi-file* Upload/Download Upload to Alfresco Download from Alfresco Method Time Rate Time Rate Out-of-the-box 5 minutes 612 MB/min 6.4 minutes 476.6 MB/min Globus Alfresco Transfers 2 minutes 1530 MB/min 3.6 minutes 1020 MB/min Improvement 60% faster 150% more throughput 53% faster 114% more throughput *Four files totaling 3,060 MB
  • 51. Learn. Connect. Collaborate. Metrics: Single-file* Upload/Download Upload to Alfresco Download from Alfresco Method Time Rate Time Rate Out-of-the-box 7.2 minutes 616.2 MB/min DNF** DNF** Globus Alfresco Transfers 3.6 minutes 1220.4 MB/min 5.1 minutes 862.9 MB/min Improvement 50% faster 98% more throughput Infinitely faster Infinitely greater throughput *Single file of size 4,418 MB **Alfresco throws an exception at around 1 GB
  • 52. Learn. Connect. Collaborate. Results • Transfers can now be done as “fire-and-forget” jobs • Any number of files, any size • Streamlined, purpose-built UI keeps researchers focused • Integrates with existing sync technology researchers like • Reduced transfer time by 50 - 60% • Increased transfer rate by 100 – 150%
  • 53. Learn. Connect. Collaborate. Futures • Improve download by doing a move from content store rather than a write • Send files to/from any Globus endpoint, including external – Currently transfer source/target is Globus Personal Connect on Noble workstations • Security hardening • Set metadata on multiple files during import • Auditing/usage reports • Possible new requirements – Scheduled/recurring transfers – Share integration – ADF integration

Notes de l'éditeur

  1. Learn more about Noble Research Institute at https://www.noble.org
  2. Learn more at https://www.metaversant.com
  3. App saves transfer job Places a message on the queue
  4. App saves transfer job Places a message on the queue
  5. Update status Put message on Alfresco Import queue
  6. Alfresco sees message Initiates a Bulk File System Import Places a message on the queue to update status App sees message Updates status to “Complete”
  7. App saves transfer job Places a message on the queue
  8. Multi-file upload test (4 files, totaling 3,060 MB): GAT uploaded the files in 2 minutes versus 5 minutes out-of-the-box (60% improvement) Multi-file download test (4 files, totaling 3,060 MB): GAT downloaded the files in 3 minutes versus 6.42 minutes out-of-the-box (53% improvement)
  9. Single-file upload test (1 file, 4,418 MB): GAT uploaded the file in 3.62 minutes versus 7.17 minutes out-of-the-box (50% improvement) Single-file download test (1 file, 4,418 MB): GAT downloaded the file in 5.12 minutes versus multiple unsuccessful attempts out-of-the-box