SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
Stork 1.0 and Beyond
Data Scheduling for Large‐scale 
ll bCollaborative Science
Mehmet Balman
Louisiana State University, Baton Rouge, LA, USA
Presented at Condor Week 2009 April 20-April 23, 2009
Scheduling Data Placement JobsScheduling Data Placement Jobs
• Data Placement ActivitiesData Placement Activities
• Modular Architecture
– Data Transfer ModulesData Transfer Modules 
for specific protocols/services
• Throttle maximum transfer operations running
• Keep a log of data placement activities• Keep a log of data placement activities
• Add fault tolerance to data transfers
Job SubmissionJob Submission
[ dest_url = "gsiftp://eric1.loni.org/scratch/user/";
arguments = ‐p 4 dbg ‐vb";
src_url = "file:///home/user/test/";
dap_type = "transfer";
verify_checksum = true;
verify_filesize = true;
set_permission = "755" ;
i trecursive_copy = true;
network_check = true;
checkpoint_transfer = true;
output = "userout";output =  user.out ;
err = "user.err";
log = "userjob.log";
]]
AgendaAgenda
• Error Detection and Error ClassificationError Detection and Error Classification
• Data Transfer Operations
D i T i– Dynamic Tuning 
– Prediction Service
– Job Aggregation
• Data Migration using Stork
• Practical example in PetaShare Project
• Future Directions
Failure‐AwarenessFailure Awareness
• Dynamic Environment: 
• data transfers are prune to frequent failures
• what went wrong during data transfer?
• No access to the remote resourcesNo access to the remote resources
• Messages get lost due to system malfunction
• Instead of waiting failure to happen• Instead of waiting failure to happen
• Detect possible failures and malfunctioning services
• Search for another data server
• Alternate data transfer service• Alternate data transfer service
• Classify erroneous cases to make better decisions
Error DetectionError Detection
• Use Network Exploration Techniques
– Check availability of the remote service
– Resolve host and determine connectivity failures
– Detect available data transfers service
– should be Fast and Efficient not to bother system/network resources
• Error while transfer is in progress?
– Error_TRANSFER
• Retry or not?
• When to re‐initiate the transfer
• Use alternate options?• Use alternate options?
Error ClassificationError Classification
•Recover from Failure
•Retry failed operation
•Postpone scheduling of a 
failed operationsfailed operations
•Early Error Detection
I i i T f h•Initiate Transfer when 
erroneous condition 
recovered
•Or use Alternate options
• Data Transfer Protocol not always return appropriate error codes
• Using error messages generated by the data transfer protocol
p
• A better logging facility and classification
Error ReportingError Reporting
Failure‐Aware SchedulingFailure Aware Scheduling
Scoop data  ‐ Hurricane Gustov Simulationsp
Hundreds of files (250 data transfer operation)
Small (100MB) and large files (1G, 2G
New Transfer ModulesNew Transfer Modules
• Verify the successful completion of the operation by y p p y
controlling checksum and file size. 
f G idFTP S k f d l f• for GridFTP, Stork transfer module can recover from a 
failed operation by restarting from the last transmitted 
file. In case of a retry from a failure, scheduler informs 
the transfer module to recover and restart the transfer 
using the information from a rescue file created by the 
checkpoint‐enabled transfer module.checkpoint enabled transfer module.
• Replacing Globus RFT (Reliable File Transfer)
AgendaAgenda
• Error Detection and Error ClassificationError Detection and Error Classification
• Data Transfer Operations
D i T i– Dynamic Tuning 
– Prediction Service
– Job Aggregation
• Data Migration using Stork
• Practical example in PetaShare Project
• Future Directions
Tuning Data TransfersTuning Data Transfers
• Latency Wall
– Buffer Size Optimization
– Parallel TCP Streams
– Concurrent  Transfers
• User level end‐to‐end Tuning
P ll liParallelism
• (1) the number of parallel data streams connected to a data transfer 
service for increasing the utilization of network bandwidthservice for increasing the utilization of network bandwidth
• (2) the number of concurrent data transfer operations that are 
initiated at the same time for better utilization of system resourcesinitiated at the same time for better utilization of system resources.
Parameter EstimationParameter Estimation
• come up with a good estimation for the co e up t a good est at o o t e
parallelism level
– Network statistics
– Extra measurement
– Historical data 
• Might not reflect the best possible current 
settings (Dynamic Environment)
Optimization ServiceOptimization Service
Dynamic TuningDynamic Tuning
Average Throughput using Parallel Streamsg g p g
Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams
Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from IBM m/c
Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams
Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
Job AggregationJob Aggregation
• data placement jobs are combined and processed as a
single transfer job.
• Information about the aggregated job is stored in the job queue and
it is tied to a main job which is actually performing the transfer
operation such that it can be queried and reported separately.operation such that it can be queried and reported separately.
• Hence, aggregation is transparent to the user
W h t f i t i ll• We have seen vast performance improvement, especially
with small data files,
• simply by combining data placement jobs based on their
d ti ti ddsource or destination addresses.
– decreasing the amount of protocol usage
– reducing the number of independent network connections
Job AggregationJob Aggregation 
2000
2500
ec)
1000
1500
2000
time (se
single job at a time
2 parallel jobs
4 ll l j b
0
500
1000
total 
4 parallel jobs
8 parallel jobs
16 parallel jobs
32 parallel jobs
0 10 20 30 40
max aggregation count
32 parallel jobs
Experiments on LONI (Louisiana Optical Network Initiative) :
1024 transfer jobs from Ducky to Queenbee (rtt avg 5.129 ms) - 5MB
data file per job
AgendaAgenda
• Error Detection and Error ClassificationError Detection and Error Classification
• Data Transfer Operations
D i T i– Dynamic Tuning 
– Prediction Service
– Job Aggregation
• Data Migration using Stork
• Practical example in PetaShare Project
• Future Directions
PetaSharePetaShare
• Distributed Storage for Data 
Archive
• Global Namespace among 
distributed resources
• Client tools and interfaces
• Pcommands
• Petashell
• Petafs
• Windows Browser
• Web Portal
• Spans among seven LouisianaSpans among seven Louisiana 
research institutions
• Manages 300TB of disk storage, 
400TB of tape400TB of tape
Broader ImpactBroader Impact
Fast and Efficient Data Migration in PetaShareg
Future DirectionsFuture Directions
Stork: Central Scheduling Framework
f b l k
Stork: Central Scheduling Framework
• Performance bottleneck
– Hundreds of jobs submitted to a single batch 
h d l kscheduler, Stork
• Single point of failure
Future DirectionsFuture Directions
Distributed Data Scheduling
• Interaction between data scheduler
• Manage data activities with lightweight agents in each site
Distributed Data Scheduling
• Manage data activities with lightweight agents in each site
• Better parameter tuning and reordering of data placement 
jobs
– Job Delegation 
– peer‐to‐peer data movement 
– data and server striping 
– make use of replicas for multi‐source downloads
Questions?Questions?
Team:
Tevfik Kosar kosar@cct lsu eduTevfik Kosar kosar@cct.lsu.edu
Mehmet Balman balman@cct.lsu.edu
Dengpan Yin dyin@cct.lsu.edu
Jia "Jacob" Cheng jacobch@cct.lsu.edu
www.petashare.org www.cybertools.loni.org www.storkproject.orgwww.cct.lsu.edu

Contenu connexe

Tendances

Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plansIan Foster
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobus
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data StagingHenning Bergmeyer
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of ScienceGlobus
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlPrimal Pappachan
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009Ian Foster
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Globus
 
empirical analysis modeling of power dissipation control in internet data ce...
 empirical analysis modeling of power dissipation control in internet data ce... empirical analysis modeling of power dissipation control in internet data ce...
empirical analysis modeling of power dissipation control in internet data ce...saadjamil31
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusGlobus
 
Globus publication demo screenshots
Globus publication demo screenshotsGlobus publication demo screenshots
Globus publication demo screenshotsIan Foster
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceGlobus
 
What's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtraWhat's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtraGlobus
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13DECK36
 

Tendances (20)

Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plans
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
20090701 Climate Data Staging
20090701 Climate Data Staging20090701 Climate Data Staging
20090701 Climate Data Staging
 
SomeSlides
SomeSlidesSomeSlides
SomeSlides
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing Webcrawl
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009
 
Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)Enabling Secure Data Discoverability (SC21 Tutorial)
Enabling Secure Data Discoverability (SC21 Tutorial)
 
empirical analysis modeling of power dissipation control in internet data ce...
 empirical analysis modeling of power dissipation control in internet data ce... empirical analysis modeling of power dissipation control in internet data ce...
empirical analysis modeling of power dissipation control in internet data ce...
 
The DBpedia databus
The DBpedia databusThe DBpedia databus
The DBpedia databus
 
hadoop
hadoophadoop
hadoop
 
Automating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with GlobusAutomating Research Data Management at Scale with Globus
Automating Research Data Management at Scale with Globus
 
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
 
contentDM
contentDMcontentDM
contentDM
 
Globus publication demo screenshots
Globus publication demo screenshotsGlobus publication demo screenshots
Globus publication demo screenshots
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials Science
 
Understanding Big Data Platform from Patents
Understanding Big Data Platform from PatentsUnderstanding Big Data Platform from Patents
Understanding Big Data Platform from Patents
 
What's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtraWhat's New in Globus - Internet2 TechEXtra
What's New in Globus - Internet2 TechEXtra
 
Log everything! @DC13
Log everything! @DC13Log everything! @DC13
Log everything! @DC13
 

En vedette

Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarbalmanme
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarbalmanme
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopbalmanme
 
Pdcs2010 balman-presentation
Pdcs2010 balman-presentationPdcs2010 balman-presentation
Pdcs2010 balman-presentationbalmanme
 
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010balmanme
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerbalmanme
 
Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation balmanme
 

En vedette (7)

Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminar
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminar
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
 
Pdcs2010 balman-presentation
Pdcs2010 balman-presentationPdcs2010 balman-presentation
Pdcs2010 balman-presentation
 
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summer
 
Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation Sc10 nov16th-flex res-presentation
Sc10 nov16th-flex res-presentation
 

Similaire à Balman stork cw09

An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEWShiyong Lu
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsVijay Karan
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsVijay Karan
 
M.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing ProjectsM.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing ProjectsVijay Karan
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET Journal
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013Kirill Osipov
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsSriram Krishnan
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryIan Foster
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility ExhibitionGlobus
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer OverlordsIan Foster
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Scalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesScalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesFinalyear Projects
 
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...Finalyear Projects
 
Venkatachandu rajana
Venkatachandu rajanaVenkatachandu rajana
Venkatachandu rajanarajanachandu
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET Journal
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow ObstructionsTatiana Al-Chueyr
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsBen Laird
 

Similaire à Balman stork cw09 (20)

An Overview of VIEW
An Overview of VIEWAn Overview of VIEW
An Overview of VIEW
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing Projects
 
M.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing ProjectsM.Phil Computer Science Cloud Computing Projects
M.Phil Computer Science Cloud Computing Projects
 
Subhabrata Deb Resume
Subhabrata Deb ResumeSubhabrata Deb Resume
Subhabrata Deb Resume
 
M.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing ProjectsM.E Computer Science Cloud Computing Projects
M.E Computer Science Cloud Computing Projects
 
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud StorageIRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013
 
My C.V
My C.VMy C.V
My C.V
 
Opal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific ApplicationsOpal: Simple Web Services Wrappers for Scientific Applications
Opal: Simple Web Services Wrappers for Scientific Applications
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate Discovery
 
Data Mobility Exhibition
Data Mobility ExhibitionData Mobility Exhibition
Data Mobility Exhibition
 
So Long Computer Overlords
So Long Computer OverlordsSo Long Computer Overlords
So Long Computer Overlords
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Scalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehousesScalable scheduling of updates in streaming data warehouses
Scalable scheduling of updates in streaming data warehouses
 
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...REAL TIME PROJECTS  IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
REAL TIME PROJECTS IEEE BASED PROJECTS EMBEDDED SYSTEMS PAPER PUBLICATIONS M...
 
Venkatachandu rajana
Venkatachandu rajanaVenkatachandu rajana
Venkatachandu rajana
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
IRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop FrameworkIRJET- Big Data Processes and Analysis using Hadoop Framework
IRJET- Big Data Processes and Analysis using Hadoop Framework
 
Clearing Airflow Obstructions
Clearing Airflow ObstructionsClearing Airflow Obstructions
Clearing Airflow Obstructions
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.js
 

Plus de balmanme

Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...balmanme
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...balmanme
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1balmanme
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networksbalmanme
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...balmanme
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...balmanme
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...balmanme
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balmanbalmanme
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterbalmanme
 
Balman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet BalmanBalman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet Balmanbalmanme
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networksbalmanme
 
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...balmanme
 
Opening ndm2012 sc12
Opening ndm2012 sc12Opening ndm2012 sc12
Opening ndm2012 sc12balmanme
 
Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011balmanme
 
Welcome ndm11
Welcome ndm11Welcome ndm11
Welcome ndm11balmanme
 
2011 agu-town hall-100g
2011 agu-town hall-100g2011 agu-town hall-100g
2011 agu-town hall-100gbalmanme
 
Rdma presentation-kisti-v2
Rdma presentation-kisti-v2Rdma presentation-kisti-v2
Rdma presentation-kisti-v2balmanme
 
Streaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networksStreaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networksbalmanme
 
APM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CAAPM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CAbalmanme
 
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 -  Delft, The NetherlandsHPDC 2012 presentation - June 19, 2012 -  Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlandsbalmanme
 

Plus de balmanme (20)

Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networks
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balman
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-poster
 
Balman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet BalmanBalman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet Balman
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
 
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
 
Opening ndm2012 sc12
Opening ndm2012 sc12Opening ndm2012 sc12
Opening ndm2012 sc12
 
Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011Balman climate-c sc-ads-2011
Balman climate-c sc-ads-2011
 
Welcome ndm11
Welcome ndm11Welcome ndm11
Welcome ndm11
 
2011 agu-town hall-100g
2011 agu-town hall-100g2011 agu-town hall-100g
2011 agu-town hall-100g
 
Rdma presentation-kisti-v2
Rdma presentation-kisti-v2Rdma presentation-kisti-v2
Rdma presentation-kisti-v2
 
Streaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networksStreaming exa-scale data over 100Gbps networks
Streaming exa-scale data over 100Gbps networks
 
APM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CAAPM project meeting - June 13, 2012 - LBNL, Berkeley, CA
APM project meeting - June 13, 2012 - LBNL, Berkeley, CA
 
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 -  Delft, The NetherlandsHPDC 2012 presentation - June 19, 2012 -  Delft, The Netherlands
HPDC 2012 presentation - June 19, 2012 - Delft, The Netherlands
 

Dernier

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 

Dernier (20)

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 

Balman stork cw09

  • 2. Scheduling Data Placement JobsScheduling Data Placement Jobs • Data Placement ActivitiesData Placement Activities • Modular Architecture – Data Transfer ModulesData Transfer Modules  for specific protocols/services • Throttle maximum transfer operations running • Keep a log of data placement activities• Keep a log of data placement activities • Add fault tolerance to data transfers
  • 3. Job SubmissionJob Submission [ dest_url = "gsiftp://eric1.loni.org/scratch/user/"; arguments = ‐p 4 dbg ‐vb"; src_url = "file:///home/user/test/"; dap_type = "transfer"; verify_checksum = true; verify_filesize = true; set_permission = "755" ; i trecursive_copy = true; network_check = true; checkpoint_transfer = true; output = "userout";output =  user.out ; err = "user.err"; log = "userjob.log"; ]]
  • 4. AgendaAgenda • Error Detection and Error ClassificationError Detection and Error Classification • Data Transfer Operations D i T i– Dynamic Tuning  – Prediction Service – Job Aggregation • Data Migration using Stork • Practical example in PetaShare Project • Future Directions
  • 5. Failure‐AwarenessFailure Awareness • Dynamic Environment:  • data transfers are prune to frequent failures • what went wrong during data transfer? • No access to the remote resourcesNo access to the remote resources • Messages get lost due to system malfunction • Instead of waiting failure to happen• Instead of waiting failure to happen • Detect possible failures and malfunctioning services • Search for another data server • Alternate data transfer service• Alternate data transfer service • Classify erroneous cases to make better decisions
  • 6. Error DetectionError Detection • Use Network Exploration Techniques – Check availability of the remote service – Resolve host and determine connectivity failures – Detect available data transfers service – should be Fast and Efficient not to bother system/network resources • Error while transfer is in progress? – Error_TRANSFER • Retry or not? • When to re‐initiate the transfer • Use alternate options?• Use alternate options?
  • 7. Error ClassificationError Classification •Recover from Failure •Retry failed operation •Postpone scheduling of a  failed operationsfailed operations •Early Error Detection I i i T f h•Initiate Transfer when  erroneous condition  recovered •Or use Alternate options • Data Transfer Protocol not always return appropriate error codes • Using error messages generated by the data transfer protocol p • A better logging facility and classification
  • 9. Failure‐Aware SchedulingFailure Aware Scheduling Scoop data  ‐ Hurricane Gustov Simulationsp Hundreds of files (250 data transfer operation) Small (100MB) and large files (1G, 2G
  • 10. New Transfer ModulesNew Transfer Modules • Verify the successful completion of the operation by y p p y controlling checksum and file size.  f G idFTP S k f d l f• for GridFTP, Stork transfer module can recover from a  failed operation by restarting from the last transmitted  file. In case of a retry from a failure, scheduler informs  the transfer module to recover and restart the transfer  using the information from a rescue file created by the  checkpoint‐enabled transfer module.checkpoint enabled transfer module. • Replacing Globus RFT (Reliable File Transfer)
  • 11. AgendaAgenda • Error Detection and Error ClassificationError Detection and Error Classification • Data Transfer Operations D i T i– Dynamic Tuning  – Prediction Service – Job Aggregation • Data Migration using Stork • Practical example in PetaShare Project • Future Directions
  • 12. Tuning Data TransfersTuning Data Transfers • Latency Wall – Buffer Size Optimization – Parallel TCP Streams – Concurrent  Transfers • User level end‐to‐end Tuning P ll liParallelism • (1) the number of parallel data streams connected to a data transfer  service for increasing the utilization of network bandwidthservice for increasing the utilization of network bandwidth • (2) the number of concurrent data transfer operations that are  initiated at the same time for better utilization of system resourcesinitiated at the same time for better utilization of system resources.
  • 13. Parameter EstimationParameter Estimation • come up with a good estimation for the co e up t a good est at o o t e parallelism level – Network statistics – Extra measurement – Historical data  • Might not reflect the best possible current  settings (Dynamic Environment)
  • 16. Average Throughput using Parallel Streamsg g p g Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
  • 17. Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from IBM m/c
  • 18. Dynamic Setting of Parallel StreamsDynamic Setting of Parallel Streams Experiments in LONI (www.loni.org) environment ‐ transfer file to QB from Linux m/c
  • 19. Job AggregationJob Aggregation • data placement jobs are combined and processed as a single transfer job. • Information about the aggregated job is stored in the job queue and it is tied to a main job which is actually performing the transfer operation such that it can be queried and reported separately.operation such that it can be queried and reported separately. • Hence, aggregation is transparent to the user W h t f i t i ll• We have seen vast performance improvement, especially with small data files, • simply by combining data placement jobs based on their d ti ti ddsource or destination addresses. – decreasing the amount of protocol usage – reducing the number of independent network connections
  • 20. Job AggregationJob Aggregation  2000 2500 ec) 1000 1500 2000 time (se single job at a time 2 parallel jobs 4 ll l j b 0 500 1000 total  4 parallel jobs 8 parallel jobs 16 parallel jobs 32 parallel jobs 0 10 20 30 40 max aggregation count 32 parallel jobs Experiments on LONI (Louisiana Optical Network Initiative) : 1024 transfer jobs from Ducky to Queenbee (rtt avg 5.129 ms) - 5MB data file per job
  • 21. AgendaAgenda • Error Detection and Error ClassificationError Detection and Error Classification • Data Transfer Operations D i T i– Dynamic Tuning  – Prediction Service – Job Aggregation • Data Migration using Stork • Practical example in PetaShare Project • Future Directions
  • 22. PetaSharePetaShare • Distributed Storage for Data  Archive • Global Namespace among  distributed resources • Client tools and interfaces • Pcommands • Petashell • Petafs • Windows Browser • Web Portal • Spans among seven LouisianaSpans among seven Louisiana  research institutions • Manages 300TB of disk storage,  400TB of tape400TB of tape
  • 24. Future DirectionsFuture Directions Stork: Central Scheduling Framework f b l k Stork: Central Scheduling Framework • Performance bottleneck – Hundreds of jobs submitted to a single batch  h d l kscheduler, Stork • Single point of failure
  • 25. Future DirectionsFuture Directions Distributed Data Scheduling • Interaction between data scheduler • Manage data activities with lightweight agents in each site Distributed Data Scheduling • Manage data activities with lightweight agents in each site • Better parameter tuning and reordering of data placement  jobs – Job Delegation  – peer‐to‐peer data movement  – data and server striping  – make use of replicas for multi‐source downloads
  • 26. Questions?Questions? Team: Tevfik Kosar kosar@cct lsu eduTevfik Kosar kosar@cct.lsu.edu Mehmet Balman balman@cct.lsu.edu Dengpan Yin dyin@cct.lsu.edu Jia "Jacob" Cheng jacobch@cct.lsu.edu www.petashare.org www.cybertools.loni.org www.storkproject.orgwww.cct.lsu.edu