DIOS - compilers

•Télécharger en tant que PPT, PDF•

1 j'aime•360 vues

awesomesos

My DIOS presentation for compilers. This is meant more for a compiler-oriented audience

Technologie Formation

DIOS: Dynamic Instrumentation for (not so) Outstanding Scheduling Blake Sutton & Chris Sosa

Approach: Adaptive Distributed Scheduler ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

“ Pinvolvement”: What it is ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],pin –t mytool -- ./myprogram Borrowed from Luk et al. 2005.

“ Pinvolvement”: What it measures ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Evaluation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Results: The Good ,[object Object],[object Object]

Results: The Bad ,[object Object],[object Object],[object Object],[object Object],7.64 7.90 14.51 6.27 1.25 1.00 lu 5.81 6.04 7.84 2.87 1.48 1.00 ocean 7.26 7.45 5.43 2.65 1.88 1.00 heatedplate latency # mems malloc/free count only pin native application

Results: The “Interesting” ,[object Object]

Other Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Conclusion: the Future of DIOS ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Wait…hasn’t this been solved? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Recommandé

DIOSawesomesos

DBOpsstrikr .

Deterministic and high throughput data processing for CubeSatsPablo Ghiglino

Building an event system on top MongoDBBigPanda

Introduction to KlepsydraPablo Ghiglino

Efficient IT operations using monitoring systems and standardized tools - Ici...Icinga

PV Monitoring Systems w/Arturo Zaratesolpowerpeople

Take control of your DevOps Dumping Ground; Melissa SussmannPuppet

Recommandé

DIOSawesomesos

DBOpsstrikr .

Deterministic and high throughput data processing for CubeSatsPablo Ghiglino

Building an event system on top MongoDBBigPanda

Introduction to KlepsydraPablo Ghiglino

Efficient IT operations using monitoring systems and standardized tools - Ici...Icinga

PV Monitoring Systems w/Arturo Zaratesolpowerpeople

Take control of your DevOps Dumping Ground; Melissa SussmannPuppet

SplunkLive! Customer Presentation - Garmin InternationalSplunk

Challenges in Practicing High Frequency Releases in Cloud Environments Liming Zhu

Splunk Implementation and Usage - GarminSplunk

Production profiling: What, Why and HowRichardWarburton

Reactive Microservices with eclipse vert.xTiera Fann, MBA

Semi-Real Time Inclinometer readings using Wireless TechnologiesRekaNext Capital

Handling Byzantine Faultsawesomesos

Amazon’s Cloud Computing Effortsawesomesos

Masters of Science presentation: Bringing The Grid Homeawesomesos

An Installable File System For Genesis IIawesomesos

Bringing The Grid Home for Grid2008awesomesos

A Guide to DAGManawesomesos

A Hardware Architecture For Implementing Protection Ringsawesomesos

Distributed Snapshotsawesomesos

Embedded Intro India05Rajesh Gupta

Real Time Operating system (RTOS) - Embedded systemsHariharan Ganesan

Spark Streaming and IoT by Mike FreedmanSpark Summit

Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA

Natural Laws of Software PerformanceGibraltar Software

operating system question bankrajatdeep kaur

Understanding the characteristics of android wear osPratik Jain

5.7 Parallel Processing - Reactive Programming.pdf.pptxMohamedBilal73

Contenu connexe

Tendances

SplunkLive! Customer Presentation - Garmin InternationalSplunk

Challenges in Practicing High Frequency Releases in Cloud Environments Liming Zhu

Splunk Implementation and Usage - GarminSplunk

Production profiling: What, Why and HowRichardWarburton

Reactive Microservices with eclipse vert.xTiera Fann, MBA

Semi-Real Time Inclinometer readings using Wireless TechnologiesRekaNext Capital

Tendances (6)

SplunkLive! Customer Presentation - Garmin International

Challenges in Practicing High Frequency Releases in Cloud Environments

Splunk Implementation and Usage - Garmin

Production profiling: What, Why and How

Reactive Microservices with eclipse vert.x

Semi-Real Time Inclinometer readings using Wireless Technologies

En vedette

Handling Byzantine Faultsawesomesos

Amazon’s Cloud Computing Effortsawesomesos

Masters of Science presentation: Bringing The Grid Homeawesomesos

An Installable File System For Genesis IIawesomesos

Bringing The Grid Home for Grid2008awesomesos

A Guide to DAGManawesomesos

A Hardware Architecture For Implementing Protection Ringsawesomesos

Distributed Snapshotsawesomesos

En vedette (8)

Handling Byzantine Faults

Amazon’s Cloud Computing Efforts

Masters of Science presentation: Bringing The Grid Home

An Installable File System For Genesis II

Bringing The Grid Home for Grid2008

A Guide to DAGMan

A Hardware Architecture For Implementing Protection Rings

Distributed Snapshots

Similaire à DIOS - compilers

Embedded Intro India05Rajesh Gupta

Real Time Operating system (RTOS) - Embedded systemsHariharan Ganesan

Spark Streaming and IoT by Mike FreedmanSpark Summit

Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA

Natural Laws of Software PerformanceGibraltar Software

operating system question bankrajatdeep kaur

Understanding the characteristics of android wear osPratik Jain

5.7 Parallel Processing - Reactive Programming.pdf.pptxMohamedBilal73

Automating the Hunt for Non-Obvious Sources of Latency SpreadsScyllaDB

Sioux Hot-or-Not: The future of Linux (Alan Cox)siouxhotornot

Workload Automation for Cloud Migration and Machine Learning PlatformActiveeon

Autosar Basics hand book_v1Keroles karam khalil

Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com

Evolving to Cloud-Native - Nate Schutta (2/2)VMware Tanzu

PPT.pdfRameshBabu461344

PART-1 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingFastBit Embedded Brain Academy

Automatic Undo for Cloud Management via AI PlanningHiroshi Wada

Survey of task schedulerelisha25

Module 3-cpu-schedulingHesham Elmasry

LM9 - OPERATIONS, SCHEDULING, Inter process xommuncationMani Deepak Choudhry

Similaire à DIOS - compilers (20)

Embedded Intro India05

Real Time Operating system (RTOS) - Embedded systems

Spark Streaming and IoT by Mike Freedman

Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...

Natural Laws of Software Performance

operating system question bank

Understanding the characteristics of android wear os

5.7 Parallel Processing - Reactive Programming.pdf.pptx

Automating the Hunt for Non-Obvious Sources of Latency Spreads

Sioux Hot-or-Not: The future of Linux (Alan Cox)

Workload Automation for Cloud Migration and Machine Learning Platform

Autosar Basics hand book_v1

Applying Cloud Techniques to Address Complexity in HPC System Integrations

Evolving to Cloud-Native - Nate Schutta (2/2)

PPT.pdf

PART-1 : Mastering RTOS FreeRTOS and STM32Fx with Debugging

Automatic Undo for Cloud Management via AI Planning

Survey of task scheduler

Module 3-cpu-scheduling

LM9 - OPERATIONS, SCHEDULING, Inter process xommuncation

Plus de awesomesos

PicFS presentationawesomesos

Online feedback correlation using clusteringawesomesos

Web Service Choreography Interface (Wsci)awesomesos

Hadoop Tutorialawesomesos

Lustre And Nfs V4awesomesos

A Web Based Covert File Systemawesomesos

Distributed File Systemsawesomesos

Exploring The Cloudawesomesos

Data Grid Taxonomiesawesomesos

Plus de awesomesos (9)

PicFS presentation

Online feedback correlation using clustering

Web Service Choreography Interface (Wsci)

Hadoop Tutorial

Lustre And Nfs V4

A Web Based Covert File System

Distributed File Systems

Exploring The Cloud

Data Grid Taxonomies

Dernier

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

A Domino Admins Adventures (Engage 2024)Gabriella Davis

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Understanding the Laravel MVC ArchitecturePixlogix Infotech

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Slack Application Development 101 Slidespraypatel2

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

Google AI Hackathon: LLM based Evaluator for RAGSujit Pal

🐬 The future of MySQL is Postgres 🐘RTylerCroy

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Dernier (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Unblocking The Main Thread Solving ANRs and Frozen Frames

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

A Domino Admins Adventures (Engage 2024)

CNv6 Instructor Chapter 6 Quality of Service

Finology Group – Insurtech Innovation Award 2024

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Understanding the Laravel MVC Architecture

GenCyber Cyber Security Day Presentation

Slack Application Development 101 Slides

SQL Database Design For Developers at php[tek] 2024

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

The 7 Things I Know About Cyber Security After 25 Years | April 2024

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

Google AI Hackathon: LLM based Evaluator for RAG

🐬 The future of MySQL is Postgres 🐘

How to Troubleshoot Apps for the Modern Connected Worker

DIOS - compilers

1. DIOS: Dynamic Instrumentation for (not so) Outstanding Scheduling Blake Sutton & Chris Sosa

2. Motivation ON OR

10.

11.

12. ¿ Preguntas?

13.

Notes de l'éditeur

Our project is about how to schedule jobs among a group of machines. Our implementation is at the user level, but the same idea could be applied in the kernel of a distributed operating system. Long-running, short-running, memory-intensive, cpu-bound…don’t know what kind of jobs to expect. So how can the scheduler put them where they should be if it doesn’t know these things? Transition: Wouldn’t it be nice if the scheduler could just “handle it” – without the user having specify characteristics of their jobs in advance?
Our approach to this problem is DIOS – an adaptive distributed scheduler. Describe diagram: local schedulers (Hare) run on each machine, with queues of jobs. Global scheduler (Rhino) receives events from the Hares and sends down actions – like, migrate, or pause. Transition: So you must be thinking…wait, how are you going to just “gather application-specific info”?
The answer is – we’ll write a tool with Pin, a dynamic instrumentation framework. Describe diagram – as you can see from the diagram, and from this command up here, Pin is kind of like a miniature virtual machine. It takes in a pintool and the program binary, and runs it in the context of Pin, inserting new code into the application as it runs – using the tool as the instructions for what code to execute and where to insert it. For example, a pintool to count the number of instructions executed in a program could insert code to increment a variable before every instruction. There are several point instrumentation can be introduced – our pintool uses routine-level and instruction-level.
So we’ve established that Pin is a tool for what we want to do – dynamically instrument applications. But what code do we want to insert? What are we looking to get from our pintool? Since we are trying to detect and avoid memory contention between processes, it makes senses to study the memory behavior of the applications. To this end, we chose three things (describe them). The figure to side there shows how the pintool fits in to our overall plan – it would collect information for each application and report the results to Hare, the local scheduler. Then Hare, which is also monitoring the memory subsystem of the local machine, reports to Rhino, and Rhino decides what to do.
Considering our motivation, it was important to try to evaluate it on a somewhat realistic workload. Since it seems like most long-running jobs on clusters are scientific applications, we wanted to use real scientific benchmarks. Describe benchmarks. To evaluate the scheduler, we measured the total runtime from groups of 100 jobs. We varied the parameters to the heatedplate program (dataset size and number of iterationas) in order to vary the length of the jobs, and produced a set of jobs on a curve – a great many short-running jobs with a few long-running jobs. Past work indicates that is a common job submission trend in batch systems. Then, to evaluate our pintool, we measured the overhead from running each application with our pintool and also tracked the information we collected over time to see if we could correlate it to interesting behavior or differences between programs.
So here are our results from evaluating the distributed scheduler by itself. The good news is we saw potential for improvement –just from using a simple policy to react to the presence of memory contention, the total runtime goes down. Might be able to get even better results on long-running jobs, with better information on the running processes (like we could get from dynamic instrumentation!) So if you’re wondering why we’re showing you results for our scheduler with this simple policy – but not with our whole system of including application-specfic information…well that brings me to The Bad.
Although our scheduler works perfectly well with the pintool, we discovered that the overhead introduced by Pin is just too much. Some of our overhead results are below – we show the time to run the application natively, with pin (no pintool), with a tool that only counts instructions, and with our three metrics. The way we hoped to solve the overhead problem originally was to basically only instrument when we needed to –like when the scheduler decided the machine was performing badly. Then, the relatively high overhead to run the analysis wouldn’t have to make much of an impact overall. However, we were unable to get the performance gains we hoped – Pin doesn’t offer the ability to completely attach and detach from a running program, only to attach, and we discovered when we tried to add and remove instrumentation dynamically that we lost the gains from code caching. So while this idea could work with another system or with a new Pin, we couldn’t manage to bring the overhead down.
But on the bright side, we were able to collect some interesting information – this figure shows the variation over time of our memory instruction measurements – it shows the change in the number of memory instructions executed in a window over time – hence the negative numbers. Note how similar the patterns of LU and heatedplate are – talk about how that’s probably because they are tightly looped and very repetitive, whereas Ocean is obviously performing a more irregular and complex analysis with some possible distinct phases in it. Possibility of using the variation in a metric like this to “predict the predictability” - to separate applications that are better left alone from those that are more likely to be safely handled by common heuristics, etc.
So – the future of DIOS.
Questions?
Kind of...but no comprehensive solution.