SlideShare a Scribd company logo
1 of 48
Download to read offline
Monitoring
Considerations
Monitorama, 2013
John Allspaw
SVP, Technical Operations
Sunday, August 4, 13
I want to warn you that I will lift references from various sources this morning, and I’ll make
sure to point to those further readings I’ll touch on when I post slides.
You can feel free to view those readings as HOMEWORK. Unsurprisingly to anyone who knows
me, a large amount of them will be in the field of Human Factors and Safety.
WHO HERE HAS EVER WRITTEN MONITORING SOFTWARE? (alerts, dashboards, graphs, metrics
collection, analysis, display, etc.)
“In the long term, Operations as a science
needs to be elevated.”
Chris Brown
Velocity London, 2012
Sunday, August 4, 13
We are at an interesting time in our field.
We are still naive.
We express indignation in terse remarks about our challenges.
We also believe that certainty is something we can attain through the use of technology alone.
This makes the field of web engineering as a whole ADORABLE.
Dr. Richard Cook, Velocity US 2012
http://www.youtube.com/watch?v=R_PDc0HFdP0
Sunday, August 4, 13
Dr. Cook explains how the research done in Human Factors and Systems Safety has a good relevance to the
operation of web infrastructures.
“Anytime you find a world in which you have high consequences, high-tempo operations, time pressure, and
lots of complexity...and people are called upon to manage that, you’re going to have these kinds of issues
arise.”
Aviation, patient safety, military, power generation and distribution, space travel, etc.....they
are attractive because we see something in them that is familiar.
While we have an opportunity to take ADVANTAGE of LESSONS LEARNED in other fields of
high-tempo/complexity/consequences, it behooves us to think on how we are DIFFERENT
from the other fields.
We also have an opportunity to SIDESTEP some of the quagmires those fields have found
themselves in.
This talk is a tiny effort towards this direction.
LANGUAGE
Sunday, August 4, 13
In order to support this, I will argue that we need to start paying attention to our language.
1. OTHER DOMAINS ALREADY HAVE A LEXICON, WE CAN BORROW SOME TERMS FROM THEM
2. How we discuss our challenges can play a very large role in how we surmount them.
There are a number of concepts, words, and ideas that need to enter our lexicon, especially
when it comes to monitoring and the challenges that come with making sense of where, what,
how, and why complex systems behave.
BETTER
QUESTIONS
Sunday, August 4, 13
One of the OTHER things that has become clear to me is that as a field, we need to ask
BETTER QUESTIONS instead of quickly jumping to CORRECT ANSWERS or SOLUTIONS.
ASKING TERRIBLE QUESTIONS WILL GUARANTEE TERRIBLE SOLUTIONS.
I’m increasingly convinced that the road to progress on such a broad and complicated topic
as monitoring is paved with BETTER QUESTIONS, not NEWER TOOLS.
So you may hear me asking some questions today.
They may or may not be good questions, but I’ll take a stab at it anyway.
DOWN and IN
Sunday, August 4, 13
“Down and In”
As the years go by and we see the continued decline of storage prices, the explosion of
accessible processing power, we have an ever-expanding ability to zoom in deeply to the
ways servers and services talk to each other and process information.
WE CAN ZOOM IN ON THE RELATIONSHIPS and BEHAVIORS of SEEMINGLY DISPARATE PIECES
OF DATA...
... AND WE CAN DISCOVER AND DETECT DISRUPTIONS IN SOMETIMES SURPRISING PLACES.
THIS IS INTERESTING.
BUT IT IS ALSO WOEFULLY INCOMPLETE IF WE ARE TO MAKE ANY PROGRESS IN OPERATIONS.
UP and OUT
Sunday, August 4, 13
...it is INCOMPLETE because as we ZOOM OUT, what we find is a much-ignored environment
which includes one of the most powerful CONTEXT-SENSITIVE and INCREDIBLY ADAPTIVE
anomaly detection and response agent in the world:
HUMANS
Sunday, August 4, 13
Do we have ANOMALY DETECTION problems? Certainly. One can argue (I will, if you’d like,
later at the bar) that we will ALWAYS have them.
BUT: What I’m interested in is NOT how software can be used to detect anomalies
automatically.
(well, I’m interested, but I don’t doubt that you all will continue to get better at it)
Sunday, August 4, 13
... It is how people navigate this boundary between themselves and the machines they work
with.
The BOUNDARY between humans and machines, as we observe our use of tools, is a focus IN
and OF ITSELF.
If we have any hope of making progress in monitoring complex systems, we must take this
boundary into account.
Sunday, August 4, 13
BUT ABOUT HUMANS: A couple of observations with respect to tools and monitoring
in general.
1. We don’t use a single tool to gain insight into the architectures we build. And we
will not.
2. Teams of people are the NORM, which means communication and coordination
become as important (if not more important) than surfacing anomalies themselves.
3. We bring our BIASES, EXPECTATIONS, TRUST, and PERCEPTIONS to the table. No
tool or piece of automation or tooling will change that.
4. Understanding the breakdowns at these boundaries between people and machines
should be a part of how we approach design of tools and organizational behaviors.
LESS CODE
MORE PSYCHOLOGY
Sunday, August 4, 13
SPECIFICALLY:
ALGORITHMS ALONE WILL NOT DELIVER US TO A BETTER AND SAFER PLACE.
OODALoop
Observe Orient Decide Act
credit:http://blog.b3k.us/ooda.html
Sunday, August 4, 13
WHO IS FAMILIAR WITH Lt. Boyd’s OODA Loop?
Observation and orientation is a place where we can look for making progress.
When we get alerted, look at dashboards, graphs and logs, we’re looking to make sense of
the past and project into the future.
NOTE: Observe and Orient are not Unix commands, they are HUMAN ACTIVITIES.
We need to understand how people make sense of
what is going on
Sunday, August 4, 13
SO: Writing code to TELL COMPUTERS WHAT TO LOOK AT is quite different than making sure
that the code’s human supervisors are equipped or aided in what to look at when an alert
goes off.
How people make sense of what is going on (in diagnosis? In planning? In response? In
control?) is just plain HARD.
We need to understand how normal
work is getting done by normal people
in normal situations.
Sunday, August 4, 13
If we don’t understand how people consume, adapt to, work around, and make use of tools
under “normal” operating conditions, how can we have confidence that our designs will
perform under uncertain or escalating scenarios?
Work As Imagined
Work As Done
Sunday, August 4, 13
Our clues on how we THINK we work guides our design decisions.
But there is a gap between how we think we work, and how we actually work.
How large is this gap? How will we know when it’s too large?
Where is design?
“The system should therefore be designed so
that human adaptation is ENHANCED.”
Erik Hollnagel
Expertise and Technology: Cognition & Human-Computer
Cooperation, 1995
Sunday, August 4, 13
Design thought should be in tools, displays, controls, and processes.
What do we have to work with, though?
“It is the expertise of the human operator
that makes it possible to adapt the
performance of the joint system, in real
time, to unexpected events and
disturbances. Every working day, across the
whole spectrum of human enterprise, a large
number of near-misses are prevented from
turning into accidents only because human
operators intervene...
Sunday, August 4, 13
Whether we know it or not, we are ALL designers now, if we build tools intended to aid
monitoring.
I’m not just talking about UI and garden-variety HCI work, but those topics should be
considered table stakes.
Where is design?
http://www.perceptualedge.com/articles/visual_business_intelligence/
time_on_the_horizon.pdf
Sunday, August 4, 13
VISUAL PERCEPTIONS and UI approaches are integral to our field, so we should try to
understand them as deeply as we can.
Armed with the knowledge that every element of design can (and will) be mis-used (like these
Horizon Graphs), we are left with a dilemma:
How can we understand what can augment human capabilities without getting in the way,
and without having to first re-start our career as an Human Factors expert?
WE FAKE IT UNTIL WE MAKE IT
http://www.perceptualedge.com/articles/Whitepapers/Dashboard_Design.pdf
Salience
Sunday, August 4, 13
For example, this illustration of the concept of SALIENCE, or
“quality of an item that stands out relative to neighboring items”
Comes from a great whitepaper called
“Dashboard Design for Real-Time Situation Awareness” by Stephen
Few
http://www.perceptualedge.com/articles/Whitepapers/Dashboard_Design.pdf
Salience
Sunday, August 4, 13
So SALIENCE is an important quality.
Principles of Display Design
• Principle of information need
• Principle of legibility
• Principle of display integration/proximity
• Principle of pictorial realism
• Principle of the moving part
• Principle of predictive aiding
• Principle of discriminability: status versus command
Wickens, Lee, Liu, Becker
An Introduction to Human Factors Engineering
Sunday, August 4, 13
Here is another great pointer on display design, from “AN INTRODUCTION TO HUMAN
FACTORS ENGINEERING”.
Cognition In The Wild
“It is notoriously difficult to generalize
laboratory findings to real-world
situations.”
Sunday, August 4, 13
So let’s leave design for a moment and talk about how we can VALIDATE our design choices.
We CANNOT hope to understand how people behave in real-world scenarios BY USING OUR
IMAGINATION alone.
How many of you work at a company where funnel or clickstream analysis is being done?
How many of you have done clickstream or funnel analysis on your monitoring dashboards, graphs,
and displays?
What sort of information might we find when we gather data on how people navigate metric data
during varying scenarios?
ALERT
DESIGN
Sunday, August 4, 13
- Who has ever gotten a page and ignored it?
Endsley: At a safety expert conference, in a 300-person hall, only 3 people got up for a fire alarm.
- How many alerts were received in the past week that were not actionable? (no human action was
required?)
- How many alerts were received in the past week as a result of known work being done, but alerts
were not silenced during that period?
- How many alerts were received as a result of a previously silenced alert (because work was being
done) that was mistakenly un-silenced?
Jack Garman
Flight controller
NASA Mission Control
Apollo Program (Murray and Cox 1990)
Sunday, August 4, 13
“A program alarm could be triggered by trivial problems that
could be ignored altogether.
Or it could be triggered by problems that called for an immediate
abort.
How to decide which was which?
"We wrote ourselves little rules like
'If this alarm happens and it only happens once, don't worry
about it. If it happens repeatedly, but other indicators are okay,
don't worry about it.'"
Operator, interviewed.
The Three Mile Island
nuclear power plant, following the
accident. (Kemeny 1979)
Sunday, August 4, 13
“I would have liked to have thrown away the alarm panel. It
wasn't giving us any useful information."
Comment by one operator at the Three Mile Island nuclear
power plant
to the official inquiry following the TMI accident (Kemeny 1979).
Physician, explaining how they
respond to a nuisance alarm on a
device in the operating room.
(Cook, Potter,Woods and McDonald 1991)
Sunday, August 4, 13
"When the alarm kept going off then we kept shutting it [the
device] off [and on] and when the alarm would go off [again],
we’d shut it off.”
“... so I just reset it [a device control] to a higher temperature. So
I kinda fooled it [the alarm]...”
SIGNAL
DETECTION
THEORY
Sunday, August 4, 13
Signal Detection Theory
- Too sensitive, and you’ll get false alarms
- Not sensitive enough, and you’ll get missed alarms
ALERT DESIGN
Mica Endsley
Designing for Situation Awareness
Sunday, August 4, 13
What about the context people are in when they
experience a FALSE ALERT?
Or a MISSED ALERT?
Interpretation
Integration
Interpretation
Other Situational
Information
Expectancies Past History Mental Model
Alarm Signal
Response
Decision
Designing for Situational Awareness, Mica Endsley
Sunday, August 4, 13
The cognitive processing of an alarm signal.
When we DESIGN ALERTS, we HAVE to think about the
various ways that the ALERT could be interpreted or
acted on. Often times, we will PUNT on aiding the
operator with CONTEXT.
Critical Care & Anesthesiology
• Monitors & alarms designed to “never miss”
• 566 deaths reported related to alarms
(2005-2008)
• Most associate with the silencing function
• ECRI’s #1 health technology hazard, 2012 & 2013
And you have complaints about Nagios’ “set downtime” feature?
Sunday, August 4, 13
Emergency Care Research Institute (ECRI), which recently
identified alarms as the “number one health technology hazard”
for 2012.9
And you have complaints about Nagios’ “set downtime” feature?
ALERT DESIGN
Confirmation
Sunday, August 4, 13
- Because false alarms are a problem, people will spend time not
reacting to an alert, but confirming that the alert is legit.
- Pilots delay responding to GPWS (Ground Proximity Warning
System) 73% of the time, because they’re looking out the window
to confirm it’s true, and how true it is.
What are ways we can SUPPORT CONFIRMATION or
VALIDATION in our alert design?
ALERT DESIGN
Expectancy
Sunday, August 4, 13
- People’s expectancies can also affect their interpretation of alerts.
- In many cases, people EXPECT the alert to go off, as the result of their own actions.
- In a study in 2001, 6% of operating room alarms were found to be expected or anticipated.
- This can become a nuisance, and further degrade the trust in the alerts.
- Example: disk space alerts that happen during a backup, and then recover.
- Example: someone on the team doing work, and not silencing the alerts temporarily.
BONUS: when the time period for an alert is silenced passes, and the condition isn’t acceptable yet.
(downtime expiring)
What are ways that we could SUPPORT EXPECTANCY in our alert design?
ALERT DESIGN
• Signal:Noise can be difficult
• Easy to err on more false alarms
• Decay in trust
• Origins: Undetectable conditions
Sunday, August 4, 13
- Signal:Noise can be difficult to get right
- General view: err on the side of too many false alarms. This ignores the detrimental effect
of them on humans.
- Study in 1998 said: New ATC systems, missed alerts at 0.2%, false alarm rates at 65%.
- Underlying false alerts: not the functioning of algorithms themselves, but the CONDITIONS
AND FACTORS THAT THE ALARM SYSTEMS CANNOT DETECT OR INTERPRET
Ex: Cincinnati Airport - riverbank leading up to a runway increases in terrain causes an alarm
because the system can’t detect that it’s going to plateau at the runway. Pilots familiar with
the airport ignore the alarms.
Information is not a
scarce resource.
Attention is.
Herb Simon, 1991
Sunday, August 4, 13
http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf
Directed Attention
• Attention focusing
• Attention switching
• Dynamic Prioritization
Sunday, August 4, 13
We work in a COGNITIVELY NOISY WORLD, even when there is NOT an outage going on.
Alerts are ESSENTIALLY ATTENTION DIRECTORS.
The main challenge for DYNAMIC FAULT MANAGEMENT (HF term) in design is to support:
- ATTENTION FOCUSING
- ATTENTION SWITCHING
- DYNAMIC PRIORITIZATION
By getting to know how human attention works (and its relationship to context, perception,
etc.), we can hope to design better alerts.
Interrupts AND
Underspecification
1. “Here is the data I want you to see”
2. “Here is why I think you would find it interesting”
Sunday, August 4, 13
An alert is essentially an INTERRUPT.
TWO STATES:
1 - HERE IS THE DATA I WANTYOU TO SEE
2 - HERE IS WHY I THINKYOU WOULD FIND IT INTERESTING
What can we do to support #2?
Paradox
Of
Directed Attention
Sunday, August 4, 13
An alert is essentially an interruption to everyday work, and there is a paradox at the heart of
DIRECTED ATTENTION.
1. We are always busy!
2. Shifting attention has a very real cost!
2. Not all signals are worth paying attention to; context-sensitivity will always vary
3. So how can you SKILLFULLY IGNORE a SIGNAL that should NOT SHIFT UR ATTENTION
WITHOUT first processing it....IN WHICH CASE IT HASN’T BEEN IGNORED.
“Given that the supervisory agent is loaded by various other task related demands, how does one
interpret information about the potential need to switch attentional focus without interrupting or
interfering with the tasks or lines of reasoning already under attentional control. We can state this
paradox in another way: how can one skillfully ignore a signal that should not shift attention within
the current context, without first processing it -- in which case it hasn't been ignored.” - David
Woods
David Woods has suggested some ways to break this paradox, he calls it PREATTENTIVE
REFERENCE.
I’ll let you discover his suggestions on your own.
Directed Attention
Sorting through
an avalanche of
data
Picking up on
subtle early
indications of a
fault
Sunday, August 4, 13
This idea of an alert DIRECTING OUR ATTENTION can exist in two views:
SORTING THROUGH AN AVALANCHE or PICKING UP SUBTLE/EARLY INDICATIONS....
So....which is it?
IT CAN BE BOTH!
“The critical point is that the challenge of fault management lies in sorting through an avalanche of raw data -- a data overload problem. This
is in contrast to the view that the performance bottleneck is the difficulty of picking up subtle early indications of a fault against the
background of a quiescent monitored process.”
Context Sensitivity
Sunday, August 4, 13
The background and context in which a SIGNAL arrives can play a huge role in how they can
HELP or HINDER us.
If the background is one of QUIET, contrast is HIGH. <- this is what most designers plan for
If the background is ONGOING DIAGNOSIS, then SIGNAL can SUPPORT/CONTRADICT existing
hypothesis
If the background is EXECUTING A RESPONSE, then SIGNAL can cue the RESPONSE is WRONG
or INCOMPLETE.
In any case, the ALERT’s MEANING will change as CONTEXT and BACKGROUND changes.
Data Overload
Sunday, August 4, 13
This is simply a tough problem.
There are approaches to solve it, but none of them to date are effective given the rate at
which new pieces of data are being collected and stored.
There is a significant agreement among those who study data overload phenomena that the
critical piece to understand is of CONTEXT SENSITIVITY.
Some HF researchers have pointed at something that may help reduce the effects of DO:
Depicting RELATIONSHIPS between data in a known FRAME of REFERENCE, as opposed to the
raw data.
What can we do as designers to aid surfacing those relationships?
How have I taken the
OPERATOR into account?
Sunday, August 4, 13
PEOPLE use monitoring tools.
Arguably, MACHINES use monitoring tools we build, as well.
But only PEOPLE can adapt and improvise with a given tool outside of the original intentions
of its designer.
Am I hurting or helping:
•Data overload or underload?
•Salience?
•Directed attention?
•Interruptibility?
Sunday, August 4, 13
When we design alerts and monitoring tools, we should be asking these questions.
In addition: HOW WILL WE KNOW WHEN THIS DESIGN WOULD HURT those things?
Joint Cognitive Systems
Sunday, August 4, 13
One final thought: what if, instead of the view that the BOUNDARY is a large barrier to be
hurdled only by our writing increasingly complex code...we view that boundary as a place for
an actual cooperative RELATIONSHIP?
Joint Cognitive Systems
What if we viewed an alerting system
as a PARTNER, instead of a subordinate?
Sunday, August 4, 13
What is we viewed alerting systems as a PARTNER, instead of a subordinate or otherwise
dumb messenger delivering news to us?
What does the world look like if we designed alerts to COOPERATE with us?
If TRUST in alerting systems is such a big deal....
WHAT can we learn from how HUMANS learn to trust each other, and let that influence our
design decisions?
In other words: how can we design alerts that SUPPORT our confirming their legitimacy, or
our expectations when an alert will fire? Is context-sensitivity part of this?
We see some blunt versions of these notions:
1 - Time periods for alerts, so that people aren’t woken up for things that can wait until
morning (the machine has been given some context about our availability to pay attention to
an alert)
2 - Rough dependency relationships, so we don’t send a bazillion alerts when a known SPOF
dies
What other examples can we think of, where the COMPUTERS can attempt to understand,
predict, or observe US, as we work?
The End
Sunday, August 4, 13
My hope is that I’ve been able to ask BETTER QUESTIONS, and I can kick off this conference
with food for thought.
You can tell me how that food tastes at the bar later.
Can We Ever Escape From Data
Overload?
A Cognitive Systems Diagnosis
Woods, Patterson, Roth 1999
http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf
Sunday, August 4, 13
http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf
The Alarm Problem and
Directed Attention in
Dynamic Fault Management
Woods 1995
http://csel.eng.ohio-state.edu/woods/foundations/directed%20att.pdf
Sunday, August 4, 13
http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf
Sunday, August 4, 13

More Related Content

What's hot

Scheduling a process in oracle fusion
Scheduling a process in oracle fusionScheduling a process in oracle fusion
Scheduling a process in oracle fusionFeras Ahmad
 
ISO 29119 -The new international software testing standards
ISO 29119 -The new international software testing standardsISO 29119 -The new international software testing standards
ISO 29119 -The new international software testing standardsFareha Nadeem
 
Head first docker
Head first dockerHead first docker
Head first dockerHan Qin
 
Build CICD Pipeline for Container Presentation Slides
Build CICD Pipeline for Container Presentation SlidesBuild CICD Pipeline for Container Presentation Slides
Build CICD Pipeline for Container Presentation SlidesAmazon Web Services
 
Unit Tests And Automated Testing
Unit Tests And Automated TestingUnit Tests And Automated Testing
Unit Tests And Automated TestingLee Englestone
 
Jira as a Tool for Test Management
Jira as a Tool for Test ManagementJira as a Tool for Test Management
Jira as a Tool for Test ManagementMaija Laksa
 
Continuous integration
Continuous integrationContinuous integration
Continuous integrationhugo lu
 
Automating Deployment with Github and CodeDeploy
Automating Deployment with Github and CodeDeployAutomating Deployment with Github and CodeDeploy
Automating Deployment with Github and CodeDeployAmazon Web Services
 
Kubernetes in Higher Education
Kubernetes in Higher EducationKubernetes in Higher Education
Kubernetes in Higher Educationlaupow
 
Test-Driven Development
Test-Driven DevelopmentTest-Driven Development
Test-Driven DevelopmentJohn Blum
 
Introduction to Automation Testing
Introduction to Automation TestingIntroduction to Automation Testing
Introduction to Automation TestingArchana Krushnan
 
What We Learned from Porting PiggyMetrics from Spring Boot to MicroProfile
What We Learned from Porting PiggyMetrics from Spring Boot to MicroProfileWhat We Learned from Porting PiggyMetrics from Spring Boot to MicroProfile
What We Learned from Porting PiggyMetrics from Spring Boot to MicroProfileEd Burns
 

What's hot (20)

Scheduling a process in oracle fusion
Scheduling a process in oracle fusionScheduling a process in oracle fusion
Scheduling a process in oracle fusion
 
Integration Testing in Python
Integration Testing in PythonIntegration Testing in Python
Integration Testing in Python
 
ISO 29119 -The new international software testing standards
ISO 29119 -The new international software testing standardsISO 29119 -The new international software testing standards
ISO 29119 -The new international software testing standards
 
Head first docker
Head first dockerHead first docker
Head first docker
 
Testing Ansible
Testing AnsibleTesting Ansible
Testing Ansible
 
The Devops Handbook
The Devops HandbookThe Devops Handbook
The Devops Handbook
 
Build CICD Pipeline for Container Presentation Slides
Build CICD Pipeline for Container Presentation SlidesBuild CICD Pipeline for Container Presentation Slides
Build CICD Pipeline for Container Presentation Slides
 
Unit Tests And Automated Testing
Unit Tests And Automated TestingUnit Tests And Automated Testing
Unit Tests And Automated Testing
 
Jira as a Tool for Test Management
Jira as a Tool for Test ManagementJira as a Tool for Test Management
Jira as a Tool for Test Management
 
Continuous integration
Continuous integrationContinuous integration
Continuous integration
 
Automating Deployment with Github and CodeDeploy
Automating Deployment with Github and CodeDeployAutomating Deployment with Github and CodeDeploy
Automating Deployment with Github and CodeDeploy
 
Kubernetes in Higher Education
Kubernetes in Higher EducationKubernetes in Higher Education
Kubernetes in Higher Education
 
Jenkins Tutorial.pdf
Jenkins Tutorial.pdfJenkins Tutorial.pdf
Jenkins Tutorial.pdf
 
Test-Driven Development
Test-Driven DevelopmentTest-Driven Development
Test-Driven Development
 
Docker Ecosystem on Azure
Docker Ecosystem on AzureDocker Ecosystem on Azure
Docker Ecosystem on Azure
 
Introduction to Automation Testing
Introduction to Automation TestingIntroduction to Automation Testing
Introduction to Automation Testing
 
Oracle Payables Reference Guide
Oracle Payables Reference GuideOracle Payables Reference Guide
Oracle Payables Reference Guide
 
Automation Testing by Selenium Web Driver
Automation Testing by Selenium Web DriverAutomation Testing by Selenium Web Driver
Automation Testing by Selenium Web Driver
 
Migración Discoverer a Oracle BI
Migración Discoverer a Oracle BIMigración Discoverer a Oracle BI
Migración Discoverer a Oracle BI
 
What We Learned from Porting PiggyMetrics from Spring Boot to MicroProfile
What We Learned from Porting PiggyMetrics from Spring Boot to MicroProfileWhat We Learned from Porting PiggyMetrics from Spring Boot to MicroProfile
What We Learned from Porting PiggyMetrics from Spring Boot to MicroProfile
 

Viewers also liked

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at FlickrJohn Allspaw
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrJohn Allspaw
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeJohn Allspaw
 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsJohn Allspaw
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeJohn Allspaw
 
CitoEngine : Alert management and automation tool.
CitoEngine : Alert management and automation tool.CitoEngine : Alert management and automation tool.
CitoEngine : Alert management and automation tool.extremeunix
 
Contiuously Deploying Culture 2.0 - Agile Ísland
Contiuously Deploying Culture 2.0 - Agile ÍslandContiuously Deploying Culture 2.0 - Agile Ísland
Contiuously Deploying Culture 2.0 - Agile ÍslandRich Smith
 
Spur Infrastructure Performance With Proactive IT Monitoring
Spur Infrastructure Performance With Proactive IT MonitoringSpur Infrastructure Performance With Proactive IT Monitoring
Spur Infrastructure Performance With Proactive IT MonitoringCA Technologies
 
Puppet: What _not_ to do
Puppet: What _not_ to doPuppet: What _not_ to do
Puppet: What _not_ to doPuppet
 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages MaturelyJohn Allspaw
 
Automated Puppet Testing - PuppetCamp Chicago '12 - Scott Nottingham
Automated Puppet Testing - PuppetCamp Chicago '12 - Scott NottinghamAutomated Puppet Testing - PuppetCamp Chicago '12 - Scott Nottingham
Automated Puppet Testing - PuppetCamp Chicago '12 - Scott NottinghamPuppet
 
Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...
Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...
Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...Grier Johnson
 
Continuous Deployment at Etsy — TimesOpen NYC
Continuous Deployment at Etsy — TimesOpen NYCContinuous Deployment at Etsy — TimesOpen NYC
Continuous Deployment at Etsy — TimesOpen NYCMike Brittain
 
Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...
Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...
Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...Puppet
 
906702 Enhancing Business Processes Using Enterprise Information Systems
906702 Enhancing Business Processes Using Enterprise Information Systems906702 Enhancing Business Processes Using Enterprise Information Systems
906702 Enhancing Business Processes Using Enterprise Information Systemssiroros
 
How To Make Dev Ops Work @ Netlight Edge X Berlin
How To Make Dev Ops Work @ Netlight Edge X BerlinHow To Make Dev Ops Work @ Netlight Edge X Berlin
How To Make Dev Ops Work @ Netlight Edge X BerlinFerdinand von den Eichen
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroGaurav "GP" Pal
 
11. Huccet I Imaniye
11. Huccet I  Imaniye11. Huccet I  Imaniye
11. Huccet I ImaniyeAhmet Türkan
 

Viewers also liked (20)

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
 
CitoEngine : Alert management and automation tool.
CitoEngine : Alert management and automation tool.CitoEngine : Alert management and automation tool.
CitoEngine : Alert management and automation tool.
 
Contiuously Deploying Culture 2.0 - Agile Ísland
Contiuously Deploying Culture 2.0 - Agile ÍslandContiuously Deploying Culture 2.0 - Agile Ísland
Contiuously Deploying Culture 2.0 - Agile Ísland
 
Spur Infrastructure Performance With Proactive IT Monitoring
Spur Infrastructure Performance With Proactive IT MonitoringSpur Infrastructure Performance With Proactive IT Monitoring
Spur Infrastructure Performance With Proactive IT Monitoring
 
Puppet: What _not_ to do
Puppet: What _not_ to doPuppet: What _not_ to do
Puppet: What _not_ to do
 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages Maturely
 
Automated Puppet Testing - PuppetCamp Chicago '12 - Scott Nottingham
Automated Puppet Testing - PuppetCamp Chicago '12 - Scott NottinghamAutomated Puppet Testing - PuppetCamp Chicago '12 - Scott Nottingham
Automated Puppet Testing - PuppetCamp Chicago '12 - Scott Nottingham
 
Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...
Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...
Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at Li...
 
Continuous Deployment at Etsy — TimesOpen NYC
Continuous Deployment at Etsy — TimesOpen NYCContinuous Deployment at Etsy — TimesOpen NYC
Continuous Deployment at Etsy — TimesOpen NYC
 
Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...
Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...
Demystifying DevOps for Ops - Including Findings from the 2015 State of DevOp...
 
Path to continuous delivery
Path to continuous deliveryPath to continuous delivery
Path to continuous delivery
 
906702 Enhancing Business Processes Using Enterprise Information Systems
906702 Enhancing Business Processes Using Enterprise Information Systems906702 Enhancing Business Processes Using Enterprise Information Systems
906702 Enhancing Business Processes Using Enterprise Information Systems
 
How To Make Dev Ops Work @ Netlight Edge X Berlin
How To Make Dev Ops Work @ Netlight Edge X BerlinHow To Make Dev Ops Work @ Netlight Edge X Berlin
How To Make Dev Ops Work @ Netlight Edge X Berlin
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
 
11. Huccet I Imaniye
11. Huccet I  Imaniye11. Huccet I  Imaniye
11. Huccet I Imaniye
 
IT_FOR_BUSINESS_30NOV15
IT_FOR_BUSINESS_30NOV15IT_FOR_BUSINESS_30NOV15
IT_FOR_BUSINESS_30NOV15
 

Similar to Considerations for Alert Design

003 Sample College Essays About Yourself Essay Exa
003 Sample College Essays About Yourself Essay Exa003 Sample College Essays About Yourself Essay Exa
003 Sample College Essays About Yourself Essay ExaJulie Brown
 
Less is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/OLess is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/OMichael Roytman
 
Ep 121: How Artificial Intelligence Creates Discrimination in HR & Recruiting
Ep 121: How Artificial Intelligence Creates Discrimination in HR & RecruitingEp 121: How Artificial Intelligence Creates Discrimination in HR & Recruiting
Ep 121: How Artificial Intelligence Creates Discrimination in HR & RecruitingWorkology
 
A Happy Marriage between Context-Driven and Agile
A Happy Marriage between Context-Driven and AgileA Happy Marriage between Context-Driven and Agile
A Happy Marriage between Context-Driven and AgileIlari Henrik Aegerter
 
Bodies and buildings nyu itp 4 8 2013
Bodies and buildings nyu itp 4 8 2013Bodies and buildings nyu itp 4 8 2013
Bodies and buildings nyu itp 4 8 2013Jennifer van der Meer
 
Steps For A Research Paper. How To Write A Research Proposal In 6
Steps For A Research Paper. How To Write A Research Proposal In 6Steps For A Research Paper. How To Write A Research Proposal In 6
Steps For A Research Paper. How To Write A Research Proposal In 6Jennifer Simmons
 
The Oracle Advisors from a Different Perspective
The Oracle Advisors from a Different PerspectiveThe Oracle Advisors from a Different Perspective
The Oracle Advisors from a Different PerspectiveKaren Morton
 
Sheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docx
Sheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docxSheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docx
Sheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docxbagotjesusa
 
Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...
Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...
Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...Blend Interactive
 
Eli 2008 Fall Focus
Eli 2008 Fall FocusEli 2008 Fall Focus
Eli 2008 Fall FocusEducause
 
Interaction design & quantified self
Interaction design & quantified selfInteraction design & quantified self
Interaction design & quantified selfPaul Sas
 
BbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics WorkshopBbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics WorkshopSteve Feldman
 
Orta Therox
Orta TheroxOrta Therox
Orta TheroxCodeFest
 
Modern agile overview
Modern agile overviewModern agile overview
Modern agile overviewSteve Purkis
 
Army Decision Making And The Army Problem Solving Process
Army Decision Making And The Army Problem Solving ProcessArmy Decision Making And The Army Problem Solving Process
Army Decision Making And The Army Problem Solving ProcessPaula Smith
 
Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...
Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...
Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...Turing Fest
 
Innovation TLA 2010
Innovation TLA 2010Innovation TLA 2010
Innovation TLA 2010Leah Krevit
 
The data science handbook pre release (1)
The data science handbook   pre release (1)The data science handbook   pre release (1)
The data science handbook pre release (1)Lakshmi Prasanna
 

Similar to Considerations for Alert Design (20)

003 Sample College Essays About Yourself Essay Exa
003 Sample College Essays About Yourself Essay Exa003 Sample College Essays About Yourself Essay Exa
003 Sample College Essays About Yourself Essay Exa
 
Less is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/OLess is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/O
 
Ep 121: How Artificial Intelligence Creates Discrimination in HR & Recruiting
Ep 121: How Artificial Intelligence Creates Discrimination in HR & RecruitingEp 121: How Artificial Intelligence Creates Discrimination in HR & Recruiting
Ep 121: How Artificial Intelligence Creates Discrimination in HR & Recruiting
 
A Happy Marriage between Context-Driven and Agile
A Happy Marriage between Context-Driven and AgileA Happy Marriage between Context-Driven and Agile
A Happy Marriage between Context-Driven and Agile
 
Bodies and buildings nyu itp 4 8 2013
Bodies and buildings nyu itp 4 8 2013Bodies and buildings nyu itp 4 8 2013
Bodies and buildings nyu itp 4 8 2013
 
On System Design
On System DesignOn System Design
On System Design
 
Steps For A Research Paper. How To Write A Research Proposal In 6
Steps For A Research Paper. How To Write A Research Proposal In 6Steps For A Research Paper. How To Write A Research Proposal In 6
Steps For A Research Paper. How To Write A Research Proposal In 6
 
The Oracle Advisors from a Different Perspective
The Oracle Advisors from a Different PerspectiveThe Oracle Advisors from a Different Perspective
The Oracle Advisors from a Different Perspective
 
Sheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docx
Sheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docxSheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docx
Sheet11234567891011120.3340.3170.360.3280.050.0490.0460.0470.0450..docx
 
Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...
Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...
Next Level Collaboration: The Future of Content and Design by Rebekah Cancino...
 
Eli 2008 Fall Focus
Eli 2008 Fall FocusEli 2008 Fall Focus
Eli 2008 Fall Focus
 
Interaction design & quantified self
Interaction design & quantified selfInteraction design & quantified self
Interaction design & quantified self
 
BbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics WorkshopBbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics Workshop
 
Orta Therox
Orta TheroxOrta Therox
Orta Therox
 
ScienceBehindUX
ScienceBehindUXScienceBehindUX
ScienceBehindUX
 
Modern agile overview
Modern agile overviewModern agile overview
Modern agile overview
 
Army Decision Making And The Army Problem Solving Process
Army Decision Making And The Army Problem Solving ProcessArmy Decision Making And The Army Problem Solving Process
Army Decision Making And The Army Problem Solving Process
 
Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...
Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...
Ivana McConnell — Ethics, Software and Identity in the Age of Data (Turing Fe...
 
Innovation TLA 2010
Innovation TLA 2010Innovation TLA 2010
Innovation TLA 2010
 
The data science handbook pre release (1)
The data science handbook   pre release (1)The data science handbook   pre release (1)
The data science handbook pre release (1)
 

More from John Allspaw

Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...John Allspaw
 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex SystemsJohn Allspaw
 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorJohn Allspaw
 
Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?John Allspaw
 
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)John Allspaw
 
Go or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comGo or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comJohn Allspaw
 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMPJohn Allspaw
 
Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009John Allspaw
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web OperationsJohn Allspaw
 
Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008John Allspaw
 

More from John Allspaw (10)

Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...
 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex Systems
 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human Error
 
Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?
 
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
 
Go or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comGo or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.com
 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMP
 
Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009Operational Efficiency Hacks Web20 Expo2009
Operational Efficiency Hacks Web20 Expo2009
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web Operations
 
Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008
 

Recently uploaded

COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 

Recently uploaded (20)

COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 

Considerations for Alert Design

  • 1. Monitoring Considerations Monitorama, 2013 John Allspaw SVP, Technical Operations Sunday, August 4, 13 I want to warn you that I will lift references from various sources this morning, and I’ll make sure to point to those further readings I’ll touch on when I post slides. You can feel free to view those readings as HOMEWORK. Unsurprisingly to anyone who knows me, a large amount of them will be in the field of Human Factors and Safety. WHO HERE HAS EVER WRITTEN MONITORING SOFTWARE? (alerts, dashboards, graphs, metrics collection, analysis, display, etc.)
  • 2. “In the long term, Operations as a science needs to be elevated.” Chris Brown Velocity London, 2012 Sunday, August 4, 13 We are at an interesting time in our field. We are still naive. We express indignation in terse remarks about our challenges. We also believe that certainty is something we can attain through the use of technology alone. This makes the field of web engineering as a whole ADORABLE.
  • 3. Dr. Richard Cook, Velocity US 2012 http://www.youtube.com/watch?v=R_PDc0HFdP0 Sunday, August 4, 13 Dr. Cook explains how the research done in Human Factors and Systems Safety has a good relevance to the operation of web infrastructures. “Anytime you find a world in which you have high consequences, high-tempo operations, time pressure, and lots of complexity...and people are called upon to manage that, you’re going to have these kinds of issues arise.” Aviation, patient safety, military, power generation and distribution, space travel, etc.....they are attractive because we see something in them that is familiar. While we have an opportunity to take ADVANTAGE of LESSONS LEARNED in other fields of high-tempo/complexity/consequences, it behooves us to think on how we are DIFFERENT from the other fields. We also have an opportunity to SIDESTEP some of the quagmires those fields have found themselves in. This talk is a tiny effort towards this direction.
  • 4. LANGUAGE Sunday, August 4, 13 In order to support this, I will argue that we need to start paying attention to our language. 1. OTHER DOMAINS ALREADY HAVE A LEXICON, WE CAN BORROW SOME TERMS FROM THEM 2. How we discuss our challenges can play a very large role in how we surmount them. There are a number of concepts, words, and ideas that need to enter our lexicon, especially when it comes to monitoring and the challenges that come with making sense of where, what, how, and why complex systems behave.
  • 5. BETTER QUESTIONS Sunday, August 4, 13 One of the OTHER things that has become clear to me is that as a field, we need to ask BETTER QUESTIONS instead of quickly jumping to CORRECT ANSWERS or SOLUTIONS. ASKING TERRIBLE QUESTIONS WILL GUARANTEE TERRIBLE SOLUTIONS. I’m increasingly convinced that the road to progress on such a broad and complicated topic as monitoring is paved with BETTER QUESTIONS, not NEWER TOOLS. So you may hear me asking some questions today. They may or may not be good questions, but I’ll take a stab at it anyway.
  • 6. DOWN and IN Sunday, August 4, 13 “Down and In” As the years go by and we see the continued decline of storage prices, the explosion of accessible processing power, we have an ever-expanding ability to zoom in deeply to the ways servers and services talk to each other and process information. WE CAN ZOOM IN ON THE RELATIONSHIPS and BEHAVIORS of SEEMINGLY DISPARATE PIECES OF DATA... ... AND WE CAN DISCOVER AND DETECT DISRUPTIONS IN SOMETIMES SURPRISING PLACES. THIS IS INTERESTING. BUT IT IS ALSO WOEFULLY INCOMPLETE IF WE ARE TO MAKE ANY PROGRESS IN OPERATIONS.
  • 7. UP and OUT Sunday, August 4, 13 ...it is INCOMPLETE because as we ZOOM OUT, what we find is a much-ignored environment which includes one of the most powerful CONTEXT-SENSITIVE and INCREDIBLY ADAPTIVE anomaly detection and response agent in the world: HUMANS
  • 8. Sunday, August 4, 13 Do we have ANOMALY DETECTION problems? Certainly. One can argue (I will, if you’d like, later at the bar) that we will ALWAYS have them. BUT: What I’m interested in is NOT how software can be used to detect anomalies automatically. (well, I’m interested, but I don’t doubt that you all will continue to get better at it)
  • 9. Sunday, August 4, 13 ... It is how people navigate this boundary between themselves and the machines they work with. The BOUNDARY between humans and machines, as we observe our use of tools, is a focus IN and OF ITSELF. If we have any hope of making progress in monitoring complex systems, we must take this boundary into account.
  • 10. Sunday, August 4, 13 BUT ABOUT HUMANS: A couple of observations with respect to tools and monitoring in general. 1. We don’t use a single tool to gain insight into the architectures we build. And we will not. 2. Teams of people are the NORM, which means communication and coordination become as important (if not more important) than surfacing anomalies themselves. 3. We bring our BIASES, EXPECTATIONS, TRUST, and PERCEPTIONS to the table. No tool or piece of automation or tooling will change that. 4. Understanding the breakdowns at these boundaries between people and machines should be a part of how we approach design of tools and organizational behaviors.
  • 11. LESS CODE MORE PSYCHOLOGY Sunday, August 4, 13 SPECIFICALLY: ALGORITHMS ALONE WILL NOT DELIVER US TO A BETTER AND SAFER PLACE.
  • 12. OODALoop Observe Orient Decide Act credit:http://blog.b3k.us/ooda.html Sunday, August 4, 13 WHO IS FAMILIAR WITH Lt. Boyd’s OODA Loop? Observation and orientation is a place where we can look for making progress. When we get alerted, look at dashboards, graphs and logs, we’re looking to make sense of the past and project into the future. NOTE: Observe and Orient are not Unix commands, they are HUMAN ACTIVITIES.
  • 13. We need to understand how people make sense of what is going on Sunday, August 4, 13 SO: Writing code to TELL COMPUTERS WHAT TO LOOK AT is quite different than making sure that the code’s human supervisors are equipped or aided in what to look at when an alert goes off. How people make sense of what is going on (in diagnosis? In planning? In response? In control?) is just plain HARD.
  • 14. We need to understand how normal work is getting done by normal people in normal situations. Sunday, August 4, 13 If we don’t understand how people consume, adapt to, work around, and make use of tools under “normal” operating conditions, how can we have confidence that our designs will perform under uncertain or escalating scenarios?
  • 15. Work As Imagined Work As Done Sunday, August 4, 13 Our clues on how we THINK we work guides our design decisions. But there is a gap between how we think we work, and how we actually work. How large is this gap? How will we know when it’s too large?
  • 16. Where is design? “The system should therefore be designed so that human adaptation is ENHANCED.” Erik Hollnagel Expertise and Technology: Cognition & Human-Computer Cooperation, 1995 Sunday, August 4, 13 Design thought should be in tools, displays, controls, and processes. What do we have to work with, though? “It is the expertise of the human operator that makes it possible to adapt the performance of the joint system, in real time, to unexpected events and disturbances. Every working day, across the whole spectrum of human enterprise, a large number of near-misses are prevented from turning into accidents only because human operators intervene...
  • 17. Sunday, August 4, 13 Whether we know it or not, we are ALL designers now, if we build tools intended to aid monitoring. I’m not just talking about UI and garden-variety HCI work, but those topics should be considered table stakes.
  • 18. Where is design? http://www.perceptualedge.com/articles/visual_business_intelligence/ time_on_the_horizon.pdf Sunday, August 4, 13 VISUAL PERCEPTIONS and UI approaches are integral to our field, so we should try to understand them as deeply as we can. Armed with the knowledge that every element of design can (and will) be mis-used (like these Horizon Graphs), we are left with a dilemma: How can we understand what can augment human capabilities without getting in the way, and without having to first re-start our career as an Human Factors expert? WE FAKE IT UNTIL WE MAKE IT
  • 19. http://www.perceptualedge.com/articles/Whitepapers/Dashboard_Design.pdf Salience Sunday, August 4, 13 For example, this illustration of the concept of SALIENCE, or “quality of an item that stands out relative to neighboring items” Comes from a great whitepaper called “Dashboard Design for Real-Time Situation Awareness” by Stephen Few
  • 21. Principles of Display Design • Principle of information need • Principle of legibility • Principle of display integration/proximity • Principle of pictorial realism • Principle of the moving part • Principle of predictive aiding • Principle of discriminability: status versus command Wickens, Lee, Liu, Becker An Introduction to Human Factors Engineering Sunday, August 4, 13 Here is another great pointer on display design, from “AN INTRODUCTION TO HUMAN FACTORS ENGINEERING”.
  • 22. Cognition In The Wild “It is notoriously difficult to generalize laboratory findings to real-world situations.” Sunday, August 4, 13 So let’s leave design for a moment and talk about how we can VALIDATE our design choices. We CANNOT hope to understand how people behave in real-world scenarios BY USING OUR IMAGINATION alone. How many of you work at a company where funnel or clickstream analysis is being done? How many of you have done clickstream or funnel analysis on your monitoring dashboards, graphs, and displays? What sort of information might we find when we gather data on how people navigate metric data during varying scenarios?
  • 23. ALERT DESIGN Sunday, August 4, 13 - Who has ever gotten a page and ignored it? Endsley: At a safety expert conference, in a 300-person hall, only 3 people got up for a fire alarm. - How many alerts were received in the past week that were not actionable? (no human action was required?) - How many alerts were received in the past week as a result of known work being done, but alerts were not silenced during that period? - How many alerts were received as a result of a previously silenced alert (because work was being done) that was mistakenly un-silenced?
  • 24. Jack Garman Flight controller NASA Mission Control Apollo Program (Murray and Cox 1990) Sunday, August 4, 13 “A program alarm could be triggered by trivial problems that could be ignored altogether. Or it could be triggered by problems that called for an immediate abort. How to decide which was which? "We wrote ourselves little rules like 'If this alarm happens and it only happens once, don't worry about it. If it happens repeatedly, but other indicators are okay, don't worry about it.'"
  • 25. Operator, interviewed. The Three Mile Island nuclear power plant, following the accident. (Kemeny 1979) Sunday, August 4, 13 “I would have liked to have thrown away the alarm panel. It wasn't giving us any useful information." Comment by one operator at the Three Mile Island nuclear power plant to the official inquiry following the TMI accident (Kemeny 1979).
  • 26. Physician, explaining how they respond to a nuisance alarm on a device in the operating room. (Cook, Potter,Woods and McDonald 1991) Sunday, August 4, 13 "When the alarm kept going off then we kept shutting it [the device] off [and on] and when the alarm would go off [again], we’d shut it off.” “... so I just reset it [a device control] to a higher temperature. So I kinda fooled it [the alarm]...”
  • 27. SIGNAL DETECTION THEORY Sunday, August 4, 13 Signal Detection Theory - Too sensitive, and you’ll get false alarms - Not sensitive enough, and you’ll get missed alarms
  • 28. ALERT DESIGN Mica Endsley Designing for Situation Awareness Sunday, August 4, 13 What about the context people are in when they experience a FALSE ALERT? Or a MISSED ALERT?
  • 29. Interpretation Integration Interpretation Other Situational Information Expectancies Past History Mental Model Alarm Signal Response Decision Designing for Situational Awareness, Mica Endsley Sunday, August 4, 13 The cognitive processing of an alarm signal. When we DESIGN ALERTS, we HAVE to think about the various ways that the ALERT could be interpreted or acted on. Often times, we will PUNT on aiding the operator with CONTEXT.
  • 30. Critical Care & Anesthesiology • Monitors & alarms designed to “never miss” • 566 deaths reported related to alarms (2005-2008) • Most associate with the silencing function • ECRI’s #1 health technology hazard, 2012 & 2013 And you have complaints about Nagios’ “set downtime” feature? Sunday, August 4, 13 Emergency Care Research Institute (ECRI), which recently identified alarms as the “number one health technology hazard” for 2012.9 And you have complaints about Nagios’ “set downtime” feature?
  • 31. ALERT DESIGN Confirmation Sunday, August 4, 13 - Because false alarms are a problem, people will spend time not reacting to an alert, but confirming that the alert is legit. - Pilots delay responding to GPWS (Ground Proximity Warning System) 73% of the time, because they’re looking out the window to confirm it’s true, and how true it is. What are ways we can SUPPORT CONFIRMATION or VALIDATION in our alert design?
  • 32. ALERT DESIGN Expectancy Sunday, August 4, 13 - People’s expectancies can also affect their interpretation of alerts. - In many cases, people EXPECT the alert to go off, as the result of their own actions. - In a study in 2001, 6% of operating room alarms were found to be expected or anticipated. - This can become a nuisance, and further degrade the trust in the alerts. - Example: disk space alerts that happen during a backup, and then recover. - Example: someone on the team doing work, and not silencing the alerts temporarily. BONUS: when the time period for an alert is silenced passes, and the condition isn’t acceptable yet. (downtime expiring) What are ways that we could SUPPORT EXPECTANCY in our alert design?
  • 33. ALERT DESIGN • Signal:Noise can be difficult • Easy to err on more false alarms • Decay in trust • Origins: Undetectable conditions Sunday, August 4, 13 - Signal:Noise can be difficult to get right - General view: err on the side of too many false alarms. This ignores the detrimental effect of them on humans. - Study in 1998 said: New ATC systems, missed alerts at 0.2%, false alarm rates at 65%. - Underlying false alerts: not the functioning of algorithms themselves, but the CONDITIONS AND FACTORS THAT THE ALARM SYSTEMS CANNOT DETECT OR INTERPRET Ex: Cincinnati Airport - riverbank leading up to a runway increases in terrain causes an alarm because the system can’t detect that it’s going to plateau at the runway. Pilots familiar with the airport ignore the alarms.
  • 34. Information is not a scarce resource. Attention is. Herb Simon, 1991 Sunday, August 4, 13 http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf
  • 35. Directed Attention • Attention focusing • Attention switching • Dynamic Prioritization Sunday, August 4, 13 We work in a COGNITIVELY NOISY WORLD, even when there is NOT an outage going on. Alerts are ESSENTIALLY ATTENTION DIRECTORS. The main challenge for DYNAMIC FAULT MANAGEMENT (HF term) in design is to support: - ATTENTION FOCUSING - ATTENTION SWITCHING - DYNAMIC PRIORITIZATION By getting to know how human attention works (and its relationship to context, perception, etc.), we can hope to design better alerts.
  • 36. Interrupts AND Underspecification 1. “Here is the data I want you to see” 2. “Here is why I think you would find it interesting” Sunday, August 4, 13 An alert is essentially an INTERRUPT. TWO STATES: 1 - HERE IS THE DATA I WANTYOU TO SEE 2 - HERE IS WHY I THINKYOU WOULD FIND IT INTERESTING What can we do to support #2?
  • 37. Paradox Of Directed Attention Sunday, August 4, 13 An alert is essentially an interruption to everyday work, and there is a paradox at the heart of DIRECTED ATTENTION. 1. We are always busy! 2. Shifting attention has a very real cost! 2. Not all signals are worth paying attention to; context-sensitivity will always vary 3. So how can you SKILLFULLY IGNORE a SIGNAL that should NOT SHIFT UR ATTENTION WITHOUT first processing it....IN WHICH CASE IT HASN’T BEEN IGNORED. “Given that the supervisory agent is loaded by various other task related demands, how does one interpret information about the potential need to switch attentional focus without interrupting or interfering with the tasks or lines of reasoning already under attentional control. We can state this paradox in another way: how can one skillfully ignore a signal that should not shift attention within the current context, without first processing it -- in which case it hasn't been ignored.” - David Woods David Woods has suggested some ways to break this paradox, he calls it PREATTENTIVE REFERENCE. I’ll let you discover his suggestions on your own.
  • 38. Directed Attention Sorting through an avalanche of data Picking up on subtle early indications of a fault Sunday, August 4, 13 This idea of an alert DIRECTING OUR ATTENTION can exist in two views: SORTING THROUGH AN AVALANCHE or PICKING UP SUBTLE/EARLY INDICATIONS.... So....which is it? IT CAN BE BOTH! “The critical point is that the challenge of fault management lies in sorting through an avalanche of raw data -- a data overload problem. This is in contrast to the view that the performance bottleneck is the difficulty of picking up subtle early indications of a fault against the background of a quiescent monitored process.”
  • 39. Context Sensitivity Sunday, August 4, 13 The background and context in which a SIGNAL arrives can play a huge role in how they can HELP or HINDER us. If the background is one of QUIET, contrast is HIGH. <- this is what most designers plan for If the background is ONGOING DIAGNOSIS, then SIGNAL can SUPPORT/CONTRADICT existing hypothesis If the background is EXECUTING A RESPONSE, then SIGNAL can cue the RESPONSE is WRONG or INCOMPLETE. In any case, the ALERT’s MEANING will change as CONTEXT and BACKGROUND changes.
  • 40. Data Overload Sunday, August 4, 13 This is simply a tough problem. There are approaches to solve it, but none of them to date are effective given the rate at which new pieces of data are being collected and stored. There is a significant agreement among those who study data overload phenomena that the critical piece to understand is of CONTEXT SENSITIVITY. Some HF researchers have pointed at something that may help reduce the effects of DO: Depicting RELATIONSHIPS between data in a known FRAME of REFERENCE, as opposed to the raw data. What can we do as designers to aid surfacing those relationships?
  • 41. How have I taken the OPERATOR into account? Sunday, August 4, 13 PEOPLE use monitoring tools. Arguably, MACHINES use monitoring tools we build, as well. But only PEOPLE can adapt and improvise with a given tool outside of the original intentions of its designer.
  • 42. Am I hurting or helping: •Data overload or underload? •Salience? •Directed attention? •Interruptibility? Sunday, August 4, 13 When we design alerts and monitoring tools, we should be asking these questions. In addition: HOW WILL WE KNOW WHEN THIS DESIGN WOULD HURT those things?
  • 43. Joint Cognitive Systems Sunday, August 4, 13 One final thought: what if, instead of the view that the BOUNDARY is a large barrier to be hurdled only by our writing increasingly complex code...we view that boundary as a place for an actual cooperative RELATIONSHIP?
  • 44. Joint Cognitive Systems What if we viewed an alerting system as a PARTNER, instead of a subordinate? Sunday, August 4, 13 What is we viewed alerting systems as a PARTNER, instead of a subordinate or otherwise dumb messenger delivering news to us? What does the world look like if we designed alerts to COOPERATE with us? If TRUST in alerting systems is such a big deal.... WHAT can we learn from how HUMANS learn to trust each other, and let that influence our design decisions? In other words: how can we design alerts that SUPPORT our confirming their legitimacy, or our expectations when an alert will fire? Is context-sensitivity part of this? We see some blunt versions of these notions: 1 - Time periods for alerts, so that people aren’t woken up for things that can wait until morning (the machine has been given some context about our availability to pay attention to an alert) 2 - Rough dependency relationships, so we don’t send a bazillion alerts when a known SPOF dies What other examples can we think of, where the COMPUTERS can attempt to understand, predict, or observe US, as we work?
  • 45. The End Sunday, August 4, 13 My hope is that I’ve been able to ask BETTER QUESTIONS, and I can kick off this conference with food for thought. You can tell me how that food tastes at the bar later.
  • 46. Can We Ever Escape From Data Overload? A Cognitive Systems Diagnosis Woods, Patterson, Roth 1999 http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf Sunday, August 4, 13 http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf
  • 47. The Alarm Problem and Directed Attention in Dynamic Fault Management Woods 1995 http://csel.eng.ohio-state.edu/woods/foundations/directed%20att.pdf Sunday, August 4, 13 http://csel.eng.ohio-state.edu/productions/woodscta/media/diagnosis.pdf