SlideShare une entreprise Scribd logo
1  sur  49
Télécharger pour lire hors ligne
Baking-In Transparency

Saturday, October 8, 11
About Me
• Matt Simmons
•
•
•
•

11+ year System Administrator
http://www.standalone-sysadmin.com
@standaloneSA
standalone.sysadmin@gmail.com

Saturday, October 8, 11
Baking-In Transparency

Saturday, October 8, 11
The Situation

Saturday, October 8, 11
Devs make things
• Small discrete programs
• Large complex programs
• Immense interconnected software suites

Saturday, October 8, 11
Ops makes things go
• Script using small discrete programs
• Administer large complex programs
• Cluster immense interconnected software suites

Saturday, October 8, 11
There is a

direct relationship
between the software that
developers write and the
software that gets
implemented by operations.
Saturday, October 8, 11
The Problems

Saturday, October 8, 11
Software needs to be monitored
"When performance is measured, performance
improves. When performance is measured and
reported back, the rate of improvement accelerates."
--Pearson’s Law

Saturday, October 8, 11
Why?
“You can’t manage what you can’t measure”
--Robert Kaplan

Saturday, October 8, 11
Software needs to be
managedClearly we need to
“Management by objective works - if you
know the objective. 90% of the time, you don’t.”
--Peter Drucker

Saturday, October 8, 11
Clearly we need to measure...
But what do we measure?
And what metrics do we use?
How do we obtain the measurements?

Saturday, October 8, 11
What do we measure?

Software Engineers measure...
• Programmer Productivity
• code size/efficiency
• Defect Density
• Bugs / module size
• Requirement Stability
• “feature creep”
Saturday, October 8, 11
What do we measure?

Operations measures...
•
•
•

Saturday, October 8, 11

Resource Utilization

•

Diskspace, Bandwidth, etc

Infrastructure Stability

•

Service Uptime, MTBF, etc

Performance

•

CPU / Memory efficiency, etc
What metrics do we use?

It depends.
Duh.

Saturday, October 8, 11
The metrics that Ops needs to
monitor are not always easy to obtain...

Saturday, October 8, 11
...even though they’re
really important

• Reliability
• Repeatability
• Root Cause Identification

Saturday, October 8, 11
...so not only is monitoring important...

Saturday, October 8, 11
Monitoring is hard.

Saturday, October 8, 11
correctly
V

Monitoring is hard.

Saturday, October 8, 11
Why is monitoring hard?
• Monitoring Software Suites are complex
• Infrastructures are complex
• Processes and applications are opaque to

our futile requests to determine and track
internal state

Saturday, October 8, 11
Processes and applications
are opaque to our futile
requests to determine and
track internal state

Saturday, October 8, 11
The Solution(s)

Saturday, October 8, 11
Dev/Ops working together gives

• Team Interrelationships
• Knowledge Sharing
• Cross Training
• Tool Sharing
Saturday, October 8, 11
But more specifically...
Methods of monitoring software can be
BUILT INTO THE SOFTWARE

Saturday, October 8, 11
How things are designed now
Question: A well-designed program encounters
an error. What happens?
Answer: It handles the error, and continues
processing requests

Saturday, October 8, 11
How things are designed now
Question: A poorly-designed program
encounters an error. What happens?
Answer: It crashes and burns

Saturday, October 8, 11
Question:
Which of those is easier to monitor?

Saturday, October 8, 11
Obviously, dying to alert the
monitoring system is overkill.
(pun firmly intended)

Saturday, October 8, 11
How do we make our statuses available
to the monitoring system, then?

It depends on the kind of software

Saturday, October 8, 11
Remember these?

• Small discrete programs
• Large complex programs
• Immense interconnected software suites

Saturday, October 8, 11
Small Discrete Programs

• Possibly a utility
• Usually scripted or run manually
• Typically short-term run time

Saturday, October 8, 11
Small Discrete Programs:
Monitoring

• Screen output
• Return codes
• Catch signals
• Great example: ping & SIGQUIT
• SIGUSR1 & SIGUSR2
Saturday, October 8, 11
Signal Handling in Perl

sub USR1_handler {
drop_state_file();
}
$SIG{‘USR1’} = ‘USR1_handler’;

Saturday, October 8, 11
Large Complex Programs

• Probably a daemon or interactive program
• Long running, needs to be stable
• Subject to resource change over time
• May need to retain state across restarts
• May have a web component
Saturday, October 8, 11
Large Complex Programs:
Reporting

• No screen output (except debugging)
• Logging
• SNMP Agent/Traps
• (seriously, read ‘man snmpd.conf’)
• Named Pipes (FIFO)
• State Output to DB (if appropriate)
Saturday, October 8, 11
Net-SNMP Embedded Perl
perl use Data::Dumper;
perl sub myroutine {
print "got called:",Dumper(@_),"n";
}
perl $agent->register
('mylink', '.1.3.6.1.8765', &myroutine);

Saturday, October 8, 11
Immense Interconnected
Software Suites
(or Large

Saturday, October 8, 11

Suites)
Large Suites

• Definitely retain state across restarts
• Probably requires centralized controller
• May use sockets to communicate
• Probably has a web component
Saturday, October 8, 11
Large Suites:
Reporting
Everything under “Large Programs”, plus...

• Monitoring coordinated by the “central”
node or program

• Aggregation of state
• Provide layer of abstraction from any insuite monitoring or reporting

• Provide XML/CSV in addition to humanparsable HTML pages

Saturday, October 8, 11
What we’re really doing is IPC
So what other methods exist? Lots.

Saturday, October 8, 11
Unix IPC
• Sockets
• RPC
• Message Queues
• FIFO
• Shared Memory
• And Many More...
Saturday, October 8, 11
They shouldn’t all be used...

Saturday, October 8, 11
What is important is that you use SOMETHING

Saturday, October 8, 11
What is best?
To crush your enemies, see them
driven before you, and to hear the
lamentation of their women?

Saturday, October 8, 11
What is best?
• An application that is easily and openly
monitored

• A developer that considers monitoring in
all phases of design and development

• A developer who writes their own
monitoring checks

Saturday, October 8, 11
Do us all a favor...
When you develop software, be it scripts, utilities,
programs, or suites, please please please...

Saturday, October 8, 11
Do us all a favor...
When you develop software, be it scripts, utilities,
programs, or suites, please please please...

Consider how we Ops folks
will manage and monitor it.
Saturday, October 8, 11
Baking-In Transparency
Thank you for your time.

Matt Simmons
standaloneSA on Twitter
standalone.sysadmin@gmail.com
http://www.standalone-sysadmin.com
Saturday, October 8, 11

Contenu connexe

En vedette

Leveraging Good User Mojo
Leveraging Good User MojoLeveraging Good User Mojo
Leveraging Good User MojoMatt Simmons
 
Infrastructure Migration
Infrastructure MigrationInfrastructure Migration
Infrastructure MigrationMatt Simmons
 
Solid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsSolid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsMatt Simmons
 
CentOS Dojo - Good User Mojo
CentOS Dojo - Good User MojoCentOS Dojo - Good User Mojo
CentOS Dojo - Good User MojoMatt Simmons
 
Staying Sane with Nagios
Staying Sane with NagiosStaying Sane with Nagios
Staying Sane with NagiosMatt Simmons
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.Theo Schlossnagle
 
Introduction to Solid State Drives
Introduction to Solid State DrivesIntroduction to Solid State Drives
Introduction to Solid State DrivesMatt Simmons
 

En vedette (7)

Leveraging Good User Mojo
Leveraging Good User MojoLeveraging Good User Mojo
Leveraging Good User Mojo
 
Infrastructure Migration
Infrastructure MigrationInfrastructure Migration
Infrastructure Migration
 
Solid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsSolid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln Labs
 
CentOS Dojo - Good User Mojo
CentOS Dojo - Good User MojoCentOS Dojo - Good User Mojo
CentOS Dojo - Good User Mojo
 
Staying Sane with Nagios
Staying Sane with NagiosStaying Sane with Nagios
Staying Sane with Nagios
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.
 
Introduction to Solid State Drives
Introduction to Solid State DrivesIntroduction to Solid State Drives
Introduction to Solid State Drives
 

Similaire à Baking-In Transparency

DevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on WindowsDevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on WindowsRob Reynolds
 
SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunk
 
Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...
Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...
Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...Charles Beyer
 
"Unlocked: The Hybrid Cloud" Business Track
"Unlocked: The Hybrid Cloud" Business Track"Unlocked: The Hybrid Cloud" Business Track
"Unlocked: The Hybrid Cloud" Business TrackHart Hoover
 
Splunk at Sabre
Splunk at SabreSplunk at Sabre
Splunk at SabreSplunk
 
Interop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A ServiceInterop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A ServicePatrick Chanezon
 
PLNOG 17 - Elisa Jasinska - Network Automation - Design your Systems
PLNOG 17 - Elisa Jasinska - Network Automation - Design your SystemsPLNOG 17 - Elisa Jasinska - Network Automation - Design your Systems
PLNOG 17 - Elisa Jasinska - Network Automation - Design your SystemsPROIDEA
 
Data processing with celery and rabbit mq
Data processing with celery and rabbit mqData processing with celery and rabbit mq
Data processing with celery and rabbit mqJeff Peck
 
Eric Proegler Early Performance Testing from CAST2014
Eric Proegler Early Performance Testing from CAST2014Eric Proegler Early Performance Testing from CAST2014
Eric Proegler Early Performance Testing from CAST2014Eric Proegler
 
Unlocked London - General Session
Unlocked London - General SessionUnlocked London - General Session
Unlocked London - General SessionWayne Walls
 
The Reluctant SysAdmin : 360|iDev Austin 2010
The Reluctant SysAdmin : 360|iDev Austin 2010The Reluctant SysAdmin : 360|iDev Austin 2010
The Reluctant SysAdmin : 360|iDev Austin 2010Voxilate
 
GWAVACon - Secure and managed file transfer (English)
GWAVACon - Secure and managed file transfer (English)GWAVACon - Secure and managed file transfer (English)
GWAVACon - Secure and managed file transfer (English)GWAVA
 
Google, quality and you
Google, quality and youGoogle, quality and you
Google, quality and younelinger
 
IWMW 1999: Indexing your web server
IWMW 1999: Indexing your web serverIWMW 1999: Indexing your web server
IWMW 1999: Indexing your web serverIWMW
 
Splunk FISMA for Continuous Monitoring
Splunk FISMA for Continuous Monitoring Splunk FISMA for Continuous Monitoring
Splunk FISMA for Continuous Monitoring Greg Hanchin
 
Customer Presentation - Telus
Customer Presentation - TelusCustomer Presentation - Telus
Customer Presentation - TelusSplunk
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer PresentationSplunk
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer PresentationSplunk
 
Building an Open Source AppSec Pipeline
Building an Open Source AppSec PipelineBuilding an Open Source AppSec Pipeline
Building an Open Source AppSec PipelineMatt Tesauro
 

Similaire à Baking-In Transparency (20)

DevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on WindowsDevOps: Getting Started with Puppet on Windows
DevOps: Getting Started with Puppet on Windows
 
SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin International
 
Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...
Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...
Case of the Unexplained Support Issue – Troubleshooting steps for diagnosing ...
 
"Unlocked: The Hybrid Cloud" Business Track
"Unlocked: The Hybrid Cloud" Business Track"Unlocked: The Hybrid Cloud" Business Track
"Unlocked: The Hybrid Cloud" Business Track
 
Splunk at Sabre
Splunk at SabreSplunk at Sabre
Splunk at Sabre
 
Interop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A ServiceInterop 2011 - Scaling Platform As A Service
Interop 2011 - Scaling Platform As A Service
 
PLNOG 17 - Elisa Jasinska - Network Automation - Design your Systems
PLNOG 17 - Elisa Jasinska - Network Automation - Design your SystemsPLNOG 17 - Elisa Jasinska - Network Automation - Design your Systems
PLNOG 17 - Elisa Jasinska - Network Automation - Design your Systems
 
Data processing with celery and rabbit mq
Data processing with celery and rabbit mqData processing with celery and rabbit mq
Data processing with celery and rabbit mq
 
Eric Proegler Early Performance Testing from CAST2014
Eric Proegler Early Performance Testing from CAST2014Eric Proegler Early Performance Testing from CAST2014
Eric Proegler Early Performance Testing from CAST2014
 
Unlocked London - General Session
Unlocked London - General SessionUnlocked London - General Session
Unlocked London - General Session
 
The Reluctant SysAdmin : 360|iDev Austin 2010
The Reluctant SysAdmin : 360|iDev Austin 2010The Reluctant SysAdmin : 360|iDev Austin 2010
The Reluctant SysAdmin : 360|iDev Austin 2010
 
GWAVACon - Secure and managed file transfer (English)
GWAVACon - Secure and managed file transfer (English)GWAVACon - Secure and managed file transfer (English)
GWAVACon - Secure and managed file transfer (English)
 
Google, quality and you
Google, quality and youGoogle, quality and you
Google, quality and you
 
Monitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp DockerMonitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp Docker
 
IWMW 1999: Indexing your web server
IWMW 1999: Indexing your web serverIWMW 1999: Indexing your web server
IWMW 1999: Indexing your web server
 
Splunk FISMA for Continuous Monitoring
Splunk FISMA for Continuous Monitoring Splunk FISMA for Continuous Monitoring
Splunk FISMA for Continuous Monitoring
 
Customer Presentation - Telus
Customer Presentation - TelusCustomer Presentation - Telus
Customer Presentation - Telus
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer Presentation
 
AdvancedMD Customer Presentation
AdvancedMD Customer PresentationAdvancedMD Customer Presentation
AdvancedMD Customer Presentation
 
Building an Open Source AppSec Pipeline
Building an Open Source AppSec PipelineBuilding an Open Source AppSec Pipeline
Building an Open Source AppSec Pipeline
 

Dernier

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 

Dernier (20)

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 

Baking-In Transparency

  • 2. About Me • Matt Simmons • • • • 11+ year System Administrator http://www.standalone-sysadmin.com @standaloneSA standalone.sysadmin@gmail.com Saturday, October 8, 11
  • 5. Devs make things • Small discrete programs • Large complex programs • Immense interconnected software suites Saturday, October 8, 11
  • 6. Ops makes things go • Script using small discrete programs • Administer large complex programs • Cluster immense interconnected software suites Saturday, October 8, 11
  • 7. There is a direct relationship between the software that developers write and the software that gets implemented by operations. Saturday, October 8, 11
  • 9. Software needs to be monitored "When performance is measured, performance improves. When performance is measured and reported back, the rate of improvement accelerates." --Pearson’s Law Saturday, October 8, 11
  • 10. Why? “You can’t manage what you can’t measure” --Robert Kaplan Saturday, October 8, 11
  • 11. Software needs to be managedClearly we need to “Management by objective works - if you know the objective. 90% of the time, you don’t.” --Peter Drucker Saturday, October 8, 11
  • 12. Clearly we need to measure... But what do we measure? And what metrics do we use? How do we obtain the measurements? Saturday, October 8, 11
  • 13. What do we measure? Software Engineers measure... • Programmer Productivity • code size/efficiency • Defect Density • Bugs / module size • Requirement Stability • “feature creep” Saturday, October 8, 11
  • 14. What do we measure? Operations measures... • • • Saturday, October 8, 11 Resource Utilization • Diskspace, Bandwidth, etc Infrastructure Stability • Service Uptime, MTBF, etc Performance • CPU / Memory efficiency, etc
  • 15. What metrics do we use? It depends. Duh. Saturday, October 8, 11
  • 16. The metrics that Ops needs to monitor are not always easy to obtain... Saturday, October 8, 11
  • 17. ...even though they’re really important • Reliability • Repeatability • Root Cause Identification Saturday, October 8, 11
  • 18. ...so not only is monitoring important... Saturday, October 8, 11
  • 21. Why is monitoring hard? • Monitoring Software Suites are complex • Infrastructures are complex • Processes and applications are opaque to our futile requests to determine and track internal state Saturday, October 8, 11
  • 22. Processes and applications are opaque to our futile requests to determine and track internal state Saturday, October 8, 11
  • 24. Dev/Ops working together gives • Team Interrelationships • Knowledge Sharing • Cross Training • Tool Sharing Saturday, October 8, 11
  • 25. But more specifically... Methods of monitoring software can be BUILT INTO THE SOFTWARE Saturday, October 8, 11
  • 26. How things are designed now Question: A well-designed program encounters an error. What happens? Answer: It handles the error, and continues processing requests Saturday, October 8, 11
  • 27. How things are designed now Question: A poorly-designed program encounters an error. What happens? Answer: It crashes and burns Saturday, October 8, 11
  • 28. Question: Which of those is easier to monitor? Saturday, October 8, 11
  • 29. Obviously, dying to alert the monitoring system is overkill. (pun firmly intended) Saturday, October 8, 11
  • 30. How do we make our statuses available to the monitoring system, then? It depends on the kind of software Saturday, October 8, 11
  • 31. Remember these? • Small discrete programs • Large complex programs • Immense interconnected software suites Saturday, October 8, 11
  • 32. Small Discrete Programs • Possibly a utility • Usually scripted or run manually • Typically short-term run time Saturday, October 8, 11
  • 33. Small Discrete Programs: Monitoring • Screen output • Return codes • Catch signals • Great example: ping & SIGQUIT • SIGUSR1 & SIGUSR2 Saturday, October 8, 11
  • 34. Signal Handling in Perl sub USR1_handler { drop_state_file(); } $SIG{‘USR1’} = ‘USR1_handler’; Saturday, October 8, 11
  • 35. Large Complex Programs • Probably a daemon or interactive program • Long running, needs to be stable • Subject to resource change over time • May need to retain state across restarts • May have a web component Saturday, October 8, 11
  • 36. Large Complex Programs: Reporting • No screen output (except debugging) • Logging • SNMP Agent/Traps • (seriously, read ‘man snmpd.conf’) • Named Pipes (FIFO) • State Output to DB (if appropriate) Saturday, October 8, 11
  • 37. Net-SNMP Embedded Perl perl use Data::Dumper; perl sub myroutine { print "got called:",Dumper(@_),"n"; } perl $agent->register ('mylink', '.1.3.6.1.8765', &myroutine); Saturday, October 8, 11
  • 38. Immense Interconnected Software Suites (or Large Saturday, October 8, 11 Suites)
  • 39. Large Suites • Definitely retain state across restarts • Probably requires centralized controller • May use sockets to communicate • Probably has a web component Saturday, October 8, 11
  • 40. Large Suites: Reporting Everything under “Large Programs”, plus... • Monitoring coordinated by the “central” node or program • Aggregation of state • Provide layer of abstraction from any insuite monitoring or reporting • Provide XML/CSV in addition to humanparsable HTML pages Saturday, October 8, 11
  • 41. What we’re really doing is IPC So what other methods exist? Lots. Saturday, October 8, 11
  • 42. Unix IPC • Sockets • RPC • Message Queues • FIFO • Shared Memory • And Many More... Saturday, October 8, 11
  • 43. They shouldn’t all be used... Saturday, October 8, 11
  • 44. What is important is that you use SOMETHING Saturday, October 8, 11
  • 45. What is best? To crush your enemies, see them driven before you, and to hear the lamentation of their women? Saturday, October 8, 11
  • 46. What is best? • An application that is easily and openly monitored • A developer that considers monitoring in all phases of design and development • A developer who writes their own monitoring checks Saturday, October 8, 11
  • 47. Do us all a favor... When you develop software, be it scripts, utilities, programs, or suites, please please please... Saturday, October 8, 11
  • 48. Do us all a favor... When you develop software, be it scripts, utilities, programs, or suites, please please please... Consider how we Ops folks will manage and monitor it. Saturday, October 8, 11
  • 49. Baking-In Transparency Thank you for your time. Matt Simmons standaloneSA on Twitter standalone.sysadmin@gmail.com http://www.standalone-sysadmin.com Saturday, October 8, 11