A presentation covering 3 personas; Developers, IT Ops, and Network Administrator, and how they can work together leveraging the various management and monitoring toolsets in Azure.
6. Plan Monitor + Learn
ReleaseDevelop + Test
Development Production
7. Lack of actionable and contextual
information to resolve incidents
Prioritization and validation of
investments not based on real data
Inability to quickly detect, diagnose
and triage application issues
Lack of collaboration between
development and operations
Plan Monitor + Learn
ReleaseDevelop + Test
Development Production
8. Telemetry is collected at each
tier: server backend, middleware,
web service & browser
Telemetry arrives in the cloud
where it is stored & processed with
Machine Learning technology
Detect & Diagnose problems in Azure
Portal; Ask ad-hoc queries in Analytics;
Integrate, Extend & Customize
10. Exploration Export & CorrelationIngestion
SCOM MP
Application Insights
Open Source SDKs
Status Monitor
Azure Extensions
LogStashcollectd
Microsoft Azure Portal
Azure Monitor
Metrics & Search Explorer
Application Map
Application Insights
Analytics portalVisual Studio IDE
Power BI
Microsoft Azure dashboards
OMS Connector
Data Access APIs
Blob storage Visual Studio
Team Services
11. 1
Outside-in monitoring
URL pings and web tests from 16
global points of presence
Observed user behavior
How is the application being used?2
Observed application behavior
No coding required – service dependencies,
queries, response time, exceptions, logs, etc.
3
Developer traces and events
Whatever the developer would like to
send to Application Insights
4
Infrastructure performance
System performance counters5
12. Key capabilities
360° views for your
app across availability,
performance and
usage
Fast and powerful
troubleshooting,
diagnostics and
usage insights
Built-in analytics for
any app, fully
integrated with your
development tools
16. See which systems are
hosting which service
Including Windows, Linux, cloud,
and on-prem systems
Monitor the whole
distributed service
In one view, monitor each component
of the services
Know the
impact of changes
Determine how a change to one server
affects other connected components
Hypervisor (ESXi / hyper-v)
Web sites
Active
Directory
Service busDatabaseStorage Network
Traditional monitoring looks into individual resources
Public cloud (Azure / AWS)
Virtual machines Virtual machines
NetworkDatabase
Application or Services
InfrastructureService
Storage
Web sites Application Email SharePoint
Dependencies?
Areyouableto…
17. Service components that make up business applications:
What’s been missing…
• End-to-end view of the service
• Every tier / every service
VM Service bus Active
Directory
Database Network
and more…
SaaS
services
App 1
App 2
Web sites
End users
Public cloud
3rd-party
SaaS
Web tier
Apps
Active
Directory
Dedicated
app DB
Service
bus
On-premises
transaction
systems
18. Hypervisor (ESXi / hyper-v)
Web sites
Active
Directory
Service busDatabaseStorage Network
Public cloud (Azure / AWS)
Virtual machines Virtual machines
NetworkDatabase
Application or Services
InfrastructureService
Storage
Web sites Application Email SharePoint
Automatically discover all dependencies for
any Windows or Linux system
View all TCP-connected processes,
their bound ports and connections
View dynamic maps of your system topology,
live and historical
Visualize any alerts or change events
across all dependencies for a given machine
19. Discovery
Automatically build a common reference
map of dependencies across servers,
processes, and 3rd party services
Incident management
View cascading alerts, failed connections,
load balancing issues, and rogue clients
Migration assurance
Identify connectivity failures, view computer
and process inventory, and identify systems
for decommissioning.
Features
• Server, process, and port dependency maps
• Computer and Process Inventory in Log Analytics
• Log Analytics Alert correlation
• Change Tracking correlation
• Historical queries
• ARM API
• SCOM integration
20. Example: Finding root cause of a slow application performance
The Solution
• One view of complete
system dependencies
• Maps dependencies
across Azure &
datacenter systems
• Monitors performance
and finds problems
• Gives complete service
view of enterprise
system
21. Discovers and maps server and
process dependencies in real-
time, without any predefinition
Complete view of your complex IT infrastructure
Automatically discover app and
system dependencies to accelerate
troubleshooting and root cause
analysis
Take advantage of Service Map to
expedite your app and workload
migrations, making it easier to shift to
the cloud
Real-time dependency
discovery and mapping
Accelerate troubleshooting
and root cause analysis
Expedite migration
to the cloud
25. Network device vendor agnostic
Fault Detection & Localization
Measures packet loss and network latency
Automatically learns baseline thresholds.
Automatically discovers the subnets and network topology
Historical graphs of loss and latency
Integrates with OMS search for easy analytics and reporting
Works in cloud, on premise or hybrid environments
Allows custom alert rule creation
26.
27.
28. Review / Q & A
• Application Insights
• Proactively detect, triage & fix issues as they occur, before they start affecting
your users
• Answer tough questions instantly with powerful ad-hoc query language
• Diagnose problems right from within your development environment and
incorporate into your existing DevOps workflows
• OMS Service Map
• Automatically build a common reference map of dependencies
• Accelerate troubleshooting and root cause analysis
• Expedite your app and workload migrations, making it easier to shift to the cloud
• OMS Network Performance Monitor
• Monitors connections between office sites, datacentres, clouds and applications
• Near real time monitoring of network performance parameters like loss and
latency
• Automatically learns baseline thresholds and discovers the subnets and network
topology
29. Thank you
• Cloud Solutions Architect (Datacenter/Azure)
• System Center
• Operations Management Suite
• Azure (IaaS, PaaS, Recovery Services)
• 3x MVP - Cloud and Datacenter Management (CDM)
• Email: Adin.Ermie@outlook.com
• Twitter: @AdinErmie
• Blog: http://AdinErmie.com
Notes de l'éditeur
3 person persona journey
1: Developer
2: IT Ops
3: Network Admin
Persona 1: Developer
With the cloud and mobile app stores, the cost and time to enter the market has reduced dramatically. Business models are getting copied and it’s very easy to build an app or service that is similar to yours, and it’s hard to keep your differentiation.
Technology and requirements are changing continuously, and you need to be on top of these changes to evolve your app to be successful
With rapid continuous delivery cycles, e.g. we ship every week, some services are shipping daily, and the faster you move, there are more chances for quality to go down
Given all this, success requires data-driven decision making. You need the right data about your customers, about your live site, and all the aspects of your app, so you can make decisions quickly and adapt to be successful.
With this premise, let’s look at the modern ALM cycle
This is a continuous cycle
Plan: Prioritize investments
Dev/Test: Lack of collaboration between Devs and Ops
Release: Quickly detect, diagnose, triage issues
Monitor/Learn: Contextual info
People doing the planning are not exposed to real world data that is coming out from customers and usage of their apps to validate and make good prioritization decisions
Developers don’t have full visibility to what happens in prod, because ops have a different set of tools and processes, and this leads to lack of collaboration between them
Customers usually become aware of issues before us, and we don’t have enough info to decide whether an issue requires a hotfix or can wait until the next version
Once you have info that something is failing, you need the information that will help you resolve the issue in the fastest way
Application Insights is a service that will help you solve these problems through your application management lifecycle.
Telemetry from server backend, middleware, web, browser
Telemetry process with Machine Leaning
Portal diagnose, display
Ingestion: How do we get the data in
Exploration: How do we work with the data collected
Correlation: How do we visualize the data
What sources of telemetry are collected by Application Insights?
Outside-in
User behavior
Application behavior
Custom tracing and events
Infra performance
Built-in analytics
Deep insights
360 view
Persona 2: IT Ops
What system is hosing what service
Monitor distributed service
Know change impact
To track dependencies, IT infrastructure has relied on conventional techniques such as endless spreadsheets and extensive audits. But the proliferation of virtualized data centers, cloud, and micro services, has made it increasingly difficult to track dependencies using these archaic methods.
The increased complexity of modern applications has changed how we how we define “applications”: As complex, multi-tier, business-service systems that may span multiple datacenters and cloud hosting environments. With this complexity comes increased challenges for the teams supporting those applications, yet IT operations still faces a set of tools focused on individual aspects of application infrastructure, from infrastructure monitoring solutions to tools that provide code-level analysis.
To manage these complex applications, and ensure they meet their SLAs , IT Ops needs an end-to-end view that ties together the different application components and infrastructure services required that make up the critical business applications they support.
What if…
Auto discovery
Processes, Services
Dynamic map
Issues across dependancies
Discovery
Automatically build a common reference map of dependencies across servers, processes, and 3rd party services.
Incident Management
View cascading alerts, failed connections, load balancing issues, and rogue clients.
Migration Assurance
Ensure nothing is left behind, identify connectivity failures, view computer and process inventory, and identify systems for decommissioning.
Service Map presents a view of your servers as you think of them - as interconnected systems that deliver services and rely on other technologies. Service Map discovers and maps server and process dependencies in real-time, without any predefinition, and visualizes application components, service dependencies, and supporting infrastructure configuration.
This helps you eliminate the guesswork of problem isolation, identify surprise connections and broken links in your environment, and perform Azure migrations knowing that critical systems and endpoints won’t be left behind. The Service Map public preview supports Windows and Linux guests, in any cloud and on-prem.
Hop-by-Hop Performance monitoring to identify the real-world performance response times across the entire application system. This allows you to identify where the bottleneck is occurring, as well as clues into how to resolve the problem.
Users can configure events and alerts to notify you before performance issues are experienced broadly.
Integration with dashboarding and visualization tools. Giving you the flexibility and data to provide a single pane of glass to monitor the health of your application systems
Accelerate troubleshooting
Discover: Build application and server dependency maps in minutes rather than weeks or months
No need for costly manual cataloging of servers, services, and applications
Assess: Identify critical servers and applications
Identify and fix existing configuration issues before migration
Save on post-migration cost by identifying under-utilized servers
Improve application performance by identifying over-utilized or under-provisioned VMs.
Validate: Verify proper migration with Comparison Reports
Quickly catch any performance issues in the new environment
No need to install new tools to monitor cloud resources
Persona 3: Network Admin
- Monitors network links and these links can be between datacenters, office sites, clouds and even different tiers of an application. It measure packet loss and network latency on these links.
In other words it checks whether the packets that are being sent from one end are reaching the other end and how much time is it taking for packets to reach other end
<Click>
So lets say you have two networks – network a and network b; NPM keeps a track of availability of the links and quality of connectivity
<Click>
And we do this using synthetic transactions. i.e. we have OMS agents installed on two ends of the link and these agents periodically exchange packets. We’ll see more on how solution works in a moment.
How many of you have used OMS before? Ok just to make sure that everyone is on same page here is a brief description of OMS architecture.
Service has two parts – a cloud service and agents that can be installed on machines; these machine can be on premise or VM in the cloud. The agents collect different types of monitoring data and upload this data to the cloud service that indexes and aggregates this data and then this data is presented to the user on a web console.
And here are some of the features of NPM.
NPM solution
OMS Agent and PowerShell script
Agent gets Intelligence Pack
Detects subnet
Learns of other OMS Agents
TCP probe between agents
Displayed in OMS
OMS solution gallery is like a market place and user can pick their payment plan and choose the solution that they want. There are dozens on solution available there for different monitoring and managements requirements. There are solutions for SQL, active directory, update management, change tracking etc.
App Ins:
Proactive detect
Diagnose
Service Map:
Map dependencies
Root cause
Migrations
NPM:
Connections
Network performance
Learns baseline thresholds