4. Humans are good at
learning, but we get lost
in volume and details…
5. ▶ Improve decision-making
▶ Uncover hidden trends or
relationships
▶ Alert on deviations
▶ Forecast or anticipate incidents
All of this requires diverse data
from across many silos. Lots
of unstructured, real-time data.
Why AI & Machine Learning?
6. Run the Business in Real Time
Data From the Past Real-Time Data Statistical Forecast
T – a few days T + a few days
Security Operations Center
IT Operations Center
Business Operations Center
Predictive
(Models)
Historical Reporting
(BI Tools, Data Lakes) Grey space
8. Deviation from past behavior
Deviation from peers
(aka Multivariate AD or Cohesive AD)
Unusual change in features
Predicting churn
Predicting events
Trend forecasting
Detecting influencing entities
Early warning of failure – predictive
maintenance
Identify peer groups
Event correlation
Reduce alert noise
Anomaly Detection Predictive Analytics Clustering
Splunk Customers Have ML Problems
9. The ML Process
Get and
explore data
Select and fit an
algorithm,
generating a model
Apply and
validate models
Surface model to
consumers to
solve problems
Problem: <Stuff in the world> causes big time and money expense. Value Hypothesis
Solution: Build ML model to forecast <possible incidents>, act pre-emptively and learn
Operationalize
11. Overview of AI Powered by ML at Splunk
CORE PLATFORM
SEARCH
PACKAGED PREMIUM
SOLUTIONS
MACHINE LEARNING
TOOLKIT
12. Search Includes Machine Learning
Core platform search is a powerful and highly flexible interface built with ML
13. Splunk IT Service Intelligence
Get Data
Define services,
entities and KPIs
Monitor and
troubleshoot
Analyze
and detect
Data-Defined, Data-Driven Service Insights
Adaptive Thresholds and Anomaly Detection
14. Anomalous Behavior Risky Users Unknown Threats
Splunk User Behavior Analytics
An out-of-the-box solution that helps organizations find
with the use of machine learning
15. ▶ Assistants: Guided model building, testing
and deployment for common objectives
▶ Showcases: Interactive examples for typical
IT, security, business and IoT use cases
▶ Algorithms: 25+ standard algorithms
included with the Toolkit
▶ ML Commands: New SPL commands to fit,
test and operationalize models
▶ Python for Scientific Computing Library:
Access to 300+ open source algorithms
Splunk Machine Learning Toolkit
Extends Splunk platform functions and provides a guided modeling environment
Build custom analytics for any use case
16. Custom Machine Learning – Success Formula
Identify use cases
Drive decisions
Set business/ops priorities
SPL
Data prep
Statistics/math background
Algorithm selection
Model building
Splunk ML Toolkit
facilitates and simplifies
via examples and guidance
Operational success
Data
Science
Expertise
Splunk
Expertise
Domain
Expertise
(IT, Security…)
17. Continuous Data Ingest at Scale
DevelopVisualize PredictAlertSearch
Engineers Data
Analysts
Security
Analysts
Business
Users
Native Inputs
TCP, UDP, Logs, Scripts, Wire, Mobile
Industrial Data
SCADA, AMI, Meter Reads
Modular Inputs
MQTT, AMQP, COAP, REST, JMS
HTTP Event Collector
Token Authenticated Events
Technology Partnerships
Kepware, AWS IoT, Cisco, Palo Alto
Maintenance
Info
Asset
Info
Data
Stores
External
Lookups/Enrichment
OT
Industrial Assets
IT
Consumer and
Mobile Devices Real Time
23. Machine Learning Customer Success
Network Incident Detection
Service Degradation Detection
Security/Fraud Prevention
Machine Learning
Consulting Services
Analytics App Built
on ML Toolkit
Optimizing operations and business results
Predict Gaming Outages
Fraud Prevention
Entertainment
Company
Cell Tower Incident Detection
Optimize Repair Operations
Prioritize Website Issues
and Predict Root Cause
26. ▶ Splunk Usergroup Zürich
▶ Regular Splunk User get-togethers
▶ Frequent Splunk Ninja Presentations (D/E)
▶ Meetings throughout all major german
speaking cities (not only Zurich)
▶ Amtssprache deutsch
▶ Not a sales thing
▶ Kick-off soon
▶ Join now:
▶ https://usergroups.splunk.com/group/splunk-
user-group-zurich.html
Splunk Usergroup Zurich
http://bit.do/SPLUGZ
Notes de l'éditeur
Hi, my name is Dirk Nitschke and I‘m working for Splunk as a Sales Engineer, primary covering Germany.
The title of this presentation is „Get more from your machine data with Splunk and artificial intelligence“.
Artificial intelligence is a broad field, and we are looking at the concept of „machine learning“.
First of yll you can ask yourself why we need the help fof machines to get more from machine data.
Humans are quite good at learning, good at applying what we have learned and good at adapting to new situations and making use of our experience.
However, it‘s getting harder for us when we have to process large amounts of data. Piles of data are simple too large and we are too slow.
Remembering multiple details for a short period of time is also hard for many of us. You may have heard of the magic number „7“. Many people can memorize a random character seuqence of 7 characters in their short term memory (layer 1 cache ;-)). Maybe 1 or 2 characters more or less. You have to practice or apply tricks to become better, e.g. you have to repeat stuff or put the character sequence into a differnt context like a story. That‘s because it‘s much easier for us to remember entire words – even when they are longer than 7 characters.
Current trend is to make decisions not based on a gut feeling but but based on data. Ideally this happens promptly, even in real-time and does not take weeks or even months.
Wouldn‘t it be better not to be reactive only but to anticipate new developments such that you can be proactive to get an advantage over your competitors?
The data needed for this kind of decision making comes in many variaties, from different systems and is – in most cases – unstructured.
Machine learning can help making decisions based on large amounts of historical data through identifying trends, anomalous behaviour or making predictions. And all this – if needed prompoly or in real-time.
What do we need to achieve this goal? First of all a platform to collect, store and analyze unstructured machine data that can deliver actionalbel insights.
Machine data is generated by different systems, devices, areas. Mayby you are running an IT or Security operations center today, or even a business operations center.
Besides current data you need historical data. You take both to learn from it, detect patterns and take a look into the crystal ball.
By the way, data stored in other systems can also be used to augment data in Splunk. Consider our classical example of a webshop. Based on machine data you can see which products people put into their catrs. If you combine this information with prices and costs of goods you can immediately see the revenue you are making – and if you take a look at carts not being checked out, you see the revenue you are not making.
The good news is that Splunk provides such a platform for the collection and analysys of machine data at scale.
But what is machine learning?
If we take a look at the formal definition, machine learning is about algorithms that perform a certain task and learn from experience / historical data to perform better in the future.
8
How to do this in real world? First of all, you define the problem you want to solve. It should be one that is worth solving, like preveting the outage of a production line to keep employees busy. Next you need to define the goal you want to achieve. E.g. we want to predict the outage of a production line – ideally early enough to be able to fix the problem before the production line actually fails.
Which data can help solving the problem? Explore that data first. You may already be able to identify soe pattern or want to check data quality and completeness. If needed clean data, apply scaling if needed or anything else.
Next you define a so-called model which is based on some mathematical algorithm. This means you defined the relationship between several variables or so-called features. In our case we want to describe a relation between the event „outage“ and some other parameters. Next you test your model which means you take existing data, apply the model to it and validate the accuracy of the reults. Many algorithms have parameters that can be tweaked so you may go back and generate a new model with different settings to finally select the „best“ model. Last but not least you present your findings to the people that will use your model.
But this is not the end. Typically you are going to operationalize these steps. This emasn you get feedback from users of your model. Feedback about model accuracy, changed boundary conditions, new requirements etc. This new information is put into your model again and it will learn and (hopefully) improve.
Which options to use machine learning are avialable in Splunk?
We want to make using machine learnig as easy as possible for you. We provide three different varieties. Why? Because based on your knowledge, use case you need different approaches.
First of all, we have machine learing in Splunk Core itself, as part of our premium solutions ITSI and UBS, and last but not least in the so-called machine learning toolkit app.
Let‘s take a closer look on these options.
The Splunk Processing Language (SPL) provides some search commands that can be used for the three previously describe typical use cases of machine learning. For example:
Anomalydetection can be used to identify anomalies (guess what)
Predict can be used to predict values over time
Cluster can be used to group events – this command is actually the secret sauce behind the „patterns“ tab in the UI.
There are some more – and by the way, you can also use classical statistical functions like average, standard deviation and friends to identify anomalies.
Our preimium solutions have machine built in – tailored for special use cases and easy to use.
Splunk ITSI is a premium solution designed for the end-to-end monitoring of services.
In ITSI you define services and key perfomance indicators that describe the health score of a service. Machine learnig is integrated in three areas and can be used via the UI which makes it pretty easy.:
* Adaptive Threshold: a fixed threshold for a KPI is not always useful. Consider the number of logins to a system. We expect a very high number in the morning and after lunch on workdays but only a small number on Sunday morning. A fixed threshold would reusult in many false positive notifications on workdays and probably many false negatives on the weekend which actually makes the entire KPI useless.
Wouldn‘t it be nice to define thresholds that are based on the usual or normal behaviour. High threshold in the morning, low threshold on the weekend? This is exactly what adaptive thresholding does in ITSI: We create some kind of tube along the usual behaviour to get different threhold for differenttimes of the day.
* Splunk ITSI also contains algorithms to identify anonmalies, deviations from the expected beahviour.
* Last but not least Splun kITSI can group so-called notable events based on machine learning algorthms. This allows to reduce the alert noise which makes the amount of alerts manageable again and especially add service contect to the alerts such that you can focos on alerts that are related to your most importnat business services.
Splunk User Behavior Analytics (Splunk UBA) is another premium solution. It includes a large number of algorithms specialised for identifying unknown threats and risks especially caused by insiders. This support ssecurity operations centers to proactively analyze unusual user behaviour. Example would be a large number of file accesses.
If you want to have full flexibility, leran more about the different machine learning algorithms, you should use the Splunk Machine Learning Toolkit app. This app adds some new commands to SPL that give you access to more then 30 typical machine learning algorithms.
So-Called assistents help you using these. They provide step-by-step instructions to create a model, test and apply it. The assistants include prediction of numerica and categorical fields, detction of numerical and categorical anomalies and data clustering.
And you also get lots of example data to try and learn.
Which variant to choose? We want to enable you to make use of machine learning, so we have some advise.
To use machine learning successfully you need some expertise. You need some Splunk knowledge, some expertise in the domain you are interested in, e.g. security or IT operations and some expertise in data science. Well, finden people that have expertise in all three areas is pretty hard, they are unicorns. SO you should choose on your use case and avaliable expertise.
If you have access to data scientists or want to learn more about it on your own, go with the machne learning toolkit. If there is no data science expertise, one of our packaged solutions might be a good fit.
We talked about different sources of machine data. We haven‘t talked about the „how“ and the „what“. How can we collect the data and what can Splunk do with the data.
AS you know, Splunk is very flexible. Besides monitoring of classical log files, data can be collecting through different means. E.g. you can use a REST API to collect data. Applications can send data directly to splunk using, e.g., the Splunk HTTP Event Collector. We also have interfaces to get data from Cloud services like Amazon Web Services.
Another interesting option is to listen to wire data. Splunk Stream is the key word for this kind of input.
All this data in indexed by Splunk. They can be searched and analyzed. Search results can be visualized and used to fire alerts. And they can be used as a source for prediction and anomaly detection.
The data set can be used by different users. Every user can get his view to the data set.
Data in Splunk can be enriched or augmented with data from external sources. For example, data stored in relational databases. ON the other hand, Splunk can forward data to other systems.
Everything in Splunk is based on a search. And these searches can make use of machine learning funcrtionality. Machine learning is a first class citizen in Splunks search language: some commands are already part of the core search commands, add ons like the machone learinng toolkit provide additional commands.
This means you can use machine learning in every search and, for example, create an alert once you identify an unusual high number of, say, logins to your system. Alerts can be fired using mail, messenger, or you automatically generate a ticket in your ticketing system. For example, interfaces to BMC remedy or Service Now are available on Splunkbase.
Ich möchte jetzt einmal das Machine Learning Toolkit mit seinen Showcases und Assistenten durchgehen.
MLTK Demo: Zuerst landet man in den Showcases. Diese sind aufgeteilt in verschidene Kategorien: Vorhersage numerischer Werte, Erkennung numerischer Ausreißer, ...
In jeder Kategorie wird kurz beschrieben, um welche Problemstellung es sich handelt. Außerdem sehen wir die einzelnen Beispiele, die zur Verfügung stehen.
Wir wählen eine aus, Server Power Consumption. Was jetzt passiert ist folgendes: wir gelangen in den Assitenten für die Vorhersage numerischer Felder und es werden die Beispieldaten eingelesen und einige Parameter gesetzt.
Im oberen Teil werden die Daten eingelsen, hier jetzt einfach eine CSV Datei. Man kann hier aber jeder Splunk-Suche verwenden, um die nötigen Daten auszuwählen.
Darunter gibt es die Möglichkeit, die Daten vorzuverarbeiten. Vielleicht ist es sinnvoll, die Daten zu skalieren. Wir benötigen das hier jetzt nicht.
Dann wählen wir den Algorithmus zur Lösung des Problems aus. Dann wählen wir die Variable, die wir verhersagen möchten und die Variablen, die wir für die Vorhersage nutzen wollen.
Rechts legen wir fest, wie wir den eingelseenen Datensatz aufteilen wollen: wir können einen sogenannten Trainingsdatensatz und einen Testdatensatz definieren. Was bedeutet das?
Das Modell wird anhand der Trainingsdaten erstellt. Der Testdatensatz wird dann verwendet, um das Model zu validieren und zu bewerten, wie gut es die Testdaten beschreibt.
„Show SPL“ zeigt uns, was in der Suchsprache passieren würde.
Preview Data: „predicted(ac_power)“ zeigt uns das Ergebnis des Models, das auf die Daten angewendet wurde. Residue zeigt uns den Fehler an.
Show SPL -> zeigt uns SPL dazu. Scheduled Alert -> kann gleich einen Alarm definieren!
Fit
Apply
Bewerten
Kann es in anderer Suche verwenden.
Industry:
Technology
Splunk Use Cases:
IT Operations
Challenges:
Monitoring and response required for 24/7 customer access
Separate silos created Balkanized IT department
Needed to pare down thousands of alerts and events
Splunk Products:
Splunk Enterprise
Splunk ITSI
Data Sources:
Application
Device
Firewall
Network
Server
Case Study: https://www.splunk.com/en_us/customers/success-stories/leidos.html
Nasdaq is a global exchange operator.
They use Splunk Enterprise Security premium solution for security investigations. With Splunk ES they have gained a efficiency level of over 50% in analyst ability to track down data.
Splunk has also sped up their security investigation time by 50% as well.
Splunk allows them to have a skill set that is common across the organization. It is reusable by analysts at different levels and gives a deep understanding of the organization’s overall security posture.
Our Early Adopter customers have had much success creating and operationalizing ML models. Some examples include:
Zillow makes hundreds of website updates daily, including content from several partners nationally. These updates can often cause issues in the site. Zillow built an ML model that predicts which of these changes is likely to result in an issue to allow the team to fix them proactively. Once a potential or actual issue has been identified, the model can also provide guidance on likely root cause and resolution.
TELUS has thousands of mobile phone towers across Canada; when one of these goes offline it can cause significant disruption for their customers. TELUS built a model to predict which towers are likely to fail so that they can proactively fix issues before they occur.
Fassen wir zusammen: Splunk bietet die Platform für die Sammlung und Analyse von Maschinendaten – auch in Real-Time. Durch die Verwendung von Machine Learning lassen sich dabei zusätzliche Einblicke und Erkenntnisse gewinnen, die Basis von Entscheidungen sein können. BLA, unterschiedlcieh Darreichungsformen passen sich ihren use cases an.
Thank you! Please give feedback and rate this session on Pony Poll. The URL can be found on the right hand side – and is also encoded in the QR code.