4. AnalyticsBig Picture
Build complex reports without the
search language
Provides more meaningful representation
of underlying raw machine data
Acceleration technology delivers up to
1000x faster analytics over Splunk 5
4
Pivot
Data
Model
Analytics
Store
5. Operational IntelligenceAcrossthe Enterprise
IT professional
Create and share data models
Accelerate data models and custom
searches with the analytics store
Create reports with pivot
Developer Analyst
Leverage data models to
abstract data
Leverage pivot in custom apps
Create reports using pivot based on
data models created by IT
Pivot
Data
Model
Raw
Data
Analytics
Store
[10/11/12
18:57:04 UTC]
000000b0
18. What is a Data Model?
Adatamodelisasearch-timemappingofdataontoahierarchicalstructure
Encapsulatetheknowledgeneededtobuildasearch
Pivotreportsarebuildontopofdatamodels
Data-independent
Screenshot here
19. A Data Model is a Collection of Objects
Screenshot here
30. Best Practices
Use event objects as often as possible
– Benefit from data model acceleration
Resist the urge to use search objects instead of event objects!!
– Event based searches can be optimized better
Minimize object hierarchy depth when possible
– Constraint based filtering is less efficientdeeper down the tree
Event object with deepest tree (and most matching results) first
– Model-wide acceleration only for first event object and its descendants
44. Data Model on Disk
EachdatamodelisaseparateJSONfile
Livesin<myapp>/local/data/models
(or<myapp>/default/data/modelsforpre-
installedmodels)
Hasassociatedconfstanzas
andmetadata
45. Editing Data Model JSON
Atyourownrisk!
ModelseditedviatheUIarevalidated
Manuallyediteddatamodels:NOTSUPPORTED
Exception:installinganewmodelbyaddingthefileto<myapp>/<local
ORdefault>/data/modelsisprobablyokay
46. Deleting a Data Model
UsetheUIforappropriatecleanup
Potentialforbadstateifmanuallydeletingmodelondisk
47. InteractingWith a Data Model
UsedatamodelbuilderandpivotUI–safestoption!
UseRESTAPI–fordevelopers(seedocsfordetails)
Use|datamodeland|pivotSplunksearchcommands
49. Data Model Acceleration
Run a pivot report
Poll: are there new
accelerated
models?
Turn on
acceleration via UI Setting written to conf file
Kick off collection
Acceleration
Kick off ad-hoc acceleration and run search
Run search using on-disk acceleration
Admin or power user
Backend magic
Non-technical user
No acceleration
50. Model-WideAcceleration
Pivotsearch:
| tstats count AS "Count of HTTP_Success" from datamodel="WebIntelligence" where
(nodename="HTTP_Request") (nodename="HTTP_Request.HTTP_Success") prestats=true | stats count AS
"Count of HTTP_Success”
Only accelerates first event-
based object and descendants
Does not accelerate search and
transaction-based objects
Welcome to SplunkLive [City].
Thank you for taking the time to attend today’s event.
How can you leverage Splunk?
Splunk 6 takes large-scale machine data analytics to the next level by introducing three breakthrough innovations:
Pivot – opens up the power of Splunk search to non-technical users with an easy-to-use drag and drop interface to explore, manipulate and visualize data
Data Model – defines meaningful relationships in underlying machine data and making the data more useful to broader base of non-technical users
Analytics Store – patent pending technology that accelerates data models by delivering extremely high performance data retrieval for analytical operations, up to 1000x faster than Splunk 5
Let’s dig into each of these new features in more detail.
How does the Analytics Store, Data Model and Pivot benefit users across the enterprise?
Lets start with the IT Professional – this includes the Splunk Administrator or an advanced Splunk user that is familiar with SPL.
Using Splunk 6 they can:
Create data models
Share data models with other users – delivering a consistent view of the data
Accelerate data models using the Analytics Store
Create reports using Pivot (although being power users, they may prefer using SPL directly!)
Next we have the enterprise developer.
Using Splunk 6 they can:
Leverage data models built by IT, making searches more portable (using common Data Models ensures predictability of results)
Leverage the Pivot interface in custom enterprise apps
Finally, there are additional users that can now benefit – for example, the business or data analyst.
Using Splunk 6 they can:
Create reports, dashboards, charts and other visualizations using the Pivot interface and based on data models that provide an abstracted view of the raw data.
Splunk 6 is not meant to replace existing BI and Business Analytics tools, but it does provide new visibility, insights and intelligence from operational data that can be used by business analysts to augment these tools. Data from Splunk software can also be leveraged directly using the Splunk API and SDKs and integrated into existing business analytics tools. For example, the recently announced Pentaho Business Analytics for Splunk® Enterprise (http://apps.splunk.com/app/1554), enables business users to utilize Pentaho to rapidly visualize and gain additional insights from Splunk’s machine data platform using existing in-house skills.
How can you leverage Splunk?
How can you leverage Splunk?
How can you leverage Splunk?
- The Splunk search language is very expressive.
- Can perform a wide variety of tasks ranging from filtering to data munging and reporting
- There are various search commands for complex transformations and statistics (e.g. correlation, prediction etc)
What does the search do?
Basically, first it normalizes the individual accesses, which should be representable as a model object.
Next it aggregates by guid to create an "instance" object, which should be representable in a DM.
It calculates a field on that instance object, "type".
Then it builds a timechart. of those, using a special "_time" value.
Low overhead to start but learning curve quickly gets steep
Obtaining website usage metrics should not require understanding Apache vs IIS format
Admins won’t know apriori what questions are being asked of the data…so they can’t provide canned dashboards for all scenarios
Backup search for example:
eventtype=pageview | eval stage_2=if(searchmatch("uri=/download*"), _time, null()) | eval stage_1=if(searchmatch("uri=/product*"), _time, null()) | eval stage_3=if(searchmatch("uri=*download_track*"), _time, null()) | stats min(stage_*) as stage_* by cookie | search stage_1=* | where isnull(stage_2) OR stage_2 >= stage_1 | where isnull(stage_3) OR stage_3 >= stage_2 | eval stage = case(isnull(stage_2), "stage_1", isnull(stage_3), "stage_2", 1==1, "stage_3") | stats count by stage | reverse | accum count as cumulative_count | reverse | streamstats current=f max(cumulative_count) as stage_1_count last(cumulative_count) as prev_count
What are the important “things” in your data?
E.g. WebIntelligence might have
HTTPAccess
HTTPSuccess
User Session
How are they related?
There’s more than one “right” way to define your objects
Constraints filter down to a set of a data
Attributes are the fields and knowledge associated with the object
Both are inherited!
A child object is a type of its parent object: e.g. An HTTP_Success object is a type of HTTP_Access
Adding a child object is essentially a way of adding a filter on the parents
A parent-child relationship makes it easy to do queries like “What percentage of my HTTP_Access events are HTTP_Success events?”
How can you leverage Splunk?
Constraints are essentially the search broken down into a hierarchy, attributes are the associated fields and knowledge
Arbitrary searches that include transforming commands to define the dataset that they represent
Fix example here? TODO
Enable the creation of objects that represent transactions
Use fields that have already been added to the model via event or search objects
This is how we capture knowledge
Required: Only events that contain this field will be returned in Pivot
Optional: The field doesn't have to appear in every event
Hidden: The field will not be displayed to Pivot users when they select the object in Pivot
Use this for fields that are only being used to define another attribute, such as an eval expression
Hidden & Required: Only events that contain this field will be returned, and the field will be hidden from use in Pivot
Be careful about lookup permissions – must be available in the context where you want to use them