Contenu connexe Similaire à Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making (20) Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making1. Fast Data Mining
Real Time Knowledge Discovery for
Predictive Decision Making
Nino Guarnacci
nino.guarnacci@oracle.com
!1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
3. … but enterprises are now facing it too … but
• Services and web transaction data (to
refine recommendations, detect trends
etc.)
• “Sensor” data:
• GPS in mobile phones
• RFIDs
• NFC
• SmartMeters
• Etc.
• Log file monitoring and analysis
• Security monitoring
Utilities deploying smart meters?
! 200x information flowing to data center!
!3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
enterprises are
also facing it
now
4. %
93
executives who would
grade themselves C or
lower in preparedness
%
89
!4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
executives who say
drawing intelligence
organization is priority
from data is top losing
believe their
revenue as a result of not being
able to fully leverage information
%
67
Source: Oracle Research Study - From Overload to Impact: An Industry Scorecard on Big Data Business Challenges, July 2012
5. Obstacles to Faster Manage Data – Latency Gap
While Ensuring Accuracy, Efficiency, and Scale
Fragmented
event entities
The Gap
Business Value
Business event
Data captured
Analysis completed
Action taken
Action Time
!5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Source: Richard Hackethorn’s Component’s of Action Time
6. Obstacles to Faster Manage Data – Latency Gap
While Ensuring Accuracy, Efficiency, and Scale
Fragmented
event entities
The Gap
Business Value
Business event
Data captured
Analysis completed
Action taken
Action Time
!6
Source: Richard Hackethorn’s Component’s of Action Time
7. What is Fast Data?
Turning High Velocity Data into Value
▪ It’s about getting more from in-flight data
▪ It’s about faster action, faster insights
▪ It’s about running your business in real-time
!7
8. Oracle Fast Data Approach
Filter, Move, Transform, Analyze, and Act at High Velocity
FILTER &
CORRELATE
MOVE &
TRANSFORM
!8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
ANALYZE
ACT
9. Oracle Fast Data Approach
Filter, Move, Transform, Analyze, and Act at High Velocity
Network Status
In-Memory
Data Grid
FILTER &
CORRELATE
Real Time Streams
Information
• Parallel Multiple Streams: jms, files, coherence, db,..
• Different Object Type: text, java object…
• High throughput for data Aggregation and Event Querying
Coherence Data Grid holds the data and compute in parallel
!9
10. Oracle Fast Data Approach
Filter, Move, Transform, Analyze, and Act at High Velocity
- Event Streams -
Event-type
Event-type
Event-type
EPN (Event Processing Network) Elements
Adapter
Channel
Cache
!10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
POJO
JSON
Processor
HTTP Pub/Sub
11. Oracle Event Processing
STREAMS
SLA Detection: Pattern Matching
<TRACE>
<ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY>
<TRACED_ENTITY>PACCO</TRACED_ENTITY>
<TRACE>
<WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED>
<ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY>
<WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED>
<TRACED_ENTITY>PACCO</TRACED_ENTITY>
<TRACE>
<WHERE_HAPPENED_DETAIL>
<WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED>
<ID_TRACED_ENTITY>HH310665064IT</ID_TRACED_ENTITY>
<OFFICE>
<WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED>
<TRACED_ENTITY>PACCO</TRACED_ENTITY>
<WHERE_HAPPENED_DETAIL>
<WHERE_DESCRIPTION>MONZA</
<WHAT_HAPPENED>ESI_SDA</WHAT_HAPPENED>
WHERE_DESCRIPTION>
<OFFICE>
<WHEN_HAPPENED>2013-09-12</WHEN_HAPPENED>
<WHERE_ID>MZ</WHERE_ID>
<WHERE_HAPPENED_DETAIL>
<WHERE_DESCRIPTION>MONZA</
WHERE_DESCRIPTION>
</OFFICE>
<OFFICE>
</WHERE_HAPPENED_DETAIL>
<WHERE_ID>MZ</WHERE_ID>
<WHERE_DESCRIPTION>MONZA</WHERE_DESCRIPTION>
</TRACE> </OFFICE>
<WHERE_ID>MZ</WHERE_ID>
</WHERE_HAPPENED_DETAIL>
</OFFICE>
</TRACE>
</WHERE_HAPPENED_DETAIL>
</TRACE>
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
DATABASE
SPATIAL
Match Pattern= R 7 ◆
TIME
WINDOW
SELECT
M.SLA_VIOLATED
FROM
TRACE IN CHANNEL,
ENTITIES,
SPATIAL CONTEXT
MATCH_RECOGNIZE (
MEASURES
SLA_VIOLATED
PATTERN (A B)
DEFINE
A (DELIVERY TIME - NOW) < 2 DAYS
B DISTANCE BETWEEN (LOCATION, DESTINATION) > 600 KM
) as M
12. Oracle Event Processing
SLA Detection: Filtering & Correlation
ISTREAM(
SELECT
FROM
PARTITION BY
SELECT
M.SLA_VIOLATED
FROM
TRACE IN CHANNEL,
ENTITIES,
SPATIAL CONTEXT
MATCH_RECOGNIZE (
MEASURES
SLA_VIOLATED
PATTERN (A B)
DEFINE
A (DELIVERY TIME - NOW)
< 2 DAYS
B DISTANCE BETWEEN
(LOCATION, DESTINATION) > 600 KM
) as M
WITHIN
GROUP BY
)
▪
Aggregate and Correlate received
filter-events
Partition by Trip-Path probable SLA
violations
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
SPATIAL_CONTEXT
SLA_VIOLATED_OUT_CHANNEL
START_OFFICE,
WHERE_HAPPENED
1 HOUR
HAVING
▪
COUNT(*),
START_OFFICE,
WHERE_HAPPEND,
LATITUDE, LONGITUDE
START_OFFICE
COUNT(*) > 5
13. Oracle Fast Data Approach Mining?
What is Oracle Data
Filter, Move, Transform, Analyze, and Act at High Velocity
!
Real-Time Streams analysis, correlate events from
Automatically sifting through large amounts of data to
different source, manage and use them valuable new
find previously hidden patterns, discover as a windows
and slides relational data.
insights and make predictions
• Identify most important factor (Attribute Importance)
• Predict customer behavior (Classification)
• Predict or estimate a value (Regression)
• Find profiles of targeted people or items (Decision Trees)
• Segment a population (Clustering)
• Find fraudulent or “rare events” (Anomaly Detection)
• Determine co-occurring items in a “baskets” (Associations)
!13 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
2013,
CONFIDENTIAL – ORACLE RESTRICTED
14. Data Mining Provides
Better Information, Valuable Insights and Predictions
Cell Phone Churners
vs. Loyal Customers
Income
Segment #3:
Insight &
Prediction
IF CUST_MO > 7 AND
INCOME < $175K, THEN
Prediction = Cell Phone
Churner, Confidence =
83%, Support = 6/39
Segment #1:
IF CUST_MO > 14 AND
INCOME < $90K, THEN
Prediction = Cell Phone
Churner, Confidence =
100%, Support = 8/39
Customer Months
!14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
15. A Real Fraud Example
Total purchases exceeds
time period average
My credit card statement—Can you see the fraud?
May 22
May 22
…
June 14
June 14
June 15
June 15
May 28
May 29
June 16
June 16
1:14 PM
7:32 PM
Gas Station?
2:05 PM
2:06 PM
11:48 AM
11:49 AM
6:31 PM
8:39 PM
11:48 AM
11:49 AM
FOOD
WINE
Monaco Café
Wine Bistro
Monaco?
MISC
MISC
MISC
MISC
WINE
FOOD
MISC
MISC
Mobil Mart
Mobil Mart
Mobil Mart
Mobil Mart
Acton Shop
Crossroads
Mobil Mart
Mobil Mart
All same $75 amount?
!15 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
2013,
$127.38
$28.00
Insert Information Protection Policy Classification from Slide 13
$75.00
$75.00
$75.00
$75.00
$31.00
$128.14
$75.00
$75.00
Pairs of
$75?
16. “Essentially, all models are wrong,
…but some are useful.”
- George Box
(One of the most influential statisticians of the 20th century and a pioneer in the
areas of quality control, time series analysis, design of experiments and
Bayesian inference.)
!16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
17. You Can Think of It Like This…
Traditional SQL
Oracle Data Mining
• “Human-driven” queries
• Domain expertise
• Any “rules” must be
defined and managed
• SQL Queries
• SELECT
• DISTINCT
• Automated knowledge
discovery, model building and
deployment
• Domain expertise to assemble
the “right” data to mine
!
+
• ODM “Verbs”
• PREDICT
• DETECT
• AGGREGATE
• CLUSTER
• WHERE
• CLASSIFY
• AND OR
• REGRESS
• GROUP BY
• PROFILE
• ORDER BY
• IDENTIFY FACTORS
• RANK
• ASSOCIATE
!17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
18. !
Real-time Prediction for a Customer
!
• On-the-fly, single record apply with new data (e.g. from call center)
Select prediction_probability(CLAS_DT_5_2, 'Yes'
USING 7800 as bank_funds, 125 as
checking_amount, 20 as credit_balance, 55 as
age, 'Married' as marital_status,
250 as MONEY_MONTLY_OVERDRAWN, 1 as
house_ownership)
Social
Call
from dual;
Branc
ECM
BI
Get
Web
Email
CRM
!18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Mobile
19. Predictive and Recommendation Analytics
Real Time Data Mining Modeling with Streaming Events
•
!19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Combine Real Time Event Streaming Data Technologies with
the Industry leading Oracle Historical Data Mining:
– Oracle Data Mining
• Rich set of Algorithms for Data Mining
• Predict Customer Behavior
• Find Profiles of Targeted People or Items, and
determine important relationships
• Immediately Predict Trends and Themes for Data in
motion
• Respond to Prevent Business Threats and take
Advantage of Opportunities
20. Acting Oracle Data Mining:
Technology Behind the America’s Cup Win
• “The USA holds 250 sensors to collect raw data: pressure sensors on the wing; angle
sensors on the adjustable trailing edge of the wing sail to monitor the effectiveness of each
adjustment, allowing the crew to ascertain the amount of lift it’s generating; and fiber-optic
strain sensors on the mast and wing to allow maximum thrust without over bending them.
!
• But collecting data was only the
beginning. ORACLE Racing
also had to manage that data,
analyze it, and present useful
results……
!20 Copyright © 2012,
http://www.sail-world.com/USA/Americas-Cup:-Oracle-Data-Mining-supports-crew-and-BMW-ORACLE-Racing/68834
Copyright © 2012, OracleOracle and/or its affiliates. Allreserved.
and/or its affiliates. All rights rights reserved. Information Protection Policy Classification from Slide 13
Insert
21. Fast Data Mining Demo:
Fraud Prediction in action…
▪ Extract Knowledge starting from a csv file
▪ Execute Anomaly Detection Mining on stored data
▪ Put in place a RealTime Event Processing Flow
▪ Consuming event from In-Memory Data Grid
▪ Obtain instantly Fraud Prediction from :
Streaming Data
!21
23. Thanks
!
Fast Data Mining
Real Time Knowledge Discovery for Predictive
Decision Making
Nino Guarnacci
nino.guarnacci@oracle.com
!23 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.