Contenu connexe Similaire à AFSUG Cafe BI - Durban 8 Nov 2011 (20) AFSUG Cafe BI - Durban 8 Nov 20112. Data Categories
Supports automated processing
–C f
Conforms with d t models associated with d t b
ith data d l i t d ith databases and
d
Structured
spreadsheets
– Granular data stored in fields
Generally does not support automated processing
– No data model or not easily understood
Unstructured – Insufficient metadata
– Noisy data communications such as an email message, blog or
document
High Volume of small data bits
– Huge volume
Event
– Only act on exceptions
– Captured at source
© 2011 SAP AG. All rights reserved. 2
4. Data Categories
Supports automated processing
–C f
Conforms with d t models associated with d t b
ith data d l i t d ith databases and
d
Structured
spreadsheets
– Granular data stored in fields
Generally does not support automated processing
– No data model or not easily understood
Unstructured – Insufficient metadata
– Noisy data communications such as an email message, blog or
document
High Volume of small data bits
– Huge volume
Event
– Only act on exceptions
– Captured at source
© 2011 SAP AG. All rights reserved. 4
7. Data Categories
Supports automated processing
–C f
Conforms with d t models associated with d t b
ith data d l i t d ith databases and
d
Structured
spreadsheets
– Granular data stored in fields
Generally does not support automated processing
– No data model or not easily understood
Unstructured – Insufficient metadata
– Noisy data communications such as an email message, blog or
document
High Volume of small data
– Huge volume
Event
– Only act on exceptions
– Captured at source
© 2011 SAP AG. All rights reserved. 7
9. What vs. Why and When
vs
It’s generally said that…
structured data tells us “what”
and
event data tells “Wh t” and “When”
t d t t ll “What” d “Wh ”
and
unstructured data tells us “why”
why
© 2011 SAP AG. All rights reserved. 9
10. From the Business Perspective
“If you are not analyzing text – if you’re
analyzing only transactional
information – you’re missing
i f ti ’ i i
opportunity or incurring risk.”
-- Seth Grimes, Alta Plana
© 2011 SAP AG. All rights reserved. 10
11. Text Analytics Boosts Business Results
“Organizations embracing text
analytics all report having an
epiphany moment when th
i h t h they
suddenly knew more than before.”
-- Phillip Russom, The Data
Warehousing Institute
© 2011 SAP AG. All rights reserved. 11
12. Text Analytics Expands Your Vision of Business
Intelligence
“The bulk of information value is
perceived as coming from data in
relational tables. Th reason i th t
l ti l t bl The is that
data that is structured is easy to mine
and analyze.”
-- Prabhakar Raghavan, Yahoo
Research
© 2011 SAP AG. All rights reserved. 12
13. Knowledge
Strategy
telligence
e
External
Information
Int
PP
formation
n
FI
Plan
HR
CO
SD
Inf
PM
MM
Operate / Generates Data
© 2011 SAP AG. All rights reserved. 13
15. Business Intelligence Reporting off Structured Data
How can you extend
your BI investments to
unstructured text data?
t t dt td t ?
© 2011 SAP AG. All rights reserved. 15
18. Workers Lose Productivity from Inadequate
Information Access
54%
Lose Productivity
Source: Economist, ‘Enterprise Knowledge Workers Study
© 2011 SAP AG. All rights reserved. 18
19. The Goal: Be a Best Run Business
77%
“77% of high
performers have
above average
analytical
y 23%
capability”
Low High
Source: Competing on Analytics, Thomas Davenport Performers Performers
© 2011 SAP AG. All rights reserved. 19
20. IT Is Looking for Flexibility in Sharing Relevant
Information
Organizations require:
• Trusted, consolidated, and
, ,
actionable information
• From a variety of data
y
sources
• Self-service access
© 2011 SAP AG. All rights reserved. 20
21. RELEVANT INFORMATION
Mobile Large
Device Scale
Business
Suite
Microsoft
Self Office
Service
LESS RELIANCE ON IT
© 2011 SAP AG. All rights reserved. 21
© SAP AG 2010. All rights reserved. / Page 21
30. @foxnews: FoxNews Chad
@foxnews: “FoxNews’ Chad
Pergram confirms Osama bin Laden
g
is dead usama osamabinladen”
35. From 10.45pm – 2.20am on
p
1st and 2 nd May 2011, there was an
average of 3000 Tweets per second.
The highest sustained rate of
The highest sustained rate of
Tweets. Ever.
39. From the way we discover
y
information, to the way we share
information, to the way we
consume i f
information and most
ti d t
importantly, the way we connect
importantly the way we connect
with others.
41. Meme. Noun.
M N
An idea, behavior or style that
, y
spreads from person to person in a
culture.
60. Text Data Processing Defined
Structured
ructured Text
Database
1.Extract meaning g
d
2.Transform into structured Once structured it can be…
data for analysis Integrated
3.
3 Cleanse and match
Unstr
Queried
Analyzed
Visualized
Vi li d
Reported against
Unlocks Key Information from Text Sources to
Drive Business Insight
© 2011 SAP AG. All rights reserved. 60
61. Automate Research Analysis
Text data processing semantically
understands the meaning and context
of information, not just the words
themselves.
Applies linguistic and statistical
techniques to extract entities, concepts
and sentiments
Discerns facts and relationships that
were previously unprocessable
Allows you to deal with information
overload by mining very large corpora of
words and making sense of it without
having to read every sentence
© 2011 SAP AG. All rights reserved. 61
62. SAP BusinessObjects Data Services
Data integration, data quality, data profiling, and text data processing
SAP BusinessObjects Data Services 4.0
ata
Business UI Technical UI
ructured Da
(Information (Data Services)
Steward)
Str
Unified M t d t
U ifi d Metadata
One Runtime
Architecture &
Services ETL
Data Quality
uctured
Profiling
Unstru
Text Analytics
Data
One Administration Environment
(Scheduling, S
(S h d li Security, U
it User M
Management)
t)
One Set of Source/Target Connectors
Provides access to all critical business data (regardless of data source, type,
( g , yp ,
or domain) enabling greater business insights and operational effectiveness
© 2011 SAP AG. All rights reserved. 62
63. Text Data Processing on the Data Services Platform
Native Text Data Processing on the Data Services p
g platform
with the Entity Extraction transform to extract :
Predefined entities (like company, person, firm, city, country, …)
Sentiment Analysis (e.g. Strong positive, Weak positive,
Neutral, Weak Negative, Strong Negative)
Custom entities (customized via dictionaries)
Languages supported (for version 4.0)
English
German
French
Spanish
Japa ese
Japanese
Simplified Chinese
…
(expanding to 31 languages in next releases)
© 2011 SAP AG. All rights reserved. 63
64. Supported Entity Types for Extraction
Who: people, job title, and national Where: addresses, cities, states,
identification numbers countries, facilities, internet
What:
Wh t companies, organizations, fi
i i ti financial
i l addresses,
addresses and phone numbers
indexes, and products How much: currencies and units of
When: dates, days, holidays, months, measure
years, times, and time periods Generic Concepts: “text data”, “global
piracy”, and so on
Current Languages supported with Data Services 4.0: English, French, German,
Simplified Chinese, Spanish, Japanese (concepts only)
Chinese Spanish
Some of the additional Languages coming: Arabic, Dutch, Farsi, Italian, Korean,
Japanese (with concepts), Portuguese, Russian
© 2011 SAP AG. All rights reserved. 64
65. Pre-defined Extraction of Sentiments, Events, and
Relationships
Voice of Customer Public Sector:
Sentiments: strong positive, weak Such as person-organization, person-
positive, neutral, weak negative, alias, travel events and security
strong negative, problems
Requests: customer requests Enterprise:
Mergers and acquisitions, as well as
M d i iti ll
executive job changes
Language Support: E li h F
L S t English, French,
h Language Support: E li h
L S t English,
German, Spanish Simplified Chinese
These are starter packs that can be built upon for a specific deployment
© 2011 SAP AG. All rights reserved. 65
66. Understanding Sentiment
“Sentiment analysis or opinion
mining refers to the application of
natural language processing,
computational linguistics, and text
analytics to identify and extract
subjective information in source
materials.”
-- Wikipedia
© 2011 SAP AG. All rights reserved. 66
67. Voice of the Customer
Apply text data processing to
enhance customer service and
satisfaction by understanding
customer opinions on blogs, forum
postings, and social media.
© 2011 SAP AG. All rights reserved. 67
68. Social Media is Noisy
“The challenge lies in identifying
statistically valid data related to specific
business priorities f
b i i iti from th mountain of
the t i f
available content. You don’t want to
overthrow a key marketing campaign
because a f
b few bloggers write snide
bl it id
things. ”
-- Leslie Owens, Text Analytics Takes
Business Insight To New Depths
socialimplications.com
© 2011 SAP AG. All rights reserved. 68
69. Your Best Customer May Be Your Worst Enemy
When Unhappy Customers Strike
Back on the Internet
Double Deviation – customers have
been victims of not only a product or
service failure, but also failed
resolutions
Betrayal – primary driver of what causes
customers to complain online
p
-- Thomas M. Tripp and Yany
Grégoire,
G é i MIT Sloan Management
Sl M t
Review
© 2011 SAP AG. All rights reserved. 69
70. Opinions Do Matter
“78% of consumers trust peer
recommendations.”
-- The Broad Reach of Social Technologies,
Forrester Research
© 2011 SAP AG. All rights reserved. 70
75. “Computer” in the Most Mentions Concepts report
Computer
© 2011 SAP AG. All rights reserved. 75
76. “Enjoy” stance in the Positive Sentiments
Enjoy
© 2011 SAP AG. All rights reserved. 76
77. “False” and “Issue” stances in the Negative Sentiments
False Issue
© 2011 SAP AG. All rights reserved. 77
78. Drilling down to further understand the complete context
© 2011 SAP AG. All rights reserved. 78
79. The data flow in the Data Services Designer
© 2011 SAP AG. All rights reserved. 79