12 Things the Semantic Web Should Know about Content Analytics
Big Data Analytics: Facts and Feelings
1. Big Data Analytics:
Facts and Feelings
Seth Grimes
Alta Plana Corporation
@sethgrimes
TDWI – Washington DC
June 21, 2013
2. Big Data Analytics: Facts and Feelings
Theses
• We gain knowledge when we make
connections.
• Data analysis is a process of connection
discovery.
• The more data, the greater the possibilities.
• The more data, the greater the need to filter,
reduce, and contextualize.
• Timeliness counts.
3. Big Data Analytics: Facts and Feelings
The World of Big Data
Machine data (e.g., logs, sensor outputs,
clickstreams).
Actions, interactions, and transactions:
geolocation and time.
Profiles: individual, demographic & behavioral.
Text, audio, images, and video.
Facts and feelings.
5. Big Data Analytics: Facts and Feelings
Imperatives for the 2010s:
Do more with more.
“It’s Not Information Overload. It’s Filter
Failure”: Clay Shirky, 2008.
• More sources & types of data.
• Greater data volumes.
• New hardware and methods.
Automate more, more intelligently.
• Analytics.
• Semantics.
Engage. Socialize.
6. Big Data Analytics: Facts and Feelings
I see three categories of data:
1. Quantities, whether measured, observed, or
computed.
2. Content, which I’ll characterize as non-
quantitative information.
3. Metadata (semantic & structural) describing
quantities and content.
• Our concern is content, analytics & fusion.
• Structured/unstructured is a false
dichotomy.
• Where do relationships fit?
7. Big Data Analytics: Facts and Feelings
http://www.businessweek.com/magazine/content/04_19/b3882029_mz072.htm
En route.
9. Big Data Analytics: Facts and Feelings
Of course not. It’s a number, not data.
Size (Volume) is only one Big Data factor.
Other factors (standard definition) are
Velocity and Variety.
I reject reVisionist 3Vs extensions such as:
Variation/Variability.
Veracity.
Value.
These factors are the province of analytics.
10. Big Data Analytics: Facts and Feelings
Gary King, Harvard Univ. –
“Big Data isn’t about the data. It’s about
analytics.”
Me –
Analytics is a collection of tools and
techniques that extract insights from data.
I’d argue –
The Value in Big Data is in content, patterns,
and connections, derived via analytics.
11. Big Data Analytics: Facts and Feelings
Variability is an interpretive property. I say –
The sense of Big Data is in context and
intent…of both the data producer and the
data consumer, captured in metadata and
(also) derived via analytics.
12. Big Data Analytics: Facts and Feelings
As for Veracity, data is data. Consider:
“The Iraqi regime… possesses and produces
chemical and biological weapons.” –
George W. Bush, October 7, 2002.
13. Big Data Analytics: Facts and Feelings
Data… and more. Is this Big Data?
No, it’s a screen of aggregated query results.
14. Big Data Analytics: Facts and Feelings
The Big Data is behind it.
http://www.newyorker.com/online/blogs/culture/2012/05/google-knowledge-graph.html
15. Big Data Analytics: Facts and Feelings
And behind comparably-scaled/structured
systems.
16. Big Data Analytics: Facts and Feelings
Comparably-
scaled/structured
systems?
http://www.cambridgesemantics.com/semantic-
university/semantic-search-and-the-semantic-web
17. Big Data Analytics: Facts and Feelings
Graphs model language models relationships.
18. Big Data Analytics: Facts and Feelings
Another view, using an old GATE image.
http://gate.ac.uk/hamish/talks/ibot-slidy.html
20. Big Data Analytics: Facts and Feelings
Text analytics applies natural-language
processing (NLP) techniques to discern –
Entities
Relationships
Context
Identity
– and get at the sense of “unstructured”
online, social, and enterprise information.
Semantic identity unites data of all types.
21. Big Data Analytics: Facts and Feelings
http://searchuserinterfaces.com/
Sensemaking:
“It is convenient to divide the entire
information access process into two
main components: information
retrieval through searching and
browsing, and analysis and
synthesis of results. This broader
process is often referred to in the
literature as sensemaking.
Sensemaking refers to an iterative
process of formulating a
conceptual representation from of
a large volume of information.”
– Marti Hearst, 2009
22. Big Data Analytics: Facts and Feelings
Intelligent computing – sensemaking –
involves:
Big (and little) Data.
• Quantities.
• Content.
• Metadata.
Analytics.
Semantics.
Integration.
Facts and feelings.
23. Big Data Analytics: Facts and Feelings
Feelings: Sentiment detection, classification.
24. Big Data Analytics: Facts and Feelings
http://techpresident.com/news/21618/pol
itico-facebook-sentiment-analysis-bogus
25. Big Data Analytics: Facts and Feelings
“Sentiment analysis is the task of identifying positive
and negative opinions, emotions, and evaluations.”
-- Wilson, Wiebe & Hoffman, 2005, “Recognizing
Contextual Polarity in Phrase-Level Sentiment
Analysis”
“Sentiment analysis or opinion mining is the
computational study of opinions, sentiments and
emotions expressed in text… An opinion on a feature
f is a positive or negative view, attitude, emotion or
appraisal on f from an opinion holder.”
-- Bing Liu, 2010, “Sentiment Analysis and Subjectivity,” in
Handbook of Natural Language Processing
26. Big Data Analytics: Facts and Feelings
Sentiment may be of interest at multiple
levels.
Corpus / data space, i.e., across multiple sources.
Document.
Statement / sentence.
Entity / topic / concept.
Human language is noisy and chaotic!
Jargon, slang, irony, ambiguity, anaphora,
polysemy, synonymy, etc.
Context is key. Discourse analysis comes into
play.
29. Big Data Analytics: Facts and Feelings
Prediction/Feeling/Wish... and Intent.
http://www.aiaioo.com/whitepapers/intention_analysis_use_cases.pdf
http://sentibet.com/
35. Big Data Analytics: Facts and Feelings
A Big Data analytics architecture (example).
http://www.geeklawblog.com/2011/12/lexis-advance-platform-launch-two.html