SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Generating Dynamic Social
   Networks from Large Scale
       Unstructured Data
Enterprise Software to Make Sense of Really Junky Data



          Tim Estes - CEO, Digital Reasoning
 What We’ll Discuss

• What is a social network?
   • The web of relationships between entities that influences actions

• Why does it matter?
   • To reference Aesop: “You are known by the company you keep.”

• What’s required to build one algorithmically?
   • What’s similar, what’s the same, what’s connected
 What’s similar?

We use patented algorithms for deducing related terms from the data…

      Bush                White                                  Justin             Britney
                                               Nashville
                          House                                Timberlake           Spears


president bush        house                  tenn            miley cyrus        britney spears
president george w    gov                    the predators   pussycat dolls     the album
administration        white                  predators       bob dylan          x factor
bush administration   clinton                oakland         nine inch nails    my friends
george                the administration     milwaukee       rock star          mtv
george w              president-elect        st louis        the timberwolves   madonna
george bush           barack obama           carolina        sean preston       lady gaga
brown                 barack                 a season        lanarkshire        singer
american              president george w     baltimore       ticket prices      a student
clinton                                      kentucky        nme
 What’s the same?

Concept resolution:
  Roll up similar things into groups of the same (again, algorithmically)




                               Example: Tony Blair
 What’s connected?
Link analysis:
 Show who and what are connected (again, you guessed it, algorithmically)
                      Terrorist Leader Connections
Let’s Put an Idea to the Test...

 With powerful analytics can you remove some or
  most of the need for a priori structure in designing
  and understanding social networks or other quasi-
                                                         YES
  ontological schemas?                                   and

 Can you also do it with messy unstructured data?
                                                         YES
But first...
       Why do we (Digital Reasoning) care?
Because its what we do for a living.
                      We make sense of the senseless.
 Our customers have critical needs
 - Digital Reasoning works primarily in the Defense and Intelligence
   Community making sense of noisy, unstructured data and turning it
   into usable entity-centric systems supporting mission critical
   intelligence.
 The data is big and bad
 - Little structure in content, topics all over the place, and totally different
   ontologies/schemas across the community.
 The times we live in create urgencies
 - We care because the better and faster we are at making sense of this
   kind of data, the safer our country is.
Why did we take a data-centric, deployed software model?

 Unique Environments
 - Given who our customers are... we can’t host their data. No one can.
   The solution had to be a pure deployed software model.
 Meaning in Hard to Reach Places
 - The data is basically a bunch of pieces that don’t want to be connected.
   People that don’t want to be found.
 Result?
 - Imagine trying to turn that kind of data in that type of architecture from a
   bunch of loose communication into a social network that has patterns of
   life, weightings of influence, and projections of probable future actions...
Here’s what it looks like in an architecture…
Now let’s show what can be learned with a little application of
      Entity-Oriented Analytics to a bunch of web data.
Test Case

 Web Blog+Wikipedia data (collected by Fetch)
  -   6M Blog URLs collected over 1Yr +
  -   16M unique blog messages
  -   no unifying these, topic or author
  -   tricky to get “good” big data from the open web. ended up using .5% of that
      original source. 1TB became 4GB.
 No a priori structure, sparse metadata, nearly all meaning emerges
  from analysis
 Let’s see what we can find out...
Examining connections related to “Carl Icahn”



                              The data shows
                              connections to and from
                              Carl Icahn by:
                              • people
                              • periodicals
On closer examination         • topics
the data tells us:            • companies
Carl Icahn “is backing” a
startup company that
“would build” products
related to Barack Obama
Let’s examine what connections we find to “Egypt”



                                             Egypt is identified as a
                                          location, as an organization
                                              (country) and as an
                                            unassigned entity with all
                                              related connections
On closer examination we see
interesting connections in the
blogs for Egypt, Cairo, Issues
and the phrase “powder keg”.
If we drill down into the actual
blog entry we see the context of
the connections
How about connections to “Steve Jobs”?

One connection isconnections in
 The entities and interesting:         Topics
“Steve Jobs” to “Walt Mossberg”
 the blog data are vast – which
to “Kindle”
 is not surprising.                  Authors
Synthesys shows the of authors
The large amount reason for
connection as “pricing” popularity
and topics reflect the
of Steve Jobswordawe see the
Clicking on this as blog subject
context of the connection
Demo Platform

 Synthesys Platform Beta
  elastic
  user-driven
  entity-oriented-analytics on demand
Observations

 New innovations will be algorithmic and focused on turning hard-
  to-use data into dynamic, evolving knowledge that can automate
  machine execution
 Architectures/solutions will have to accommodate customers that
  don’t want to move their data to a Public Cloud
 It is a true statement... “If you can connect the dots, you can
  connect the people”
So why should You care?

 Because there is a lot of data that doesn‘t belong on a shared grid.
  Such as Top Secret data, Sensitive Corporate Data, and Personal
  Data.
 Because people may want to own (Personal Computing model)
  vs. rent (Mainframe model) analytics
 Because you may not want to convert your data to fit the model of
  the hosted solution or map to their ontology to get the answers
  you need.
To learn more…

 See us at:
 - Strata Science Fair (Wed evening 6:45PM)
 - Digital Reasoning Booth #305
 - www.digitalreasoning.com
Questions?



Automated Understanding, Trusted Decisions, True Intelligence

Contenu connexe

Tendances

Dull, Difficult, and Essential: Managing Public Records
Dull,  Difficult,  and Essential: Managing Public RecordsDull,  Difficult,  and Essential: Managing Public Records
Dull, Difficult, and Essential: Managing Public RecordsPaul W. Taylor
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Fredrik Olsson
 
Unicom Big Data Innovation Conference - The return of the narrative
Unicom Big Data Innovation Conference - The return of the narrativeUnicom Big Data Innovation Conference - The return of the narrative
Unicom Big Data Innovation Conference - The return of the narrativeVenkataraman Ramachandran
 
Privacy and Big Data Overload!
Privacy and Big Data Overload!Privacy and Big Data Overload!
Privacy and Big Data Overload!SparkPost
 
Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of librariesRegan Harper
 
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015Jonathan Woodward
 
Introduction to Ethics of Big Data
Introduction to Ethics of Big DataIntroduction to Ethics of Big Data
Introduction to Ethics of Big Data28 Burnside
 
Miranda Marcus – Data and ethics
Miranda Marcus – Data and ethicsMiranda Marcus – Data and ethics
Miranda Marcus – Data and ethicsNEXTConference
 
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...g8briel
 
Big Data, Psychografics and Social Media Advertising - Alessandro Sisti
Big Data, Psychografics and Social Media Advertising - Alessandro SistiBig Data, Psychografics and Social Media Advertising - Alessandro Sisti
Big Data, Psychografics and Social Media Advertising - Alessandro SistiData Driven Innovation
 
The power of Structured Journalism & Hacker Culture in NPR
The power of Structured Journalism & Hacker Culture in NPRThe power of Structured Journalism & Hacker Culture in NPR
The power of Structured Journalism & Hacker Culture in NPRPoderomedia
 
Introduction to Ethics of Big Data
Introduction to Ethics of Big DataIntroduction to Ethics of Big Data
Introduction to Ethics of Big Data28 Burnside
 
Data Analytics Governance and Ethics
Data Analytics Governance and EthicsData Analytics Governance and Ethics
Data Analytics Governance and EthicsHPCC Systems
 
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018Joe Keating
 

Tendances (20)

Ethics and Data
Ethics and DataEthics and Data
Ethics and Data
 
Dull, Difficult, and Essential: Managing Public Records
Dull,  Difficult,  and Essential: Managing Public RecordsDull,  Difficult,  and Essential: Managing Public Records
Dull, Difficult, and Essential: Managing Public Records
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
 
Context, Narratives & Big Data Analytics
Context, Narratives & Big Data AnalyticsContext, Narratives & Big Data Analytics
Context, Narratives & Big Data Analytics
 
Unicom Big Data Innovation Conference - The return of the narrative
Unicom Big Data Innovation Conference - The return of the narrativeUnicom Big Data Innovation Conference - The return of the narrative
Unicom Big Data Innovation Conference - The return of the narrative
 
Privacy and Big Data Overload!
Privacy and Big Data Overload!Privacy and Big Data Overload!
Privacy and Big Data Overload!
 
Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of libraries
 
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
 
Introduction to Ethics of Big Data
Introduction to Ethics of Big DataIntroduction to Ethics of Big Data
Introduction to Ethics of Big Data
 
Miranda Marcus – Data and ethics
Miranda Marcus – Data and ethicsMiranda Marcus – Data and ethics
Miranda Marcus – Data and ethics
 
Big data Paper
Big data PaperBig data Paper
Big data Paper
 
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...
Information Literacy, Privacy, & Risk: What Are the Implications of Mass Surv...
 
Open data
Open dataOpen data
Open data
 
Big Data, Psychografics and Social Media Advertising - Alessandro Sisti
Big Data, Psychografics and Social Media Advertising - Alessandro SistiBig Data, Psychografics and Social Media Advertising - Alessandro Sisti
Big Data, Psychografics and Social Media Advertising - Alessandro Sisti
 
The power of Structured Journalism & Hacker Culture in NPR
The power of Structured Journalism & Hacker Culture in NPRThe power of Structured Journalism & Hacker Culture in NPR
The power of Structured Journalism & Hacker Culture in NPR
 
Introduction to Ethics of Big Data
Introduction to Ethics of Big DataIntroduction to Ethics of Big Data
Introduction to Ethics of Big Data
 
Data Analytics Governance and Ethics
Data Analytics Governance and EthicsData Analytics Governance and Ethics
Data Analytics Governance and Ethics
 
web 30.pptx
web 30.pptxweb 30.pptx
web 30.pptx
 
Big data: understanding the present
Big data: understanding the presentBig data: understanding the present
Big data: understanding the present
 
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
 

En vedette

Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insightDigital Reasoning
 
Mining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started GuideMining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started GuideMatthew Russell
 
Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebMatthew Russell
 
Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)Matthew Russell
 
Mining the Geo Needles in the Social Haystack
Mining the Geo Needles in the Social HaystackMining the Geo Needles in the Social Haystack
Mining the Geo Needles in the Social HaystackMatthew Russell
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Matthew Russell
 

En vedette (7)

Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
 
How to Build a Tech Team
How to Build a Tech TeamHow to Build a Tech Team
How to Build a Tech Team
 
Mining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started GuideMining the Social Web for Fun and Profit: A Getting Started Guide
Mining the Social Web for Fun and Profit: A Getting Started Guide
 
Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social Web
 
Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)Mining Social Web APIs with IPython Notebook (PyCon 2014)
Mining Social Web APIs with IPython Notebook (PyCon 2014)
 
Mining the Geo Needles in the Social Haystack
Mining the Geo Needles in the Social HaystackMining the Geo Needles in the Social Haystack
Mining the Geo Needles in the Social Haystack
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
 

Similaire à Tim Estes - Generating dynamic social networks from large scale unstructured data

Blacks In Technology BMI Tech Workshop preso
Blacks In Technology BMI Tech Workshop presoBlacks In Technology BMI Tech Workshop preso
Blacks In Technology BMI Tech Workshop presoblacksintechnology
 
Final Blacks In Tech /BMI presentation
Final Blacks In Tech /BMI presentationFinal Blacks In Tech /BMI presentation
Final Blacks In Tech /BMI presentationblacksintechnology
 
Blacks In Technology for BMI Technology Workshop Presentation
Blacks In Technology for BMI Technology Workshop PresentationBlacks In Technology for BMI Technology Workshop Presentation
Blacks In Technology for BMI Technology Workshop Presentationblacksintechnology
 
Social media & sentiment analysis splunk conf2012
Social media & sentiment analysis   splunk conf2012Social media & sentiment analysis   splunk conf2012
Social media & sentiment analysis splunk conf2012Michael Wilde
 
First, Firster, Firstest: Three lessons from history on information overload
First, Firster, Firstest: Three lessons from history on information overloadFirst, Firster, Firstest: Three lessons from history on information overload
First, Firster, Firstest: Three lessons from history on information overloadmark madsen
 
Intelligentcontent2009
Intelligentcontent2009Intelligentcontent2009
Intelligentcontent2009Salim Ismail
 
AIIM New England Social Networking Presentation
AIIM New England  Social Networking PresentationAIIM New England  Social Networking Presentation
AIIM New England Social Networking PresentationDoug Cornelius
 
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Jonathan Stray
 
Business considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught outBusiness considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught outtheODI
 
Semantic Social Network
Semantic Social NetworkSemantic Social Network
Semantic Social NetworkHaklae Kim
 
Let's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational designLet's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational designNikita Lukianets
 
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DigitYser
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
From Attention to Trust:
 Data-driven journalism and the urban future
From Attention to Trust:
 Data-driven journalism and the urban futureFrom Attention to Trust:
 Data-driven journalism and the urban future
From Attention to Trust:
 Data-driven journalism and the urban futureMirko Lorenz
 
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedPrivacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
ONA (organizational network analysis) - enabling individuals to impact their ...
ONA (organizational network analysis) - enabling individuals to impact their ...ONA (organizational network analysis) - enabling individuals to impact their ...
ONA (organizational network analysis) - enabling individuals to impact their ...Agron Fazliu
 
Why CxOs care about Data Governance; the roadblock to digital mastery
Why CxOs care about Data Governance; the roadblock to digital masteryWhy CxOs care about Data Governance; the roadblock to digital mastery
Why CxOs care about Data Governance; the roadblock to digital masteryCoert Du Plessis (杜康)
 
SXSW Interactive 2015 Recap
SXSW Interactive 2015 RecapSXSW Interactive 2015 Recap
SXSW Interactive 2015 RecapJuston Western
 

Similaire à Tim Estes - Generating dynamic social networks from large scale unstructured data (20)

Blacks In Technology BMI Tech Workshop preso
Blacks In Technology BMI Tech Workshop presoBlacks In Technology BMI Tech Workshop preso
Blacks In Technology BMI Tech Workshop preso
 
Final Blacks In Tech /BMI presentation
Final Blacks In Tech /BMI presentationFinal Blacks In Tech /BMI presentation
Final Blacks In Tech /BMI presentation
 
Blacks In Technology for BMI Technology Workshop Presentation
Blacks In Technology for BMI Technology Workshop PresentationBlacks In Technology for BMI Technology Workshop Presentation
Blacks In Technology for BMI Technology Workshop Presentation
 
Social media & sentiment analysis splunk conf2012
Social media & sentiment analysis   splunk conf2012Social media & sentiment analysis   splunk conf2012
Social media & sentiment analysis splunk conf2012
 
First, Firster, Firstest: Three lessons from history on information overload
First, Firster, Firstest: Three lessons from history on information overloadFirst, Firster, Firstest: Three lessons from history on information overload
First, Firster, Firstest: Three lessons from history on information overload
 
Intelligentcontent2009
Intelligentcontent2009Intelligentcontent2009
Intelligentcontent2009
 
AIIM New England Social Networking Presentation
AIIM New England  Social Networking PresentationAIIM New England  Social Networking Presentation
AIIM New England Social Networking Presentation
 
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
 
Business considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught outBusiness considerations for privacy and open data: how not to get caught out
Business considerations for privacy and open data: how not to get caught out
 
Semantic Social Network
Semantic Social NetworkSemantic Social Network
Semantic Social Network
 
Let's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational designLet's Talk: fundamentals of conversational design
Let's Talk: fundamentals of conversational design
 
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
DISUMMIT Keynote presentation from Kirk Borne - From Sensors to Sense-Making
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
A short introduction to Semantic Web - 2012
A short introduction to Semantic Web - 2012A short introduction to Semantic Web - 2012
A short introduction to Semantic Web - 2012
 
From Attention to Trust:
 Data-driven journalism and the urban future
From Attention to Trust:
 Data-driven journalism and the urban futureFrom Attention to Trust:
 Data-driven journalism and the urban future
From Attention to Trust:
 Data-driven journalism and the urban future
 
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedPrivacy in AI/ML Systems: Practical Challenges and Lessons Learned
Privacy in AI/ML Systems: Practical Challenges and Lessons Learned
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
ONA (organizational network analysis) - enabling individuals to impact their ...
ONA (organizational network analysis) - enabling individuals to impact their ...ONA (organizational network analysis) - enabling individuals to impact their ...
ONA (organizational network analysis) - enabling individuals to impact their ...
 
Why CxOs care about Data Governance; the roadblock to digital mastery
Why CxOs care about Data Governance; the roadblock to digital masteryWhy CxOs care about Data Governance; the roadblock to digital mastery
Why CxOs care about Data Governance; the roadblock to digital mastery
 
SXSW Interactive 2015 Recap
SXSW Interactive 2015 RecapSXSW Interactive 2015 Recap
SXSW Interactive 2015 Recap
 

Dernier

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 

Tim Estes - Generating dynamic social networks from large scale unstructured data

  • 1. Generating Dynamic Social Networks from Large Scale Unstructured Data Enterprise Software to Make Sense of Really Junky Data Tim Estes - CEO, Digital Reasoning
  • 2.  What We’ll Discuss • What is a social network? • The web of relationships between entities that influences actions • Why does it matter? • To reference Aesop: “You are known by the company you keep.” • What’s required to build one algorithmically? • What’s similar, what’s the same, what’s connected
  • 3.  What’s similar? We use patented algorithms for deducing related terms from the data… Bush White Justin Britney Nashville House Timberlake Spears president bush house tenn miley cyrus britney spears president george w gov the predators pussycat dolls the album administration white predators bob dylan x factor bush administration clinton oakland nine inch nails my friends george the administration milwaukee rock star mtv george w president-elect st louis the timberwolves madonna george bush barack obama carolina sean preston lady gaga brown barack a season lanarkshire singer american president george w baltimore ticket prices a student clinton kentucky nme
  • 4.  What’s the same? Concept resolution: Roll up similar things into groups of the same (again, algorithmically) Example: Tony Blair
  • 5.  What’s connected? Link analysis: Show who and what are connected (again, you guessed it, algorithmically) Terrorist Leader Connections
  • 6. Let’s Put an Idea to the Test...  With powerful analytics can you remove some or most of the need for a priori structure in designing and understanding social networks or other quasi- YES ontological schemas? and  Can you also do it with messy unstructured data? YES
  • 7. But first... Why do we (Digital Reasoning) care?
  • 8. Because its what we do for a living. We make sense of the senseless.  Our customers have critical needs - Digital Reasoning works primarily in the Defense and Intelligence Community making sense of noisy, unstructured data and turning it into usable entity-centric systems supporting mission critical intelligence.  The data is big and bad - Little structure in content, topics all over the place, and totally different ontologies/schemas across the community.  The times we live in create urgencies - We care because the better and faster we are at making sense of this kind of data, the safer our country is.
  • 9. Why did we take a data-centric, deployed software model?  Unique Environments - Given who our customers are... we can’t host their data. No one can. The solution had to be a pure deployed software model.  Meaning in Hard to Reach Places - The data is basically a bunch of pieces that don’t want to be connected. People that don’t want to be found.  Result? - Imagine trying to turn that kind of data in that type of architecture from a bunch of loose communication into a social network that has patterns of life, weightings of influence, and projections of probable future actions...
  • 10. Here’s what it looks like in an architecture…
  • 11. Now let’s show what can be learned with a little application of Entity-Oriented Analytics to a bunch of web data.
  • 12. Test Case  Web Blog+Wikipedia data (collected by Fetch) - 6M Blog URLs collected over 1Yr + - 16M unique blog messages - no unifying these, topic or author - tricky to get “good” big data from the open web. ended up using .5% of that original source. 1TB became 4GB.  No a priori structure, sparse metadata, nearly all meaning emerges from analysis  Let’s see what we can find out...
  • 13. Examining connections related to “Carl Icahn” The data shows connections to and from Carl Icahn by: • people • periodicals On closer examination • topics the data tells us: • companies Carl Icahn “is backing” a startup company that “would build” products related to Barack Obama
  • 14. Let’s examine what connections we find to “Egypt” Egypt is identified as a location, as an organization (country) and as an unassigned entity with all related connections On closer examination we see interesting connections in the blogs for Egypt, Cairo, Issues and the phrase “powder keg”. If we drill down into the actual blog entry we see the context of the connections
  • 15. How about connections to “Steve Jobs”? One connection isconnections in The entities and interesting: Topics “Steve Jobs” to “Walt Mossberg” the blog data are vast – which to “Kindle” is not surprising. Authors Synthesys shows the of authors The large amount reason for connection as “pricing” popularity and topics reflect the of Steve Jobswordawe see the Clicking on this as blog subject context of the connection
  • 16. Demo Platform  Synthesys Platform Beta  elastic  user-driven  entity-oriented-analytics on demand
  • 17. Observations  New innovations will be algorithmic and focused on turning hard- to-use data into dynamic, evolving knowledge that can automate machine execution  Architectures/solutions will have to accommodate customers that don’t want to move their data to a Public Cloud  It is a true statement... “If you can connect the dots, you can connect the people”
  • 18. So why should You care?  Because there is a lot of data that doesn‘t belong on a shared grid. Such as Top Secret data, Sensitive Corporate Data, and Personal Data.  Because people may want to own (Personal Computing model) vs. rent (Mainframe model) analytics  Because you may not want to convert your data to fit the model of the hosted solution or map to their ontology to get the answers you need.
  • 19. To learn more…  See us at: - Strata Science Fair (Wed evening 6:45PM) - Digital Reasoning Booth #305 - www.digitalreasoning.com
  • 20. Questions? Automated Understanding, Trusted Decisions, True Intelligence