SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
NLP and Graph Databases in
Charlie Greenbacker & Joe Kerner
Agenda
Graph Databases
Lumify Overview
Introductions
Natural Language Processing
photo:&Columbia&Pictures&
About me: @greenbacker
Theories: popular tripe
Methods: sloppy
Conclusions: highly questionable
Best reason for
not finishing PhD
@ExploreAltamira
is an open source
big data analysis and
visualization platform
built by Altamira engineers
Key Lumify Concepts
structure for organizing information (i.e., your data model)
Ontology
any “thing” you want to represent (e.g., person, place, event)
Entities
a link between two entities (e.g., leader-of, works-for, sibling-of)
Relationships
data about an entity (e.g., first name, last name, date of birth)
Properties
collection of entities and the relationships between them
Graph
Live Demo
Who can Lumify help?
Lumify helps analysts
fuse structured and
unstructured data
from myriad sources
into actionable
intelligence.
Intelligence
Analyst
Law enforcement
personnel can use
Lumify to explore
criminal networks,
uncover hidden
connections, and
develop leads.
Police
Investigator
Lumify analyzes
financial data and
transaction records
to help detect fraud
and identify possible
insider threats.
Financial
Analyst
photo:&Ken&Teegardin&(h9ps://flic.kr/p/9rn9Yh)&
Scientists, law firms,
news organizations,
and others can
track their research
in Lumify to unearth
latent knowledge
and discover critical
new insights.
Research
Staff
photo:&UK&NaConal&Archives&(h9p://bit.ly/1n9dhR8)&
Why Lumify?
•  Distributed under the
permissive Apache 2.0
license
•  No restrictions on
modifications
•  No licensing or usage
constraints
Free and
Open Source
Built on Scalable Open Source Tech
Hadoop&CDH&4&
Accumulo&
ElasCcSearch&
tesseract&CLAVIN& CMU&Sphinx&OpenNLP& OpenCV& ffmpeg&
Apache&Storm&
Secure&Graph&
custom&code&
•  Separate security
restrictions at the
entity, property, and
relationship level
•  Implemented in and
enforced by
Accumulo cell-level
security
Highly Secure
Joaquin Guzman Loera
DOB: 1957-04-04
POB: Badiraguarto
Nationality: Mexican
Founded: 2010-01-11
Location: Mexico City
Employees: 121
Zarka de Mexico
•  Full-time development
staff
•  Custom development
and customization
services
•  Commercial support
offerings
Supported
•  Day-to-day
development done on
Amazon infrastructure
•  Primarily use EC2, VPC,
S3, SES, CloudWatch
•  Altamira is an AWS
consulting partner
AWS
Compatible
Natural Language Processing in
Text Extraction
video
text docs
structured
data
images OCR
tesseract
audio CMU
Sphinx
CMU
Sphinx
OCR
tesseract
extractor
Text Enrichment
•  Apache OpenNLP
•  Named Entity Recognition
•  Extracts names of entities
from unstructured text
•  Persons, Orgs, & Locations
•  Highlighted in preview text
•  User must confirm/resolve
•  CLAVIN
•  Geospatial Entity Resolution
•  Resolves extracted location
names to gazetteer records
•  Solves “Springfield problem”
•  Disambiguates place names
•  Turns text docs into maps!
Machine-powered entity
extraction and resolution,
combined with human QA
and supplementation,
supports rich semantic
analysis of raw text.
Enriched
Text
Documents
Drug Lord “El Chapo” Captured in Mexico
PUBLISHED DATE
SOURCE
Audit
2014/02/22
Wikipedia
Add Property
Although Guzman had long hidden successfully in remote areas of the
Sierra Madre mountains, the arrested members of his security team told
the military he had begun venturing out to Culiacan and the beach town of
Mazatlan. A week prior to his capture, Guzman and Zambada were
reported to have attended a family reunion in Sinaloa. The Mexican military
followed the bodyguards tips to Guzman’s ex-wife’s house, but they had
trouble ramming the steel-reinforced front door, which allowed Guzman to
escape through a system of secret tunnels that connected six houses,
eventually moving south to Mazatlan. He planned to stay a few days in
Mazatlan to see his twin baby daughters before retreating to the
mountains.

On 22 February 2014, at around 6:40 a.m., Mexican authorities arrested
Guzman at a hotel in a beach front area in Mazatlan, Sinaloa, following an
operation by the Mexican Navy, with joint intelligence from the DEA and
Benefits to Users
quickly find relevant data without reading
Increases Discoverability
machines process text faster than humans
Helps Deal with Information Overload
enables object-based analysis & investigations
Uncovers Hidden Connections
Future NLP Integration
e.g., Stanford NER, SUTime, MITIE
Support other NER tools
e.g., OpenIE (formerly ReVerb)
Event/Relationship Extraction
augmenting/extending GATE/ANNIE
Coreference Resolution
e.g., frequency analysis, topic modeling, sentiment analysis
Additional Text Analytics
use non-English language models for NER, etc.
Multilingual Support
Graph Databases in
view part 2 of the presentation here:
github.com/altamiracorp/secure-graph-presentation
Questions?
more info: lumify.io

Contenu connexe

Tendances

Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction Wael Badawy
 
Cyber Security Regulatory Landscape
Cyber Security Regulatory LandscapeCyber Security Regulatory Landscape
Cyber Security Regulatory LandscapeSamir Pawaskar
 
Cloud-forensics
Cloud-forensicsCloud-forensics
Cloud-forensicsanupriti
 
How Graphs are Changing AI
How Graphs are Changing AIHow Graphs are Changing AI
How Graphs are Changing AINeo4j
 
The Semantic Web: An Introduction
The Semantic Web: An IntroductionThe Semantic Web: An Introduction
The Semantic Web: An IntroductionElena Simperl
 
Drug Discovery Knowledge Graph
Drug Discovery Knowledge Graph Drug Discovery Knowledge Graph
Drug Discovery Knowledge Graph Tomás Sabat
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis WorkshopData Works MD
 
Community Detection
Community Detection Community Detection
Community Detection Kanika Kanwal
 
Social network analysis
Social network analysisSocial network analysis
Social network analysisCaleb Jones
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018Arsalan Khan
 
2014 한국 링크드 데이터 사례집
2014 한국 링크드 데이터 사례집2014 한국 링크드 데이터 사례집
2014 한국 링크드 데이터 사례집Hansung University
 
Top 5 digital forensic court cases
Top 5 digital forensic court casesTop 5 digital forensic court cases
Top 5 digital forensic court casesDeadbolt Forensics
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
Cybercrime And Cyber forensics
Cybercrime And  Cyber forensics Cybercrime And  Cyber forensics
Cybercrime And Cyber forensics sunanditaAnand
 
Digital Transformation in Social Science
Digital Transformation in Social ScienceDigital Transformation in Social Science
Digital Transformation in Social ScienceLauri Eloranta
 

Tendances (20)

Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Cyber Security Regulatory Landscape
Cyber Security Regulatory LandscapeCyber Security Regulatory Landscape
Cyber Security Regulatory Landscape
 
Cloud-forensics
Cloud-forensicsCloud-forensics
Cloud-forensics
 
How Graphs are Changing AI
How Graphs are Changing AIHow Graphs are Changing AI
How Graphs are Changing AI
 
The Semantic Web: An Introduction
The Semantic Web: An IntroductionThe Semantic Web: An Introduction
The Semantic Web: An Introduction
 
Drug Discovery Knowledge Graph
Drug Discovery Knowledge Graph Drug Discovery Knowledge Graph
Drug Discovery Knowledge Graph
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Community Detection
Community Detection Community Detection
Community Detection
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
eyeblink-detection
eyeblink-detectioneyeblink-detection
eyeblink-detection
 
Social Network Analysis (SNA) 2018
Social Network Analysis  (SNA) 2018Social Network Analysis  (SNA) 2018
Social Network Analysis (SNA) 2018
 
2014 한국 링크드 데이터 사례집
2014 한국 링크드 데이터 사례집2014 한국 링크드 데이터 사례집
2014 한국 링크드 데이터 사례집
 
Top 5 digital forensic court cases
Top 5 digital forensic court casesTop 5 digital forensic court cases
Top 5 digital forensic court cases
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Cloud computing lecture 7
Cloud computing lecture 7Cloud computing lecture 7
Cloud computing lecture 7
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Cybercrime And Cyber forensics
Cybercrime And  Cyber forensics Cybercrime And  Cyber forensics
Cybercrime And Cyber forensics
 
Digital Transformation in Social Science
Digital Transformation in Social ScienceDigital Transformation in Social Science
Digital Transformation in Social Science
 
Semantic web
Semantic webSemantic web
Semantic web
 

En vedette

Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewRadityo Eko Prasojo
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackAlyona Medelyan
 
“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”diannepatricia
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseJanna Hastings
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data IntegrationJanna Hastings
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...semanticsconference
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextDataWorks Summit
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyJanna Hastings
 

En vedette (11)

Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer Feedback
 
“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and Disease
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
 
Ontology
OntologyOntology
Ontology
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
 
AI and the Future of Growth
AI and the Future of GrowthAI and the Future of Growth
AI and the Future of Growth
 

Dernier

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Natural Language Processing and Graph Databases in Lumify

  • 1. NLP and Graph Databases in Charlie Greenbacker & Joe Kerner
  • 3. photo:&Columbia&Pictures& About me: @greenbacker Theories: popular tripe Methods: sloppy Conclusions: highly questionable
  • 4. Best reason for not finishing PhD
  • 6. is an open source big data analysis and visualization platform built by Altamira engineers
  • 7. Key Lumify Concepts structure for organizing information (i.e., your data model) Ontology any “thing” you want to represent (e.g., person, place, event) Entities a link between two entities (e.g., leader-of, works-for, sibling-of) Relationships data about an entity (e.g., first name, last name, date of birth) Properties collection of entities and the relationships between them Graph
  • 10. Lumify helps analysts fuse structured and unstructured data from myriad sources into actionable intelligence. Intelligence Analyst
  • 11. Law enforcement personnel can use Lumify to explore criminal networks, uncover hidden connections, and develop leads. Police Investigator
  • 12. Lumify analyzes financial data and transaction records to help detect fraud and identify possible insider threats. Financial Analyst photo:&Ken&Teegardin&(h9ps://flic.kr/p/9rn9Yh)&
  • 13. Scientists, law firms, news organizations, and others can track their research in Lumify to unearth latent knowledge and discover critical new insights. Research Staff photo:&UK&NaConal&Archives&(h9p://bit.ly/1n9dhR8)&
  • 15. •  Distributed under the permissive Apache 2.0 license •  No restrictions on modifications •  No licensing or usage constraints Free and Open Source
  • 16. Built on Scalable Open Source Tech Hadoop&CDH&4& Accumulo& ElasCcSearch& tesseract&CLAVIN& CMU&Sphinx&OpenNLP& OpenCV& ffmpeg& Apache&Storm& Secure&Graph& custom&code&
  • 17. •  Separate security restrictions at the entity, property, and relationship level •  Implemented in and enforced by Accumulo cell-level security Highly Secure Joaquin Guzman Loera DOB: 1957-04-04 POB: Badiraguarto Nationality: Mexican Founded: 2010-01-11 Location: Mexico City Employees: 121 Zarka de Mexico
  • 18. •  Full-time development staff •  Custom development and customization services •  Commercial support offerings Supported
  • 19. •  Day-to-day development done on Amazon infrastructure •  Primarily use EC2, VPC, S3, SES, CloudWatch •  Altamira is an AWS consulting partner AWS Compatible
  • 21. Text Extraction video text docs structured data images OCR tesseract audio CMU Sphinx CMU Sphinx OCR tesseract extractor
  • 22. Text Enrichment •  Apache OpenNLP •  Named Entity Recognition •  Extracts names of entities from unstructured text •  Persons, Orgs, & Locations •  Highlighted in preview text •  User must confirm/resolve •  CLAVIN •  Geospatial Entity Resolution •  Resolves extracted location names to gazetteer records •  Solves “Springfield problem” •  Disambiguates place names •  Turns text docs into maps!
  • 23. Machine-powered entity extraction and resolution, combined with human QA and supplementation, supports rich semantic analysis of raw text. Enriched Text Documents Drug Lord “El Chapo” Captured in Mexico PUBLISHED DATE SOURCE Audit 2014/02/22 Wikipedia Add Property Although Guzman had long hidden successfully in remote areas of the Sierra Madre mountains, the arrested members of his security team told the military he had begun venturing out to Culiacan and the beach town of Mazatlan. A week prior to his capture, Guzman and Zambada were reported to have attended a family reunion in Sinaloa. The Mexican military followed the bodyguards tips to Guzman’s ex-wife’s house, but they had trouble ramming the steel-reinforced front door, which allowed Guzman to escape through a system of secret tunnels that connected six houses, eventually moving south to Mazatlan. He planned to stay a few days in Mazatlan to see his twin baby daughters before retreating to the mountains. On 22 February 2014, at around 6:40 a.m., Mexican authorities arrested Guzman at a hotel in a beach front area in Mazatlan, Sinaloa, following an operation by the Mexican Navy, with joint intelligence from the DEA and
  • 24. Benefits to Users quickly find relevant data without reading Increases Discoverability machines process text faster than humans Helps Deal with Information Overload enables object-based analysis & investigations Uncovers Hidden Connections
  • 25. Future NLP Integration e.g., Stanford NER, SUTime, MITIE Support other NER tools e.g., OpenIE (formerly ReVerb) Event/Relationship Extraction augmenting/extending GATE/ANNIE Coreference Resolution e.g., frequency analysis, topic modeling, sentiment analysis Additional Text Analytics use non-English language models for NER, etc. Multilingual Support
  • 26. Graph Databases in view part 2 of the presentation here: github.com/altamiracorp/secure-graph-presentation