SlideShare a Scribd company logo
1 of 18
Big Data Flows vs. Wicked Leaks Jeff Jonas,  IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics [email_address] December 1, 2010
Background ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Volumes Exploding “ Every two days now we create as much information as we did from the dawn of civilization up until 2003.”  -Eric Schmidt, CEO Google
Big Data Flows: How Many Copies? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object]
Organizations Are Getting Dumber Time Computing Power Growth Available Observation Space Context Sensemaking Algorithms Enterprise Amnesia
No Context [email_address]
Information in Context … and Accumulating  Top 200 Customer Job  Applicant Identity Thief  Termination “ No-Rehire” [email_address]
Demonstration
VOTER George F Balston YOB: 1951  D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DECEASED PERSON George Balston YOB: 1951  SSN: 5598 DOD: 1995 Is This Voter Deceased? When it comes to best practices in voter matching, if only a name and year of birth match, this is insufficient proof of a match.  Many different people in the U.S. share a name and year of birth. Human review is required. Unfortunately, there are thousands and thousands of cases just like this and state election offices don’t have the staff (or budget) to manually review such volumes.
VOTER George F Balston YOB: 1951  D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DECEASED PERSON George Balston YOB: 1951  SSN: 5598 DOD: 1995 Now Consider This Tertiary DMV Record DMV George F Balston YOB: 1951  SSN: 5598  D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 The DMV record contains enough features to match both the voter (name, year of birth and driver’s license) and/or the deceased persons record (name, year of birth and SSN).  For the sake of argument, let’s say it matches the voter best.
VOTER George F Balston YOB: 1951  D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DMV George F Balston YOB: 1951  SSN: 5598  D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 DECEASED PERSON George Balston YOB: 1951  SSN: 5598 DOD: 1995 Is This Voter/DMV Person Deceased? The voter/DMV record now shares a name, year of birth and SSN with the deceased person record.  In voter matching best practices, this evidence  would be  sufficient to make a determination that this voter is in fact deceased.  This case no longer needs human review.
VOTER George F Balston YOB: 1951  D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DMV George F Balston YOB: 1951  SSN: 5598  D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 DECEASED PERSON George Balston YOB: 1951  SSN: 5598 DOD: 1995 Context Accumulates As features accumulate it becomes easier to match future identity records. As events and transactions accumulate – detection of relevance improves.  Here we can see George  who died in 1995 voted in 2008.
Flows vs. Leaks
Flows vs. Leaks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Wicked Leaks, Prediction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Protecting Big Data from Wicked Leaks ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Big Data Flows vs. Wicked Leaks Jeff Jonas,  IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics [email_address] December 1, 2010

More Related Content

Similar to EOCD Big Data Flows vs. Wicked Leaks

How Will Privacy Regulation Impact Your Business in 2012
How Will Privacy Regulation Impact Your Business in 2012How Will Privacy Regulation Impact Your Business in 2012
How Will Privacy Regulation Impact Your Business in 2012
Vivastream
 
#Op exposecps roz mcallister shill informant d0x
#Op exposecps roz mcallister shill informant d0x#Op exposecps roz mcallister shill informant d0x
#Op exposecps roz mcallister shill informant d0x
RepentSinner
 
IRE 2012 Unstructured Data Talk
IRE 2012 Unstructured Data TalkIRE 2012 Unstructured Data Talk
IRE 2012 Unstructured Data Talk
sirrice
 
Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02
Elaine Sandberg
 
BAKER DONELSON - Linda Daschle - CONNECTIONS To FAA
BAKER DONELSON - Linda Daschle - CONNECTIONS To FAABAKER DONELSON - Linda Daschle - CONNECTIONS To FAA
BAKER DONELSON - Linda Daschle - CONNECTIONS To FAA
VogelDenise
 

Similar to EOCD Big Data Flows vs. Wicked Leaks (20)

Jeff jonas big data new physics
Jeff jonas big data new physicsJeff jonas big data new physics
Jeff jonas big data new physics
 
Be a Better Business Watchdog -- CAR for Business Journalists
Be a Better Business Watchdog -- CAR for Business JournalistsBe a Better Business Watchdog -- CAR for Business Journalists
Be a Better Business Watchdog -- CAR for Business Journalists
 
How to Conquer your Post-Election Data Chaos with the Cicero API
How to Conquer your Post-Election Data Chaos with the Cicero APIHow to Conquer your Post-Election Data Chaos with the Cicero API
How to Conquer your Post-Election Data Chaos with the Cicero API
 
How Will Privacy Regulation Impact Your Business in 2012
How Will Privacy Regulation Impact Your Business in 2012How Will Privacy Regulation Impact Your Business in 2012
How Will Privacy Regulation Impact Your Business in 2012
 
It's not the documents; it's the DATA
It's not the documents; it's the DATAIt's not the documents; it's the DATA
It's not the documents; it's the DATA
 
Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
 
2600 v24 n3 (autumn 2007)
2600 v24 n3 (autumn 2007)2600 v24 n3 (autumn 2007)
2600 v24 n3 (autumn 2007)
 
#Op exposecps roz mcallister shill informant d0x
#Op exposecps roz mcallister shill informant d0x#Op exposecps roz mcallister shill informant d0x
#Op exposecps roz mcallister shill informant d0x
 
Developers can Change The World
Developers can Change The WorldDevelopers can Change The World
Developers can Change The World
 
C4 cnewsletter[jan2015]
C4 cnewsletter[jan2015]C4 cnewsletter[jan2015]
C4 cnewsletter[jan2015]
 
Essay On Harmful Effects Of Junk Food. Online assignment writing service.
Essay On Harmful Effects Of Junk Food. Online assignment writing service.Essay On Harmful Effects Of Junk Food. Online assignment writing service.
Essay On Harmful Effects Of Junk Food. Online assignment writing service.
 
IRE 2012 Unstructured Data Talk
IRE 2012 Unstructured Data TalkIRE 2012 Unstructured Data Talk
IRE 2012 Unstructured Data Talk
 
Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02
 
Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
 
Lobbying101 for foc december 2015
Lobbying101 for foc december 2015Lobbying101 for foc december 2015
Lobbying101 for foc december 2015
 
BAKER DONELSON - Linda Daschle - CONNECTIONS To FAA
BAKER DONELSON - Linda Daschle - CONNECTIONS To FAABAKER DONELSON - Linda Daschle - CONNECTIONS To FAA
BAKER DONELSON - Linda Daschle - CONNECTIONS To FAA
 
JOAN McENTEE (Baker Donelson Employee & Government Employment)
JOAN McENTEE (Baker Donelson Employee & Government Employment)JOAN McENTEE (Baker Donelson Employee & Government Employment)
JOAN McENTEE (Baker Donelson Employee & Government Employment)
 
2008 12 08 2008 Privacy
2008 12 08 2008 Privacy2008 12 08 2008 Privacy
2008 12 08 2008 Privacy
 
Models For Writers Short Essays For Composition (9
Models For Writers Short Essays For Composition (9Models For Writers Short Essays For Composition (9
Models For Writers Short Essays For Composition (9
 
2600 v07 n4 (winter 1990)
2600 v07 n4 (winter 1990)2600 v07 n4 (winter 1990)
2600 v07 n4 (winter 1990)
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

EOCD Big Data Flows vs. Wicked Leaks

  • 1. Big Data Flows vs. Wicked Leaks Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics [email_address] December 1, 2010
  • 2.
  • 3. Data Volumes Exploding “ Every two days now we create as much information as we did from the dawn of civilization up until 2003.” -Eric Schmidt, CEO Google
  • 4.
  • 5.
  • 6. Organizations Are Getting Dumber Time Computing Power Growth Available Observation Space Context Sensemaking Algorithms Enterprise Amnesia
  • 8. Information in Context … and Accumulating Top 200 Customer Job Applicant Identity Thief Termination “ No-Rehire” [email_address]
  • 10. VOTER George F Balston YOB: 1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Is This Voter Deceased? When it comes to best practices in voter matching, if only a name and year of birth match, this is insufficient proof of a match. Many different people in the U.S. share a name and year of birth. Human review is required. Unfortunately, there are thousands and thousands of cases just like this and state election offices don’t have the staff (or budget) to manually review such volumes.
  • 11. VOTER George F Balston YOB: 1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Now Consider This Tertiary DMV Record DMV George F Balston YOB: 1951 SSN: 5598 D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 The DMV record contains enough features to match both the voter (name, year of birth and driver’s license) and/or the deceased persons record (name, year of birth and SSN). For the sake of argument, let’s say it matches the voter best.
  • 12. VOTER George F Balston YOB: 1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DMV George F Balston YOB: 1951 SSN: 5598 D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Is This Voter/DMV Person Deceased? The voter/DMV record now shares a name, year of birth and SSN with the deceased person record. In voter matching best practices, this evidence would be sufficient to make a determination that this voter is in fact deceased. This case no longer needs human review.
  • 13. VOTER George F Balston YOB: 1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DMV George F Balston YOB: 1951 SSN: 5598 D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 DECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Context Accumulates As features accumulate it becomes easier to match future identity records. As events and transactions accumulate – detection of relevance improves. Here we can see George who died in 1995 voted in 2008.
  • 15.
  • 16.
  • 17.
  • 18. Big Data Flows vs. Wicked Leaks Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics [email_address] December 1, 2010