SlideShare une entreprise Scribd logo
1  sur  19
Chris Selland
VP Marketing
HP Vertica


© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Finding fraud in large,
diverse data sets
using big data analytics for fraud detection and prevention
Chris Selland, VP Marketing, HP Vertica
Decemter, 2012


© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
“If you need a machine and don’t
buy it, you will ultimately find out
that you have paid for it and don’t
have it.”


Henry Ford


© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Fraud is a parasite to business and government
5% of gross business and government returns and operating expenses are lost to fraud

Approaching $3.5B annually in US for
credit cards
Medicare and Medicaid fraud over
10X credit card fraud
The US Treasury expects $65B in tax
fraud over the next 5 years
One banker at UBS Bank Switzerland
singlehandedly stole $2B
But the average fraud attack is $5K

4   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
… but the toll on individuals is what really matters

Every minute,
    19 citizen’s identities are stolen
The average victim spends 500 hours
and $3,000 undoing the damage
And we as individual …
                 Taxpayers,
                     Consumers,
                         and Business owners …
                                 pay the rest of the bill

5   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Big Data Application: Fraud Analysis
     The Problem:
     U.S. Government needed to detect patterns of fraud in federal health care
     programs

     The Solution:
     • Uses government supercomputer to detect fraud in near-real time on aggregated
       databases
     • Multiple petabytes of claims data (Medicare, Medicaid, DoD, Veterans Affairs, etc.)
     • Finds patterns to generate rules and identify anomalies
     • Boosted recovery of claims from $1 billion/year to $50 billion



6   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Detecting fraud is a game of pattern recognition …

A simple foundation of                                                 … Applied to financial                                          … with modern technology
statistical formulas …                                                 and workflow process                                            we can …
                                                                       transactions, e.g.                                              Capture transaction streams
                                                                       • Credit card bills                                             Build historical track records
                                                                       • Supplier invoices                                             and ID anomalies
                                                                       • Financial transactions                                        Run analytics based on those
                                                                       • Call records                                                  17th century formulas
Baysian …                                                              • Claims records                                                … and detect fraudulent
                                                                       • Approval chains                                               activities
B1Xi1 + B2Xi2 … BnXin + ei = yi
                                                                       • …

7   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
But combating fraud is a game of cat and mouse

The patterns keep changing
Hackers are thwarted only to come back
through a different security hole
Novell scams are being thought up for
loopholes and exceptions in business workflows


And the playing field keeps growing
Volume and velocity of transactions
Digitization of workflow records and approvals
Source and type of transactions

8   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
With Vertica, you get a better mouse trap

Your proprietary systems can’t keep up
Expensive to scale for today’s “Big Data” real-
time transaction streams
Difficult to modify legacy analytic code and
architectures to keep up with changing patterns
With Vertica you stay a step ahead
Rapidly create real-time high-speed transactional
record datamarts on inexpensive platforms
Open analytics platform for deep predictive
fraudulent pattern modeling

9   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Designed for answers from the very first line of code,
Vertica technology makes the difference
 Columnar storage
                                                   Achieve best data query performance with unique Vertica column store
 and execution

 Clustering                                        Add resources on the fly with linear scaling on the grid, commodity hardware


 Compression                                       Store more data, provide more views, 90% less storage required


 Continuous
                                                   Query and load 24x7 with zero administration
 performance

 Database design                                                                                            Advanced analytics
 Automated performance tuning                                                                               Time-series, geospatial, click-stream and an SDK for more


10   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Vertica architecture – every element purpose built
for pattern recognition in Big Data scenarios
     Industry Specific Fraud Detection & Prevention Use Case Scenarios


     SAS                                                              Hyperion                                                               MicroStrategy
                                                                                                                                 User-                       Business
                                        Cognos                        (Oracle)                  R-Function
                                                                                                                                 Defined                     Objects
                                        (IBM)                                                      Library
                                                                                                                                 Analytics                   (SAP)



     Next Generation administration, cluster architecture, Standard interfaces
     True column store – RDBMS w/ columnar compression, concurrent load/query
     Real time massively parallel processing, performance and high availability
11   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Communication Service Provider fraud

Estimated impact of $40 Billion
p.a. worldwide

Service Providers continue to see
increases in volume & variety

Continuous improvement
necessary to alleviate –
constantly moving targets


12   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Banking services fraud detection use case scenario
Identifying credit card fraud (skimming)

Historical reference dataset
• Credit card skimming record for merchants
• Merchant characteristics (size of store, popularity)
• Credit record for card holder
Real-time transactional data
• Credit card transaction and card status

Result: Merchant probability of skimming

Implemented for a large American bank on Vertica

14   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Identifying healthcare public/private insurance
reimbursement improper payments use case scenario
Historical reference dataset
• Reimbursement record for providers
• Reimbursement record for patient
• National, regional, local statistical analysis of treatment
  associated with reimbursement
• Provider characteristics (size of provider, popularity, etc.)
Real-time transactional data
• Credit card transaction and card status

Result: Provider’s probability of improper payments
Implemented for a large American insurance carrier on Vertica
15   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Staying ahead of credit card skimming
Ability to incorporate the latest algorithms
without proprietary code, leveraging “Big Data”
and social media
• Front-end outlier detection in
  multivariate data streams
• Neural networks
• Social network analysis

Implemented analytics in TWENTY lines of code
Different use cases, data sources and industries
handled with the same pattern recognition scenario

16   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Conclusion
Get in front of the games …
We can help you combat fraud by enabling you to …
• Incorporation of all necessary data sets
• Ability to incorporate the latest algorithms without proprietary
  code, leveraging “Big Data” and social media
• Be proactive versus reactive
Where to find more information
•      bit.ly/VerticaFraud
•      www.vertica.com
•      my.vertica.com/evaluate/
•      cselland@vertica.com
•      +1.617.3864523
17   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Download Now


                                                                                                                          Get the Mobile App

                                                                                                                      Download content from this session
                                                                                                                      with the free Mobile App at:

                                                                                                                      m.hp.com/events




18   © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Thank you




© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Contenu connexe

Tendances

Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017Cloudera, Inc.
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...GetInData
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyCloudera, Inc.
 
Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Niu Bai
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
Big Data Analytics in Government
Big Data Analytics in GovernmentBig Data Analytics in Government
Big Data Analytics in GovernmentDeepak Ramanathan
 
Connectivity to business outcomes
Connectivity to business outcomesConnectivity to business outcomes
Connectivity to business outcomesAndrey Karpov
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessInside Analysis
 
Cloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learningCloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learningCloudera, Inc.
 
Big Data LDN 2017: The 3rd Wave of Business Intelligence
Big Data LDN 2017: The 3rd Wave of Business IntelligenceBig Data LDN 2017: The 3rd Wave of Business Intelligence
Big Data LDN 2017: The 3rd Wave of Business IntelligenceMatt Stubbs
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for EveryoneCaserta
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Hortonworks
 
The Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent OffersThe Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent OffersCloudera, Inc.
 
IBM Governed Data Lake
IBM Governed Data LakeIBM Governed Data Lake
IBM Governed Data LakeKaran Sachdeva
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti
 
Tusker Corporate Profile
Tusker Corporate ProfileTusker Corporate Profile
Tusker Corporate ProfilePrashant Kumar
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Publicis Sapient Engineering
 
A Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision MedicineA Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision MedicineCloudera, Inc.
 
The Journey to Success with Big Data
The Journey to Success with Big DataThe Journey to Success with Big Data
The Journey to Success with Big DataCloudera, Inc.
 

Tendances (20)

Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
 
The Five Markers on Your Big Data Journey
The Five Markers on Your Big Data JourneyThe Five Markers on Your Big Data Journey
The Five Markers on Your Big Data Journey
 
Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714Value proposition for big data isv partners 0714
Value proposition for big data isv partners 0714
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
Big Data Analytics in Government
Big Data Analytics in GovernmentBig Data Analytics in Government
Big Data Analytics in Government
 
Connectivity to business outcomes
Connectivity to business outcomesConnectivity to business outcomes
Connectivity to business outcomes
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of Success
 
Cloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learningCloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learning
 
Big Data LDN 2017: The 3rd Wave of Business Intelligence
Big Data LDN 2017: The 3rd Wave of Business IntelligenceBig Data LDN 2017: The 3rd Wave of Business Intelligence
Big Data LDN 2017: The 3rd Wave of Business Intelligence
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
 
The Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent OffersThe Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent Offers
 
IBM Governed Data Lake
IBM Governed Data LakeIBM Governed Data Lake
IBM Governed Data Lake
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
 
Tusker Corporate Profile
Tusker Corporate ProfileTusker Corporate Profile
Tusker Corporate Profile
 
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
Ask Bigger Questions with Cloudera and Apache Hadoop - Big Data Day Paris 2013
 
A Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision MedicineA Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision Medicine
 
The Journey to Success with Big Data
The Journey to Success with Big DataThe Journey to Success with Big Data
The Journey to Success with Big Data
 

Similaire à Finding fraud in large, diverse data sets

Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
Data Virtualization for Accelerated Digital Transformation in Banking and Fin...Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
Data Virtualization for Accelerated Digital Transformation in Banking and Fin...Denodo
 
Bi Lunch And Learn Examples
Bi Lunch And Learn ExamplesBi Lunch And Learn Examples
Bi Lunch And Learn Exampleseokerholm
 
Three Dimensions of Data as a Service
Three Dimensions of Data as a ServiceThree Dimensions of Data as a Service
Three Dimensions of Data as a ServiceDenodo
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationDenodo
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
Transforming Finance With Analytics
Transforming Finance With AnalyticsTransforming Finance With Analytics
Transforming Finance With AnalyticsKathleen Brunner
 
OpTier McKinsey Big Data Overview
OpTier McKinsey Big Data OverviewOpTier McKinsey Big Data Overview
OpTier McKinsey Big Data Overviewnickychu
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionRevolution Analytics
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Denodo
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data PlatformVikas Manoria
 
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...IBM (Middle East and Africa)
 
Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012nickychu
 
Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012optier
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessInside Analysis
 
Advanced Logging and Analysis for SOA, Social, Cloud and Big Data
Advanced Logging and Analysis for SOA, Social, Cloud and Big DataAdvanced Logging and Analysis for SOA, Social, Cloud and Big Data
Advanced Logging and Analysis for SOA, Social, Cloud and Big DataPerficient, Inc.
 
Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBigDataExpo
 

Similaire à Finding fraud in large, diverse data sets (20)

Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
Data Virtualization for Accelerated Digital Transformation in Banking and Fin...Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
 
Bi Lunch And Learn Examples
Bi Lunch And Learn ExamplesBi Lunch And Learn Examples
Bi Lunch And Learn Examples
 
Three Dimensions of Data as a Service
Three Dimensions of Data as a ServiceThree Dimensions of Data as a Service
Three Dimensions of Data as a Service
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Transforming Finance With Analytics
Transforming Finance With AnalyticsTransforming Finance With Analytics
Transforming Finance With Analytics
 
OpTier McKinsey Big Data Overview
OpTier McKinsey Big Data OverviewOpTier McKinsey Big Data Overview
OpTier McKinsey Big Data Overview
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
 
BigData in Banking
BigData in BankingBigData in Banking
BigData in Banking
 
Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012
 
Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012Big Transaction Data - CMG Vegas 2012
Big Transaction Data - CMG Vegas 2012
 
Take Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven BusinessTake Action: The New Reality of Data-Driven Business
Take Action: The New Reality of Data-Driven Business
 
Advanced Logging and Analysis for SOA, Social, Cloud and Big Data
Advanced Logging and Analysis for SOA, Social, Cloud and Big DataAdvanced Logging and Analysis for SOA, Social, Cloud and Big Data
Advanced Logging and Analysis for SOA, Social, Cloud and Big Data
 
Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of Analytics
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Finding fraud in large, diverse data sets

  • 1. Chris Selland VP Marketing HP Vertica © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 2. Finding fraud in large, diverse data sets using big data analytics for fraud detection and prevention Chris Selland, VP Marketing, HP Vertica Decemter, 2012 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 3. “If you need a machine and don’t buy it, you will ultimately find out that you have paid for it and don’t have it.” Henry Ford © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 4. Fraud is a parasite to business and government 5% of gross business and government returns and operating expenses are lost to fraud Approaching $3.5B annually in US for credit cards Medicare and Medicaid fraud over 10X credit card fraud The US Treasury expects $65B in tax fraud over the next 5 years One banker at UBS Bank Switzerland singlehandedly stole $2B But the average fraud attack is $5K 4 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 5. … but the toll on individuals is what really matters Every minute, 19 citizen’s identities are stolen The average victim spends 500 hours and $3,000 undoing the damage And we as individual … Taxpayers, Consumers, and Business owners … pay the rest of the bill 5 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 6. Big Data Application: Fraud Analysis The Problem: U.S. Government needed to detect patterns of fraud in federal health care programs The Solution: • Uses government supercomputer to detect fraud in near-real time on aggregated databases • Multiple petabytes of claims data (Medicare, Medicaid, DoD, Veterans Affairs, etc.) • Finds patterns to generate rules and identify anomalies • Boosted recovery of claims from $1 billion/year to $50 billion 6 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 7. Detecting fraud is a game of pattern recognition … A simple foundation of … Applied to financial … with modern technology statistical formulas … and workflow process we can … transactions, e.g. Capture transaction streams • Credit card bills Build historical track records • Supplier invoices and ID anomalies • Financial transactions Run analytics based on those • Call records 17th century formulas Baysian … • Claims records … and detect fraudulent • Approval chains activities B1Xi1 + B2Xi2 … BnXin + ei = yi • … 7 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 8. But combating fraud is a game of cat and mouse The patterns keep changing Hackers are thwarted only to come back through a different security hole Novell scams are being thought up for loopholes and exceptions in business workflows And the playing field keeps growing Volume and velocity of transactions Digitization of workflow records and approvals Source and type of transactions 8 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 9. With Vertica, you get a better mouse trap Your proprietary systems can’t keep up Expensive to scale for today’s “Big Data” real- time transaction streams Difficult to modify legacy analytic code and architectures to keep up with changing patterns With Vertica you stay a step ahead Rapidly create real-time high-speed transactional record datamarts on inexpensive platforms Open analytics platform for deep predictive fraudulent pattern modeling 9 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 10. Designed for answers from the very first line of code, Vertica technology makes the difference Columnar storage Achieve best data query performance with unique Vertica column store and execution Clustering Add resources on the fly with linear scaling on the grid, commodity hardware Compression Store more data, provide more views, 90% less storage required Continuous Query and load 24x7 with zero administration performance Database design Advanced analytics Automated performance tuning Time-series, geospatial, click-stream and an SDK for more 10 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 11. Vertica architecture – every element purpose built for pattern recognition in Big Data scenarios Industry Specific Fraud Detection & Prevention Use Case Scenarios SAS Hyperion MicroStrategy User- Business Cognos (Oracle) R-Function Defined Objects (IBM) Library Analytics (SAP) Next Generation administration, cluster architecture, Standard interfaces True column store – RDBMS w/ columnar compression, concurrent load/query Real time massively parallel processing, performance and high availability 11 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 12. Communication Service Provider fraud Estimated impact of $40 Billion p.a. worldwide Service Providers continue to see increases in volume & variety Continuous improvement necessary to alleviate – constantly moving targets 12 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 13. © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 14. Banking services fraud detection use case scenario Identifying credit card fraud (skimming) Historical reference dataset • Credit card skimming record for merchants • Merchant characteristics (size of store, popularity) • Credit record for card holder Real-time transactional data • Credit card transaction and card status Result: Merchant probability of skimming Implemented for a large American bank on Vertica 14 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 15. Identifying healthcare public/private insurance reimbursement improper payments use case scenario Historical reference dataset • Reimbursement record for providers • Reimbursement record for patient • National, regional, local statistical analysis of treatment associated with reimbursement • Provider characteristics (size of provider, popularity, etc.) Real-time transactional data • Credit card transaction and card status Result: Provider’s probability of improper payments Implemented for a large American insurance carrier on Vertica 15 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 16. Staying ahead of credit card skimming Ability to incorporate the latest algorithms without proprietary code, leveraging “Big Data” and social media • Front-end outlier detection in multivariate data streams • Neural networks • Social network analysis Implemented analytics in TWENTY lines of code Different use cases, data sources and industries handled with the same pattern recognition scenario 16 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 17. Conclusion Get in front of the games … We can help you combat fraud by enabling you to … • Incorporation of all necessary data sets • Ability to incorporate the latest algorithms without proprietary code, leveraging “Big Data” and social media • Be proactive versus reactive Where to find more information • bit.ly/VerticaFraud • www.vertica.com • my.vertica.com/evaluate/ • cselland@vertica.com • +1.617.3864523 17 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 18. Download Now Get the Mobile App Download content from this session with the free Mobile App at: m.hp.com/events 18 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 19. Thank you © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Notes de l'éditeur

  1. 5% of gross business and government returns and operating expenses are lost to fraud This works out to be around $2.9 Trillion dollars – this study was global, done in 2009 by the Association of Certified Fraud ExaminersGlobally, fraudulent use of payment cards (including general purpose and private label credit cards, debit cards and prepaid payment cards) generated $7.6 billion in losses in 2010,up 10.2 percent from the previous year. The United States sustained a disproportionate share of those losses; while the U.S. registered 27 percent of worldwide payment card business in 2010, it reported nearly half (47 percent) of all losses, or $3.56 billion.But again, it’s not just a commercial industry problem, for example, in the US - As much as $26 billion could still be refunded to identity thieves in the next five years if the IRS does not do more to control the problem, the Treasury Inspector General estimated.KwekuAdeboli with Switzerland’s UBS bank was accused of stealing $2.03 billion through false accounting practices [but these large gross errors like Bernie Madoff are actually a drop in the bucket compared to overall fraud or the fact that the center of gravity or median value of a fraudulent scheme is $5K – so it’s really something that is perpetrated by average criminals and has detrimental impact on average individuals [NEXT SLIDE]
  2. Most of the time fraudulent charges are passed on to us …Directly when someone’s identity is stolen and they have to jump through several costly time consuming hurdles to reestablish their identitiesAnd indirectly when the cost of our medical care, debt taken on because correct and expected levels of tax return revenues are not available and more has to be borrowed by governments – remember the size of the black market economy and tax evasion in Greece is one of the highest in the world, but even on the cost of goods either directly in their pricing or interest rates on our credit cards go up.
  3. Traditional BI environments are often designed with proprietary technology that is expensive. They were not designed to provide the speed and agility required to integrate the variety of data types we are dealing with today, analyze data in real-time, and generate the intelligence required by the fast-paced demands of today’s changing business environment. Where near real time, iterative, automated and low cost data analytics are not required, these legacy platforms will likely meet business requirements. The question is … is that the world that you live in?
  4. A vision is great, but technology is makes our vision a reality. Vertica’s innovative technology makes the difference because it was designed from the very first line of code for the new demands of near real time data analytics. RelationalSoftware platform to store, manage, and analyze informationNative COLUMNAR architecture is core, and enables better joins and fundamentally faster analyticsLoad and query simultaneously, dramatically increasing the velocityMPP- Highly scalable, elastic and fully PARALLEL, with commodity hardware and 90% less storage due to compression technology SQL & NoSQLanalytics capabilities Simpleinstallation & use with automaticsetup and tuning
  5. Key Thoughts:Don’t dig in deep – just highlight that the core foundation of the product is MPP, Optimized HA, and a True Column StoreTMThe key idea is that everything ties into the innovation at the core of the product – every module, feature, function, connector, etc. The extensibility of the platform is ultimately due to the innovation at the coreMention in passing everyone will say “hey – we have a columnar db too”. But we are the only True Column StoreHighlight that our modular approach allows us to innovate more frequently than most- hence a new major release every 6-9 months.
  6. Key Thoughts:Telstra, Vodaphone, Optus,TimeWarner, Shaw, Bell Mobility are just some of the newer customers to begin using VerticaRefer back to comcast and Trane use cases (network devices and sensors – capture a vast amount of the market with those 2 use cases)CHALLENGESCustomer and product churnCompetitive market with mix of high and low margin productsVolume of data eclipses capabilities of legacy infrastructuresSOLUTIONSAnalyze portfolio for insight into churn and satisfaction Prioritize infrastructure investments in high value, high margin infrastructure and applications via empirical dataStore, access, and monetize via new analytic paradigmBENEFITSHigher customer satisfaction, retention, and profitability Alleviate high cost low value products and servicesDynamically manage and scale portfolio without sacrificing details of any customer, transaction, or product
  7. Key Thoughts:Telstra, Vodaphone, Optus,TimeWarner, Shaw, Bell Mobility are just some of the newer customers to begin using VerticaRefer back to comcast and Trane use cases (network devices and sensors – capture a vast amount of the market with those 2 use cases)CHALLENGESCustomer and product churnCompetitive market with mix of high and low margin productsVolume of data eclipses capabilities of legacy infrastructuresSOLUTIONSAnalyze portfolio for insight into churn and satisfaction Prioritize infrastructure investments in high value, high margin infrastructure and applications via empirical dataStore, access, and monetize via new analytic paradigmBENEFITSHigher customer satisfaction, retention, and profitability Alleviate high cost low value products and servicesDynamically manage and scale portfolio without sacrificing details of any customer, transaction, or product
  8. Key Thoughts:Telstra, Vodaphone, Optus,TimeWarner, Shaw, Bell Mobility are just some of the newer customers to begin using VerticaRefer back to comcast and Trane use cases (network devices and sensors – capture a vast amount of the market with those 2 use cases)CHALLENGESCustomer and product churnCompetitive market with mix of high and low margin productsVolume of data eclipses capabilities of legacy infrastructuresSOLUTIONSAnalyze portfolio for insight into churn and satisfaction Prioritize infrastructure investments in high value, high margin infrastructure and applications via empirical dataStore, access, and monetize via new analytic paradigmBENEFITSHigher customer satisfaction, retention, and profitability Alleviate high cost low value products and servicesDynamically manage and scale portfolio without sacrificing details of any customer, transaction, or product
  9. Key Thoughts:Telstra, Vodaphone, Optus,TimeWarner, Shaw, Bell Mobility are just some of the newer customers to begin using VerticaRefer back to comcast and Trane use cases (network devices and sensors – capture a vast amount of the market with those 2 use cases)CHALLENGESCustomer and product churnCompetitive market with mix of high and low margin productsVolume of data eclipses capabilities of legacy infrastructuresSOLUTIONSAnalyze portfolio for insight into churn and satisfaction Prioritize infrastructure investments in high value, high margin infrastructure and applications via empirical dataStore, access, and monetize via new analytic paradigmBENEFITSHigher customer satisfaction, retention, and profitability Alleviate high cost low value products and servicesDynamically manage and scale portfolio without sacrificing details of any customer, transaction, or product
  10. Be proactive versus reactiveAnalysis of transaction data can provide a retroactive means of detecting fraud, but real-time use of transaction data can proactively step in to stop fraud.