SlideShare une entreprise Scribd logo
1  sur  20
THE FUTURE OF FRAUD FIGHTING
an intro to machine learning
table of contents
what is machinelearning?
why do we want machine learning?
what does machine learning do?
where does machine learningwork?
how accurate is machinelearning?
how does machine learning apply tofraud?
does ithave anyweaknesses?
fraudfighting in the future
references and furtherreading
introduction00
01
02
03
04
05
06
07
08
09
03
05
06
08
10
12
16
18
19
02
introduction
Machine learning
—abuzzwordtobe sure. Inthisdigital age, do you really
know what machinelearningis and how itcan beused?
Chances are good thatmachinelearningisalready
integraltoyour day-to-day life.Crazy,right?
The goal ofthisebook is tohelp you understand
machinelearningand itsapplicationinreducing fraud
on your website. With justa few keystrokes and the
righttools,you can stop fraudstersintheir tracks.
2
Fraud is a $200 billiondollar,trulyglobal industry. Cyber
criminals steal frommorethanone million people...every
day! Ifyou’re an individual hitby fraud, hopefullyyour bank
orinsurance companycan help you out.However,
businesses —especially small business owners —findthat
every dollar lost compoundstoa staggeringdegree.
AtSift Science, we thinkofourselves as Fraud Fighters,
wielding ourtrustysword ofmachine learningagainstthe
crooks thatwant torobyour site. Follow us as we journey
throughthe intricacies ofthe interwebstonotonlygraspthe
what ofmachine learningbutalso how toputittowork for
you fast.
01 what is machine learning?
So what exactly is machine learning?
Machine Learning =Statistics +Data +Software.
Machine learning(ML) falls underthe umbrellaof
computerscience.Atitscore, machinelearning refers to
the practice oftraining computersvia software to
recognize patternsand inferpredictions, emulatinga
human-likeability tolearn from “experience”. InML,the
“experience” thatmachines get is fromhumansnotonly
inputtingbutalso indicatinguseful data. But why doesn’t
data alone make formachinelearning?
3
01 what is machine learning?
Let’sthinkabout data itself.Chances are, your computer
has spreadsheets fullofnumbersand namesthat
encompass your pastsales and communications.
However, you cannot predictyour site’sfutureunless you
take your data tothe next level. Machine learningis that
nextstep.
With ML,computers —via specially created algorithmsand
mathematicalformulas—can learn fromhistorical data and
suggest likely future scenarios.The magic ofmachine
learningis thatafter the initial set-up,your machinelearning
system needn’tfurtherexplicit programmingoutside of
occasional training and course-correcting.
4
02 why do we want machinelearning?
What’s so great about machine learning?
IfI toldyou toimagine a tiger,your braincanconjure upan
image,right? What are some ofthe key elements that
connote tiger inyour brain? Big cat, stripes, tail —all of
these are relevantdescriptors.
But how does your brain distinguish between a tiger and a
lion (they are both big cats), or between a tiger and a
zebra(stripes forthewin!).
Yourability tonotonly learn patternsindata but
distinguish between and weigh the importanceof each
patternis essential toyour intelligence.This ability is
what engineers strive to emulate when building
machine learning systems.
ZEBRA
Machine learningallows fora reductioninhuman effort.
The timeittakes humanstoread, synthesize, categorize,
and evaluate data is significant —and that’sall before any
action takes place. MLteaches machinestoidentify and
gauge the importance of patternsinplace ofhumans.
Particularly foruse- cases where data mustbe analyzed
and acted upon ina shortamountoftime,having the
support of machinesallows humanstobe moreefficient
andact with moreconfidence.
TIGER
5
03 what does machine learning do?
Machine learning offers automated,
accurate predictions.
Depending on the purpose ofa specific machine
learningsystemoralgorithm(called models),ML
can turndense and confusing informationintoa
narrative thatsuggests the likelihoodofa future
action. By continually addingdata and experience,
a user further trainsthe MLsystem,makingits
predictions moreaccurate and morerelevant tothat
specific use-case.Atits core, machinelearning is a
3-partcycle.
6
03 what does machine learning do?
Train
The firststep of ofany MLsystem is totrain the model. Herein, the algorithm receives its
basic purpose: toextract patternsfromdata.Subsequent training can come fromthe user
and doesn’t require furthertechnical guidance.
Predict
Once the algorithmhas sufficientdata, itcan make predictions.These predictions
essentially answer the question, “what matchesthe datapattern?”.
Act
MLis only as accurate as itsdata. Inorder togrow even moreaccurate, the model needs
feedback, which is the Act phase. Inthisstep, the user indicates whether the predictionis
correct orincorrect,which inturntrains the model,restarting the cycle.
7
04 what are examples of machinelearning?
How does machine learning affect my
day-to-day life?
Excellent question!Do you use Gmail? Howabout
Netflix? Guess what -- bothofthose sites use machine
learningtooptimizeyour experience.
Netflix is a video-streaming site thatoffersviewers
access toTV shows, documentaries,movies, etc.in
exchange fora subscription fee. Netflix constantly asks
users toreview the shows and movies that they’ve
alreadywatched.
The purpose ofthose reviews (via a star-rating system) is
totrainthe Netflix recommendation model, based on what
you choose towatch and how toreact tothose choices.
Netflix wants tounderstandyour taste and preferences.
The Netflix’sSuggested for Youis machinelearning,
automatically gatheringdata totailorthe experience toyou.
8
Why does Netflix want to create such a
you-specific experience?
Netflix “estimates that75% ofviewer activity isdriven by
recommendations.”Those recommendations take into
account not justthe viewer’s historical taste but also the
viewer’s ongoing viewing trends,changing as viewer
selections change.Sometimes,we can confuse the
algorithms,like when a friendpicksthe movie formovie
night.A different user mayaccount forsome unexpected
futurerecommendations, but that’show you know thatthe
MLsystemisworking.
04 where does machine learning work?
ENGINEERS TRAIN
a “tastes” model
NETFLIXMAKES
movie recommendations
VIEWERS
select &rate movies
Retrainyour Netflix model by selecting the Not
Interested optionforsuggestions thatare out-of-line with
your tastes.There you go —you’re using machine
learning!
9
05 how accurate is machine learning?
Is machine learning actually accurate?
Yes, very! For manymachinelearningmodels,the data is
always incoming and users are constantly actingon the
predictions.Takeforexample your email.Inyour email
inbox,how oftendo you currently see spammessages?
Maybe once every few weeks? Andhow rarely do good
messages get stuck inyour spamfolder?
On the whole, your email account is pretty accurate in
weeding outspamorfake emails,and you haven’t done a
thingtoteach itwrong fromright.That’s because it’s
predictingspamwith machinelearning.
Ifyou didn’thave machinelearningworkingtokeep your
inboxclear,you’d see lotsmoreViagra offers, desperate
notes fromdeposed princes,and “You’ve won!” emails
daily. The 2013Symantec Intelligence Reportestimated
that spammade up68% ofallemail trafficin2012.Thatthe
MLalgorithms can catch the vast majorityofthese spam
messages is pretty impressive!
10
05 how accurate is machine learning?
How do these algorithms work?
Youremail MLsystemis always looking forpatterns in
data. Certain keywords appear inspam messages more
frequentlythangood messages.Termslike “Viagra” or
“belly fatloss,”indicators such as the message length or
email domain,and data fromthe sender’s email address
itself maysuggestspam.
Every timeyou click “Report Spam,”you’re actingto train
your email’s MLsystemtorecognizethe
ever-changing signs ofjunkmail.
When you finda good message stuck inyour Spam folder
and markas “Not Spam,”you’re doing the same. So even
when it’srightoutofthe box,machine learningcan be
exceptionallyaccurate. Likely, you receive all ofthe emails
thatyou want and don’toften see those spammessages --
that’smachinelearning atwork!
11
06 how does machine learning apply to fraud?
What can machine learning do against fraud?
As evidenced by the email spamexample,badusers
leave behindsigns.These calling cards offraudulent
activity can be used totrain a machinelearning model to
spot credit card fraud,fake accounts, referral fraud,and
othertypes ofillicit activity that plague websites. Having a
humanreview every new order ornewly created account
—knownas manual review —is too timeconsuming as
trafficand transactionalvolumeincrease.
The solution?Put a systeminplace thatidentifiesthe most
glaring examples offraud,while flagging those thatrequire
further,closerexamination.
Althoughcreating a simpleset ofbusiness rules
based on commonfraudulentcharacteristicsmay help
you at first,such a solutionisn’tscalable.
Why, you ask? Well, as you learn and identify fraud
signals, you have tomanuallycreate an If X then Y
system.Adding a new ruleforevery IP address or email
domain where fraudhas occurred isunscalable and
unreasonable.
Imaginethatyour onlinestore is hitby a fraudster from
Venezuela. Ifyou were usingan If X then Y, rules-based
system,you mightset upa businessrule todeny any and
all orders comingfromVenezuela.
But what ifonly 15%ofall Venezuelan orders are
fraudulent—you’dbe missingouton the profitsof that
other85% oforders,and creating a negative
customerexperience forthose goodshoppers.
12
06 how does machine learning apply to fraud?
Additionally,what ifthose manyindicators —shipping
address, IP address, social network data, shopper credit
card issuer —carry different weights? Let’sreturntoour
animal example tobetter visualize the impact ofstaticIf X
then Y systemson detecting onlinefraud.
IfI tell you, “Animal X is approaching;ithas stripes,” do
you automatically assumethatAnimal X is atiger and
therefore terrifying?Of course not!Itcould be any number
ofstripedanimals—a zebraora clown fishoratapir.
There are countless otherand moreimportantdata points
thatwould help you decide whether you shouldrunaway
orstick around tofinallymeet a tapir.Even ifa non-ML
systemhad data pointsthat included,“Animal X has
stripes,is orange and black, and can be foundintrees,”
thismystery animalcould still be a monarchbutterflyora
tree frog.But with sufficiently weighted data (Animal X
has a long tail,a loud roar, is a big cat), your brainwould
be able to deduce that,“Hey, I thinkthat’sa tiger.Time to
get outtahere!”
ZEBRA CLOWNFISH TAPIR
13
06 how does machine learning apply to fraud?
The ability totake indata and understandthe relative
values ofeach data point inrelationtothe whole is where
intellect comes into play. That is what machinelearning
can do forfrauddetection. With a trainedMLsystem,
analysts can accurately assess which orders are tigers
and which aretapirs.
Machine learningis a voice ofreason infraud detection.
When combinedwith humanintuition, statisticalinsights
can bring about betterdecisions and long termfinancial
savings.
Rather thancanceling orders fromgood customers
simplybecause they mayshare an insignificant attribute
with a fraudulentorcharged back transaction,machine
learningcan weigh the many otherindicators that
contributetoa customer’s profile.
The ability to take in data and understand the
relative values of each data point in relation to the
14
whole is where intellect comes into play.
”
“
06 how does machine learning apply to fraud?
Fraud is rarely cut and dry.ML gives users probabilities,
notblackand white judgements. Probabilities allow you to
understandthe nuances in your data, enabling you to
make better decisions and teach the ML model tomake
even moreaccurate predictionsinthefuture.
describes
15
Youneed human judgement to know if the person
you’re calling is a fraudster.
”
“
replicates
can tell what’s said
explains
creates
can tell what’s meant
Machine learning is only as strong as its data. Ifan
onlinebusiness owner is only predictingbased on her
own, limited pool ofshoppers,she must experience each
unique fraudattempt before being able topredict future
attacks.This methodis exceptionallyunsustainable.
With machinelearningoperatingon a global poolof data,
however, a shopowner inPortugal canbenefit fromthe
fraudexperience ofa shopowner in Portland or Odessa
orKyoto.
The vastness ofdata froma wider net ofexperience
allows predictive behavior without actual losses.That way,
shopowners thathave yet toexperience fraud froma
specific IP address or email address can benefit fromthe
knowledge ofthose whohave.
Perhaps you’ve heard ofthe networkeffect? Inorder for
machinelearningtoeffectively and accurately detect fraud
before ithappens,a criticalmassofdata and users must
be achieved. Unless you’re an Amazon.com,operating
with millionsofworldwide data points,usingmachine
learningwithout anyinput fromoutside ofyour business is
likely ineffectual.
Machine learningneeds a deep set ofdatainorder to
be trulymagical.
07 does it have any weaknesses?
16
Once connected toa vast network oforderand
transactionhistory,MLforfraudis unstoppable.
Certain fraudfightingtools can even trainfraud detection
systemsinreal time.What does thismean? Say a shop
owner inShanghai sees achargeback on an order placed
by Freddie Fraudster inHong Kong. Once that
Shanghainese shopowner marksFreddie as “bad” inthe
global MLfrauddetection system, immediately and
automatically Freddie’s details will be updated forevery
othershopowner on the networkworldwide. So when
Freddie tries tobuy somethingfroma Londoner’swebsite
2 minutes later,the Londonerwill see that Freddie is a bad
user and can cancel Freddie’s order before she too
suffers achargeback.
07 does it have any weaknesses?
17
Criminals are constantlycookingupnew ways torob hard-
working folks.Whether it’sdiluting good content with spam
messages, stealing credit card information,orscamminga
referral or coupon system,fraudis an ever-evolving, ever-
growing industry.
Thankfully,machinelearningenables the good guys to
stay one step ahead ofthe fraudsters.When properly
leveraged, ML bringstogether sets of informationso vast
and so deep thatthey would be otherwise
incomprehensible.Allowing manual reviewers tofocuson
only flagged data pointssaves timeand money.By
combining the power ofpattern recognitionwith human
intuition,machinelearningis drivingthe evolutionofhow
we processdata.
Machine learningis a necessary tool inthe arsenalof
today’s savvy business leaders. Whether used to stop
criminals before they hityour bottomline, project market
trends,orflag spam emails,ML is how we optimizeour
lives.
Interested inlearningmoreabout the power of machine
learning? Have a fraudproblemthatneeds an accurate
and innovativesolution?
We’re here to answer any questions—just drop us a
line at scientists@siftscience.com or visit our
website at https://siftscience.com.
08 fraud fighting in the future
18
Medium offersan excellent overview andexercises in
introductorymachinelearning.
Symantec’s regularintelligence reportsoffer
invaluable insightinto the impact offraud.
Wired Magazine’s article on the Netflixsuggestion
algorithms is an interestingread.
Sift Science offersa "Machine LearningDemystified"
overview onYouTube.
Business Insider’s machinelearningoverviewwith
embedded Nova video arehelpful.
References and further reading
19

Contenu connexe

En vedette

Entrepreneur + Developer Gangbang: Co-working
Entrepreneur + Developer Gangbang: Co-workingEntrepreneur + Developer Gangbang: Co-working
Entrepreneur + Developer Gangbang: Co-working
kamal.fariz
 

En vedette (9)

Omise fintech研究会
Omise fintech研究会Omise fintech研究会
Omise fintech研究会
 
[daddly] Stripe勉強会 運用編 2016/11/30
[daddly] Stripe勉強会 運用編 2016/11/30[daddly] Stripe勉強会 運用編 2016/11/30
[daddly] Stripe勉強会 運用編 2016/11/30
 
Entrepreneur + Developer Gangbang: Co-working
Entrepreneur + Developer Gangbang: Co-workingEntrepreneur + Developer Gangbang: Co-working
Entrepreneur + Developer Gangbang: Co-working
 
Payments using Stripe.com
Payments using Stripe.comPayments using Stripe.com
Payments using Stripe.com
 
Hiring Hacks: How Stripe Creatively Finds Candidates and Builds a Recruiting ...
Hiring Hacks: How Stripe Creatively Finds Candidates and Builds a Recruiting ...Hiring Hacks: How Stripe Creatively Finds Candidates and Builds a Recruiting ...
Hiring Hacks: How Stripe Creatively Finds Candidates and Builds a Recruiting ...
 
Bitcoin ,
Bitcoin ,Bitcoin ,
Bitcoin ,
 
Braintree SDK v.zero or "A payment gateway walks into a bar..." - Devfest Nan...
Braintree SDK v.zero or "A payment gateway walks into a bar..." - Devfest Nan...Braintree SDK v.zero or "A payment gateway walks into a bar..." - Devfest Nan...
Braintree SDK v.zero or "A payment gateway walks into a bar..." - Devfest Nan...
 
Payments integration: Stripe & Taxamo
Payments integration: Stripe & TaxamoPayments integration: Stripe & Taxamo
Payments integration: Stripe & Taxamo
 
Payments Made Easy with Stripe
Payments Made Easy with StripePayments Made Easy with Stripe
Payments Made Easy with Stripe
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Machine Learning: The Future of Fraud Fighting

  • 1. THE FUTURE OF FRAUD FIGHTING an intro to machine learning
  • 2. table of contents what is machinelearning? why do we want machine learning? what does machine learning do? where does machine learningwork? how accurate is machinelearning? how does machine learning apply tofraud? does ithave anyweaknesses? fraudfighting in the future references and furtherreading introduction00 01 02 03 04 05 06 07 08 09 03 05 06 08 10 12 16 18 19 02
  • 3. introduction Machine learning —abuzzwordtobe sure. Inthisdigital age, do you really know what machinelearningis and how itcan beused? Chances are good thatmachinelearningisalready integraltoyour day-to-day life.Crazy,right? The goal ofthisebook is tohelp you understand machinelearningand itsapplicationinreducing fraud on your website. With justa few keystrokes and the righttools,you can stop fraudstersintheir tracks. 2 Fraud is a $200 billiondollar,trulyglobal industry. Cyber criminals steal frommorethanone million people...every day! Ifyou’re an individual hitby fraud, hopefullyyour bank orinsurance companycan help you out.However, businesses —especially small business owners —findthat every dollar lost compoundstoa staggeringdegree. AtSift Science, we thinkofourselves as Fraud Fighters, wielding ourtrustysword ofmachine learningagainstthe crooks thatwant torobyour site. Follow us as we journey throughthe intricacies ofthe interwebstonotonlygraspthe what ofmachine learningbutalso how toputittowork for you fast.
  • 4. 01 what is machine learning? So what exactly is machine learning? Machine Learning =Statistics +Data +Software. Machine learning(ML) falls underthe umbrellaof computerscience.Atitscore, machinelearning refers to the practice oftraining computersvia software to recognize patternsand inferpredictions, emulatinga human-likeability tolearn from “experience”. InML,the “experience” thatmachines get is fromhumansnotonly inputtingbutalso indicatinguseful data. But why doesn’t data alone make formachinelearning? 3
  • 5. 01 what is machine learning? Let’sthinkabout data itself.Chances are, your computer has spreadsheets fullofnumbersand namesthat encompass your pastsales and communications. However, you cannot predictyour site’sfutureunless you take your data tothe next level. Machine learningis that nextstep. With ML,computers —via specially created algorithmsand mathematicalformulas—can learn fromhistorical data and suggest likely future scenarios.The magic ofmachine learningis thatafter the initial set-up,your machinelearning system needn’tfurtherexplicit programmingoutside of occasional training and course-correcting. 4
  • 6. 02 why do we want machinelearning? What’s so great about machine learning? IfI toldyou toimagine a tiger,your braincanconjure upan image,right? What are some ofthe key elements that connote tiger inyour brain? Big cat, stripes, tail —all of these are relevantdescriptors. But how does your brain distinguish between a tiger and a lion (they are both big cats), or between a tiger and a zebra(stripes forthewin!). Yourability tonotonly learn patternsindata but distinguish between and weigh the importanceof each patternis essential toyour intelligence.This ability is what engineers strive to emulate when building machine learning systems. ZEBRA Machine learningallows fora reductioninhuman effort. The timeittakes humanstoread, synthesize, categorize, and evaluate data is significant —and that’sall before any action takes place. MLteaches machinestoidentify and gauge the importance of patternsinplace ofhumans. Particularly foruse- cases where data mustbe analyzed and acted upon ina shortamountoftime,having the support of machinesallows humanstobe moreefficient andact with moreconfidence. TIGER 5
  • 7. 03 what does machine learning do? Machine learning offers automated, accurate predictions. Depending on the purpose ofa specific machine learningsystemoralgorithm(called models),ML can turndense and confusing informationintoa narrative thatsuggests the likelihoodofa future action. By continually addingdata and experience, a user further trainsthe MLsystem,makingits predictions moreaccurate and morerelevant tothat specific use-case.Atits core, machinelearning is a 3-partcycle. 6
  • 8. 03 what does machine learning do? Train The firststep of ofany MLsystem is totrain the model. Herein, the algorithm receives its basic purpose: toextract patternsfromdata.Subsequent training can come fromthe user and doesn’t require furthertechnical guidance. Predict Once the algorithmhas sufficientdata, itcan make predictions.These predictions essentially answer the question, “what matchesthe datapattern?”. Act MLis only as accurate as itsdata. Inorder togrow even moreaccurate, the model needs feedback, which is the Act phase. Inthisstep, the user indicates whether the predictionis correct orincorrect,which inturntrains the model,restarting the cycle. 7
  • 9. 04 what are examples of machinelearning? How does machine learning affect my day-to-day life? Excellent question!Do you use Gmail? Howabout Netflix? Guess what -- bothofthose sites use machine learningtooptimizeyour experience. Netflix is a video-streaming site thatoffersviewers access toTV shows, documentaries,movies, etc.in exchange fora subscription fee. Netflix constantly asks users toreview the shows and movies that they’ve alreadywatched. The purpose ofthose reviews (via a star-rating system) is totrainthe Netflix recommendation model, based on what you choose towatch and how toreact tothose choices. Netflix wants tounderstandyour taste and preferences. The Netflix’sSuggested for Youis machinelearning, automatically gatheringdata totailorthe experience toyou. 8
  • 10. Why does Netflix want to create such a you-specific experience? Netflix “estimates that75% ofviewer activity isdriven by recommendations.”Those recommendations take into account not justthe viewer’s historical taste but also the viewer’s ongoing viewing trends,changing as viewer selections change.Sometimes,we can confuse the algorithms,like when a friendpicksthe movie formovie night.A different user mayaccount forsome unexpected futurerecommendations, but that’show you know thatthe MLsystemisworking. 04 where does machine learning work? ENGINEERS TRAIN a “tastes” model NETFLIXMAKES movie recommendations VIEWERS select &rate movies Retrainyour Netflix model by selecting the Not Interested optionforsuggestions thatare out-of-line with your tastes.There you go —you’re using machine learning! 9
  • 11. 05 how accurate is machine learning? Is machine learning actually accurate? Yes, very! For manymachinelearningmodels,the data is always incoming and users are constantly actingon the predictions.Takeforexample your email.Inyour email inbox,how oftendo you currently see spammessages? Maybe once every few weeks? Andhow rarely do good messages get stuck inyour spamfolder? On the whole, your email account is pretty accurate in weeding outspamorfake emails,and you haven’t done a thingtoteach itwrong fromright.That’s because it’s predictingspamwith machinelearning. Ifyou didn’thave machinelearningworkingtokeep your inboxclear,you’d see lotsmoreViagra offers, desperate notes fromdeposed princes,and “You’ve won!” emails daily. The 2013Symantec Intelligence Reportestimated that spammade up68% ofallemail trafficin2012.Thatthe MLalgorithms can catch the vast majorityofthese spam messages is pretty impressive! 10
  • 12. 05 how accurate is machine learning? How do these algorithms work? Youremail MLsystemis always looking forpatterns in data. Certain keywords appear inspam messages more frequentlythangood messages.Termslike “Viagra” or “belly fatloss,”indicators such as the message length or email domain,and data fromthe sender’s email address itself maysuggestspam. Every timeyou click “Report Spam,”you’re actingto train your email’s MLsystemtorecognizethe ever-changing signs ofjunkmail. When you finda good message stuck inyour Spam folder and markas “Not Spam,”you’re doing the same. So even when it’srightoutofthe box,machine learningcan be exceptionallyaccurate. Likely, you receive all ofthe emails thatyou want and don’toften see those spammessages -- that’smachinelearning atwork! 11
  • 13. 06 how does machine learning apply to fraud? What can machine learning do against fraud? As evidenced by the email spamexample,badusers leave behindsigns.These calling cards offraudulent activity can be used totrain a machinelearning model to spot credit card fraud,fake accounts, referral fraud,and othertypes ofillicit activity that plague websites. Having a humanreview every new order ornewly created account —knownas manual review —is too timeconsuming as trafficand transactionalvolumeincrease. The solution?Put a systeminplace thatidentifiesthe most glaring examples offraud,while flagging those thatrequire further,closerexamination. Althoughcreating a simpleset ofbusiness rules based on commonfraudulentcharacteristicsmay help you at first,such a solutionisn’tscalable. Why, you ask? Well, as you learn and identify fraud signals, you have tomanuallycreate an If X then Y system.Adding a new ruleforevery IP address or email domain where fraudhas occurred isunscalable and unreasonable. Imaginethatyour onlinestore is hitby a fraudster from Venezuela. Ifyou were usingan If X then Y, rules-based system,you mightset upa businessrule todeny any and all orders comingfromVenezuela. But what ifonly 15%ofall Venezuelan orders are fraudulent—you’dbe missingouton the profitsof that other85% oforders,and creating a negative customerexperience forthose goodshoppers. 12
  • 14. 06 how does machine learning apply to fraud? Additionally,what ifthose manyindicators —shipping address, IP address, social network data, shopper credit card issuer —carry different weights? Let’sreturntoour animal example tobetter visualize the impact ofstaticIf X then Y systemson detecting onlinefraud. IfI tell you, “Animal X is approaching;ithas stripes,” do you automatically assumethatAnimal X is atiger and therefore terrifying?Of course not!Itcould be any number ofstripedanimals—a zebraora clown fishoratapir. There are countless otherand moreimportantdata points thatwould help you decide whether you shouldrunaway orstick around tofinallymeet a tapir.Even ifa non-ML systemhad data pointsthat included,“Animal X has stripes,is orange and black, and can be foundintrees,” thismystery animalcould still be a monarchbutterflyora tree frog.But with sufficiently weighted data (Animal X has a long tail,a loud roar, is a big cat), your brainwould be able to deduce that,“Hey, I thinkthat’sa tiger.Time to get outtahere!” ZEBRA CLOWNFISH TAPIR 13
  • 15. 06 how does machine learning apply to fraud? The ability totake indata and understandthe relative values ofeach data point inrelationtothe whole is where intellect comes into play. That is what machinelearning can do forfrauddetection. With a trainedMLsystem, analysts can accurately assess which orders are tigers and which aretapirs. Machine learningis a voice ofreason infraud detection. When combinedwith humanintuition, statisticalinsights can bring about betterdecisions and long termfinancial savings. Rather thancanceling orders fromgood customers simplybecause they mayshare an insignificant attribute with a fraudulentorcharged back transaction,machine learningcan weigh the many otherindicators that contributetoa customer’s profile. The ability to take in data and understand the relative values of each data point in relation to the 14 whole is where intellect comes into play. ” “
  • 16. 06 how does machine learning apply to fraud? Fraud is rarely cut and dry.ML gives users probabilities, notblackand white judgements. Probabilities allow you to understandthe nuances in your data, enabling you to make better decisions and teach the ML model tomake even moreaccurate predictionsinthefuture. describes 15 Youneed human judgement to know if the person you’re calling is a fraudster. ” “ replicates can tell what’s said explains creates can tell what’s meant
  • 17. Machine learning is only as strong as its data. Ifan onlinebusiness owner is only predictingbased on her own, limited pool ofshoppers,she must experience each unique fraudattempt before being able topredict future attacks.This methodis exceptionallyunsustainable. With machinelearningoperatingon a global poolof data, however, a shopowner inPortugal canbenefit fromthe fraudexperience ofa shopowner in Portland or Odessa orKyoto. The vastness ofdata froma wider net ofexperience allows predictive behavior without actual losses.That way, shopowners thathave yet toexperience fraud froma specific IP address or email address can benefit fromthe knowledge ofthose whohave. Perhaps you’ve heard ofthe networkeffect? Inorder for machinelearningtoeffectively and accurately detect fraud before ithappens,a criticalmassofdata and users must be achieved. Unless you’re an Amazon.com,operating with millionsofworldwide data points,usingmachine learningwithout anyinput fromoutside ofyour business is likely ineffectual. Machine learningneeds a deep set ofdatainorder to be trulymagical. 07 does it have any weaknesses? 16
  • 18. Once connected toa vast network oforderand transactionhistory,MLforfraudis unstoppable. Certain fraudfightingtools can even trainfraud detection systemsinreal time.What does thismean? Say a shop owner inShanghai sees achargeback on an order placed by Freddie Fraudster inHong Kong. Once that Shanghainese shopowner marksFreddie as “bad” inthe global MLfrauddetection system, immediately and automatically Freddie’s details will be updated forevery othershopowner on the networkworldwide. So when Freddie tries tobuy somethingfroma Londoner’swebsite 2 minutes later,the Londonerwill see that Freddie is a bad user and can cancel Freddie’s order before she too suffers achargeback. 07 does it have any weaknesses? 17
  • 19. Criminals are constantlycookingupnew ways torob hard- working folks.Whether it’sdiluting good content with spam messages, stealing credit card information,orscamminga referral or coupon system,fraudis an ever-evolving, ever- growing industry. Thankfully,machinelearningenables the good guys to stay one step ahead ofthe fraudsters.When properly leveraged, ML bringstogether sets of informationso vast and so deep thatthey would be otherwise incomprehensible.Allowing manual reviewers tofocuson only flagged data pointssaves timeand money.By combining the power ofpattern recognitionwith human intuition,machinelearningis drivingthe evolutionofhow we processdata. Machine learningis a necessary tool inthe arsenalof today’s savvy business leaders. Whether used to stop criminals before they hityour bottomline, project market trends,orflag spam emails,ML is how we optimizeour lives. Interested inlearningmoreabout the power of machine learning? Have a fraudproblemthatneeds an accurate and innovativesolution? We’re here to answer any questions—just drop us a line at scientists@siftscience.com or visit our website at https://siftscience.com. 08 fraud fighting in the future 18
  • 20. Medium offersan excellent overview andexercises in introductorymachinelearning. Symantec’s regularintelligence reportsoffer invaluable insightinto the impact offraud. Wired Magazine’s article on the Netflixsuggestion algorithms is an interestingread. Sift Science offersa "Machine LearningDemystified" overview onYouTube. Business Insider’s machinelearningoverviewwith embedded Nova video arehelpful. References and further reading 19