SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
Produced in cooperation with:HP Technology Forum & Expo 2009
© 2009 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice
Email Address
Harvesting
Michael Lamont
Senior Software Engineer
June 17, 2009
Overview
• What is email address harvesting?
• How do spammers do it?
• What can you do about it?
• Examples of harvesting software
Mandatory Definition Slide
• Email address harvesting is the process used by
spammers to extract email addresses from public
sources.
• Common sources:
− Web sites
− Newsgroups
− Mailing lists
− Chat rooms
Mandatory “How Bad Is It?” Slide
• FTC: 86% of all email addresses posted on web
pages receive spam.
• FTC: 93% of all email addresses used in
newsgroups receive spam.
• PSC honeypot record: Address received spam 4
minutes after being included in a newsgroup post.
Address Lists
• Spammers use address harvesting to build giant
lists of addresses to send spam to.
• Most lists have 1-20 million addresses.
• Spammers sell/share their lists, so being on even
just one list will get you a lot of spam.
Evolution Of The Address List
• Somebody (probably not even a spammer)
harvests addresses from various sources.
• A “good” harvester scrubs the list.
• The harvester sells the list to lots of spammers.
• Once your address is on a list, it’s going to be
on one or more lists forever.
Harvesting From Web Sites
• Spammers usually use a spider program to
scrape addresses off of web pages.
Harvesting From Web Sites
Harvesting From Web Sites
• Web directories make it easy to get lots of
addresses
Harvesting From Web Sites
10 22 July 2014
UseNet Newsgroups
• Spider programs exist to extract these addresses
as well.
• Email addresses are splattered all over:
− Message headers
− Signatures
− Attributions
Mailing Lists
• Lots of list manager software provides a list of
every email address on a list.
• Spammers are happy to join a mailing list
temporarily to get access to a list of subscribers.
• Some clever spammers send an innocuous
newbie question from the list archives with a
read-receipt request.
3rd Party Mailing Lists
• People you’ve provided your address to provide it
to 3rd parties (usually for profit).
• Example: Auto insurance quote
• Initial sale of list might be aboveboard, but lists
have a way of trickling down to less desirable
senders.
Web Browser Holes
• Newer browsers have eliminated most of these,
but they’re still common in older browsers.
• Extraction of email address from HTTP_FROM
header that browser sends to web server.
• JavaScript to extract email address from
browser’s configuration.
Web Browser Holes
• Force browser to fetch an image on a page by
anonymous FTP.
− Most browsers use the configured email address as the
password.
• JavaScript action that sends an email message in
the background on page load.
Chat Rooms
• Web bots monitor chat rooms and extract user
names.
• Lots of providers (AOL, Yahoo) use the same
profile names for both chat rooms and email.
• IRC used to be fertile harvesting ground, but it’s
fallen into disuse by less savvy users.
Domain Contacts
• Every registered domain name has one or more
contact addresses.
• Addresses are publicly accessible (WHOIS)
• Addresses are almost always valid and read by a
real person on a regular basis.
Guessing
• Spammers “guess together” a list of email
addresses.
• The addresses are tested against one or more
email servers.
• Valid addresses are added to a list of addresses
to be spammed.
• Usually referred to as directory harvesting.
CAN-SPAM
• Federal CAN-SPAM act explicitly makes email
address harvesting illegal.
• Some providers of the harvesting software
have ceased and desisted, but harvesting has
actually increased.
• Like most legal solutions, CAN-SPAM is
severely constrained by jurisdictional
boundaries.
Harvesting Prevention
• The harder it is for spammers to get your
address, the harder it is for them to spam you.
• “I don’t care – my spam filter is awesome. Bring
it on!”
• No filter is 100% accurate
• Filtering still places load on filtering system
and/or email server.
Prevention Methods
• Reformatting addresses
• Web forms
• JavaScript-generated mailto links
• Graphical addresses
• Throwaway addresses
Reformatting Addresses
• Prevents harvesting from web pages and
newsgroups.
• Simple examples include inserting bogus strings
into the address to make it invalid:
jdoe@NOSPAM.hp.com
jdoeREMOVEME@hp.com
Reformatting Addresses
• Writing the address out longhand can prevent
harvesters from recognizing it as an email
address:
jdoe at hp dot com
• Inserting extra whitespace can also help:
jdoe @ hp.com
jdoe @ hp.com
Reformatting Addresses
• ASCII-encoded characters in the address are
decoded by most web clients, but not by most
spamware:
jdoe@p&#
114;ocess&#
046;com
Web Forms
• Provide an HTML form for web site visitors to
enter a message.
• When the form is submitted, the CGI script mails
the message to the appropriate recipient.
• Avoids displaying the actual address anywhere
on the site.
• Can still be abused, but it’s relatively difficult to
do.
Web Forms
JavaScript Generated mailtos
• Use JavaScript to dynamically generate mailto:
link when the link is clicked.
<A HREF=„javascript:window.location=
“mail”+”to:”+”jdoe”+”@”+”hp”+”.”+”com”; return
true‟>Click here to mail John Doe</A>
Graphical Addresses
• Displaying all or part of an email address as a
graphical image will throw off most harvesting
software.
• No known harvesting software is OCR-capable.
− Anecdotal reports of at least one large spam
organization trying to develop accurate OCR harvesters
Graphical Address Complexity
• Graphical @ sign:
− Probably sufficient to throw off most harvesters.
− Username and hostname are still in close proximity.
− Works easily for multiple users/multiple domains.
jdoe hp.com
Graphical Address Complexity
• Graphical @hostname:
− Should prevent any harvester from working.
− Requires a different image for each email domain.
jdoe
Graphical Address Complexity
• Graphical everything:
− For the truly paranoid.
− Completely unreadable by harvesters unless they’re
OCR-enabled.
− Requires either a lot of images or a script that can
dynamically generate them.
Throwaway Addresses
• Many people create an email account that they
use only for web pages and newsgroups.
• Some software products go further and let you
create an alias for every occasion.
• You still need a static address for business cards,
resumes, etc.
Harvesting Software
• Tons of specialized software (spamware) used
by spammers to harvest addresses.
• Most spamware developed in Eastern Europe
and Asia.
• We’re going to look at several of the most popular
packages.
List Harvester
• Harvests addresses from web sites.
• “Targeted” harvesting - in theory, the harvested
email addresses have something in common.
• Appears to be based in China.
• http://www.listharvester.com
• Price: $699 US
List Harvester - Method
• Performs a search for one or more keywords on
the user’s choice of search engine.
• Parses every site returned by the search engine
in order, looking for addresses and links.
• Follows links to other pages and parses them for
addresses as well.
List Harvester
• Start screen:
List Harvester
• Search terms entry:
List Harvester
• Search parameters:
List Harvester
• Search filters:
List Harvester
• Parsing engine options:
List Harvester
• Saving list of extracted addresses:
List Harvester
• Harvesting in progress:
Atomic Email Hunter
• Harvests addresses from web sites.
• Either scans an entire web site for addresses or
performs a “targeted search” like List Harvester.
• Based in Russia, most likely Moscow.
• http://www.massmailsoftware.com/
• Price: $79.85 US
Atomic Email Hunter
• Start screen:
Atomic Email Hunter
• Web download settings:
Atomic Email Hunter
• Address filtering settings:
Atomic Email Hunter
Run:
Atomic Email Hunter
• Results:
Fast Newsgroups Extractor
• Harvests addresses from newsgroups.
• Has a companion web site extractor that’s very
similar to Atomic Email Hunter.
• Based in Russia, most likely Moscow.
• http://www.lencom.com
• Price: $79.00 US
Fast Newsgroups Extractor - Method
• Lets user select one or more newsgroups to
extract content from.
• Downloads multiple messages simultaneously
from the NNTP server.
• Extracts addresses from the downloaded
messages.
• Has the ability to limit downloaded messages to
those that contain certain text in the subject.
Fast Newsgroups Extractor
• Start screen:
Fast Newsgroups Extractor
• News server setup:
Fast Newsgroups Extractor
• Newsgroup list download:
Fast Newsgroups Extractor
• News group selection:
Fast Newsgroups Extractor
• Harvesting job setup
Fast Newsgroups Extractor
• Run:
Quick Review
• We talked about:
− What email address harvesting is
− What data sources are harvested
− How you can protect your addresses
− 3 software packages used by spammers to harvest
addresses
58 22 July 2014

Contenu connexe

En vedette

Dil Nasil öğRenilir
Dil Nasil öğRenilirDil Nasil öğRenilir
Dil Nasil öğReniliritu
 
Guia de profissoes_2014
Guia de profissoes_2014Guia de profissoes_2014
Guia de profissoes_20143485fael
 
Bigmac
BigmacBigmac
Bigmackey
 
¿Cómo abrir una cuenta en hotmail?
¿Cómo abrir una cuenta en hotmail?¿Cómo abrir una cuenta en hotmail?
¿Cómo abrir una cuenta en hotmail?male_tati
 
Industria vicunha-textil
Industria vicunha-textilIndustria vicunha-textil
Industria vicunha-textilRF SYSTEMS
 
Certificate in HR Skills I & II
Certificate in HR Skills I & IICertificate in HR Skills I & II
Certificate in HR Skills I & IIIIR Middle East
 
Youblisher.com 986927-cana mix-edi_ao_73_2
Youblisher.com 986927-cana mix-edi_ao_73_2Youblisher.com 986927-cana mix-edi_ao_73_2
Youblisher.com 986927-cana mix-edi_ao_73_2Lela Gomes
 
Unimed guarulhos julho2012
Unimed guarulhos   julho2012Unimed guarulhos   julho2012
Unimed guarulhos julho2012Ivan Gouveia
 
MartíN DíAz Con Anexos 2010
MartíN DíAz    Con Anexos 2010MartíN DíAz    Con Anexos 2010
MartíN DíAz Con Anexos 2010Martin Henao
 
Edital RioPrevidência Assistente Previdenciário
Edital RioPrevidência Assistente PrevidenciárioEdital RioPrevidência Assistente Previdenciário
Edital RioPrevidência Assistente PrevidenciárioConcurso Virtual
 
Edital concurso caema,são luis do maranhão
Edital concurso caema,são luis do maranhãoEdital concurso caema,são luis do maranhão
Edital concurso caema,são luis do maranhãoPedro Álvares
 
Apresentação dos Resultados 1T14
Apresentação dos Resultados 1T14Apresentação dos Resultados 1T14
Apresentação dos Resultados 1T14ForjasTaurus
 

En vedette (20)

Dil Nasil öğRenilir
Dil Nasil öğRenilirDil Nasil öğRenilir
Dil Nasil öğRenilir
 
Guia de profissoes_2014
Guia de profissoes_2014Guia de profissoes_2014
Guia de profissoes_2014
 
Bigmac
BigmacBigmac
Bigmac
 
Linguagem java
Linguagem javaLinguagem java
Linguagem java
 
Edital mprj 01_09
Edital mprj 01_09Edital mprj 01_09
Edital mprj 01_09
 
Novena ao Divino Menino Jesus
Novena ao Divino Menino JesusNovena ao Divino Menino Jesus
Novena ao Divino Menino Jesus
 
¿Cómo abrir una cuenta en hotmail?
¿Cómo abrir una cuenta en hotmail?¿Cómo abrir una cuenta en hotmail?
¿Cómo abrir una cuenta en hotmail?
 
Doc 1032224 2
Doc 1032224 2Doc 1032224 2
Doc 1032224 2
 
Industria vicunha-textil
Industria vicunha-textilIndustria vicunha-textil
Industria vicunha-textil
 
Certificate in HR Skills I & II
Certificate in HR Skills I & IICertificate in HR Skills I & II
Certificate in HR Skills I & II
 
Youblisher.com 986927-cana mix-edi_ao_73_2
Youblisher.com 986927-cana mix-edi_ao_73_2Youblisher.com 986927-cana mix-edi_ao_73_2
Youblisher.com 986927-cana mix-edi_ao_73_2
 
Informativo abrat jul2016
Informativo abrat jul2016Informativo abrat jul2016
Informativo abrat jul2016
 
Unimed guarulhos julho2012
Unimed guarulhos   julho2012Unimed guarulhos   julho2012
Unimed guarulhos julho2012
 
MartíN DíAz Con Anexos 2010
MartíN DíAz    Con Anexos 2010MartíN DíAz    Con Anexos 2010
MartíN DíAz Con Anexos 2010
 
68060200610
6806020061068060200610
68060200610
 
Yeimy2
Yeimy2Yeimy2
Yeimy2
 
La uva
La uvaLa uva
La uva
 
Edital RioPrevidência Assistente Previdenciário
Edital RioPrevidência Assistente PrevidenciárioEdital RioPrevidência Assistente Previdenciário
Edital RioPrevidência Assistente Previdenciário
 
Edital concurso caema,são luis do maranhão
Edital concurso caema,são luis do maranhãoEdital concurso caema,são luis do maranhão
Edital concurso caema,são luis do maranhão
 
Apresentação dos Resultados 1T14
Apresentação dos Resultados 1T14Apresentação dos Resultados 1T14
Apresentação dos Resultados 1T14
 

Similaire à Email Address Harvesting

HadoopSummit_2010_big dataspamchallange_hadoopsummit2010
HadoopSummit_2010_big dataspamchallange_hadoopsummit2010HadoopSummit_2010_big dataspamchallange_hadoopsummit2010
HadoopSummit_2010_big dataspamchallange_hadoopsummit2010Yahoo Developer Network
 
Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010Yahoo Developer Network
 
Power of Email Marketing (NADA 2010) Peter Martin
Power of Email Marketing (NADA 2010) Peter MartinPower of Email Marketing (NADA 2010) Peter Martin
Power of Email Marketing (NADA 2010) Peter MartinSean Bradley
 
What is a MS Windows Network Drive
What is a MS Windows Network DriveWhat is a MS Windows Network Drive
What is a MS Windows Network Driveadil raja
 
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28Dave Archer
 
Evaluating and Implementing Anti-Spam Solutions
Evaluating and Implementing Anti-Spam SolutionsEvaluating and Implementing Anti-Spam Solutions
Evaluating and Implementing Anti-Spam SolutionsMichael Lamont
 
Module 1 : Section 4 Internet Hosting
Module 1 : Section 4 Internet HostingModule 1 : Section 4 Internet Hosting
Module 1 : Section 4 Internet Hostingwebhostingguy
 
Deliverability 101
Deliverability 101Deliverability 101
Deliverability 101Sally Beers
 
Deliverability 101
Deliverability 101Deliverability 101
Deliverability 101Sally Beers
 
3 Best Practices for Email Marketing
3   Best Practices for Email Marketing3   Best Practices for Email Marketing
3 Best Practices for Email MarketingFriday Explorer
 
HighRoad U Webinar: Election & Holiday Email Extravaganza
HighRoad U Webinar:  Election & Holiday Email ExtravaganzaHighRoad U Webinar:  Election & Holiday Email Extravaganza
HighRoad U Webinar: Election & Holiday Email ExtravaganzaHighRoad Solution
 
Why Aren't They Registering? Did They Receive the Email?
Why Aren't They Registering? Did They Receive the Email? Why Aren't They Registering? Did They Receive the Email?
Why Aren't They Registering? Did They Receive the Email? HighRoad Solution
 
ch12.ppt which is very good forensics of email
ch12.ppt which is very good forensics of emailch12.ppt which is very good forensics of email
ch12.ppt which is very good forensics of emailgadisagemechu1
 
Lessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of SpamLessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of SpamSparkPost
 
How to deploy Exchange Online Protection
How to deploy Exchange Online ProtectionHow to deploy Exchange Online Protection
How to deploy Exchange Online ProtectionPeter Schmidt
 
Domain racer web-hosting
Domain racer web-hostingDomain racer web-hosting
Domain racer web-hostingimrose khan
 
2010 Spam Filtered World Fv
2010 Spam Filtered World Fv2010 Spam Filtered World Fv
2010 Spam Filtered World Fvcactussky
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniquesranjit banshpal
 

Similaire à Email Address Harvesting (20)

HadoopSummit_2010_big dataspamchallange_hadoopsummit2010
HadoopSummit_2010_big dataspamchallange_hadoopsummit2010HadoopSummit_2010_big dataspamchallange_hadoopsummit2010
HadoopSummit_2010_big dataspamchallange_hadoopsummit2010
 
Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010Winning the Big Data SPAM Challenge__HadoopSummit2010
Winning the Big Data SPAM Challenge__HadoopSummit2010
 
Power of Email Marketing (NADA 2010) Peter Martin
Power of Email Marketing (NADA 2010) Peter MartinPower of Email Marketing (NADA 2010) Peter Martin
Power of Email Marketing (NADA 2010) Peter Martin
 
8.1.Phishing Analysis.ppt
8.1.Phishing Analysis.ppt8.1.Phishing Analysis.ppt
8.1.Phishing Analysis.ppt
 
What is a MS Windows Network Drive
What is a MS Windows Network DriveWhat is a MS Windows Network Drive
What is a MS Windows Network Drive
 
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
NCET Biz Bite | Darren McBride, Email Productivity Tips and Tricks | Mar 28
 
Evaluating and Implementing Anti-Spam Solutions
Evaluating and Implementing Anti-Spam SolutionsEvaluating and Implementing Anti-Spam Solutions
Evaluating and Implementing Anti-Spam Solutions
 
Module 1 : Section 4 Internet Hosting
Module 1 : Section 4 Internet HostingModule 1 : Section 4 Internet Hosting
Module 1 : Section 4 Internet Hosting
 
Deliverability 101
Deliverability 101Deliverability 101
Deliverability 101
 
Deliverability 101
Deliverability 101Deliverability 101
Deliverability 101
 
3 Best Practices for Email Marketing
3   Best Practices for Email Marketing3   Best Practices for Email Marketing
3 Best Practices for Email Marketing
 
HighRoad U Webinar: Election & Holiday Email Extravaganza
HighRoad U Webinar:  Election & Holiday Email ExtravaganzaHighRoad U Webinar:  Election & Holiday Email Extravaganza
HighRoad U Webinar: Election & Holiday Email Extravaganza
 
Why Aren't They Registering? Did They Receive the Email?
Why Aren't They Registering? Did They Receive the Email? Why Aren't They Registering? Did They Receive the Email?
Why Aren't They Registering? Did They Receive the Email?
 
Academy4 l m
Academy4 l mAcademy4 l m
Academy4 l m
 
ch12.ppt which is very good forensics of email
ch12.ppt which is very good forensics of emailch12.ppt which is very good forensics of email
ch12.ppt which is very good forensics of email
 
Lessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of SpamLessons Learned From the Evolution of Spam
Lessons Learned From the Evolution of Spam
 
How to deploy Exchange Online Protection
How to deploy Exchange Online ProtectionHow to deploy Exchange Online Protection
How to deploy Exchange Online Protection
 
Domain racer web-hosting
Domain racer web-hostingDomain racer web-hosting
Domain racer web-hosting
 
2010 Spam Filtered World Fv
2010 Spam Filtered World Fv2010 Spam Filtered World Fv
2010 Spam Filtered World Fv
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniques
 

Plus de Michael Lamont

Introduction to TCP/IP
Introduction to TCP/IPIntroduction to TCP/IP
Introduction to TCP/IPMichael Lamont
 
Why Is Managing Software So Hard?
Why Is Managing Software So Hard?Why Is Managing Software So Hard?
Why Is Managing Software So Hard?Michael Lamont
 
Pricing Analytics: Segmenting Customers To Maximize Revenue
Pricing Analytics: Segmenting Customers To Maximize RevenuePricing Analytics: Segmenting Customers To Maximize Revenue
Pricing Analytics: Segmenting Customers To Maximize RevenueMichael Lamont
 
Pricing Analytics: Optimizing Sales Models
Pricing Analytics: Optimizing Sales ModelsPricing Analytics: Optimizing Sales Models
Pricing Analytics: Optimizing Sales ModelsMichael Lamont
 
Pricing Analytics: Price Skimming
Pricing Analytics: Price SkimmingPricing Analytics: Price Skimming
Pricing Analytics: Price SkimmingMichael Lamont
 
Pricing Analytics: Estimating Demand Curves Without Price Elasticity
Pricing Analytics: Estimating Demand Curves Without Price ElasticityPricing Analytics: Estimating Demand Curves Without Price Elasticity
Pricing Analytics: Estimating Demand Curves Without Price ElasticityMichael Lamont
 
Business Intelligence: Multidimensional Analysis
Business Intelligence: Multidimensional AnalysisBusiness Intelligence: Multidimensional Analysis
Business Intelligence: Multidimensional AnalysisMichael Lamont
 
Pricing Analytics: Optimizing Price
Pricing Analytics: Optimizing PricePricing Analytics: Optimizing Price
Pricing Analytics: Optimizing PriceMichael Lamont
 
Pricing Analytics: Creating Linear & Power Demand Curves
Pricing Analytics: Creating Linear & Power Demand CurvesPricing Analytics: Creating Linear & Power Demand Curves
Pricing Analytics: Creating Linear & Power Demand CurvesMichael Lamont
 
Understanding Business Intelligence
Understanding Business IntelligenceUnderstanding Business Intelligence
Understanding Business IntelligenceMichael Lamont
 
Antispam Image Filtering Technologies
Antispam Image Filtering TechnologiesAntispam Image Filtering Technologies
Antispam Image Filtering TechnologiesMichael Lamont
 
Installing & Configuring OpenLDAP (Hands On Lab)
Installing & Configuring OpenLDAP (Hands On Lab)Installing & Configuring OpenLDAP (Hands On Lab)
Installing & Configuring OpenLDAP (Hands On Lab)Michael Lamont
 
Evaluating Anti-Spam Filtering Solutions
Evaluating Anti-Spam Filtering SolutionsEvaluating Anti-Spam Filtering Solutions
Evaluating Anti-Spam Filtering SolutionsMichael Lamont
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data WarehousesMichael Lamont
 

Plus de Michael Lamont (14)

Introduction to TCP/IP
Introduction to TCP/IPIntroduction to TCP/IP
Introduction to TCP/IP
 
Why Is Managing Software So Hard?
Why Is Managing Software So Hard?Why Is Managing Software So Hard?
Why Is Managing Software So Hard?
 
Pricing Analytics: Segmenting Customers To Maximize Revenue
Pricing Analytics: Segmenting Customers To Maximize RevenuePricing Analytics: Segmenting Customers To Maximize Revenue
Pricing Analytics: Segmenting Customers To Maximize Revenue
 
Pricing Analytics: Optimizing Sales Models
Pricing Analytics: Optimizing Sales ModelsPricing Analytics: Optimizing Sales Models
Pricing Analytics: Optimizing Sales Models
 
Pricing Analytics: Price Skimming
Pricing Analytics: Price SkimmingPricing Analytics: Price Skimming
Pricing Analytics: Price Skimming
 
Pricing Analytics: Estimating Demand Curves Without Price Elasticity
Pricing Analytics: Estimating Demand Curves Without Price ElasticityPricing Analytics: Estimating Demand Curves Without Price Elasticity
Pricing Analytics: Estimating Demand Curves Without Price Elasticity
 
Business Intelligence: Multidimensional Analysis
Business Intelligence: Multidimensional AnalysisBusiness Intelligence: Multidimensional Analysis
Business Intelligence: Multidimensional Analysis
 
Pricing Analytics: Optimizing Price
Pricing Analytics: Optimizing PricePricing Analytics: Optimizing Price
Pricing Analytics: Optimizing Price
 
Pricing Analytics: Creating Linear & Power Demand Curves
Pricing Analytics: Creating Linear & Power Demand CurvesPricing Analytics: Creating Linear & Power Demand Curves
Pricing Analytics: Creating Linear & Power Demand Curves
 
Understanding Business Intelligence
Understanding Business IntelligenceUnderstanding Business Intelligence
Understanding Business Intelligence
 
Antispam Image Filtering Technologies
Antispam Image Filtering TechnologiesAntispam Image Filtering Technologies
Antispam Image Filtering Technologies
 
Installing & Configuring OpenLDAP (Hands On Lab)
Installing & Configuring OpenLDAP (Hands On Lab)Installing & Configuring OpenLDAP (Hands On Lab)
Installing & Configuring OpenLDAP (Hands On Lab)
 
Evaluating Anti-Spam Filtering Solutions
Evaluating Anti-Spam Filtering SolutionsEvaluating Anti-Spam Filtering Solutions
Evaluating Anti-Spam Filtering Solutions
 
Business Intelligence: Data Warehouses
Business Intelligence: Data WarehousesBusiness Intelligence: Data Warehouses
Business Intelligence: Data Warehouses
 

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Email Address Harvesting

  • 1. Produced in cooperation with:HP Technology Forum & Expo 2009 © 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Email Address Harvesting Michael Lamont Senior Software Engineer June 17, 2009
  • 2. Overview • What is email address harvesting? • How do spammers do it? • What can you do about it? • Examples of harvesting software
  • 3. Mandatory Definition Slide • Email address harvesting is the process used by spammers to extract email addresses from public sources. • Common sources: − Web sites − Newsgroups − Mailing lists − Chat rooms
  • 4. Mandatory “How Bad Is It?” Slide • FTC: 86% of all email addresses posted on web pages receive spam. • FTC: 93% of all email addresses used in newsgroups receive spam. • PSC honeypot record: Address received spam 4 minutes after being included in a newsgroup post.
  • 5. Address Lists • Spammers use address harvesting to build giant lists of addresses to send spam to. • Most lists have 1-20 million addresses. • Spammers sell/share their lists, so being on even just one list will get you a lot of spam.
  • 6. Evolution Of The Address List • Somebody (probably not even a spammer) harvests addresses from various sources. • A “good” harvester scrubs the list. • The harvester sells the list to lots of spammers. • Once your address is on a list, it’s going to be on one or more lists forever.
  • 7. Harvesting From Web Sites • Spammers usually use a spider program to scrape addresses off of web pages.
  • 9. Harvesting From Web Sites • Web directories make it easy to get lots of addresses
  • 10. Harvesting From Web Sites 10 22 July 2014
  • 11. UseNet Newsgroups • Spider programs exist to extract these addresses as well. • Email addresses are splattered all over: − Message headers − Signatures − Attributions
  • 12. Mailing Lists • Lots of list manager software provides a list of every email address on a list. • Spammers are happy to join a mailing list temporarily to get access to a list of subscribers. • Some clever spammers send an innocuous newbie question from the list archives with a read-receipt request.
  • 13. 3rd Party Mailing Lists • People you’ve provided your address to provide it to 3rd parties (usually for profit). • Example: Auto insurance quote • Initial sale of list might be aboveboard, but lists have a way of trickling down to less desirable senders.
  • 14. Web Browser Holes • Newer browsers have eliminated most of these, but they’re still common in older browsers. • Extraction of email address from HTTP_FROM header that browser sends to web server. • JavaScript to extract email address from browser’s configuration.
  • 15. Web Browser Holes • Force browser to fetch an image on a page by anonymous FTP. − Most browsers use the configured email address as the password. • JavaScript action that sends an email message in the background on page load.
  • 16. Chat Rooms • Web bots monitor chat rooms and extract user names. • Lots of providers (AOL, Yahoo) use the same profile names for both chat rooms and email. • IRC used to be fertile harvesting ground, but it’s fallen into disuse by less savvy users.
  • 17. Domain Contacts • Every registered domain name has one or more contact addresses. • Addresses are publicly accessible (WHOIS) • Addresses are almost always valid and read by a real person on a regular basis.
  • 18. Guessing • Spammers “guess together” a list of email addresses. • The addresses are tested against one or more email servers. • Valid addresses are added to a list of addresses to be spammed. • Usually referred to as directory harvesting.
  • 19. CAN-SPAM • Federal CAN-SPAM act explicitly makes email address harvesting illegal. • Some providers of the harvesting software have ceased and desisted, but harvesting has actually increased. • Like most legal solutions, CAN-SPAM is severely constrained by jurisdictional boundaries.
  • 20. Harvesting Prevention • The harder it is for spammers to get your address, the harder it is for them to spam you. • “I don’t care – my spam filter is awesome. Bring it on!” • No filter is 100% accurate • Filtering still places load on filtering system and/or email server.
  • 21. Prevention Methods • Reformatting addresses • Web forms • JavaScript-generated mailto links • Graphical addresses • Throwaway addresses
  • 22. Reformatting Addresses • Prevents harvesting from web pages and newsgroups. • Simple examples include inserting bogus strings into the address to make it invalid: jdoe@NOSPAM.hp.com jdoeREMOVEME@hp.com
  • 23. Reformatting Addresses • Writing the address out longhand can prevent harvesters from recognizing it as an email address: jdoe at hp dot com • Inserting extra whitespace can also help: jdoe @ hp.com jdoe @ hp.com
  • 24. Reformatting Addresses • ASCII-encoded characters in the address are decoded by most web clients, but not by most spamware: &#106;&#100;&#111;&#101;&#064;&#112;&# 114;&#111;&#099;&#101;&#115;&#115;&# 046;&#099;&#111;&#109;
  • 25. Web Forms • Provide an HTML form for web site visitors to enter a message. • When the form is submitted, the CGI script mails the message to the appropriate recipient. • Avoids displaying the actual address anywhere on the site. • Can still be abused, but it’s relatively difficult to do.
  • 27. JavaScript Generated mailtos • Use JavaScript to dynamically generate mailto: link when the link is clicked. <A HREF=„javascript:window.location= “mail”+”to:”+”jdoe”+”@”+”hp”+”.”+”com”; return true‟>Click here to mail John Doe</A>
  • 28. Graphical Addresses • Displaying all or part of an email address as a graphical image will throw off most harvesting software. • No known harvesting software is OCR-capable. − Anecdotal reports of at least one large spam organization trying to develop accurate OCR harvesters
  • 29. Graphical Address Complexity • Graphical @ sign: − Probably sufficient to throw off most harvesters. − Username and hostname are still in close proximity. − Works easily for multiple users/multiple domains. jdoe hp.com
  • 30. Graphical Address Complexity • Graphical @hostname: − Should prevent any harvester from working. − Requires a different image for each email domain. jdoe
  • 31. Graphical Address Complexity • Graphical everything: − For the truly paranoid. − Completely unreadable by harvesters unless they’re OCR-enabled. − Requires either a lot of images or a script that can dynamically generate them.
  • 32. Throwaway Addresses • Many people create an email account that they use only for web pages and newsgroups. • Some software products go further and let you create an alias for every occasion. • You still need a static address for business cards, resumes, etc.
  • 33. Harvesting Software • Tons of specialized software (spamware) used by spammers to harvest addresses. • Most spamware developed in Eastern Europe and Asia. • We’re going to look at several of the most popular packages.
  • 34. List Harvester • Harvests addresses from web sites. • “Targeted” harvesting - in theory, the harvested email addresses have something in common. • Appears to be based in China. • http://www.listharvester.com • Price: $699 US
  • 35. List Harvester - Method • Performs a search for one or more keywords on the user’s choice of search engine. • Parses every site returned by the search engine in order, looking for addresses and links. • Follows links to other pages and parses them for addresses as well.
  • 40. List Harvester • Parsing engine options:
  • 41. List Harvester • Saving list of extracted addresses:
  • 43. Atomic Email Hunter • Harvests addresses from web sites. • Either scans an entire web site for addresses or performs a “targeted search” like List Harvester. • Based in Russia, most likely Moscow. • http://www.massmailsoftware.com/ • Price: $79.85 US
  • 44. Atomic Email Hunter • Start screen:
  • 45. Atomic Email Hunter • Web download settings:
  • 46. Atomic Email Hunter • Address filtering settings:
  • 49. Fast Newsgroups Extractor • Harvests addresses from newsgroups. • Has a companion web site extractor that’s very similar to Atomic Email Hunter. • Based in Russia, most likely Moscow. • http://www.lencom.com • Price: $79.00 US
  • 50. Fast Newsgroups Extractor - Method • Lets user select one or more newsgroups to extract content from. • Downloads multiple messages simultaneously from the NNTP server. • Extracts addresses from the downloaded messages. • Has the ability to limit downloaded messages to those that contain certain text in the subject.
  • 52. Fast Newsgroups Extractor • News server setup:
  • 53. Fast Newsgroups Extractor • Newsgroup list download:
  • 54. Fast Newsgroups Extractor • News group selection:
  • 55. Fast Newsgroups Extractor • Harvesting job setup
  • 57. Quick Review • We talked about: − What email address harvesting is − What data sources are harvested − How you can protect your addresses − 3 software packages used by spammers to harvest addresses
  • 58. 58 22 July 2014