SlideShare une entreprise Scribd logo
1  sur  30
A Quantitative Study of Forum Spamming Using Context-Based Analysis Yi-Min Wang^ Ming Ma^ Yuan Niu* Hao Chen* Francis Hsu* *UC Davis, ^Microsoft Research
A Look at the Web User Spammer
Why do we care about spam? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why Web Forums? ,[object Object],[object Object]
Why Web Forums?
How Spammers Operate Doorway Pages (Splogs) Search Results Comment Spam Search Engine Spammer Domain Spammer 2. Writes Splog URLs 1. Creates Returns 3. Propagates  Splog URL 4. Sends User to  Doorway URL 5. Redirects User
How to deal with the problem? ,[object Object],[object Object],[object Object],[object Object]
Context-based Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Doorways & Redirections Google search: Coach handbag
Redirection Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object]
Cloaking Analysis ,[object Object],[object Object],[object Object],[object Object]
Crawler-Browser Cloaking Google Search: ringtones download www.welcometuscany.it/images/_notes/xc/26/Ringtones-Download.html Javascript Disabled www.welcometuscany.it/images/_notes/xc/26/Ringtones-Download.html Javascript Enabled
Crawler-Browser Cloaking
Click-Through Cloaking Cached page/ Scripting off/ Crawler View Advertising Page from Click-throughs Directly Visiting the Page Directly Visiting the Page Cached page/ Scripting off/ Crawler View
Three Perspectives Doorway Pages (Splogs) Search Results Comment Spam Search Engine Spammer Domain Spammer 2. Writes Splog URLs 1. Creates Returns 3. Propagates  Splog URL 4. Sends User to  Doorway URL 5. Redirects User Search User Webhost
Search User
Search User ,[object Object],[object Object],[object Object],[object Object],[object Object]
Search User ,[object Object],[object Object],[object Object],79 105 http://samba.eecs.umich .edu /phorum/list.php?2 97 117 http://classicauthors.net/messageboard/list.php?f=1 94 119 http://www.usra .edu /phorum 82 134 http://www.comm.fsu .edu /interactive/forum/ 102 175 http://fs.fed.us/...mm/get/mmforumA.html Keywords Pages Forum
Honeyblogs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Honeyblogs ,[object Object],[object Object],[object Object],[object Object],[object Object]
Honeyblog Activity
Honeyblog Activity 3142
Webhost Perspective ,[object Object],[object Object],[object Object],0 82 (83%) 99 Blogsharing 0 198 (54%) 369 Blogstudio 131 3,535 (75 % ) 4,714 Blogspoint 652 1,091 (8.1%) 13,389 Blogspot URLs Using Cloaking Spam URLs Examined URLs Blog Host
Webhost Perspective ,[object Object],[object Object],[object Object],[object Object],[object Object]
Webhost Perspective ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Also of note… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],1 Yi-Min Wang, et al. Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities. NDSS, 2006.
Related Work (Part 1) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Related Work (Part 2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object]
Future work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],1 Yi-Min Wang et al.  Spam Double-Funnel: Connecting Web Spammers with Advertisers . WWW 2007.

Contenu connexe

Tendances

Do follow and no-follow link
Do follow and no-follow linkDo follow and no-follow link
Do follow and no-follow linkAbhishek Mitra
 
Understanding SEO
Understanding SEOUnderstanding SEO
Understanding SEOTim Huegdon
 
SEO Tools For Marketers - Seo tools for you
SEO Tools For Marketers - Seo tools for youSEO Tools For Marketers - Seo tools for you
SEO Tools For Marketers - Seo tools for youUy Hoàng
 
Top 5 Ways to Analyze Your Website: Google Search Console
Top 5 Ways to Analyze Your Website: Google Search ConsoleTop 5 Ways to Analyze Your Website: Google Search Console
Top 5 Ways to Analyze Your Website: Google Search ConsoleElementive
 
Trusted Friend Attack: Guardian Angels Strike
Trusted Friend Attack: Guardian Angels StrikeTrusted Friend Attack: Guardian Angels Strike
Trusted Friend Attack: Guardian Angels StrikeMSc Ashar Javed
 
Sucuri Webinar: Impacts of a website compromise
Sucuri Webinar: Impacts of a website compromiseSucuri Webinar: Impacts of a website compromise
Sucuri Webinar: Impacts of a website compromiseSucuri
 
500+ High DA & PA Do-Follow Profile Backlink Sites List 2020
500+ High DA & PA Do-Follow Profile Backlink Sites List 2020500+ High DA & PA Do-Follow Profile Backlink Sites List 2020
500+ High DA & PA Do-Follow Profile Backlink Sites List 2020SukantaParthib
 
OpenID Security
OpenID SecurityOpenID Security
OpenID Securityeugenet
 
Increase Your Rankings With This Secret Source Of Authoritative Backlinks
Increase Your Rankings With This Secret Source Of Authoritative BacklinksIncrease Your Rankings With This Secret Source Of Authoritative Backlinks
Increase Your Rankings With This Secret Source Of Authoritative BacklinksMatthew Woodward
 
Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....
Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....
Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....Matthew Woodward
 
Find Monkey Making Niches with this Secret
Find Monkey Making Niches with this SecretFind Monkey Making Niches with this Secret
Find Monkey Making Niches with this SecretMatthew Woodward
 
Search Profile Index sample report for DigitalFilipino.com
Search Profile Index sample report for DigitalFilipino.comSearch Profile Index sample report for DigitalFilipino.com
Search Profile Index sample report for DigitalFilipino.comJanette Toral
 
Hyves Open Id
Hyves Open IdHyves Open Id
Hyves Open Idevidos
 
List of do follow websites used to build backlinks
List of do follow websites used to build backlinksList of do follow websites used to build backlinks
List of do follow websites used to build backlinksDSIM
 
Writing for the web
Writing for the webWriting for the web
Writing for the webChris Snider
 
Alternative link building strategies for difficult niches and disavow files
Alternative link building strategies for difficult niches and disavow filesAlternative link building strategies for difficult niches and disavow files
Alternative link building strategies for difficult niches and disavow filesNaZapad
 
Sucuri Webinar: How Websites Get Hacked
Sucuri Webinar: How Websites Get HackedSucuri Webinar: How Websites Get Hacked
Sucuri Webinar: How Websites Get HackedSucuri
 

Tendances (20)

Sample seo report
Sample seo reportSample seo report
Sample seo report
 
Lb guide barbara
Lb guide barbaraLb guide barbara
Lb guide barbara
 
Do follow and no-follow link
Do follow and no-follow linkDo follow and no-follow link
Do follow and no-follow link
 
Understanding SEO
Understanding SEOUnderstanding SEO
Understanding SEO
 
SEO Tools For Marketers - Seo tools for you
SEO Tools For Marketers - Seo tools for youSEO Tools For Marketers - Seo tools for you
SEO Tools For Marketers - Seo tools for you
 
Top 5 Ways to Analyze Your Website: Google Search Console
Top 5 Ways to Analyze Your Website: Google Search ConsoleTop 5 Ways to Analyze Your Website: Google Search Console
Top 5 Ways to Analyze Your Website: Google Search Console
 
Trusted Friend Attack: Guardian Angels Strike
Trusted Friend Attack: Guardian Angels StrikeTrusted Friend Attack: Guardian Angels Strike
Trusted Friend Attack: Guardian Angels Strike
 
Sucuri Webinar: Impacts of a website compromise
Sucuri Webinar: Impacts of a website compromiseSucuri Webinar: Impacts of a website compromise
Sucuri Webinar: Impacts of a website compromise
 
500+ High DA & PA Do-Follow Profile Backlink Sites List 2020
500+ High DA & PA Do-Follow Profile Backlink Sites List 2020500+ High DA & PA Do-Follow Profile Backlink Sites List 2020
500+ High DA & PA Do-Follow Profile Backlink Sites List 2020
 
OpenID Security
OpenID SecurityOpenID Security
OpenID Security
 
probed
probedprobed
probed
 
Increase Your Rankings With This Secret Source Of Authoritative Backlinks
Increase Your Rankings With This Secret Source Of Authoritative BacklinksIncrease Your Rankings With This Secret Source Of Authoritative Backlinks
Increase Your Rankings With This Secret Source Of Authoritative Backlinks
 
Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....
Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....
Take A Sneaky Peak At How Someone Built & Ranked A Site That Beats Penguin 3....
 
Find Monkey Making Niches with this Secret
Find Monkey Making Niches with this SecretFind Monkey Making Niches with this Secret
Find Monkey Making Niches with this Secret
 
Search Profile Index sample report for DigitalFilipino.com
Search Profile Index sample report for DigitalFilipino.comSearch Profile Index sample report for DigitalFilipino.com
Search Profile Index sample report for DigitalFilipino.com
 
Hyves Open Id
Hyves Open IdHyves Open Id
Hyves Open Id
 
List of do follow websites used to build backlinks
List of do follow websites used to build backlinksList of do follow websites used to build backlinks
List of do follow websites used to build backlinks
 
Writing for the web
Writing for the webWriting for the web
Writing for the web
 
Alternative link building strategies for difficult niches and disavow files
Alternative link building strategies for difficult niches and disavow filesAlternative link building strategies for difficult niches and disavow files
Alternative link building strategies for difficult niches and disavow files
 
Sucuri Webinar: How Websites Get Hacked
Sucuri Webinar: How Websites Get HackedSucuri Webinar: How Websites Get Hacked
Sucuri Webinar: How Websites Get Hacked
 

En vedette

Spam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta BhattacharyaSpam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta Bhattacharyasankhadeep
 
Presentación sobre los SPAM
Presentación sobre los SPAMPresentación sobre los SPAM
Presentación sobre los SPAMsalyyyyy
 
Denial of Service Attacks
Denial of Service AttacksDenial of Service Attacks
Denial of Service AttacksBrent Muir
 
Phishing attacks ppt
Phishing attacks pptPhishing attacks ppt
Phishing attacks pptAryan Ragu
 
Spoofing
SpoofingSpoofing
SpoofingSanjeev
 

En vedette (8)

Spam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta BhattacharyaSpam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta Bhattacharya
 
Presentación sobre los SPAM
Presentación sobre los SPAMPresentación sobre los SPAM
Presentación sobre los SPAM
 
Denial of Service Attacks
Denial of Service AttacksDenial of Service Attacks
Denial of Service Attacks
 
Ip Spoofing
Ip SpoofingIp Spoofing
Ip Spoofing
 
Phishing
PhishingPhishing
Phishing
 
Phishing attacks ppt
Phishing attacks pptPhishing attacks ppt
Phishing attacks ppt
 
Spoofing
SpoofingSpoofing
Spoofing
 
Denial of Service Attacks
Denial of Service AttacksDenial of Service Attacks
Denial of Service Attacks
 

Similaire à ppt presentation

Link Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhLink Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhSam Oh
 
Rawnet Lightning Talk - Negative SEO - A Dirty Business!
Rawnet Lightning Talk -  Negative SEO - A Dirty Business!Rawnet Lightning Talk -  Negative SEO - A Dirty Business!
Rawnet Lightning Talk - Negative SEO - A Dirty Business!Rawnet
 
Link Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny TeamLink Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny Team97th Floor
 
Inbound Marketing Tools - SearchFest
Inbound Marketing Tools - SearchFestInbound Marketing Tools - SearchFest
Inbound Marketing Tools - SearchFestJustin Briggs
 
Social Bookmarking
Social Bookmarking Social Bookmarking
Social Bookmarking guest9c244f
 
State of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLsState of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLsIOSRjournaljce
 
SEO 101 | New York University
SEO 101 | New York UniversitySEO 101 | New York University
SEO 101 | New York UniversityNik Papic
 
Different Module of Digital Marketing
Different Module of Digital MarketingDifferent Module of Digital Marketing
Different Module of Digital MarketingAbhishekBasak11
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtipssounddelivery
 
Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014Thomas Whittam
 

Similaire à ppt presentation (20)

Web crawler
Web crawlerWeb crawler
Web crawler
 
Link Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhLink Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam Oh
 
Link Building Campaign
Link Building CampaignLink Building Campaign
Link Building Campaign
 
Link Building Overview
Link Building OverviewLink Building Overview
Link Building Overview
 
Rawnet Lightning Talk - Negative SEO - A Dirty Business!
Rawnet Lightning Talk -  Negative SEO - A Dirty Business!Rawnet Lightning Talk -  Negative SEO - A Dirty Business!
Rawnet Lightning Talk - Negative SEO - A Dirty Business!
 
prestiva_blackhat
prestiva_blackhatprestiva_blackhat
prestiva_blackhat
 
Link Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny TeamLink Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny Team
 
Inbound Marketing Tools - SearchFest
Inbound Marketing Tools - SearchFestInbound Marketing Tools - SearchFest
Inbound Marketing Tools - SearchFest
 
Social Bookmarking
Social Bookmarking Social Bookmarking
Social Bookmarking
 
webcrawler.pptx
webcrawler.pptxwebcrawler.pptx
webcrawler.pptx
 
State of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLsState of the Art Analysis Approach for Identification of the Malignant URLs
State of the Art Analysis Approach for Identification of the Malignant URLs
 
White Hat Cloaking
White Hat CloakingWhite Hat Cloaking
White Hat Cloaking
 
Seminar on crawler
Seminar on crawlerSeminar on crawler
Seminar on crawler
 
SEO 101 | New York University
SEO 101 | New York UniversitySEO 101 | New York University
SEO 101 | New York University
 
Different Module of Digital Marketing
Different Module of Digital MarketingDifferent Module of Digital Marketing
Different Module of Digital Marketing
 
Basic SEO Lecture Presentation
Basic SEO Lecture PresentationBasic SEO Lecture Presentation
Basic SEO Lecture Presentation
 
Free seo-book
Free seo-bookFree seo-book
Free seo-book
 
Chewy Trewella - Google Searchtips
Chewy Trewella - Google SearchtipsChewy Trewella - Google Searchtips
Chewy Trewella - Google Searchtips
 
Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014Proactive Measures for Good Site Health - Brighton SEO 2014
Proactive Measures for Good Site Health - Brighton SEO 2014
 
Web Mining
Web MiningWeb Mining
Web Mining
 

Plus de webhostingguy

Running and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test FrameworkRunning and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test Frameworkwebhostingguy
 
MySQL and memcached Guide
MySQL and memcached GuideMySQL and memcached Guide
MySQL and memcached Guidewebhostingguy
 
Novell® iChain® 2.3
Novell® iChain® 2.3Novell® iChain® 2.3
Novell® iChain® 2.3webhostingguy
 
Load-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web serversLoad-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web serverswebhostingguy
 
SQL Server 2008 Consolidation
SQL Server 2008 ConsolidationSQL Server 2008 Consolidation
SQL Server 2008 Consolidationwebhostingguy
 
Master Service Agreement
Master Service AgreementMaster Service Agreement
Master Service Agreementwebhostingguy
 
PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...webhostingguy
 
Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...webhostingguy
 
Managing Diverse IT Infrastructure
Managing Diverse IT InfrastructureManaging Diverse IT Infrastructure
Managing Diverse IT Infrastructurewebhostingguy
 
Web design for business.ppt
Web design for business.pptWeb design for business.ppt
Web design for business.pptwebhostingguy
 
IT Power Management Strategy
IT Power Management Strategy IT Power Management Strategy
IT Power Management Strategy webhostingguy
 
Excel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for MerchandisersExcel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for Merchandiserswebhostingguy
 
Parallels Hosting Products
Parallels Hosting ProductsParallels Hosting Products
Parallels Hosting Productswebhostingguy
 
Microsoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 MbMicrosoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 Mbwebhostingguy
 

Plus de webhostingguy (20)

File Upload
File UploadFile Upload
File Upload
 
Running and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test FrameworkRunning and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test Framework
 
MySQL and memcached Guide
MySQL and memcached GuideMySQL and memcached Guide
MySQL and memcached Guide
 
Novell® iChain® 2.3
Novell® iChain® 2.3Novell® iChain® 2.3
Novell® iChain® 2.3
 
Load-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web serversLoad-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web servers
 
SQL Server 2008 Consolidation
SQL Server 2008 ConsolidationSQL Server 2008 Consolidation
SQL Server 2008 Consolidation
 
What is mod_perl?
What is mod_perl?What is mod_perl?
What is mod_perl?
 
What is mod_perl?
What is mod_perl?What is mod_perl?
What is mod_perl?
 
Master Service Agreement
Master Service AgreementMaster Service Agreement
Master Service Agreement
 
Notes8
Notes8Notes8
Notes8
 
PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...
 
Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...
 
Managing Diverse IT Infrastructure
Managing Diverse IT InfrastructureManaging Diverse IT Infrastructure
Managing Diverse IT Infrastructure
 
Web design for business.ppt
Web design for business.pptWeb design for business.ppt
Web design for business.ppt
 
IT Power Management Strategy
IT Power Management Strategy IT Power Management Strategy
IT Power Management Strategy
 
Excel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for MerchandisersExcel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for Merchandisers
 
OLUG_xen.ppt
OLUG_xen.pptOLUG_xen.ppt
OLUG_xen.ppt
 
Parallels Hosting Products
Parallels Hosting ProductsParallels Hosting Products
Parallels Hosting Products
 
Microsoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 MbMicrosoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 Mb
 
Reseller's Guide
Reseller's GuideReseller's Guide
Reseller's Guide
 

ppt presentation

  • 1. A Quantitative Study of Forum Spamming Using Context-Based Analysis Yi-Min Wang^ Ming Ma^ Yuan Niu* Hao Chen* Francis Hsu* *UC Davis, ^Microsoft Research
  • 2. A Look at the Web User Spammer
  • 3.
  • 4.
  • 6. How Spammers Operate Doorway Pages (Splogs) Search Results Comment Spam Search Engine Spammer Domain Spammer 2. Writes Splog URLs 1. Creates Returns 3. Propagates Splog URL 4. Sends User to Doorway URL 5. Redirects User
  • 7.
  • 8.
  • 9. Doorways & Redirections Google search: Coach handbag
  • 10.
  • 11.
  • 12. Crawler-Browser Cloaking Google Search: ringtones download www.welcometuscany.it/images/_notes/xc/26/Ringtones-Download.html Javascript Disabled www.welcometuscany.it/images/_notes/xc/26/Ringtones-Download.html Javascript Enabled
  • 14. Click-Through Cloaking Cached page/ Scripting off/ Crawler View Advertising Page from Click-throughs Directly Visiting the Page Directly Visiting the Page Cached page/ Scripting off/ Crawler View
  • 15. Three Perspectives Doorway Pages (Splogs) Search Results Comment Spam Search Engine Spammer Domain Spammer 2. Writes Splog URLs 1. Creates Returns 3. Propagates Splog URL 4. Sends User to Doorway URL 5. Redirects User Search User Webhost
  • 17.
  • 18.
  • 19.
  • 20.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.

Notes de l'éditeur

  1. -- Put more details on slide -- SPAM SPAM
  2. Transition better to 2 nd slide More motivation – why SEO is important -- screenshot of search results -- More than spam, possible exploits -- more coherent story about comment spam -- Moderation nightmare
  3. Users want to see useful information. They want to participate in forums, they want to blog, go shopping without being bombarded by irrelevant ads. And of course, everyone has the right to surf the web without fear of being attacked by this or that exploit. Search engines try to point users to quality pages through good search results. They’re also partially motivated by money earned through ads.
  4. More reasons here… Define web forum
  5. More trackbacks or pingbacks (how do they work. Why do they exist) -- similarity based on layout COLOR backgrounds -- Captcha can’t be used. -- more difficult to moderate trackbacks/pingbacks
  6. Content-based analysis We get all the doorway pages + the destination. End game is to direct traffic to the destination Why we chose context-based analysis over content-based -- Define -- Related
  7. Thumbnails Define 3 rd party domain here
  8. More detail on the process of recording pages. 3 rd party domain-defin “ seeded known spammer domains” Mention the double funnel -- blacklist, whitelist, spam policies
  9. Also do picture for crawler-browser
  10. 1 st image is: konquerer masquerading from Wget (which doesn’t deal with javascript) The 2 nd image shows konquerer sending the correct user-agent id.
  11. Use circles/emphasize current graph. shrink
  12. First we look at the extent to which web forums are spammed, from the perspective of the web user. Presumably, this is because the spammer has been very busy in leaving his URLs all over the web. And again, the URLs being left about are doorway pages, which are more expendable than actual domains.
  13. WWWBoard, Hypernews, Ikonboard, Ezboard, Bravenet, Invision Board, Phpbb, Phorum, and VBulletin A mix of languages (perl, php) hosted/non-hosted. 9 different softwares – highlight differences rather than names -- list all, but more readable (maybe red circles & graphically)
  14. Top 5 numbers. Show more non-spammy words -- .edu & .gov sites (why web forums as well) Why is this bad?? (for every perspective)
  15. Expand the graph. Growth keeps continuing. Spammers are still visiting. Exponential growth seen on all 3
  16. Change colors. Sum 3 lines -- Shift the number Mark the important dates -- 2 nd graph to show rate of change -- mention length of experiment
  17. Include percentages
  18. Put numbers here -- Google has resources
  19. Blogspoint + blogstudio share spammers *** numbers!! Graph/table showing all 4 webhosts Why isn’t spam consistent across Consistent metrics
  20. Why are .edu/.gov redirs troublesome
  21. Less time on this. Don’t read out loud Highlight how ours differs/relates, their shortcomings (cloaking).
  22. Move WWW paper info to: APPLICATION/FUTURE WORK/IMPACT Explain how useful results are to search engines/forum.
  23. add citation. Title, partial names, www 2007. Add homepage url. Don’t mention morals.