From Event to Action: Accelerate Your Decision Making with Real-Time Automation
White Hat Cloaking
1. White Hat Cloaking – Six Practical Applications Presented by Hamlet Batista
2.
3. Crash course in white hat cloaking Page When to cloak? How do we cloak? How can cloaking be detected? Risks and next steps 1 2 4 5 Practical scenarios where good cloaking makes sense Practical scenarios and alternatives 3
4.
5.
6.
7.
8. Practical scenario #4 Page Regular users follow a link structure designed for ease of navigation Sites requiring massive site strucuture changes to improve index penetration Search engine robots follow a link structure designed for ease of crawling and deeper index penetration of the most important content Step 4 Step 1 Step 2 Step 3 Step 4 Step 5 Step 1 Step 3 Step 2 Step 5
9.
10.
11.
12. Robot detection by HTTP user agent Page Search robot HTTP request 66.249.66.1 - - [04/Mar/2008:00:20:56 -0500] “ GET /2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/ HTTP/1.1″ 200 61477 “ -” “ Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” “-” A very simple robot detection technique
13. Robot detection by HTTP cookie test Page Search robot HTTP request 66.249.66.1 - - [04/Mar/2008:00:20:56 -0500] “ GET /2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/ HTTP/1.1″ 200 61477 “ -” “ Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” “ Missing cookie info ” Another simple robot detection technique, but weaker
14. Robot detection by JavaScript/CSS test HTML Code <div id="header"><h1><a href="http://www.example.com" title="Example Site">Example site</a></h1></div> and the CSS code is pretty straight forward, it swaps out anything in the h1 tag in the header with an image CSS Code /* CSS Image replacement */ #header h1 {margin:0; padding:0;} #header h1 a { display: block; padding: 150px 0 0 0; background: url(path to image) top right no-repeat; overflow: hidden; font-size: 1px; line-height: 1px; height: 0px !important; height /**/:150px; } Page DHTML Content Another option for robot detection
15. Robot detection by IP address Page Search robot HTTP request 66.249.66.1 - - [04/Mar/2008:00:20:56 -0500] “ GET /2007/11/13/game-plan-what-marketers-can-learn-from-strategy-games/ HTTP/1.1″ 200 61477 “ -” “ Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” “ -” A more robust robot detection technique
16.
17. Robot detection by visitor behavior Page Robots differ substantially from regular users when visiting a website Your text
18. Combining the best of all techniques Page Maintain a cache with a list of known search robots to reduce the number of verification attempts Label as possible robot any visitor with suspicious behavior Label a robot anything that identifies as such Confirm it is a robot by doing a double DNS check. Also confirm suspect robots User Behavior Check User Agent Check IP Address Check Double DNS check
19.
20.
21.
22.
Notes de l'éditeur
Hi, My name is Hamlet Batista. Some of you know me from my blog, Hamlet Batista dot Com. I’m sure that everybody here has been taught that cloaking is bad. Today I am here to tell you otherwise. I am here to convince you that you should cloak. Now, before you leave the room for fear of castigation by Google, let me share some practical scenarios where good cloaking makes sense. I will contrast cloaking to other recommended alternatives and show why cloaking is still a better option. Hopefully, at the end of my presentation I will have convinced the search engineers in my panel as well.