SEO: FTW!

SEO: FTW!
How to make the web a better place by improving
your site positioning on search engines
In 2013 Internet will celebrate 20 years of publicly available Search Engines,
since the first of them (ALIWEB) was made available to the public in a user
friendly HTML page SEO has come a long way and is today a well established
branch of the Web industry. But what has changed since the long gone days of
doorway pages, hidden text and forum spamming? Which rules are still valid and
which new ones have beed adopted recently? How new SEO standards will
make the Web a better place? What still has to come in the near future?
Follow me while I'll walk through the evolution of SEO as an organic entity and
analyze its inner workings and techniques, from the basics to the pro tips. Some
new concept will be introduced, some old rules will be confirmed, some myths
busted.
And as Search-based functionalities sneak into our daily lives thanks to smarter
and smaller Internet-enabled devices, new SEO challenges arise on the horizon
of an extremely dynamic scenario; new concepts like Mobile SEO, SEO 2.0 and
SEO for UGC (User Generated Content) will become familiar to you as we get to
the final section of the presentation.
Federico "Lox" Lucignano
Senior Mobile & SEO Solutions Developer
Wikia sp.z.o.o.

Witam | Welcome | Bevenuti
✔
Who are you (or better: Who am I)?
– Federico “Lox” Lucignano
• Senior Mobile & SEO Solutions Engineer @ Wikia
I'm a member of that amazing team which is Wikia's Engineering, Wikia is a
great place to work both as a developer and as a SEO/Mobile enthusiast.
Even though my personal interest in Search Engines and their inner workings
dates back to the 20th
century, my professional involvement in SEO began in
2005; during the last 6 years I had a chance to work on SEO strategies for an
infinite variety of websites on different platforms, from the small DVD rental
service to big Wedding portals till the huge Community services. Today I'm
here to share this experience with you.
Before anyone comes up with two obvious questions let me clarify: YES! I'm
Italian (probably my name and surname betrayed me) and...
[next slide]

Pasta ain't my favourite dish :P
...NO! Despite a common misconception that famous Italian movies from the
50's and 60's spread around the world, in Italy we DO NOT eat pasta every
day :P

✔
Why are you here (or better: What is
this all about)?
– SEO, SEM, ATO, [infinite list of obscure
acronyms here]
• Evolution
• Techniques
• Standards
Witam | Welcome | Bevenuti
As mentioned, today we'll deal with SEO as an organic entity, we'll quickly go
through recent developments, we'll analyze the techniques SEO specialists
and enthusiasts of any level are using with success on millions of websites
and we'll take a sneak peek into what to expect in the near future.
This brings us to the next big question...

Why should we care about SEO?
Marketing and Monetization are not the only
reasons anymore as search engines are
sneaking in our daily activities, they even started
following us wherever we go...
How?
Let's take a quick look at some of the most
popular internet enabled applications/platforms
In simple words: making money is not the main reason behind SEO anymore!
That is already a concept of the past, in today's internet-connected world
Search Engines are become the main entry point for the Web as vinyl discs
have been the main entry point to the world of Music for millions of fans till the
introduction of tapes.
But how Google & co. are taking over our daily life?
Simple, by introducing extremely easy to use search functionalities in popular
software applications and electronic devices.
Let's take a look at some examples.
[next slide]

Web browsers: Internet Explorer
Versions 7 & 8
search box in the upper
right corner of the UI
Version 9
Search integrated in the
Address bar
The first step was to integrate Web Search capabilities right into the UI of
prime time Web browsers, which are probably the software used more and
more often than any other nowadays.
At first the search field was a standalone UI widget which would run a search
only if the user interacted directly with it by inputting one or more keywords
and then pressing a button (e.g. Internet Explorer 7 & 8, Firefox 3, Safari).
But in the latest versions this functionality got integrated in the URL bar, this
means that if a user types there anything that is not an URL (even a mistyped
one or an incomplete one) a search will be initiated on the (not always) user
customizable search engine.
This made the trick, Search Engines are now always there, waiting for your
input!

Web browsers: Google Chrome
Omnibox
Serves both as the
Addess bar and as a
search field with realtime
search results and
suggestions
To this Chrome adds realtime as-you-tipe search suggestion, basically this
implies that every time you type a letter the browser will ask the Search Engine
for those suggestions..

Web browsers: Firefox
Awesome bar
Serves both as the
Addess bar and as a
search field
Search box
At the top right corner in
the UI
Default Homepage
A Mozilla-customized
Google page
With the release of version 4, Mozilla couldn't make it more clear: they love
Google.
By default the browser will let the user search the web via the old-fashioned
stand-alone search field (on the right in the UI), via the so called “Awesome
bar” (which works in a pretty similar way to Chrome's “Omnibox” URL bar) and
via the Firefox-themed default Google homepage. 3 in 1, who could resist
such a offer?
I was amazed when discovering that some of my friends are used to type the
domain name (without extension) in the first available search filed to them
even though they perfectly know the full URL of the website they want to reach
and THEN click the first result the Search engine would show. If you think
about it it makes a lot of sense, not all the Computer users (and electronics
consumers more in general) have a Master Degree in IT, to most of them what
matters is to get what they want with the minimum effort (in this case the effort
is to type or remember the full address of a website)
It doesn't matter that this way they're triggering a double roundtrip (one to get
the search result, another to get to the real site), in the end who cares of such
a thing in the era of HighSpeed Internet connections? Exactly, the Search
Engines, they DO care since like this they're able to know where the users
want to go (even when they don't really need to search anything)!

Desktop, Laptop and netbooks
Desktop Search box
Let's you find any file and
run any action on your
system, of course it let
you run searches on the
web too
But there's a different breed of software that makes this integration even more
pervasive: Desktop Search/Application Launchers.
These applications, which are pretty familiar to Linux and Mac users, let you
run commands/tasks, open websites, search for files on your PC, play
videos/music, make calculations and, most important of all, initiate Web
searches together with integrated results listings. All by just pressing a simple
combination of keys and then start typing, whenever you want, doesn't matter
what other task your computer is already busy with (except games).

Mobile devices: iOS and Android
Android browser's omnibox
As in chrome, serves both as
the address bar and a search
bar with realtime suggestions
Mobile Safari's search box
Always available at the top-
right corner in the UI
Android search widget
Similar to Google Desktop
search, lets you find any file
and run actions on the system
altogether with web searches
What said for Web browsers and Desktop Search applies also to modern (and
smarter) mobile devices.
The Android default browser and Safari Mobile on iOS behave exactly like
Chrome (omnibar) and Safari (standalone search field) while the Android's
Seach Widget is a cutted down version of a Desktop Search client.
Here we go, the Search Engine can now follow you any time wherever you go,
ready to help you find what you need.

And much more...
Google refrigerator
Shows you realtime
suggestions about what
you'd like to drink! :D
I'm skipping a long list of other Internet-enabled devices like gaming consoles
(my Nintendo Wii runs Opera), Media Centers (which unfortunately have born
already old) and new generation TV's/content delivery systems (like Apple TV),
anything that can connect to the Internet has the possibility to run a Web
search.
One day even your fridge! ;)

A new definition of User eXperience
UX
AccessibleFindable
Usable
Valuable Credible
Desirable
With search functionalities beginning to be so prominent users are starting to
abuse them constantly as the main way to get to content, even the one they
already know how to get to.
This new habit of “finding” content defines a new set of attributes for the
content to be found easily, this is what has been recently named “Findability”;
it's one of this new “Internet Slang” words, like Gamification... They sound so
New Age, don't they?
This new set of attributes cross the limits of plain SEO and require all the
parties involved in the Internet Publishing industry (Design, Marketing,
Copywriting, Development, etc.) to cooperate in a much deeper way than
before, long are gone the days when a Designer was not supposed to be
aware of what markup would be required to lay out his sketches and which
impact the position of a piece of content across the page would have in the
relevancy given to it.
Findability is actually the main goal of what is being defined as “SEO 2.0”, or
“Emotional SEO”; we'll come back to this later, first things first, let's start from
the beginning: the title of this lecture...
[switch slide]

What's in an acronym?
✔
SEO
– White hats
– Black hats
✔
FTW
– For The Win!
– F#@$ The World!
SEO and FTW: how they're related?
•both the acronyms started to be used in the mid-90's
•Both the acronyms are made up of 3 letters
•Both had initially a negative connotation
•Both evolved to outline positive concepts
•The difference is: the negative connotation of FTW didn't have any economic
benefit, while SEO Black hat techniques still generate money even though
they work only for a short amount of time, that's why the latter still sticks
around.

Before we start
✔ SEO
✔ SEM
✔ SERP
✔ PR
✔ IM
✔ PV
✔ CPC
✔ ATO
✔ Robot / crawler / spider
✔ Page title
✔ Meta description
✔ Meta keywords
✔ Bounce rate
✔ Conversion rate
✔ Keyword density
✔ Web directory
✔ Natural / organic results
✔ Spamdexing
✔ IRYD
Before we start, let's quickly review the meaning of some common
terms/acronyms I will use during this talk.
TBD
If you don't know what the last item in the list stands for... that's good! It
doesn't exist, I just put it there to check if you are paying attention :) Actually
it's a quote from Transformers the movie, “I Rise, You Die” - Optimus Prime

A bit of history first: 20th
century
✔
1993 – ALIWEB, the first public search engine
✔
1995 – Altavista, the first BIG search engine (later
absorbed by Yahoo)
✔
1997 – the SEO acronym appears for the first time
on a web page
✔
1997 – Search engines acknowledge Webmaster's
SEO efforts and started fighting back spamdexing
✔
1998 – Page and Brin develop Backrub and later
found Google
But first a bit of history (courtesy of WikiPedia)
Webmasters and content providers began optimizing sites for search engines in the mid-
1990s, as the first search engines were cataloging the early Web. Initially, all webmasters
needed to do was submit the address of a page, or URL, to the various engines which would
send a "spider" to "crawl" that page, extract links to other pages from it, and return information
found on the page to be indexed.
Site owners started to recognize the value of having their sites highly ranked and visible in
search engine results, creating an opportunity for both white hat and black hat SEO
practitioners.
The first documented use of the term Search Engine Optimization was John Audette and his
company Multimedia Marketing Group as documented by a web page from the MMG site from
August, 1997 on the Internet Way Back machine (Document Number 19970801004204).
Early versions of search algorithms relied on webmaster-provided information such as the
keyword meta tag, or index files in engines like ALIWEB. Meta tags provide a guide to each
page's content.
By relying so much on factors such as keyword density which were exclusively within a
webmaster's control, early search engines suffered from abuse and ranking manipulation. To
provide better results to their users, search engines had to adapt to ensure their results pages
showed the most relevant search results, rather than unrelated pages stuffed with numerous
keywords by unscrupulous webmasters.
Graduate students at Stanford University, Larry Page and Sergey Brin, developed "backrub," a
search engine that relied on a mathematical algorithm to rate the prominence of web pages.
The number calculated by the algorithm, PageRank, is a function of the quantity and strength
of inbound links.

A bit of history first: 21th
century
✔ 2004 – All the major Search Engines officially switch to a
PageRank-like algorithm and start to partially disclose details
through Webmaster-targeted portals
✔ 2005 – Google introduces personalized search results for
logged in users
✔ 2009 – Google introduces history, location and realtime
search features, Social “Bookmarking” gains consensus
✔ 2010 – The era of mobile search begins with the rise of
smarter mobile devices (iOS, Android)
✔ 2011 – Social networks and recommendation services
redefine the web landscape, Google announces +1 and
Recipes Search, Facebook embraces microformats
By 2004, search engines had incorporated a wide range of undisclosed factors in their ranking
algorithms to reduce the impact of link manipulation. Google says it ranks sites using more than 200
different signals. The leading search engines, Google, Bing, and Yahoo, do not disclose the
algorithms they use to rank pages. Notable SEO service providers, such as Rand Fishkin, Barry
Schwartz, Aaron Wall and Jill Whalen, have studied different approaches to search engine optimization,
and have published their opinions in online forums and blogs. SEO practitioners may also study patents
held by various search engines to gain insight into the algorithms.
In 2005 Google began personalizing search results for each user. Depending on their history of previous
searches, Google crafted results for logged in users. In 2008, Bruce Clay said that "ranking is dead"
because of personalized search. It would become meaningless to discuss how a website ranked,
because its rank would potentially be different for each user and each search.
In 2007 Google announced a campaign against paid links that transfer PageRank.[16] On June 15,
2009, Google disclosed that they had taken measures to mitigate the effects of PageRank sculpting by
use of the nofollow attribute on links. Matt Cutts, a well-known software engineer at Google, announced
that Google Bot would no longer treat nofollowed links in the same way, in order to prevent SEO service
providers from using nofollow for PageRank sculpting.[17] As a result of this change the usage of
nofollow leads to evaporation of pagerank. In order to avoid the above, SEO engineers developed
alternative techniques that replace nofollowed tags with obfuscated Javascript and thus permit
PageRank sculpting. Additionally several solutions have been suggested that include the usage of
iframes, Flash and Javascript. [18]
In December 2009 Google announced it would be using the web search history of all its users in order to
populate search results.[19]
Real-time-search was introduced in late 2009 in an attempt to make search results more timely and
relevant. Historically site administrators have spent months or even years optimizing a website to
increase search rankings. With the growth in popularity of social media sites and blogs the leading
engines made changes to their algorithms to allow fresh content to rank quickly within the search results.
[20]

SEO evolution so far: a retrospective
Bruce Clay: “Something must have gone totally wrong”
Retrospective on SEO evolution so far
Who is Bruce Clay? SEO international consultant, Board of Directors member
of SEMPO (Search Engine Marketing Professional Organization), member of
Web Analytics Association, American Marketing Association, International
Internet Marketing Association, Author of the SEO Code of Ethics (translated in
18 languages, published in 2001)
Back in 2005, when Google introduced per-user personalized Search Results,
he declared “ranking is dead” since the position of each result in SERPs would
be different per each user.
Mr. Clay belongs to the old generation of SEO consultants, those old White
hats that saw in a mathematically calculated PageRank the “Panacea” to all
the evil concerning the rising SEO industry, to those people, and even more to
Black hats, the idea that the ranking of a page would be more and more
dependent on the result of actions out of the direct reach of a webmaster is
frightening.
It's normal, Search Engine policies are taking power away from the hands of
those guys and putting it back in the hands of the users (and their friends, see
social developments of the upcoming Google +1 and Stumble Upon), if my
salary was totally depending from it I would be scared too.

IMHO: the best has yet to come
The ugly
1993 - 1997
The bad
1998 - 2004
The good
2005 – till now
The best
Yet to come
The best has yet to come: my personal opinion
I'm no big authority in the SEO industry as Mr. Clay is, but as many other “no-
ones” who are good at SEO I have my personal, strong opinion that can't but
be totally opposite to the one of this Gurus of long fame.
We went through the worst between 1993 (the first public search engines
appear) and 1997 (Spamdexing spreads as a standard SEO practice, the
Search Engines start to fight back).
Then the situation started to improve between 1998 (Backrub, the first
PageRank driven Search algorithm is implemented) and 2004 (all the major
search engines switch to a PageRank-like implementation, but still one that
could be tricked in many ways).
Starting from 2005 the big players started to introduce new concepts, and
more important, STANDARDS; they also started to share some detailed
information to Webmasters and SEO enthusiasts and began to panish the bad
practices with no mercy. Also the PageRank algorithm became smarter and
harder to trick, and keeps improving day by day.
The best is yet to come, with Findability becoming a more prominent goal and
Social Networks taking the stage we'll probably see big changes in the not so
far future, actually the first signs are appearing right now (Google +1, Stumble
Upon, Digg, the new possibilities opened by the interaction between mobile
and non mobile devices and the environment [augmented reality] and, best
of all, the SEMANTIC WEB [thanks to microformats])

First things first: get crawled
✔ Be sure that your site has interesting content to
attract crawlers and check the server uptime
✔ Prepare at least a standard XML Sitemap
✔ Have a robots.txt file at the root of your site
✔ Subscribe to Webmaster's services for the main
search engines
✔ Setup an analytics service for monitoring your traffic
✔ Submit your sitemaps manually for fine-grained
control
Before starting
Let's quickly talk about crawlers...
[next slide]

Know the beast
Meet the crawler... ...and his daily meal
I'm amazed sometimes that some “SEO-aware” people still doesn't exactly
know how a crawler works, especially they have a really magical/mystical idea
of what a crawler “sees” when doing his duty. So let me give you a clear
description of it and its' inner workings (even though I'm sure many in this
room already know what I'm gonna say).
A web crawler (or spider, or bot) is nothing more than an HTML parser on
steroids, he's aware of semantic meaning of the markup and can recognize
tricks watching out for tricky CSS/JS at a basic level, it is even able to intepret
a series of binary files like PDF's, DOC's and TXT's
Specialized crawlers can do even more, e.g. Googlebot-image is able to
analyze an image size, extract the EXIF data, analyze the colours, recognize
faces, etc. no matter the file format.
Once you've attracted a crawler attention you need to keep in mind that each
crawling pass has a time limit, the crawler won't go through all your contents at
once, so be sure to avoid it loosing time on non-relevant and unimportant stuff
(i.e. make good use of sitemaps and robots.txt; avoid the crawler get stuck
parsing a 25Mb PDF, have a description of the file somewhere in your page
where the download link is instead)

✔
Be sure that your site has interesting
content to attract crawlers and check
the server uptime
✔
Prepare at least a standard XML
Sitemap
Point 1
Do not submit to search engines your “under construction” website on your
testing server, in the early days of a new website crawling rate is really low and
days could pass before your real content would be indexed and start driving
traffic.
Point 2
[next slide]

XML Sitemaps
Details @ http://sitemaps.org/
Google expanded the original Sitemaps protocol to include:
•Image data support in standard Sitemaps (giving Googlebot-image crawler
image data even before starting crawling a site, with license and caption
included)
•News sitemaps (for website accepted in Google News, Googlebot-news)
•Video sitemaps (same as images, data can be included in the regular
sitemap, Googlebot-video crawler will collect that data before Googlebot starts
crawling the site)
•Mobile sitemaps (this is a separate sitemap, even if your site uses the same
URL, Googlebot-mobile and YahooSeeker/M1A1-R2D2 )

✔
Have a robots.txt file at the root of your
site
[next slide]

Robots.txt
User-agent: *
Disallow: /wikistats/
Disallow: /*action=history*
Allow: /Special:Sitemap*
Allow: /wiki/Special:Sitemap*
Disallow: /wiki/Special:*
Disallow: /Special:*
Sitemap: http://www.wikia.com/sitemap.xml
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
Details @ http://robotstxt.org
A good usage of the allow/disallow directives is e.g. login/protected content
and mobile content (disallow normal crawler, allow mobile crawler, will discuss
it later)
WARNING: use it to disallow crawlers from reaching internal SERPs as there's
a high probablility that would be flagged as search spam (at least by Google,
which made a official announcement on the topic back in 2007)
Part of the robots.txt “protocol” is the ROBOTS HTML meta tag, which differs
from the “rel=nofollow” directive introduced by Google for anchor tags.
Those links won't get any credit when Google ranks websites in the search
results, thus removing the main incentive behind blog comment spammers
robots. This directive only affects the ranking, and the Google robot may still
follow the links and index them.
There are two important considerations when using the robots <META> tag:
•robots can ignore your <META> tag. Especially malware robots that scan the
web for security vulnerabilities, and email address harvesters used by
spammers will pay no attention.
•the NOFOLLOW directive only applies to links on this page. It's entirely likely
that a robot might find the same links on some other page without a
NOFOLLOW (perhaps on some other site), and so still arrives at your
undesired page.

✔
Subscribe to Webmaster's services for
the main search engines
✔
Setup an analytics service for
monitoring your traffic
✔
Submit your sitemaps manually for fine-
grained control
Point 1
Webmaster's services (or tools) are a great resource for monitoring your SEO
efforts effects, submit new content and tweak crawling rate. Always keep an
eye on their documentation as it gets continously updated (without notice!!!)
Point 2
Analytics let you analyze from where your traffic comes from, using which
keywords, how they use your site and where they spend time on it
Point 3
Some search engines (e.g. Yahoo) won't let you access some
information/stats deriving from sitemaps discovered only via robots.txt
REMEMBER TO VALIDATE YOUR SITEMAPS BEFORE SUBMISSION (you
can use any XSD schema-based validator)

It's just the first step, do it right
Guess I'm getting all this crawling thing totally wrong...
Getting crawled by the major search engines is just a matter of following the
simple steps I've descibed, but still many first-timers fail from the very
beginning.
Be sure to start the right way ;)
(It's actually a real picture, don't ask me how the Giraffe got over there...)

One step further: be valuable
✔
Write original, quality content
✔
Avoid pages with duplicated content
✔
Be sure your markup makes the most
out of HTML semantic structures
Point 1
Quality content: people want it, crawlers search for it, you'd better have some.
If you think about it, it does make a lot of sense!
Point 2
Recently the term “duplicated content” was extended to what some content
farms (now punished by the changes to Google's algorithm done in January
2011) do by copying and slightly change some quality content from other
sources, but still the original concept apply, NEVER serve two pages with the
same content and different URL (unless you're done your URL normalization
right, more about it later)
Point 3
[next slide]

Be brave, embrace HTML5
HTML5 enhanced semantic structures
Enable you to literally model data, in a meaningful way
Using semantic structures helps in reinforcing the meaning and in giving
priority to different pieces of content in the same page (e.g. it's possible to use
more than one h1 per page wrapping them in
sections/articles/headers/footers).
The era of “html constructs just for the sake of layout” is finished, get rid of
tables and multifunctional divs. “Web Designers” you have been warned, get
an HTML5 book and start reading!
While Google announced they've just started supporting HTML5 tags in a
preliminary way, our tests confirms it is working, and it's working good:
[mention to Wikia's Image SEO project]
One more thing: semantic web is still not here (we're just at the beginning), but
will be soon, using semantic structures makes your site future-proof.

✔
Use good quality images, enrich them
with metadata and captions
[next slide]

Do your image homework
Athlete's epic failure at
Beijing Olympic Games
Filename:
/pictures/sport/epic_failure-Olympic_Games.png
HTML:
<figure>
<img src=”...” width=”235” height=”300”
alt=”epic failure at Olympic Games” />
<figcaption>
Athlete's <em>epic
failure</em> at Beijing
<em>Olympic
Games</em>
</figcaption>
</figure>
Caption
With image search becoming more and more popular (you would be
impressed at how much traffic a blog gets from the images used in the post's
headline) Image SEO becomes a need rather than a detail, some useful tips:
•use a smantically meaningful folder hierarchy to phisically store the file (the
image crawler tries to make sense out of it)
•use keywords associated with the image content in the file name (separate
with dash multiple keywords/phrases), in the alt attribute (but totally avoid
keyword stuffing, just use what describes the image briefly) and in the image
caption
•use images in PNG, JPG and GIF formats, exactly in this order (PNG gives
the best results)
•use quality images (at least 235px * 300px, but the bigger the better, you can
place a thumbnail in the page and link to the full size version)
•once again: exploit HTML semantic structures to bind the caption to the image
itself
•Optionally, give emphasis to the keywords in the caption text
•always try to place the image in a block of text dealing with the same topic (or
the reverse, avoid using images that have nothing in common with the
surrounding text)
•keep EXIF metadata if your image has it (date, location, camera model, etc.),
sooner or later that will turn to be useful for advanced searches

✔
Give specific content that extra dose of
semantic meaning with Microformats
[next slide[

The “Micro” advantage
Choose your weapon
Microformats, Microdata or RDFa
It's all about adding semantic value
to content
<footer class="vcard">This article has
been written by <span class="fn">John
Doe</span></footer>
But it doesn't work for everything
Reviews, People, Products, Businesses
and organizations, Recipes, Events,
Video
I find Microformats to be an amazing idea, it takes so little to make your
(existing) content so much more samantically valuable!
And you have plenty of choice:
- Microformats (based on vcard/hcard standards)
- Microdata (exploiting HTML5's extremely flexible data attribute)
- RDFa (the most complex and verbose [XML based] but probably the most
flexible and open)
With the recent launch of Google Recipes Search and with Fecebook
embracing hcard microformats on their events pages this became the new
focus in the SEO Industry, the Semantic Web has never been so near...
[see the Google Recipes search result example at the bottom of the slide, it
shows cooking time, ratings, reviews and amount of calories, all this is stored
in the HTML markup of the target page via Microformats]]

✔
Keep on updating contents (ad libitum)
[next slide]

If you can, get a trained update monkey
Done, Boss! Weekly update pushed to server!
The reason why blogs are easier to position than corporate websites, and also
the reason why Black hat SEO uses “splogs” (spam blogs) for backlinks is that
it's in the nature of this kind of platforms to be continuosly updated and be less
complex to crawl (thanks to a simplified site structure)
Crawlers love that, they just crave for fresh content more than a transilvanian
vampire craves for vingin's blood.
Keep updating your contents, the best rate is at least twice per week.

Improve your position: be smart
✔
Keep the total number of links on a
page below 50
[next slide]

PageRank, it's simpler than THEY think
PageRank
the TeleTubbies' way
PageRank is partially calculated on real attributes/merits of a page (like the
content quality) but is partially based on a “weighted voting system”.
Each page is given the possibility to vote for other pages (internal or external,
it doesn't matter), the total value of this vote is 1.0. This value get's divided by
all the links in that page, but not equally, links using relevant keywords and
higher in the page get more “weight”.
Understanding how this works makes it quite clear why a sane SEO strategy
won't let any page have more than 50 links, and is also the reason why when
requesting backlinks from other sites you should never pick pages that cross
that number.
The more “weighted” votes a page gets, the higher its PageRank will be.

Improve your position: be smart
✔ Give a glimpse of the site structure and of the
latest updates on your Homepage
✔ Cross link your content to provide more links to
most important pages
✔ Normalize your URLs to avoid sequential content
being flagged as duplicate content
✔ If your site is about a local business/activity
remember to include information about the
location at least on the homepage
✔ Mind your domain
Point 1
On your site the Homepage is the page which vote is given the higher weight,
whatever you link from there will have higher chances to rank better, so using
it to link to relevant/fresh content is a good way to give a boost to those pages
while helping the new visitor start browsing your website, it also helps the
familiar visitor to get the latest updates quickly.
Point 2
The more a page is linked across a website, the more relevance will be given
to it. Keep your cross links updated and use them for really important/relevant
stuff.
Point 3
Paginated listings of items (being a list of books to buy or the list of articles
contained in a category) can lead to duplicate content, especially if the user
can change the sorting order at will. Place a <link rel="canonical" href="[Main
URL here]" /> in the page header to tell the crawler that it's just another URL to
access the same content. This is the most common scenario but not the only
one (e.g. content accessible via different categories)
Point 4
After Google announced the addition of Location-based searches (both via a
Geo-locatable device, google maps, or the google search itself) pages
containing location information rank higher in that specific kind of SERP. Now
that “Personal search” is more than a reality, this is even more important.
Point 5
Try to get a domain name as short as possible, it should contain at least the
main keyword you're targeting (is getting harder over time, so when something
new comes up, act quickly)

SEO like a PRO
✔
Be sure to include meta tags (for description
and keywords) on each page
[next slide[

Use META for fine-grained control on SERP
Old stuff never dies, but changes over time. Today's Search Engines pay little
(Yahoo) or no (Google) attention at all to Meta tags in the perspective of
granting PageRank. So you won't see those epic battles in big corporations, as
depicted in the picture with the two zebras, anymore. That is a thing of the
past.
But those tools still have a use in focused SEO strategies, they let you have
more control on SERP summaries/snippets, they let you tell the Search Engine
for which keywords the Meta description content should be used as a snippet
to represent the page in the results listings. When these tags are not present
(or the keyword used for searching is not matching their contents) the Search
engine will extract a snippet directly from the page content, and most of the
times it fails in a very bad way.
So the goal switched from attracting the attention of the crawler to attracting
the attention of the user.
When choosing keywords you should take into consideration using long tail
ones: the total amount of traffic generated by a “cluster” of those keywords is
the same as one/two more general keywords (it's a technique also known as
cluster optimization)

SEO like a PRO
✔
Keep your Keyword Density below 3%
✔
Use domains at least 3 years old, and more
than 1
✔
Model your URLs
Point 1
KD is the percentage of times a keyword or phrase appears on a web page compared
to the total number of words on the page, the optimum keyword density is 1 to 3
percent, using a keyword more than that could be considered search spam; there are
simple and specific equations to calculate KD for both single keywords and keyword
phrases (can find them on WikiPedia)), but there are also numerous automated tools
that let you analyze all the pages in a website with a single click.
Point 2
Older domains are given trust (think of those scam/phishing sites that appear and
disappear)
Point 3
[next slide]

Make your URLs sexy
http://mysite.com/index.php?itemid=3346 http://mysite.com/angelina-jolie-bikini
Which link would you click on?
This is a classic SEO joke, but the problem is real and still very common.
It doesn't matter if you're writing your own Content Management System or
using an existing one, you need to pay attention to the URLs that it generates.
Avoid dynamic formats, using URL rewrite you can achieve quite more
interesting page addresses that can catch the attention of the user (since he
can understand them) and give valuable information to crawlers (even in a
semantic way, e.g. mysite.com/equipment/sport/running/shoes/reebok-air-
white).

SEO like a PRO
✔ Be sure to include meta tags (description and
keywords) on each page
✔ Keep your Keyword Density below 3%
✔ Use domains at least 3 years old, and more than 1
✔ Model your URLs
✔ Make a wise use of 301 redirects
✔ Link to and get linked from relevant resources
✔ Make it to Web Directories
✔ Keep monitoring and tweaking, your job is never done
✔ PubSubHub indexing, only for the good guys
Point 1
Let's clarify it once for all: 302 redirects are meant only for temporary changes, 301 redirects
automatically transfer all the attributes of a URL to a new one, PR included. Use it wisely, it's a
powerful tool for link sculpting.
Point 2
In a Link Building strategy getting Backlinks from resources that are not relevant nor related to
your site focus area are is meaningless, no PR will come from there and they will smell of paid
link spamming no matter what, watch out!
Point 3
As for the previous point, Web Directories are considered to be both Credible and Relevant to
any topic (since their nature), try to make your site be included in Dmoz's collection of links,
submit your site as early as possible for a review.
Point 4
SEO industry is in continous evolution, Ranking algorithms get tweaked at a steady rate, new
standards rise and fall and, most of all, the traffic reaching your site changes continously. Keep
on monitoring search keywords that bring visitors to your site and tweak your strategy
accordingly,
Point 5
[next slide]

Are you the Chosen One?
PubSubHub Google Indexing
This option is offered only to a strict selection of websites
Details @ http://code.google.com/p/pubsubhubbub
Slide transcript TBD.

This will be your obsession
Google logo, in all its glorious variants
As you'll get more and more familiar (and involved) with advanced SEO
principles you'll be able to productively exploit them and dramatically improve
your site position in SERPs.
But there's still a lot you could do! SEO has a more “commercial” variant,
SEM.

SEM: if you've got $$$ to spend
✔
Paid search campaigns
✔
Paid links campaigns
✔
ATO
✔
Paid articles and featurettes (+backlink)
✔
Paid blogs
Search Engine Marketing is about being able to apply SEO techniques on
someone else's site for your advantage, and of course as you can guess this
implies a consistent transfer of money.
But it does it's dirty job pretty well.
Usually this is where you end up digging when you can't do anything more to
get better in the organic results and your previous SEO efforts brought you an
embarrassing amount of money. When you start doing this, then you'll need to
start dealing with the Dark Lord, a.k.a. ROI.
Just one simple rule: whatever you end up buying, be sure to do it on relevant
resources related to your website focus area (don't buy links on a automotive
website if you sell vegetables, it's just a waste if money)

The power of Bling
If you've got enough
budget then the answer to
your SEM question is:
YES, YOU CAN!
In the end SEM is all about how big your budget is.

Black hats: meet the bad guys
✔ Keyword stuffing
✔ Hidden or invisible text
✔ Meta-tag stuffing
✔ Doorway pages
✔ Scraper sites
✔ Article spinning
✔ Link spam
✔ Link-building software
✔ Link farms
✔ Hidden links
✔ Sybil attack
✔ Spam blogs
✔ Page hijacking
✔ Buying expired domains
✔ Cookie stuffing
✔ Spam in blogs
✔ Comment spam
✔ Wiki spam
✔ Referrer log spamming
✔ Mirror websites
✔ URL redirection
✔ Cloaking
Scraper sites
Scraper sites sites, are created using various programs designed to "scrape" search-engine results
pages or other sources of content and create "content" for a website.
Article spinning
Article spinning involves rewriting existing articles. This process is undertaken by hired writers or
automated using a thesaurus database or a neural network.
Sybil attack
A Sybil attack is the forging of multiple identities for malicious intent. A spammer may create
multiple web sites at different domain names that all link to each other, such as fake blogs.
Page hijacking
Page hijacking is achieved by creating a rogue copy of a popular website which shows contents
similar to the original to a web crawler but redirects web surfers to unrelated or malicious websites.
Cookie stuffing
Cookie stuffing involves placing an affiliate tracking cookie on a website visitor's computer without
their knowledge, which will then generate revenue for the person doing the cookie stuffing.
Referrer log spamming
Referrer spam takes place when a spam perpetrator or facilitator accesses a web page (the
referee), by following a link from another web page (the referrer), so that the referee is given the
address of the referrer by the person's Internet browser.
Mirror websites
A mirror site is the hosting of multiple websites with conceptually similar content but using different
URLs. Some search engines give a higher rank to results where the keyword searched for appears
in the URL.

Know your enemy
The Google Almighty Search Patrolling Task
Force Team member
If you'll decide to go for the “Dark side” then be aware of who you're gonna
fight!
These guys pass their lives buried in the Geekiest corner of Googleplex,
probably the last (and only) date they had was with their optometrist for a new
pair of 80's looking glasses, this should tell you how much they hate the rest of
the world sitting on the other side of the screen, especially SEO smart-asses
who go from club to club on their porsche carrera with the typical top model
sitting on the passenger side...
They've been trained to kill and they've got a clearance from CIA to do that. If I
were in your shoes, I would avoid catching their attention.

Sooner or later...
MHH?!? I sense a disturbance in the SERPS...
Sooner or later Search Engines catch the bad guys, here's usually what
happens...
[next slide]

...this is what happens usually
Sorry man, we've just changed our algorithm
And this is usually the explanation! They get huge PageRank penalties (which
bury them in the depth of SERPs, when no man has ever been before) or, in
the worst cases, their sites get removed from the indexes.
[see what happened to J.C. Penney (linkspamming and non-related paid
backlinks) and BMW (doorway pages)]

SEO of the (near) future
✔
SEO for UGC
✔
Mobile SEO
✔
SEO and Social recommendations
✔
SEO and AR
I left this part of the lecture a bit more opened, I'll quickly describe each item
on the list, but we can discuss them as you wish (this is valid if you're
attending a live lecture/talk)
Point 1
Old way: site staff imposes manual/automated SEO strategy on Community
content, New way: involve the community in SEO strategy via engaging/game
mechanism [explain SEO dashboard project]
Point 2
Meta.txt, Mobile Sitemaps, .mobi domain, Apps/Markets presence
Point 3
Facebook-Bing integration, Google +1, Stumble Upon, Digg
Point 4
[next slide]

Augmented Reality
Is the “environment” that determines the
search results, in a combination of proximity
detection, geolocation and image recognition

Emotional SEO (SEO 2.0) 1/2
SEO
✔
Link building, manually adding them,
submitting static websites to directories,
link exchange, paying for links
✔
On site optimization for spiders.
Example: Repetitive page titles
concentrating (solely) on keywords
✔
Competition: You compete with others to
be on the first page/in the Google top 10
for keywords
✔
Barter: You give me a link and only then I
will give you one
✔
Hiding: We’re not doing SEO, we can’t
show our client list publicly, generic SEO
company
✔
keywords
SEO 2.0
✔
Getting links, via blogging, writing
pillar content, creating link bait,
socializing
✔
On site optimization for users.
Example: Kick ass post headlines
✔
Cooperation: You cooperate with each
other sharing fellow blogger’s post on
social media, you link to them
✔
Giving: I link you regardless whether
you link back, but in most cases you
will, more than once
✔
Being open: Welcome our new client
xyz, we are proud to work together
with them, Rand Fishkin, Lee Odden,
BlueGlass
✔
tags
Here's a brief comparison (courtesy if Tadeusz Szewczyk) of how SEO in the
near future will differ from what we currently do and how we currenly think.
Additional transcript TBD.
[continues on next slide]

Emotional SEO (SEO 2.0) 2/2
✔
optimization for links and rankings
✔
clicks, page views, visits
✔
DMOZ
✔
Main traffic sources: Google, Yahoo,
MSN
✔
one way communication
✔
top down, corporations and old
media decide what succeeds
✔
undemocratic, who pays most is on
top
✔
50% automated, half of the SEO
tasks can be done by SEO software
✔
technocratic
✔ optimization for traffic and engagement
✔ conversions, ROI, branding
✔ Delicious, Digg
✔ Main traffic sources: Facebook,
Twitter, StumbleUpon, niche social
news sites, blogs
✔ dialog, conversation
✔ bottom up, wisdom of crowds
determines true popularity via
participation
✔ democratic, who responds to popular
demand is on top
✔ 10% automated, most SEO 2.0 tasks
are about content and interaction
✔ emotional

Links you might find interesting
Webmaster Tools
✔
http://www.google.com/webmasters/tools/
✔
https://siteexplorer.search.yahoo.com/mysites
✔
http://www.bing.com/webmaster/
References
✔
http://www.google.com/support/webmasters/
✔
http://help.yahoo.com/l/us/yahoo/search/webcrawler/index.html
News
✔
http://www.seroundtable.com/
✔
http://www.ysearchblog.com/
✔
http://www.bing.com/community/site_blogs/b/webmaster/default.aspx
✔
http://googlewebmastercentral.blogspot.com/
Here's a list of links definitely worth a mention (and a visit).

Before we finish...
Questions, anyone?
If you're not attending a live lecture/talk please feel free to contact me via:
Twitter: http://www.twitter.com/federico_lox
Facebook: http://www.facebook.com/flucignano
Tumblr: http://loxzone.tumblr.com

Thank you for your time
Thanks for your attention, I hope you enjoyed the lecture!

Thanks to
Wikia for letting me prepare this lecture
Politechnika Poznańska for hosting it
Dr Andrzej P. Urbański for organizing the event
DISCLAIMER
No animal, evil Dark Lord or U.S. President has
been harmed during the preparation of this
presentation.

SEO: FTW!

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (6)

Similaire à SEO: FTW!

Similaire à SEO: FTW! (20)

Dernier

Dernier (20)

SEO: FTW!