SEO for the Semantic Web

How do the machines know
what Tasty Wheat
tasted like?
Mouse – The Matrix

Short SEO History
Short SEO History
• Web1 0
Web1.0
• Web2.0
• Web3.0

Genesis
• A story of the Internet by
A story of the Internet, by
• Solving the most important problems
l i fl db
• Greatly influenced by one man…

Tim Berners‐Lee
Tim Berners Lee

“the World Wide Web is Berners-Lee's
alone. He designed it. He loosed it on the
g
world. And he more than anyone else has
fought to keep it open, nonproprietary and
free.”
Time Magazine, 1999
Time Magazine 1999

The Problem
The Problem
• Where can I find the information?
Where can I find the information?

“Our ineptitude in getting at the record is
largely caused by the artificiality of the
systems of indexing ”
indexing.
The Atlantic Monthly, 1945

Archie, 1990
Archie, 1990
• Indexed file names and
Indexed file names and
• Returned results based on pattern matching

Web1.0
• Means HTML
Means HTML
• Is born in 1991, with the help of
• Tim Berners‐Lee (TBL), who also founded
i ( ) h l f d d
• WWW Consortium (W3C) at MIT, and also
• Created WWW Virtual Library – the 1st catalog

Yahoo Directory, 1994
Yahoo Directory, 1994
• Vertical = categories is like
Vertical = categories... is like
• “Show me all the stuff and I’ll handle it”
• Manually indexed stuff, which was
ll i d d ff hi h
• OK for starters, but…
• Websites quickly grew in number and
• Y! started charging money for one listing
Y! started charging money for one listing
• Increasingly more money...

,1994
• First SE to fully search text
First SE to fully search text
• Bought by AOL, then
• S ld
Sold to Excite, which
i hi h
• Excite went bankrupt and
• WebCrawler ends up bought by InfoSpace

Other Search Engines
Other “Search Engines”
• 1994, reaches 60mil pages in 96
1994 reaches 60mil pages in ‘96
• 1995, bought by Overture, bought by Y!
• 1996, meta search, bought by Lycos
996 h b h b
• 1997, bought by IAC/InterActiveCorp
• 1999, bought by Overture, meaning Y!

Shopping fun, right?
Shopping fun, right?

, 1998
, 1998
• Open Directory Project
Open Directory Project
• Each listing is checked and certified by a
volunteer
• The main source for Google Directory

Current State of Search Industry
Current State of Search Industry

Web1.0 Problems
• SE couldn’t understand text so
SE couldn t understand text, so
• They said “why don’t you implement some
meta tags (description & keywords) so we can
meta tags (description & keywords) so we can
get a glimpse of what you’re saying”
• Th
The relevancy of a page with respect to a
l f ih
keyword was determined by a few factors, so
• It was very easy to abuse and spam, therefore
p q
• Search Results had poor qualityy

Web2.0
• Is coined by Tim O’Reilly yet
Is coined by... Tim O Reilly, yet
• TBL later said that “web2.0” is a stupid,
meaningless term and that he thought of it
meaningless term and that he thought of it
first in ’96 anyway

Web2.0 means
Web2.0 means
• which grew apart because of
which grew apart because of
• PageRank (1998) invented by
• Larry & Sergei who adapted the algo from
&S i h d d h l f
• An MIT professor who had developed
• A nasty mathematical formula for positioning
y p
keywords in a 3d space model based on the
relevancy that one kw holds … whatever

PageRank actually means
PageRank actually means
• That a link is a vote and
That a link is a vote and
• Not all links are created equal, so
• It matters who links to you
h li k
• Just like in our real life society

• Read the content of pages really well just that
Read the content of pages really well, just that
• Pages were crappy:
–NNon‐standard coding
t d d di
– Ugly tech (like applets)
– Senseless IA
• So Google said: “don’t do evil and try to nicely
format the info, according to W3C standards”
(remember TBL)

SEO
• Is a multitude of practices aimed at facilitating
Is a multitude of practices aimed at facilitating
the indexing of pages by search engines
• Evolves as the ranking algorithm changes and
Evolves as the ranking algorithm changes, and
• Of course, the algorithm is kept secret.

SEO actually means
SEO actually means

Courtesy of Kelly Ishikawa

SEO actually means
SEO actually means
• An on‐going battle between bots & SEO guys
An on going battle between bots & SEO guys
• Now 100+ factors influence ranking
• And I’d like to take the time to talk about each
d ’d lik k h i lk b h
one of them in the following…

My SEO Cheat Sheet
My SEO Cheat Sheet
• Consider:
1. Page Titles
2. URLs (mod_rewrite)
3. Anchor Text
4. Website Architecture (IA)
5. Link Title & Alt Images
6. Relevant content (text)
7.
7 Sitemap xml
Sitemap.xml
8. Hosting
9. Freshness

Resources

Matt Cutts Blog

Mihai’s SEO Cheat Sheet :D

Web2.0 Problems
• © for pictures articles books etc
for pictures, articles, books, etc
• PPC fraud
• Privacy
i
• Search Engine SPAM
• Link bombing
• Paid links
Paid links
• But more important...

Web2.0 Problems
• SE still don’t understand what the $#%@
SE still don t understand what the $#%@
you’re talking about
• Crawling a website’s interface to extract info is
Crawling a website s interface to extract info is
almost insane

Web3.0
Web3.0
• Means semantic web
semantic web
• Attention migrates from syntax/formatting to
semantics and
semantics and
• Meta Data (data about the data) becomes...

Web3.0

&
Resource Description
Resource Description Microformats
Framework

Resource Description Framework
Resource Description Framework
• A kind of XML
A kind of XML
• RDF = Subject + Predicate + Object
• S + P + O creates a Triple which
O i l hi h
• Can describe almost anything in the universe
• Triples are connectable (eg: FOAF)
• RDFa = XHTML + RDF (W3C compliant)
RDFa XHTML + RDF (W3C compliant)

Microformats
• hCalendar
• hCard
• rel‐tag
• VoteLinks
• XFN
• Geo
• hResume
• hReview
hR i
• etc

SPARQL
• SPARQL Protocol and RDF Query Language
SPARQL Protocol and RDF Query Language
• Standardized on 15th Jan 08 (1 month ago) and
• Endorsed by?... TBL
d db ?

quot;Trying to use the Semantic Web without
SPARQL is like trying to use a relational
Q y g
database without SQL“
TBL

Potential
• With SPARQL you skip the presentation layer
With SPARQL you skip the presentation layer
• You can query ad‐hoc any API, so
• You don’t need to crawl in advance, therefore
d ’ d li d h f
• Information will be as fresh as it gets

And possibilities
And possibilities
• Query: “I can has pizza?”
Query: I can has pizza?
• Returns:
–Af i d f
A friend of yours (XFN ‐ F b k)
(XFN Facebook)
– has a colleague (FOAF ‐ LinkedIN) who
– said that they make good pizza (hReview ‐ yelp) at
( )
– a restaurant nearby (geo – Gmaps)
– Tip: U2 in concert today (hCalendar ‐ upcoming)

Perhaps now we can see
Perhaps now we can see
• Why Social Networking Communities are
Why Social Networking Communities are
worth so much, even though most of them
don’t have a revenue model
– Facebook
– LinkedIN
– Meebo
– Beebo
– Pipu...
• They/We are the databases of the future

Thanks!

“Most of the right choices in SEO come from
asking: What’s the best thing for the user?”
g g
Matt Cutts

Mihai Gheza
Mih i Gh
Creative Commons Attribution‐Noncommercial‐Share Alike 3.0 Unported License.

SEO for the Semantic Web

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (17)

Similaire à SEO for the Semantic Web

Similaire à SEO for the Semantic Web (20)

Dernier

Dernier (20)

SEO for the Semantic Web