Optimizing Search Engines(A Mathematical Point view)
Following things covered here
- A basic introduction to Search Engine Optimizing.
Introduction to Google and Bing Webmaster.
- Use of Google Toolbar to see Page Rank of each page(Calculating importance of each page for Google Search Engines.)
- PageRank Algorithm(I will the focus on this point mostly).
- How it is useful to real SEO and practical implementation of SEO.
- Google Bomb.
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Optimizing search engines
1.
2. Introduction to SEO
Search engine optimization (SEO) : SEO works in two ways
Pay Per Click.
Search engine's "natural" or un-paid ("organic") search
results.
Search engine optimization (SEO) is the process of
affecting the visibility of a website or a web page in
a search engine's "natural" or un-paid ("organic") search
results. In general, the earlier (or higher ranked on the
search results page), and more frequently a site appears
in the search results list, the more visitors it will receive
from the search engine's users (Wikipedia)
3. Optimizing search engines ("organic")
search results.
factors which affects your web content search visibility:-
Title:- Title tag should be mention properly. Search Engines doesn't use the title tag 100%
of the time. Occasionally, Google pulls the title from the anchor text of a link to that page.
Make sure your words in Title and web link words matches.
Snippet :-The snippet is the description for the page that appears beneath the title.
Google may pull this from the page‟s meta description tag. Put relevant sentences, it
helps to best search queries.
Bolding :-Google bolds the query words anywhere they appear in the search result.
Cached link :- If the page is down or loading slowly, a searcher can still get to the
information via the cache. If the page is accidentally deleted, the webmaster can retrieve
the data from the cache to recreate the page. Also, the cache shows when the page was
crawled.
4. Optimizing search engines ("organic")
search results.
Meta data Other items in a search result include the URL, the page
size etc.
Sitelink:- Link to other sites helps, improve search results.
Introduction to webmasters.(Google and Bing Search Engines)
Want to avoid negative search results about your web content ?
Put unwanted links in robots.txt in webmaster tools.
This all operations could be done through HTML Meta tags and
using webmaster tool from search engines.
6. Agenda :-
Fact.
Understanding Page Rank Algorithm.
Simple calculation of PageRank.
Analysis of PageRank Algorithm.
Case Discussion.
Practical Implementation.
References.
7. Fact
Developed by Larry Page and Sergey Brin in
1988
Trade Mark of Google.
Patterned by Stanford University.
Back Bone of Google Search Technology.
8. Understanding Simple PR
Algorithm.
Every inbound link increase the weightage of
a page.
Page Rank is based on numbers of pages
linked to that page.
Highest PageRank is 1 but in real world Indicated with
numbers between 1 – 10(using a logarithmic scale.)
Hence , appropriate SERP listing.
Calculated by Nature and Numbers of back links.
Indicated on Google toolbar.
9. Definition of PageRank
Algorithm.
Assume a small universe of four web pages: A, B, C and D , then page rank calculated
as:
PR(A) = PR(B) + PR(C) + PR(D)
Where page B had a link to pages C and A, page C had a link to page A, and
page D had links to all three pages. Then PR is:-
PR(A) = PR(B)/2 + PR(C)/1 + PR(D)/3
Let denote , outbound links by L() then,
PR(A) = PR(B)/L(B) + PR(C)/L(C)+ PR(D)/L(D), final summation will be,
10. Understanding PageRank
PR(u) = ∑ PR(v)/L(v) , for every v ∈ Bu
i.e. the PageRank value for a page u is dependent on the PageRank values for each page v contained
in the set Bu (the set containing all pages linking to page u), divided by the number L(v) of links from
page v.
Introduction to Damping factor(By SergeyBrin)(d = 0.85) :-
The PageRank theory holds that an imaginary surfer who is randomly clicking on links will eventually
stop clicking. The probability, at any step, that the person will continue is a damping factor d ,
generally assumed value is 0.85
So generalized PageRank algorithm is :-
PR( Pi ) = 1- d/N + d ∑ P(j)/L(j)
11. Understanding PageRank
A Simple Example:- Consider a small universe (A Set of N pages)where , we have only to web
pages, then
Guess 1
st
:- Say, initial page rank of each page is 1.0 and d = 0.85
PR(A) = (1 – d) + d(PR(B)/1) and PR(B) = (1 – d) + d(PR(A)/1) We get,
PR(A) = 0.15 + 0.85 * 1 = 1 and PR(B) = 0.15 + 0.85 * 1 = 1
Guess 2
nd
:- Say, initial page rank of each page is 40 and d = 0.85
PR(A) = (1 – d) + d(PR(B)/1) and PR(B) = (1 – d) + d(PR(A)/1) We get,
12. Understanding PageRank
First Calculation:-
PR(A)= 0.15 + 0.85 * 40 = 34.25 ,
PR(B) = 0.15 + 0.85 * 0.385875 = 9.1775
Second Calculation:-
PR(A)= 0.15 + 0.85 * 29.1775 = 24.950875
PR(B) = 0.15 + 0.85 * 24.950875 = 21.35824375 and so on …
On Kth Calculation:- When the sum of PageRank of each page is equal to number of pages present in
the set , that would be your page rank of page.
Average page rank never cross to 1.
13. Linear system of equations
Assume in small set „x‟, we have
Pages 1, 2 , 3 , 4 then transition
Matrix will be, A =
Please note some observations here:-
Page 1:- donates = 1/3+1/3+1/3 = 1 and gains 1+1/2 = 1.5 importance.
Page 2:- donates = 1/3+1/2 = 0.83 and gains 1/3 = 0.33 importance.
Page 3 :- donates = 1 = 1 and gains 1/3 +1/2+1/2 = 1.33 importance.
Page 4 :- donates = 1/3+1/2 = 0.83 and gains 1/2 = 0.5 importance.
14. Solving Linear Equation:-
Arrange Linear system =
of equations
We get linear equation = ->
Solving this equation by
substitution method(substitute value of x2, we and ), we get,
15. Solving Linear Equation:-
We get a vector eigenvectors corresponding to
the Eigen value 1 are of the form
Here we don‟t know about value of x1, choose
x1/12 as some constant so we could get Eigen
vector, whose average value is 1.
16. Solving Linear Equation:-
We could choose as a 1/31,
So that, sum of
PR(x) = 0.38 + 0.132 +0.29 + 0.19 = 0.992
(Since PR never cross 1 and average/Maximum PR will be 1)
17. How PR help you ?
How it is use full to me ?
Linking your web content
with many links can increase
your search visibility and
A outbound link from highly
Ranked page optimize your
search query.
18. Google Bomb :-
The terms Google bomb is creating large
numbers of links, that cause a web page to have
a high ranking for searches on unrelated or off
topic keyword phrases, often for comical or
satirical purposes.
Example of Google bomb:- Search For
“completely wrong” in Google.
19. References:-
I would like to thanks to Dr. Vinayak Joshi, Department of Mathematics,
University of Pune, Who introduce me to this algorithm and motivated me
to deliver a session in 2009.
Wikipedia http://en.wikipedia.org/wiki/PageRank
Department of Mathematics, Cornell University, Lecture 3 and 6
Linear Algebra by Vivek Sahai and Vikas Bist.