SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
Matt Cutts explains the basics of how Google Search works.
About Search
Every day Google answers more than one billion questions from people around the globe in 181 countries and 146
languages. 15% of the searches we see everyday we’ve never seen before. Technology makes this possible because
we can create computing programs, called “algorithms”, that can handle the immense volume and breadth of search
requests. We’re just at the beginning of what’s possible, and we are constantly looking to find better solutions. We have
more engineers working on search today than at any time in the past.
Search relies on human ingenuity, persistence and hard work. Just as an automobile engineer designs an engine with
good torque, fuel efficiency, road noise and other qualities – Google’s search engineers design algorithms to return
timely, high-quality, on-topic, answers to people’s questions.
Our algorithms attempt to rank the most relevant search results towards the top of the page, and less relevant search
results lower down the page.
Algorithms Rank Relevant Results Higher
For every search query performed on Google, whether it’s [hotels in Tulsa] or [New York Yankees scores], there are
thousands, if not millions of web pages with helpful information. Our challenge in search is to return only the most
relevant results at the top of the page, sparing people from combing through the less relevant results below. Not every
website can come out at the top of the page, or even appear on the first page of our search results.
Today our algorithms rely on more than 200 unique signals, some of which you’d expect, like how often the search terms
occur on the webpage, if they appear in the title or whether synonyms of the search terms occur on the page. Google
has invented many innovations in search to improve the answers you find. The first and most well known is PageRank,
named for Larry Page (Google’s co-founder and CEO). PageRank works by counting the number and quality of links to
a page to determine a rough estimate of how important the website is. The underlying assumption is that more important
websites are likely to receive more links from other websites.
Panda: Helping People Find More High-Quality Sites
To give you an example of the changes we make, recently we launched a pretty big algorithmic improvement to our
ranking—a change that noticeably impacts 11.8% of Google searches. This change came to be known as “Panda,” and
while it’s one of hundreds of changes we make in a given year, it illustrates some of the problems we tackle in search.
The Panda update was designed to improve the user experience by catching and demoting low-quality sites that did
not provide useful original content or otherwise add much value. At the same time, it provided better rankings for high-
quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so
on.
Market Pressure to Innovate
“[Google] has every reason to do whatever it takes to preserve its algorithm’s long-standing reputation for excellence. If
consumers start to regard it as anything less than good, it won’t be good for anybody—except other search engines.”
Harry McCracken, TIME, 3/3/2011
We rely on rigorous testing and evaluation methods to rapidly and efficiently make improvements to our algorithms.
A Peek Inside
“At any moment, dozens of these changes are going through a well-oiled testing process…Every time engineers want
to test a tweak, they run the new algorithm on a tiny percentage of random users, letting the rest of the site’s searchers
serve as a massive control group.” – Read more from Steven Levy’s in-depth story in Wired, 02/22/10
Testing and Evaluation
Google is constantly working to improve search. We take a data-driven approach and employ analysts, researchers
and statisticians to evaluate search quality on a full-time basis. Changes to our algorithms undergo extensive quality
evaluation before being released.
A typical algorithmic change begins as an idea from one of our engineers. We then implement that idea on a test
version of Google and generate before and after results pages. We typically present these before and after results
pages to “raters,” people who are trained to evaluate search quality. Assuming the feedback is positive, we may run
what’s called a “live experiment” where we try out the updated algorithm on a very small percentage of Google users,
so we can see data on how people seem to be interacting with the new results. For example, do searchers click the
new result #1 more often? If so, that’s generally a good sign. Despite all the work we put into our evaluations, the
process is so efficient at this point that in 2010 alone we ran:
13,311 precision evaluations: To test whether potential algorithm changes had a positive or negative
impact on the precision of our results
8,157 side-by-side experiments: Where we show a set of raters two dif f erent pages of results and ask
them to evaluate which ones are better
2,800 click evaluations: To see how a small sample (typically less than 1% of our users) respond to a
change
Based on all of this experimentation, evaluation and analysis, in 2010 we launched 516 improvements to search.
Manual Control and the Human Element
In very limited cases, manual controls are necessary to improve the user experience:
1. Security Concerns: We take aggressive manual action to protect people f rom security threats online,
including malware and viruses. This includes removing pages f rom our index (including pages with credit
card numbers and other personal inf ormation that can compromise security), putting up interstitial
warning pages and adding notices to our results page to indicate that, “this site may harm your
computer.”
2. Legal Issues: We will also manually intervene in our search results f or legal reasons, f or example to
remove child sexual-abuse content (child pornography) or copyright inf ringing material (when notif ied
through valid legal process such as a DMCA takedown request in the United States).
3. Exception Lists: Like the vast majority of search engines, in some cases our algorithms f alsely identif y
sites and we sometimes make limited exceptions to improve our search quality. For example, our
Saf eSearch algorithms are designed to protect kids f rom sexual content online. When one of these
algorithms mistakenly catches websites, such as essex.edu, we can make manual exceptions to prevent
these sites f rom being classif ied as pornography.
4. Spam: Google and other search engines publish and enf orce guidelines to prevent unscrupulous actors
f rom trying to game their way to the top of the results. For example, our guidelines state that websites
should not repeat the same keyword over and over again on the page, a technique known as “keyword
stuf f ing.” While we use many automated ways of detecting these behaviors, we also take manual action
to remove spam.
The Engineers Behind Search
“So behind every algorithm, and therefore behind every search result, is a team of people responsible for making sure
Google search makes the right decisions when responding to your query. Obviously, there’s no other way it could have
happened: Google is a living example of what’s possible when brilliant people devise a smart algorithm and marry it to
limitless computing resources.” – Tom Krazit, The human process behind Google’s algorithm, CNET, 09/07/10
Matt Cutts explains how Google deals with spam through a combination of algorithms and manual action, and how
websites can request reconsideration of their sites.
Fighting Spam
Ever since there have been search engines, there have been people dedicated to tricking their way to the top of the
results page. Common tactics include:
Cloaking: In this practice a website shows dif f erent inf ormation to search engine crawlers than users.
For example, a spammer might put the words “Sony Television” on his site in white text on a white
background, even though the page is actually an advertisement f or Viagra.
Keyword Stuf f ing: In this practice a website packs a page f ull of keywords over and over again to try and
get a search engine to think the page is especially relevant f or that topic. Long ago, this could mean
simply repeating a phrase like “tax preparation advice” hundreds of times at the bottom of a site selling
used cars, but today spammers have gotten more sophisticated.
Paid Links: In this practice one website pays another website to link to his site in hopes it will improve
rankings based on PageRank. PageRank looks at links to try and determine the authoritativeness of a
site.
Today, we estimate more than one million spam pages are created each hour. This is bad for searchers because it
means more relevant websites get buried under irrelevant results, and it’s bad for legitimate website owners because
their sites become harder to find. For these reasons, we’ve been working since the earliest days of Google to fight
spammers, helping people find the answers they’re looking for, and helping legitimate websites get traffic from search.

Contenu connexe

Tendances

Hypothesis focused SEO - Tom Anthony
Hypothesis focused SEO - Tom AnthonyHypothesis focused SEO - Tom Anthony
Hypothesis focused SEO - Tom AnthonyEventz.Digital
 
Case Study: Rocket Games
Case Study: Rocket GamesCase Study: Rocket Games
Case Study: Rocket GamesWes McCabe
 
Top Tips to Deliver Quality Web Experiences From IE 9 to the iPhone
Top Tips to Deliver Quality Web Experiences From IE 9 to the iPhoneTop Tips to Deliver Quality Web Experiences From IE 9 to the iPhone
Top Tips to Deliver Quality Web Experiences From IE 9 to the iPhoneCompuware APM
 
Se omoz the-beginners-guide-to-seo
Se omoz the-beginners-guide-to-seoSe omoz the-beginners-guide-to-seo
Se omoz the-beginners-guide-to-seoalexanderandreya
 
SEO meetup Utrecht
SEO meetup UtrechtSEO meetup Utrecht
SEO meetup UtrechtRoy Huiskes
 

Tendances (8)

Hypothesis focused SEO - Tom Anthony
Hypothesis focused SEO - Tom AnthonyHypothesis focused SEO - Tom Anthony
Hypothesis focused SEO - Tom Anthony
 
Link Audit and Removal
Link Audit and RemovalLink Audit and Removal
Link Audit and Removal
 
Case Study: Rocket Games
Case Study: Rocket GamesCase Study: Rocket Games
Case Study: Rocket Games
 
Tweak Geeks #FOS15
Tweak Geeks #FOS15Tweak Geeks #FOS15
Tweak Geeks #FOS15
 
Top Tips to Deliver Quality Web Experiences From IE 9 to the iPhone
Top Tips to Deliver Quality Web Experiences From IE 9 to the iPhoneTop Tips to Deliver Quality Web Experiences From IE 9 to the iPhone
Top Tips to Deliver Quality Web Experiences From IE 9 to the iPhone
 
Se omoz the-beginners-guide-to-seo
Se omoz the-beginners-guide-to-seoSe omoz the-beginners-guide-to-seo
Se omoz the-beginners-guide-to-seo
 
The Future Of SEO
The Future Of SEOThe Future Of SEO
The Future Of SEO
 
SEO meetup Utrecht
SEO meetup UtrechtSEO meetup Utrecht
SEO meetup Utrecht
 

En vedette

Lesson plan 2
Lesson plan 2Lesson plan 2
Lesson plan 2foreva100
 
Mi-ACE Mc-ICE Keynote - Vision for Innovation in Technology
Mi-ACE Mc-ICE Keynote - Vision for Innovation in TechnologyMi-ACE Mc-ICE Keynote - Vision for Innovation in Technology
Mi-ACE Mc-ICE Keynote - Vision for Innovation in TechnologyBarry Dahl
 
Teaching Technology Myths & Realities CIT 2009
Teaching Technology Myths & Realities CIT 2009Teaching Technology Myths & Realities CIT 2009
Teaching Technology Myths & Realities CIT 2009Barry Dahl
 
Desire2Learn Analytics Oklahoma RUF
Desire2Learn Analytics Oklahoma RUFDesire2Learn Analytics Oklahoma RUF
Desire2Learn Analytics Oklahoma RUFBarry Dahl
 
Pressemeddelelse F.C. Cream Team
Pressemeddelelse F.C. Cream TeamPressemeddelelse F.C. Cream Team
Pressemeddelelse F.C. Cream Teamguest55d787
 
Kind Of Children We Help
Kind Of Children We HelpKind Of Children We Help
Kind Of Children We Helpkhadkaramhari
 
Child Who Has Been Neglated
Child Who Has Been NeglatedChild Who Has Been Neglated
Child Who Has Been Neglatedkhadkaramhari
 
Course Design Standards
Course Design StandardsCourse Design Standards
Course Design StandardsBarry Dahl
 
Desire2Learn Insights Sampler
Desire2Learn Insights SamplerDesire2Learn Insights Sampler
Desire2Learn Insights SamplerBarry Dahl
 
ITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional DevelopmentITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional DevelopmentBarry Dahl
 
開店傳單
開店傳單開店傳單
開店傳單park101
 
RSCC - Setting Expectations for e-Education
RSCC - Setting Expectations for e-EducationRSCC - Setting Expectations for e-Education
RSCC - Setting Expectations for e-EducationBarry Dahl
 
Level 1 Key
Level 1 KeyLevel 1 Key
Level 1 Keythescala
 
Doc1 Solidão Segundo Chico Buarque
Doc1 Solidão Segundo Chico BuarqueDoc1 Solidão Segundo Chico Buarque
Doc1 Solidão Segundo Chico BuarqueMiriam Camargo
 
Level 1 Key
Level 1 KeyLevel 1 Key
Level 1 Keythescala
 
2-by-8 A.A. Degree
2-by-8 A.A. Degree2-by-8 A.A. Degree
2-by-8 A.A. DegreeBarry Dahl
 

En vedette (20)

Lesson plan 2
Lesson plan 2Lesson plan 2
Lesson plan 2
 
Eerp slide shaare
Eerp slide shaareEerp slide shaare
Eerp slide shaare
 
Mi-ACE Mc-ICE Keynote - Vision for Innovation in Technology
Mi-ACE Mc-ICE Keynote - Vision for Innovation in TechnologyMi-ACE Mc-ICE Keynote - Vision for Innovation in Technology
Mi-ACE Mc-ICE Keynote - Vision for Innovation in Technology
 
Teaching Technology Myths & Realities CIT 2009
Teaching Technology Myths & Realities CIT 2009Teaching Technology Myths & Realities CIT 2009
Teaching Technology Myths & Realities CIT 2009
 
Desire2Learn Analytics Oklahoma RUF
Desire2Learn Analytics Oklahoma RUFDesire2Learn Analytics Oklahoma RUF
Desire2Learn Analytics Oklahoma RUF
 
Pressemeddelelse F.C. Cream Team
Pressemeddelelse F.C. Cream TeamPressemeddelelse F.C. Cream Team
Pressemeddelelse F.C. Cream Team
 
Kind Of Children We Help
Kind Of Children We HelpKind Of Children We Help
Kind Of Children We Help
 
Child Who Has Been Neglated
Child Who Has Been NeglatedChild Who Has Been Neglated
Child Who Has Been Neglated
 
Course Design Standards
Course Design StandardsCourse Design Standards
Course Design Standards
 
Desire2Learn Insights Sampler
Desire2Learn Insights SamplerDesire2Learn Insights Sampler
Desire2Learn Insights Sampler
 
ITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional DevelopmentITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional Development
 
開店傳單
開店傳單開店傳單
開店傳單
 
MOSS For Enterprise
MOSS For EnterpriseMOSS For Enterprise
MOSS For Enterprise
 
RSCC - Setting Expectations for e-Education
RSCC - Setting Expectations for e-EducationRSCC - Setting Expectations for e-Education
RSCC - Setting Expectations for e-Education
 
Level 1 Key
Level 1 KeyLevel 1 Key
Level 1 Key
 
Doc1 Solidão Segundo Chico Buarque
Doc1 Solidão Segundo Chico BuarqueDoc1 Solidão Segundo Chico Buarque
Doc1 Solidão Segundo Chico Buarque
 
Level 1 Key
Level 1 KeyLevel 1 Key
Level 1 Key
 
Diki After Indreni
Diki After IndreniDiki After Indreni
Diki After Indreni
 
2-by-8 A.A. Degree
2-by-8 A.A. Degree2-by-8 A.A. Degree
2-by-8 A.A. Degree
 
Buse 2013 illness
Buse 2013 illnessBuse 2013 illness
Buse 2013 illness
 

Similaire à Matt Cutts Explains How Google Search Works & Handles Spam

Google's page rank; a decision support system in itself.
Google's page rank; a decision support system in itself.Google's page rank; a decision support system in itself.
Google's page rank; a decision support system in itself.Akash Sagar
 
Seo beginner's guide by client joy
Seo beginner's guide by client joySeo beginner's guide by client joy
Seo beginner's guide by client joyJoannBeals
 
Plerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UX
Plerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UXPlerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UX
Plerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UXDan Taylor
 
The beginners guide to SEO
The beginners guide to SEOThe beginners guide to SEO
The beginners guide to SEOThanh Nguyen
 
Using SEO to Build Your Business
Using SEO to Build Your BusinessUsing SEO to Build Your Business
Using SEO to Build Your BusinessKatie Spence
 
seo-basics-course-2023.pdf
seo-basics-course-2023.pdfseo-basics-course-2023.pdf
seo-basics-course-2023.pdfWaqarAhmad332389
 
beginners-guide.pdf
beginners-guide.pdfbeginners-guide.pdf
beginners-guide.pdfCreationlabz
 
Using SEO to Build Your Business
Using SEO to Build Your BusinessUsing SEO to Build Your Business
Using SEO to Build Your BusinessSpryIdeas
 
SEO for Beginners - A Step by Step Guide
SEO for Beginners - A Step by Step Guide SEO for Beginners - A Step by Step Guide
SEO for Beginners - A Step by Step Guide Utpal Upadhyay
 
SEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEOSEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEOFlutterbyBarb
 
Seo material Digitoliens - Best Digital Marketing Institute in Hyderabad
Seo material Digitoliens - Best Digital Marketing Institute in HyderabadSeo material Digitoliens - Best Digital Marketing Institute in Hyderabad
Seo material Digitoliens - Best Digital Marketing Institute in HyderabadDigitalMarketingByDi
 
The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...
The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...
The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...holisticface9206
 
seo important Terms A-Z (1).pdf safalta.com
seo important Terms A-Z (1).pdf safalta.comseo important Terms A-Z (1).pdf safalta.com
seo important Terms A-Z (1).pdf safalta.comashgamer800
 
Search engine optimization (seo) overview
Search engine optimization (seo) overviewSearch engine optimization (seo) overview
Search engine optimization (seo) overviewArpan Jain
 
How search works
How search worksHow search works
How search worksGrace Adato
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEONeeraj Reddy
 
Post-Penguin SEO Strategies for Google Success - 8-27-13 slides
Post-Penguin SEO Strategies for Google Success - 8-27-13 slides Post-Penguin SEO Strategies for Google Success - 8-27-13 slides
Post-Penguin SEO Strategies for Google Success - 8-27-13 slides DemandWave
 
Search Engine Manifesto
Search Engine  ManifestoSearch Engine  Manifesto
Search Engine Manifestobhabeshnath1
 

Similaire à Matt Cutts Explains How Google Search Works & Handles Spam (20)

Google's page rank; a decision support system in itself.
Google's page rank; a decision support system in itself.Google's page rank; a decision support system in itself.
Google's page rank; a decision support system in itself.
 
What is SEO ?
What is SEO ? What is SEO ?
What is SEO ?
 
Seo beginner's guide by client joy
Seo beginner's guide by client joySeo beginner's guide by client joy
Seo beginner's guide by client joy
 
Plerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UX
Plerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UXPlerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UX
Plerdy's CRO/UX_Party February 2021 - Dan Taylor - SEO & UX
 
The beginners guide to SEO
The beginners guide to SEOThe beginners guide to SEO
The beginners guide to SEO
 
Using SEO to Build Your Business
Using SEO to Build Your BusinessUsing SEO to Build Your Business
Using SEO to Build Your Business
 
seo-basics-course-2023.pdf
seo-basics-course-2023.pdfseo-basics-course-2023.pdf
seo-basics-course-2023.pdf
 
beginners-guide.pdf
beginners-guide.pdfbeginners-guide.pdf
beginners-guide.pdf
 
Using SEO to Build Your Business
Using SEO to Build Your BusinessUsing SEO to Build Your Business
Using SEO to Build Your Business
 
SEO for Beginners - A Step by Step Guide
SEO for Beginners - A Step by Step Guide SEO for Beginners - A Step by Step Guide
SEO for Beginners - A Step by Step Guide
 
SEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEOSEOMoz The Beginners Guide To SEO
SEOMoz The Beginners Guide To SEO
 
Seo material Digitoliens - Best Digital Marketing Institute in Hyderabad
Seo material Digitoliens - Best Digital Marketing Institute in HyderabadSeo material Digitoliens - Best Digital Marketing Institute in Hyderabad
Seo material Digitoliens - Best Digital Marketing Institute in Hyderabad
 
The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...
The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...
The Step-by-Step Guide on Improving Your Google Rankings Without Getting Pena...
 
Advanced seo-techniques-sourcefile
Advanced seo-techniques-sourcefileAdvanced seo-techniques-sourcefile
Advanced seo-techniques-sourcefile
 
seo important Terms A-Z (1).pdf safalta.com
seo important Terms A-Z (1).pdf safalta.comseo important Terms A-Z (1).pdf safalta.com
seo important Terms A-Z (1).pdf safalta.com
 
Search engine optimization (seo) overview
Search engine optimization (seo) overviewSearch engine optimization (seo) overview
Search engine optimization (seo) overview
 
How search works
How search worksHow search works
How search works
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEO
 
Post-Penguin SEO Strategies for Google Success - 8-27-13 slides
Post-Penguin SEO Strategies for Google Success - 8-27-13 slides Post-Penguin SEO Strategies for Google Success - 8-27-13 slides
Post-Penguin SEO Strategies for Google Success - 8-27-13 slides
 
Search Engine Manifesto
Search Engine  ManifestoSearch Engine  Manifesto
Search Engine Manifesto
 

Plus de Bitsytask

Lehman Brothers ALT-A Mortgage Docs, December 18, 2006
Lehman Brothers ALT-A Mortgage Docs, December 18, 2006Lehman Brothers ALT-A Mortgage Docs, December 18, 2006
Lehman Brothers ALT-A Mortgage Docs, December 18, 2006Bitsytask
 
BNC Subprime Mortgage Ratesheet 7-2006
BNC Subprime Mortgage Ratesheet 7-2006BNC Subprime Mortgage Ratesheet 7-2006
BNC Subprime Mortgage Ratesheet 7-2006Bitsytask
 
Impac libor option arm 2nd matrix
Impac libor option arm 2nd matrixImpac libor option arm 2nd matrix
Impac libor option arm 2nd matrixBitsytask
 
New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...
New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...
New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...Bitsytask
 
Countrywide Option Arm Loans (Negative Amortization) July 26 2006
Countrywide Option Arm Loans (Negative Amortization) July 26 2006Countrywide Option Arm Loans (Negative Amortization) July 26 2006
Countrywide Option Arm Loans (Negative Amortization) July 26 2006Bitsytask
 
Lehman Brothers ALT-A mortgage outline August 18 2006
Lehman Brothers ALT-A mortgage outline August 18 2006Lehman Brothers ALT-A mortgage outline August 18 2006
Lehman Brothers ALT-A mortgage outline August 18 2006Bitsytask
 
Credit Suisse sellers guide (secondary market) August 2006
Credit Suisse sellers guide (secondary market) August 2006Credit Suisse sellers guide (secondary market) August 2006
Credit Suisse sellers guide (secondary market) August 2006Bitsytask
 
GMAC Mortgage Underwriting Guidelines 9-11-2006
GMAC Mortgage Underwriting Guidelines 9-11-2006GMAC Mortgage Underwriting Guidelines 9-11-2006
GMAC Mortgage Underwriting Guidelines 9-11-2006Bitsytask
 
Operation Ajax Declassified PDF 7 of 9
Operation Ajax Declassified PDF 7 of 9Operation Ajax Declassified PDF 7 of 9
Operation Ajax Declassified PDF 7 of 9Bitsytask
 
Operation Ajax Declassified PDF 6 of 9
Operation Ajax Declassified PDF 6 of 9Operation Ajax Declassified PDF 6 of 9
Operation Ajax Declassified PDF 6 of 9Bitsytask
 
Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9Bitsytask
 
Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9Bitsytask
 
Operation Ajax Declassified PDF 3 of 9
Operation Ajax Declassified PDF 3 of 9Operation Ajax Declassified PDF 3 of 9
Operation Ajax Declassified PDF 3 of 9Bitsytask
 
Operation Ajax Declassified PDF 2 of 9
Operation Ajax Declassified PDF 2 of 9Operation Ajax Declassified PDF 2 of 9
Operation Ajax Declassified PDF 2 of 9Bitsytask
 
Operation Ajax Declassified PDF 1 of 9
Operation Ajax Declassified PDF 1 of 9Operation Ajax Declassified PDF 1 of 9
Operation Ajax Declassified PDF 1 of 9Bitsytask
 
Operation Ajax Declassified PDF Appendix E
Operation Ajax Declassified PDF Appendix EOperation Ajax Declassified PDF Appendix E
Operation Ajax Declassified PDF Appendix EBitsytask
 
Operation Ajax Declassified PDF Appendix D
Operation Ajax Declassified PDF Appendix DOperation Ajax Declassified PDF Appendix D
Operation Ajax Declassified PDF Appendix DBitsytask
 
Operation Ajax Declassified PDF Appendix B
Operation Ajax Declassified PDF Appendix BOperation Ajax Declassified PDF Appendix B
Operation Ajax Declassified PDF Appendix BBitsytask
 
Operation Ajax Declassified PDF 9 of 9
Operation Ajax Declassified PDF 9 of 9Operation Ajax Declassified PDF 9 of 9
Operation Ajax Declassified PDF 9 of 9Bitsytask
 
the crypto republic
the crypto republicthe crypto republic
the crypto republicBitsytask
 

Plus de Bitsytask (20)

Lehman Brothers ALT-A Mortgage Docs, December 18, 2006
Lehman Brothers ALT-A Mortgage Docs, December 18, 2006Lehman Brothers ALT-A Mortgage Docs, December 18, 2006
Lehman Brothers ALT-A Mortgage Docs, December 18, 2006
 
BNC Subprime Mortgage Ratesheet 7-2006
BNC Subprime Mortgage Ratesheet 7-2006BNC Subprime Mortgage Ratesheet 7-2006
BNC Subprime Mortgage Ratesheet 7-2006
 
Impac libor option arm 2nd matrix
Impac libor option arm 2nd matrixImpac libor option arm 2nd matrix
Impac libor option arm 2nd matrix
 
New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...
New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...
New Century Subprime Mortgage Matrix (Stated Doc / 80%, 550 FICO, 50% DTI) 7-...
 
Countrywide Option Arm Loans (Negative Amortization) July 26 2006
Countrywide Option Arm Loans (Negative Amortization) July 26 2006Countrywide Option Arm Loans (Negative Amortization) July 26 2006
Countrywide Option Arm Loans (Negative Amortization) July 26 2006
 
Lehman Brothers ALT-A mortgage outline August 18 2006
Lehman Brothers ALT-A mortgage outline August 18 2006Lehman Brothers ALT-A mortgage outline August 18 2006
Lehman Brothers ALT-A mortgage outline August 18 2006
 
Credit Suisse sellers guide (secondary market) August 2006
Credit Suisse sellers guide (secondary market) August 2006Credit Suisse sellers guide (secondary market) August 2006
Credit Suisse sellers guide (secondary market) August 2006
 
GMAC Mortgage Underwriting Guidelines 9-11-2006
GMAC Mortgage Underwriting Guidelines 9-11-2006GMAC Mortgage Underwriting Guidelines 9-11-2006
GMAC Mortgage Underwriting Guidelines 9-11-2006
 
Operation Ajax Declassified PDF 7 of 9
Operation Ajax Declassified PDF 7 of 9Operation Ajax Declassified PDF 7 of 9
Operation Ajax Declassified PDF 7 of 9
 
Operation Ajax Declassified PDF 6 of 9
Operation Ajax Declassified PDF 6 of 9Operation Ajax Declassified PDF 6 of 9
Operation Ajax Declassified PDF 6 of 9
 
Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9
 
Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9Operation Ajax Declassified PDF 5 of 9
Operation Ajax Declassified PDF 5 of 9
 
Operation Ajax Declassified PDF 3 of 9
Operation Ajax Declassified PDF 3 of 9Operation Ajax Declassified PDF 3 of 9
Operation Ajax Declassified PDF 3 of 9
 
Operation Ajax Declassified PDF 2 of 9
Operation Ajax Declassified PDF 2 of 9Operation Ajax Declassified PDF 2 of 9
Operation Ajax Declassified PDF 2 of 9
 
Operation Ajax Declassified PDF 1 of 9
Operation Ajax Declassified PDF 1 of 9Operation Ajax Declassified PDF 1 of 9
Operation Ajax Declassified PDF 1 of 9
 
Operation Ajax Declassified PDF Appendix E
Operation Ajax Declassified PDF Appendix EOperation Ajax Declassified PDF Appendix E
Operation Ajax Declassified PDF Appendix E
 
Operation Ajax Declassified PDF Appendix D
Operation Ajax Declassified PDF Appendix DOperation Ajax Declassified PDF Appendix D
Operation Ajax Declassified PDF Appendix D
 
Operation Ajax Declassified PDF Appendix B
Operation Ajax Declassified PDF Appendix BOperation Ajax Declassified PDF Appendix B
Operation Ajax Declassified PDF Appendix B
 
Operation Ajax Declassified PDF 9 of 9
Operation Ajax Declassified PDF 9 of 9Operation Ajax Declassified PDF 9 of 9
Operation Ajax Declassified PDF 9 of 9
 
the crypto republic
the crypto republicthe crypto republic
the crypto republic
 

Matt Cutts Explains How Google Search Works & Handles Spam

  • 1. Matt Cutts explains the basics of how Google Search works. About Search Every day Google answers more than one billion questions from people around the globe in 181 countries and 146 languages. 15% of the searches we see everyday we’ve never seen before. Technology makes this possible because we can create computing programs, called “algorithms”, that can handle the immense volume and breadth of search requests. We’re just at the beginning of what’s possible, and we are constantly looking to find better solutions. We have more engineers working on search today than at any time in the past. Search relies on human ingenuity, persistence and hard work. Just as an automobile engineer designs an engine with good torque, fuel efficiency, road noise and other qualities – Google’s search engineers design algorithms to return timely, high-quality, on-topic, answers to people’s questions. Our algorithms attempt to rank the most relevant search results towards the top of the page, and less relevant search results lower down the page. Algorithms Rank Relevant Results Higher For every search query performed on Google, whether it’s [hotels in Tulsa] or [New York Yankees scores], there are thousands, if not millions of web pages with helpful information. Our challenge in search is to return only the most relevant results at the top of the page, sparing people from combing through the less relevant results below. Not every website can come out at the top of the page, or even appear on the first page of our search results. Today our algorithms rely on more than 200 unique signals, some of which you’d expect, like how often the search terms occur on the webpage, if they appear in the title or whether synonyms of the search terms occur on the page. Google has invented many innovations in search to improve the answers you find. The first and most well known is PageRank, named for Larry Page (Google’s co-founder and CEO). PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.
  • 2. Panda: Helping People Find More High-Quality Sites To give you an example of the changes we make, recently we launched a pretty big algorithmic improvement to our ranking—a change that noticeably impacts 11.8% of Google searches. This change came to be known as “Panda,” and while it’s one of hundreds of changes we make in a given year, it illustrates some of the problems we tackle in search. The Panda update was designed to improve the user experience by catching and demoting low-quality sites that did not provide useful original content or otherwise add much value. At the same time, it provided better rankings for high- quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so on. Market Pressure to Innovate “[Google] has every reason to do whatever it takes to preserve its algorithm’s long-standing reputation for excellence. If consumers start to regard it as anything less than good, it won’t be good for anybody—except other search engines.” Harry McCracken, TIME, 3/3/2011 We rely on rigorous testing and evaluation methods to rapidly and efficiently make improvements to our algorithms. A Peek Inside “At any moment, dozens of these changes are going through a well-oiled testing process…Every time engineers want to test a tweak, they run the new algorithm on a tiny percentage of random users, letting the rest of the site’s searchers serve as a massive control group.” – Read more from Steven Levy’s in-depth story in Wired, 02/22/10 Testing and Evaluation Google is constantly working to improve search. We take a data-driven approach and employ analysts, researchers and statisticians to evaluate search quality on a full-time basis. Changes to our algorithms undergo extensive quality evaluation before being released. A typical algorithmic change begins as an idea from one of our engineers. We then implement that idea on a test
  • 3. version of Google and generate before and after results pages. We typically present these before and after results pages to “raters,” people who are trained to evaluate search quality. Assuming the feedback is positive, we may run what’s called a “live experiment” where we try out the updated algorithm on a very small percentage of Google users, so we can see data on how people seem to be interacting with the new results. For example, do searchers click the new result #1 more often? If so, that’s generally a good sign. Despite all the work we put into our evaluations, the process is so efficient at this point that in 2010 alone we ran: 13,311 precision evaluations: To test whether potential algorithm changes had a positive or negative impact on the precision of our results 8,157 side-by-side experiments: Where we show a set of raters two dif f erent pages of results and ask them to evaluate which ones are better 2,800 click evaluations: To see how a small sample (typically less than 1% of our users) respond to a change Based on all of this experimentation, evaluation and analysis, in 2010 we launched 516 improvements to search. Manual Control and the Human Element In very limited cases, manual controls are necessary to improve the user experience: 1. Security Concerns: We take aggressive manual action to protect people f rom security threats online, including malware and viruses. This includes removing pages f rom our index (including pages with credit card numbers and other personal inf ormation that can compromise security), putting up interstitial warning pages and adding notices to our results page to indicate that, “this site may harm your computer.” 2. Legal Issues: We will also manually intervene in our search results f or legal reasons, f or example to remove child sexual-abuse content (child pornography) or copyright inf ringing material (when notif ied through valid legal process such as a DMCA takedown request in the United States). 3. Exception Lists: Like the vast majority of search engines, in some cases our algorithms f alsely identif y sites and we sometimes make limited exceptions to improve our search quality. For example, our Saf eSearch algorithms are designed to protect kids f rom sexual content online. When one of these algorithms mistakenly catches websites, such as essex.edu, we can make manual exceptions to prevent these sites f rom being classif ied as pornography. 4. Spam: Google and other search engines publish and enf orce guidelines to prevent unscrupulous actors f rom trying to game their way to the top of the results. For example, our guidelines state that websites should not repeat the same keyword over and over again on the page, a technique known as “keyword stuf f ing.” While we use many automated ways of detecting these behaviors, we also take manual action to remove spam. The Engineers Behind Search “So behind every algorithm, and therefore behind every search result, is a team of people responsible for making sure Google search makes the right decisions when responding to your query. Obviously, there’s no other way it could have happened: Google is a living example of what’s possible when brilliant people devise a smart algorithm and marry it to limitless computing resources.” – Tom Krazit, The human process behind Google’s algorithm, CNET, 09/07/10 Matt Cutts explains how Google deals with spam through a combination of algorithms and manual action, and how websites can request reconsideration of their sites.
  • 4. Fighting Spam Ever since there have been search engines, there have been people dedicated to tricking their way to the top of the results page. Common tactics include: Cloaking: In this practice a website shows dif f erent inf ormation to search engine crawlers than users. For example, a spammer might put the words “Sony Television” on his site in white text on a white background, even though the page is actually an advertisement f or Viagra. Keyword Stuf f ing: In this practice a website packs a page f ull of keywords over and over again to try and get a search engine to think the page is especially relevant f or that topic. Long ago, this could mean simply repeating a phrase like “tax preparation advice” hundreds of times at the bottom of a site selling used cars, but today spammers have gotten more sophisticated. Paid Links: In this practice one website pays another website to link to his site in hopes it will improve rankings based on PageRank. PageRank looks at links to try and determine the authoritativeness of a site. Today, we estimate more than one million spam pages are created each hour. This is bad for searchers because it means more relevant websites get buried under irrelevant results, and it’s bad for legitimate website owners because their sites become harder to find. For these reasons, we’ve been working since the earliest days of Google to fight spammers, helping people find the answers they’re looking for, and helping legitimate websites get traffic from search.