SlideShare une entreprise Scribd logo
1  sur  4
Notes for “Search Engine” Project
Common Popular Search Engines. www.google.com, www.bing.com, www.yahoo.com,
www.ask.com

Other Search engines Wolfram Alpha, Dog Pile, Swag Bucks


Crawler-Based Search Engines


Crawler-based search engines, such as Google, create their listings automatically.
They "crawl" or "spider" the web, then people search through what they have found.


If you change your web pages, crawler-based search engines eventually find these
changes, and that can affect how you are listed. Page titles, body copy and other
elements all play a role.


Human-Powered Directories


A human-powered directory, such as the Open Directory, depends on humans for its
listings. You submit a short description to the directory for your entire site, or editors
write one for sites they review. A search looks for matches only in the descriptions
submitted.


Changing your web pages has no effect on your listing. Things that are useful for
improving a listing with a search engine have nothing to do with improving a listing
in a directory. The only exception is that a good site, with good content, might be
more likely to get reviewed for free than a poor site.


The Parts Of A Crawler-Based Search Engine


Crawler-based search engines have three major elements. First is the spider, also
called the crawler. The spider visits a web page, reads it, and then follows links to
other pages within the site. This is what it means when someone refers to a site
being "spidered" or "crawled." The spider returns to the site on a regular basis, such
as every month or two, to look for changes.
Everything the spider finds goes into the second part of the search engine, the index.
The index, sometimes called the catalog, is like a giant book containing a copy of
every web page that the spider finds. If a web page changes, then this book is
updated with new information.


Sometimes it can take a while for new pages or changes that the spider finds to be
added to the index. Thus, a web page may have been "spidered" but not yet
"indexed." Until it is indexed -- added to the index -- it is not available to those
searching with the search engine.


Search engine software is the third part of a search engine. This is the program that
sifts through the millions of pages recorded in the index to find matches to a search
and rank them in order of what it believes is most relevant. You can learn more
about how search engine software ranks web pages on the aptly-named How Search
Engines Rank Web Pages page.


All crawler-based search engines have the basic parts described above, but there are
differences in how these parts are tuned. That is why the same search on different
search engines often produces different results.1


Without search engines it would be very difficult to search the internet to find
information. Imagine having to do a research paper on the Great Depression and not
being able to get information quickly. You would have to spend literally hours finding
different information on the internet to efficiently write the paper. But with search
engines it becomes very easy to find information on your topic.




1
    http://searchenginewatch.com/2168031
What is a Search Engine?
By definition, an Internet search engine is an information retrieval system, which helps us find
information on the World Wide Web. World Wide Web is the universe of information where this
information is accessible on the network. It facilitates global sharing of information. But WWW is
seen as an unstructured database. It is exponentially growing to become enormous store of
information. Searching for information on the web is hence a difficult task. There is a need to
have a tool to manage, filter and retrieve this oceanic information. A search engine serves this
purpose.
How does a Search Engine Work?

      •    Internet search engines are web search engines that search and retrieve information on
           the web. Most of them use crawler indexer architecture. They depend on their crawler
           modules. Crawlers also referred to as spiders are small programs that browse the web.
      •    Crawlers are given an initial set of URLs whose pages they retrieve. They extract the
           URLs that appear on the crawled pages and give this information to the crawler control
           module. The crawler module decides which pages to visit next and gives their URLs back
           to the crawlers.
      •    The topics covered by different search engines vary according to the algorithms they use.
           Some search engines are programmed to search sites on a particular topic while the
           crawlers in others may be visiting as many sites as possible.
      •    The crawl control module may use the link graph of a previous crawl or may use usage
           patterns to help in its crawling strategy.
      •    The indexer module extracts the words form each page it visits and records its URLs. It
           results into a large lookup table that gives a list of URLs pointing to pages where each
           word occurs. The table lists those pages, which were covered in the crawling process.
      •    A collection analysis module is another important part of the search engine architecture.
           It creates a utility index. A utility index may provide access to pages of a given length or
           pages containing a certain number of pictures on them.
      •    During the process of crawling and indexing, a search engine stores the pages it
           retrieves. They are temporarily stored in a page repository. Search engines maintain a
           cache of pages they visit so that retrieval of already visited pages expedites.
      •    The query module of a search engine receives search requests form users in the form of
           keywords. The ranking module sorts the results.

      •    The crawler indexer architecture has many variants. It is modified in the distributed
           architecture of a search engine. These search engine architectures consist of gatherers
           and brokers. Gatherers collect indexing information from web servers while the brokers
           give the indexing mechanism and the query interface. Brokers update indices on the
           basis of information received from gatherers and other brokers. They can filter
           information. Many search engines of today use this type of architecture. 2




2
    http://www.buzzle.com/articles/how-does-a-search-engine-work.html
Notes for

Contenu connexe

Tendances

Working of search engine
Working of search engineWorking of search engine
Working of search engineNikhil Deswal
 
working of search engine & SEO
working of search engine & SEOworking of search engine & SEO
working of search engine & SEODeepak Singh
 
How search engine works
How search engine worksHow search engine works
How search engine worksleoniehannah
 
How a search engine works slide
How a search engine works slideHow a search engine works slide
How a search engine works slideSovan Misra
 
Search Engines and its working
Search Engines and its workingSearch Engines and its working
Search Engines and its workingMukesh Kumar
 
Learn the Search Engine Type and Its Functions!
Learn the Search Engine Type and Its Functions!Learn the Search Engine Type and Its Functions!
Learn the Search Engine Type and Its Functions!aashokkr
 
Search Engine Powerpoint
Search Engine PowerpointSearch Engine Powerpoint
Search Engine Powerpoint201014161
 
How a search engine works report
How a search engine works reportHow a search engine works report
How a search engine works reportSovan Misra
 
Meta Search Engines
Meta Search EnginesMeta Search Engines
Meta Search Enginesvcsstudent
 
Effective Searching Policies for Web Crawler
Effective Searching Policies for Web CrawlerEffective Searching Policies for Web Crawler
Effective Searching Policies for Web CrawlerIJMER
 
Introduction to Search Engine Optimization
Introduction to Search Engine OptimizationIntroduction to Search Engine Optimization
Introduction to Search Engine OptimizationGauravPrajapati39
 

Tendances (20)

How search engine work ppt
How search engine work pptHow search engine work ppt
How search engine work ppt
 
Working of search engine
Working of search engineWorking of search engine
Working of search engine
 
working of search engine & SEO
working of search engine & SEOworking of search engine & SEO
working of search engine & SEO
 
Search Engine
Search EngineSearch Engine
Search Engine
 
How search engine works
How search engine worksHow search engine works
How search engine works
 
Search Engine
Search EngineSearch Engine
Search Engine
 
How a search engine works slide
How a search engine works slideHow a search engine works slide
How a search engine works slide
 
Search engine
Search engineSearch engine
Search engine
 
Web Search Engine
Web Search EngineWeb Search Engine
Web Search Engine
 
Search Engines and its working
Search Engines and its workingSearch Engines and its working
Search Engines and its working
 
Learn the Search Engine Type and Its Functions!
Learn the Search Engine Type and Its Functions!Learn the Search Engine Type and Its Functions!
Learn the Search Engine Type and Its Functions!
 
Search Engine Powerpoint
Search Engine PowerpointSearch Engine Powerpoint
Search Engine Powerpoint
 
How a search engine works report
How a search engine works reportHow a search engine works report
How a search engine works report
 
Meta Search Engines
Meta Search EnginesMeta Search Engines
Meta Search Engines
 
Effective Searching Policies for Web Crawler
Effective Searching Policies for Web CrawlerEffective Searching Policies for Web Crawler
Effective Searching Policies for Web Crawler
 
Meta Search Engine: An Introductory Study
Meta Search Engine: An Introductory StudyMeta Search Engine: An Introductory Study
Meta Search Engine: An Introductory Study
 
Meta search engine
Meta search engineMeta search engine
Meta search engine
 
Introduction to Search Engine Optimization
Introduction to Search Engine OptimizationIntroduction to Search Engine Optimization
Introduction to Search Engine Optimization
 
Search engine
Search engineSearch engine
Search engine
 
Search engine
Search engineSearch engine
Search engine
 

En vedette

Сек’юритизація активів
Сек’юритизація активівСек’юритизація активів
Сек’юритизація активівAlexei Pilipets
 
El renac. españa,
El renac. españa, El renac. españa,
El renac. españa, Josep Beser
 
El color. Valor simbólico.
El color. Valor simbólico.El color. Valor simbólico.
El color. Valor simbólico.jesus plastik
 
Gateway | Psychometric Assessment
Gateway | Psychometric AssessmentGateway | Psychometric Assessment
Gateway | Psychometric AssessmentChandler Macleod
 
Strategies for Measuring and Securing ROI with Salesforce
Strategies for Measuring and Securing ROI with SalesforceStrategies for Measuring and Securing ROI with Salesforce
Strategies for Measuring and Securing ROI with Salesforcedreamforce2006
 
A. Transit et milieu de vie
A. Transit et milieu de vieA. Transit et milieu de vie
A. Transit et milieu de viepierredo
 
historia de los caballos purasangre
historia de los caballos purasangrehistoria de los caballos purasangre
historia de los caballos purasangresalvador19XD
 
Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...
Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...
Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...haic2hv.net
 
Sistema gastrointestinal equino
Sistema gastrointestinal equinoSistema gastrointestinal equino
Sistema gastrointestinal equinoJennyfer Dávila
 
Đánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ Singapore
Đánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ SingaporeĐánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ Singapore
Đánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ Singaporehaic2hv.net
 
300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toán
300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toán300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toán
300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toánhaic2hv.net
 

En vedette (16)

Сек’юритизація активів
Сек’юритизація активівСек’юритизація активів
Сек’юритизація активів
 
GBZ_Annual_Report_2003
GBZ_Annual_Report_2003GBZ_Annual_Report_2003
GBZ_Annual_Report_2003
 
El renac. españa,
El renac. españa, El renac. españa,
El renac. españa,
 
cisco_certificate
cisco_certificatecisco_certificate
cisco_certificate
 
El color. Valor simbólico.
El color. Valor simbólico.El color. Valor simbólico.
El color. Valor simbólico.
 
Verón
VerónVerón
Verón
 
Guar update oct2016
Guar update oct2016Guar update oct2016
Guar update oct2016
 
Gateway | Psychometric Assessment
Gateway | Psychometric AssessmentGateway | Psychometric Assessment
Gateway | Psychometric Assessment
 
Clandestino
ClandestinoClandestino
Clandestino
 
Strategies for Measuring and Securing ROI with Salesforce
Strategies for Measuring and Securing ROI with SalesforceStrategies for Measuring and Securing ROI with Salesforce
Strategies for Measuring and Securing ROI with Salesforce
 
A. Transit et milieu de vie
A. Transit et milieu de vieA. Transit et milieu de vie
A. Transit et milieu de vie
 
historia de los caballos purasangre
historia de los caballos purasangrehistoria de los caballos purasangre
historia de los caballos purasangre
 
Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...
Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...
Đánh thức tài năng toán học - Quyển 5 (11-13 tuổi) | Sách toán song ngữ singa...
 
Sistema gastrointestinal equino
Sistema gastrointestinal equinoSistema gastrointestinal equino
Sistema gastrointestinal equino
 
Đánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ Singapore
Đánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ SingaporeĐánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ Singapore
Đánh thức tài năng toán học - Quyển 1 (7-8 tuổi) | Sách toán song ngữ Singapore
 
300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toán
300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toán300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toán
300 bài toán lớp 4 ôn luyện thi học sinh giỏi và violympic toán
 

Similaire à Notes for

An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrievaliosrjce
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptxScrbifPt
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewIOSR Journals
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawlerishmecse13
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawlervinay arora
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than GoogleDr Trivedi
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine OptimizationArun Kumar
 
The Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web CrawlerThe Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web CrawlerIRJESJOURNAL
 
A Two Stage Crawler on Web Search using Site Ranker for Adaptive Learning
A Two Stage Crawler on Web Search using Site Ranker for Adaptive LearningA Two Stage Crawler on Web Search using Site Ranker for Adaptive Learning
A Two Stage Crawler on Web Search using Site Ranker for Adaptive LearningIJMTST Journal
 
Crawling and Indexing
Crawling and IndexingCrawling and Indexing
Crawling and IndexingHimani Tyagi
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03dpd
 
Search Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanismSearch Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanismUmang MIshra
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEONeeraj Reddy
 

Similaire à Notes for (20)

Search engine
Search engineSearch engine
Search engine
 
G017254554
G017254554G017254554
G017254554
 
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
 
Seo Manual
Seo ManualSeo Manual
Seo Manual
 
How Google Works
How Google WorksHow Google Works
How Google Works
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptx
 
Search engine
Search engineSearch engine
Search engine
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A Review
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than Google
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
The Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web CrawlerThe Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web Crawler
 
A Two Stage Crawler on Web Search using Site Ranker for Adaptive Learning
A Two Stage Crawler on Web Search using Site Ranker for Adaptive LearningA Two Stage Crawler on Web Search using Site Ranker for Adaptive Learning
A Two Stage Crawler on Web Search using Site Ranker for Adaptive Learning
 
Search Engine
Search EngineSearch Engine
Search Engine
 
Crawling and Indexing
Crawling and IndexingCrawling and Indexing
Crawling and Indexing
 
Internet Tutorial 03
Internet  Tutorial 03Internet  Tutorial 03
Internet Tutorial 03
 
Search Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanismSearch Engine working, Crawlers working, Search Engine mechanism
Search Engine working, Crawlers working, Search Engine mechanism
 
Search Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEOSearch Engine Optimization - Fundamentals - SEO
Search Engine Optimization - Fundamentals - SEO
 
Search engines
Search enginesSearch engines
Search engines
 

Dernier

4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptxmary850239
 
The role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipThe role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipKarl Donert
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...Nguyen Thanh Tu Collection
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 

Dernier (20)

4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx
 
The role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipThe role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenship
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
CARNAVAL COM MAGIA E EUFORIA _
CARNAVAL COM MAGIA E EUFORIA            _CARNAVAL COM MAGIA E EUFORIA            _
CARNAVAL COM MAGIA E EUFORIA _
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 

Notes for

  • 1. Notes for “Search Engine” Project Common Popular Search Engines. www.google.com, www.bing.com, www.yahoo.com, www.ask.com Other Search engines Wolfram Alpha, Dog Pile, Swag Bucks Crawler-Based Search Engines Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found. If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role. Human-Powered Directories A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted. Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site. The Parts Of A Crawler-Based Search Engine Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes.
  • 2. Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine. Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant. You can learn more about how search engine software ranks web pages on the aptly-named How Search Engines Rank Web Pages page. All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results.1 Without search engines it would be very difficult to search the internet to find information. Imagine having to do a research paper on the Great Depression and not being able to get information quickly. You would have to spend literally hours finding different information on the internet to efficiently write the paper. But with search engines it becomes very easy to find information on your topic. 1 http://searchenginewatch.com/2168031
  • 3. What is a Search Engine? By definition, an Internet search engine is an information retrieval system, which helps us find information on the World Wide Web. World Wide Web is the universe of information where this information is accessible on the network. It facilitates global sharing of information. But WWW is seen as an unstructured database. It is exponentially growing to become enormous store of information. Searching for information on the web is hence a difficult task. There is a need to have a tool to manage, filter and retrieve this oceanic information. A search engine serves this purpose. How does a Search Engine Work? • Internet search engines are web search engines that search and retrieve information on the web. Most of them use crawler indexer architecture. They depend on their crawler modules. Crawlers also referred to as spiders are small programs that browse the web. • Crawlers are given an initial set of URLs whose pages they retrieve. They extract the URLs that appear on the crawled pages and give this information to the crawler control module. The crawler module decides which pages to visit next and gives their URLs back to the crawlers. • The topics covered by different search engines vary according to the algorithms they use. Some search engines are programmed to search sites on a particular topic while the crawlers in others may be visiting as many sites as possible. • The crawl control module may use the link graph of a previous crawl or may use usage patterns to help in its crawling strategy. • The indexer module extracts the words form each page it visits and records its URLs. It results into a large lookup table that gives a list of URLs pointing to pages where each word occurs. The table lists those pages, which were covered in the crawling process. • A collection analysis module is another important part of the search engine architecture. It creates a utility index. A utility index may provide access to pages of a given length or pages containing a certain number of pictures on them. • During the process of crawling and indexing, a search engine stores the pages it retrieves. They are temporarily stored in a page repository. Search engines maintain a cache of pages they visit so that retrieval of already visited pages expedites. • The query module of a search engine receives search requests form users in the form of keywords. The ranking module sorts the results. • The crawler indexer architecture has many variants. It is modified in the distributed architecture of a search engine. These search engine architectures consist of gatherers and brokers. Gatherers collect indexing information from web servers while the brokers give the indexing mechanism and the query interface. Brokers update indices on the basis of information received from gatherers and other brokers. They can filter information. Many search engines of today use this type of architecture. 2 2 http://www.buzzle.com/articles/how-does-a-search-engine-work.html