نظم استرجاع المعلومات الفرقة 4 مكتبات بني سويف

Search Engines :
As information retrieval system

Search Engines
Search Directories
Meta Search Engines
Portals

spidercrawler
-
URL
Indexer
-
--
Searcher
-
-

The Parts of a Search Engine
Spider (or “crawler”)
Indexer
Search software (an algorithm)

The “spider” or “crawler”
The spider visits a web page, reads it, and then
follows links to other pages within the site. This is
what it means when someone refers to a site being
"spidered" or "crawled". This is also known as
“harvesting”. The spider returns to the site on a
regular basis, such as every month or two, to look for
changes.

9
UCB SIMS 202, Sept. 2004
Avi Rappoport, Search Tools Consulting
Robot Indexing Diagram
Sour

The indexer
Everything the spider finds goes into the second part
of a search engine, the index. The index, sometimes
called the catalog, is like a giant book containing a
copy of every web page that the spider finds. If a web
page changes, then this book is updated new
information.

11
Simple Index Diagram

12
Search Looks Simple

13
But It's Not
Index ahead of time
• Find files or records
• Open each one and read it
• Store each word in a searchable index
Provide search forms
• Match the query terms with words in the index
• Sort documents by relevance
Display results

14
content
search
functionality
user
interface
Search is Mostly Invisible
Like an iceberg,
2/3 below water

How Search Engines Work?
1) They collect information from selected web sites
2) The employ special software robots, called spiders, to
crawl web pages
3) Spiders build lists of the words found in Web sites.
1) When a spider is building its lists, the spider is Web crawling.
4) Spiders store the lists in the engine’s database
5) The engine’s indexing software builds an index of words
6) Information is matched against query input and
retrieved (processing algorithm)

16
Search Processing

The Web
URL1
URL2
URL3
URL4
Crawler
Indexer
Search
Engine
Database rivers?
rivers.
All About
rivers
by
S. I. Am

-
-
-
-Indexing
-
-Advanced Search
-
-

27

28

29

S E 1 S E 2 S E 3
Dispatcher
Display
UserInterface
Knowledge
Personalize
Query
Feedback
User
Web

34

35

Ditto - www.ditto.com
Free Photo - www.freephoto.com
Amazing Image Machine -
www.ncrtec.org/picture.htm
Pics 4 Learning www.pics4learning.com

Traditional text-based image search engines
• Manual annotation of images
• Use text-based retrieval methods
Water lilies
Flowers in a pond
<Its biological
name>

QBIC – Search by color
** Images courtesy : Yong Rao

QBIC – Search by shape

QBIC – Query by sketch

42

Find Sounds - specialized search engine
www.findsounds.com
Daily Wav - www.dailywav.com
Sound America -
www.soundamerica.com
Wav Central - www.wavecentral.com

44

45
written searching
spoken searching
browsing technique
terms browsing
items browsing

Barry’s Clipart Server -
www.barrysclipart.com
Animated Gif Server -
www.animatedgif.net
Animation Factory -
www.animfactory.com

48
100 20025
100
25
200
25
25
25% 100
25
12.5% 200

نظم استرجاع المعلومات الفرقة 4 مكتبات بني سويف

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (20)

Similaire à نظم استرجاع المعلومات الفرقة 4 مكتبات بني سويف

Similaire à نظم استرجاع المعلومات الفرقة 4 مكتبات بني سويف (20)

Dernier

Dernier (20)

نظم استرجاع المعلومات الفرقة 4 مكتبات بني سويف