SlideShare a Scribd company logo
1 of 24
Download to read offline
ReSearch
(because research without search is just “re”)



                                                 Davide Eynard
                                                 eynard@elet.polimi.it
2

Table of contents

        Introduction
        ... (ellipsis left by purpose)
        Conclusions




                              Davide Eynard
ReSearch - 2008/06/06
3

This seminar is not...

“Le risorse elettroniche per la ricerca”
     a transversal course for the PhD Students of Politecnico di
       Milano
     This (June 2008?) will be the fourth edition
     Very good material from previous editions is available at
       http://www.biblio.polimi.it/documenti
     Main topics:
               •   query languages
               •   online libraries, journals and ebooks
               •   tools to create and manage your bibliography
               •   search engines, deep Web, open archives, advanced browsing
               •   social publishing (blogs and RSS) and social bookmarking
               •   POLIsearch
               •   using the university proxy to access online resources
               •   notes on copyright issues
               •   search techniques (like PICO and SPICE)


                                  Davide Eynard
ReSearch - 2008/06/06
4

So... why?


                                        Searching (and now, in particular,
                                        being able to effectively search on
                                        the Internet) is very important for
                                        our research and, more generally,
                                        in our lives.

                                        Even if they are interested, some
                                        students skip the course as it does
                                        not give enough credits!

                                        If you're interested in these topics,
                                        ask for a solution (ie. increase the
                                        credits, together with the teaching
                                        material).



                        Davide Eynard
ReSearch - 2008/06/06
5

So... what?

What is the real purpose of this lecture, then?
What are the contents?
Is this a short version of the PhD course?




                          Davide Eynard
ReSearch - 2008/06/06
6

So... what?

What is the real purpose of this lecture, then?
What are the contents?
Is this a short version of the PhD course?


                                    NAH!

There's so much material about search that we could prepare
ten complementary PhD courses...




                          Davide Eynard
ReSearch - 2008/06/06
7

So... what?

What is the real purpose of this lecture, then?
What are the contents?
Is this a short version of the PhD course?


                                    NAH!

There's so much material about search that we could prepare
ten complementary PhD courses...

Moreover, I already had some material I wanted to recycle
    http://searchlores.org – a precious source for seekers
    PowerBrowsing – an old project of mine




                          Davide Eynard
ReSearch - 2008/06/06
8

So... what?

What is the real purpose of this lecture, then?
What are the contents?
Is this a short version of the PhD course?


                                    NAH!

There's so much material about search that we could prepare
ten complementary PhD courses...

Moreover, I already had some material I wanted to recycle
    http://searchlores.org – a precious source for seekers
    PowerBrowsing – an old project of mine


BUT I also have something new to tell you, I promise!


                          Davide Eynard
ReSearch - 2008/06/06
9

The Web

                                                 Search engines
                                                  cover (at best)
                                                  ¼ of the web

                                                 Different SE may
                                                  return different
                                                  results (as they
                                                  overlap)

                                                 Quality of results
                                                  in terms of
                                                  precision and
                                                  recall

                                                 See (for instance)
                                                  here
                 [http://www.searchlores.org]

                                Davide Eynard
ReSearch - 2008/06/06
10

The Internet

                        The Web              vs               Not the Web

                                                     IRC              Email


                  Blogs
                                                   Usenet               IM
                  Wikis
                  Forums
                  File sharing                                    Emule
                                                                  Bittorrent
                  Folksonomies
                                                              P2P ...
                  ...




                                   Davide Eynard
ReSearch - 2008/06/06
11

Search engines

How are search engines used?
    Mostly queries with one or few words
               • (which ones? Give a look at zeitgeist!)
        Mostly you look just at the first hits
               • (check here and here)


Main operators are available instead...
    quotes
    allinanchor
    inurl
    filetype
    intitle
    related
    ... and of course boolean ones



                                  Davide Eynard
ReSearch - 2008/06/06
12

True or false?

How true is boolean search?
    (that is, how truly boolean...)
    “I want this term or this other and not that one” is fine...
    ... but don't try to think in sets!




           semantic AND web                     semantic AND semantic

  web


                                                             semantic
                        semantic
                                             ... but it doesn't work like this!

                             Davide Eynard
ReSearch - 2008/06/06
13

Vector Space Model

                   In the VSM, documents are represented as vectors
                          in a multidimensional Euclidean space




                        The coordinate of document d in axis t is given by
                                        dt = TF(d,t) * IDF(t)

                                      Davide Eynard
ReSearch - 2008/06/06
14

The epanaleptical approach

Some search engines are based on models that are much more similar
to the VSM than to sets+boolean.

Epanaleptical approach:
    just repeat the word many times
    if it's more that one word, surround them with quotes


Examples (nice academic drawbacks):
    semantic web
    semantic web + collaborative systems
    slam
    performance evaluation




                        Davide Eynard
ReSearch - 2008/06/06
15

To google or not to google

Use google to find anything
    “local” searches can be run from google too
    try it with blogs, forums, wikis etc
               • phpbb trick
               • mediawiki trick


Use alternative search engines
    search for related:www.google.com




                                   Davide Eynard
ReSearch - 2008/06/06
16

Search techniques

        Word search (+ suffixes)
        Webbits (here and here)
               • (and the “index of” trick)
        Concept related search and specific search engines
        Arrows: using communities of practice to enhance search
               • What are diy, gtd, seo, slam, etc.?
        Foster serendipity
               • check upper dirs
               • follow links
               • look at the status bar




                                   Davide Eynard
ReSearch - 2008/06/06
17

Exploit collaboration

Blogs/News
     Ok, I suppose you all know about RSS feeds...
               • You can recognize them
               • You can mash them up
               • You can use them for other media
        ... but how can you find interesting ones?
               • AideRSS technique
               • ... and a tutorial that explains you how to use it




                                   Davide Eynard
ReSearch - 2008/06/06
18

Exploit collaboration

Folksonomies
     del.icio.us
     ma.gnolia


Bibliography sharing
      bibsonomy
      CiteULike


Social networks/groups
    Ever searched for Facebook groups?




                        Davide Eynard
ReSearch - 2008/06/06
19

DIY

AKA Do It Yourself
    AKA means Also Known As
               • Also means... well, just jokin'!


In this case it means use a personal, custom approach using ready
made tools or creating new ones.

How can you do it?
    Know thy enemy
               • WWW, HTTP, HTML (see powerbrowsing)
               • Human patterns
               • PC patterns
        Build models
        Exploit tools or regularities in contents




                                   Davide Eynard
ReSearch - 2008/06/06
20

Web Technologies

There are some things you should know to make a well-behaving
bot:

   • HTTP
     ◦ GET and POST
     ◦ Referer
     ◦ UserAgent
     ◦ Cookie
     ◦ Proxy

   • HTML
     ◦ Form
     ◦ Dynamically generated code

Give a look at this tutorial. And to some DEI examples.

                         Davide Eynard
ReSearch - 2008/06/06
21

Tools and examples

Web tools
               •   Program Committee Searcher
               •   Changedetection
               •   Wayback machine
               •   Mashup tools
               •   SpeakinAbout


Client tools
               •   user agent switcher
               •   spiders/scrapers
               •   custom made tools ;-)
               •   Firefox search plugins




                                   Davide Eynard
ReSearch - 2008/06/06
22

To conclude: did you know...

        that we have people working on very interesting stuff about
         searching, libraries and documents here (and, in the real world,
         about 100m from us?)
        that here you can find all the info you need to set up the
         university proxy, so you can access restricted document
         libraries from anywhere?
        that on the OPAC you can find recent doctoral theses ready to
         read, in pdf format?
        ... and that you have a lot of polimi-related news here?




                            Davide Eynard
ReSearch - 2008/06/06
23

That's all, folks!




                               Thank you!

                               Questions?




                        Davide Eynard
ReSearch - 2008/06/06
24




     Contact                                  Davide Eynard

                                                Tel. 02 2399 4010
                                                Fax 02 2399 3411


                                              eynard@elet.polimi.it
                                      http://www.dei.polimi.it/people/eynard


                        Back

                           Davide Eynard
ReSearch - 2008/06/06

More Related Content

Viewers also liked

NAEP Media Trends
NAEP Media TrendsNAEP Media Trends
NAEP Media Trendsyanhast01
 
Performance Attacks on Intrusion Detection Systems
Performance Attacks on Intrusion Detection SystemsPerformance Attacks on Intrusion Detection Systems
Performance Attacks on Intrusion Detection SystemsDavide Eynard
 
Нормативно правова база по національно-патріотичному вихованню
Нормативно правова база по національно-патріотичному вихованнюНормативно правова база по національно-патріотичному вихованню
Нормативно правова база по національно-патріотичному вихованнюTetjana Bilotserkivets
 
SAnno: a unifying framework for semantic annotation
SAnno: a unifying framework for semantic annotationSAnno: a unifying framework for semantic annotation
SAnno: a unifying framework for semantic annotationDavide Eynard
 
Напрями роботи навчального закладу з патріотичного виховання учнів
Напрями роботи навчального закладу з патріотичного виховання учнівНапрями роботи навчального закладу з патріотичного виховання учнів
Напрями роботи навчального закладу з патріотичного виховання учнівTetjana Bilotserkivets
 

Viewers also liked (9)

Rewire the Net
Rewire the NetRewire the Net
Rewire the Net
 
christopher-w-betts
christopher-w-bettschristopher-w-betts
christopher-w-betts
 
NAEP Media Trends
NAEP Media TrendsNAEP Media Trends
NAEP Media Trends
 
Performance Attacks on Intrusion Detection Systems
Performance Attacks on Intrusion Detection SystemsPerformance Attacks on Intrusion Detection Systems
Performance Attacks on Intrusion Detection Systems
 
Нормативно правова база по національно-патріотичному вихованню
Нормативно правова база по національно-патріотичному вихованнюНормативно правова база по національно-патріотичному вихованню
Нормативно правова база по національно-патріотичному вихованню
 
SAnno: a unifying framework for semantic annotation
SAnno: a unifying framework for semantic annotationSAnno: a unifying framework for semantic annotation
SAnno: a unifying framework for semantic annotation
 
Presentación1
Presentación1Presentación1
Presentación1
 
Partes del computador
Partes del computadorPartes del computador
Partes del computador
 
Напрями роботи навчального закладу з патріотичного виховання учнів
Напрями роботи навчального закладу з патріотичного виховання учнівНапрями роботи навчального закладу з патріотичного виховання учнів
Напрями роботи навчального закладу з патріотичного виховання учнів
 

Similar to ReSearch - Searching for Researchers

Understanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveUnderstanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveYuwei Lin
 
Web 2.0 Tools for Science
Web 2.0 Tools for ScienceWeb 2.0 Tools for Science
Web 2.0 Tools for ScienceStephen Best
 
Nsta Web20 Science
Nsta Web20 ScienceNsta Web20 Science
Nsta Web20 ScienceStephen Best
 
Dynamics of Talk pages: Serving the article, showing the community - Wikimani...
Dynamics of Talk pages: Serving the article, showing the community - Wikimani...Dynamics of Talk pages: Serving the article, showing the community - Wikimani...
Dynamics of Talk pages: Serving the article, showing the community - Wikimani...jodischneider
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppttpoelzer
 
Webquest about the Solar System
Webquest about the Solar SystemWebquest about the Solar System
Webquest about the Solar Systemlshemuga
 
How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)CS, NcState
 
Why Moodle?
Why Moodle?Why Moodle?
Why Moodle?nosh0502
 
Semantic Web: A web that is not the Web
Semantic Web: A web that is not the WebSemantic Web: A web that is not the Web
Semantic Web: A web that is not the WebBruce Esrig
 
Nonprofits: Create New Income Streams While Sharing Knowledge
Nonprofits: Create New Income Streams While Sharing KnowledgeNonprofits: Create New Income Streams While Sharing Knowledge
Nonprofits: Create New Income Streams While Sharing Knowledge4Good.org
 
Mobile Devices in the Classroom
Mobile Devices in the ClassroomMobile Devices in the Classroom
Mobile Devices in the ClassroomKathy Schrock
 
The Dreaded Group Assignment
The Dreaded Group AssignmentThe Dreaded Group Assignment
The Dreaded Group Assignmentjjenna
 
Hello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic DeveloperHello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic DeveloperAlexandre Passant
 

Similar to ReSearch - Searching for Researchers (20)

Understanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical PerspectiveUnderstanding Research 2.0 from a Socio-technical Perspective
Understanding Research 2.0 from a Socio-technical Perspective
 
Web design beginning
Web design   beginningWeb design   beginning
Web design beginning
 
Web 2.0 Tools for Science
Web 2.0 Tools for ScienceWeb 2.0 Tools for Science
Web 2.0 Tools for Science
 
Nsta Web20 Science
Nsta Web20 ScienceNsta Web20 Science
Nsta Web20 Science
 
Dynamics of Talk pages: Serving the article, showing the community - Wikimani...
Dynamics of Talk pages: Serving the article, showing the community - Wikimani...Dynamics of Talk pages: Serving the article, showing the community - Wikimani...
Dynamics of Talk pages: Serving the article, showing the community - Wikimani...
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
 
Webquest about the Solar System
Webquest about the Solar SystemWebquest about the Solar System
Webquest about the Solar System
 
How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)
 
Why Moodle?
Why Moodle?Why Moodle?
Why Moodle?
 
Why Moodle137
Why Moodle137Why Moodle137
Why Moodle137
 
Fabulous Freebies
Fabulous FreebiesFabulous Freebies
Fabulous Freebies
 
A View on eScience
A View on eScienceA View on eScience
A View on eScience
 
Semantic Web: A web that is not the Web
Semantic Web: A web that is not the WebSemantic Web: A web that is not the Web
Semantic Web: A web that is not the Web
 
Web2.0 and KM
Web2.0 and KMWeb2.0 and KM
Web2.0 and KM
 
Nonprofits: Create New Income Streams While Sharing Knowledge
Nonprofits: Create New Income Streams While Sharing KnowledgeNonprofits: Create New Income Streams While Sharing Knowledge
Nonprofits: Create New Income Streams While Sharing Knowledge
 
Mobile Devices in the Classroom
Mobile Devices in the ClassroomMobile Devices in the Classroom
Mobile Devices in the Classroom
 
ppt
pptppt
ppt
 
The Dreaded Group Assignment
The Dreaded Group AssignmentThe Dreaded Group Assignment
The Dreaded Group Assignment
 
30 Jan 2012 - New York City
30 Jan 2012 - New York City30 Jan 2012 - New York City
30 Jan 2012 - New York City
 
Hello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic DeveloperHello Open World - The Web of Data for the Pragmatic Developer
Hello Open World - The Web of Data for the Pragmatic Developer
 

More from Davide Eynard

Building Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and ManifoldsBuilding Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and ManifoldsDavide Eynard
 
Laplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformationsLaplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformationsDavide Eynard
 
Notes on Spectral Clustering
Notes on Spectral ClusteringNotes on Spectral Clustering
Notes on Spectral ClusteringDavide Eynard
 
An integrated approach to discover tag semantics
An integrated approach to discover tag semanticsAn integrated approach to discover tag semantics
An integrated approach to discover tag semanticsDavide Eynard
 
PhDLinux: A Linux Crash Course for PhD Students
PhDLinux: A Linux Crash Course for PhD StudentsPhDLinux: A Linux Crash Course for PhD Students
PhDLinux: A Linux Crash Course for PhD StudentsDavide Eynard
 
Exploiting user gratification for collaborative semantic annotation
Exploiting user gratification for collaborative semantic annotationExploiting user gratification for collaborative semantic annotation
Exploiting user gratification for collaborative semantic annotationDavide Eynard
 
Cracking Codes With Genetic Algorithms
Cracking Codes With Genetic AlgorithmsCracking Codes With Genetic Algorithms
Cracking Codes With Genetic AlgorithmsDavide Eynard
 
Fast algorithms for large scale genome alignment and comparison
Fast algorithms for large scale genome alignment and comparisonFast algorithms for large scale genome alignment and comparison
Fast algorithms for large scale genome alignment and comparisonDavide Eynard
 
Unambiguous Recognizable Two-dimensional Languages
Unambiguous Recognizable Two-dimensional LanguagesUnambiguous Recognizable Two-dimensional Languages
Unambiguous Recognizable Two-dimensional LanguagesDavide Eynard
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systemsDavide Eynard
 

More from Davide Eynard (10)

Building Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and ManifoldsBuilding Compatible Bases on Graphs, Images, and Manifolds
Building Compatible Bases on Graphs, Images, and Manifolds
 
Laplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformationsLaplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformations
 
Notes on Spectral Clustering
Notes on Spectral ClusteringNotes on Spectral Clustering
Notes on Spectral Clustering
 
An integrated approach to discover tag semantics
An integrated approach to discover tag semanticsAn integrated approach to discover tag semantics
An integrated approach to discover tag semantics
 
PhDLinux: A Linux Crash Course for PhD Students
PhDLinux: A Linux Crash Course for PhD StudentsPhDLinux: A Linux Crash Course for PhD Students
PhDLinux: A Linux Crash Course for PhD Students
 
Exploiting user gratification for collaborative semantic annotation
Exploiting user gratification for collaborative semantic annotationExploiting user gratification for collaborative semantic annotation
Exploiting user gratification for collaborative semantic annotation
 
Cracking Codes With Genetic Algorithms
Cracking Codes With Genetic AlgorithmsCracking Codes With Genetic Algorithms
Cracking Codes With Genetic Algorithms
 
Fast algorithms for large scale genome alignment and comparison
Fast algorithms for large scale genome alignment and comparisonFast algorithms for large scale genome alignment and comparison
Fast algorithms for large scale genome alignment and comparison
 
Unambiguous Recognizable Two-dimensional Languages
Unambiguous Recognizable Two-dimensional LanguagesUnambiguous Recognizable Two-dimensional Languages
Unambiguous Recognizable Two-dimensional Languages
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systems
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

ReSearch - Searching for Researchers

  • 1. ReSearch (because research without search is just “re”) Davide Eynard eynard@elet.polimi.it
  • 2. 2 Table of contents  Introduction  ... (ellipsis left by purpose)  Conclusions Davide Eynard ReSearch - 2008/06/06
  • 3. 3 This seminar is not... “Le risorse elettroniche per la ricerca”  a transversal course for the PhD Students of Politecnico di Milano  This (June 2008?) will be the fourth edition  Very good material from previous editions is available at http://www.biblio.polimi.it/documenti  Main topics: • query languages • online libraries, journals and ebooks • tools to create and manage your bibliography • search engines, deep Web, open archives, advanced browsing • social publishing (blogs and RSS) and social bookmarking • POLIsearch • using the university proxy to access online resources • notes on copyright issues • search techniques (like PICO and SPICE) Davide Eynard ReSearch - 2008/06/06
  • 4. 4 So... why? Searching (and now, in particular, being able to effectively search on the Internet) is very important for our research and, more generally, in our lives. Even if they are interested, some students skip the course as it does not give enough credits! If you're interested in these topics, ask for a solution (ie. increase the credits, together with the teaching material). Davide Eynard ReSearch - 2008/06/06
  • 5. 5 So... what? What is the real purpose of this lecture, then? What are the contents? Is this a short version of the PhD course? Davide Eynard ReSearch - 2008/06/06
  • 6. 6 So... what? What is the real purpose of this lecture, then? What are the contents? Is this a short version of the PhD course? NAH! There's so much material about search that we could prepare ten complementary PhD courses... Davide Eynard ReSearch - 2008/06/06
  • 7. 7 So... what? What is the real purpose of this lecture, then? What are the contents? Is this a short version of the PhD course? NAH! There's so much material about search that we could prepare ten complementary PhD courses... Moreover, I already had some material I wanted to recycle  http://searchlores.org – a precious source for seekers  PowerBrowsing – an old project of mine Davide Eynard ReSearch - 2008/06/06
  • 8. 8 So... what? What is the real purpose of this lecture, then? What are the contents? Is this a short version of the PhD course? NAH! There's so much material about search that we could prepare ten complementary PhD courses... Moreover, I already had some material I wanted to recycle  http://searchlores.org – a precious source for seekers  PowerBrowsing – an old project of mine BUT I also have something new to tell you, I promise! Davide Eynard ReSearch - 2008/06/06
  • 9. 9 The Web  Search engines cover (at best) ¼ of the web  Different SE may return different results (as they overlap)  Quality of results in terms of precision and recall  See (for instance) here [http://www.searchlores.org] Davide Eynard ReSearch - 2008/06/06
  • 10. 10 The Internet The Web              vs               Not the Web IRC Email Blogs Usenet IM Wikis Forums File sharing Emule Bittorrent Folksonomies P2P ... ... Davide Eynard ReSearch - 2008/06/06
  • 11. 11 Search engines How are search engines used?  Mostly queries with one or few words • (which ones? Give a look at zeitgeist!)  Mostly you look just at the first hits • (check here and here) Main operators are available instead...  quotes  allinanchor  inurl  filetype  intitle  related  ... and of course boolean ones Davide Eynard ReSearch - 2008/06/06
  • 12. 12 True or false? How true is boolean search?  (that is, how truly boolean...)  “I want this term or this other and not that one” is fine...  ... but don't try to think in sets! semantic AND web semantic AND semantic web semantic semantic ... but it doesn't work like this! Davide Eynard ReSearch - 2008/06/06
  • 13. 13 Vector Space Model In the VSM, documents are represented as vectors in a multidimensional Euclidean space The coordinate of document d in axis t is given by dt = TF(d,t) * IDF(t) Davide Eynard ReSearch - 2008/06/06
  • 14. 14 The epanaleptical approach Some search engines are based on models that are much more similar to the VSM than to sets+boolean. Epanaleptical approach:  just repeat the word many times  if it's more that one word, surround them with quotes Examples (nice academic drawbacks):  semantic web  semantic web + collaborative systems  slam  performance evaluation Davide Eynard ReSearch - 2008/06/06
  • 15. 15 To google or not to google Use google to find anything  “local” searches can be run from google too  try it with blogs, forums, wikis etc • phpbb trick • mediawiki trick Use alternative search engines  search for related:www.google.com Davide Eynard ReSearch - 2008/06/06
  • 16. 16 Search techniques  Word search (+ suffixes)  Webbits (here and here) • (and the “index of” trick)  Concept related search and specific search engines  Arrows: using communities of practice to enhance search • What are diy, gtd, seo, slam, etc.?  Foster serendipity • check upper dirs • follow links • look at the status bar Davide Eynard ReSearch - 2008/06/06
  • 17. 17 Exploit collaboration Blogs/News  Ok, I suppose you all know about RSS feeds... • You can recognize them • You can mash them up • You can use them for other media  ... but how can you find interesting ones? • AideRSS technique • ... and a tutorial that explains you how to use it Davide Eynard ReSearch - 2008/06/06
  • 18. 18 Exploit collaboration Folksonomies  del.icio.us  ma.gnolia Bibliography sharing  bibsonomy  CiteULike Social networks/groups  Ever searched for Facebook groups? Davide Eynard ReSearch - 2008/06/06
  • 19. 19 DIY AKA Do It Yourself  AKA means Also Known As • Also means... well, just jokin'! In this case it means use a personal, custom approach using ready made tools or creating new ones. How can you do it?  Know thy enemy • WWW, HTTP, HTML (see powerbrowsing) • Human patterns • PC patterns  Build models  Exploit tools or regularities in contents Davide Eynard ReSearch - 2008/06/06
  • 20. 20 Web Technologies There are some things you should know to make a well-behaving bot: • HTTP ◦ GET and POST ◦ Referer ◦ UserAgent ◦ Cookie ◦ Proxy • HTML ◦ Form ◦ Dynamically generated code Give a look at this tutorial. And to some DEI examples. Davide Eynard ReSearch - 2008/06/06
  • 21. 21 Tools and examples Web tools • Program Committee Searcher • Changedetection • Wayback machine • Mashup tools • SpeakinAbout Client tools • user agent switcher • spiders/scrapers • custom made tools ;-) • Firefox search plugins Davide Eynard ReSearch - 2008/06/06
  • 22. 22 To conclude: did you know...  that we have people working on very interesting stuff about searching, libraries and documents here (and, in the real world, about 100m from us?)  that here you can find all the info you need to set up the university proxy, so you can access restricted document libraries from anywhere?  that on the OPAC you can find recent doctoral theses ready to read, in pdf format?  ... and that you have a lot of polimi-related news here? Davide Eynard ReSearch - 2008/06/06
  • 23. 23 That's all, folks! Thank you! Questions? Davide Eynard ReSearch - 2008/06/06
  • 24. 24 Contact Davide Eynard Tel. 02 2399 4010 Fax 02 2399 3411 eynard@elet.polimi.it http://www.dei.polimi.it/people/eynard Back Davide Eynard ReSearch - 2008/06/06