SlideShare une entreprise Scribd logo
1  sur  11
BY
NANTHINI R O
II – MLIS
PONDICHERRY UNIVERSITY








Theory based approach to design various
aspects of information retrieval systems
Based on a set of principles and assumptions

Theory drives experiment by suggesting new
ways and means of doing tests
Experiment drives theory by justifying or
helping to improve the model


Cognitive or user centered
◦ Human information behaviour models
◦ Eg: Wilson’s model, Dervin’s model, Ellis’s model,
Bates’s model, Kulthau’s model, etc...



Structural or system centered
◦ Classical models based on logical and mathematical
principles
◦ Eg: Boolean search model, Vector Space model,
probabilistic model, etc...








Also called as ‘term vector model’ or ‘vector
processing model’
Represents both documents and queries by term
sets and compares global similarities between
queries and documents
used in information filtering, information
retrieval, indexing and relevancy rankings

first use was in the SMART Information Retrieval
System


term vectors are assigned for the keywords of the
documents and weights are provided according to
relevance



to compare different texts and retrieve relevant
records similar to the queries



terms are single words, keywords, or longer phrases



If words are chosen to be the terms, the
dimensionality of the vector is the number of words
in the vocabulary (the number of distinct words occurring in the corpus)


BASICS: (i and j are 2 documents, k – term, t – last term)

◦ Denotes the sum of the weights of all properties of
a vector

◦ Denotes the sum of products of corresponding term
weights for two vectors
◦ Denotes the sum of minimum component weights
of the corresponding two vectors


Similarity coefficients
◦ The Dice Coefficient

◦ The Jaccard Coefficient

acc. to Salton and McGill
Let the weights for the index terms assigned to two
documents i and j be as follows:

Doci = 3,2,1,0,0,0,1,1
Docj = 1,1,1,0,0,1,0,0
= 2 [(3*1)+(2*1)+(1*1)+(0*0)+(0*0)+(0*1)+(1*0)+(1*0)]
(3+2+1+0+0+0+1+1)+(1+1+1+0+0+1+0+0)
=12/12 = 1
= 6/(12-6)
= 1
Vector space model of information retrieval

Contenu connexe

Tendances

Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval ModelsNisha Arankandath
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introductionnimmyjans4
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Primya Tamil
 
Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)silambu111
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalVikas Bhushan
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalSudarsun Santhiappan
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsSelman Bozkır
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean modelVaibhav Khanna
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval systemLeslie Vargas
 
INFORMATION RETRIEVAL Anandraj.L
INFORMATION RETRIEVAL Anandraj.LINFORMATION RETRIEVAL Anandraj.L
INFORMATION RETRIEVAL Anandraj.Lanujessy
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on irPrimya Tamil
 
Parallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval SystemParallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval Systemvimalsura
 
The semantic web
The semantic web The semantic web
The semantic web ap
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrievalssbd6985
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyDebashisnaskar
 

Tendances (20)

Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval Models
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models
 
Term weighting
Term weightingTerm weighting
Term weighting
 
Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information Retrieval
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systems
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
 
INFORMATION RETRIEVAL Anandraj.L
INFORMATION RETRIEVAL Anandraj.LINFORMATION RETRIEVAL Anandraj.L
INFORMATION RETRIEVAL Anandraj.L
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
 
Parallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval SystemParallel and Distributed Information Retrieval System
Parallel and Distributed Information Retrieval System
 
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I  PPT  IN PDFCS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I  PPT  IN PDF
CS8080 INFORMATION RETRIEVAL TECHNIQUES - IRT - UNIT - I PPT IN PDF
 
Inverted index
Inverted indexInverted index
Inverted index
 
The semantic web
The semantic web The semantic web
The semantic web
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Ontology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical StudyOntology and Ontology Libraries: a Critical Study
Ontology and Ontology Libraries: a Critical Study
 

Similaire à Vector space model of information retrieval

Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
 
Types of case study
Types of  case studyTypes of  case study
Types of case studylaveleen
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxsleeperharwell
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxketurahhazelhurst
 
Achieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning ObjectsAchieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning ObjectsBabatunde Ishola
 
E-learning research methodological issues
E-learning research methodological issuesE-learning research methodological issues
E-learning research methodological issuesgrainne
 
Graduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paperGraduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paperColleen Farrelly
 
Reading Material: Qualitative Interview
Reading Material: Qualitative InterviewReading Material: Qualitative Interview
Reading Material: Qualitative Interviewfirdausabdmunir85
 
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon UniversityData Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon Universityeraser Juan José Calderón
 
Lecture 1 research methods
Lecture 1 research methodsLecture 1 research methods
Lecture 1 research methodsAdina Dudau
 
Chapter 5 theory and methodology
Chapter 5 theory and methodology Chapter 5 theory and methodology
Chapter 5 theory and methodology grainne
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxmamanda2
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxarnoldmeredith47041
 
Writing the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative ResearchWriting the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative Researchschool
 
In house training 151114 qualitative research
In house training 151114 qualitative researchIn house training 151114 qualitative research
In house training 151114 qualitative researchHiram Ting
 

Similaire à Vector space model of information retrieval (20)

Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
Types of case study
Types of  case studyTypes of  case study
Types of case study
 
43144 12
43144 1243144 12
43144 12
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
Achieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning ObjectsAchieving Highly Effective Personalized Learning through Learning Objects
Achieving Highly Effective Personalized Learning through Learning Objects
 
E-learning research methodological issues
E-learning research methodological issuesE-learning research methodological issues
E-learning research methodological issues
 
Graduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paperGraduate Paper--Hierarchical clustring and topology for psychometrics paper
Graduate Paper--Hierarchical clustring and topology for psychometrics paper
 
Reading Material: Qualitative Interview
Reading Material: Qualitative InterviewReading Material: Qualitative Interview
Reading Material: Qualitative Interview
 
THE-USE-OF-THEORY.pptx
THE-USE-OF-THEORY.pptxTHE-USE-OF-THEORY.pptx
THE-USE-OF-THEORY.pptx
 
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon UniversityData Mining for Education.  Ryan S.J.d. Baker, Carnegie Mellon University
Data Mining for Education. Ryan S.J.d. Baker, Carnegie Mellon University
 
Lecture 1 research methods
Lecture 1 research methodsLecture 1 research methods
Lecture 1 research methods
 
Chapter 5 theory and methodology
Chapter 5 theory and methodology Chapter 5 theory and methodology
Chapter 5 theory and methodology
 
Theoretical & framework
Theoretical & frameworkTheoretical & framework
Theoretical & framework
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docx
 
The Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docxThe Case StudyMany disciplines use various forms of the ca.docx
The Case StudyMany disciplines use various forms of the ca.docx
 
2. theoretical framework
2. theoretical framework2. theoretical framework
2. theoretical framework
 
Writing the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative ResearchWriting the Theoretical and Conceptual Framework of a Quantitative Research
Writing the Theoretical and Conceptual Framework of a Quantitative Research
 
In house training 151114 qualitative research
In house training 151114 qualitative researchIn house training 151114 qualitative research
In house training 151114 qualitative research
 
Cdst12 ijtel
Cdst12 ijtelCdst12 ijtel
Cdst12 ijtel
 

Dernier

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Dernier (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

Vector space model of information retrieval

  • 1. BY NANTHINI R O II – MLIS PONDICHERRY UNIVERSITY
  • 2.     Theory based approach to design various aspects of information retrieval systems Based on a set of principles and assumptions Theory drives experiment by suggesting new ways and means of doing tests Experiment drives theory by justifying or helping to improve the model
  • 3.  Cognitive or user centered ◦ Human information behaviour models ◦ Eg: Wilson’s model, Dervin’s model, Ellis’s model, Bates’s model, Kulthau’s model, etc...  Structural or system centered ◦ Classical models based on logical and mathematical principles ◦ Eg: Boolean search model, Vector Space model, probabilistic model, etc...
  • 4.     Also called as ‘term vector model’ or ‘vector processing model’ Represents both documents and queries by term sets and compares global similarities between queries and documents used in information filtering, information retrieval, indexing and relevancy rankings first use was in the SMART Information Retrieval System
  • 5.  term vectors are assigned for the keywords of the documents and weights are provided according to relevance  to compare different texts and retrieve relevant records similar to the queries  terms are single words, keywords, or longer phrases  If words are chosen to be the terms, the dimensionality of the vector is the number of words in the vocabulary (the number of distinct words occurring in the corpus)
  • 6.  BASICS: (i and j are 2 documents, k – term, t – last term) ◦ Denotes the sum of the weights of all properties of a vector ◦ Denotes the sum of products of corresponding term weights for two vectors
  • 7. ◦ Denotes the sum of minimum component weights of the corresponding two vectors  Similarity coefficients ◦ The Dice Coefficient ◦ The Jaccard Coefficient acc. to Salton and McGill
  • 8. Let the weights for the index terms assigned to two documents i and j be as follows: Doci = 3,2,1,0,0,0,1,1 Docj = 1,1,1,0,0,1,0,0