SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013
DOI : 10.5121/ijist.2013.3501 1
Quantification of Portrayal Concepts using tf-idf
Weighting
S. Florence Vijila 1
and Dr. K. Nirmala 2
1
Research Scholar, Manonmaniam Sundaranar University, Tamil Nadu, India,
Assistant Professor , CSI Ewart Women’s Christian College, Melrosapuram , TamilNadu
2
Associate Prof. of Computer Applications, Quaid-e-Millath Govt. College for Women,
Chennai, Tamil Nadu, India
ABSTRACT
Term frequencies and inverse document frequencies have been successfully applied in determining
weighting for document rankings. However these have been more successful in text mining and in
extraction techniques used in the web. Concept mining has become increasingly popular in the research
and application areas of Computer Science. This paper attempts to demonstrate the limited usage of term
frequency and inverse document frequency for the application of weighting calculations for ranking
documents that are based on concept quantifications. The case study considered for experiment in this
paper, is based on concept terms of David Merrill’s First Principles of Instruction (FPI). Merrill’s FPI
applies cognitive structures explicitly for analyzing instructional materials. Therefore it is justified that the
terms categorized under each cognitive structure (or portrayal) of FPI can be taken as respective concept
of that portrayal. As question papers are representative of cognitive structures in a more clear and logical
way, four question papers on ‘C Language’ have been considered for the experimental study, that are
detailed in this paper. Manual method has been adopted for the computation of quantities of portrayals in
selected documents for the purpose of comparative study. As manual method is accurate, the values
(results) are considered as benchmark values. These benchmark values are considered for comparing with
normalized term frequencies that are derived (experimented) from automated extractions from the same
selected documents. The study is however limited to four documents only. Conclusions are drawn from this
experimental study, which will be of immense use to concept mining researchers as well as for instructional
designers.
KEYWORDS
concept keywords; document ranking; term frequency weighting; cognitive structures.
1. INTRODUCTION
Literatures are aplenty for document ranking using term weighting. Documents are retrieved
using keywords and documents are ranked according to keyword density (Masaru Ohba et al -
2005, SA-Kwang Song et al -2012). According to documented literature, ranking of documents
based on term weighting has been a key research issue in information retrieval, particularly
retrieving concepts in addition to keywords. Applying weighting schemes using term frequency
and inverse document frequency are crucial for ranking of documents (Sa-Kwang Song et al-
2012). A concept keyword is a word that represents a concept. But the nature of concept and the
International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013
2
meaning a particular keyword carries are subjective judgment, as per these authors. The authors
have adapted term frequency for mining concepts. Both human selected concepts and automated
methodologies have been successfully adopted. But research publications point out that concept
mining ultimately depends only upon manual methods, when accuracy is needed to a high level.
In other words, automated methods would only provide approximate results.
The tf-idf weighting (term frequency-inverse document frequency, a.k.a.TF-IDF) is a numerical
statistic which reflects how important a word is to a document in a collection or corpus (Salton et
al - 1988). By convention, the tf-idf value increases proportionally to the number of times a word
appears in a document, but is offset by the frequency of the word in the corpus, which helps to
control for the fact that some words are generally more common than others. In view of this, the
factor itself may be considered as a weighting factor in information retrieval and text mining. This
paper hence attempts to determine the acceptability of tf-idf, through experimentation, on
quantifying concepts from documents. Comparative study will be carried out from the results of
manual methods.
In concept mining techniques, variations of the tf-idf weighting scheme are often used by some
search engines as a central tool in scoring and ranking a document’s relevance given by a user
query. Tf-idf can be successfully used for stop-words filtering in various subject fields including
text summarization and classification. This paper attempts to quantify well proven conceptual
terms, such as cognitive structures from instructional documents.
David Merrill (2007), in his ‘First Principles of Instruction’ (FPI) divides any instructional event
into four phases, which he calls ‘Activation’, ’Demonstration’, ’Application’ and ‘Integration’.
Central to this instructional model is a real-time problem-solving theme, called ‘problem’. Merrill
suggests that fundamental principles of instructional design should be relied on and these apply
regardless of any instructional design model used. Violating this would produce a decrement in
learning and performance. Each phase is a cognitive portrayal for learning a concept according to
this author. Therefore it would be very apt to consider cognitive structure or portrayal terms as
concepts for the purpose of quantification.
We propose to rank order four sets of concept keywords of these four cognitive structures
(portrayals) of Merrill’s FPI. The rank ordering would be carried out using normalized term
frequency (tf) and inverse document frequency (idf) on selected University question paper
documents of “C Language’. To ascertain the degree of accuracy (the objective of this paper), we
propose to compare the results with manual methods. Conclusions have been drawn from this
experimental study. The results will be of immense use to concept mining methodologists and for
instructional designers. The material of this paper forms a part of a whole research programme on
concept based quantification of portrayals by the authors. As these portrayals are taken for
concept quantification, the meaning of each portrayal is essential to comprehend before
attempting to experiment with.
2. ‘ACTIVATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’
LANGUAGE
The ‘Activation’ portrayal of FPI specifies, “ Does the question direct students to recall,
remember, repeat or recognize basic and fundamental knowledge of ‘C’ from the relevant past
International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013
3
behavior (experience) of the students that can be used as a foundation for the acquisition of new
(and further advanced) knowledge of ‘C’?“.
Based on the above definition and on the past ten year question paper patterns analyzed by the
authors, the following concept keywords have been specifically arrived at for the computation of
term frequency:
“list, define, tell, name, locate, identify, what, give, distinguish, acquire, underline, relate, state,
recall, select, repeat, recognize, reproduce and measure”.
3. ‘DEMONSTRATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’
LANGUAGE
The ‘Demonstration’ portrayal of FPI specifies, “Does the question directs the students to
demonstrate of what has he learnt rather than merely provide information (unlike ’Activation’)
about what is needed as foundation for further techniques of ‘C’?“. Based on the above definition
and the past ten year question papers pattern analyzed by the authors, the following concept
keywords have been exclusively arrived at for the computation of term frequency:
“demonstrate, summarize, illustrate, interpret, contrast, predict, associate, distinguish,
identify, show, label, collect, experiment, recite, classify, discuss, select, compare,
translate, prepare, change, rephrase, differentiate, draw, explain, estimate, fill in, choose,
operate, perform, organize and write”.
4. ‘APPLICATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’
LANGUAGE
The ‘Application’ portrayal of FPI specifies, ”Does the question provides an opportunity for
student to apply her acquired knowledge of ‘C’ to solve a problem?”. Based on this definition and
from analysis carried out authors on the past ten year question papers, the following concept
keywords have been exclusively arrived at for the computation of term frequencies:
“apply, calculate, find, solve, illustrate, make, predict, construct, assess, practice,
restrictive, classify, code, develop, generate, write”.
5. ‘INTEGRATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’
LANGUAGE
The ‘Integration’ portrayal of FPI specifies, ”Does the question triggers and encourages the
student to integrate (transfer) her acquired knowledge of ‘C’ to a new situation in any critical and
complex situation?”. Based on this definition and from the study on past ten year question papers,
the following concept keywords have been arrived at exclusively for the computation of term
frequencies:
“analyze, solve, justify, infer, combine, integrate, plan, generalize, assess, decide,
rank, grade, recommend, contrast, survey, examine, investigate, compose, invent,
improve, imagine, hypothesize, predict, evaluate, rate, how and why”.
International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013
4
Using these keywords, quantified benchmark values on the selected documents have been
arrived at through manual methods. These benchmark values would be compared with
automated term frequencies so as to arrive at conclusions.
6. COMPUTATION OF BENCHMARK VALUES BY MANUAL METHOD
Based on the definitions of the portrayals of FPI (four cognitive structures), quantification of
concept terms in terms of these four cognitive structures on the selected four question paper
documents (samples) were carried out using concept keywords as provided above. Thus, the
addition of all values (in percentages) would yield to 100%, as the method adapted is pure manual
by the authors. Therefore the values need to be exact and hence taken as benchmark values for
comparative studies. The values are provided in Table 1.
Portrayal Documents Average
Q1 Q2 Q3 Q4
Activation 48% 27% 30% 31% 34%
Demonstration 22% 28% 42% 33% 31%
Application 26% 36% 21% 29% 28%
Integration 4% 9% 7% 7% 7%
Table 1. Benchmark values in percentages computed manually
7. RESULTS AND DISCUSSIONS ON WEIGHTING USING TERM FREQUENCIES
The four selected question papers (documents) of the subject ‘C Language’ have been used for
the intended analysis. The normalized term frequency and inverse document frequencies on these
selected four concept keywords, namely ‘Activation’, ’Demonstration’, ’Application’ and
’Integration’ using their respective keywords have been used for the calculation of two
frequencies. These term frequencies are used for determining the weighting of these four
portrayals that we have computed through inverse document frequencies. The total number of
words, the most frequented word and the number of concepts (different keywords of each concept
or portrayal) are used for the calculation of weightings (Wikipedia 2012). They are computed and
presented for each document (i.e. question paper). They are presented in a nut shell in Table 2.
The basic variables for term frequency computations are:
Number of concept terms occurring in one document = i;
Number of most frequently occurred word in one document = n;
Normalized term frequency, tf = i/n ;_______________(1)
The basic data for term frequencies are presented in Table 2.
International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013
5
Document
No.
Total
words
No. of most
frequented
word(n)
No. of concept terms appearing (i)
Activation Demonstration Application Integration
Q1 187 15 11 5 6 1
Q2 206 25 6 7 8 2
Q3 184 16 9 12 6 2
Q4 173 21 6 10 8 2
Total no. of concept terms
appearing
32 34 28 7
Table 2.Basic Data for Term Frequency
The total number of documents considered is N, which is 4.
The normalized term frequencies for each concept term of each document is presented in Table 3
(refer to equation 1).
Concept Normalized term frequency Total Average
Q1 Q2 Q3 Q4
Activation 0.7333 0.2400 0.5625 0.2857 1.8215 0.4554
Demonstration 0.3333 0.2800 0.7500 0.4762 1.8395 0.4599
Application 0.4000 0.3200 0.3750 0.0952 1.1902 0.2976
Integration 0.0667 0.0800 0.1250 0.952 0.3669 0.0917
Table 3. Normalized Term Frequencies
The inverse document frequency is computed as:
idf = log (N/K) ;_______________(2)
Where N = Total number of documents, which is 4 and
K: Number of occurrences of terms in all the documents considered (see total concept terms
appearing in Table 2).
The final results are :-
Activation = -0.903089
Demonstration = -0.92959
Application = -0.845098
Integration = -0.243038
The negative sign is due to the fact that the total number of documents considered is less than the
occurrences of terms in all the documents. The objective of this research work is to compare these
frequency values of concept terms with respect to the benchmarks that were calculated based on
manual methods. As the study is limited to comparisons, the idf is considered with modular
values. Thus weighting would be
tf.│log (N/K)│;_______________(3)
International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013
6
Accordingly, the weighting for each concept term of each document is computed and presented in
Table 4
Concept Weighting of terms Average
Q1 Q2 Q3 Q4
Activation 0.6622 0.2167 0.5080 0.2580 0.4113
Demonstration 0.3098 0.2603 0.6972 0.4427 0.4275
Application 0.3380 0.2704 0.3169 0.0805 0.2515
Integration 0.0162 0.0194 0.0304 0.2313 0.0223
Table 4. Weighting of portrayals in each document
The above values yield important conclusions.
8. CONCLUSIONS
The largest and the least present cognitive portrayals in every document as per benchmark values
tally well with weighting of term frequency of all question paper documents considered. It is thus
concluded that normalized term frequencies may be used for quantifying concept terms in
individualized documents to an acceptable standards. However the benchmark values deviate
from that of weighting term frequency on average values. This clearly shows that unlike
quantification of keywords, quantifying concept words may not be reliable when inverse
document frequency is computed on small number of documents. The work may be extended to
large number of documents for further study.
REFERENCES
[1] Salton G, Buckley C (1988), "Term-weighting approaches in automatic text retrieval". Information
Processing and Management 24 (5): 513–523, 1988.
[2] Sa-kwang Song, and Sung Hyon Myaeng, (2012), “A novel term weighting scheme based on
discrimination power obtained from past retrieval results.”, Information Processing and
Management 48 (2012) 919–930, 2012.
[3] Masaru Ohba and Katsuhiko Gondow, (2005), “Toward Mining ‘Concept Keywords’ from
Identifiers in Large Software Projects”, ACM SIGSOFT Software Engineering Notes 30(4): 1-5,
2005.
[4] Merrill M.D., (2002), “First Principles of Instruction”, Englewood Cliffs, NJ: Educational
Technology Publications, 2002.
[5] Wikipedia (2012), http://en.wikipedia.org/wiki/Tf*idf, 2012
Authors
S.Florence Vijila is working as Assistant Professor in the Department t of Computer
Science at CSI Ewart Women’s Christian College,Tamil Nadu for the past 12 years.Her
qualification is M.CA,M.Phil,Now she is doing Ph.D in the area of “Data Mining”.Her
work have been published in the International conference on ”Innovations in
contemporary IT research “ proceedings.

Contenu connexe

Tendances

Data analysis chapter 18 from the companion website for educational research
Data analysis   chapter 18 from the companion website for educational researchData analysis   chapter 18 from the companion website for educational research
Data analysis chapter 18 from the companion website for educational researchYamith José Fandiño Parra
 
A Term Paper for the Course of Theories and Approaches in Language Teaching(...
A Term Paper for  the Course of Theories and Approaches in Language Teaching(...A Term Paper for  the Course of Theories and Approaches in Language Teaching(...
A Term Paper for the Course of Theories and Approaches in Language Teaching(...DawitDibekulu
 
SUM sem 3 Mb0050 – research methodology
SUM sem 3  Mb0050 – research methodologySUM sem 3  Mb0050 – research methodology
SUM sem 3 Mb0050 – research methodologyak007420
 
EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)
EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)
EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)Videoconferencias UTPL
 
RES860 P6 IndividualProject - version101
RES860 P6 IndividualProject - version101RES860 P6 IndividualProject - version101
RES860 P6 IndividualProject - version101ThienSi Le
 
10 - qualitative research data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...
10 - qualitative research  data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...10 - qualitative research  data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...
10 - qualitative research data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...Rasha
 
Mining Opinions from University Students’ Feedback using Text Analytics
Mining Opinions from University Students’ Feedback using Text AnalyticsMining Opinions from University Students’ Feedback using Text Analytics
Mining Opinions from University Students’ Feedback using Text AnalyticsITIIIndustries
 
Qualitative data analysis - Student L
Qualitative data analysis - Student LQualitative data analysis - Student L
Qualitative data analysis - Student LLee Cox
 
615900072
615900072615900072
615900072picktru
 
Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...Gan Keng Hoon
 
Mb0050 research methodology
Mb0050    research methodologyMb0050    research methodology
Mb0050 research methodologysmumbahelp
 
Mb0050 “research methodology answer
Mb0050 “research methodology  answerMb0050 “research methodology  answer
Mb0050 “research methodology answerRohit Mishra
 
Presentation on interpretation and report writing
Presentation on interpretation and report writingPresentation on interpretation and report writing
Presentation on interpretation and report writingSafiullah Rifat
 

Tendances (19)

Data Analysis
Data Analysis Data Analysis
Data Analysis
 
Data analysis chapter 18 from the companion website for educational research
Data analysis   chapter 18 from the companion website for educational researchData analysis   chapter 18 from the companion website for educational research
Data analysis chapter 18 from the companion website for educational research
 
A Term Paper for the Course of Theories and Approaches in Language Teaching(...
A Term Paper for  the Course of Theories and Approaches in Language Teaching(...A Term Paper for  the Course of Theories and Approaches in Language Teaching(...
A Term Paper for the Course of Theories and Approaches in Language Teaching(...
 
Report writing
Report writingReport writing
Report writing
 
Watanabe thesis
Watanabe thesisWatanabe thesis
Watanabe thesis
 
SUM sem 3 Mb0050 – research methodology
SUM sem 3  Mb0050 – research methodologySUM sem 3  Mb0050 – research methodology
SUM sem 3 Mb0050 – research methodology
 
Content analysis
Content analysisContent analysis
Content analysis
 
Independent Study Guide
Independent Study GuideIndependent Study Guide
Independent Study Guide
 
EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)
EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)
EDUCATIONAL RESEARCH II (II Bimetsre Abril Agosto 2011)
 
RES860 P6 IndividualProject - version101
RES860 P6 IndividualProject - version101RES860 P6 IndividualProject - version101
RES860 P6 IndividualProject - version101
 
10 - qualitative research data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...
10 - qualitative research  data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...10 - qualitative research  data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...
10 - qualitative research data collection ( Dr. Abdullah Al-Beraidi - Dr. Ib...
 
Writing research report
Writing research reportWriting research report
Writing research report
 
Mining Opinions from University Students’ Feedback using Text Analytics
Mining Opinions from University Students’ Feedback using Text AnalyticsMining Opinions from University Students’ Feedback using Text Analytics
Mining Opinions from University Students’ Feedback using Text Analytics
 
Qualitative data analysis - Student L
Qualitative data analysis - Student LQualitative data analysis - Student L
Qualitative data analysis - Student L
 
615900072
615900072615900072
615900072
 
Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...
 
Mb0050 research methodology
Mb0050    research methodologyMb0050    research methodology
Mb0050 research methodology
 
Mb0050 “research methodology answer
Mb0050 “research methodology  answerMb0050 “research methodology  answer
Mb0050 “research methodology answer
 
Presentation on interpretation and report writing
Presentation on interpretation and report writingPresentation on interpretation and report writing
Presentation on interpretation and report writing
 

En vedette

Detection of moving object using
Detection of moving object usingDetection of moving object using
Detection of moving object usingijistjournal
 
Suppression of common mode
Suppression of common modeSuppression of common mode
Suppression of common modeijistjournal
 
Fuzzy based hyperspectral image
Fuzzy based hyperspectral imageFuzzy based hyperspectral image
Fuzzy based hyperspectral imageijistjournal
 
Mark kubena resume ba bpm
Mark kubena resume ba bpmMark kubena resume ba bpm
Mark kubena resume ba bpmMark Kubena
 
Rob Wayne Business Analyst Resume
Rob Wayne Business Analyst ResumeRob Wayne Business Analyst Resume
Rob Wayne Business Analyst ResumeGO Logistic, LLC
 
Preetam_Resume_Business Analyst
Preetam_Resume_Business AnalystPreetam_Resume_Business Analyst
Preetam_Resume_Business AnalystPreetam Sahu
 
3.KIRAN Resume_Business Analyst_Insurance
3.KIRAN Resume_Business Analyst_Insurance3.KIRAN Resume_Business Analyst_Insurance
3.KIRAN Resume_Business Analyst_InsuranceKIRAN KUMAR
 
Daksha Desai Resume Business Analyst
Daksha Desai Resume   Business AnalystDaksha Desai Resume   Business Analyst
Daksha Desai Resume Business Analystdakshadesai1
 
Business-Analyst-Resume
Business-Analyst-ResumeBusiness-Analyst-Resume
Business-Analyst-ResumeRanjit nikam
 
Business Analyst Resume for Insurance industry
Business Analyst Resume for Insurance industryBusiness Analyst Resume for Insurance industry
Business Analyst Resume for Insurance industryMaria FutureThoughts
 

En vedette (15)

Detection of moving object using
Detection of moving object usingDetection of moving object using
Detection of moving object using
 
Resume
ResumeResume
Resume
 
Suppression of common mode
Suppression of common modeSuppression of common mode
Suppression of common mode
 
Fuzzy based hyperspectral image
Fuzzy based hyperspectral imageFuzzy based hyperspectral image
Fuzzy based hyperspectral image
 
Resume
ResumeResume
Resume
 
Mark kubena resume ba bpm
Mark kubena resume ba bpmMark kubena resume ba bpm
Mark kubena resume ba bpm
 
Aaron's Resume
Aaron's ResumeAaron's Resume
Aaron's Resume
 
Rob Wayne Business Analyst Resume
Rob Wayne Business Analyst ResumeRob Wayne Business Analyst Resume
Rob Wayne Business Analyst Resume
 
Preetam_Resume_Business Analyst
Preetam_Resume_Business AnalystPreetam_Resume_Business Analyst
Preetam_Resume_Business Analyst
 
3.KIRAN Resume_Business Analyst_Insurance
3.KIRAN Resume_Business Analyst_Insurance3.KIRAN Resume_Business Analyst_Insurance
3.KIRAN Resume_Business Analyst_Insurance
 
Daksha Desai Resume Business Analyst
Daksha Desai Resume   Business AnalystDaksha Desai Resume   Business Analyst
Daksha Desai Resume Business Analyst
 
EDI_gowtam
EDI_gowtamEDI_gowtam
EDI_gowtam
 
Business-Analyst-Resume
Business-Analyst-ResumeBusiness-Analyst-Resume
Business-Analyst-Resume
 
Business Analyst Resume for Insurance industry
Business Analyst Resume for Insurance industryBusiness Analyst Resume for Insurance industry
Business Analyst Resume for Insurance industry
 
AppleOne - Sample Resumes
AppleOne - Sample ResumesAppleOne - Sample Resumes
AppleOne - Sample Resumes
 

Similaire à Quantification of Portrayal Concepts using tf-idf Weighting

A Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsA Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsLisa Graves
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxsleeperharwell
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxketurahhazelhurst
 
A hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerA hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerIAESIJAI
 
ASSIGNMENT 2 - Research Proposal Weighting 30 tow.docx
ASSIGNMENT 2 - Research Proposal    Weighting 30 tow.docxASSIGNMENT 2 - Research Proposal    Weighting 30 tow.docx
ASSIGNMENT 2 - Research Proposal Weighting 30 tow.docxsherni1
 
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...Rasha
 
Semi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term SetSemi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term SetIJCSEA Journal
 
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...kevig
 
Exploring Semantic Question Generation Methodology and a Case Study for Algor...
Exploring Semantic Question Generation Methodology and a Case Study for Algor...Exploring Semantic Question Generation Methodology and a Case Study for Algor...
Exploring Semantic Question Generation Methodology and a Case Study for Algor...IJCI JOURNAL
 
A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...
A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...
A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...Pubrica
 
Automated Thai Online Assignment Scoring
Automated Thai Online Assignment ScoringAutomated Thai Online Assignment Scoring
Automated Thai Online Assignment ScoringMary Montoya
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionEditorIJAERD
 
Document Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word RepresentationDocument Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word Representationsuthi
 
please help i have 40 minsItem 1In the case below, the original .pdf
please help i have 40 minsItem 1In the case below, the original .pdfplease help i have 40 minsItem 1In the case below, the original .pdf
please help i have 40 minsItem 1In the case below, the original .pdfsantanadenisesarin13
 
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...cseij
 
Arabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation methodArabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation methodijcsit
 
A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...IJERA Editor
 
Clustering Students of Computer in Terms of Level of Programming
Clustering Students of Computer in Terms of Level of ProgrammingClustering Students of Computer in Terms of Level of Programming
Clustering Students of Computer in Terms of Level of ProgrammingEditor IJCATR
 

Similaire à Quantification of Portrayal Concepts using tf-idf Weighting (20)

D1802023136
D1802023136D1802023136
D1802023136
 
A Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And ApplicationsA Review Of Text Mining Techniques And Applications
A Review Of Text Mining Techniques And Applications
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
Chao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docxChao Wrote Some trends that influence human resource are, Leade.docx
Chao Wrote Some trends that influence human resource are, Leade.docx
 
A hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzerA hybrid composite features based sentence level sentiment analyzer
A hybrid composite features based sentence level sentiment analyzer
 
ASSIGNMENT 2 - Research Proposal Weighting 30 tow.docx
ASSIGNMENT 2 - Research Proposal    Weighting 30 tow.docxASSIGNMENT 2 - Research Proposal    Weighting 30 tow.docx
ASSIGNMENT 2 - Research Proposal Weighting 30 tow.docx
 
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
 
Semi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term SetSemi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term Set
 
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...
 
Exploring Semantic Question Generation Methodology and a Case Study for Algor...
Exploring Semantic Question Generation Methodology and a Case Study for Algor...Exploring Semantic Question Generation Methodology and a Case Study for Algor...
Exploring Semantic Question Generation Methodology and a Case Study for Algor...
 
A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...
A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...
A Meta-Analysis of the Effects of Writing and Writing Instruction on Reading ...
 
Automated Thai Online Assignment Scoring
Automated Thai Online Assignment ScoringAutomated Thai Online Assignment Scoring
Automated Thai Online Assignment Scoring
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regression
 
Document Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word RepresentationDocument Classification Using KNN with Fuzzy Bags of Word Representation
Document Classification Using KNN with Fuzzy Bags of Word Representation
 
please help i have 40 minsItem 1In the case below, the original .pdf
please help i have 40 minsItem 1In the case below, the original .pdfplease help i have 40 minsItem 1In the case below, the original .pdf
please help i have 40 minsItem 1In the case below, the original .pdf
 
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
USING TF-ISF WITH LOCAL CONTEXT TO GENERATE AN OWL DOCUMENT REPRESENTATION FO...
 
Arabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation methodArabic text categorization algorithm using vector evaluation method
Arabic text categorization algorithm using vector evaluation method
 
A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and Naïve Bayes Classifiers for Documen...
 
Clustering Students of Computer in Terms of Level of Programming
Clustering Students of Computer in Terms of Level of ProgrammingClustering Students of Computer in Terms of Level of Programming
Clustering Students of Computer in Terms of Level of Programming
 
NLP Ecosystem
NLP EcosystemNLP Ecosystem
NLP Ecosystem
 

Plus de ijistjournal

International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)ijistjournal
 
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...ijistjournal
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...ijistjournal
 
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVINA MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVINijistjournal
 
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDYDECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDYijistjournal
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...ijistjournal
 
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...ijistjournal
 
Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...ijistjournal
 
Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...ijistjournal
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSijistjournal
 
Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...ijistjournal
 
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHMANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHMijistjournal
 
Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...ijistjournal
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...ijistjournal
 
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial DomainColour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domainijistjournal
 
Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...ijistjournal
 
International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)ijistjournal
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...ijistjournal
 
Design and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression AlgorithmDesign and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression Algorithmijistjournal
 
Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...ijistjournal
 

Plus de ijistjournal (20)

International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)
 
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...
 
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVINA MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVIN
 
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDYDECEPTION AND RACISM IN THE TUSKEGEE  SYPHILIS STUDY
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDY
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...
 
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...
 
Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...Call for Papers - International Journal of Information Sciences and Technique...
Call for Papers - International Journal of Information Sciences and Technique...
 
Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...Online Paper Submission - 4th International Conference on NLP & Data Mining (...
Online Paper Submission - 4th International Conference on NLP & Data Mining (...
 
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSTOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWS
 
Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...Research Articles Submission - International Journal of Information Sciences ...
Research Articles Submission - International Journal of Information Sciences ...
 
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHMANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
ANALYSIS OF IMAGE WATERMARKING USING LEAST SIGNIFICANT BIT ALGORITHM
 
Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...Call for Research Articles - 6th International Conference on Machine Learning...
Call for Research Articles - 6th International Conference on Machine Learning...
 
Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...Online Paper Submission - International Journal of Information Sciences and T...
Online Paper Submission - International Journal of Information Sciences and T...
 
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial DomainColour Image Steganography Based on Pixel Value Differencing in Spatial Domain
Colour Image Steganography Based on Pixel Value Differencing in Spatial Domain
 
Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...Call for Research Articles - 5th International conference on Advanced Natural...
Call for Research Articles - 5th International conference on Advanced Natural...
 
International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)International Journal of Information Sciences and Techniques (IJIST)
International Journal of Information Sciences and Techniques (IJIST)
 
Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...Research Article Submission - International Journal of Information Sciences a...
Research Article Submission - International Journal of Information Sciences a...
 
Design and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression AlgorithmDesign and Implementation of LZW Data Compression Algorithm
Design and Implementation of LZW Data Compression Algorithm
 
Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...Online Paper Submission - 5th International Conference on Soft Computing, Art...
Online Paper Submission - 5th International Conference on Soft Computing, Art...
 

Dernier

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Quantification of Portrayal Concepts using tf-idf Weighting

  • 1. International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013 DOI : 10.5121/ijist.2013.3501 1 Quantification of Portrayal Concepts using tf-idf Weighting S. Florence Vijila 1 and Dr. K. Nirmala 2 1 Research Scholar, Manonmaniam Sundaranar University, Tamil Nadu, India, Assistant Professor , CSI Ewart Women’s Christian College, Melrosapuram , TamilNadu 2 Associate Prof. of Computer Applications, Quaid-e-Millath Govt. College for Women, Chennai, Tamil Nadu, India ABSTRACT Term frequencies and inverse document frequencies have been successfully applied in determining weighting for document rankings. However these have been more successful in text mining and in extraction techniques used in the web. Concept mining has become increasingly popular in the research and application areas of Computer Science. This paper attempts to demonstrate the limited usage of term frequency and inverse document frequency for the application of weighting calculations for ranking documents that are based on concept quantifications. The case study considered for experiment in this paper, is based on concept terms of David Merrill’s First Principles of Instruction (FPI). Merrill’s FPI applies cognitive structures explicitly for analyzing instructional materials. Therefore it is justified that the terms categorized under each cognitive structure (or portrayal) of FPI can be taken as respective concept of that portrayal. As question papers are representative of cognitive structures in a more clear and logical way, four question papers on ‘C Language’ have been considered for the experimental study, that are detailed in this paper. Manual method has been adopted for the computation of quantities of portrayals in selected documents for the purpose of comparative study. As manual method is accurate, the values (results) are considered as benchmark values. These benchmark values are considered for comparing with normalized term frequencies that are derived (experimented) from automated extractions from the same selected documents. The study is however limited to four documents only. Conclusions are drawn from this experimental study, which will be of immense use to concept mining researchers as well as for instructional designers. KEYWORDS concept keywords; document ranking; term frequency weighting; cognitive structures. 1. INTRODUCTION Literatures are aplenty for document ranking using term weighting. Documents are retrieved using keywords and documents are ranked according to keyword density (Masaru Ohba et al - 2005, SA-Kwang Song et al -2012). According to documented literature, ranking of documents based on term weighting has been a key research issue in information retrieval, particularly retrieving concepts in addition to keywords. Applying weighting schemes using term frequency and inverse document frequency are crucial for ranking of documents (Sa-Kwang Song et al- 2012). A concept keyword is a word that represents a concept. But the nature of concept and the
  • 2. International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013 2 meaning a particular keyword carries are subjective judgment, as per these authors. The authors have adapted term frequency for mining concepts. Both human selected concepts and automated methodologies have been successfully adopted. But research publications point out that concept mining ultimately depends only upon manual methods, when accuracy is needed to a high level. In other words, automated methods would only provide approximate results. The tf-idf weighting (term frequency-inverse document frequency, a.k.a.TF-IDF) is a numerical statistic which reflects how important a word is to a document in a collection or corpus (Salton et al - 1988). By convention, the tf-idf value increases proportionally to the number of times a word appears in a document, but is offset by the frequency of the word in the corpus, which helps to control for the fact that some words are generally more common than others. In view of this, the factor itself may be considered as a weighting factor in information retrieval and text mining. This paper hence attempts to determine the acceptability of tf-idf, through experimentation, on quantifying concepts from documents. Comparative study will be carried out from the results of manual methods. In concept mining techniques, variations of the tf-idf weighting scheme are often used by some search engines as a central tool in scoring and ranking a document’s relevance given by a user query. Tf-idf can be successfully used for stop-words filtering in various subject fields including text summarization and classification. This paper attempts to quantify well proven conceptual terms, such as cognitive structures from instructional documents. David Merrill (2007), in his ‘First Principles of Instruction’ (FPI) divides any instructional event into four phases, which he calls ‘Activation’, ’Demonstration’, ’Application’ and ‘Integration’. Central to this instructional model is a real-time problem-solving theme, called ‘problem’. Merrill suggests that fundamental principles of instructional design should be relied on and these apply regardless of any instructional design model used. Violating this would produce a decrement in learning and performance. Each phase is a cognitive portrayal for learning a concept according to this author. Therefore it would be very apt to consider cognitive structure or portrayal terms as concepts for the purpose of quantification. We propose to rank order four sets of concept keywords of these four cognitive structures (portrayals) of Merrill’s FPI. The rank ordering would be carried out using normalized term frequency (tf) and inverse document frequency (idf) on selected University question paper documents of “C Language’. To ascertain the degree of accuracy (the objective of this paper), we propose to compare the results with manual methods. Conclusions have been drawn from this experimental study. The results will be of immense use to concept mining methodologists and for instructional designers. The material of this paper forms a part of a whole research programme on concept based quantification of portrayals by the authors. As these portrayals are taken for concept quantification, the meaning of each portrayal is essential to comprehend before attempting to experiment with. 2. ‘ACTIVATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’ LANGUAGE The ‘Activation’ portrayal of FPI specifies, “ Does the question direct students to recall, remember, repeat or recognize basic and fundamental knowledge of ‘C’ from the relevant past
  • 3. International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013 3 behavior (experience) of the students that can be used as a foundation for the acquisition of new (and further advanced) knowledge of ‘C’?“. Based on the above definition and on the past ten year question paper patterns analyzed by the authors, the following concept keywords have been specifically arrived at for the computation of term frequency: “list, define, tell, name, locate, identify, what, give, distinguish, acquire, underline, relate, state, recall, select, repeat, recognize, reproduce and measure”. 3. ‘DEMONSTRATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’ LANGUAGE The ‘Demonstration’ portrayal of FPI specifies, “Does the question directs the students to demonstrate of what has he learnt rather than merely provide information (unlike ’Activation’) about what is needed as foundation for further techniques of ‘C’?“. Based on the above definition and the past ten year question papers pattern analyzed by the authors, the following concept keywords have been exclusively arrived at for the computation of term frequency: “demonstrate, summarize, illustrate, interpret, contrast, predict, associate, distinguish, identify, show, label, collect, experiment, recite, classify, discuss, select, compare, translate, prepare, change, rephrase, differentiate, draw, explain, estimate, fill in, choose, operate, perform, organize and write”. 4. ‘APPLICATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’ LANGUAGE The ‘Application’ portrayal of FPI specifies, ”Does the question provides an opportunity for student to apply her acquired knowledge of ‘C’ to solve a problem?”. Based on this definition and from analysis carried out authors on the past ten year question papers, the following concept keywords have been exclusively arrived at for the computation of term frequencies: “apply, calculate, find, solve, illustrate, make, predict, construct, assess, practice, restrictive, classify, code, develop, generate, write”. 5. ‘INTEGRATION’ PORTRAYAL ON EVALUATION DOCUMENT OF ‘C’ LANGUAGE The ‘Integration’ portrayal of FPI specifies, ”Does the question triggers and encourages the student to integrate (transfer) her acquired knowledge of ‘C’ to a new situation in any critical and complex situation?”. Based on this definition and from the study on past ten year question papers, the following concept keywords have been arrived at exclusively for the computation of term frequencies: “analyze, solve, justify, infer, combine, integrate, plan, generalize, assess, decide, rank, grade, recommend, contrast, survey, examine, investigate, compose, invent, improve, imagine, hypothesize, predict, evaluate, rate, how and why”.
  • 4. International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013 4 Using these keywords, quantified benchmark values on the selected documents have been arrived at through manual methods. These benchmark values would be compared with automated term frequencies so as to arrive at conclusions. 6. COMPUTATION OF BENCHMARK VALUES BY MANUAL METHOD Based on the definitions of the portrayals of FPI (four cognitive structures), quantification of concept terms in terms of these four cognitive structures on the selected four question paper documents (samples) were carried out using concept keywords as provided above. Thus, the addition of all values (in percentages) would yield to 100%, as the method adapted is pure manual by the authors. Therefore the values need to be exact and hence taken as benchmark values for comparative studies. The values are provided in Table 1. Portrayal Documents Average Q1 Q2 Q3 Q4 Activation 48% 27% 30% 31% 34% Demonstration 22% 28% 42% 33% 31% Application 26% 36% 21% 29% 28% Integration 4% 9% 7% 7% 7% Table 1. Benchmark values in percentages computed manually 7. RESULTS AND DISCUSSIONS ON WEIGHTING USING TERM FREQUENCIES The four selected question papers (documents) of the subject ‘C Language’ have been used for the intended analysis. The normalized term frequency and inverse document frequencies on these selected four concept keywords, namely ‘Activation’, ’Demonstration’, ’Application’ and ’Integration’ using their respective keywords have been used for the calculation of two frequencies. These term frequencies are used for determining the weighting of these four portrayals that we have computed through inverse document frequencies. The total number of words, the most frequented word and the number of concepts (different keywords of each concept or portrayal) are used for the calculation of weightings (Wikipedia 2012). They are computed and presented for each document (i.e. question paper). They are presented in a nut shell in Table 2. The basic variables for term frequency computations are: Number of concept terms occurring in one document = i; Number of most frequently occurred word in one document = n; Normalized term frequency, tf = i/n ;_______________(1) The basic data for term frequencies are presented in Table 2.
  • 5. International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013 5 Document No. Total words No. of most frequented word(n) No. of concept terms appearing (i) Activation Demonstration Application Integration Q1 187 15 11 5 6 1 Q2 206 25 6 7 8 2 Q3 184 16 9 12 6 2 Q4 173 21 6 10 8 2 Total no. of concept terms appearing 32 34 28 7 Table 2.Basic Data for Term Frequency The total number of documents considered is N, which is 4. The normalized term frequencies for each concept term of each document is presented in Table 3 (refer to equation 1). Concept Normalized term frequency Total Average Q1 Q2 Q3 Q4 Activation 0.7333 0.2400 0.5625 0.2857 1.8215 0.4554 Demonstration 0.3333 0.2800 0.7500 0.4762 1.8395 0.4599 Application 0.4000 0.3200 0.3750 0.0952 1.1902 0.2976 Integration 0.0667 0.0800 0.1250 0.952 0.3669 0.0917 Table 3. Normalized Term Frequencies The inverse document frequency is computed as: idf = log (N/K) ;_______________(2) Where N = Total number of documents, which is 4 and K: Number of occurrences of terms in all the documents considered (see total concept terms appearing in Table 2). The final results are :- Activation = -0.903089 Demonstration = -0.92959 Application = -0.845098 Integration = -0.243038 The negative sign is due to the fact that the total number of documents considered is less than the occurrences of terms in all the documents. The objective of this research work is to compare these frequency values of concept terms with respect to the benchmarks that were calculated based on manual methods. As the study is limited to comparisons, the idf is considered with modular values. Thus weighting would be tf.│log (N/K)│;_______________(3)
  • 6. International Journal of Information Sciences and Techniques (IJIST) Vol.3, No.5, September 2013 6 Accordingly, the weighting for each concept term of each document is computed and presented in Table 4 Concept Weighting of terms Average Q1 Q2 Q3 Q4 Activation 0.6622 0.2167 0.5080 0.2580 0.4113 Demonstration 0.3098 0.2603 0.6972 0.4427 0.4275 Application 0.3380 0.2704 0.3169 0.0805 0.2515 Integration 0.0162 0.0194 0.0304 0.2313 0.0223 Table 4. Weighting of portrayals in each document The above values yield important conclusions. 8. CONCLUSIONS The largest and the least present cognitive portrayals in every document as per benchmark values tally well with weighting of term frequency of all question paper documents considered. It is thus concluded that normalized term frequencies may be used for quantifying concept terms in individualized documents to an acceptable standards. However the benchmark values deviate from that of weighting term frequency on average values. This clearly shows that unlike quantification of keywords, quantifying concept words may not be reliable when inverse document frequency is computed on small number of documents. The work may be extended to large number of documents for further study. REFERENCES [1] Salton G, Buckley C (1988), "Term-weighting approaches in automatic text retrieval". Information Processing and Management 24 (5): 513–523, 1988. [2] Sa-kwang Song, and Sung Hyon Myaeng, (2012), “A novel term weighting scheme based on discrimination power obtained from past retrieval results.”, Information Processing and Management 48 (2012) 919–930, 2012. [3] Masaru Ohba and Katsuhiko Gondow, (2005), “Toward Mining ‘Concept Keywords’ from Identifiers in Large Software Projects”, ACM SIGSOFT Software Engineering Notes 30(4): 1-5, 2005. [4] Merrill M.D., (2002), “First Principles of Instruction”, Englewood Cliffs, NJ: Educational Technology Publications, 2002. [5] Wikipedia (2012), http://en.wikipedia.org/wiki/Tf*idf, 2012 Authors S.Florence Vijila is working as Assistant Professor in the Department t of Computer Science at CSI Ewart Women’s Christian College,Tamil Nadu for the past 12 years.Her qualification is M.CA,M.Phil,Now she is doing Ph.D in the area of “Data Mining”.Her work have been published in the International conference on ”Innovations in contemporary IT research “ proceedings.