SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME
40
SPAM DETECTION IN SOCIAL BOOKMARKING SYSTEM– EMPIRICAL
EVALUATION
Mittal Sejpal1
, Priyank Thakkar2
Department of Computer Science & Engineering, Institute of Technology
Nirma University, Ahmedabad, 382481, Gujarat, India
ABSTRACT
Social Bookmarking web sites have recently become popular for collecting and sharing of
interesting web sites among users. People can add web pages to such sites, as bookmarks and allow
themselves as well as others to work on them. One of the key features of the social bookmarking
sites is the ability of annotating a web page when it is being bookmarked. The annotation usually
contains a set of words or phrases, which are collectively known as tags that could reveal the
semantics of the annotated web page. Efficient and effective search of web pages can then be
achieved via such tags. However, spam tags that are irrelevant to the content of web pages often
appear to deceive other users for malicious or commercial purposes. There are users who
intentionally assign spam tags. Manual detection of such users is very difficult.In this paper, main
focus is on the detection of spam users in Social Bookmarking System. Experimental Evaluation is
done using ECM PKDD discovery challenge 2008 data set. Experimentation using naïve Bayes and
K-Nearest Neighbour classifiers on all three Information Retrieval (IR) models (Boolean, bag-of-
words and TFIDF) gives promising results.
Keywords: Spam Detection, Social Bookmarking Systems, Feature Selection.
I. INTRODUCTION
Social Bookmarking Systems have gained high popularity now a day. With this growing
popularity, the spam users have started exploiting tags used during annotating process for malicious
and commercial purposes. In social bookmarking websites, users can store and access their
bookmarks online through a web interface. The stored information is sharable among users, allowing
for improved searching. Any system that is highly dependent on user-generated content is vulnerable
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH
IN ENGINEERING AND TECHNOLOGY (IJARET)
ISSN 0976 - 6480 (Print)
ISSN 0976 - 6499 (Online)
Volume 5, Issue 6, June (2014), pp. 40-44
© IAEME: http://www.iaeme.com/IJARET.asp
Journal Impact Factor (2014): 7.8273 (Calculated by GISI)
www.jifactor.com
IJARET
© I A E M E
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME
41
to spam in one form or another [3]. Social bookmarking websites provide a large and continuously
growing pool of potential customers. In fact, a spam post is more attractive than a non-spam post [5].
Spamming in Social Bookmarking Systems has become a crucial challenge affecting both
users and service providers. Users face spamming obstacles while doing activities like web based
searching [10]. The main challenge is that the characteristics and behavior of spam user’s change
over time, hence, maintaining the rules for detecting spam is a very difficult task. It is very difficult
to manually detect spam users because of huge number of user’s data etc. [10].
The data set used in this work is taken from the ECML PKDD discovery challenge 2008
composed of tags, bookmarks and bibtex post of users. Researchers with the help of this data set can
now conduct supervised learning and reliable evaluation of the task.
In this paper, the objective is to automate the task of spam user detection in social
bookmarking systems. The paper is organized as follows. In Section II, related work is described. In
Section III, brief description of classification methods used is given. Section IV consists of
implementation methodology. The paper finally ends with conclusions in section V.
II. RELATED WORK
Spam Detection in Social Bookmarking system is a relatively new research area and the
literature is still sparse. A brief review of related work and the recent shift of attention toward social
spam are discussed here.
A method for spam detection using language models based on the intuitive notion that similar
users and posts tend to use the same language was proposed in [3]. Authors in [5], proposed a novel,
fast and accurate supervised learning method as a general text classification algorithm for linearly
separated data for the task of spam detection. A machine learning-based approach to automate spam
detection was proposed in [6].
An algorithm to identify spammers from the collaborating systems by employing a spam
score propagating technique was proposed in [7]. The problem of learning to classify texts was
addressed by exploiting information derived from both training and testing sets in [8]. To accomplish
this, clustering was used as a complementary step to text classification, and was applied not only to
the training set but also to the testing set [8].
III. CLASSIFICATION METHODS
In this paper, K-Nearest Neighbor (KNN) and Naïve Bayes (NB) is used for the task of spam
user detection in social bookmarking system.
KNN Classifier
KNN is an algorithm that is extremely easy to see yet works extraordinarily well in practice.
Likewise, it is astonishingly adaptable and it has already been applied in various domains. For the
test instance, it predicts the class based on the class of k-nearest neighbours from the training set.
That is, k instances from the training set which are most similar to the testing instance are utilized to
predict the class label of test instance.
To find the distance or similarity between instances, one can use different distance measures
such as Euclidean distance, Hamming distance, Cityblock distance, cosine similarity etc. Details can
be found in [12].
Naïve Bayes Classifier
The Naive Bayes Classifier technique is based on the so-called Bayesian theorem and is
particularly suited when the dimensionality of the inputs is high. It considers supervised learning
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME
42
from a probabilistic point of view. The task of classification can be regarded as estimating the class
posterior probabilities given a test example.
Naive-Bayes classifier assumes class conditional independence. Given test instance X,
Bayesian classifier predicts the probability of belonging to a specific class ‫ܥ‬௜. To anticipate
probability it utilizes idea of Bayes' theorem. Bayes' theorem is valuable in that it gives a method for
computing the posterior probability,ܲሺ‫ܺ|ܥ‬ሻ, from ܲሺ‫ܥ‬ሻ, ܲሺܺ|‫ܥ‬ሻ, and ܲሺܺሻ. Bayes' theorem states
that
ܲሺ‫ܺ|ܥ‬ሻ ൌ
ܲሺܺ|‫ܥ‬ሻܲሺ‫ܥ‬ሻ
ܲሺܺሻ
where,
ܲሺܺ|‫ܥ‬ሻ ൌ ܲሺ‫ݔ‬ଵ, ‫ݔ‬ଶ, ‫ݔ‬ଷ, … ‫ݔ‬௡│‫ܥ‬ሻ ൌ ෑ ܲሺ‫ݔ‬௜|‫ܥ‬ሻ
௡
௜ୀଵ
Here, ‫ݔ‬ଵ, ‫ݔ‬ଶ, … , ‫ݔ‬௡ are features describing X. An advantage of naive Bayes is that it only
requires a small amount of training data to estimate the parameters necessary for classification.
Based on the type of data, appropriate distribution is fitted to the data to evaluate ܲሺ‫ݔ‬௜|‫ܥ‬ሻ. Details
can be found in [11], [12].
IV. EXPERIMENTAL METHODOLOGY
Performance Parameters
Accuracy is used as the parameter for evaluating the performance of the classification
algorithms. Accuracy of a classifier on a given test set is the percentage of test set items that are
correctly classified by the classifier. In this paper, non-spammer user is considered as positive class
and spammer user is considered as negative class. Accordingly, True Positive (TP), False Negative
(FN), False Positive (FP) and True Negative (TN) are defined as under [11].
TP (True Positives): It refers to the number of positive items that are correctly labelled by the
classifier.
FP (False Positives): It refers to the number of negative items that are incorrectly labelled as positive.
TN (True Negatives): It refers to the number of negative items that are correctly labelled by the
classifier.
FN (False Negatives): It refers to the number of positive items that are mislabelled as negative.
Base on the above interpretations, accuracy is defined by the following equation.
‫ݕܿܽݎݑܿܿܣ‬ ൌ
ܶܲ ൅ ܶܰ
ܶܲ ൅ ܶܰ ൅ ‫ܲܨ‬ ൅ ‫ܰܨ‬
Dataset
The assessment of the proposed method is carried out using ECML PKDD Discovery
Challenge 2008 Data set.The Data set consists of 7171 unique users. The tables in the data set are:
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME
43
tas, tas_spam
bookmark, bookmark_spam
bibtex, bibtex_spam
user
The user table contained flag denoting whether the user is spammer or non-spammer. This is
helpful to us for supervised learning.
Preprocessing
Pre-processing of data set is necessary to improve the quality of data thereby helping to
improve subsequent classification processes. Each of the user is described by means of the tags
posted by him. First of all, all the special characters are removed and all the characters are converted
to lower-case letters. Stop words which are not important in describing tag profile of the user are
than removed. For each of the user, Boolean, bag-of-words and TFIDF profile is prepared. Details
can be found in [12].
V. RESULTS AND DISCUSSIONS
Table 1 shows the accuracy of NB and KNN for the task of bookmark spam user detection. It
can be seen that best results are achieved in case of naïve-Bayes classifier when multivariate
Bernoulli distribution is fitted to the data. The other observation is that KNN classifier works best for
all the representations while naïve-Bayes works best for the Boolean representation of the user
vector.
Table 1: Accuracy of all three IR models considering all 71911 attributes
Classifier
Accuracy for Different IR Models
Boolean Word Count TFIDF
Naïve Bayes 97.544 78.297 84.395
KNN 97.423 96.931 96.984
VI. CONCLUSIONS AND FUTURE WORK
This paper focuses on the task of detecting spam users in social bookmarking system. The
problem is modelled as classification task. Naïve Bayes and K-Nearest Neighbor classifiers are used
as the classification techniques. Experiments are carried out with three IR models (Boolean, bag-of-
words and TFIDF).It is evident from the results that KNN classifier works well for all the three IR
models while naïve-Bayes works the best for Boolean representation of the users.
The future work would be to work on combining the interrelationship between tags with
feature selection for better classification performance.
REFERENCES
[1] Benjamin Markines, Ciro Cattuto, Filippo Menczer, “Social Spam Detection”, Proceedings of
the 5th International Workshop on Adversarial Information Retrieval on the Web, pp. 41-48,
2009.
[2] Stephan Doerfel, Andreas Hotho, Robert Jaschke, Folke Mitzlaff, Juergen Mueller, “ECML
PKDD Discovery Challenge”, 2008.
International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 –
6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME
44
[3] Toine Bogers, Antal Van Den Bosch, “Using Language Modelling for Spam Detection in
Social Reference Manager Websites", Proceedings of the 9th Belgian-Dutch Information
Retrieval Workshop, pp. 87-94, 2009.
[4] Steven Young, Itamar Arel, Thomas P Karnowski, Derek Rose, “A Fast and Stable
Incremental Clustering Algorithm", Information Technology: New Generations (ITNG), pp.
204-209, 2010.
[5] AnestisGkanogiannis and Theodore Kalamboukis, “A Novel Supervised Learning Algorithm
and its use for Spam Detection in Social Bookmarking Systems", ECML PKDD Discovery
Challenge, 2008.
[6] Chanju Kim and Kyu-Baek Hwang, “Naive Bayes Classifier Learning with Feature Selection
for Spam Detection in Social Bookmarking", ECML PKDD Discovery Challenge, 2008.
[7] Ralf Krestel and Ling Chen, “Using Co-occurrence of Tags and Resources to Identify
Spammers”, ECML/PKDD Discovery Challenge Workshop, pp. 38-46, 2008.
[8] Antonia Kyriakopoulou and Theodore Kalamboukis, “Combining Clustering with
Classification for Spam Detection in Social Bookmarking Systems",ECML/PKDD Discovery
Challenge Workshop, pp. 47-54,2008.
[9] Beate Krause, Christoph Schmitz, Andreas Hotho, Gerd Stumme, “The Anti-Social Tagger –
Detecting Spam in Social Bookmarking Systems", Proceedings of the 4th International
Workshop on Adversarial Information Retrieval on the Web”, pp. 61-68, 2008.
[10] Amgad Madkour, Tarek Hefni, Ahmed Hefny, Khaled S Refaat, “Using Semantic Features to
Detect Spamming in Social Bookmarking Systems", ECML PKDD Discovery Challenge
Workshop, 2008.
[11] J. P. Jiawei Han, Micheline Kamber, “Data Mining Concepts and Techniques”. Morgan
Kaufmann, 3 Edition, July 2011.
[12] Zdravko Markov, Daniel T Larose, “Data Mining the Web: Uncovering Patterns in Web
Content, Structure, and Usage”, John Wiley & Sons, 2007.
[13] Radhika Kotecha, Vijay Ukani, Sanjay Garg, “An empirical analysis of multiclass
classification techniques in data mining”, NUiCONE 2011, pp. 1-5, 2011.
[14] Rutu Joshi and Priyank Thakkar, “Experimental Evaluation of Different Classification
Techniques for Web Page Classification”, International Journal of Advanced Research in
Engineering & Technology (IJARET), Volume 5, Issue 5, 2014, pp. 91 - 101, ISSN Print:
0976-6480, ISSN Online: 0976-6499.
[15] Priyank Thakkar, Samir Kariya and K Kotecha, “Web Page Clustering using Cemetery
Organization Behavior of Ants”, International Journal of Advanced Research in Engineering
& Technology (IJARET), Volume 5, Issue 1, 2014, pp. 7 - 17, ISSN Print: 0976-6480,
ISSN Online: 0976-6499.
[16] Prajakta Ozarkar and Dr. Manasi Patwardhan, “Efficient Spam Classification by Appropriate
Feature Selection”, International Journal of Computer Engineering & Technology (IJCET),
Volume 4, Issue 3, 2013, pp. 123 - 139, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.
[17] C.R. Cyril Anthoni and Dr. A. Christy, “Integration of Feature Sets with Machine Learning
Techniques for Spam Filtering”, International Journal of Computer Engineering &
Technology (IJCET), Volume 2, Issue 1, 2011, pp. 47 - 52, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.
[18] R. Manickam, D. Boominath and V. Bhuvaneswari, “An Analysis of Data Mining: Past,
Present and Future”, International Journal of Computer Engineering & Technology (IJCET),
Volume 3, Issue 1, 2012, pp. 1 - 9, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.

Contenu connexe

Tendances

A Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsA Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsEditor IJMTER
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataIRJET Journal
 
Extraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity MiningExtraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity Miningiosrjce
 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsIJECEIAES
 
11.hybrid ga svm for efficient feature selection in e-mail classification
11.hybrid ga svm for efficient feature selection in e-mail classification11.hybrid ga svm for efficient feature selection in e-mail classification
11.hybrid ga svm for efficient feature selection in e-mail classificationAlexander Decker
 
Hybrid ga svm for efficient feature selection in e-mail classification
Hybrid ga svm for efficient feature selection in e-mail classificationHybrid ga svm for efficient feature selection in e-mail classification
Hybrid ga svm for efficient feature selection in e-mail classificationAlexander Decker
 
Prediction of User Rare Sequential Topic Patterns of Internet Users
Prediction of User Rare Sequential Topic Patterns of Internet UsersPrediction of User Rare Sequential Topic Patterns of Internet Users
Prediction of User Rare Sequential Topic Patterns of Internet UsersIRJET Journal
 
Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.IRJET Journal
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONijistjournal
 
OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...
OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...
OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...Content Savvy
 
On the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsOn the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsjournalBEEI
 
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning ApproachesA Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approachesijtsrd
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.docbutest
 
IRJET- Text Document Clustering using K-Means Algorithm
IRJET-  	  Text Document Clustering using K-Means Algorithm IRJET-  	  Text Document Clustering using K-Means Algorithm
IRJET- Text Document Clustering using K-Means Algorithm IRJET Journal
 
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION cscpconf
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics DomainDrjabez
 
News Reliability Evaluation using Latent Semantic Analysis
News Reliability Evaluation using Latent Semantic AnalysisNews Reliability Evaluation using Latent Semantic Analysis
News Reliability Evaluation using Latent Semantic AnalysisTELKOMNIKA JOURNAL
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)es712
 

Tendances (18)

A Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie ReviewsA Survey on Sentiment Categorization of Movie Reviews
A Survey on Sentiment Categorization of Movie Reviews
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Extraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity MiningExtraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity Mining
 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documents
 
11.hybrid ga svm for efficient feature selection in e-mail classification
11.hybrid ga svm for efficient feature selection in e-mail classification11.hybrid ga svm for efficient feature selection in e-mail classification
11.hybrid ga svm for efficient feature selection in e-mail classification
 
Hybrid ga svm for efficient feature selection in e-mail classification
Hybrid ga svm for efficient feature selection in e-mail classificationHybrid ga svm for efficient feature selection in e-mail classification
Hybrid ga svm for efficient feature selection in e-mail classification
 
Prediction of User Rare Sequential Topic Patterns of Internet Users
Prediction of User Rare Sequential Topic Patterns of Internet UsersPrediction of User Rare Sequential Topic Patterns of Internet Users
Prediction of User Rare Sequential Topic Patterns of Internet Users
 
Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
 
OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...
OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...
OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extr...
 
On the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisonsOn the benefit of logic-based machine learning to learn pairwise comparisons
On the benefit of logic-based machine learning to learn pairwise comparisons
 
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning ApproachesA Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
A Deep Analysis on Prevailing Spam Mail Filteration Machine Learning Approaches
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.doc
 
IRJET- Text Document Clustering using K-Means Algorithm
IRJET-  	  Text Document Clustering using K-Means Algorithm IRJET-  	  Text Document Clustering using K-Means Algorithm
IRJET- Text Document Clustering using K-Means Algorithm
 
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics Domain
 
News Reliability Evaluation using Latent Semantic Analysis
News Reliability Evaluation using Latent Semantic AnalysisNews Reliability Evaluation using Latent Semantic Analysis
News Reliability Evaluation using Latent Semantic Analysis
 
Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)Social media recommendation based on people and tags (final)
Social media recommendation based on people and tags (final)
 

En vedette

Modified synthesis strategy for vowels and semi vowels klatt synthesize
Modified synthesis strategy for vowels and semi vowels klatt synthesizeModified synthesis strategy for vowels and semi vowels klatt synthesize
Modified synthesis strategy for vowels and semi vowels klatt synthesizeIAEME Publication
 
Web metrics-cell-carrier-buzz-on-the-web-12063
Web metrics-cell-carrier-buzz-on-the-web-12063Web metrics-cell-carrier-buzz-on-the-web-12063
Web metrics-cell-carrier-buzz-on-the-web-12063Lextant
 
Design implementation and comparison of various cmos charge pumps
Design implementation and comparison of various cmos charge pumpsDesign implementation and comparison of various cmos charge pumps
Design implementation and comparison of various cmos charge pumpsIAEME Publication
 
Designing of telecommand system using system on chip soc for spacecraft contr...
Designing of telecommand system using system on chip soc for spacecraft contr...Designing of telecommand system using system on chip soc for spacecraft contr...
Designing of telecommand system using system on chip soc for spacecraft contr...IAEME Publication
 

En vedette (9)

Modified synthesis strategy for vowels and semi vowels klatt synthesize
Modified synthesis strategy for vowels and semi vowels klatt synthesizeModified synthesis strategy for vowels and semi vowels klatt synthesize
Modified synthesis strategy for vowels and semi vowels klatt synthesize
 
Web metrics-cell-carrier-buzz-on-the-web-12063
Web metrics-cell-carrier-buzz-on-the-web-12063Web metrics-cell-carrier-buzz-on-the-web-12063
Web metrics-cell-carrier-buzz-on-the-web-12063
 
30420140503002
3042014050300230420140503002
30420140503002
 
10120140506002
1012014050600210120140506002
10120140506002
 
Design implementation and comparison of various cmos charge pumps
Design implementation and comparison of various cmos charge pumpsDesign implementation and comparison of various cmos charge pumps
Design implementation and comparison of various cmos charge pumps
 
Designing of telecommand system using system on chip soc for spacecraft contr...
Designing of telecommand system using system on chip soc for spacecraft contr...Designing of telecommand system using system on chip soc for spacecraft contr...
Designing of telecommand system using system on chip soc for spacecraft contr...
 
50120140507008
5012014050700850120140507008
50120140507008
 
50320140502004
5032014050200450320140502004
50320140502004
 
20120140506009
2012014050600920120140506009
20120140506009
 

Similaire à IJARET Spam Detection Social Bookmarking

IRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label ClassificationIRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label ClassificationIRJET Journal
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...IRJET Journal
 
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
IRJET  - Student Pass Percentage Dedection using Ensemble LearninngIRJET  - Student Pass Percentage Dedection using Ensemble Learninng
IRJET - Student Pass Percentage Dedection using Ensemble LearninngIRJET Journal
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...IRJET Journal
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningIRJET Journal
 
Detailed structure applicable to hybrid recommendation technique
Detailed structure applicable to hybrid recommendation techniqueDetailed structure applicable to hybrid recommendation technique
Detailed structure applicable to hybrid recommendation techniqueIAEME Publication
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...ijtsrd
 
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...IRJET Journal
 
IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
IRJET-A Novel Technic to Notice Spam Reviews on e-ShoppingIRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
IRJET-A Novel Technic to Notice Spam Reviews on e-ShoppingIRJET Journal
 
A Brief Survey on Recommendation System for a Gradient Classifier based Inade...
A Brief Survey on Recommendation System for a Gradient Classifier based Inade...A Brief Survey on Recommendation System for a Gradient Classifier based Inade...
A Brief Survey on Recommendation System for a Gradient Classifier based Inade...Christo Ananth
 
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataSentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataIRJET Journal
 
Survey in Online Social Media Skelton by Network based Spam
Survey in Online Social Media Skelton by Network based SpamSurvey in Online Social Media Skelton by Network based Spam
Survey in Online Social Media Skelton by Network based SpamIRJET Journal
 
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageIRJET Journal
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesIRJET Journal
 
Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...IJECEIAES
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated ProcessIRJET Journal
 
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...IRJET Journal
 
A survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesA survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesIAEME Publication
 
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media DataIRJET Journal
 

Similaire à IJARET Spam Detection Social Bookmarking (20)

IRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label ClassificationIRJET- Personality Recognition using Multi-Label Classification
IRJET- Personality Recognition using Multi-Label Classification
 
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...An Approach for Malicious Spam Detection in Email with Comparison of Differen...
An Approach for Malicious Spam Detection in Email with Comparison of Differen...
 
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
IRJET  - Student Pass Percentage Dedection using Ensemble LearninngIRJET  - Student Pass Percentage Dedection using Ensemble Learninng
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
 
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
Evaluating and Enhancing Efficiency of Recommendation System using Big Data A...
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 
Detailed structure applicable to hybrid recommendation technique
Detailed structure applicable to hybrid recommendation techniqueDetailed structure applicable to hybrid recommendation technique
Detailed structure applicable to hybrid recommendation technique
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...
 
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...
 
IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
IRJET-A Novel Technic to Notice Spam Reviews on e-ShoppingIRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping
 
A Brief Survey on Recommendation System for a Gradient Classifier based Inade...
A Brief Survey on Recommendation System for a Gradient Classifier based Inade...A Brief Survey on Recommendation System for a Gradient Classifier based Inade...
A Brief Survey on Recommendation System for a Gradient Classifier based Inade...
 
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter DataSentiment Analysis on Twitter Data
Sentiment Analysis on Twitter Data
 
50120140505015 2
50120140505015 250120140505015 2
50120140505015 2
 
Survey in Online Social Media Skelton by Network based Spam
Survey in Online Social Media Skelton by Network based SpamSurvey in Online Social Media Skelton by Network based Spam
Survey in Online Social Media Skelton by Network based Spam
 
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
 
Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated Process
 
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
 
A survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniquesA survey of memory based methods for collaborative filtering based techniques
A survey of memory based methods for collaborative filtering based techniques
 
IRJET- Analysis of Brand Value Prediction based on Social Media Data
IRJET-  	  Analysis of Brand Value Prediction based on Social Media DataIRJET-  	  Analysis of Brand Value Prediction based on Social Media Data
IRJET- Analysis of Brand Value Prediction based on Social Media Data
 

Plus de IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

Plus de IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

IJARET Spam Detection Social Bookmarking

  • 1. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME 40 SPAM DETECTION IN SOCIAL BOOKMARKING SYSTEM– EMPIRICAL EVALUATION Mittal Sejpal1 , Priyank Thakkar2 Department of Computer Science & Engineering, Institute of Technology Nirma University, Ahmedabad, 382481, Gujarat, India ABSTRACT Social Bookmarking web sites have recently become popular for collecting and sharing of interesting web sites among users. People can add web pages to such sites, as bookmarks and allow themselves as well as others to work on them. One of the key features of the social bookmarking sites is the ability of annotating a web page when it is being bookmarked. The annotation usually contains a set of words or phrases, which are collectively known as tags that could reveal the semantics of the annotated web page. Efficient and effective search of web pages can then be achieved via such tags. However, spam tags that are irrelevant to the content of web pages often appear to deceive other users for malicious or commercial purposes. There are users who intentionally assign spam tags. Manual detection of such users is very difficult.In this paper, main focus is on the detection of spam users in Social Bookmarking System. Experimental Evaluation is done using ECM PKDD discovery challenge 2008 data set. Experimentation using naïve Bayes and K-Nearest Neighbour classifiers on all three Information Retrieval (IR) models (Boolean, bag-of- words and TFIDF) gives promising results. Keywords: Spam Detection, Social Bookmarking Systems, Feature Selection. I. INTRODUCTION Social Bookmarking Systems have gained high popularity now a day. With this growing popularity, the spam users have started exploiting tags used during annotating process for malicious and commercial purposes. In social bookmarking websites, users can store and access their bookmarks online through a web interface. The stored information is sharable among users, allowing for improved searching. Any system that is highly dependent on user-generated content is vulnerable INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET) ISSN 0976 - 6480 (Print) ISSN 0976 - 6499 (Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME: http://www.iaeme.com/IJARET.asp Journal Impact Factor (2014): 7.8273 (Calculated by GISI) www.jifactor.com IJARET © I A E M E
  • 2. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME 41 to spam in one form or another [3]. Social bookmarking websites provide a large and continuously growing pool of potential customers. In fact, a spam post is more attractive than a non-spam post [5]. Spamming in Social Bookmarking Systems has become a crucial challenge affecting both users and service providers. Users face spamming obstacles while doing activities like web based searching [10]. The main challenge is that the characteristics and behavior of spam user’s change over time, hence, maintaining the rules for detecting spam is a very difficult task. It is very difficult to manually detect spam users because of huge number of user’s data etc. [10]. The data set used in this work is taken from the ECML PKDD discovery challenge 2008 composed of tags, bookmarks and bibtex post of users. Researchers with the help of this data set can now conduct supervised learning and reliable evaluation of the task. In this paper, the objective is to automate the task of spam user detection in social bookmarking systems. The paper is organized as follows. In Section II, related work is described. In Section III, brief description of classification methods used is given. Section IV consists of implementation methodology. The paper finally ends with conclusions in section V. II. RELATED WORK Spam Detection in Social Bookmarking system is a relatively new research area and the literature is still sparse. A brief review of related work and the recent shift of attention toward social spam are discussed here. A method for spam detection using language models based on the intuitive notion that similar users and posts tend to use the same language was proposed in [3]. Authors in [5], proposed a novel, fast and accurate supervised learning method as a general text classification algorithm for linearly separated data for the task of spam detection. A machine learning-based approach to automate spam detection was proposed in [6]. An algorithm to identify spammers from the collaborating systems by employing a spam score propagating technique was proposed in [7]. The problem of learning to classify texts was addressed by exploiting information derived from both training and testing sets in [8]. To accomplish this, clustering was used as a complementary step to text classification, and was applied not only to the training set but also to the testing set [8]. III. CLASSIFICATION METHODS In this paper, K-Nearest Neighbor (KNN) and Naïve Bayes (NB) is used for the task of spam user detection in social bookmarking system. KNN Classifier KNN is an algorithm that is extremely easy to see yet works extraordinarily well in practice. Likewise, it is astonishingly adaptable and it has already been applied in various domains. For the test instance, it predicts the class based on the class of k-nearest neighbours from the training set. That is, k instances from the training set which are most similar to the testing instance are utilized to predict the class label of test instance. To find the distance or similarity between instances, one can use different distance measures such as Euclidean distance, Hamming distance, Cityblock distance, cosine similarity etc. Details can be found in [12]. Naïve Bayes Classifier The Naive Bayes Classifier technique is based on the so-called Bayesian theorem and is particularly suited when the dimensionality of the inputs is high. It considers supervised learning
  • 3. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME 42 from a probabilistic point of view. The task of classification can be regarded as estimating the class posterior probabilities given a test example. Naive-Bayes classifier assumes class conditional independence. Given test instance X, Bayesian classifier predicts the probability of belonging to a specific class ‫ܥ‬௜. To anticipate probability it utilizes idea of Bayes' theorem. Bayes' theorem is valuable in that it gives a method for computing the posterior probability,ܲሺ‫ܺ|ܥ‬ሻ, from ܲሺ‫ܥ‬ሻ, ܲሺܺ|‫ܥ‬ሻ, and ܲሺܺሻ. Bayes' theorem states that ܲሺ‫ܺ|ܥ‬ሻ ൌ ܲሺܺ|‫ܥ‬ሻܲሺ‫ܥ‬ሻ ܲሺܺሻ where, ܲሺܺ|‫ܥ‬ሻ ൌ ܲሺ‫ݔ‬ଵ, ‫ݔ‬ଶ, ‫ݔ‬ଷ, … ‫ݔ‬௡│‫ܥ‬ሻ ൌ ෑ ܲሺ‫ݔ‬௜|‫ܥ‬ሻ ௡ ௜ୀଵ Here, ‫ݔ‬ଵ, ‫ݔ‬ଶ, … , ‫ݔ‬௡ are features describing X. An advantage of naive Bayes is that it only requires a small amount of training data to estimate the parameters necessary for classification. Based on the type of data, appropriate distribution is fitted to the data to evaluate ܲሺ‫ݔ‬௜|‫ܥ‬ሻ. Details can be found in [11], [12]. IV. EXPERIMENTAL METHODOLOGY Performance Parameters Accuracy is used as the parameter for evaluating the performance of the classification algorithms. Accuracy of a classifier on a given test set is the percentage of test set items that are correctly classified by the classifier. In this paper, non-spammer user is considered as positive class and spammer user is considered as negative class. Accordingly, True Positive (TP), False Negative (FN), False Positive (FP) and True Negative (TN) are defined as under [11]. TP (True Positives): It refers to the number of positive items that are correctly labelled by the classifier. FP (False Positives): It refers to the number of negative items that are incorrectly labelled as positive. TN (True Negatives): It refers to the number of negative items that are correctly labelled by the classifier. FN (False Negatives): It refers to the number of positive items that are mislabelled as negative. Base on the above interpretations, accuracy is defined by the following equation. ‫ݕܿܽݎݑܿܿܣ‬ ൌ ܶܲ ൅ ܶܰ ܶܲ ൅ ܶܰ ൅ ‫ܲܨ‬ ൅ ‫ܰܨ‬ Dataset The assessment of the proposed method is carried out using ECML PKDD Discovery Challenge 2008 Data set.The Data set consists of 7171 unique users. The tables in the data set are:
  • 4. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME 43 tas, tas_spam bookmark, bookmark_spam bibtex, bibtex_spam user The user table contained flag denoting whether the user is spammer or non-spammer. This is helpful to us for supervised learning. Preprocessing Pre-processing of data set is necessary to improve the quality of data thereby helping to improve subsequent classification processes. Each of the user is described by means of the tags posted by him. First of all, all the special characters are removed and all the characters are converted to lower-case letters. Stop words which are not important in describing tag profile of the user are than removed. For each of the user, Boolean, bag-of-words and TFIDF profile is prepared. Details can be found in [12]. V. RESULTS AND DISCUSSIONS Table 1 shows the accuracy of NB and KNN for the task of bookmark spam user detection. It can be seen that best results are achieved in case of naïve-Bayes classifier when multivariate Bernoulli distribution is fitted to the data. The other observation is that KNN classifier works best for all the representations while naïve-Bayes works best for the Boolean representation of the user vector. Table 1: Accuracy of all three IR models considering all 71911 attributes Classifier Accuracy for Different IR Models Boolean Word Count TFIDF Naïve Bayes 97.544 78.297 84.395 KNN 97.423 96.931 96.984 VI. CONCLUSIONS AND FUTURE WORK This paper focuses on the task of detecting spam users in social bookmarking system. The problem is modelled as classification task. Naïve Bayes and K-Nearest Neighbor classifiers are used as the classification techniques. Experiments are carried out with three IR models (Boolean, bag-of- words and TFIDF).It is evident from the results that KNN classifier works well for all the three IR models while naïve-Bayes works the best for Boolean representation of the users. The future work would be to work on combining the interrelationship between tags with feature selection for better classification performance. REFERENCES [1] Benjamin Markines, Ciro Cattuto, Filippo Menczer, “Social Spam Detection”, Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web, pp. 41-48, 2009. [2] Stephan Doerfel, Andreas Hotho, Robert Jaschke, Folke Mitzlaff, Juergen Mueller, “ECML PKDD Discovery Challenge”, 2008.
  • 5. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN 0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 5, Issue 6, June (2014), pp. 40-44 © IAEME 44 [3] Toine Bogers, Antal Van Den Bosch, “Using Language Modelling for Spam Detection in Social Reference Manager Websites", Proceedings of the 9th Belgian-Dutch Information Retrieval Workshop, pp. 87-94, 2009. [4] Steven Young, Itamar Arel, Thomas P Karnowski, Derek Rose, “A Fast and Stable Incremental Clustering Algorithm", Information Technology: New Generations (ITNG), pp. 204-209, 2010. [5] AnestisGkanogiannis and Theodore Kalamboukis, “A Novel Supervised Learning Algorithm and its use for Spam Detection in Social Bookmarking Systems", ECML PKDD Discovery Challenge, 2008. [6] Chanju Kim and Kyu-Baek Hwang, “Naive Bayes Classifier Learning with Feature Selection for Spam Detection in Social Bookmarking", ECML PKDD Discovery Challenge, 2008. [7] Ralf Krestel and Ling Chen, “Using Co-occurrence of Tags and Resources to Identify Spammers”, ECML/PKDD Discovery Challenge Workshop, pp. 38-46, 2008. [8] Antonia Kyriakopoulou and Theodore Kalamboukis, “Combining Clustering with Classification for Spam Detection in Social Bookmarking Systems",ECML/PKDD Discovery Challenge Workshop, pp. 47-54,2008. [9] Beate Krause, Christoph Schmitz, Andreas Hotho, Gerd Stumme, “The Anti-Social Tagger – Detecting Spam in Social Bookmarking Systems", Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web”, pp. 61-68, 2008. [10] Amgad Madkour, Tarek Hefni, Ahmed Hefny, Khaled S Refaat, “Using Semantic Features to Detect Spamming in Social Bookmarking Systems", ECML PKDD Discovery Challenge Workshop, 2008. [11] J. P. Jiawei Han, Micheline Kamber, “Data Mining Concepts and Techniques”. Morgan Kaufmann, 3 Edition, July 2011. [12] Zdravko Markov, Daniel T Larose, “Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage”, John Wiley & Sons, 2007. [13] Radhika Kotecha, Vijay Ukani, Sanjay Garg, “An empirical analysis of multiclass classification techniques in data mining”, NUiCONE 2011, pp. 1-5, 2011. [14] Rutu Joshi and Priyank Thakkar, “Experimental Evaluation of Different Classification Techniques for Web Page Classification”, International Journal of Advanced Research in Engineering & Technology (IJARET), Volume 5, Issue 5, 2014, pp. 91 - 101, ISSN Print: 0976-6480, ISSN Online: 0976-6499. [15] Priyank Thakkar, Samir Kariya and K Kotecha, “Web Page Clustering using Cemetery Organization Behavior of Ants”, International Journal of Advanced Research in Engineering & Technology (IJARET), Volume 5, Issue 1, 2014, pp. 7 - 17, ISSN Print: 0976-6480, ISSN Online: 0976-6499. [16] Prajakta Ozarkar and Dr. Manasi Patwardhan, “Efficient Spam Classification by Appropriate Feature Selection”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 3, 2013, pp. 123 - 139, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [17] C.R. Cyril Anthoni and Dr. A. Christy, “Integration of Feature Sets with Machine Learning Techniques for Spam Filtering”, International Journal of Computer Engineering & Technology (IJCET), Volume 2, Issue 1, 2011, pp. 47 - 52, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [18] R. Manickam, D. Boominath and V. Bhuvaneswari, “An Analysis of Data Mining: Past, Present and Future”, International Journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 1 - 9, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.