SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
ADRIT SOLUTIONS
Ph: 9845252155 ; 7676768124 Email: adritsolutions@gmail.com
JAVA IEEE 2016-15 Data Mining Projects
1. IEEE 2016: SPORE: A Sequential Personalized Spatial Item Recommender System
Abstract: With the rapid development of location-based social networks (LBSNs), spatial
item recommendation has become an important way of helping users discover interesting
locations to increase their engagement with location-based services. Although human
movement exhibits sequential patterns in LBSNs, most current studies on spatial item
recommendations do not consider the sequential influence of locations. Leveraging
sequential patterns in spatial item recommendation is, however, very challenging,
considering 1)users’ check-in data in LBSNs has a low sampling rate in both space and time,
which renders existing prediction techniques on GPS trajectories ineffective; 2) the
prediction space is extremely large, with millions of distinct locations as the next prediction
target, which impedes the application of classical Markov chain models; and3)there is no
existing framework that unifies users’ personal interests and the sequential influence in a
principled manner. In light of the above challenges, we propose a sequential personalized
spatial item recommendation framework (SPORE) which introduces a novel latent variable
topic-region to model and fuse sequential influence with personal interests in the latent and
exponential space. The advantages of modeling the sequential effect at the topic-region level
include a significantly reduced prediction space, an effective alleviation of data sparsity and a
direct expression of the semantic meaning of users’ spatial activities. Furthermore, we design
an asymmetric Locality Sensitive Hashing (ALSH) technique to speed up the online top-k
recommendation process by extending the traditional LSH. We evaluate the performance of
SPORE on two real datasets and one large-scale synthetic dataset. The results demonstrate a
significant improvement in SPORE’s ability to recommend spatial items, in terms of both
effectiveness and efficiency, compared with the state-of-the-art methods.
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
2. IEEE 2016: Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search
Abstract: With advances in geo-positioning technologies and geo-location services, there are
a rapidly growing amount of spatio-textual objects collected in many applications such as
location based services and social networks, in which an object is described by its spatial
location and a set of keywords (terms). Consequently, the study of spatial keyword search
which explores both location and textual description of the objects has attracted great
attention from the commercial organizations and research communities. In the paper, we
study two fundamental problems in the spatial keyword queries: top k spatial keyword search
(TOPK-SK), and batch top k spatial keyword search (BTOPK-SK). Given a set of spatio-
textual objects, a query location and a set of query keywords, the TOPK-SK retrieves the
closest k objects each of which contains all keywords in the query. BTOPK-SK is the batch
processing of sets of TOPK-SK queries. Based on the inverted index and the linear quadtree,
we propose a novel index structure, called inverted linear quadtree (IL-Quadtree), which is
carefully designed to exploit both spatial and keyword based pruning techniques to
effectively reduce the search space. An efficient algorithm is then developed to tackle top k
spatial keyword search. To further enhance the filtering capability of the signature of linear
quadtree, we propose a partition based method. In addition, to deal with BTOPK-SK, we
design a new computing paradigm which partition the queries into groups based on both
spatial proximity and the textual relevance between queries. We show that the IL-Quadtree
technique can also efficiently support BTOPK-SK. Comprehensive experiments on real and
synthetic data clearly demonstrate the efficiency of our methods.
3. IEEE 2016: Truth Discovery in Crowd sourced Detection of Spatial Events
Abstract: The ubiquity of smartphones has led to the emergence of mobile crowd sourcing
tasks such as the detection of spatial events when smartphone users move around in their
daily lives. However, the credibility of those detected events can be negatively impacted by
unreliable participants with low-quality data. Consequently, a major challenge in quality
control is to discover true events from diverse and noisy participants’ reports. This truth
discovery problem is uniquely distinct from its online counterpart in that it involves
uncertainties in both participants’ mobility and reliability. Decoupling these two types of
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
uncertainties through location tracking will raise severe privacy and energy issues; whereas
simply ignoring missing reports or treating them as negative reports will significantly
degrade the accuracy of the discovered truth. In this paper, we propose a new method to
tackle this truth discovery problem through principled probabilistic modeling. In particular,
we integrate the modeling of location popularity, location visit indicators, truth of events and
three-way participant reliability in a unified framework. The proposed model is thus capable
of efficiently handling various types of uncertainties and automatically discovering truth
without any supervision or the need of location tracking. Experimental results demonstrate
that our proposed method out-performs existing state-of-the-art truth discovery approaches in
the mobile crowd sourcing environment.
4. IEEE 2016: Sentiment Analysis of Top Colleges in India Using Twitter Data
Abstract: In today’s world, opinions and reviews accessible to us are one of the most critical
factors in formulating our views and influencing the success of a brand, product or service.
With the advent and growth of social media in the world, stakeholders often take to
expressing their opinions on popular social media, namely twitter. While Twitter data is
extremely informative, it presents a challenge for analysis because of its humongous and
disorganized nature. This paper is a thorough effort to dive into the novel domain of
performing sentiment analysis of people’s opinions regarding top colleges in India. Besides
taking additional preprocessing measures like the expansion of net lingo and removal of
duplicate tweets, a probabilistic model based on Bayes’ theorem was used for spelling
correction, which is overlooked in other research studies. This paper also highlights a
comparison between the results obtained by exploiting the following machine learning
algorithms: Naïve Bayes and Support Vector Machine and an Artificial Neural Network
model: Multilayer Perceptron. Furthermore, a contrast has been presented between four
different kernels of SVM: RBF, linear, polynomial and sigmoid.
5. IEEE 2016: FRAppE: Detecting Malicious Facebook Applications
Abstract: With 20 million installs a day [1], third-party apps are a major reason for the
popularity and addictiveness of Facebook. Unfortunately, hackers have realized the potential
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
of using apps for spreading malware and spam. The problem is already significant, as we find
that at least 13% of apps in our dataset are malicious. So far, the research community has
focused on detecting malicious posts and campaigns. In this paper, we ask the question:
given a Facebook application, can we determine if it is malicious? Our key contribution is in
developing FRAppE—Facebook’s Rigorous Application Evaluator—arguably the first tool
focused on detecting malicious apps on Face-book. To develop FRAppE, we use information
gathered by ob-serving the posting behavior of 111K Facebook apps seen across 2.2 million
users on Facebook. First, we identify a set of features that help us distinguish malicious apps
from benign ones. For example, we find that malicious apps often share names with other
apps, and they typically request less permission than benign apps. Second, leveraging these
distinguishing features, we show that FRAppE can detect malicious apps with 99.5%
accuracy, with no false positives and a low false negative rate (4.1%). Finally, we explore the
ecosystem of malicious Facebook apps and identify mechanisms that these apps use to
propagate. Interestingly, we find that many apps collude and support each other; in our
dataset, we find 1,584 apps enabling the viral propagation of 3,723 other apps through their
posts. Long-term, we see FRAppE as a step towards creating an independent watchdog for
app assessment and ranking, so as to warn Facebook users before installing apps.
6. IEEE 2016: Practical Approximate k-Nearest Neighbor Queries with Location and
Query Privacy
Abstract: In mobile communication, spatial queries pose a serious threat to user location
privacy because the location of a query may reveal sensitive information about the mobile
user. In this paper, we study approximate k nearest neighbor (kNN) queries where the mobile
user queries the location-based service (LBS) provider about approximate k nearest points of
interest (POIs) on the basis of his current location. We propose a basic solution and a generic
solution for the mobile user to preserve his location and query privacy in approximate kNN
queries. The proposed solutions are mainly built on the Paillier public-key cryptosystem and
can provide both location and query privacy. To preserve query privacy, our basic solution
allows the mobile user to retrieve one type of POIs, for example, approximate k nearest car
parks, without revealing to the LBS provider what type of points is retrieved. Our generic
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
solution can be applied to multiple discrete type attributes of private location-based queries.
Compared with existing solutions for kNN queries with location privacy, our solution is more
efficient. Experiments have shown that our solution is practical for kNN queries.
7. IEEE 2016: A Novel Pipeline Approach for Efficient Big Data Broadcasting
Abstract: Big-data computing is a new critical challenge for the ICT industry. Engineers and
researchers are dealing with data sets of petabyte scale in the cloud computing paradigm.
Thus, the demand for building a service stack to distribute, manage, and process
massive data sets has risen drastically. In this paper, we investigate
the Big Data Broadcasting problem for a single source node to broadcast a big chunk
of data to a set of nodes with the objective of minimizing the maximum completion time.
These nodes may locate in the same datacenter or across geo-distributed datacenters. This
problem is one of the fundamental problems in distributed computing and is known to be NP-
hard in heterogeneous environments. We model the Big-data broadcasting problem into a
LockStep Broadcast Tree (LSBT) problem. The main idea of the LSBT model is to define a
basic unit of upload bandwidth, r, such that a node with capacity c broadcasts data to a set of
[c/r] children at the rater. Note that r is a parameter to be optimized as part of the LSBT
problem. We further divide the broadcast data into m chunks. These data chunks can then
be broadcast down the LSBT in a pipeline manner. In a homogeneous network environment
in which each node has the same upload capacity c, we show that the optimal uplink rate r*
of LSBT is either c/2 or c/3, whichever gives the smaller maximum completion time. For
heterogeneous environments, we present an O(nlog2n) algorithm to select an optimal uplink
rater* and to construct an optimal LSBT. Numerical results show that our approach performs
well with less maximum completion time and lower computational complexity than
other efficient solutions in literature.
8. IEEE 2016: VoteTrust: Leveraging Friend Invitation Graph to Defend against
Social Network Sybils
Abstract: Online social networks (OSNs) suffer from the creation of fake accounts that
introduce fake product reviews, malware and spam. Existing defenses focus on using
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
the social graph structure to isolate fakes. However, our work shows that Sybils could
befriend a large number of real users, invalidating the assumption behind social-graph-based
detection. In this paper, we present VoteTrust, a scalable defense system that
further leverages user-level activities. VoteTrust models the friend invitation interactions
among users as a directed, signed graph, and uses two key mechanisms to detect Sybils over
the graph: a voting-based Sybil detection to find Sybils that users vote to reject, and a Sybil
community detection to find other colluding Sybils around identified Sybils. Through
evaluating on Renren social network, we show that VoteTrust is able to prevent Sybils from
generating many unsolicited friend requests. We also deploy VoteTrust in Renen, and our
real experience demonstrates that VoteTrust can detect large-scale collusion among Sybils.
9. IEEE 2016: A Secure and Dynamic Multi-Keyword Ranked Search Scheme over
Encrypted Cloud Data
Abstract: Due to the increasing popularity of cloud computing, more and more data owners
are motivated to outsource their data to cloud servers for great convenience and reduced cost
in data management. However, sensitive data should be encrypted before outsourcing for
privacy requirements, which obsoletes data utilization like keyword-based document
retrieval. In this paper, we present a securemulti-
keyword ranked search scheme over encrypted cloud data, which simultaneously supports
dynamic update operations like deletion and insertion of documents. Specifically, the vector
space model and the widely-used TF x IDF model are combined in the index construction
and query generation. We construct a special tree-based index structure and propose a
“Greedy Depth-first Search” algorithm to provide efficient multi-keyword ranked search.
The secure kNN algorithm is utilized to encrypt the index and query vectors, and meanwhile
ensure accurate relevance score calculation between encrypted index and query vectors. In
order to resist statistical attacks, phantom terms are added to the index vector for
blinding search results. Due to the use of our special tree-based index structure, the
proposed scheme can achieve sub-linear search time and deal with the deletion and insertion
of documents flexibly. Extensive experiments are conducted to demonstrate the efficiency of
the proposed scheme.
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
10. IEEE 2016: SmartCrawler: A Two-Stage Crawler for Efficiently Harvesting Deep-
Web Interfaces
Abstract: As deep web grows at a very fast pace, there has been increased interest in
techniques that help efficiently locate deep-web interfaces. However, due to the large volume
of web resources and the dynamic nature of deep web, achieving wide coverage and high
efficiency is a challenging issue. We propose a two-stage framework, namely SmartCrawler,
for efficient harvesting deep web interfaces. In the first stage, SmartCrawler performs site-
based searching for center pages with the help of search engines, avoiding visiting a large
number of pages. To achieve more accurate results for a focused crawl, SmartCrawler ranks
websites to prioritize highly relevant ones for a given topic. In the second
stage, SmartCrawler achieves fast in-site searching by excavating most relevant links with an
adaptive link-ranking. To eliminate bias on visiting some highly relevant links in
hidden web directories, we design a link tree data structure to achieve wider coverage for a
website. Our experimental results on a set of representative domains show the agility and
accuracy of our proposed crawler framework, which efficiently retrieves deep-
web interfaces from large-scale sites and achieves higher harvest rates than other crawlers.
11. IEEE 2016: FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
Abstract: Existing parallel mining algorithms for frequent itemsets lack a mechanism that
enables automatic parallelization, load balancing, data distribution, and fault tolerance on
large clusters. As a solution to this problem, we design
a parallel frequent itemsets mining algorithm called FiDoop using the
MapReduce programming model. To achieve compressed storage and avoid building
conditional pattern bases, FiDoop incorporates the frequent items ultrametric tree, rather than
conventional FP trees. In FiDoop, three MapReduce jobs are implemented to complete
the mining task. In the crucial third MapReduce job, the mappers independently
decompose itemsets, the reducers perform combination operations by constructing small ultra
metric trees, and the actual mining of these trees separately. We implement FiDoop on our
in-house Hadoop cluster. We show that FiDoop on the cluster is sensitive to data distribution
and dimensions, because item sets with different lengths have different decomposition and
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
construction costs. To improve FiDoop's performance, we develop a workload balance metric
to measure load balance across the cluster's computing nodes. We develop FiDoop-HD, an
extension of FiDoop, to speed up the mining performance for high-dimensional data analysis.
Extensive experiments using real-world celestial spectral data demonstrate that our proposed
solution is efficient and scalable.
12. IEEE 2015: Discover the Expert: Context-Adaptive Expert Selection for Medical
Diagnosis
Abstract: In this paper, we propose an expert selection system that learns online the
best expert to assign to each patient depending on the context of the patient. In general,
the context can include an enormous number and variety of information related to the
patient's health condition, age, gender, previous drug doses, and so forth, but the most
relevant information is embedded in only a few contexts. If these most relevant contexts were
known in advance, learning would be relatively simple but they are not. Moreover, the
relevant contexts may be different for different health conditions. To address these
challenges, we develop a new class of algorithms aimed at discovering the most relevant
contexts and the best clinic and expert to use to make a diagnosis given a patient's contexts.
We prove that as the number of patients grows, the proposed context-adaptive algorithm
will discover the optimal expert to select for patients with a specific context. Moreover, the
algorithm also provides confidence bounds on the diagnostic accuracy of the expert it selects,
which can be considered by the primary care physician before making the final decision.
While our algorithm is general and can be applied in numerous medical scenarios, we
illustrate its functionality and performance by applying it to a real-world breast
cancer diagnosis data set. Finally, while the application we consider in this paper
is medical diagnosis, our proposed algorithm can be applied in other environments where
expertise needs to be discovered.
13. IEEE 2015: Active Learning for Ranking through Expected Loss Optimization
Abstract: Learning to rank arises in many data mining applications, ranging from web
search engine, online advertising to recommendation system. In learning to rank, the
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
performance of a ranking model is strongly affected by the number of labeled examples in
the training set; on the other hand, obtaining labeled examples for training data is very
expensive and time-consuming. This presents a great need for the active learning approaches
to select most informative examples for ranking learning; however, in the literature there is
still very limited work to address active learning for ranking. In this paper, we propose a
general active learning framework, expected loss optimization (ELO), for ranking. The ELO
framework is applicable to a wide range of ranking functions. Under this framework, we
derive a novel algorithm, expected discounted cumulative gain
(DCG) loss optimization (ELO-DCG), to select most informative examples. Then, we
investigate both query and document level active learning for raking and propose a two-stage
ELO-DCG algorithm which incorporate both query and document selection into
active learning. Furthermore, we show that it is flexible for the algorithm to deal with the
skewed grade distribution problem with the modification of the loss function. Extensive
experiments on real-world web search data sets have demonstrated great potential and
effectiveness of the proposed framework and algorithms.
14. IEEE 2015: k-Nearest Neighbor Classification over Semantically Secure Encrypted
Relational Data
Abstract: Data Mining has wide applications in many areas such as banking, medicine,
scientific research and among government agencies. Classification is one of the commonly
used tasks in data mining applications. For the past decade, due to the rise of various privacy
issues, many theoretical and practical solutions to the classification problem have been
proposed under different security models. However, with the recent popularity of cloud
computing, users now have the opportunity to outsource their data, in encrypted form, as well
as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form,
existing privacy-preserving classification techniques are not applicable. In this paper, we
focus on solving the classification problem over encrypted data. In particular, we propose a
secure k-NN classifier over encrypted data in the cloud. The proposed protocol protects the
confidentiality of data, privacy of user's input query, and hides the data access patterns. To
the best of our knowledge, our work is the first to develop a secure k-NN classifier
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
over encrypted data under the semi-honest model. Also, we empirically analyze the
efficiency of our proposed protocol using a real-world dataset under different parameter
settings.
15. IEEE 2015: Generating Searchable Public-Key Ciphertexts With Hidden
Structures for Fast Keyword Search
Abstract: Existing semantically secure public-key searchable encryption schemes
take search time linear with the total number of the ciphertexts. This makes retrieval from
large-scale databases prohibitive. To alleviate this problem, this paper
proposes searchable public-key ciphertexts with hidden structures (SPCHS)
for keyword search as fast as possible without sacrificing semantic security of the encrypted
keywords. In SPCHS, all keyword-searchable ciphertexts are structured by hidden relations,
and with the search trapdoor corresponding to a keyword, the minimum information of the
relations is disclosed to a search algorithm as the guidance to find all
matching ciphertexts efficiently. We construct an SPCHS scheme from scratch in which
the ciphertexts have a hidden star-like structure. We prove our scheme to be semantically
secure in the random oracle (RO) model. The search complexity of our scheme is dependent
on the actual number of the ciphertexts containing the queried keyword, rather than the
number of all ciphertexts. Finally, we present a generic SPCHS construction from
anonymous identity-based encryption and collision-free full-identity malleable identity-
based key encapsulation mechanism (IBKEM) with anonymity. We illustrate two collision-
free full-identity malleable IBKEM instances, which are semantically secure and anonymous,
respectively, in the RO and standard models. The latter instance enables us to construct an
SPCHS scheme with semantic security in the standard model.
16. IEEE 2015: Research Directions for Engineering Big Data Analytics Software
Abstract: Many software startups and research and development efforts are actively trying to
harness the power of big data and create software with the potential to improve almost every
aspect of human life. As these efforts continue to increase, full consideration needs to be
given to the engineering aspects of big data software. Since these systems exist to make
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
predictions on complex and continuous massive datasets, they pose unique problems during
specification, design, and verification of software that needs to be delivered on time and
within budget. But, given the nature of big data software, can this be done?
Does big data software engineering really work? This article explores the details of big data
software, discusses the main problems encountered when engineering big data software, and
proposes avenues for future research.
17. IEEE 2015: Co-Extracting Opinion Targets and Opinion Words from Online
Reviews Based on the Word Alignment Model
Abstract: Mining opinion targets and opinion words from online reviews are important tasks
for fine-grained opinion mining, the key component of which involves
detecting opinion relations among words. To this end, this paper proposes a novel
approach based on the partially-supervised alignment model, which regards
identifying opinion relations as an alignment process. Then, a graph-based co-ranking
algorithm is exploited to estimate the confidence of each candidate. Finally, candidates with
higher confidence are extracted as opinion targets or opinion words. Compared to previous
methods based on the nearest-neighbor rules, our model captures opinion relations more
precisely, especially for long-span relations. Compared to syntax-based methods,
our word alignment model effectively alleviates the negative effects of parsing errors when
dealing with informal online texts. In particular, compared to the traditional
unsupervised alignment model, the proposed model obtains better precision because of the
usage of partial supervision. In addition, when estimating candidate confidence, we penalize
higher-degree vertices in our graph-based co-ranking algorithm to decrease the probability of
error generation. Our experimental results on three corpora with different sizes and languages
show that our approach effectively outperforms state-of-the-art methods.
18. IEEE 2015: Constructing a Global Social Service Network for Better Quality of
Web Service Discovery
Abstract: Web services have had a tremendous impact on the Web for supporting a
distributed service-based economy on a global scale. However, despite the outstanding
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
progress, their uptake on a Web scale has been significantly less than initially anticipated.
The isolation of services and the lack of social relationships among related services have
been identified as reasons for the poor uptake. In this paper, we propose connecting the
isolated service islands into a global social service network to enhance the services'
sociability on a global scale. First, we propose linked social service-specific principles based
on linked data principles for publishing services on the open Web as linked social services.
Then, we suggest a new framework
for constructing the global social service network following linked social service-specific
principles based on complex network theories. Next, an approach is proposed to enable the
exploitation of the global social service network, providing Linked Social Services as a
Service. Finally, experimental results show that our approach can solve
the quality of service discovery problem, improving both the service discovering time and the
success rate by exploring service-to-service based on the global social service network.
19. IEEE 2015: Privacy-Preserving Detection of Sensitive Data Exposure
Abstract: Statistics from security firms, research institutions and government organizations
show that the numbers of data-leak instances have grown rapidly in recent years. Among
various data-leak cases, human mistakes are one of the main causes of data loss. There exist
solutions detecting inadvertent sensitive data leaks caused by human mistakes and to provide
alerts for organizations. A common approach is to screen content in storage and transmission
for exposed sensitive information. Such an approach usually requires the detection operation
to be conducted in secrecy. However, this secrecy requirement is challenging to satisfy in
practice, as detection servers may be compromised or outsourced. In this paper, we present
a privacy-preserving data-leak detection (DLD) solution to solve the issue where a special set
of sensitive data digests is used in detection. The advantage of our method is that it enables
the data owner to safely delegate the detection operation to a semi honest provider without
revealing the sensitive data to the provider. We describe how Internet service providers can
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
offer their customers DLD as an add-on service with strong privacy guarantees. The
evaluation results show that our method can support accurate detection with very small
number of false alarms under various data-leak scenarios.
20. IEEE 2015: Friendbook: A Semantic-Based Friend Recommendation System for
Social Networks
Abstract: Existing social networking services recommend friends to users based on
their social graphs, which may not be the most appropriate to reflect a user's preferences
on friend selection in real life. In this paper, we present Friendbook, a novel semantic-
based friend recommendation system for socialnetworks, which recommends friends to
users based on their life styles instead of social graphs. By taking advantage of sensor-rich
smartphones, Friendbook discovers life styles of users from user-centric sensor data,
measures the similarity of life styles between users, and recommends friends to users if their
life styles have high similarity. Inspired by text mining, we model a user's daily life as life
documents, from which his/her life styles are extracted by using the Latent Dirichlet
Allocation algorithm. We further propose a similarity metric to measure the similarity of life
styles between users, and calculate users' impact in terms of life styles with a friend-matching
graph. Upon receiving a request, Friendbook returns a list of people with
highest recommendation scores to the query user. Finally, Friendbook integrates a feedback
mechanism to further improve the recommendation accuracy. We have
implemented Friendbook on the Android-based smartphones, and evaluated its performance
on both small-scale experiments and large-scale simulations. The results show that there
commendations accurately reflect the preferences of users in choosing friends.
21. IEEE 2015: Secure Distributed Deduplication Systems with Improved Reliability
Abstract: Data deduplication is a technique for eliminating duplicate copies of data, and has
been widely used in cloud storage to reduce storage space and upload bandwidth. However,
there is only one copy for each file stored in cloud even if such a file is owned by a huge
number of users. As a result, deduplication system improves storage utilization while
reducing reliability. Furthermore, the challenge of privacy for sensitive data also arises when
Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
they are outsourced by users to cloud. Aiming to address the above security challenges, this
paper makes the first attempt to formalize the notion of distributed reliable
deduplication system. We propose new distributed deduplication systems with
higher reliability in which the data chunks are distributed across multiple cloud servers. The
security requirements of data confidentiality and tag consistency are also achieved by
introducing a deterministic secret sharing scheme in distributed storage systems, instead of
using convergent encryption as in previous deduplication systems. Security analysis
demonstrates that our deduplication systems are secure in terms of the definitions specified in
the proposed security model. As a proof of concept, we implement the proposed systems and
demonstrate that the incurred overhead is very limited in realistic environments.

Contenu connexe

Tendances

Security and Privacy Measurements in Social Networks: Experiences and Lessons...
Security and Privacy Measurements in Social Networks: Experiences and Lessons...Security and Privacy Measurements in Social Networks: Experiences and Lessons...
Security and Privacy Measurements in Social Networks: Experiences and Lessons...FACE
 
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017Rohit Desai
 
Sampling of User Behavior Using Online Social Network
Sampling of User Behavior Using Online Social NetworkSampling of User Behavior Using Online Social Network
Sampling of User Behavior Using Online Social NetworkEditor IJCATR
 
Individual project 2.20
Individual project 2.20Individual project 2.20
Individual project 2.20Monisha100
 
IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...IRJET Journal
 
Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis Athena Vakali
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social networkFiras Husseini
 
BIG DATA ANALYTICS FOR USER-ACTIVITY ANALYSIS AND USER-ANOMALY DETECTION IN...
 BIG DATA ANALYTICS FOR USER-ACTIVITY  ANALYSIS AND USER-ANOMALY DETECTION IN... BIG DATA ANALYTICS FOR USER-ACTIVITY  ANALYSIS AND USER-ANOMALY DETECTION IN...
BIG DATA ANALYTICS FOR USER-ACTIVITY ANALYSIS AND USER-ANOMALY DETECTION IN...Nexgen Technology
 
Adversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender SystemsAdversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender SystemsWQ Fan
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Mediahome
 
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisFuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisIJERA Editor
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...
Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...
Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...IRJET Journal
 
Data collection thru social media
Data collection thru social mediaData collection thru social media
Data collection thru social mediai4box Anon
 
Elements of AI Luxembourg - session 4
Elements of AI Luxembourg - session 4Elements of AI Luxembourg - session 4
Elements of AI Luxembourg - session 4Jeremie Dauphin
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender SystemsWQ Fan
 
efficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksefficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksswathi78
 

Tendances (19)

FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT FAKE NEWS DETECTION PPT
FAKE NEWS DETECTION PPT
 
Security and Privacy Measurements in Social Networks: Experiences and Lessons...
Security and Privacy Measurements in Social Networks: Experiences and Lessons...Security and Privacy Measurements in Social Networks: Experiences and Lessons...
Security and Privacy Measurements in Social Networks: Experiences and Lessons...
 
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017
Poster presentation in 3rd big data conclave at vit chennai on 20th april 2017
 
Sampling of User Behavior Using Online Social Network
Sampling of User Behavior Using Online Social NetworkSampling of User Behavior Using Online Social Network
Sampling of User Behavior Using Online Social Network
 
Individual project 2.20
Individual project 2.20Individual project 2.20
Individual project 2.20
 
nm
nmnm
nm
 
IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...IRJET- Predicting Social Network Communities Structure Changes and Detection ...
IRJET- Predicting Social Network Communities Structure Changes and Detection ...
 
Evolving social data mining and affective analysis
Evolving social data mining and affective analysis  Evolving social data mining and affective analysis
Evolving social data mining and affective analysis
 
Data mining based social network
Data mining based social networkData mining based social network
Data mining based social network
 
BIG DATA ANALYTICS FOR USER-ACTIVITY ANALYSIS AND USER-ANOMALY DETECTION IN...
 BIG DATA ANALYTICS FOR USER-ACTIVITY  ANALYSIS AND USER-ANOMALY DETECTION IN... BIG DATA ANALYTICS FOR USER-ACTIVITY  ANALYSIS AND USER-ANOMALY DETECTION IN...
BIG DATA ANALYTICS FOR USER-ACTIVITY ANALYSIS AND USER-ANOMALY DETECTION IN...
 
Adversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender SystemsAdversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender Systems
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Media
 
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisFuzzy AndANN Based Mining Approach Testing For Social Network Analysis
Fuzzy AndANN Based Mining Approach Testing For Social Network Analysis
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...
Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...
Location Privacy Protection Mechanisms using Order-Retrievable Encryption for...
 
Data collection thru social media
Data collection thru social mediaData collection thru social media
Data collection thru social media
 
Elements of AI Luxembourg - session 4
Elements of AI Luxembourg - session 4Elements of AI Luxembourg - session 4
Elements of AI Luxembourg - session 4
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender Systems
 
efficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networksefficient data query in intermittently-connected mobile ad hoc social networks
efficient data query in intermittently-connected mobile ad hoc social networks
 

En vedette

An Intelligence Analysis of Crime Data for Law Enforcement Using Data Mining
An Intelligence Analysis of Crime Data for Law Enforcement Using Data MiningAn Intelligence Analysis of Crime Data for Law Enforcement Using Data Mining
An Intelligence Analysis of Crime Data for Law Enforcement Using Data MiningWaqas Tariq
 
Ieee 2010 java data mining projects sbgc
Ieee 2010 java data mining projects sbgcIeee 2010 java data mining projects sbgc
Ieee 2010 java data mining projects sbgcSBGC
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Data mining with big data
2014 IEEE JAVA DATA MINING PROJECT Data mining with big data2014 IEEE JAVA DATA MINING PROJECT Data mining with big data
2014 IEEE JAVA DATA MINING PROJECT Data mining with big dataIEEEMEMTECHSTUDENTSPROJECTS
 
JavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projectsJavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projectsAlexey Zinoviev
 
IEEE Final Year Project Titles 2016-17 - Java - Data Mining
IEEE Final Year Project Titles 2016-17 - Java - Data MiningIEEE Final Year Project Titles 2016-17 - Java - Data Mining
IEEE Final Year Project Titles 2016-17 - Java - Data MiningCTech Projects
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...Zakaria Zubi
 
Crime Pattern Detection using K-Means Clustering
Crime Pattern Detection using K-Means ClusteringCrime Pattern Detection using K-Means Clustering
Crime Pattern Detection using K-Means ClusteringReuben George
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction SystemBigDataCloud
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsMotaz Saad
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data MiningSushil Kulkarni
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining ConceptsDung Nguyen
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesSaif Ullah
 

En vedette (18)

An Intelligence Analysis of Crime Data for Law Enforcement Using Data Mining
An Intelligence Analysis of Crime Data for Law Enforcement Using Data MiningAn Intelligence Analysis of Crime Data for Law Enforcement Using Data Mining
An Intelligence Analysis of Crime Data for Law Enforcement Using Data Mining
 
Ieee 2010 java data mining projects sbgc
Ieee 2010 java data mining projects sbgcIeee 2010 java data mining projects sbgc
Ieee 2010 java data mining projects sbgc
 
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
IEEE 2014 JAVA DATA MINING PROJECTS Mining statistically significant co locat...
 
2014 IEEE JAVA DATA MINING PROJECT Data mining with big data
2014 IEEE JAVA DATA MINING PROJECT Data mining with big data2014 IEEE JAVA DATA MINING PROJECT Data mining with big data
2014 IEEE JAVA DATA MINING PROJECT Data mining with big data
 
JavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projectsJavaDayKiev'15 Java in production for Data Mining Research projects
JavaDayKiev'15 Java in production for Data Mining Research projects
 
IEEE Final Year Project Titles 2016-17 - Java - Data Mining
IEEE Final Year Project Titles 2016-17 - Java - Data MiningIEEE Final Year Project Titles 2016-17 - Java - Data Mining
IEEE Final Year Project Titles 2016-17 - Java - Data Mining
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
 
Crime Pattern Detection using K-Means Clustering
Crime Pattern Detection using K-Means ClusteringCrime Pattern Detection using K-Means Clustering
Crime Pattern Detection using K-Means Clustering
 
Crime Analysis & Prediction System
Crime Analysis & Prediction SystemCrime Analysis & Prediction System
Crime Analysis & Prediction System
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data Mining
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Big Data v Data Mining
Big Data v Data MiningBig Data v Data Mining
Big Data v Data Mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining
Data miningData mining
Data mining
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 

Similaire à Data mining java titles adrit solutions

A survey on identification of ranking fraud for mobile applications
A survey on identification of ranking fraud for mobile applicationsA survey on identification of ranking fraud for mobile applications
A survey on identification of ranking fraud for mobile applicationseSAT Journals
 
MOBILE APPLICATION FOR DONATION OF ITEMS
MOBILE APPLICATION FOR DONATION OF ITEMSMOBILE APPLICATION FOR DONATION OF ITEMS
MOBILE APPLICATION FOR DONATION OF ITEMSvivatechijri
 
Travel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingTravel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingIRJET Journal
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POIIRJET Journal
 
Paper id 41201614
Paper id 41201614Paper id 41201614
Paper id 41201614IJRAT
 
Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...
Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...
Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...Sean Ekins
 
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYING
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYINGA MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYING
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYINGijaia
 
A Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of CyberbullyingA Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of Cyberbullyinggerogepatton
 
A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...
A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...
A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...IJERA Editor
 
Published Paper
Published PaperPublished Paper
Published PaperFaeza Noor
 
Detection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social NetworkDetection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social NetworkIRJET Journal
 
A Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of CyberbullyingA Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of Cyberbullyinggerogepatton
 
Common Operational Research Environment Quarterly Newsletter
Common Operational Research Environment Quarterly NewsletterCommon Operational Research Environment Quarterly Newsletter
Common Operational Research Environment Quarterly NewsletterWilliam Orkins
 
Behavioural Modelling Outcomes prediction using Casual Factors
Behavioural Modelling Outcomes prediction using Casual  FactorsBehavioural Modelling Outcomes prediction using Casual  Factors
Behavioural Modelling Outcomes prediction using Casual FactorsIJMER
 
IRJET- Social Network Mental Disorders Detection Via Online Social Media Mining
IRJET- Social Network Mental Disorders Detection Via Online Social Media MiningIRJET- Social Network Mental Disorders Detection Via Online Social Media Mining
IRJET- Social Network Mental Disorders Detection Via Online Social Media MiningIRJET Journal
 
Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...CloudTechnologies
 

Similaire à Data mining java titles adrit solutions (20)

A survey on identification of ranking fraud for mobile applications
A survey on identification of ranking fraud for mobile applicationsA survey on identification of ranking fraud for mobile applications
A survey on identification of ranking fraud for mobile applications
 
MOBILE APPLICATION FOR DONATION OF ITEMS
MOBILE APPLICATION FOR DONATION OF ITEMSMOBILE APPLICATION FOR DONATION OF ITEMS
MOBILE APPLICATION FOR DONATION OF ITEMS
 
Networking java titles Adrit Solution
Networking java titles Adrit SolutionNetworking java titles Adrit Solution
Networking java titles Adrit Solution
 
[IJCT-V3I2P30] Authors: Sunny Sharma
[IJCT-V3I2P30] Authors: Sunny Sharma[IJCT-V3I2P30] Authors: Sunny Sharma
[IJCT-V3I2P30] Authors: Sunny Sharma
 
Travel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingTravel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social Networking
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POI
 
Ijsrdv7 i10842
Ijsrdv7 i10842Ijsrdv7 i10842
Ijsrdv7 i10842
 
Paper id 41201614
Paper id 41201614Paper id 41201614
Paper id 41201614
 
50120140506002
5012014050600250120140506002
50120140506002
 
Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...
Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...
Collaborative Mobile Apps Using Social Media and Appifying Data For Drug Disc...
 
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYING
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYINGA MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYING
A MACHINE LEARNING ENSEMBLE MODEL FOR THE DETECTION OF CYBERBULLYING
 
A Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of CyberbullyingA Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of Cyberbullying
 
A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...
A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...
A Survey of Privacy-Preserving Algorithms for Finding meeting point in Mobile...
 
Published Paper
Published PaperPublished Paper
Published Paper
 
Detection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social NetworkDetection and Minimization Influence of Rumor in Social Network
Detection and Minimization Influence of Rumor in Social Network
 
A Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of CyberbullyingA Machine Learning Ensemble Model for the Detection of Cyberbullying
A Machine Learning Ensemble Model for the Detection of Cyberbullying
 
Common Operational Research Environment Quarterly Newsletter
Common Operational Research Environment Quarterly NewsletterCommon Operational Research Environment Quarterly Newsletter
Common Operational Research Environment Quarterly Newsletter
 
Behavioural Modelling Outcomes prediction using Casual Factors
Behavioural Modelling Outcomes prediction using Casual  FactorsBehavioural Modelling Outcomes prediction using Casual  Factors
Behavioural Modelling Outcomes prediction using Casual Factors
 
IRJET- Social Network Mental Disorders Detection Via Online Social Media Mining
IRJET- Social Network Mental Disorders Detection Via Online Social Media MiningIRJET- Social Network Mental Disorders Detection Via Online Social Media Mining
IRJET- Social Network Mental Disorders Detection Via Online Social Media Mining
 
Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...Service rating prediction by exploring social mobile users’ geographical loca...
Service rating prediction by exploring social mobile users’ geographical loca...
 

Dernier

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spaintimesproduction05
 

Dernier (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 

Data mining java titles adrit solutions

  • 1. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com ADRIT SOLUTIONS Ph: 9845252155 ; 7676768124 Email: adritsolutions@gmail.com JAVA IEEE 2016-15 Data Mining Projects 1. IEEE 2016: SPORE: A Sequential Personalized Spatial Item Recommender System Abstract: With the rapid development of location-based social networks (LBSNs), spatial item recommendation has become an important way of helping users discover interesting locations to increase their engagement with location-based services. Although human movement exhibits sequential patterns in LBSNs, most current studies on spatial item recommendations do not consider the sequential influence of locations. Leveraging sequential patterns in spatial item recommendation is, however, very challenging, considering 1)users’ check-in data in LBSNs has a low sampling rate in both space and time, which renders existing prediction techniques on GPS trajectories ineffective; 2) the prediction space is extremely large, with millions of distinct locations as the next prediction target, which impedes the application of classical Markov chain models; and3)there is no existing framework that unifies users’ personal interests and the sequential influence in a principled manner. In light of the above challenges, we propose a sequential personalized spatial item recommendation framework (SPORE) which introduces a novel latent variable topic-region to model and fuse sequential influence with personal interests in the latent and exponential space. The advantages of modeling the sequential effect at the topic-region level include a significantly reduced prediction space, an effective alleviation of data sparsity and a direct expression of the semantic meaning of users’ spatial activities. Furthermore, we design an asymmetric Locality Sensitive Hashing (ALSH) technique to speed up the online top-k recommendation process by extending the traditional LSH. We evaluate the performance of SPORE on two real datasets and one large-scale synthetic dataset. The results demonstrate a significant improvement in SPORE’s ability to recommend spatial items, in terms of both effectiveness and efficiency, compared with the state-of-the-art methods.
  • 2. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com 2. IEEE 2016: Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search Abstract: With advances in geo-positioning technologies and geo-location services, there are a rapidly growing amount of spatio-textual objects collected in many applications such as location based services and social networks, in which an object is described by its spatial location and a set of keywords (terms). Consequently, the study of spatial keyword search which explores both location and textual description of the objects has attracted great attention from the commercial organizations and research communities. In the paper, we study two fundamental problems in the spatial keyword queries: top k spatial keyword search (TOPK-SK), and batch top k spatial keyword search (BTOPK-SK). Given a set of spatio- textual objects, a query location and a set of query keywords, the TOPK-SK retrieves the closest k objects each of which contains all keywords in the query. BTOPK-SK is the batch processing of sets of TOPK-SK queries. Based on the inverted index and the linear quadtree, we propose a novel index structure, called inverted linear quadtree (IL-Quadtree), which is carefully designed to exploit both spatial and keyword based pruning techniques to effectively reduce the search space. An efficient algorithm is then developed to tackle top k spatial keyword search. To further enhance the filtering capability of the signature of linear quadtree, we propose a partition based method. In addition, to deal with BTOPK-SK, we design a new computing paradigm which partition the queries into groups based on both spatial proximity and the textual relevance between queries. We show that the IL-Quadtree technique can also efficiently support BTOPK-SK. Comprehensive experiments on real and synthetic data clearly demonstrate the efficiency of our methods. 3. IEEE 2016: Truth Discovery in Crowd sourced Detection of Spatial Events Abstract: The ubiquity of smartphones has led to the emergence of mobile crowd sourcing tasks such as the detection of spatial events when smartphone users move around in their daily lives. However, the credibility of those detected events can be negatively impacted by unreliable participants with low-quality data. Consequently, a major challenge in quality control is to discover true events from diverse and noisy participants’ reports. This truth discovery problem is uniquely distinct from its online counterpart in that it involves uncertainties in both participants’ mobility and reliability. Decoupling these two types of
  • 3. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com uncertainties through location tracking will raise severe privacy and energy issues; whereas simply ignoring missing reports or treating them as negative reports will significantly degrade the accuracy of the discovered truth. In this paper, we propose a new method to tackle this truth discovery problem through principled probabilistic modeling. In particular, we integrate the modeling of location popularity, location visit indicators, truth of events and three-way participant reliability in a unified framework. The proposed model is thus capable of efficiently handling various types of uncertainties and automatically discovering truth without any supervision or the need of location tracking. Experimental results demonstrate that our proposed method out-performs existing state-of-the-art truth discovery approaches in the mobile crowd sourcing environment. 4. IEEE 2016: Sentiment Analysis of Top Colleges in India Using Twitter Data Abstract: In today’s world, opinions and reviews accessible to us are one of the most critical factors in formulating our views and influencing the success of a brand, product or service. With the advent and growth of social media in the world, stakeholders often take to expressing their opinions on popular social media, namely twitter. While Twitter data is extremely informative, it presents a challenge for analysis because of its humongous and disorganized nature. This paper is a thorough effort to dive into the novel domain of performing sentiment analysis of people’s opinions regarding top colleges in India. Besides taking additional preprocessing measures like the expansion of net lingo and removal of duplicate tweets, a probabilistic model based on Bayes’ theorem was used for spelling correction, which is overlooked in other research studies. This paper also highlights a comparison between the results obtained by exploiting the following machine learning algorithms: Naïve Bayes and Support Vector Machine and an Artificial Neural Network model: Multilayer Perceptron. Furthermore, a contrast has been presented between four different kernels of SVM: RBF, linear, polynomial and sigmoid. 5. IEEE 2016: FRAppE: Detecting Malicious Facebook Applications Abstract: With 20 million installs a day [1], third-party apps are a major reason for the popularity and addictiveness of Facebook. Unfortunately, hackers have realized the potential
  • 4. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com of using apps for spreading malware and spam. The problem is already significant, as we find that at least 13% of apps in our dataset are malicious. So far, the research community has focused on detecting malicious posts and campaigns. In this paper, we ask the question: given a Facebook application, can we determine if it is malicious? Our key contribution is in developing FRAppE—Facebook’s Rigorous Application Evaluator—arguably the first tool focused on detecting malicious apps on Face-book. To develop FRAppE, we use information gathered by ob-serving the posting behavior of 111K Facebook apps seen across 2.2 million users on Facebook. First, we identify a set of features that help us distinguish malicious apps from benign ones. For example, we find that malicious apps often share names with other apps, and they typically request less permission than benign apps. Second, leveraging these distinguishing features, we show that FRAppE can detect malicious apps with 99.5% accuracy, with no false positives and a low false negative rate (4.1%). Finally, we explore the ecosystem of malicious Facebook apps and identify mechanisms that these apps use to propagate. Interestingly, we find that many apps collude and support each other; in our dataset, we find 1,584 apps enabling the viral propagation of 3,723 other apps through their posts. Long-term, we see FRAppE as a step towards creating an independent watchdog for app assessment and ranking, so as to warn Facebook users before installing apps. 6. IEEE 2016: Practical Approximate k-Nearest Neighbor Queries with Location and Query Privacy Abstract: In mobile communication, spatial queries pose a serious threat to user location privacy because the location of a query may reveal sensitive information about the mobile user. In this paper, we study approximate k nearest neighbor (kNN) queries where the mobile user queries the location-based service (LBS) provider about approximate k nearest points of interest (POIs) on the basis of his current location. We propose a basic solution and a generic solution for the mobile user to preserve his location and query privacy in approximate kNN queries. The proposed solutions are mainly built on the Paillier public-key cryptosystem and can provide both location and query privacy. To preserve query privacy, our basic solution allows the mobile user to retrieve one type of POIs, for example, approximate k nearest car parks, without revealing to the LBS provider what type of points is retrieved. Our generic
  • 5. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com solution can be applied to multiple discrete type attributes of private location-based queries. Compared with existing solutions for kNN queries with location privacy, our solution is more efficient. Experiments have shown that our solution is practical for kNN queries. 7. IEEE 2016: A Novel Pipeline Approach for Efficient Big Data Broadcasting Abstract: Big-data computing is a new critical challenge for the ICT industry. Engineers and researchers are dealing with data sets of petabyte scale in the cloud computing paradigm. Thus, the demand for building a service stack to distribute, manage, and process massive data sets has risen drastically. In this paper, we investigate the Big Data Broadcasting problem for a single source node to broadcast a big chunk of data to a set of nodes with the objective of minimizing the maximum completion time. These nodes may locate in the same datacenter or across geo-distributed datacenters. This problem is one of the fundamental problems in distributed computing and is known to be NP- hard in heterogeneous environments. We model the Big-data broadcasting problem into a LockStep Broadcast Tree (LSBT) problem. The main idea of the LSBT model is to define a basic unit of upload bandwidth, r, such that a node with capacity c broadcasts data to a set of [c/r] children at the rater. Note that r is a parameter to be optimized as part of the LSBT problem. We further divide the broadcast data into m chunks. These data chunks can then be broadcast down the LSBT in a pipeline manner. In a homogeneous network environment in which each node has the same upload capacity c, we show that the optimal uplink rate r* of LSBT is either c/2 or c/3, whichever gives the smaller maximum completion time. For heterogeneous environments, we present an O(nlog2n) algorithm to select an optimal uplink rater* and to construct an optimal LSBT. Numerical results show that our approach performs well with less maximum completion time and lower computational complexity than other efficient solutions in literature. 8. IEEE 2016: VoteTrust: Leveraging Friend Invitation Graph to Defend against Social Network Sybils Abstract: Online social networks (OSNs) suffer from the creation of fake accounts that introduce fake product reviews, malware and spam. Existing defenses focus on using
  • 6. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com the social graph structure to isolate fakes. However, our work shows that Sybils could befriend a large number of real users, invalidating the assumption behind social-graph-based detection. In this paper, we present VoteTrust, a scalable defense system that further leverages user-level activities. VoteTrust models the friend invitation interactions among users as a directed, signed graph, and uses two key mechanisms to detect Sybils over the graph: a voting-based Sybil detection to find Sybils that users vote to reject, and a Sybil community detection to find other colluding Sybils around identified Sybils. Through evaluating on Renren social network, we show that VoteTrust is able to prevent Sybils from generating many unsolicited friend requests. We also deploy VoteTrust in Renen, and our real experience demonstrates that VoteTrust can detect large-scale collusion among Sybils. 9. IEEE 2016: A Secure and Dynamic Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data Abstract: Due to the increasing popularity of cloud computing, more and more data owners are motivated to outsource their data to cloud servers for great convenience and reduced cost in data management. However, sensitive data should be encrypted before outsourcing for privacy requirements, which obsoletes data utilization like keyword-based document retrieval. In this paper, we present a securemulti- keyword ranked search scheme over encrypted cloud data, which simultaneously supports dynamic update operations like deletion and insertion of documents. Specifically, the vector space model and the widely-used TF x IDF model are combined in the index construction and query generation. We construct a special tree-based index structure and propose a “Greedy Depth-first Search” algorithm to provide efficient multi-keyword ranked search. The secure kNN algorithm is utilized to encrypt the index and query vectors, and meanwhile ensure accurate relevance score calculation between encrypted index and query vectors. In order to resist statistical attacks, phantom terms are added to the index vector for blinding search results. Due to the use of our special tree-based index structure, the proposed scheme can achieve sub-linear search time and deal with the deletion and insertion of documents flexibly. Extensive experiments are conducted to demonstrate the efficiency of the proposed scheme.
  • 7. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com 10. IEEE 2016: SmartCrawler: A Two-Stage Crawler for Efficiently Harvesting Deep- Web Interfaces Abstract: As deep web grows at a very fast pace, there has been increased interest in techniques that help efficiently locate deep-web interfaces. However, due to the large volume of web resources and the dynamic nature of deep web, achieving wide coverage and high efficiency is a challenging issue. We propose a two-stage framework, namely SmartCrawler, for efficient harvesting deep web interfaces. In the first stage, SmartCrawler performs site- based searching for center pages with the help of search engines, avoiding visiting a large number of pages. To achieve more accurate results for a focused crawl, SmartCrawler ranks websites to prioritize highly relevant ones for a given topic. In the second stage, SmartCrawler achieves fast in-site searching by excavating most relevant links with an adaptive link-ranking. To eliminate bias on visiting some highly relevant links in hidden web directories, we design a link tree data structure to achieve wider coverage for a website. Our experimental results on a set of representative domains show the agility and accuracy of our proposed crawler framework, which efficiently retrieves deep- web interfaces from large-scale sites and achieves higher harvest rates than other crawlers. 11. IEEE 2016: FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce Abstract: Existing parallel mining algorithms for frequent itemsets lack a mechanism that enables automatic parallelization, load balancing, data distribution, and fault tolerance on large clusters. As a solution to this problem, we design a parallel frequent itemsets mining algorithm called FiDoop using the MapReduce programming model. To achieve compressed storage and avoid building conditional pattern bases, FiDoop incorporates the frequent items ultrametric tree, rather than conventional FP trees. In FiDoop, three MapReduce jobs are implemented to complete the mining task. In the crucial third MapReduce job, the mappers independently decompose itemsets, the reducers perform combination operations by constructing small ultra metric trees, and the actual mining of these trees separately. We implement FiDoop on our in-house Hadoop cluster. We show that FiDoop on the cluster is sensitive to data distribution and dimensions, because item sets with different lengths have different decomposition and
  • 8. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com construction costs. To improve FiDoop's performance, we develop a workload balance metric to measure load balance across the cluster's computing nodes. We develop FiDoop-HD, an extension of FiDoop, to speed up the mining performance for high-dimensional data analysis. Extensive experiments using real-world celestial spectral data demonstrate that our proposed solution is efficient and scalable. 12. IEEE 2015: Discover the Expert: Context-Adaptive Expert Selection for Medical Diagnosis Abstract: In this paper, we propose an expert selection system that learns online the best expert to assign to each patient depending on the context of the patient. In general, the context can include an enormous number and variety of information related to the patient's health condition, age, gender, previous drug doses, and so forth, but the most relevant information is embedded in only a few contexts. If these most relevant contexts were known in advance, learning would be relatively simple but they are not. Moreover, the relevant contexts may be different for different health conditions. To address these challenges, we develop a new class of algorithms aimed at discovering the most relevant contexts and the best clinic and expert to use to make a diagnosis given a patient's contexts. We prove that as the number of patients grows, the proposed context-adaptive algorithm will discover the optimal expert to select for patients with a specific context. Moreover, the algorithm also provides confidence bounds on the diagnostic accuracy of the expert it selects, which can be considered by the primary care physician before making the final decision. While our algorithm is general and can be applied in numerous medical scenarios, we illustrate its functionality and performance by applying it to a real-world breast cancer diagnosis data set. Finally, while the application we consider in this paper is medical diagnosis, our proposed algorithm can be applied in other environments where expertise needs to be discovered. 13. IEEE 2015: Active Learning for Ranking through Expected Loss Optimization Abstract: Learning to rank arises in many data mining applications, ranging from web search engine, online advertising to recommendation system. In learning to rank, the
  • 9. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com performance of a ranking model is strongly affected by the number of labeled examples in the training set; on the other hand, obtaining labeled examples for training data is very expensive and time-consuming. This presents a great need for the active learning approaches to select most informative examples for ranking learning; however, in the literature there is still very limited work to address active learning for ranking. In this paper, we propose a general active learning framework, expected loss optimization (ELO), for ranking. The ELO framework is applicable to a wide range of ranking functions. Under this framework, we derive a novel algorithm, expected discounted cumulative gain (DCG) loss optimization (ELO-DCG), to select most informative examples. Then, we investigate both query and document level active learning for raking and propose a two-stage ELO-DCG algorithm which incorporate both query and document selection into active learning. Furthermore, we show that it is flexible for the algorithm to deal with the skewed grade distribution problem with the modification of the loss function. Extensive experiments on real-world web search data sets have demonstrated great potential and effectiveness of the proposed framework and algorithms. 14. IEEE 2015: k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data Abstract: Data Mining has wide applications in many areas such as banking, medicine, scientific research and among government agencies. Classification is one of the commonly used tasks in data mining applications. For the past decade, due to the rise of various privacy issues, many theoretical and practical solutions to the classification problem have been proposed under different security models. However, with the recent popularity of cloud computing, users now have the opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form, existing privacy-preserving classification techniques are not applicable. In this paper, we focus on solving the classification problem over encrypted data. In particular, we propose a secure k-NN classifier over encrypted data in the cloud. The proposed protocol protects the confidentiality of data, privacy of user's input query, and hides the data access patterns. To the best of our knowledge, our work is the first to develop a secure k-NN classifier
  • 10. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com over encrypted data under the semi-honest model. Also, we empirically analyze the efficiency of our proposed protocol using a real-world dataset under different parameter settings. 15. IEEE 2015: Generating Searchable Public-Key Ciphertexts With Hidden Structures for Fast Keyword Search Abstract: Existing semantically secure public-key searchable encryption schemes take search time linear with the total number of the ciphertexts. This makes retrieval from large-scale databases prohibitive. To alleviate this problem, this paper proposes searchable public-key ciphertexts with hidden structures (SPCHS) for keyword search as fast as possible without sacrificing semantic security of the encrypted keywords. In SPCHS, all keyword-searchable ciphertexts are structured by hidden relations, and with the search trapdoor corresponding to a keyword, the minimum information of the relations is disclosed to a search algorithm as the guidance to find all matching ciphertexts efficiently. We construct an SPCHS scheme from scratch in which the ciphertexts have a hidden star-like structure. We prove our scheme to be semantically secure in the random oracle (RO) model. The search complexity of our scheme is dependent on the actual number of the ciphertexts containing the queried keyword, rather than the number of all ciphertexts. Finally, we present a generic SPCHS construction from anonymous identity-based encryption and collision-free full-identity malleable identity- based key encapsulation mechanism (IBKEM) with anonymity. We illustrate two collision- free full-identity malleable IBKEM instances, which are semantically secure and anonymous, respectively, in the RO and standard models. The latter instance enables us to construct an SPCHS scheme with semantic security in the standard model. 16. IEEE 2015: Research Directions for Engineering Big Data Analytics Software Abstract: Many software startups and research and development efforts are actively trying to harness the power of big data and create software with the potential to improve almost every aspect of human life. As these efforts continue to increase, full consideration needs to be given to the engineering aspects of big data software. Since these systems exist to make
  • 11. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com predictions on complex and continuous massive datasets, they pose unique problems during specification, design, and verification of software that needs to be delivered on time and within budget. But, given the nature of big data software, can this be done? Does big data software engineering really work? This article explores the details of big data software, discusses the main problems encountered when engineering big data software, and proposes avenues for future research. 17. IEEE 2015: Co-Extracting Opinion Targets and Opinion Words from Online Reviews Based on the Word Alignment Model Abstract: Mining opinion targets and opinion words from online reviews are important tasks for fine-grained opinion mining, the key component of which involves detecting opinion relations among words. To this end, this paper proposes a novel approach based on the partially-supervised alignment model, which regards identifying opinion relations as an alignment process. Then, a graph-based co-ranking algorithm is exploited to estimate the confidence of each candidate. Finally, candidates with higher confidence are extracted as opinion targets or opinion words. Compared to previous methods based on the nearest-neighbor rules, our model captures opinion relations more precisely, especially for long-span relations. Compared to syntax-based methods, our word alignment model effectively alleviates the negative effects of parsing errors when dealing with informal online texts. In particular, compared to the traditional unsupervised alignment model, the proposed model obtains better precision because of the usage of partial supervision. In addition, when estimating candidate confidence, we penalize higher-degree vertices in our graph-based co-ranking algorithm to decrease the probability of error generation. Our experimental results on three corpora with different sizes and languages show that our approach effectively outperforms state-of-the-art methods. 18. IEEE 2015: Constructing a Global Social Service Network for Better Quality of Web Service Discovery Abstract: Web services have had a tremendous impact on the Web for supporting a distributed service-based economy on a global scale. However, despite the outstanding
  • 12. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com progress, their uptake on a Web scale has been significantly less than initially anticipated. The isolation of services and the lack of social relationships among related services have been identified as reasons for the poor uptake. In this paper, we propose connecting the isolated service islands into a global social service network to enhance the services' sociability on a global scale. First, we propose linked social service-specific principles based on linked data principles for publishing services on the open Web as linked social services. Then, we suggest a new framework for constructing the global social service network following linked social service-specific principles based on complex network theories. Next, an approach is proposed to enable the exploitation of the global social service network, providing Linked Social Services as a Service. Finally, experimental results show that our approach can solve the quality of service discovery problem, improving both the service discovering time and the success rate by exploring service-to-service based on the global social service network. 19. IEEE 2015: Privacy-Preserving Detection of Sensitive Data Exposure Abstract: Statistics from security firms, research institutions and government organizations show that the numbers of data-leak instances have grown rapidly in recent years. Among various data-leak cases, human mistakes are one of the main causes of data loss. There exist solutions detecting inadvertent sensitive data leaks caused by human mistakes and to provide alerts for organizations. A common approach is to screen content in storage and transmission for exposed sensitive information. Such an approach usually requires the detection operation to be conducted in secrecy. However, this secrecy requirement is challenging to satisfy in practice, as detection servers may be compromised or outsourced. In this paper, we present a privacy-preserving data-leak detection (DLD) solution to solve the issue where a special set of sensitive data digests is used in detection. The advantage of our method is that it enables the data owner to safely delegate the detection operation to a semi honest provider without revealing the sensitive data to the provider. We describe how Internet service providers can
  • 13. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com offer their customers DLD as an add-on service with strong privacy guarantees. The evaluation results show that our method can support accurate detection with very small number of false alarms under various data-leak scenarios. 20. IEEE 2015: Friendbook: A Semantic-Based Friend Recommendation System for Social Networks Abstract: Existing social networking services recommend friends to users based on their social graphs, which may not be the most appropriate to reflect a user's preferences on friend selection in real life. In this paper, we present Friendbook, a novel semantic- based friend recommendation system for socialnetworks, which recommends friends to users based on their life styles instead of social graphs. By taking advantage of sensor-rich smartphones, Friendbook discovers life styles of users from user-centric sensor data, measures the similarity of life styles between users, and recommends friends to users if their life styles have high similarity. Inspired by text mining, we model a user's daily life as life documents, from which his/her life styles are extracted by using the Latent Dirichlet Allocation algorithm. We further propose a similarity metric to measure the similarity of life styles between users, and calculate users' impact in terms of life styles with a friend-matching graph. Upon receiving a request, Friendbook returns a list of people with highest recommendation scores to the query user. Finally, Friendbook integrates a feedback mechanism to further improve the recommendation accuracy. We have implemented Friendbook on the Android-based smartphones, and evaluated its performance on both small-scale experiments and large-scale simulations. The results show that there commendations accurately reflect the preferences of users in choosing friends. 21. IEEE 2015: Secure Distributed Deduplication Systems with Improved Reliability Abstract: Data deduplication is a technique for eliminating duplicate copies of data, and has been widely used in cloud storage to reduce storage space and upload bandwidth. However, there is only one copy for each file stored in cloud even if such a file is owned by a huge number of users. As a result, deduplication system improves storage utilization while reducing reliability. Furthermore, the challenge of privacy for sensitive data also arises when
  • 14. Contact Us: #42/5, 1st Floor, 18th Cross, 21st Main, Vijayanagar, Bangalore-40 Land Mark: Near Maruthi Mandir ; www.adritsolutions.com they are outsourced by users to cloud. Aiming to address the above security challenges, this paper makes the first attempt to formalize the notion of distributed reliable deduplication system. We propose new distributed deduplication systems with higher reliability in which the data chunks are distributed across multiple cloud servers. The security requirements of data confidentiality and tag consistency are also achieved by introducing a deterministic secret sharing scheme in distributed storage systems, instead of using convergent encryption as in previous deduplication systems. Security analysis demonstrates that our deduplication systems are secure in terms of the definitions specified in the proposed security model. As a proof of concept, we implement the proposed systems and demonstrate that the incurred overhead is very limited in realistic environments.