We are providing training on IEEE 2016-17 projects for Ph.D Scalars, M.Tech, B.E, MCA, BCA and Diploma students for
all branches for their academic projects.
For more details call us or watsapp us @ 7676768124 0r 9545252155
Email your base papers to "adritsolutions@gmail.co.in"
We are providing IEEE projects on
1) Cloud Computing, Data Mining, BigData Projects Using JAva
2) Image Processing and Video Procesing (MATLAB) , Signal Processing
3) NS2 (Wireless Sensor, MANET, VANET)
4) ANDRIOD APPS
5) JAVA, JEE, J2EE, J2ME
6) Mechanical Design projects
7) Embedded Systems and IoT Projects
8) VLSI- Verilog Projects (ModelSim and Xilinx using FPGA)
For More details Please Visit us at
Adrit Solutions
Near Maruthi Mandir
#42/5, 18th Cross, 21st Main
Vijaynagar
Bangalore.
1. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
ADRIT SOLUTIONS
Ph: 9845252155 ; 7676768124 Email: adritsolutions@gmail.com
JAVA IEEE 2016-15 Data Mining Projects
1. IEEE 2016: SPORE: A Sequential Personalized Spatial Item Recommender System
Abstract: With the rapid development of location-based social networks (LBSNs), spatial
item recommendation has become an important way of helping users discover interesting
locations to increase their engagement with location-based services. Although human
movement exhibits sequential patterns in LBSNs, most current studies on spatial item
recommendations do not consider the sequential influence of locations. Leveraging
sequential patterns in spatial item recommendation is, however, very challenging,
considering 1)users’ check-in data in LBSNs has a low sampling rate in both space and time,
which renders existing prediction techniques on GPS trajectories ineffective; 2) the
prediction space is extremely large, with millions of distinct locations as the next prediction
target, which impedes the application of classical Markov chain models; and3)there is no
existing framework that unifies users’ personal interests and the sequential influence in a
principled manner. In light of the above challenges, we propose a sequential personalized
spatial item recommendation framework (SPORE) which introduces a novel latent variable
topic-region to model and fuse sequential influence with personal interests in the latent and
exponential space. The advantages of modeling the sequential effect at the topic-region level
include a significantly reduced prediction space, an effective alleviation of data sparsity and a
direct expression of the semantic meaning of users’ spatial activities. Furthermore, we design
an asymmetric Locality Sensitive Hashing (ALSH) technique to speed up the online top-k
recommendation process by extending the traditional LSH. We evaluate the performance of
SPORE on two real datasets and one large-scale synthetic dataset. The results demonstrate a
significant improvement in SPORE’s ability to recommend spatial items, in terms of both
effectiveness and efficiency, compared with the state-of-the-art methods.
2. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
2. IEEE 2016: Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search
Abstract: With advances in geo-positioning technologies and geo-location services, there are
a rapidly growing amount of spatio-textual objects collected in many applications such as
location based services and social networks, in which an object is described by its spatial
location and a set of keywords (terms). Consequently, the study of spatial keyword search
which explores both location and textual description of the objects has attracted great
attention from the commercial organizations and research communities. In the paper, we
study two fundamental problems in the spatial keyword queries: top k spatial keyword search
(TOPK-SK), and batch top k spatial keyword search (BTOPK-SK). Given a set of spatio-
textual objects, a query location and a set of query keywords, the TOPK-SK retrieves the
closest k objects each of which contains all keywords in the query. BTOPK-SK is the batch
processing of sets of TOPK-SK queries. Based on the inverted index and the linear quadtree,
we propose a novel index structure, called inverted linear quadtree (IL-Quadtree), which is
carefully designed to exploit both spatial and keyword based pruning techniques to
effectively reduce the search space. An efficient algorithm is then developed to tackle top k
spatial keyword search. To further enhance the filtering capability of the signature of linear
quadtree, we propose a partition based method. In addition, to deal with BTOPK-SK, we
design a new computing paradigm which partition the queries into groups based on both
spatial proximity and the textual relevance between queries. We show that the IL-Quadtree
technique can also efficiently support BTOPK-SK. Comprehensive experiments on real and
synthetic data clearly demonstrate the efficiency of our methods.
3. IEEE 2016: Truth Discovery in Crowd sourced Detection of Spatial Events
Abstract: The ubiquity of smartphones has led to the emergence of mobile crowd sourcing
tasks such as the detection of spatial events when smartphone users move around in their
daily lives. However, the credibility of those detected events can be negatively impacted by
unreliable participants with low-quality data. Consequently, a major challenge in quality
control is to discover true events from diverse and noisy participants’ reports. This truth
discovery problem is uniquely distinct from its online counterpart in that it involves
uncertainties in both participants’ mobility and reliability. Decoupling these two types of
3. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
uncertainties through location tracking will raise severe privacy and energy issues; whereas
simply ignoring missing reports or treating them as negative reports will significantly
degrade the accuracy of the discovered truth. In this paper, we propose a new method to
tackle this truth discovery problem through principled probabilistic modeling. In particular,
we integrate the modeling of location popularity, location visit indicators, truth of events and
three-way participant reliability in a unified framework. The proposed model is thus capable
of efficiently handling various types of uncertainties and automatically discovering truth
without any supervision or the need of location tracking. Experimental results demonstrate
that our proposed method out-performs existing state-of-the-art truth discovery approaches in
the mobile crowd sourcing environment.
4. IEEE 2016: Sentiment Analysis of Top Colleges in India Using Twitter Data
Abstract: In today’s world, opinions and reviews accessible to us are one of the most critical
factors in formulating our views and influencing the success of a brand, product or service.
With the advent and growth of social media in the world, stakeholders often take to
expressing their opinions on popular social media, namely twitter. While Twitter data is
extremely informative, it presents a challenge for analysis because of its humongous and
disorganized nature. This paper is a thorough effort to dive into the novel domain of
performing sentiment analysis of people’s opinions regarding top colleges in India. Besides
taking additional preprocessing measures like the expansion of net lingo and removal of
duplicate tweets, a probabilistic model based on Bayes’ theorem was used for spelling
correction, which is overlooked in other research studies. This paper also highlights a
comparison between the results obtained by exploiting the following machine learning
algorithms: Naïve Bayes and Support Vector Machine and an Artificial Neural Network
model: Multilayer Perceptron. Furthermore, a contrast has been presented between four
different kernels of SVM: RBF, linear, polynomial and sigmoid.
5. IEEE 2016: FRAppE: Detecting Malicious Facebook Applications
Abstract: With 20 million installs a day [1], third-party apps are a major reason for the
popularity and addictiveness of Facebook. Unfortunately, hackers have realized the potential
4. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
of using apps for spreading malware and spam. The problem is already significant, as we find
that at least 13% of apps in our dataset are malicious. So far, the research community has
focused on detecting malicious posts and campaigns. In this paper, we ask the question:
given a Facebook application, can we determine if it is malicious? Our key contribution is in
developing FRAppE—Facebook’s Rigorous Application Evaluator—arguably the first tool
focused on detecting malicious apps on Face-book. To develop FRAppE, we use information
gathered by ob-serving the posting behavior of 111K Facebook apps seen across 2.2 million
users on Facebook. First, we identify a set of features that help us distinguish malicious apps
from benign ones. For example, we find that malicious apps often share names with other
apps, and they typically request less permission than benign apps. Second, leveraging these
distinguishing features, we show that FRAppE can detect malicious apps with 99.5%
accuracy, with no false positives and a low false negative rate (4.1%). Finally, we explore the
ecosystem of malicious Facebook apps and identify mechanisms that these apps use to
propagate. Interestingly, we find that many apps collude and support each other; in our
dataset, we find 1,584 apps enabling the viral propagation of 3,723 other apps through their
posts. Long-term, we see FRAppE as a step towards creating an independent watchdog for
app assessment and ranking, so as to warn Facebook users before installing apps.
6. IEEE 2016: Practical Approximate k-Nearest Neighbor Queries with Location and
Query Privacy
Abstract: In mobile communication, spatial queries pose a serious threat to user location
privacy because the location of a query may reveal sensitive information about the mobile
user. In this paper, we study approximate k nearest neighbor (kNN) queries where the mobile
user queries the location-based service (LBS) provider about approximate k nearest points of
interest (POIs) on the basis of his current location. We propose a basic solution and a generic
solution for the mobile user to preserve his location and query privacy in approximate kNN
queries. The proposed solutions are mainly built on the Paillier public-key cryptosystem and
can provide both location and query privacy. To preserve query privacy, our basic solution
allows the mobile user to retrieve one type of POIs, for example, approximate k nearest car
parks, without revealing to the LBS provider what type of points is retrieved. Our generic
5. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
solution can be applied to multiple discrete type attributes of private location-based queries.
Compared with existing solutions for kNN queries with location privacy, our solution is more
efficient. Experiments have shown that our solution is practical for kNN queries.
7. IEEE 2016: A Novel Pipeline Approach for Efficient Big Data Broadcasting
Abstract: Big-data computing is a new critical challenge for the ICT industry. Engineers and
researchers are dealing with data sets of petabyte scale in the cloud computing paradigm.
Thus, the demand for building a service stack to distribute, manage, and process
massive data sets has risen drastically. In this paper, we investigate
the Big Data Broadcasting problem for a single source node to broadcast a big chunk
of data to a set of nodes with the objective of minimizing the maximum completion time.
These nodes may locate in the same datacenter or across geo-distributed datacenters. This
problem is one of the fundamental problems in distributed computing and is known to be NP-
hard in heterogeneous environments. We model the Big-data broadcasting problem into a
LockStep Broadcast Tree (LSBT) problem. The main idea of the LSBT model is to define a
basic unit of upload bandwidth, r, such that a node with capacity c broadcasts data to a set of
[c/r] children at the rater. Note that r is a parameter to be optimized as part of the LSBT
problem. We further divide the broadcast data into m chunks. These data chunks can then
be broadcast down the LSBT in a pipeline manner. In a homogeneous network environment
in which each node has the same upload capacity c, we show that the optimal uplink rate r*
of LSBT is either c/2 or c/3, whichever gives the smaller maximum completion time. For
heterogeneous environments, we present an O(nlog2n) algorithm to select an optimal uplink
rater* and to construct an optimal LSBT. Numerical results show that our approach performs
well with less maximum completion time and lower computational complexity than
other efficient solutions in literature.
8. IEEE 2016: VoteTrust: Leveraging Friend Invitation Graph to Defend against
Social Network Sybils
Abstract: Online social networks (OSNs) suffer from the creation of fake accounts that
introduce fake product reviews, malware and spam. Existing defenses focus on using
6. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
the social graph structure to isolate fakes. However, our work shows that Sybils could
befriend a large number of real users, invalidating the assumption behind social-graph-based
detection. In this paper, we present VoteTrust, a scalable defense system that
further leverages user-level activities. VoteTrust models the friend invitation interactions
among users as a directed, signed graph, and uses two key mechanisms to detect Sybils over
the graph: a voting-based Sybil detection to find Sybils that users vote to reject, and a Sybil
community detection to find other colluding Sybils around identified Sybils. Through
evaluating on Renren social network, we show that VoteTrust is able to prevent Sybils from
generating many unsolicited friend requests. We also deploy VoteTrust in Renen, and our
real experience demonstrates that VoteTrust can detect large-scale collusion among Sybils.
9. IEEE 2016: A Secure and Dynamic Multi-Keyword Ranked Search Scheme over
Encrypted Cloud Data
Abstract: Due to the increasing popularity of cloud computing, more and more data owners
are motivated to outsource their data to cloud servers for great convenience and reduced cost
in data management. However, sensitive data should be encrypted before outsourcing for
privacy requirements, which obsoletes data utilization like keyword-based document
retrieval. In this paper, we present a securemulti-
keyword ranked search scheme over encrypted cloud data, which simultaneously supports
dynamic update operations like deletion and insertion of documents. Specifically, the vector
space model and the widely-used TF x IDF model are combined in the index construction
and query generation. We construct a special tree-based index structure and propose a
“Greedy Depth-first Search” algorithm to provide efficient multi-keyword ranked search.
The secure kNN algorithm is utilized to encrypt the index and query vectors, and meanwhile
ensure accurate relevance score calculation between encrypted index and query vectors. In
order to resist statistical attacks, phantom terms are added to the index vector for
blinding search results. Due to the use of our special tree-based index structure, the
proposed scheme can achieve sub-linear search time and deal with the deletion and insertion
of documents flexibly. Extensive experiments are conducted to demonstrate the efficiency of
the proposed scheme.
7. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
10. IEEE 2016: SmartCrawler: A Two-Stage Crawler for Efficiently Harvesting Deep-
Web Interfaces
Abstract: As deep web grows at a very fast pace, there has been increased interest in
techniques that help efficiently locate deep-web interfaces. However, due to the large volume
of web resources and the dynamic nature of deep web, achieving wide coverage and high
efficiency is a challenging issue. We propose a two-stage framework, namely SmartCrawler,
for efficient harvesting deep web interfaces. In the first stage, SmartCrawler performs site-
based searching for center pages with the help of search engines, avoiding visiting a large
number of pages. To achieve more accurate results for a focused crawl, SmartCrawler ranks
websites to prioritize highly relevant ones for a given topic. In the second
stage, SmartCrawler achieves fast in-site searching by excavating most relevant links with an
adaptive link-ranking. To eliminate bias on visiting some highly relevant links in
hidden web directories, we design a link tree data structure to achieve wider coverage for a
website. Our experimental results on a set of representative domains show the agility and
accuracy of our proposed crawler framework, which efficiently retrieves deep-
web interfaces from large-scale sites and achieves higher harvest rates than other crawlers.
11. IEEE 2016: FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce
Abstract: Existing parallel mining algorithms for frequent itemsets lack a mechanism that
enables automatic parallelization, load balancing, data distribution, and fault tolerance on
large clusters. As a solution to this problem, we design
a parallel frequent itemsets mining algorithm called FiDoop using the
MapReduce programming model. To achieve compressed storage and avoid building
conditional pattern bases, FiDoop incorporates the frequent items ultrametric tree, rather than
conventional FP trees. In FiDoop, three MapReduce jobs are implemented to complete
the mining task. In the crucial third MapReduce job, the mappers independently
decompose itemsets, the reducers perform combination operations by constructing small ultra
metric trees, and the actual mining of these trees separately. We implement FiDoop on our
in-house Hadoop cluster. We show that FiDoop on the cluster is sensitive to data distribution
and dimensions, because item sets with different lengths have different decomposition and
8. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
construction costs. To improve FiDoop's performance, we develop a workload balance metric
to measure load balance across the cluster's computing nodes. We develop FiDoop-HD, an
extension of FiDoop, to speed up the mining performance for high-dimensional data analysis.
Extensive experiments using real-world celestial spectral data demonstrate that our proposed
solution is efficient and scalable.
12. IEEE 2015: Discover the Expert: Context-Adaptive Expert Selection for Medical
Diagnosis
Abstract: In this paper, we propose an expert selection system that learns online the
best expert to assign to each patient depending on the context of the patient. In general,
the context can include an enormous number and variety of information related to the
patient's health condition, age, gender, previous drug doses, and so forth, but the most
relevant information is embedded in only a few contexts. If these most relevant contexts were
known in advance, learning would be relatively simple but they are not. Moreover, the
relevant contexts may be different for different health conditions. To address these
challenges, we develop a new class of algorithms aimed at discovering the most relevant
contexts and the best clinic and expert to use to make a diagnosis given a patient's contexts.
We prove that as the number of patients grows, the proposed context-adaptive algorithm
will discover the optimal expert to select for patients with a specific context. Moreover, the
algorithm also provides confidence bounds on the diagnostic accuracy of the expert it selects,
which can be considered by the primary care physician before making the final decision.
While our algorithm is general and can be applied in numerous medical scenarios, we
illustrate its functionality and performance by applying it to a real-world breast
cancer diagnosis data set. Finally, while the application we consider in this paper
is medical diagnosis, our proposed algorithm can be applied in other environments where
expertise needs to be discovered.
13. IEEE 2015: Active Learning for Ranking through Expected Loss Optimization
Abstract: Learning to rank arises in many data mining applications, ranging from web
search engine, online advertising to recommendation system. In learning to rank, the
9. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
performance of a ranking model is strongly affected by the number of labeled examples in
the training set; on the other hand, obtaining labeled examples for training data is very
expensive and time-consuming. This presents a great need for the active learning approaches
to select most informative examples for ranking learning; however, in the literature there is
still very limited work to address active learning for ranking. In this paper, we propose a
general active learning framework, expected loss optimization (ELO), for ranking. The ELO
framework is applicable to a wide range of ranking functions. Under this framework, we
derive a novel algorithm, expected discounted cumulative gain
(DCG) loss optimization (ELO-DCG), to select most informative examples. Then, we
investigate both query and document level active learning for raking and propose a two-stage
ELO-DCG algorithm which incorporate both query and document selection into
active learning. Furthermore, we show that it is flexible for the algorithm to deal with the
skewed grade distribution problem with the modification of the loss function. Extensive
experiments on real-world web search data sets have demonstrated great potential and
effectiveness of the proposed framework and algorithms.
14. IEEE 2015: k-Nearest Neighbor Classification over Semantically Secure Encrypted
Relational Data
Abstract: Data Mining has wide applications in many areas such as banking, medicine,
scientific research and among government agencies. Classification is one of the commonly
used tasks in data mining applications. For the past decade, due to the rise of various privacy
issues, many theoretical and practical solutions to the classification problem have been
proposed under different security models. However, with the recent popularity of cloud
computing, users now have the opportunity to outsource their data, in encrypted form, as well
as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form,
existing privacy-preserving classification techniques are not applicable. In this paper, we
focus on solving the classification problem over encrypted data. In particular, we propose a
secure k-NN classifier over encrypted data in the cloud. The proposed protocol protects the
confidentiality of data, privacy of user's input query, and hides the data access patterns. To
the best of our knowledge, our work is the first to develop a secure k-NN classifier
10. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
over encrypted data under the semi-honest model. Also, we empirically analyze the
efficiency of our proposed protocol using a real-world dataset under different parameter
settings.
15. IEEE 2015: Generating Searchable Public-Key Ciphertexts With Hidden
Structures for Fast Keyword Search
Abstract: Existing semantically secure public-key searchable encryption schemes
take search time linear with the total number of the ciphertexts. This makes retrieval from
large-scale databases prohibitive. To alleviate this problem, this paper
proposes searchable public-key ciphertexts with hidden structures (SPCHS)
for keyword search as fast as possible without sacrificing semantic security of the encrypted
keywords. In SPCHS, all keyword-searchable ciphertexts are structured by hidden relations,
and with the search trapdoor corresponding to a keyword, the minimum information of the
relations is disclosed to a search algorithm as the guidance to find all
matching ciphertexts efficiently. We construct an SPCHS scheme from scratch in which
the ciphertexts have a hidden star-like structure. We prove our scheme to be semantically
secure in the random oracle (RO) model. The search complexity of our scheme is dependent
on the actual number of the ciphertexts containing the queried keyword, rather than the
number of all ciphertexts. Finally, we present a generic SPCHS construction from
anonymous identity-based encryption and collision-free full-identity malleable identity-
based key encapsulation mechanism (IBKEM) with anonymity. We illustrate two collision-
free full-identity malleable IBKEM instances, which are semantically secure and anonymous,
respectively, in the RO and standard models. The latter instance enables us to construct an
SPCHS scheme with semantic security in the standard model.
16. IEEE 2015: Research Directions for Engineering Big Data Analytics Software
Abstract: Many software startups and research and development efforts are actively trying to
harness the power of big data and create software with the potential to improve almost every
aspect of human life. As these efforts continue to increase, full consideration needs to be
given to the engineering aspects of big data software. Since these systems exist to make
11. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
predictions on complex and continuous massive datasets, they pose unique problems during
specification, design, and verification of software that needs to be delivered on time and
within budget. But, given the nature of big data software, can this be done?
Does big data software engineering really work? This article explores the details of big data
software, discusses the main problems encountered when engineering big data software, and
proposes avenues for future research.
17. IEEE 2015: Co-Extracting Opinion Targets and Opinion Words from Online
Reviews Based on the Word Alignment Model
Abstract: Mining opinion targets and opinion words from online reviews are important tasks
for fine-grained opinion mining, the key component of which involves
detecting opinion relations among words. To this end, this paper proposes a novel
approach based on the partially-supervised alignment model, which regards
identifying opinion relations as an alignment process. Then, a graph-based co-ranking
algorithm is exploited to estimate the confidence of each candidate. Finally, candidates with
higher confidence are extracted as opinion targets or opinion words. Compared to previous
methods based on the nearest-neighbor rules, our model captures opinion relations more
precisely, especially for long-span relations. Compared to syntax-based methods,
our word alignment model effectively alleviates the negative effects of parsing errors when
dealing with informal online texts. In particular, compared to the traditional
unsupervised alignment model, the proposed model obtains better precision because of the
usage of partial supervision. In addition, when estimating candidate confidence, we penalize
higher-degree vertices in our graph-based co-ranking algorithm to decrease the probability of
error generation. Our experimental results on three corpora with different sizes and languages
show that our approach effectively outperforms state-of-the-art methods.
18. IEEE 2015: Constructing a Global Social Service Network for Better Quality of
Web Service Discovery
Abstract: Web services have had a tremendous impact on the Web for supporting a
distributed service-based economy on a global scale. However, despite the outstanding
12. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
progress, their uptake on a Web scale has been significantly less than initially anticipated.
The isolation of services and the lack of social relationships among related services have
been identified as reasons for the poor uptake. In this paper, we propose connecting the
isolated service islands into a global social service network to enhance the services'
sociability on a global scale. First, we propose linked social service-specific principles based
on linked data principles for publishing services on the open Web as linked social services.
Then, we suggest a new framework
for constructing the global social service network following linked social service-specific
principles based on complex network theories. Next, an approach is proposed to enable the
exploitation of the global social service network, providing Linked Social Services as a
Service. Finally, experimental results show that our approach can solve
the quality of service discovery problem, improving both the service discovering time and the
success rate by exploring service-to-service based on the global social service network.
19. IEEE 2015: Privacy-Preserving Detection of Sensitive Data Exposure
Abstract: Statistics from security firms, research institutions and government organizations
show that the numbers of data-leak instances have grown rapidly in recent years. Among
various data-leak cases, human mistakes are one of the main causes of data loss. There exist
solutions detecting inadvertent sensitive data leaks caused by human mistakes and to provide
alerts for organizations. A common approach is to screen content in storage and transmission
for exposed sensitive information. Such an approach usually requires the detection operation
to be conducted in secrecy. However, this secrecy requirement is challenging to satisfy in
practice, as detection servers may be compromised or outsourced. In this paper, we present
a privacy-preserving data-leak detection (DLD) solution to solve the issue where a special set
of sensitive data digests is used in detection. The advantage of our method is that it enables
the data owner to safely delegate the detection operation to a semi honest provider without
revealing the sensitive data to the provider. We describe how Internet service providers can
13. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
offer their customers DLD as an add-on service with strong privacy guarantees. The
evaluation results show that our method can support accurate detection with very small
number of false alarms under various data-leak scenarios.
20. IEEE 2015: Friendbook: A Semantic-Based Friend Recommendation System for
Social Networks
Abstract: Existing social networking services recommend friends to users based on
their social graphs, which may not be the most appropriate to reflect a user's preferences
on friend selection in real life. In this paper, we present Friendbook, a novel semantic-
based friend recommendation system for socialnetworks, which recommends friends to
users based on their life styles instead of social graphs. By taking advantage of sensor-rich
smartphones, Friendbook discovers life styles of users from user-centric sensor data,
measures the similarity of life styles between users, and recommends friends to users if their
life styles have high similarity. Inspired by text mining, we model a user's daily life as life
documents, from which his/her life styles are extracted by using the Latent Dirichlet
Allocation algorithm. We further propose a similarity metric to measure the similarity of life
styles between users, and calculate users' impact in terms of life styles with a friend-matching
graph. Upon receiving a request, Friendbook returns a list of people with
highest recommendation scores to the query user. Finally, Friendbook integrates a feedback
mechanism to further improve the recommendation accuracy. We have
implemented Friendbook on the Android-based smartphones, and evaluated its performance
on both small-scale experiments and large-scale simulations. The results show that there
commendations accurately reflect the preferences of users in choosing friends.
21. IEEE 2015: Secure Distributed Deduplication Systems with Improved Reliability
Abstract: Data deduplication is a technique for eliminating duplicate copies of data, and has
been widely used in cloud storage to reduce storage space and upload bandwidth. However,
there is only one copy for each file stored in cloud even if such a file is owned by a huge
number of users. As a result, deduplication system improves storage utilization while
reducing reliability. Furthermore, the challenge of privacy for sensitive data also arises when
14. Contact Us: #42/5, 1st
Floor, 18th
Cross, 21st
Main, Vijayanagar, Bangalore-40
Land Mark: Near Maruthi Mandir ; www.adritsolutions.com
they are outsourced by users to cloud. Aiming to address the above security challenges, this
paper makes the first attempt to formalize the notion of distributed reliable
deduplication system. We propose new distributed deduplication systems with
higher reliability in which the data chunks are distributed across multiple cloud servers. The
security requirements of data confidentiality and tag consistency are also achieved by
introducing a deterministic secret sharing scheme in distributed storage systems, instead of
using convergent encryption as in previous deduplication systems. Security analysis
demonstrates that our deduplication systems are secure in terms of the definitions specified in
the proposed security model. As a proof of concept, we implement the proposed systems and
demonstrate that the incurred overhead is very limited in realistic environments.