SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627

RESEARCH ARTICLE

www.ijera.com

OPEN ACCESS

Hiding Sensitive items using MDSRRC to Maintain Privacy in
Database
Pratiksha Sapkal, Minakshi Panchal, Manisha Pol, Madhumita Mane
1,2,3,4,

(Department of Information Technology, Pune University, India)

ABSTRACT
This paper focuses on the research in hiding sensitive association rules to maintain privacy and data quality in
database. In this paper we have proposed heuristic based algorithm named MDSRRC (Modified Decrease
Support of R.H.S. item of Rule Clusters) to hide the sensitive association rules with multiple items in consequent
(R.H.S) and antecedent (L.H.S). The algorithm selects the items and transactions based on certain criteria which
modify transactions to hide the sensitive information. Our evaluation shows that the proposed technique is highly
valuable and maintains dataset quality.
Keywords-Association rules, Privacy preserving data mining (PPDM), MDSRRC
contain some sensitive information which are on the
Right hand side or left hand side of the rule, so that
I.
INTRODUCTION
rules containing sensitive item can‟t be reveal. Our
Mining association rule techniques are
approached is based on modifying the database in a
wide employed in data mining to and relationship
way that confidence of the association rule can
between item sets. The corporate and many
bereduce with the help increase or decrease the
government organizations reveals their data or
support value of R.H.S or L.H.S correspondingly.
information for mutual benefit to search out some
As the confidence of the rule is reduce below a
useful data for some decision making purpose and
given threshold, it is hidden or we can say it will not
improve their business schemes. But this database
be disclosed.
may contain some confidential information and
DSRRC is not able to hide association rules with
which the organization does not need to reveal.
multiple items in the antecedent (L.H.S) and
consequent (R.H.S). To overcome this limitation,
The problem of concealment plays important role
the work proposed by Nikunj H. Domadiya and
once the corporate share their data for mutual profit
Udai Pratap Rao elt al. [1] is the improved version
however there is no one need to leak their private
of DSRRC i.e. MDSRRC, which uses count of
data. So before revealing the information, sensitive
things in consequent of the sensitive rules. It
patterns should be hidden and to resolve this issue
modifies the minimum number of transactions to
PPDM (Privacy preserving data mining) techniques
cover most sensitive rules and maintain data quality.
are helpful to boost the safety of database. These
approaches have in general the advantage to require
a minimum amount of input (usually the database,
II.
RELATED WORK
the information to protect and few other parameters)
There have been several methods proposed
and then a low effort is required to the user in order
for hiding sensitive patterns in dataset. In 1999, M.
to apply them. The selection of rules would require
Attalah and E. bernito[13] proposed the idea of
data mining process to beexecuted first. For
Disclosure limitation of sensitive rules. It discusses
association rules hiding, two basic approaches have
security risks on the database when reveals it in
been proposed. The first approach hides one rule at
public. They introduce algorithm for hiding sensitive
a time.
items with little impact on database. V.S. Verykios et
First selects transactions that contain the items in a
al, A. K. Elmagarmid et al. [12] present five different
give rule. It then tries to modify the transaction by
algorithm to hide the sensitive rules. These algorithm
transaction until the confidence or support of the
use hiding strategies which are based on decrease
rule fall below minimum confidence or minimum
support and confidence of the sensitive rule.
support. The modification is done by either
Furthermore, C. N. Modi and U. P. Rao et al.[3],
removing the items from the transaction or inserting
presented the algorithm for Maintaining privacy and
new items to the transactions. The second approach
data quality in privacy preserving association rule
deals with groups of restricted patterns or
mining. It improves the quality of DSRRC. Next,
association rules at a time. In our work we are
Motivation example shows importance of sensitive
concern of hiding certain association rules which
patterns in business applications.
www.ijera.com

623 | P a g e
Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627
Let an Mobile store purchase mobiles from two
companies X and Y. Now X applies data mining
techniques and mines association rules applied to
related to B‟s product. X found that most of the
customers who buy mobile of Y also buy camera.
Now X offers some discount on camera if customer
purchases X‟s mobile. As a result the business of Y
goes down. Therefore discharging the database with
sensitive information cause the problem. This
scheme provides the order on sensitive rules hiding
in the database.
The proposed algorithm customizes least possible
number of transactions to hide supreme sensitive
rules and preserving data quality.
III.
PROBLEM STATEMENT
Association rule activity problem is defined
as: converting the original database into sanitized
database so that data mining techniques will not be
ready to mine sensitive rules from the database while
all non sensitive rules remain visible. Given
transactional database D,Minimum confidence,
Minimum support, and generated set of association
rules R from D, a subset SR of R as sensitive rules,
which database owner want to hide. Problem is to
and the sanitized database D' such that when mining
technique is applied on the D', all sensitive rules in
set SR will be hidden while all non sensitive rules
can be mined. The aim of association rule hiding is
to satisfy the following conditions:


It must not generate any new rules which are
not present in the database.

 Applying Apriori
This algorithm starts with mining the association
rule from the original dataset using association rule
mining algorithm. Apriori is an influential algorithm
for generating association rules. Let I={i1…..in} be
distinct literals called items. Given a database
D={T1.....Tm}is a set of transaction where each
transaction T is a set of items as TI(1≤ i≤m).The
association rule is define as XY,where YI ,XI
and YX=.X is called rule‟s antecedent(L.H.S)
and
Y
is
called
rule‟s
consequent
(R.H.S).Association rules generation is usually split
up into two separate steps.
1.

Minimum support is applied to find
frequentitems sets in a dataset. The support of
rule X→Y is calculated using the following
formula: Support(X→Y) =|XY|/|D|, where |D|
define the total number of the transactions in
the database D and |XY| is the number of
transactions which support item set XY. A rule
X→Y is mined from database if support
(X→Y) ≥ MST (minimum support threshold).

2.

Secondly, comparing those frequent item sets
with the minimum confidence constraints to
form mining association rules. The confidence
of rule iscalculated using following formula:
Confidence (X→Y) =|XY| / |X|, where |X| is
number of transactions which supportitem set
X. A rule X→Y is mined from database if
confidence(X→Y)≥MCT.

Sanitized database must facilitate mining of
all non-sensitive rules.



items present in the transaction and 0 for the items in
the transaction.

Database must not disclose any sensitive
rules.



www.ijera.com

3.

The Proposed algorithm named MDSRRC hides
sensitive association rules with fewer modifications
on database to maintain data quality and to reduce
the side effect of database. In this paper we have
apply the MDSRRC algorithm for the Gojee
transaction dataset

IV.

METHODOLOGY

The proposed methodology is categorized
into five major modules viz; Binarization, Apriori,
Sensitive rules generation, Rule Hiding and sanitized
database creation.
The Fig1 shows general architecture of the
proposed method.
 Binarization
We have integrated a pre-binarization step in order
to enhance the input dataset quality. Let put 1 to the
www.ijera.com

While the second step is straight forward, the
first step needs more attention. Finding all
frequent item sets in a database is difficult since
it involves finding all item sets. The standard
definition of association rule is given in [4].
 Sensitive Rule Generation :
The user specifies some rules as sensitivity as
sensitive rules (SR) from the rules generated by the
association rule mining algorithm. Then algorithm
counts occurrences of each item in R.H.S of
sensitive rules (SR) from the rules generated by the
association rule mining algorithm. The algorithm
counts occurrence of each item in R.H.S of
sensitivity rules. The algorithm finds Is= (is0,
is1…….isk)
k<_n, by arranging those items in
decreasing order of their counts. After that
sensitivity of each item is calculated. Occurrence of
each item in R.H.S of sensitivity rules. The
algorithm finds Is= (is0, is1…….isk)k<_n, by arranging
those items in decreasing order of their counts. After
that sensitivity of each item is calculated. Then
624 | P a g e
Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627
transactions which support is 0 are sorted in
descending order of their sensitivities.
The algorithm counts occurrence of each item in
R.H.S of sensitivity rules. The algorithm finds Is=
(is0, is1…….isk) k<_n, by arranging those items in
decreasing order of their counts. After that
sensitivity of each item is calculated. Occurrence of
each item in R.H.S of sensitivity rules. The
algorithm finds Is= (is0, is1…….isk)k<_n, by arranging
those items in decreasing order of their counts. After
that sensitivity of each item is calculated. Then
transactions which support is 0 are sorted in
descending order of their sensitivities.
 Rule Hiding
This paper uses heuristic based approaches for
hiding the sensitive rule. The data distortion
mechanism changes the item value by a new value
in dataset. It alters „0‟ to „1‟ or „1‟ to „0‟ for selected
items in selected transactions to decrease the
confidence, by decreasing or increasing support of
items in sensitive rules. Sensitivity of Item is the
number of sensitivity rules which contain this item.
The sensitivity of transactions is the total of
sensitivities of all sensitive items which are
presented in this transaction.

www.ijera.com

Input Dataset

Binarization

Association rule
generation

Sensitive rule generation

Rule Hiding

Sanitized data
The Rule hiding process starts by selecting first
transaction form the sorted transaction with higher
sensitivity; delete item is0 from that transaction.
Then update support and confidence of all sensitive
rules and if any rules have support and confidence
below MST and MCT respectively then delete it
from SR. Finally update sensitivity of each item,
transaction and Is. Again select transaction with
higher sensitivity and delete is0 from it. This
process continues until all sensitive rules are hidden.
 Sanitized Database Generation
The modified transactions are updated in the
original database and new database is generated
which is called sanitized database D‟, which
preserves the privacy of sensitive information and
maintains database quality.

www.ijera.com

Fig1 .General Architecture

V. Example
To understand MDSRRC following
example is illustrated In Table I transactional
database D is shown. With 3 as MST and 40% as
MCT, The possible generated association rules by
Apriori
algorithm[4]
are:FishCrabSalad,FishSaladCrab,CrabFishS
alad,SaladFishCrab,Fishcrab,FishSalad,Crab
Fish,SaladFish,CrabSaladFish.Let
the
Database
Owner
Specify
CrabFishSalad,SaladFishCrab as sensitive
rules. The sensitivity of Crab=2, Salad=2, Fish=2.
Transaction with itssensitivity is shown in Table
II. Now algorithm findsfrequency of each item
presents in R.H.S of sensitive rules.Here frequency
of Crab=1,Salad=1,Fish=2 IS={Fish,Salad,Crab}.
In this example item „Fish‟ is selected as is0. Then it
sorts the transactionswhich supports is0 in
descending order of their sensitivity. Then Select
transaction with highest sensitivity and delete is0
item from that transaction. Update confidenceand
support of all the sensitive rules. Table III show
modified database D1after first deletion of item
from first transaction. Now update sensitivity of
each item. Updated count of eachitem for IS .Sort
transactions which support is0and delete the is0 from
transaction with highest sensitivity, then delete the
is0 from transaction with highest sensitivity. Now
625 | P a g e
Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627
all sensitiverules are hidden. Final sanitize database
is shown in table III.
Table I. Transaction Database
TID

Items

TRN001
TRN002
TRN003
TRN004
TRN005
TRN006
TRN007
TRN008
TRN009
TRN010
TRN011
TRN012
TRN013

Tofutti,fish,steak
Steak,beef
Fish,steak
Fish,monkfish,potato
Fish,steak,halibut
Fish,crab,salad
Fish,potato,crab
Fish,salad
Fish,potato,crab,salad
Tofutti,fish,crab
Tofutti,steak,salad,shell
Tofutti,fish,crab,salad
Tofutti,steak,beef,potato

Binary matrix of
Items
1110000000
0011000000
0110000000
0100110000
0110001000
0100000110
0100010100
0100000010
0100010110
1100000100
1010000011
1100000110
1011010000

Table II.Transaction with Sensitivity
TID
Sensitivity
TRN001
2
TRN002
0
TRN003
2
TRN004
2
TRN005
2
TRN006
6
TRN007
4
TRN008
4
TRN009
6
TRN010
4
TRN011
2
TRN012
6
TRN013
0

www.ijera.com

evaluation by considering the performance
parameters which are given in [12] viz.
(a) HF (hiding failure): It is the percentage of the
sensitive data that remain exposed in the
sanitized dataset.
(b) MC (misses cost): It is the percentage of the
non-sensitive data that are hidden as a sideeffect of the sanitization process.
(c) AP (artificial patterns): It is the percentage of
the discovered patterns that are artifacts.
(d) DISS (dissimilarity): It is the difference
between the original and the sanitized datasets.
(e) SEF (side effect factor): It is the amount of
non-sensitive association rules that are
removed as an effect of the sanitization
process.
Experimental results show that MDSRRC
increase efficiency and reduce modification of
transactions in database. Performance comparison of
MDSRRC with algorithm DSRRC is given in Table
VI.
Table VI. Performance result

Table III.Sanitized Database D‟
TID
TRN001
TRN002

Items
Tofutti,steak
Steak,beef

TRN003
TRN004
TRN005
TRN006
TRN007
TRN008
TRN009
TRN010
TRN011
TRN012
TRN013

Fish,steak
Fish,monkfish,potato
Fish,steak,halibut
Fish,crab,salad
Fish,potato,crab
Fish,salad
Fish,potato,crab,salad
Tofutti,fish,crab
Tofutti,steak,salad,shell
Tofutti,fish,crab,salad
Tofutti,steak,beef,potato

40%
30%

VI. EXPERIMENTAL RESULT AND ANALYSIS
OF PROPOSED ALGORITHM

We compare MDSRRC algorithm with
DSRRC algorithm. These algorithms are used to
hide the Sensitive rules on database. After applying
bothAlgorithms on sample database we have done
www.ijera.com

20%
10%

DSRRC

0%

MDSRRC

Fig. 2. Performance Comparison between DSRRC,
MDSRRC

VI.CONCLUSION
The purpose of the Association rule hiding
techniques for privacy preserving data mining is to
hide certain crucial information so they cannot
discovered through association rule. In this paper,
we proposed an algorithm named MDSRRC which
hides sensitive association rules with fewer
626 | P a g e
Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications
ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627
modifications on database to maintain data quality
and to reduce the side effect of database.
Functionality of proposed algorithm is shown using
sample database with three sensitive rules.
Experimental results show that proposed algorithm
works better than DSRRC. So MDSRRC hide
sensitive rules with minimum modifications on
database and maintain data quality.MDSRRC
algorithm can be extended to increase the efficiency
and reduce the side effects by minimizing the
modifications on database.

REFERENCES
[1]

Nikunj H. Domadiya and Udai Pratap Rao,
“Hiding Sensitive Association Rules to
Maintain Privacy and Data Quality in
Database”3rd IEEE International Advance
Computing Conference (IACC) 2013.
[2] V. S. Verykios, A. K. Elmagarmid, E.
Bertino, Y. Saygin, and E. Dasseni,
“Association
rule
hiding,”
IEEE
Transactions on Knowledge and Data
Engineering, vol. 16, pp. 434–447, 2004.
[3] C. N. Modi, U. P. Rao, and D. R. Patel,
“Maintaining privacy and data quality in
privacy preserving association rule mining,”
2010 Second International conference on
Computing, Communication and Networking
Technologies, pp. 1–6, Jul. 2010.
[4] J. Han, Data Mining: Concepts and
Techniques. San Francisco, CA, USA:
Morgan Kaufmann Publishers Inc., 2005.
[5] Y.-H. Wu, C.-M. Chiang, and A. L. Chen,
“Hiding sensitive association rules with
limited side effects,” IEEE Transactions on
Knowledge and Data Engineering, vol. 19,
pp. 29–42, 2007.
[6] S.-L. Wang, B. Parikh, and A. Jafari,
“Hiding informative association rule sets,”
Expert Systems with Applications, vol. 33,
no. 2, pp. 316 – 323, 2007.
[7] S.-L.Wang, D. Patel, A. Jafari, and T.-P.
Hong,
“Hiding
collaborative
recommendation association rules,” Applied
Intelligence, vol. 27, pp. 67–77, 2007.
[8] D.F. Gleich and L.-H. Lim.
Rank
aggregation via nuclear norm minimization.
In KDD, 2011.
[9] Y. Saygin, V. S. Verykios, and A. K.
Elmagarmid,
“Privacy
preserving
association rule mining.” in RIDE. IEEE
Computer Society, 2002, pp 151–158.
[10] C. N. Modi, U. P. Rao, and D. R. Patel, “An
Efficient Solution for Privacy Preserving
Association Rule
Mining,”
(IJCNS)
International Journal of Computer and
Network Security, vol. 2, no. 5, pp. 79–85,
2010.
www.ijera.com

www.ijera.com

[11] Wu and H. Wang, “Research on the privacy
preserving algorithm of association rule
mining in centralized database,” in
Proceedings of the 2008 International
Symposiums on Information Processing, ser.
ISIP’08. Washington, DC, USA: IEEE
Computer Society, 2008, pp. 131–134
[12] V. Verykios and A. Gkoulalas-Divanis, A
Survey of Association Rule Hiding Methods
for Privacy, ser. Advances in Database
Systems, C.
[13] M. Atallah, A. Elmagarmid, M. Ibrahim, E.
Bertino, and V. Verykios, “Disclosure
limitation of sensitive rules,” in Proceedings
of the 1999 Workshop on Knowledge and
Data Engineering Exchange, ser. KDEX
’99. Washington, DC, USA: IEEE Computer
Society, 1999, pp. 45–52.

627 | P a g e

Contenu connexe

Tendances

Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Miningidescitation
 
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...IRJET Journal
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches14894
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data MiningVrushali Malvadkar
 
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...IJCSIS Research Publications
 
Privacy Preserving Approaches for High Dimensional Data
Privacy Preserving Approaches for High Dimensional DataPrivacy Preserving Approaches for High Dimensional Data
Privacy Preserving Approaches for High Dimensional Dataijtsrd
 
Cryptography for privacy preserving data mining
Cryptography for privacy preserving data miningCryptography for privacy preserving data mining
Cryptography for privacy preserving data miningMesbah Uddin Khan
 
78201919
7820191978201919
78201919IJRAT
 
Improving Cloud Security Using Data Mining
Improving Cloud Security Using Data MiningImproving Cloud Security Using Data Mining
Improving Cloud Security Using Data MiningIOSR Journals
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_pptSagar Verma
 
An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...
An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...
An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...IRJET Journal
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data miningNeeda Multani
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesEditor IJMTER
 
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...IJSRD
 
Isaca journal - bridging the gap between access and security in big data...
Isaca journal  - bridging the gap between access and security in big data...Isaca journal  - bridging the gap between access and security in big data...
Isaca journal - bridging the gap between access and security in big data...Ulf Mattsson
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data MiningROMALEE AMOLIC
 
Privacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachPrivacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachNarendra Dhadhal
 
Information Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data MiningInformation Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data Miningwanani181
 

Tendances (18)

Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
 
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...A Survey on Features and Techniques Description for Privacy of Sensitive Info...
A Survey on Features and Techniques Description for Privacy of Sensitive Info...
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
Privacy Preserving Distributed Association Rule Mining Algorithm for Vertical...
 
Privacy Preserving Approaches for High Dimensional Data
Privacy Preserving Approaches for High Dimensional DataPrivacy Preserving Approaches for High Dimensional Data
Privacy Preserving Approaches for High Dimensional Data
 
Cryptography for privacy preserving data mining
Cryptography for privacy preserving data miningCryptography for privacy preserving data mining
Cryptography for privacy preserving data mining
 
78201919
7820191978201919
78201919
 
Improving Cloud Security Using Data Mining
Improving Cloud Security Using Data MiningImproving Cloud Security Using Data Mining
Improving Cloud Security Using Data Mining
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
 
An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...
An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...
An Robust Outsourcing of Multi Party Dataset by Utilizing Super-Modularity an...
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data mining
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for Databases
 
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
Maintaining Data Confidentiality in Association Rule Mining in Distributed En...
 
Isaca journal - bridging the gap between access and security in big data...
Isaca journal  - bridging the gap between access and security in big data...Isaca journal  - bridging the gap between access and security in big data...
Isaca journal - bridging the gap between access and security in big data...
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
Privacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approachPrivacy preserving in data mining with hybrid approach
Privacy preserving in data mining with hybrid approach
 
Information Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data MiningInformation Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data Mining
 

Similaire à Cr4201623627

NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUE
NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUENEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUE
NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUEcscpconf
 
An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
An Effective Heuristic Approach for Hiding Sensitive Patterns in  DatabasesAn Effective Heuristic Approach for Hiding Sensitive Patterns in  Databases
An Effective Heuristic Approach for Hiding Sensitive Patterns in DatabasesIOSR Journals
 
Output Privacy Protection With Pattern-Based Heuristic Algorithm
Output Privacy Protection With Pattern-Based Heuristic AlgorithmOutput Privacy Protection With Pattern-Based Heuristic Algorithm
Output Privacy Protection With Pattern-Based Heuristic Algorithmijcsit
 
Hiding sensitive association_rule.pdf.
Hiding sensitive association_rule.pdf.Hiding sensitive association_rule.pdf.
Hiding sensitive association_rule.pdf.Dhyanendra Jain
 
A literature review of modern association rule mining techniques
A literature review of modern association rule mining techniquesA literature review of modern association rule mining techniques
A literature review of modern association rule mining techniquesijctet
 
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...IOSR Journals
 
To allot secrecy-safe association rules mining schema using FP tree
To allot secrecy-safe association rules mining schema using FP treeTo allot secrecy-safe association rules mining schema using FP tree
To allot secrecy-safe association rules mining schema using FP treeUvaraj Shan
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...Editor IJMTER
 
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set MiningAn Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set Miningijsrd.com
 
Association rule mining.pptx
Association rule mining.pptxAssociation rule mining.pptx
Association rule mining.pptxmaha797959
 
Analysis and Implementation of Efficient Association Rules using K-mean and N...
Analysis and Implementation of Efficient Association Rules using K-mean and N...Analysis and Implementation of Efficient Association Rules using K-mean and N...
Analysis and Implementation of Efficient Association Rules using K-mean and N...IOSR Journals
 
DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...
DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...
DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...IJDKP
 

Similaire à Cr4201623627 (20)

NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUE
NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUENEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUE
NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUE
 
F212734
F212734F212734
F212734
 
F04713641
F04713641F04713641
F04713641
 
Dy33753757
Dy33753757Dy33753757
Dy33753757
 
Dy33753757
Dy33753757Dy33753757
Dy33753757
 
An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
An Effective Heuristic Approach for Hiding Sensitive Patterns in  DatabasesAn Effective Heuristic Approach for Hiding Sensitive Patterns in  Databases
An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
 
F0423038041
F0423038041F0423038041
F0423038041
 
Output Privacy Protection With Pattern-Based Heuristic Algorithm
Output Privacy Protection With Pattern-Based Heuristic AlgorithmOutput Privacy Protection With Pattern-Based Heuristic Algorithm
Output Privacy Protection With Pattern-Based Heuristic Algorithm
 
Hu3414421448
Hu3414421448Hu3414421448
Hu3414421448
 
Hiding sensitive association_rule.pdf.
Hiding sensitive association_rule.pdf.Hiding sensitive association_rule.pdf.
Hiding sensitive association_rule.pdf.
 
Bi4201403406
Bi4201403406Bi4201403406
Bi4201403406
 
A literature review of modern association rule mining techniques
A literature review of modern association rule mining techniquesA literature review of modern association rule mining techniques
A literature review of modern association rule mining techniques
 
K017167076
K017167076K017167076
K017167076
 
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
 
To allot secrecy-safe association rules mining schema using FP tree
To allot secrecy-safe association rules mining schema using FP treeTo allot secrecy-safe association rules mining schema using FP tree
To allot secrecy-safe association rules mining schema using FP tree
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
 
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set MiningAn Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
 
Association rule mining.pptx
Association rule mining.pptxAssociation rule mining.pptx
Association rule mining.pptx
 
Analysis and Implementation of Efficient Association Rules using K-mean and N...
Analysis and Implementation of Efficient Association Rules using K-mean and N...Analysis and Implementation of Efficient Association Rules using K-mean and N...
Analysis and Implementation of Efficient Association Rules using K-mean and N...
 
DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...
DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...
DISCOVERY OF ACTIONABLE PATTERNS THROUGH SCALABLE VERTICAL DATA SPLIT METHOD ...
 

Dernier

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Dernier (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Cr4201623627

  • 1. Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627 RESEARCH ARTICLE www.ijera.com OPEN ACCESS Hiding Sensitive items using MDSRRC to Maintain Privacy in Database Pratiksha Sapkal, Minakshi Panchal, Manisha Pol, Madhumita Mane 1,2,3,4, (Department of Information Technology, Pune University, India) ABSTRACT This paper focuses on the research in hiding sensitive association rules to maintain privacy and data quality in database. In this paper we have proposed heuristic based algorithm named MDSRRC (Modified Decrease Support of R.H.S. item of Rule Clusters) to hide the sensitive association rules with multiple items in consequent (R.H.S) and antecedent (L.H.S). The algorithm selects the items and transactions based on certain criteria which modify transactions to hide the sensitive information. Our evaluation shows that the proposed technique is highly valuable and maintains dataset quality. Keywords-Association rules, Privacy preserving data mining (PPDM), MDSRRC contain some sensitive information which are on the Right hand side or left hand side of the rule, so that I. INTRODUCTION rules containing sensitive item can‟t be reveal. Our Mining association rule techniques are approached is based on modifying the database in a wide employed in data mining to and relationship way that confidence of the association rule can between item sets. The corporate and many bereduce with the help increase or decrease the government organizations reveals their data or support value of R.H.S or L.H.S correspondingly. information for mutual benefit to search out some As the confidence of the rule is reduce below a useful data for some decision making purpose and given threshold, it is hidden or we can say it will not improve their business schemes. But this database be disclosed. may contain some confidential information and DSRRC is not able to hide association rules with which the organization does not need to reveal. multiple items in the antecedent (L.H.S) and consequent (R.H.S). To overcome this limitation, The problem of concealment plays important role the work proposed by Nikunj H. Domadiya and once the corporate share their data for mutual profit Udai Pratap Rao elt al. [1] is the improved version however there is no one need to leak their private of DSRRC i.e. MDSRRC, which uses count of data. So before revealing the information, sensitive things in consequent of the sensitive rules. It patterns should be hidden and to resolve this issue modifies the minimum number of transactions to PPDM (Privacy preserving data mining) techniques cover most sensitive rules and maintain data quality. are helpful to boost the safety of database. These approaches have in general the advantage to require a minimum amount of input (usually the database, II. RELATED WORK the information to protect and few other parameters) There have been several methods proposed and then a low effort is required to the user in order for hiding sensitive patterns in dataset. In 1999, M. to apply them. The selection of rules would require Attalah and E. bernito[13] proposed the idea of data mining process to beexecuted first. For Disclosure limitation of sensitive rules. It discusses association rules hiding, two basic approaches have security risks on the database when reveals it in been proposed. The first approach hides one rule at public. They introduce algorithm for hiding sensitive a time. items with little impact on database. V.S. Verykios et First selects transactions that contain the items in a al, A. K. Elmagarmid et al. [12] present five different give rule. It then tries to modify the transaction by algorithm to hide the sensitive rules. These algorithm transaction until the confidence or support of the use hiding strategies which are based on decrease rule fall below minimum confidence or minimum support and confidence of the sensitive rule. support. The modification is done by either Furthermore, C. N. Modi and U. P. Rao et al.[3], removing the items from the transaction or inserting presented the algorithm for Maintaining privacy and new items to the transactions. The second approach data quality in privacy preserving association rule deals with groups of restricted patterns or mining. It improves the quality of DSRRC. Next, association rules at a time. In our work we are Motivation example shows importance of sensitive concern of hiding certain association rules which patterns in business applications. www.ijera.com 623 | P a g e
  • 2. Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627 Let an Mobile store purchase mobiles from two companies X and Y. Now X applies data mining techniques and mines association rules applied to related to B‟s product. X found that most of the customers who buy mobile of Y also buy camera. Now X offers some discount on camera if customer purchases X‟s mobile. As a result the business of Y goes down. Therefore discharging the database with sensitive information cause the problem. This scheme provides the order on sensitive rules hiding in the database. The proposed algorithm customizes least possible number of transactions to hide supreme sensitive rules and preserving data quality. III. PROBLEM STATEMENT Association rule activity problem is defined as: converting the original database into sanitized database so that data mining techniques will not be ready to mine sensitive rules from the database while all non sensitive rules remain visible. Given transactional database D,Minimum confidence, Minimum support, and generated set of association rules R from D, a subset SR of R as sensitive rules, which database owner want to hide. Problem is to and the sanitized database D' such that when mining technique is applied on the D', all sensitive rules in set SR will be hidden while all non sensitive rules can be mined. The aim of association rule hiding is to satisfy the following conditions:  It must not generate any new rules which are not present in the database.  Applying Apriori This algorithm starts with mining the association rule from the original dataset using association rule mining algorithm. Apriori is an influential algorithm for generating association rules. Let I={i1…..in} be distinct literals called items. Given a database D={T1.....Tm}is a set of transaction where each transaction T is a set of items as TI(1≤ i≤m).The association rule is define as XY,where YI ,XI and YX=.X is called rule‟s antecedent(L.H.S) and Y is called rule‟s consequent (R.H.S).Association rules generation is usually split up into two separate steps. 1. Minimum support is applied to find frequentitems sets in a dataset. The support of rule X→Y is calculated using the following formula: Support(X→Y) =|XY|/|D|, where |D| define the total number of the transactions in the database D and |XY| is the number of transactions which support item set XY. A rule X→Y is mined from database if support (X→Y) ≥ MST (minimum support threshold). 2. Secondly, comparing those frequent item sets with the minimum confidence constraints to form mining association rules. The confidence of rule iscalculated using following formula: Confidence (X→Y) =|XY| / |X|, where |X| is number of transactions which supportitem set X. A rule X→Y is mined from database if confidence(X→Y)≥MCT. Sanitized database must facilitate mining of all non-sensitive rules.  items present in the transaction and 0 for the items in the transaction. Database must not disclose any sensitive rules.  www.ijera.com 3. The Proposed algorithm named MDSRRC hides sensitive association rules with fewer modifications on database to maintain data quality and to reduce the side effect of database. In this paper we have apply the MDSRRC algorithm for the Gojee transaction dataset IV. METHODOLOGY The proposed methodology is categorized into five major modules viz; Binarization, Apriori, Sensitive rules generation, Rule Hiding and sanitized database creation. The Fig1 shows general architecture of the proposed method.  Binarization We have integrated a pre-binarization step in order to enhance the input dataset quality. Let put 1 to the www.ijera.com While the second step is straight forward, the first step needs more attention. Finding all frequent item sets in a database is difficult since it involves finding all item sets. The standard definition of association rule is given in [4].  Sensitive Rule Generation : The user specifies some rules as sensitivity as sensitive rules (SR) from the rules generated by the association rule mining algorithm. Then algorithm counts occurrences of each item in R.H.S of sensitive rules (SR) from the rules generated by the association rule mining algorithm. The algorithm counts occurrence of each item in R.H.S of sensitivity rules. The algorithm finds Is= (is0, is1…….isk) k<_n, by arranging those items in decreasing order of their counts. After that sensitivity of each item is calculated. Occurrence of each item in R.H.S of sensitivity rules. The algorithm finds Is= (is0, is1…….isk)k<_n, by arranging those items in decreasing order of their counts. After that sensitivity of each item is calculated. Then 624 | P a g e
  • 3. Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627 transactions which support is 0 are sorted in descending order of their sensitivities. The algorithm counts occurrence of each item in R.H.S of sensitivity rules. The algorithm finds Is= (is0, is1…….isk) k<_n, by arranging those items in decreasing order of their counts. After that sensitivity of each item is calculated. Occurrence of each item in R.H.S of sensitivity rules. The algorithm finds Is= (is0, is1…….isk)k<_n, by arranging those items in decreasing order of their counts. After that sensitivity of each item is calculated. Then transactions which support is 0 are sorted in descending order of their sensitivities.  Rule Hiding This paper uses heuristic based approaches for hiding the sensitive rule. The data distortion mechanism changes the item value by a new value in dataset. It alters „0‟ to „1‟ or „1‟ to „0‟ for selected items in selected transactions to decrease the confidence, by decreasing or increasing support of items in sensitive rules. Sensitivity of Item is the number of sensitivity rules which contain this item. The sensitivity of transactions is the total of sensitivities of all sensitive items which are presented in this transaction. www.ijera.com Input Dataset Binarization Association rule generation Sensitive rule generation Rule Hiding Sanitized data The Rule hiding process starts by selecting first transaction form the sorted transaction with higher sensitivity; delete item is0 from that transaction. Then update support and confidence of all sensitive rules and if any rules have support and confidence below MST and MCT respectively then delete it from SR. Finally update sensitivity of each item, transaction and Is. Again select transaction with higher sensitivity and delete is0 from it. This process continues until all sensitive rules are hidden.  Sanitized Database Generation The modified transactions are updated in the original database and new database is generated which is called sanitized database D‟, which preserves the privacy of sensitive information and maintains database quality. www.ijera.com Fig1 .General Architecture V. Example To understand MDSRRC following example is illustrated In Table I transactional database D is shown. With 3 as MST and 40% as MCT, The possible generated association rules by Apriori algorithm[4] are:FishCrabSalad,FishSaladCrab,CrabFishS alad,SaladFishCrab,Fishcrab,FishSalad,Crab Fish,SaladFish,CrabSaladFish.Let the Database Owner Specify CrabFishSalad,SaladFishCrab as sensitive rules. The sensitivity of Crab=2, Salad=2, Fish=2. Transaction with itssensitivity is shown in Table II. Now algorithm findsfrequency of each item presents in R.H.S of sensitive rules.Here frequency of Crab=1,Salad=1,Fish=2 IS={Fish,Salad,Crab}. In this example item „Fish‟ is selected as is0. Then it sorts the transactionswhich supports is0 in descending order of their sensitivity. Then Select transaction with highest sensitivity and delete is0 item from that transaction. Update confidenceand support of all the sensitive rules. Table III show modified database D1after first deletion of item from first transaction. Now update sensitivity of each item. Updated count of eachitem for IS .Sort transactions which support is0and delete the is0 from transaction with highest sensitivity, then delete the is0 from transaction with highest sensitivity. Now 625 | P a g e
  • 4. Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627 all sensitiverules are hidden. Final sanitize database is shown in table III. Table I. Transaction Database TID Items TRN001 TRN002 TRN003 TRN004 TRN005 TRN006 TRN007 TRN008 TRN009 TRN010 TRN011 TRN012 TRN013 Tofutti,fish,steak Steak,beef Fish,steak Fish,monkfish,potato Fish,steak,halibut Fish,crab,salad Fish,potato,crab Fish,salad Fish,potato,crab,salad Tofutti,fish,crab Tofutti,steak,salad,shell Tofutti,fish,crab,salad Tofutti,steak,beef,potato Binary matrix of Items 1110000000 0011000000 0110000000 0100110000 0110001000 0100000110 0100010100 0100000010 0100010110 1100000100 1010000011 1100000110 1011010000 Table II.Transaction with Sensitivity TID Sensitivity TRN001 2 TRN002 0 TRN003 2 TRN004 2 TRN005 2 TRN006 6 TRN007 4 TRN008 4 TRN009 6 TRN010 4 TRN011 2 TRN012 6 TRN013 0 www.ijera.com evaluation by considering the performance parameters which are given in [12] viz. (a) HF (hiding failure): It is the percentage of the sensitive data that remain exposed in the sanitized dataset. (b) MC (misses cost): It is the percentage of the non-sensitive data that are hidden as a sideeffect of the sanitization process. (c) AP (artificial patterns): It is the percentage of the discovered patterns that are artifacts. (d) DISS (dissimilarity): It is the difference between the original and the sanitized datasets. (e) SEF (side effect factor): It is the amount of non-sensitive association rules that are removed as an effect of the sanitization process. Experimental results show that MDSRRC increase efficiency and reduce modification of transactions in database. Performance comparison of MDSRRC with algorithm DSRRC is given in Table VI. Table VI. Performance result Table III.Sanitized Database D‟ TID TRN001 TRN002 Items Tofutti,steak Steak,beef TRN003 TRN004 TRN005 TRN006 TRN007 TRN008 TRN009 TRN010 TRN011 TRN012 TRN013 Fish,steak Fish,monkfish,potato Fish,steak,halibut Fish,crab,salad Fish,potato,crab Fish,salad Fish,potato,crab,salad Tofutti,fish,crab Tofutti,steak,salad,shell Tofutti,fish,crab,salad Tofutti,steak,beef,potato 40% 30% VI. EXPERIMENTAL RESULT AND ANALYSIS OF PROPOSED ALGORITHM We compare MDSRRC algorithm with DSRRC algorithm. These algorithms are used to hide the Sensitive rules on database. After applying bothAlgorithms on sample database we have done www.ijera.com 20% 10% DSRRC 0% MDSRRC Fig. 2. Performance Comparison between DSRRC, MDSRRC VI.CONCLUSION The purpose of the Association rule hiding techniques for privacy preserving data mining is to hide certain crucial information so they cannot discovered through association rule. In this paper, we proposed an algorithm named MDSRRC which hides sensitive association rules with fewer 626 | P a g e
  • 5. Pratiksha Sapkal et al Int. Journal of Engineering Research and Applications ISSN : 2248-9622, Vol. 4, Issue 2( Version 1), February 2014, pp.623-627 modifications on database to maintain data quality and to reduce the side effect of database. Functionality of proposed algorithm is shown using sample database with three sensitive rules. Experimental results show that proposed algorithm works better than DSRRC. So MDSRRC hide sensitive rules with minimum modifications on database and maintain data quality.MDSRRC algorithm can be extended to increase the efficiency and reduce the side effects by minimizing the modifications on database. REFERENCES [1] Nikunj H. Domadiya and Udai Pratap Rao, “Hiding Sensitive Association Rules to Maintain Privacy and Data Quality in Database”3rd IEEE International Advance Computing Conference (IACC) 2013. [2] V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni, “Association rule hiding,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, pp. 434–447, 2004. [3] C. N. Modi, U. P. Rao, and D. R. Patel, “Maintaining privacy and data quality in privacy preserving association rule mining,” 2010 Second International conference on Computing, Communication and Networking Technologies, pp. 1–6, Jul. 2010. [4] J. Han, Data Mining: Concepts and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005. [5] Y.-H. Wu, C.-M. Chiang, and A. L. Chen, “Hiding sensitive association rules with limited side effects,” IEEE Transactions on Knowledge and Data Engineering, vol. 19, pp. 29–42, 2007. [6] S.-L. Wang, B. Parikh, and A. Jafari, “Hiding informative association rule sets,” Expert Systems with Applications, vol. 33, no. 2, pp. 316 – 323, 2007. [7] S.-L.Wang, D. Patel, A. Jafari, and T.-P. Hong, “Hiding collaborative recommendation association rules,” Applied Intelligence, vol. 27, pp. 67–77, 2007. [8] D.F. Gleich and L.-H. Lim. Rank aggregation via nuclear norm minimization. In KDD, 2011. [9] Y. Saygin, V. S. Verykios, and A. K. Elmagarmid, “Privacy preserving association rule mining.” in RIDE. IEEE Computer Society, 2002, pp 151–158. [10] C. N. Modi, U. P. Rao, and D. R. Patel, “An Efficient Solution for Privacy Preserving Association Rule Mining,” (IJCNS) International Journal of Computer and Network Security, vol. 2, no. 5, pp. 79–85, 2010. www.ijera.com www.ijera.com [11] Wu and H. Wang, “Research on the privacy preserving algorithm of association rule mining in centralized database,” in Proceedings of the 2008 International Symposiums on Information Processing, ser. ISIP’08. Washington, DC, USA: IEEE Computer Society, 2008, pp. 131–134 [12] V. Verykios and A. Gkoulalas-Divanis, A Survey of Association Rule Hiding Methods for Privacy, ser. Advances in Database Systems, C. [13] M. Atallah, A. Elmagarmid, M. Ibrahim, E. Bertino, and V. Verykios, “Disclosure limitation of sensitive rules,” in Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange, ser. KDEX ’99. Washington, DC, USA: IEEE Computer Society, 1999, pp. 45–52. 627 | P a g e