This document summarizes techniques for privacy-preserving data publishing. It discusses k-anonymity, which anonymizes data by suppressing or generalizing attributes until each record is identical to at least k-1 other records. The document also covers limitations of k-anonymity, including that it does not protect against background knowledge attacks. It proposes addressing these limitations by combining k-anonymity with generalization and slicing techniques.
2. Privacy preserving data publishing
2
A typical scenario for data collection and publishing is described in
Figure above. In the data collection phase, the data publisher collects
data from record owners (e.g., Alice and Bob).
3. Privacy preserving data publishing
In the data publishing phase, the data publisher releases the collected
data to a data miner or to the public, called the data recipient, who will
then conduct data mining on the published data. In this survey, data
mining has a broad sense, not necessarily restricted to pattern mining or
model building. For example, a hospital collects data from patients and
publishes the patient records to an external medical centre. In this
example, the hospital is the data publisher, patients are record owners,
and the medical centre is the data recipient.
3
4. Related work given in Base Paper
4
The base paper explain about the privacy preservation on publish data by
using anonymization technique through record elimination method. First
of all explain about type of attack on publish data.
There are two kind of major attack on publish data:-
1> Record linkage(Identity discloser attack)
2> Attribute linkage(Attribute discloser attack)
Identity discloser attack:-
Identity discloser occurs where new information about some
individual is revealed, i.e., the released data make it possible to infer the
characteristic of an individual more accurately than it would be possible
before the data release.
Attribute discloser attack:-
attribute discloser can occur with or without identity discloser. it
has been recognized that even discloser of false attribute information may
cause.
5. 5
Related work given in Base Paper
In this paper the K-Anonymity technique used for preserve the publish data
with the comparison with other technique given below.
1> L-Diversity
2> T-Closeness
K-anonymity:- The data base said to be k-anonymous where attribute are
suppressed or generalized until each row is identical with at least k-1 other
row.
The data anonymization technique are :-
1> Generalization
2> Bucketization
3> Suppresion
6. 6
Limitation/Drawback of k-anonymization technique
Some limitation of k-anonymity technique are given below:-
1> It does not hide whether a given individual is in the database.
2> It reveals individuals' sensitive attributes.
3> It does not protect against attacks based on background knowledge.
4> mere knowledge of the k-anonymization algorithm can violate privacy.
5> It cannot be applied to high-dimensional data without complete loss of
utility .
6> Special methods are required if a dataset is anonymized and published
more than once.
7. 7
Literacy survey on paper t-Closeness: Privacy
Beyond k-Anonymity and -Diversity
The t-closeness Principle:-
An equivalence class is said to have t-closeness if the distance between the
distribution of a sensitive attribute in this class and the distribution of the
attribute in the whole table is no more than a threshold t. A table is said to
have t-closeness if all equivalence classes have t-closeness.
8. 8
Proposed work with behalf of Base Paper
The proposed work on behalf of base paper I want to used:-
1> The concept of generalization
2> Slicing