08448380779 Call Girls In Civil Lines Women Seeking Men
ISACA Houston - How to de-classify data and rethink transfer of data between us and eu
1. 1 datafloq
How to De-classify Data
and Rethink
Transfer of Data between US
and EU
Ulf Mattsson
Chief Security Strategist
www.Protegrity.com
2. 2
Tokenization Management
and Security
Cloud Management and Security
Payment Card Industry (PCI)
Security Standards Council (SSC):
1. Tokenization Task Force
2. Encryption Task Force, Point to
Point Encryption Task Force
3. Risk Assessment SIG
4. eCommerce SIG
5. Cloud SIG, Virtualization SIG
6. Pre-Authorization SIG, Scoping SIG
Working Group
• Chief Security Strategist at Protegrity, previously Head of Innovation at
TokenEx and Chief Technology Officer at Atlantic BT, Compliance Engineering,
and IT Architect at IBM
Ulf Mattsson
• Products and Services:
• Data Encryption, Tokenization, Data Discovery, Cloud Application Security
Brokers (CASB), Web Application Firewalls (WAF), Robotics, and
Applications
• Security Operation Center (SOC), Managed Security Services (MSSP)
• Inventor of more than 70 issued US Patents and developed Industry
Standards with ANSI X9 and PCI SSC
Dec 2019
May 2020
May 2020
3. 3
Agenda
1. Privacy Shield and Schrems II
2. When GDPR apply to data
3. Re-identification attacks
4. Pseudonymization
• When to use pseudonymization or anonymization
• Compliance aspects
• Trans-border communication
• Best practices
• A framework
5. International privacy standards
6. Data de-classification process and workflow
7. Privacy protection of personal health information
4. 4
Source: FTI Consulting, 2020, an independent
global business advisory firm.
More than 500 leaders of large-sized private
sector companies, based in the U.S.
How have the following data privacy regulations impacted your organization?
5. 5
FTI Consulting -
Corporate Data
Privacy Today, 2020
Which of the following aspects of data privacy are you particularly concerned about?
6. 6http://dataprotection.link/Zn1Uk#https://www.wsj.com/articles/coronavirus-paves-way-for-new-age-of-digital-surveillance-11586963028
American officials are drawing cellphone location data from mobile advertising firms to track the presence of crowds—but
not individuals. Apple Inc. and Alphabet Inc.’s Google recently announced plans to launch a voluntary app that health officials
can use to reverse-engineer sickened patients’ recent whereabouts—provided they agree to provide such information.
European nations monitor citizen
movement by tapping
telecommunications data that they say
conceals individuals’ identities.
The extent of tracking hinges on a series of tough choices:
• Make it voluntary or mandatory?
• Collect personal or anonymized data?
• Disclose information publicly or privately?
In Western Australia, lawmakers approved a bill last month to install surveillance gadgets in people’s homes to monitor those
placed under quarantine. Authorities in Hong Kong and India are using geofencing that draws virtual fences around
quarantine zones. They monitor digital signals from smartphone or wristbands to deter rule breakers and nab offenders, who
can be sent to jail. Japan’s most popular messaging app beams health-status questions to its users on behalf of the
government.
8. 8Privacyshield.gov
Privacy Shield Program Overview
The EU-U.S. and Swiss-U.S. Privacy Shield Frameworks were designed by the U.S. Department of Commerce, and the
European Commission and Swiss Administration, respectively, to provide companies on both sides of the Atlantic with a
mechanism to comply with data protection requirements when transferring personal data from the European Union and
Switzerland to the United States in support of transatlantic commerce.
On July 12, 2016, the European Commission deemed the EU-U.S. Privacy Shield Framework adequate to enable data
transfers under EU law (see the adequacy determination).
On January 12, 2017, the Swiss Government announced the approval of the Swiss-U.S. Privacy Shield Framework as a valid
legal mechanism to comply with Swiss requirements when transferring personal data from Switzerland to the United States.
See the statements from the Swiss Federal Council and Swiss Federal Data Protection and Information Commissioner.
On July 16, 2020, the Court of Justice of the European Union issued a judgment declaring as “invalid” the European
Commission’s Decision (EU) 2016/1250 of 12 July 2016 on the adequacy of the protection provided by the EU-U.S. Privacy
Shield. As a result of that decision, the EU-U.S. Privacy Shield Framework is no longer a valid mechanism to comply with EU
data protection requirements when transferring personal data from the European Union to the United States.
This decision does not relieve participants in the EU-U.S. Privacy Shield of their obligations under the EU-U.S. Privacy
Shield Framework.
9. 9
Privacy Shield safeguards: Encryption
• The CJEU reaffirmed the validity of SCCs* but stated that companies must verify, on a case-by-case basis,
whether the law in the recipient country ensures adequate protection, under EU law, for personal data
transferred under SCCs and, where it doesn’t, that companies must provide additional safeguards or suspend
transfers.
• The ruling placed the same requirement on EU data protection authorities to suspend such transfers on a
case-by-case basis where equivalent protection can not be ensured.
• Privacy professionals may need to consider whether relevant surveillance programs and authorities apply in
particular contexts. If they do, they could then assess whether those authorities include proportional
limitations in the given context, as well as whether effective judicial remedies exist.
• Alternatively, they might consider ways to limit the context itself through additional safeguards. Encryption,
for instance, might be a consideration.
https://iapp.org/news/a/the-schrems-ii-decision-eu-us-data-transfers-in-question/
*: Standard Contractual Clauses (SCC). Standard contractual clauses for data transfers between EU and non-EU countries.
10. 10
After Privacy Shield
Focus on five main areas to protect data privacy:
1. Accessible Data: It is critical that organizations be able to access and blend data from many different file types to have an
integrated view and understanding of what personal data they hold.
2. Identifying Data: No matter where personally identifiable information (PII) resides, many organizations rely on technology
capabilities like data filters, sampling techniques and sophisticated algorithms that can identify and extract personal data
from structured and unstructured data sources.
3. Proactive Governance: Organizations need to be able to enforce governance policies, monitor data quality and manage
business terms across the organization. They must also be able to assign owners to terms and link them to policies or
technical assets like reports or data sources. This can be accomplished with data quality, metadata management and
information cataloging technologies.
4. Ongoing Protection: For ongoing protection, role-based data masking and encryption technologies can secure sensitive
information, as well dynamically blend data without moving it. This helps to minimize exposure of sensitive data.
5. Audits and Reviews: Technology that provides interactive reports to identify the users, files, data sources and types of PII
detected is essential. Audits should show who has accessed PII data and how it is being protected across the business.
https://www.cmswire.com/information-management/enterprise-data-strategies-in-the-aftermath-of-the-us-privacy-shield-defeat/
11. 11Privacyshield.gov
Will the Privacy Shield continue to serve as a data transfer
mechanism under the EU General Data Protection Regulation
(GDPR)?
• Yes. Article 45 of the GDPR provides for the continuity of adequacy determinations made under
the EU’s 1995 Data Protection Directive, one of which was the adequacy decision on the EU-U.S.
Privacy Shield.
• The Privacy Shield was also designed with an eye to the GDPR, addressing both substantive and
procedural elements.
• For instance, the Privacy Shield includes an annual review, which was designed to address the
GDPR’s requirement for a mechanism for a periodic review, at least once every four years, of
relevant developments.
• It is important to note that Privacy Shield is not a GDPR compliance mechanism, but rather is a
mechanism that enables participating companies to meet the EU requirements for transferring
personal data to third countries, discussed in Chapter V of the GDPR.
13. 13
The advocate general's 'Schrems II' opinion: What it says and means
• On July 16, the Court of Justice of the European Union issued its long-awaited decision in the case Data Protection
Commission v. Facebook Ireland, Schrems.
• That decision invalidates the European Commission’s adequacy decision for the EU-U.S. Privacy Shield Framework,
on which more than 5,000 U.S. companies rely to conduct trans-Atlantic trade in compliance with EU data
protection rules.
• While the decision upholds the validity of standard contractual clauses, it requires companies and regulators to
conduct case-by-case analyses to determine whether foreign protections concerning government access to data
transferred meet EU standards.
• The decision reinforces the importance of data protection to global commerce and the critical role that privacy
professionals play in implementing protections in line with foreign legal requirements.
• For privacy professionals today, though, there may be more questions than answers.
https://iapp.org/news/a/the-advocate-generals-schrems-ii-opinion-what-it-says-and-means/
14. 14
After Schrems II
Contracts No Longer Enough For Data Transfer
It is critical to note that under the GDPR, pseudonymisation is defined as an outcome and not a technique.
Before the GDPR, pseudonymisation was widely understood to mean replacing direct identifiers with tokens and was
applied to individual fields within a data set.
• It was merely a Privacy Enhancing Technique (“PET”).
• In addition, instead of being applied only to individual fields, GDPR pseudonymisation, in combination with the GDPR
definition for personal data, now requires that the outcome should apply to a data set as a whole (the entire
collection of direct identifiers, indirect identifiers and other attributes).
This means that to achieve GDPR-compliant pseudonymisation, you must protect not only direct identifiers but also indirect
identifiers.
• You must also consider the degree of protection applied to all attributes in a data set.
• Further, to retain any value, you must do so while still preserving the data’s utility for its intended use.
• As a result, pre-GDPR approaches (using static tokens on a direct identifier, which is too often still incorrectly referred to
as “pseudonymisation”) will rarely, if ever, meet the heightened GDPR requirements to satisfy “appropriate safeguard”
requirements for lawful international data transfers under EU law.
16. 16
Case Study
Major healthcare enterprise, providing and coordinating services to
government sponsored programs.
Contracts with numerous physicians, hospitals and Federally Qualified Health Centers (FQHCs) across
many states in the USA.
• The company needed to improve patient outcomes to reduce overall cost per member utilizing
predictive analytics.
• However, governance policies dictated that analysts should not have access to sensitive Protected
Health Information (PHI) and Personally Identifiable Information (PII).
• This meant protecting data in Teradata, Oracle and SQL Server, as well as applications and files.
• In addition, recent security breaches by other companies in the industry drove a mandate to review
and secure sensitive data from external threats and unauthorized access.
17. 17https://www.iso.org/standard/42807.html
Definitions in ISO 25237 International Health informatics standard
• De-identification process addresses three kinds of data:
• direct identifiers, which by themselves identify the patient;
• indirect identifiers, which provide correlation when used with other indirect or external knowledge; and
• non-identifying data, the rest of the data.
• Pseudonymization: particular type of de-identification that both removes the association with a data subject and adds an
association between a particular set of characteristics relating to the data subject and one or more pseudonyms
• Pseudonym: personal identifier that is different from the normally used personal identifier and is used with pseudonymized data to
provide dataset coherence linking all the information about a subject, without disclosing the real world person identity
• data protection: technical and social regimen for negotiating, managing and ensuring informational privacy, and security
• de-identification: general term for any process of reducing the association between a set of identifying data and the data
subject
• irreversibility: situation when, for any passage from identifiable to pseudonymous, it is computationally unfeasible to trace
back to the original identifier from the pseudonym
• data linking: matching and combining data from multiple databases
• linkage of information objects: process allowing a logical association to be established between different information objects
• primary use of personal data: uses and disclosures that are intended for the data collected
18. 18
Re-identification attacks
https://www.iso.org/standard/42807.html
• It is important to note that this information
is usually outside the scope of the data
model of an application.
• In order to create a methodology for privacy
risk assessment, a formalized way of
describing privacy threat and the risk of re-
identification is needed.
• A generic model of re-identification attacks,
shown in its highest level of abstraction in,
consists of three major entities.
A key element in privacy risk assessment is to assess the effect of observational data that can be obtained by an attacker.
Observational data can consist of events recorded by the attacker, but can also consist of information that can be legally
obtained by the attacker.
It could be that the attacker is a generic user of the system who has, either by accident or unauthorized effort, obtained
extra data with which he should not have come into contact in the normal line of his duty.
19. 19
Threat model, goals and means of the attacker
https://www.iso.org/standard/42807.html
There is the goal of the attack (what information is an attacker after?) and there are the means at his disposal.
• The latter is linked with the “value” that the information that could be recovered from the anonymous database has for
the attacker.
Privacy protection is about protecting personal information and not simply about protecting the identity linked to a specific
database record.
This subtle difference is reflected in the three different attacker goals that are specified in the model:
a) re-identification (full):
1) identify to whom a specific anonymous record belongs;
2) identify which anonymous record belongs to a certain
person;
b) information recovery (or partial re-identification);
c) database membership:
1) Is someone listed in the database?
2) Is someone not listed in the database?
20. 20
Re-identification example
https://www.iso.org/standard/42807.html
As can be seen, the anonymous database contains three records with four static variables, which can have the values A or B
(where a question mark indicates missing information).
The attacker can observe only two of these variables directly and correctly and knows all people who are listed in the
anonymous database.
The linkage rules for this situation are thus extremely simple, either a value is the same or it is not.
21. 21
Linkage mechanisms
https://www.iso.org/standard/42807.html
Partial re-identification is an intermediate stage between recovery of all information (on a particular subject) within the
anonymous database and no recovery at all.
In other words, it is the situation in which full re-identification fails, but in which the processes (algorithms) of re-
identification used still succeed in recovering some information from the anonymous database.
23. 23
Field Privacy Action (PA) PA Config
Variant Twin
Output
Gender Pseudonymise AD-lks75HF9aLKSa
Pseudonymization
Generalization
Field Privacy Action (PA) PA Config
Variant Twin
Output
Age Integer Range Bin
Step 10 +
Pseud.
Age_KXYC
Age Integer Range Bin
Custom
Steps
18-25
Aggregation/Binning
Field Privacy Action (PA) PA Config
Variant Twin
Output
Balance Nearest Unit Value Thousand 94000
Rounding
Generalization
Source data:
Output data:
Last name Balance Age Gender
Folds 93791 23 m
… … … …
Generalization
Source data:
Output data:
Patient Age Gender Region Disease
173965429 57 Female Hamburg Gastric ulcer
Patient Age Gender Region Disease
173965429 >50 Female Germany Gastric ulcer
Generalization
Examples of data de-identification
Source: INTERNATIONAL STANDARD ISO/IEC 20889, Privitar, Anonos
24. 24
Pseudonymization vs. Anonymization
Pseudonymization is recognized as an important method for privacy protection of personal health information.
• Such services may be used nationally, as well as for trans-border communication.
• Application areas include:
• indirect use of clinical data (e.g. research); clinical trials and post-marketing surveillance; pseudonymous
care; patient identification systems; public health monitoring and assessment; confidential patient-safety
reporting (e.g. adverse drug effects); comparative quality indicator reporting; peer review; consumer
groups; field service.
Anonymization
• Anonymization is the process and set of tools used where no longitudinal consistency is needed.
• The anonymization process is also used where pseudonymization has been used to address the remaining data
attributes.
• Anonymization utilizes tools like redaction, removal, blanking, substitution, randomization, shifting, skewing,
truncation, grouping, etc. Anonymization can lead to a reduced possibility of linkage.
• Each element allowed to pass should be justified. Each element should present the minimal risk, given the
intended use of the resulting data-set. Thus, where the intended use of the resulting data-set does not require
fine-grain codes, a grouping of codes might be used.
ISO 25237 Health informatics
25. 25
Imaging Data
Application Protection*
Cloud Gateway*
Big Data Protection*
Big Data Protection*
Big Data Protection*
File Protection*
Example of Privacy
protection of personal
health information
Use Cases:
• Diagnostic & reporting with real (Pseudonymized) data
• Clinical Research purpose by analyzing the historical data with Anonymized data
• Real time analytics and triggering actionable events for patients/ Physicians with real (Pseudonymized) data
• Training purpose with Anonymized data
• Clinical Trials and treatment with Real (Pseudonymized) / Anonymized data
• Predictive analytics with real (Pseudonymized) / Anonymized data
Use Cases for
vendor neutral
archive (VNA) for
Medical Imaging
devices &
Analytics
*: Examples of Data Protection Enforcement points
Big Data Protection*
26. 26
Protection throughout the lifecycle of data in Hadoop
Tokenizes or encrypts
sensitive data fields
Enterprise
Policies
Privacy policies may be
managed on-prem or
Cloud Platform
• Policy Enforcement Point (PEP)
Protected data fields
U
Separation of Duties
• Encryption Key Management
Big Data Analytics
Data
Producers
Data
Users
Google Cloud
UU
Big Data Protection with Granular Field Level Protection for Google Cloud
28. 28
Transit Use Storage Singling out
Pseudonymization Tokenization
Protects the data flow
from attacks
Yes Yes Yes Yes Direct identifiers No
Deterministic
encryption
Protects the data when
not used in processing
operations
Yes No Yes Yes All attributes No
Order-preserving
encryption
Protects the data from
attacks
Partially Partially Partially Yes All attributes No
Homomorphic
encryption
Protects the data also
when used in processing
operations
Yes Yes Yes Yes All attributes No
Masking
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes Local identifiers Yes
Local suppression
Protects the data in
analytical applications
Yes Yes Yes Yes
Identifying
attributes
Partially
Record suppression
Removes the data from
the data set
Yes Yes Yes Yes All attributes Yes
Sampling
Exposes only a subset of
the data for analytical
applications
Partially Partially Partially Yes All attributes Partially
Generalization
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
Partially
Rounding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No
Top/bottom coding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No
Noise addition
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially
Permutation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially
Micro aggregation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No All attributes No
Differential privacy
Protects the data in
analytical applications
No Yes Yes No
Identifying
attributes
Yes
K-anonymity
Protects the data in
analytical applications
No Yes Yes Yes Quai identifiers Yes
Privacy models
Applicable to
types of
attributes
Red
Cryptographic tools
Suppression
Generalization
Technique name
Data
truthfulness
at record
level
Use Case / User Story
Data protected in
Randomization
Applicable
to Direct
identifiers
Applicable
to All
attributes
Applicable
to Local
identifiers
Applicable
to
Identifying
attributes
Applicable
to Quasi
Identifiers
Applicability
to Different
types of
attributes
Risk reduction and
truthfulness of
standardized
de-identification
techniques and
models
Source:
INTERNATIONAL
STANDARD ISO/IEC
20889
Data truthfulness
at the record level
is useful for cases
involving traceable
data principal
specific patterns,
such as for fraud
detection,
healthcare
outcome
assessments, etc.
Technique name
29. 29
Transit Use Storage Singling out Linking In
Pseudonymization Tokenization
Protects the data flow
from attacks
Yes Yes Yes Yes Direct identifiers No Partially
Deterministic
encryption
Protects the data when
not used in processing
operations
Yes No Yes Yes All attributes No Partially
Order-preserving
encryption
Protects the data from
attacks
Partially Partially Partially Yes All attributes No Partially
Homomorphic
encryption
Protects the data also
when used in processing
operations
Yes Yes Yes Yes All attributes No No
Masking
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes Local identifiers Yes Partially
Local suppression
Protects the data in
analytical applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially
Record suppression
Removes the data from
the data set
Yes Yes Yes Yes All attributes Yes Yes
Sampling
Exposes only a subset of
the data for analytical
applications
Partially Partially Partially Yes All attributes Partially Partially
Generalization
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially
Rounding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially
Top/bottom coding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially
Noise addition
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially
Permutation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially
Micro aggregation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No All attributes No Partially
Differential privacy
Protects the data in
analytical applications
No Yes Yes No
Identifying
attributes
Yes Yes
K-anonymity
Protects the data in
analytical applications
No Yes Yes Yes Quai identifiers Yes Partially
Privacy models
Applicable to
types of
attributes
Reduces the risk o
Cryptographic tools
Suppression
Generalization
Technique name
Data
truthfulness
at record
level
Use Case / User Story
Data protected in
Randomization
Reduces
the risk
of
Singling
out
Partially
Reduces
the risk
of
Singling
out
Reduce the risk of Singling out
Singling out:
isolating some or
all records
belonging to a data
principal in the
dataset by
observing a set of
characteristics
known to uniquely
identify this data
principal
Risk reduction and
truthfulness of
standardized
de-identification
techniques and
models
Source:
INTERNATIONAL
STANDARD ISO/IEC
20889
Technique name
30. 30
Transit Use Storage Singling out Linking Inference
Pseudonymization Tokenization
Protects the data flow
from attacks
Yes Yes Yes Yes Direct identifiers No Partially No
Deterministic
encryption
Protects the data when
not used in processing
operations
Yes No Yes Yes All attributes No Partially No
Order-preserving
encryption
Protects the data from
attacks
Partially Partially Partially Yes All attributes No Partially No
Homomorphic
encryption
Protects the data also
when used in processing
operations
Yes Yes Yes Yes All attributes No No No
Masking
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes Local identifiers Yes Partially No
Local suppression
Protects the data in
analytical applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partially
Record suppression
Removes the data from
the data set
Yes Yes Yes Yes All attributes Yes Yes Yes
Sampling
Exposes only a subset of
the data for analytical
applications
Partially Partially Partially Yes All attributes Partially Partially Partially
Generalization
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partially
Rounding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partially
Top/bottom coding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partially
Noise addition
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partially
Permutation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partially
Micro aggregation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No All attributes No Partially Partially
Differential privacy
Protects the data in
analytical applications
No Yes Yes No
Identifying
attributes
Yes Yes Partially
K-anonymity
Protects the data in
analytical applications
No Yes Yes Yes Quai identifiers Yes Partially No
Privacy models
Applicable to
types of
attributes
Reduces the risk of
Cryptographic tools
Suppression
Generalization
Technique name
Data
truthfulness
at record
level
Use Case / User Story
Data protected in
Randomization
Reduces
the risk
of
Linking
Partially
Reduces
the risk
of
Linking
Source:
INTERNATIONAL
STANDARD ISO/IEC
20889
Reduce the risk of Linking
Linking
act of associating a
record concerning a
data principal with a
record concerning the
same data principal in a
separate dataset
Risk reduction and
truthfulness of
standardized
de-identification
techniques and
models
Technique name
31. 31
Risk reduction and
truthfulness of
standardized
de-identification
techniques and
models
Transit Use Storage Singling out Linking Inferenc
Pseudonymization Tokenization
Protects the data flow
from attacks
Yes Yes Yes Yes Direct identifiers No Partially No
Deterministic
encryption
Protects the data when
not used in processing
operations
Yes No Yes Yes All attributes No Partially No
Order-preserving
encryption
Protects the data from
attacks
Partially Partially Partially Yes All attributes No Partially No
Homomorphic
encryption
Protects the data also
when used in processing
operations
Yes Yes Yes Yes All attributes No No No
Masking
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes Local identifiers Yes Partially No
Local suppression
Protects the data in
analytical applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partiall
Record suppression
Removes the data from
the data set
Yes Yes Yes Yes All attributes Yes Yes Yes
Sampling
Exposes only a subset of
the data for analytical
applications
Partially Partially Partially Yes All attributes Partially Partially Partiall
Generalization
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partiall
Rounding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partiall
Top/bottom coding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partiall
Noise addition
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partiall
Permutation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partiall
Micro aggregation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No All attributes No Partially Partiall
Differential privacy
Protects the data in
analytical applications
No Yes Yes No
Identifying
attributes
Yes Yes Partiall
K-anonymity
Protects the data in
analytical applications
No Yes Yes Yes Quai identifiers Yes Partially No
Privacy models
Applicable to
types of
attributes
Reduces the risk of
Cryptographic tools
Suppression
Generalization
Technique name
Data
truthfulness
at record
level
Use Case / User Story
Data protected in
Randomization
Reduces
the risk
of
Inference
Partially
Reduces
the risk
of
Inference
Source:
INTERNATIONAL
STANDARD ISO/IEC
20889
Inference:
act of deducing otherwise
unknown information with non-
negligible probability, using the
values of one or more
attributes or by correlating
external data sources
The deduced information can be
the value of one or more
attributes of a data principal,
the presence or absence of a
data principal in a dataset, or
the value of one or more
statistics for a population or
segment of a population.
Reduce the risk of InferenceTechnique name
32. 32
Risk
reduction
and
truthfulness
of
standardized
de-
identification
techniques
and
models
Source:
INTERNATIONAL
STANDARD ISO/IEC
20889
Transit Use Storage Singling out Linking Inference
Pseudonymization Tokenization
Protects the data flow
from attacks
Yes Yes Yes Yes Direct identifiers No Partially No
Deterministic
encryption
Protects the data when
not used in processing
operations
Yes No Yes Yes All attributes No Partially No
Order-preserving
encryption
Protects the data from
attacks
Partially Partially Partially Yes All attributes No Partially No
Homomorphic
encryption
Protects the data also
when used in processing
operations
Yes Yes Yes Yes All attributes No No No
Masking
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes Local identifiers Yes Partially No
Local suppression
Protects the data in
analytical applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partially
Record suppression
Removes the data from
the data set
Yes Yes Yes Yes All attributes Yes Yes Yes
Sampling
Exposes only a subset of
the data for analytical
applications
Partially Partially Partially Yes All attributes Partially Partially Partially
Generalization
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
Partially Partially Partially
Rounding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partially
Top/bottom coding
Protects the data in
dev/test and analytical
applications
Yes Yes Yes Yes
Identifying
attributes
No Partially Partially
Noise addition
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partially
Permutation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No
Identifying
attributes
Partially Partially Partially
Micro aggregation
Protects the data in
dev/test and analytical
applications
Yes Yes Yes No All attributes No Partially Partially
Differential privacy
Protects the data in
analytical applications
No Yes Yes No
Identifying
attributes
Yes Yes Partially
K-anonymity
Protects the data in
analytical applications
No Yes Yes Yes Quai identifiers Yes Partially No
Privacy models
Applicable to
types of
attributes
Reduces the risk of
Cryptographic tools
Suppression
Generalization
Technique name
Data
truthfulness
at record
level
Use Case / User Story
Data protected in
Randomization
Technique name
35. 35
ISO/IEC 29101:2018 Architecture framework
ISO - Actors and Systems
• An actor can be responsible for building the ICT (information and communication technology) systems that it uses, or
not. For example, the PII principal can use a system built by and the responsibility of the PII controller or the ICT system
of the PII principal can be a part of the ICT system of the PII controller.
• In ICT systems employing peer-to-peer communications (communication method, communication model or
communication technology featuring communication in between/among the entities that are peer to each other without
central servers), every application can take the roles of all three listed actors.
• Information is both sent and received by
each peer, so each peer can be a PII
controller or processor for PII transferred by
another party in the role of a PII principal.
• In social networking applications, PII can be
processed by anyone with access to other
people's profiles.
36. 36
Policy framework for operation of pseudonymization services
https://www.iso.org/standard/42807.html
This policy should include the following:
1. description of the processing in which pseudonymization plays a role;
2. identification of the controller of the personal data;
3. identification of the controller of the pseudonymized data;
4. description of the pseudonymization method;
5. identification of the entity carrying out the pseudonymization;
• protection, storage and handling of the pseudonymization “secrets” (usually a cryptographic key or a linking table);
• description of what will happen if the organization is discontinued
• description also for which domains and applications the secret will be used and or how long it is valid
6. detailed description if the pseudonymization is reversible and what authorization by whom is required;
7. definition of the limitations of the receiver of pseudonymized data (e.g. information actions, onward forwarding,
retention policies)
Each data processing or collecting
project that uses pseudonymization
should have a data protection policy
dealing with the pseudonymization
aspects
37. 37
Trustworthy implementation - A trusted third party* performing a pseudonymizing
https://www.iso.org/standard/42807.html
A trusted third party performing a pseudonymizing transformation is necessary for trustworthy implementation of the
pseudonymization technique across multiple entities.
1. As one communicating party does not always trust the other, trust can be established indirectly because the two parties
trust a third, independent party.
2. Both parties are bound by a code of conduct, as specified in a privacy and security policy agreement they agree on with
the pseudonymization service.
3. Use of a pseudonymization service offers the only reliable protection against several types of attack on the
pseudonymization process.
4. Complementary privacy enhancements technology (PETs) and data processing features can easily be implemented.
*: Security authority, or its agent,
trusted by other entities with
respect to security-related
activities (ISO_25237_2017 Health
informatics – Pseudonymization)
38. 38
Interoperability of Trustworthy implementations of the pseudonymization
https://www.iso.org/standard/42807.html
• One or more mechanisms for exchanging the data between the entities in the model (source, pseudonymization service,
target) and for controlling the operation. This is less of an issue and existing protocols can be used, such as html.
• Key exchange issues.
For two independent pseudonymization
service providers to be interoperable:
• integrate each other’s data: data
from the same date subject
processed by any of the service
providers should be linkable to each
other without direct re-identification
of the data subject;
• convert the pseudonymization
results from one or more service
providers in a controlled way
without direct re-identification of
the data subject.
39. 39
Pseudonymization services - Trustworthy practices for operations
https://www.iso.org/standard/42807.html
A pseudonymization service:
1. should be strictly independent of the organizations supplying source data;
2. should be able to guarantee security and trustworthiness of its methods by publishing to its subscribers its
operating practices;
3. should be able to guarantee security and trustworthiness of its software modules:
4. should be able to guarantee security and trustworthiness of its operating environment, platforms and
infrastructure (should provide technical, physical, procedural and personnel controls in accordance with ISO
27799)
5. should implement monitoring and quality assurance services and programmes
6. cryptographic key management
7. instantiation of the pseudonymization service
8. internal audit procedures
9. external audit procedures
10. participants
11. risk assessment should be conducted regarding access by the data source to the resulting pseudonyms and
specification of such restrictions should be expressed in operational policies
40. 40https://www.iso.org/standard/42807.html
Preparation of data
The conceptual model for the use of pseudonymization services requires
that the data be split up in a part containing identifying data and in
another part containing nothing but anonymous data.
1. Data elements that will be used for
linking, grouping, anonymous
searching, matching, etc. shall be
indicated and marked
2. Depending on the privacy policy,
convert elements that need specific
transformations, e.g. for changing
absolute time references into
relative time references, dates of
birth into age groups, need similar
marking
3. Identifying elements that,
according to the privacy policy, are
not needed in the further
processing in the target
applications, shall be discarded
4. The anonymous part of the raw
personal data is put into the
payload part of the personal data
element
42. 42
Pseudonymize - Identifying and payload data shall be separated
Entities in the de-classification process
The separation of identifying and payload data
• Further processing steps will take the identifying part as input and leave the payload
unchanged.
• The pseudonymization process translates the given identifiers into a pseudonym.
Pseudonymization can map a given identifier with the same pseudonym.
• Because the combination of both preservation of linkage between records
belonging to the same identity and the protection of privacy of the data subjects
is the main reason for using pseudonymization, this variant is used most often;
— map a given identifier with a different pseudonym:
— context dependent (context spanning aspect of a pseudonym);
— time dependent (e.g. always varying or changing over specified time-intervals);
— location dependent (e.g. changing when the data comes from different places).
ISO/TS 25237:2008 Health informatics — Pseudonymization
Two types of pseudonymized data
• Irreversible pseudonymization
• Reversible pseudonymization by
applying procedures restricted to
duly authorized users.
U
Tokens
Lookup table
Identifying
data
Payload
data
43. 43 pcisecuritystandards.org
Encryption process
Encrypted Cardholder
data (CHD)
U
Encryption keys
System 1
System 2
System 3
Encryption keys
Encrypted Cardholder
data (CHD)
USystem 4
Encrypted Cardholder
data (CHD)
USystem 0
The following MAY NOT be in scope for
PCI DSS
Encryption Example for PCI DSS
Encryption keys
“Where a third party receives
and/or stores only data
encrypted by another entity,
and where they do not have
the ability to decrypt the data,
the third party may be able to
consider the encrypted data
out of scope if certain
conditions are met.”
third party
another entity
data encrypted
That is specific to a situation where the organization has no access
to the key material and only encrypted PANS. For example, if a
card swipe is encrypted at the PAD and traverses the organizations
network, then to the bank for authorization/settlement and the
organization never gets the clear text PAN and has no access to
the keys used between the PAD and the bank, then that
organization may have no PCI responsibility.
In any situation where the organization has access to the key
material, tokenization is the only method to reduce scope.
44. 44
Tokenization process
U
System 1
The following are each in scope
1. Systems performing tokenization of data
2. Tokens that are not isolated from the tokenization
processes
3. Tokenized data that is present on a system or media
that also contains the tokenization table
4. Tokens that are present in the same environment as
the tokenization table
5. Tokens accessible to an entity that also has access to
the tokenization table
System 2
System 3 USystem 4
Tokens
USystem 0
The following is NOT in scopeTokenization Example for PCI DSS
TokensLookup table
Lookup table
Tokens
Lookup table
pcisecuritystandards.org
46. 46
Access to DataLow High
High -
Low -
I I
Lower Risk and Higher Productivity
with More Access to More Data
User Productivity
Risk
More
Access to
Data
Low Risk Tokens
High Risk Clear Data
47. 47
Security Compliance
Privacy
Controls
&
Tools Regulations
Policies
Risk
Management
Why, What & How
Balance
Breaches
Opportunities
Enable use of protected data to find new business opportunities
Protect that data in ways
that are transparent to
business processes and
compliant to regulations
Data Security
On-prem or as a Service
Compliance to EU GDPR, California CCPA and a
growing list of country specific privacy regulations
50. 50
Personally Identifiable Information
(PII) in compliance with the EU Cross
Border Data Protection Laws,
specifically
• Datenschutzgesetz 2000 (DSG
2000) in Austria, and
• Bundesdatenschutzgesetz in
Germany.
This required access to Austrian and
German customer data to be
restricted to only requesters in each
respective country.
• Achieved targeted compliance with
EU Cross Border Data Security laws
• Implemented country-specific data
access restrictions
Data sources
Case Study
A major international bank performed a consolidation of all European operational data sources
to Italy
52. 52
CCPA redefines ”Personal information”
• CCPA states that ”Personal information” means information that identifies, relates to, describes, is capable of
being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or
household
PwC,
Micro Focus
55. 55
Data flow mapping under GDPR
• If there is not already a documented workflow in place in your organisation, it can be worthwhile for a
team to be sent out to identify how the data is being gathered.
• This will enable you to see how your data flow is different from reality and what needs to be done to
amend this.
If an organisation’s theory about how its data is flowing is different from the reality, you have a breach and
could be fined.
The organisation needs to look at how the data
was captured, who is accountable for it, where it
is located and who has access.
56. 56
Legal Compliance and Nation-State Attacks
• Many companies have information that is attractive to governments and intelligence services.
• Others worry that litigation may result in a subpoena for all their data.
Securosis, 2019
Multi-Cloud Data Privacy considerations
Jurisdiction
• Cloud service
providers
redundancy is great
for resilience, but
regulatory concerns
arises when moving
data across regions
which may have
different laws and
jurisdictions.
SecuPi
57. 57Securosis, 2019
Consistency
• Most firms are quite familiar with their on-premises encryption and key management systems, so they often prefer
to leverage the same tool and skills across multiple clouds.
• Firms often adopt a “best of breed” cloud approach.
Multi-Cloud Key Management considerations
Trust
• Some customers simply do not trust their vendors.
Vendor Lock-in and Migration
• A common concern is vendor lock-in, and
an inability to migrate to another cloud
service provider.
• Some native cloud encryption systems do
not allow customer keys to move outside
the system, and cloud encryption systems
are based on proprietary interfaces.
• The goal is to maintain protection
regardless of where data resides, moving
between cloud vendors.
Cloud Gateway
Google Cloud AWS Cloud Azure Cloud
58. 58
References:
1. California Consumer Privacy Act, OCT 4, 2019, https://www.csoonline.com/article/3182578/california-consumer-privacy-act-what-
you-need-to-know-to-be-compliant.html
2. CIS Controls V7.1 Mapping to NIST CSF, https://dataprivacylab.org/projects/identifiability/paper1.pdf
3. GDPR and Tokenizing Data, https://tdwi.org/articles/2018/06/06/biz-all-gdpr-and-tokenizing-data-3.aspx
4. GDPR VS CCPA, https://wirewheel.io/wp-content/uploads/2018/10/GDPR-vs-CCPA-Cheatsheet.pdf
5. General Data Protection Regulation, https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
6. IBM Framework Helps Clients Prepare for the EU's General Data Protection Regulation, https://ibmsystemsmag.com/IBM-
Z/03/2018/ibm-framework-gdpr
7. INTERNATIONAL STANDARD ISO/IEC 20889, https://webstore.ansi.org/Standards/ISO/ISOIEC208892018?gclid=EAIaIQobChMIvI-
k3sXd5gIVw56zCh0Y0QeeEAAYASAAEgLVKfD_BwE
8. INTERNATIONAL STANDARD ISO/IEC 27018, https://webstore.ansi.org/Standards/ISO/
ISOIEC270182019?gclid=EAIaIQobChMIleWM6MLd5gIVFKSzCh3k2AxKEAAYASAAEgKbHvD_BwE
9. New Enterprise Application and Data Security Challenges and Solutions https://www.brighttalk.com/webinar/new-enterprise-
application-and-data-security-challenges-and-solutions/
10. Machine Learning and AI in a Brave New Cloud World https://www.brighttalk.com/webcast/14723/357660/machine-learning-and-ai-
in-a-brave-new-cloud-world
11. Emerging Data Privacy and Security for Cloud https://www.brighttalk.com/webinar/emerging-data-privacy-and-security-for-cloud/
12. New Application and Data Protection Strategies https://www.brighttalk.com/webinar/new-application-and-data-protection-
strategies-2/
13. The Day When 3rd Party Security Providers Disappear into Cloud https://www.brighttalk.com/webinar/the-day-when-3rd-party-
security-providers-disappear-into-cloud/
14. Advanced PII/PI Data Discovery https://www.brighttalk.com/webinar/advanced-pii-pi-data-discovery/
15. Emerging Application and Data Protection for Cloud https://www.brighttalk.com/webinar/emerging-application-and-data-protection-
for-cloud/
16. Data Security: On Premise or in the Cloud, ISSA Journal, December 2019, ulf@ulfmattsson.com
17. Webinars and slides, www.ulfmattsson.com
59. 59
IS: International Standard
TR: Technical Report
TS: Technical Specification
Guidelines to help comply
with ethical standards
20889 IS Privacy enhancing de-identification terminology and
classification of techniques
27018 IS Code of practice for protection of PII in public clouds acting
as PII processors
27701 IS Security techniques - Extension to ISO/IEC 27001 and
ISO/IEC 27002 for privacy information management - Requirements
and guidelines
29100 IS Privacy framework
29101 IS Privacy architecture framework
29134 IS Guidelines for Privacy impact assessment
29151 IS Code of Practice for PII Protection
29190 IS Privacy capability assessment model
29191 IS Requirements for partially anonymous, partially unlinkable
authentication
Cloud
11 Published International Privacy Standards
Framework
Management
Techniques
Impact
19608 TS Guidance for developing security and privacy functional
requirements based on 15408
Requirements
27550 TR Privacy engineering for system lifecycle processesProcess
ISO Privacy Standards