Data Privacy in the Cloud.pdf

Baseline Technical Measures for Data Privacy IN the Cloud
Meng-Chow Kang, PhD, CISSP
Adjunct Associate Professor
Chi-Hung Chi, PhD
Senior Principal Scientist
Kwok-Yan Lam, PhD
Professor
The School of Computer Science and Engineering
Nanyang Technological University
27 January 2023
Abstract
As the digital economy grows, individuals’ personal data is increasingly being collected, used, and disclosed, be it
through online social media, e-commerce, or e-government transactions. As the volume of such personal data
increases online, data breaches have become more prevalent, and consumers have increased their demands for
stronger privacy protection. Data privacy legislations are not new,– frameworks by the Organization for Economic
Co-operation and Development (OECD), European Union General Data Protection Regulations (EU GDPR)1
, and
Asia-Pacific Economic Cooperation (APEC) have existed as early as 1980. We have seen more policy developments
being introduced recently across the global. In ASEAN, the Singapore Personal Data Protection Act (SG PDPA) was
enacted more recently in 2012, and the latest being the Thailand Personal Data Protection Act (TH PDPA) that
came into force on 1 June 2022.
Against the backdrop of these legal frameworks, businesses and governments are also leveraging advanced cloud
services, such as AI/ML and Big Data Analytics to innovate and offer better customer services. As a result, more
personal data is being migrated to the cloud, increasing the need for technical measures to enable privacy by
design and by default and move beyond mere compliance with legal provisions. While new standards and
frameworks have emerged to guide the management of privacy risk on the use of cloud, there are limitations in
providing implementation guidance for technical controls to cloud using organizations. This paper thus seeks to fill
the gap and provide technical measures that organizations may adopt to develop technical baselines that simplify
their regulatory compliance across legal frameworks.
We review the principles from the OECD, APEC, the EU GDPR, the SG PDPA, and the newly enforced TH PDPA,
identifies a set of 31 technical measures that cloud using organizations may use to achieve common data
protection objectives including fulfilling data subject rights, minimizing personal data exposure, preparing to
respond to and recovering from data privacy breaches, ensuring security and quality of personal data, and
providing transparency and assurance. Our elaboration of the technical measures for achieving these data
protection objectives includes references to common cloud-enabled and cloud-native services from major CSPs to
provide implementation guidance.
NOTE: This paper is provided solely for informational purposes. It is not legal advice and should not be relied on as
legal advice. The target audience of the paper is systems architects, privacy engineers, information security and
data privacy managers of cloud using organizations, as well as policy makers and regulators.
1 Background and motivation
1.1 Purpose
The motivation for this paper is to address the organizational needs for privacy compliance and fill the gaps in the
existing standards. We adopt a principle-based methodology to identify a set of baseline technical measures
suitable for achieving the data protection objectives underlying their privacy principles. The paper further presents
guidance to cloud using organizations that cloud-native and cloud-enabled services may be used to implement the
1 The GDPR repealed and replaced the Data Protection Directive, which was enacted in October 1995.

2
baseline technical controls with reference to capabilities available from major Cloud Service Providers (CSPs)
including Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.
1.2 Demands and expectations for better personal data protection
Data privacy breaches around the world have been making headlines in the past decade. Besides major breaches
reported in Europe[1], United Kingdom[2-5], and the United States[6, 7], personal data breaches have not spared
organizations in Asia, either. In 2018, in Thailand, two Thai banks had breaches involving over 120,000 customers’
personal data [8]. In the same year, SingHealth, a major healthcare cluster in Singapore, experienced a cyberattack
resulting in 1.5 million patients’ personal details illegally accessed and copied [9, 10]. In Indonesia, Tokopedia2
, a
major e-commerce platform had a cybersecurity incident that exposed over 15 million user account data in 2020.
In November 2022, Optus, one of leading telecom providers in Australia had about 10 million Australians’
identification information compromised, potentially affecting 40 percent of Australian population [11] [12] [13].
With these developments, and the increased collection and use of personal data in the digital space, consumers
and citizens around the world are demanding better privacy protection. Governments have also responded. On 25
May 2018, the European Union’s General Data Protection Regulation (EU GDPR) became enforceable, starting a
new era of higher protection of peoples’ privacy rights. Asia has been one of the most dynamic regions in the
world for privacy regulatory changes. Thailand’s Personal Data Protection Act (TH PDPA) has come in full effect
since 1 June 2022, as the fourth in the 10 ASEAN countries that have passed such a law after Singapore (2013),
Malaysia (2013), and the Philippines (2016). Indonesia passed its first Personal Data Protection Bill on 20th
September 2022, with a 2-year grace period for compliance. With these developments, organizations processing
personal data must ensure adequate data protection compliance across more economies that they do business
with. While the objectives of these regulations are similar, there are differences in their approach, scope,
coverage, and depth of requirements. Without a common baseline, organizations are further challenged with
identifying the technical specificity, which resolve to local customization, increasing the cost of compliance and
reduce operational efficiency.
1.3 Public cloud and related concerns and challenges
Compared with traditional on-premises IT infrastructure, it is easier, faster, more elastic, and cheaper to set up and
operate systems on the public cloud[14]. Cloud services have also lowered the barriers to access to advanced
technical capabilities that historically only big companies could afford, such as AI/ML, Big Data analytics, and
natural language processing[15].
As cloud service providers (CSP) build and operate public cloud services for consumption by cloud using
organizations (Cloud Users)3
, a shared responsibility model (SRM) is used to determine the roles and
responsibilities of the two entities in such an environment[16, 17]. Under this model, the CSP is responsible for the
cloud services and underlying infrastructure, systems, and processes that they build and operate—commonly
known as the “Of the Cloud” domain. The Cloud User is responsible for the applications and data that they build,
operate, and process using the cloud services that CSP provides—also known as the “In the Cloud” domain that
Cloud User manages and controls. The specific delineation depends on the cloud deployment model and the cloud
service level. There are three models - Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software
as a Service (SaaS)[18]. Cloud Users normally use a combination of at least two models across the services that
they use.
For privacy compliance for the data that Cloud Users upload to cloud services for processing, the CSP acts as the
data processor4
, taking the responsibilities for managing “Privacy of the Cloud”. The responsibilities for managing
2 Tokopedia is a subsidiary of a new holding company called GoTo, following a merger with Gojek on 17 May 2021.
3 Organizations that are using public cloud services for processing of personal and other data to deliver application services to
their customers.
4 Specific term used and definition may vary in each legal jurisdiction and standard. In this paper, we adopt the definition of
ISO 10667:2020 and ISO 30400:2021, in which “data processor refers to person (other than an employee of the data controller)

3
“Privacy in the Cloud” stay with the Cloud User, who is typically the data controller5
and ultimately responsible for
the security and privacy of the data that they upload to and process in the cloud. The Cloud User retains ownership
and control of their data throughout the data life cycle. “Privacy of the Cloud” refers to that the CSP manages and
updates the hardware, software and related infrastructure and processes to deliver their public cloud services.
“Privacy in the Cloud” refers to that Cloud Users can leverage the CSP’s compliance on the technology layers that
the CSP manages, and apply additional controls for the implementation, configurations and operations of their
systems and processes running on or supported by public cloud services.
Developing and operating in the cloud often differ from the on-premises IT environment. The shift towards the
DevOps and Continuous Integration and Continuous Deployment (CI/CD) models enables Cloud Users to build,
deploy and iterate faster in the cloud. Without embedding privacy requirements in the DevOps and CI/CD
processes, Cloud Users face challenges when tracking their personal data and ensuring required data protection in
place.
In general, there are three common concerns that Cloud Users have towards the public cloud – control over data,
access to their data, and enforcing data residency controls. Appendix 2 describes these concerns and CSPs’
responses and measures addressing the security and privacy of the cloud. In this paper, we provide guidance on
the technical measures applicable for addressing these concerns for data privacy in the cloud.
Awareness and compliance with technical regulatory requirements in the cloud are key for Cloud Users to
implement and operationalize common data protection principles. In addition, regardless of the use of cloud, there
are inherent concerns from personal data over-collection practices and over-use beyond permitted purposes.
1.4 Limitations of existing privacy control standards
Along with the regulatory changes, various standards organizations have published new or extended standards to
provide guidance for identifying data protection controls. For examples, the ISO/IEC 27701 Privacy Information
Management System (PIMS) [19], NIST Privacy Framework (NIST-PF)[20], and Cloud Security Alliance (CSA)’s
updated Cloud Control Matrix (CCM) [21]. Targeting at broad organizational level privacy program development,
these standards focus primarily on augmenting existing enterprise risk management frameworks.6
Data protection
regulatory requirements are identified and assessed as part of the risk assessment process. This requires a
jurisdiction-by-jurisdiction technical assessment, which is not scalable. The ISO series, which include ISO/IEC
27017[22] and ISO/IEC 27018[23], and the Cloud Security Alliance (CSA) standards are structured to support the
respective certification schemes[24, 25] for providing assurance of CSPs’ control environment. They do not directly
address the privacy in the cloud requirements from a data lifecycle perspective, and provide a set of baseline
technical measures and implementation guidance to Cloud Users for their personal data processing in cloud
application systems.
or organization that collects, processes and stores personal data on behalf of the data controller”. In ISO/IEC 29100:2011, which
ISO/IEC 27701:2019 follows, the term “PII processor” is used, which refers to “privacy stakeholder that processes personally
identifiable information (PII) on behalf of and in accordance with the instructions of a PII controller”.
5 Specific term used and definition may vary in each legal jurisdiction and standard. In this paper, we adopt the definition of
ISO 10667:2020 and ISO 30400:2021, in which “data controller refers to person or organization who determines the purposes
for which and the manner in which any personal data are to be collected, processed and stored”. In ISO/IEC 29100:2011, which
ISO/IEC 27701:2019 follows, the term “PII controller” is used, which refers to “privacy stakeholder (or privacy stakeholders) that
determines the purposes and means for processing personally identifiable information (PII) other than natural persons who use
data for personal purposes”.
6 The ISO/IEC 27701 is established as an extension to the ISO/IEC 27001 Information Security Management System (ISMS),
which requires the ISMS to be implemented before considering privacy management and controls. The NIST-PF adopts the NIST
Critical Infrastructure Cybersecurity Framework (NIST-CSF) Core-Profiles-Implementation Tiers model to narrow the focus on
privacy risks in the IT environment.

4
1.5 Organization’s requirements for privacy compliance at scale
As organizations seek to innovate and improve customer experience, adopting advanced cloud services7
such as
Artificial Intelligence and Machine Learning (AI/ML) and Big Data analytics services on demand are accelerating the
migration of personal data to the cloud. For organizations that operate at a regional or global level, beyond
updating Privacy Notice and internal policies, they need their technical approach for data protection to scale and
gain efficiency while ensuring effectiveness in compliance.
This paper aims to address the above challenges and requirements and provide technical measures suitable for
achieving organizations’ data privacy compliance needs in the cloud environment.
2 Scope and target audience
In this paper, we define Cloud Users as organizations that develop or consume software for handling personal data
that are built in-house or by third parties using public cloud services. Cloud Users distinguish from Cloud Service
Providers (CSPs) who provide public cloud services.
Our goal is to enable Cloud Users to manage their regulatory obligations as a Data Controller and/or Data
Processor via technical measures that are privacy principles based. The target audience includes systems
architects/designers, privacy engineers, privacy risk managers of Cloud Users. Policy makers and regulators may
also consider incorporating the technical measures, guidance and associated cloud services discussed in this paper
for protecting personal data processing in the cloud as they develop and enforce privacy regulations and
guidelines. Similarly, organizations’ leaders may consider the technical measures, guidance and associated cloud
services when evaluating the CSPs and available cloud services as part of their cloud adoption and migration
adoption strategy.
The study further presents cloud-native and cloud-enabled services that Cloud Users may use to implement the
technical baseline, with Amazon Web Services (AWS) as the primary reference for simplicity. Readers who are
using Google Cloud Platform (GCP) or Microsoft’s Azure Cloud may also refer to the cloud services mapping
provided by the respective CSP[26, 27].
3 Methodology – an overview
To enable privacy compliance in the cloud, we must identify the technical measures required that can be
implemented in the cloud. These technical measures must be relevant to most if not all applicable privacy
regulations that may be realized or supported by technical means.
The study therefore applies a principle-based approach to analyze the privacy regulations and framework starting
with identifying the privacy principles promulgated. The privacy principles are then categorized based on their
shared objectives. The categorization provides ease of reference and align those principles that have common
objectives to a common set of technical measures. The privacy principles are then analyzed against the data life
cycle stages (see Figure 2), which lead us to the list of technical measures required for achieving or supporting the
objectives of the privacy principles within the same category. Figure 1 depicts the high-level process involved in
this methodology. This approach can provide a more scalable and efficient method for identifying technical
measures required for compliance with the growing numbers of data protection laws and regulations.
7 In this paper, cloud services mean public or commercial cloud-enabled or cloud-native services that are provided by
commercial cloud service providers (CSP) such as Amazon Web Services (AWS), Microsoft Azuze, and Google Cloud Platform
(GCP).

5
Figure 1: Outline of the principles-based approach for translating privacy requirements to technical controls
4 Privacy regulations and principles
In the early 1980s, recognizing the threats to information privacy in the online world, the Organization for
Economic Co-operation and Development (OECD) promulgated the OECD Guidelines Governing the Protection of
Privacy and Transborder Flows of Personal Data8
, advocating eight fundamental privacy principles. In 2005, the
Asia-Pacific Economic Co-operation (APEC) economies endorsed an APEC Privacy Framework to encourage
development of information privacy protections and ensuring the free flow of information across APEC economies.
The APEC Privacy Framework has nine principles and is consistent with the core values of the OECD
Guidelines9
.While the two frameworks are not legally binding, they serve as a baseline whereby an economy or
jurisdiction may adopt and adapt to their local legislation.
Singapore, as a critical regional trade and financial hub in Asia, had its Personal Data Protection Act (SG PDPA)
becoming effective in 2013, a significant regulatory development for the regional economy. In 2016, the European
Union endorsed its General Data Protection Regulation (EU GDPR) (successor to the Data Protection
Directive95/46/EC), which became enforceable in May 2018. It has potent influence on many countries’ privacy
regulations including economies in Asia. Thailand’s Personal Data Protection Act (TH PDPA), which aligns with the
EU GDPR’s approach, has come in full effect since 1 June 2022, as one of the most recent privacy regulations in
Asia.
Through our analysis of the two frameworks and three regulations, we observed that the privacy requirements are
mostly principle-based. We can map regulatory requirements to one or more privacy principles. For example, the
requirement for data subject to be able to exercise their “rights to be forgotten” aligns with the retention
limitation principle. The requirement for organization to publish a Privacy Notice aligns with the openness
principle. The requirement for organization to limit their collection and of personal data aligns with the principle of
purpose specification, use limitation, and collection limitation. Following this observation and the methodology
discussed in Section 3 and depicted in Step 1 of Figure 1, we identified 19 privacy principles that relate to common
data subjects' rights and compliance obligations (see Table 1).
8 See Appendix 3 for a brief description of the OECD privacy principles.
9 See Appendix 4 for a brief description of the APEC privacy principles.
Regulation #1
requirements
Regulation #2
requirements
…
Regulation #N
requirements
OECD Privacy
Guidelines
APEC Privacy
Framework
Privacy Principles #1,
#2, #3, … #P
Principles with Common
Objective #1
Objective #2
Objective #M
…
Technical Measure #1.1
Technical Measure #1.X
…
Technical Measure #2.Y
…
Technical Measure #M.1
Technical Measure #M.2
Technical Measure #M.Z
…
Step 1 – Identify Principles Step 2 - Categorization Step 3 – Determine Technical Measures

6
Table 1: Mapping of data protection principles amongst OECD, APEC, EU GDPR, TH PDPA, and SG PDPA. Legend: O = OECD; A =
APEC; E = EU GDPR; T = TH PDPA; S = SG PDPA.
Data Protection Principles O A E T S
Notice X X X X
Choice X X X X
Consent X X X
Individual participation X X X X
Purpose specification X X X X
Access X X X X
Correction X X X X
Use limitation X X X X X
Retention limitation X X X
Data portability X X
Data residency X X X
Collection limitation X X X X X
Preventing harm (includes Notice) X
Breach notification X X X
Security safeguards X X X X X
Integrity of personal data X
Data quality X X X X
Openness X X X X X
Accountability X X X X X
These privacy principles form the basis for identifying the technical measures applicable to support the compliance
needs of the associated frameworks and regulations.
5 Privacy principles and technical control objectives
Considering the 19 privacy principles in Table 1, each principle has one or more objectives from a technical
perspective. Collectively, they aim to address and reduce the impact of data privacy threats and enable data
subjects to exercise their rights in the collection and use of their personal data. We can group the principles that
have similar or overlapping objectives to ease referencing, identification, and implementation of related technical
measures. Using this method, we identified the purpose specification and use limitation as belonging to similar
group of principles since both share the aim of requiring the use of personal data to be limited to the purpose
specified. Similarly, notice, choice, consent, and individual participation share the aim of ensuring data subject’s
rights to agree/disagree to the processing of their personal data are fully respected and enforceable. Data subjects
need to be notified before they can exercise those rights. We group together access and correction as data
subjects can only request for correction of their personal data after gaining access to them.
Continuing with this approach, we can categorize the groupings into five categories of privacy principles that share
common higher-level objectives. As shown in Table 2, the five categories are:
(1) fulfill data subject’s rights or data controller and processor data processing obligations,
(2) minimize the exposure of personal data,
(3) prepare for response and recovery from data breaches and incidents,
(4) ensure the security and quality of personal data across the data life cycle, and
(5) provide transparency over the functions and processing of personal data to enable data subject’s monitoring,
and data controllers’ improvement of security.
Table 2: Mapping of data protection principles to common technical control objectives
Data Protection Principles Common Data Protection Objectives
Notice, choice, consent, individual participation 1. Fulfill data subject rights
Access and correction
Purpose specification and use limitation

7
Data Protection Principles Common Data Protection Objectives
Retention limitation
Data portability
Data residency
Collection limitation 2. Minimize data exposure
Purpose specification and use limitation
Retention limitation
Preventing harm
Breach notification 3. Prepare for incident response & recovery
Preventing harm
Security safeguards 4. Ensure security and quality of personal data
across the data life cycle
Data integrity and quality
Openness 5. Provide transparency and assurance
Accountability
The preventing harm, purpose specification and use limitation principles exist in more than one category as their
definitions (as detailed in Appendix 3 and 4) include the objectives of the other principles in those categories.
The following sub-sections describe the objectives and discuss their expected outcomes in relation with the privacy
principles that they align with (see Appendix 3 and 4 for descriptions of these principles).
5.1 Fulfill data subject rights
A primary objective of data protection regulations is to protect the rights and safety of data subjects. The privacy
principles in this category, which include notice, choice, consent, individual participation, access and correction,
purpose specification, use limitation, retention limitation, data portability, and data residency share a common
objective of enabling data subjects to exercise their rights in the data life cycle stages. The desired outcome is to
tilt the balance of control from the data controller and processor such that data subjects have better control over
the processing of their personal data. Technical measures in this category should enable data subjects to exercise
their rights and assist organizations to fulfill their obligations more efficiently and effectively in operations, for
example, through automation of the process involve in handling and responding to data subject’s requests.
5.2 Minimize data exposure
The reported breaches highlighted in Section 1 involve many household names. They show that no organization
(large or small) is immune, and that securing digital systems is a challenging endeavor. Organizations faced with
known vulnerabilities need to balance their resources against other priorities, which often leave certain
weaknesses unresolved. Whereas threat actors only need to find one or a few weak points in the system to get
closer to or compromise their target. Cybersecurity professionals and government leaders have been advocating
the concept of “assumed breach” urging organizations to have online systems practice security by design and be
better prepared for such an eventual to minimize losses and reduce negative impact[28]. According to Verizon’s
Annual Data Breach Incident Report (DBIR) 2022[29], over-collection and accumulation of personal data beyond
their consented purposes was one problem that account for the massive breaches. Data minimization, which is one
of the technical measures that the EU GDPR advocates in the data protection by design and by default
requirements10
aligns with the design perspective of the concept of “assumed breach”. By minimizing the data
being processed, the objective is to reduce data exposure should the system or user be compromised. The
outcome desired is minimizing the potential losses and impact to the data subjects. These are the objective and
desired outcomes shared by the collection limitation, use limitation, retention limitation, purpose specification,
and preventing harm principles.
10 See Article 25 of EU GDPR on “Data protection by design and by default”.

8
5.3 Prepare for incident response & recovery
There is no absolute security. Following the concept of “assumed breach” highlighted in Section 5.2, an
organization’s preparation and readiness for incident response and recovery is critical for minimizing losses and
reducing impact to data subjects[30].
In addition, to meet regulatory requirements for data breach incident reporting and notification, organizations
typically have a certain timeframe (such as up to 72 hours) under a regulation to decide whether to report. Within
the timeframe, they must conduct an initial investigation and find out enough information to provide an
estimation of what has been compromised, whose (data subjects) data is involved, the amount of data involved,
and the value of those data to assess the risk and impact to the data subjects involved. An informed estimation of
the size and significance of the breach is essential for the reporting decision to be made, while the actual
investigation may continue over a longer period. Without preparation, organizations may miss the reporting
deadline, and cannot assess the losses and impact accurately, which could result in severe penalties from the
regulator, and loss of confidence of their customers. Preparing for responding to data breaches is therefore as
important as defending against them.
The preparation process, which should include conducting desktop exercises and scenario drills involving
personnel and systems will further identify potential weaknesses or gaps in the other four categories. This will help
strengthen the relevant programs for achieving the objectives of these privacy principles.
5.4 Ensure the security and quality of personal data across the data life cycle
One common thread across breaches studied in the Verizon’s DBIR[29] is security lapses regardless of using public
cloud or not, with top causes as inadequate security safeguards of the organization’s network and applications,
misconfiguration, and mis-delivery of information because of human errors. Data compromised in organization’s
internal data breaches include personal (81 percent), medical (18 percent), and financial (8 percent). The study
finds that data breaches are more prominent in on-premises IT environments given the bigger volume and longer
history of hosting personal data. In a separate report, the European Agency for Cybersecurity, ENISA’s analysis of
623 ransomware incidents shows that 58.2 percent of more than 136 terabytes of stolen data contains GDPR
personal data. This includes protected health information (PHI), passport and visas data, addresses, and COVID-19
status[29]. These findings further instill the importance of security in the protection of personal data against
unauthorized and inappropriate processing. The objectives of the security safeguard principle align with the data
integrity and data quality principles from the perspective of ensuring data confidentiality, integrity, and availability.
5.5 Provide transparency and assurance
Besides implementing and achieving the objectives of the privacy principles in the above categories, organizations
must be transparent in its policies and practices and provide assurance to demonstrate compliance. These are the
desired outcome of the principles in this category, which are openness and accountability. The objective of
openness is to provide transparency to data subjects with regards to the protection that organizations accord to
personal data. The objective of accountability is to ensure that organizations comply with measures that give effect
to the data privacy principles promulgated in the applicable regulations and frameworks.
Section 6 below describes the technical measures applicable for the implementing five categories of privacy
principles in the cloud. Where applicable, we include additional guidance for considerations in the design,
implementation, and operation of the technical measures.
6 Applicable technical measures
Based on the five categories of privacy principles and their common objectives in Table 2 and Section 5 above, our
next step (as discussed in Section 3 and shown in Figure 1) is to identify the technical measures that can be used to
implementing data privacy in the cloud. To do this, we use the data life cycle model (see Figure 2), which is a
common approach for designing privacy controls to identify the technical measures applicable at each stage of the
data life cycle, from collection to deletion. A data-centric approach using the data life cycle model ensures
organizations adequately address their data protection requirements throughout the data life cycle. A typical data

9
life cycle comprises six stages: (1) Collection, (2) Storage, (3) Access and use, (4) Transfer, share, or disclose, (5)
Archival, and (6) Deletion (as shown Figure 2).
Figure 2: Data life cycle stages
Table 3 summarizes the list of technical measures applicable for achieving the control objectives, developed based
on our execution of Step 3 depicted in Figure 1, as discussed in Section 3.
Table 3: Identification of underlying applicable technical measures over the data life cycle stages for meeting common technical
control objectives. Legend: C = Collection; S = Storage; U = Use and access; T = Transfer or share; A = Archival; D = Deletion.
Categories of Data
Protection Principles
Data Protection Principles Applicable Technical Measures C S U T A D
1. Fulfill data subject
rights
• Notice, choice, consent, individual
participation
• Access and correction
• Purpose specification and use
limitation
• Retention limitation
• Data portability
• Data residency
• Automated decision making
1.1 Automated notification X X X X X X
1.2 Consent administration and
tracking
X X X X X X
1.3 Data map/inventory X X X X X
1.4 Identity and access control with
just-in-time provisioning
X X X X X X
1.5 Portable data format standards X X
1.6 Portable data encryption X X
1.7 Location restriction X X X X X
1.8 Access to automated decision-
making data
X X X X X X
2. Minimize data
exposure
• Collection limitation
limitation
• Preventing harm
2.1 Data minimization X X X X X
2.2 De-identification X X
2.3 Data encryption X X X X X X
2.4 Disclosure control X X
2.5 Privacy preserving encryption X X X X
2.6 Secure data destruction X
3. Prepare for
incident response &
recovery
• Breach notification
• Preventing harm
3.1 Event logging X X X X X X
3.2 Incident response readiness X X X
3.3 Automated notification X X X X X X
3.4 High-availability resilience
architecture
X X X X
3.5 Backups X X
4. Security and
quality
• Security safeguards
• Preventing harm
• Data integrity and quality
4.1 Identity and access control X X X X X
4.2 Privilege management X X X X
4.4 Data integrity mechanisms X X X X X
4.5 Code integrity X X X X X X
4.6 Anti-malware / Threat detection X X

10
Categories of Data
4.7 Vulnerability management X X
5. Transparency and
assurance
• Openness
• Transparency
participation
• Accountability
5.1 Data map X X X X X X
5.2 Continuous governance X X X X X X
5.3 Compliance assessment,
attestation, and certifications
X X X X X X
5.4 Data lineage X X X X X X
5.5 Automated reasoning and formal
verification
X X X X X X
In the sub-sections below, we describe the technical measures and related objectives, and the applicable cloud
capabilities that Cloud Users can leverage to implement required control in the cloud.
6.1 Fulfilling data subject’s rights
This category of privacy principles is a primary responsibility of a Cloud User who uses personal data as a data
controller. It will also involve data processors who are engaged by a data controller in relevant processing
activities. The overarching objective is to embed privacy features and controls in systems supporting online
interactions with data subjects and fulfilling their privacy rights. As shown in Table 4Error! Reference source not
found., there are eight technical measures that are applicable to organizations when designing applications to fulfil
the 12 data subject rights’ related data protection principles.
Table 4: Technical controls applicable to fulfill data subject rights obligations in the data life cycle
Categories of Data
1. Fulfill data subject
rights
participation
• Access and correction
limitation
• Data portability
• Data residency
• Automated decision making
1.1 Automated notification X X X X
1.2 Consent administration and
tracking
X X X X X X
1.3 Data map X X X X X
1.4 Identity and access control with
just-in-time provisioning
X X X X X X
1.5 Portable data format standards X X
1.6 Portable data encryption X X
1.7 Location control X X X X X
1.8 Access to automated decision-
making data
X X X X X X
6.1.1 Automated notification
Objective: To establish and maintain capabilities to automatically notify data subject, Data Protection Officer
(DPO), incident responders, and/or third-party processors in accordance with the personal data related events
such as change of consent agreement, scope of use, transfer, sharing, and disclosure, retention period, and breach
notification.
Organizations processing personal data are required to notify data subjects about the purposes of collecting, using,
and disclosure during the collection. They also need to notify data subjects when the purpose, use, or disclosure
practices change materially. Cloud Users can use cloud services such as Amazon Simple Notification Services
(SNS)[31] triggered by the Amazon Lambda[32] service to send a reminder to the DPO or direct notification
messages to data subjects automatically as part of the consent seeking process, with event logs captured in a
database system or encrypted cloud storage to provide evidence for compliance.
In case of a data breach incident, once the data controller assessed that the breach is likely to cause risk of serious
harm on the data subject’s privacy, safety, or freedom, they are required to notify the affected data subjects to
prevent harm. Cloud Users must notify the data protection regulatory authority first, within 72 hours of initial
discovery of the breach. In preparation for incident response, system designers should incorporate such
requirements for breach notification as part of the automation as a reminder alert with regular follow-up 72-hour

11
countdown messages to the DPO when the security team activate the data breach incident response process. See
Section 6.3.2 for further discussion on incident response readiness.
While certain regulations such as the TH PDPA have not included the need for a data protection impact assessment
(DPIA) to be conducted, the condition for reporting to the regulatory authority, and notification to the data
subjects are consistent with EU GDPR and SG PDPA requirements.
6.1.2 Consent administration and tracking
Objective: To establish and maintain capabilities to enable data subjects’ individual autonomy and control
processing of their personal data. Individual autonomy means enabling data subjects to exercise their rights over
their personal data that is kept and processed by the data controller and data processor.
As data controllers implement systems to enable data subjects to submit request to exercise their rights such as
request for access, correction, deletion, and objection to the use, transfer, and/or sharing of their personal data11
.
Each request will trigger a chain of events within the controller or processor’s cloud-based systems. To enable
individual autonomy, organizations will need to develop supporting capabilities for each type of data subject
request.
Data subjects’ consent is central to the data life cycle process. Organizations’ systems processing personal data
need to capture initial consent and allow updates later. Consent administration provides the capability for users to
accept and change their agreement and related scope of processing of their personal data. When data subjects
change their mind, the system can automatically send notification messages (as discussed in Section 6.1.1) to
related applications and trigger the deletion or revocation of access to the data. For example, data subjects may
change their communication preferences for direct marketing and the consent they previously provided, including
the withdrawal of consent, changes to the level of restriction for the collection, use, disclosure, transfer, sharing or
retention, or objection to the processing of their personal data. Organizations’ systems must trace the locations
and collate the data involved and present the affected information for data subjects’ confirmation before
executing the data subject’s decision. Consent tracking is the monitoring of changes to data subjects’ consents. An
individual user’s request may also be logged to enable accountability but will also need to be appropriately de-
identified (see Section 6.2.2).
These technical measures are fundamental for providing data subjects with individual autonomy to exercise their
rights and control over their personal data.
6.1.3 Data map
Objective: To provide visibility on the processing of personal data in the organization by capturing and maintaining
an up-to-date record of the data elements, processing activities, location, flow, access authorization, and security
safeguards implemented to ensure personal data confidentiality, integrity (including quality) and availability across
the data life cycle.
The data map[33] (also known as data inventory, dataflow map, and metadata catalog) is a fundamental tool that
Cloud Users can use for gaining visibility to the data in the organization, and data that are shared with third
parties. It is also an essential tool for enabling the technical measures required for fulfilling data subject rights,
including automated notification (Section 6.1.1), consent administration and tracking (Section 6.1.2), identity and
access management (Section 6.1.4), providing data portability (Section 6.1.5) and data security and quality
(Sections 6.1.6 and 6.4), enforcing data location restriction (Section 6.1.7), and explaining automated decision-
making (Section 6.1.8).
Besides supporting the organization’s fulfillment of its obligations on data subject’s rights, the organization’s data
map is an essential component for implementing the technical measures required for minimizing data exposure
(Section 6.2) and providing transparency (Section 6.5) to its processing activities to demonstrate regulatory
compliance. For examples, Section 39 and 40 in TH PDPA require record keeping for data processing, like the data
11 The collection, storage, access, use, transfer, sharing, disclosure, archival, and deletion of personal data is collectively known
as processing of personal data in data protection regulations and literatures.

12
map and Record of Processing Activities (RoPA) requirement in the EU GDPR12
. Without a data map, organizations
will find it inefficient in its creation and maintenance of the RoPA required to provide transparency to data
subjects and regulators, and accountability to their processing actions on personal data.
Systems designers may leverage cloud services with automated data discovery and data classification capabilities
that are available in most cloud platforms to create and maintain the organization’s data map, alleviating the pain
from the manual mapping of an extensive set of data stores or data sources used in the organization. Automation
is necessary to reduce friction and efficiently execute the chain of follow-up actions from request to completion,
and across the data life cycle. Examples of cloud services that enable automated data discovery, classification, and
mapping include Ethyca[34], BigID[35], and PKWARE[36], which are available in major CSP platforms, supporting
both cloud and on-premises data governance needs.
6.1.4 Identity and access management with Just-in-Time access provisioning
Objective: To authenticate a data subject’s identity before granting access and executing data subject’s request to
protect the data subject against identity fraud and personal data against unauthorized access.
An identity and access management (IAM) system is fundamental to authenticating a data subject’s identity before
granting permissions to request and execute desired changes. As data subject requests are infrequent and the
volume of personal data records processed by the organization may be substantial, pre-registering every data
subject in anticipation of their request for exercising their rights may not be practical or cost-efficient. A more
pragmatic approach is to implement a just-in-time access provisioning system that makes use of the organization’s
knowledge about the data subject to verify the identity of the requestor (authentication) before granting access to
his/her personal data. The application system providing such access may integrate with cloud services such as
Amazon Cognito[37], Google Identity Platform[38], Microsoft Azure Directory B2C[39] and IBM Security Verify[40].
To enable data subject access to their personal data, Cloud Users must have a data inventory or map of where the
personal data lives and be able to present them in a format that the data subject can read and examine for
correctness and accuracy (see Section 6.1.3).
6.1.5 Portable data format standards
Objective: To ensure portable data format standards are used in web and mobile applications processing personal
data to enable safe transfer to other web and mobile application systems desired by data subject.
Data subjects may request their data to be ported to another service provider’s platform or application. While the
CSP may adhere and support data portability at the infrastructure, platform, or software services layer13
where an
organization’s applications operate, the organization may not meet a specific data subject’s requirement at the
application service layer, such as the mobile or web application that data subjects interact with directly. This is
because these user-facing applications may use a proprietary data format that the receiving applications may not
be able to interpret and ingest. To ensure data portability at the application level, systems must be able to export a
data subject’s personal data in a commonly used machine-readable format (e.g., JSON or XML) for transmission
and import to the receiving controller’s systems.
In addition to requesting porting of personal data to another application system or platform, applications
operating in different platforms may require a data subject’s identity information for verification or other
purposes—such as personalization—before granting access to the application resources. To enhance data subject’s
identity protection, organization as data controller should consider implementing an identity and access
management system that supports federated identity standard such as the Open Authorization (OAuth)
standard[41, 42] that enables cross domains Single Sign-On (SSO) for data subject’s identity verification and access
12 For more information on Article 30 of EU GDPR that relates to RoPA, see: https://gdpr.eu/article-30-records-of-processing-activities/
13 The European Commission has developed a set of Data Portability Codes of Conduct known as “Switching Cloud Providers and Porting Data”
(SWIPO). The SWIPO codes of conduct is voluntary and provides guidance on for the application of Article 6 of EU Free Flow of Non-Personal
Data Regulation. Major CSPs have signed up as member and declared adherence to these codes. For more information, see
https://swipo.eu/current-swipo-code-adherences/.

13
authorization without unnecessarily exposing their personal data to other service providers, including data
processors engaged by the data controller.
6.1.6 Portable data encryption
Objective: To ensure the confidentiality of personal data during data portability.
To enable transfer or porting of personal data from one application to another operating in a different platform,
when data encryption is used, organization must decrypt data subject’s data before exporting them to a portable
data format. Data confidentiality will be exposed should the application simply transfer the decrypted data to the
receiving application. To ensure confidentiality of personal data remains protected, system designer may use a
public key cryptography system to re-encrypt the personal data exported in the portable format with the receiving
application’s public key. Similarly, application designer designing for receiving data from third-party applications
should publish their public key online along with the data format and application programming interface (API) that
the sending application may use to ensure end-to-end portability with confidentiality protection[43].
System designers should also consider the need for key escrow and portability provision in the key management
system to facilitate data loss prevention, and disaster recovery in addition to the data portability requirement. For
example, see [44].
6.1.7 Location restriction
Objective: To ensure the processing of personal data by users and application systems is restricted to authorized
locations. The locations include the originating and destination geographical location where processing is initiated
and terminated.
When designing for data storage and transfer (or sharing) requirements, system designers need to determine the
physical data storage location, and whether the destinations for sharing or transfers meet the data residency
requirements for their organization under the data protection law of the country or countries. Data location
restriction is a feature configurable in most public cloud services platforms.
Systems architects and application designers should note that data location restriction is a technical measure that
avoids the risk of data exposure to geographical locations deemed less secure. It does not align with the principle
of data minimization and does not seek to reduce or minimize data exposure since implementing such a measure
will physically isolate the data from those locations that are deemed not suitable or permitted for the processing.
Often, this measure will limit data processing to very few or even to a single geographic location. This increases the
risks of a location-focused denial of service attacks and physical environmental disasters. When data location
restriction is used as a technical measure, the trade-off from an application architecture design perspective, is the
elimination of the option to leverage cross-geographical locations multi-region distributed systems architecture
design that will provide higher application and data resiliency against such risks. Application designers need to
weigh the risks of using data location restriction against the needs and benefits of higher systems and data
resiliency.
In some situations, it may be necessary for applications to use cross-geographic locations multi-region distributed
systems architecture to meet a higher level of resilience. The risk of data exposure can be minimized with the use
of secure logical separation[45] measures that are available in major cloud platforms. In such cases, where
regulatory compliance may be a concern, system designers may consider working with their legal counsel to get
approval from the regulator, consent from the data subjects or other valid mechanisms (such as certifications and
so on) to enable such architectural implementation.
6.1.8 Access to automated decision-making data
Objective: To enable data subject’s consent management and provide access to automated decision-making data
in the use of AI/ML capabilities that process personal data.
Under the EU GDPR, organizations are prohibited from using AI/ML to process EU residents’ personal data for
automate decision making purposes, unless data subject’s consent has been obtained beforehand. System
designers therefore must ensure that the application provides notification and get the consent from data subjects

14
when and/or before their data is collected for such purposes. Data subjects may also request for automated
decisions to be explained as a condition for their consent. Cloud applications using AI/ML for personal data
processing will need to provide data lineage to support such a request. For more information on data lineage
related measures and cloud services, see Section 6.5.4.
6.2 Minimizing personal data exposure
This category of data privacy principles includes collection limitation, use limitation, retention limitation, purpose
specification, and preventing harm is to reduce data exposure should the system or user be compromised. The
outcome desired in these privacy principles is minimizing the potential losses and impact to the data subject’s
privacy, freedom, and safety. We identified six technical measures that are applicable to this category (see Table
5).
Table 5: Technical controls applicable to minimize personal data exposure in the data life cycle.
Categories of Data
Data Protection Principles Applicable Technical C S U T A D
1. Minimize data
exposure
• Collection limitation
limitation
• Preventing harm
1.1 Data minimization X X X X X
1.2 De-identification X X
1.4 Disclosure control X X
1.5 Privacy preserving encryption X X X X
1.6 Secure data destruction X
6.2.1 Data minimization
Objective: To ensure application systems only process personal data that are compatible with the purpose
specified and consented by the data subject.
Data minimization has a wide span of influence throughout the data life cycle stages. System designers should see
it as a primary technical measure and apply it upfront in the data collection stage to limit the amount of data a
system will collect. The amount of data collected will determine the data storage, the data for access, use, transfer,
and archival to meet contractual and compliance requirements. System designers need to identify the types,
amount, location, and processing needs of personal data in the organization. Creating a data map (see Section
6.1.3) is essential during the process to ensure full visibility of the data location, flows, and processing activities
throughout the data life cycle.
Considering the data collection stage, system designers may use user interface controls for the application, such as
radio buttons, dialog boxes with pull-down list to limit the content data subjects (application users) can provide. By
limiting the use of free-form text fields, we could prevent data subjects from entering excessive sensitive contents
or personal data elements. Implementing text length limit and input validation will prevent over-collection of
personal data and serve as security guardrails against input fields related attacks such as buffer overflow and SQL
injection. Cloud services for automated data classification[33-36] and language processing[46, 47] can help to filter
text input for sensitive contents and personal data and trigger actions such as review and correction to enforce
additional data protection rules given the increased data sensitivity.
6.2.2 De-identification
Objective: To transform personal data into non-personally identifiable data to minimize data exposure and
violation of data subjects’ rights.
Post data collection, and prior to storage or archival, personal data may be transformed further by removing or
obscuring personal data and/or personal identifying data elements prior to the use, transfer, disclosure and/or

15
sharing. Commonly known as de-identification[48], it transforms data with two options: (1) pseudonymization14
,
and (2) anonymization15
. Pseudonymization, also known as tokenization, involves replacing personal data and/or
personal identifying data elements in such a way that the data can re-identify individuals with the help of an
identifier (i.e., a data element that links the personal data to the pseudo-identity of the data record).
Anonymization eliminates the possibility for re-identification by replacing personally identifiable data elements
(e.g., name, passport number, mobile number, home address) with unrelated or randomly generated data.
In situations where the application output does not require personal identifiers or related data elements, systems
may apply data redaction technique[46] to block the view or access of such identifiers or data elements. For
example, when an application collates event logs for analysis, the application may use an automated data
classification cloud service to identify personal data and perform redaction before ingesting it to a data store for
the analysts to access.
When processing highly sensitive personal data, data decomposition is another option. It involves a process that
reduces data sets into unrecognizable elements that have no significance on their own. The application then stores
these elements or fragments in a distributed virtual network in the cloud so that any compromise in one node
would yield only an insignificant data fragment. This technique requires a threat actor to compromise all nodes,
get all data fragments, and know the algorithm (or fragmentation scheme) to piece together the data coherently.
Public cloud virtualized network capability, such as the Amazon Virtual Private Cloud (VPC)[45] provides a native
environment ideal for implementing such a measure.
6.2.3 Data encryption
Objective: To protect the confidentiality and integrity of personal data against unauthorized access, use, and
disclosure during data at rest and in-transit and minimize data exposure should the system or users be
compromised.
A critical control at the collection and transfer (where transmission happens) and storage stage of the data life
cycle is data encryption. Cloud Users need to reduce the risk of unauthorized access, use, and disclosure by
application users and admins, and from the CSPs. System designers for individual application should use its own
unique encryption key and implement granular access control to keys for enforcing separation of duties between
applications, and between key users and key admins. For defending against threats from the CSP operators, Cloud
Users can “bring your own” key materials to be used for encryption and self-manage keys.
Major CSPs provide cloud-based cryptographic services that enable data to be encrypted using strong
cryptographic algorithms on-disk by default within their data centers. Cloud Users normally can configure such
cryptographic services in their cloud storage and database services. Specific CSP may or may not enable the
encryption by default. CSPs provide options for Cloud Users to choose CSP-managed, or self-managed
cryptographic keys with cloud-based hardware security modules (HSM) for such encryption. The self-managed key
option, also known as “Bring-Your-Own-Key” (BYOK) will provide an added layer of control and more flexibility in
choosing key generation techniques and key management practices that they prefer, at a cost of internal
management overhead and complexity[43, 49-51].
Beyond the CSP-provided cloud-based encryption and key management services for data at rest, system designers
also need to implement network and application layer encryption to protect data in transit. Network level
encryption can be enforced using virtual private network (VPN) between the application server and the receiver’s
client end-point system, either using the Transport Layer Security (TLS) or IP Security (IPSec) protocol
implementation. Applying encryption on data element or data record level provides added disclosure restriction
14 In Article 4(5) EU GDPR, 'pseudonymization' means the processing of personal data in such a manner that the personal data can no longer be
attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and
is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural
person.
15 EU GDPR Recital 26 defines anonymous information, as ‘…information which does not relate to an identified or identifiable natural person or
to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable’.
The GDPR does not apply to anonymized information.

16
and access control. However, such application-level control will have a performance trade-off. For example, Big
Data analytic applications may suffer a degraded performance with application layer encryption, while this
additional protection may not be necessary if they already apply data minimization and de-identification controls
prior to data injection for the analytic processing and/or other controls (i.e., access control and so on) reduced the
privacy risks to be lower than the threshold.
When implementing cryptographic mechanisms or using cryptographic services, whether it is in the cloud or on
premises, application designers and developers should take note of the common code design and coding errors
that could result in data leakages and application failures. For examples, see [52].
6.2.4 Disclosure control
Objective: To ensure application systems do not provide access to more data than required to fulfill its functional
purposes in relation to the use of its features or services by users or other applications.
Disclosure control is a measure to enforce the use limitation, purpose specification, and preventing harm
principles. Disclosure control may be achieved using fine-grained access control, role-based access control, and
resource-based access control. In database systems, disclosure control may be implemented with the use of
database views based on individual user’s authorization by their identity or role.
Data encryption also serves as access control, and just like access control, it provides for disclosure control from a
data-in-use perspective.
Besides considering the access permission and control policy on individual principal and data resources in the
system, system designers should establish coding standards and conduct code reviews for application and
infrastructure code during the development lifecycle to prevent data leakage. Typical vulnerabilities include poor
memory management, passing of unredacted or personally identifiable data contents to unauthorized functions or
external APIs, and mis-configured cloud services. For example, in a healthcare clinic system, the patient diagnostic
application that captures sensitive medical information only passes the patient’s name and contact number to the
appointment scheduler module for the clerk to schedule follow-up appointments for the patient.
6.2.5 Privacy preserving encryption
Objective: To ensure the confidentiality and integrity of highly sensitive personal data remain encrypted during
processing.
Where highly sensitive personal data is involved, to reduce the risk of unauthorized access and the resulting harm
to data subjects, privacy preserving encryption techniques such as Homomorphic Encryption and Secure Multi-
Party Computation may be applied as an enhanced disclosure control mechanism to fulfill such a need. At the
point of this study, such Cryptographic Computing techniques are still an emerging technology that is not
commercially available on public cloud platforms. Major CSPs have, however, announced investments in research
and development of in such technology area[53, 54], which system designers may monitor and adopt when
available.
6.2.6 Secure destruction
Objective: To ensure personal data is not recoverable after deletion.
At the deletion stage of the data life cycle, system designers must ensure that there is no remanence of personal
data that is recoverable once a data element, record, file, or database has been deleted. An effective method for
secure destruction is to apply data encryption and securely discard the data by re-encrypting or overriding it with
either random bits or all-zeros over the encryption key used. Applications also use similar random and/or all-zeros
bit-by-bit overriding technique for secure erasure, but this may have a performance impact if involving a large
amount of data.
System designers should note that transient data may contain personal data in a database application
environment and ensure their secure removal after use. In cloud database systems such as Amazon Dynamo DB,

17
system designers may set a low time-to-live configuration so that transient data gets deleted automatically upon
completion of transaction.
6.3 Preparation for responding to data breaches
This category of data privacy principles includes breach notification and preventing harm. The outcome desired is
timely reporting and notification to relevant regulatory authorities and notification of data subject should the
Cloud User or its associated data processors encounter a high-severity data breach incident.
Table 6 identifies a minimum list of applicable technical measures that Cloud Users should implement as part of
their preparation for incident response and recovery. Note that these are minimum applicable measures. To
develop a complete plan, Cloud Users need to consider different incident scenarios and develop response
playbooks and conduct simulations regularly to improve readiness. For an example of incident response
preparation in the cloud, see [55].
Table 6: Technical controls applicable to preparation for data breach incidents and the related data life cycle stages
Category of Data
5. Prepare for
incident response &
recovery
• Breach notification
• Preventing harm
5.1 Event logging X X X X X X
5.2 Incident response readiness X X X
5.3 High-availability resilience
architecture
X X X X
5.4 Backups X X
6.3.1 Event logging
Objective: To enable identification, analysis, notification, and investigation of malicious and unauthorized access to
application systems and processing of personal data.
Event logging is a fundamental technical measure for enabling visibility of the application environment in
operation. It is a required measure for providing security safeguards, including monitoring, detection, and respond
to security and privacy-impacting events in the application system. Event logs are essential for providing the meta-
data of applications and principals’ activities from data governance to incident investigation.
Cloud Users can use services such as AWS CloudTrail[56] and Google Cloud Audit Log[57] in the respective cloud
platform to log every API call and associated events in every cloud account, including data transfer to external
systems, or data stores accessed by third-party applications or users. Event logs processing and consolidation are
important for events analysis and detection of abnormal events and usages. Cloud Users can centralize the event
logs for monitoring and analysis with required retention and backup to ensure availability and durability.
Cloud services such as Amazon GuardDuty[58], Amazon CloudWatch[59] and Google Security Command Center[60]
are examples of capabilities that enable automated detection of malicious or suspicious activities in the cloud
environment. Cloud Users must also consider and implement required personal data de-identification when logs
need to be transferred to different geo-locations that have data residency restriction imposed by the data
subject’s country. For example, see [61].
6.3.2 Incident response readiness
Objective: To ensure that the organization that processes personal data are is adequately prepared and ready to
respond to data breach incidents upon their occurrence.
The key to incident response readiness is to contain the incident so that the blast radius of impact is minimized. It
also enables investigation to identify the causes of the incident and the data and resources that have been
affected. The former is to identify both short-term mitigations, and longer-term remediation to prevent a
recurrence of the incident. The latter is to enable valuation of the data and resource assets affected so that the
potential risk and impact to the data subjects can be qualified.

18
To prepare for containment and investigation, in the cloud environment, incident isolation virtual network (IIVN)
can be pre-architected and store as architecture code (e.g., using Amazon’s CloudFormation[62] template). Such
IIVN can be instantiated as part of the incident response process to move affected or suspicious servers and other
resources into the virtual network segment configured with heightened monitoring and logging to permit close
observation and in-depth investigation and containing the threats within. Affected cloud compute and storage
instances may be tagged and isolated automatically triggered by the detection of suspicious events[63].
To aid investigation, data breach impact assessment, and subsequent recovery, snapshots of critical server
instances may be taken at a regular interval These are images of those servers in operating conditions, which
enable in-depth investigation and digital forensic to be performed to identify changes over a period before the
incident detection and identify root or near-root causes and follow-up mitigation measures to be applied. For
more examples on preparing for forensic investigation in the cloud, see [64, 65].
6.3.3 High-availability resilience architecture
Objective: To ensure application systems that process personal data maintain a high level of availability to meet its
processing purposes, and resilience against potential malicious denial of service and natural disaster events.
Beyond readiness to response, organization should design for high availability and resiliency. In the cloud
environment, organization can leverage the highly available and resilience network and data infrastructure to
architect high availability applications that span across server and storage systems in multiple data centers in the
same country, or across different geographies. In the latter configuration, system designers must consider the data
location restriction and ensure personal data de-identification or redaction is used if such architecture is used
between regional locations that have data localization requirements.
As discussed in Section 6.1.7, application and data resilience should be considered with data location restriction
requirements and risk-managed using technical measures such as secure logical separation[45]. The need to
ensure regulatory compliance aside, in some situations, ensuring application and data resilience may outweigh the
risk of exposing to location-specific threats. it may be necessary for application systems to use cross-geographical
locations multi-region distributed systems architecture together with secure logical separation to achieve the
higher resilience requirement. In such cases, system designers may consider working with their legal counsel to get
exceptional approval from the regulator or consent from the data subjects to for high resilience.
6.3.4 Backups
Objective: To ensure personal data is recoverable to meet the organization or application systems recovery point
objective (RPO) and recovery time objective (RTO) after their destruction or corruption because of a malicious or
disastrous event.
Murphy’s Law states that anything that can fail will. Implementing data and application systems backups is
necessary to enable recovery from such eventual failure. The backup and recovery strategy must be based on the
organization or application systems RPO and RTO, which determines the interval for backups to be taken, and the
time required for recovery to be completed, respectively[66]. For examples of disaster recovery architecture
strategies in the cloud, see [67, 68]
System designers must also design-in the backup testing and recoverability verification processes to ensure they
are effective and reliable in achieving the required RPO[69]. Cloud Users should also consider data encryption and
integrity requirements in the design of personal data backup systems and solutions to ensure end-to-end
compliance with the data quality and data security principles[70]. For data that require a longer term retention
period, immutable cloud storage solution may be considered[71].
6.4 Ensure the security and quality of personal data across the data life cycle
This category of data protection principles includes security safeguards, preventing harm, and data integrity and
quality. Security is at the core of personal data protection. Without security, personal data can be accessed, used,
transferred, disclosed, or changed without authorization. So, it will compromise data quality and integrity. Threats
of unauthorized access come from different sources and on multiple technology layers. The applicable measures

19
include data-level protection, as well as for network, systems, applications, and users. These include preventative
measures for normal operations for preventing harm to data subjects and detective measures to ensure early
detection of malicious and undesirable events, minimizing the negative impact on Cloud Users (see Table 7). The
technical measures for preparing for incident response and recovery (as discussed in Sections 6.3) form part of the
overall security safeguards.
Table 7: Technical controls applicable to security safeguards in the data life cycle
Categories of Data
4. Security and
quality
• Security safeguards
• Preventing harm
• Data integrity and quality
4.1 Identity and access control X X X X X
4.2 Privilege management X X X X
4.4 Data integrity mechanisms X X X X X
4.5 Code integrity X X X X X X
4.6 Anti-malware / Threat detection X X
4.7 Vulnerability management X X
6.4.1 Identity and Access Management (IAM)
Objective: To ensure users and applications are authentic and authorized before permitting their processing of
personal data.
Section 6.1.4 discusses the need for IAM with just-in-time access provisioning capability to enable data subject’s
exercise of individual autonomy to their personal data. IAM as a technical measure also plays a key role in the
enablement of access control and enforcement of disclosure control (Section 6.2.4) in the data access, use, and
sharing/transfer stages for all principals (including users and application systems) processing personal data.
Through the identification of authorized principals and resources and assignment of roles and permission policies,
IAM enables the security team to identify malicious actors and unauthorized devices and resources in the network
environment.
Cloud IAM service is common across all major CSPs, with capabilities to support multi-factor authentication (MFA),
identity-based access control, and resource-based access control. Identity-based policies are attached to a user,
group, or role to specify what that identity can do in the system[37-39]. Resource-based policies are attached to a
resource, e.g., a data object, a file folder, a process queue, etc., specifying who has access to the resources and
what actions they can perform on it. Find different IAM policies and their use in [72]. Cloud Users must also ensure
security of identity accounts. As reported in Verizon’s DBIR[29], stolen credential is one of the four primary
security failures. In 2021, Microsoft detected and blocked more than 25.6 billion attempts to hijack enterprise
customer accounts by brute-forcing stolen passwords[73]. Cloud services such as the Amazon IAM supports multi-
factor authentications (MFA) with multiple MFA devices that further protect user accounts against unauthorized
use[74].
6.4.2 Privilege management
Objective: To restrict the use of privileged identity and access on a need-to-use basis, and ensure their use are
temporal and fully accountable.
Privileged access refers to roles in the organization that have access levels above that of a standard user. It is
required mainly for critical administration tasks in the network or cloud environment. For examples, Root or
Superuser account, Domain administration account, and emergency account. Privileged access can be assigned to
human users, applications, and machine identities. Resource or system accounts may include service accounts, and
database accounts. Managing privileged identities and accounts in the organization is a critical task in the
organization. The principle of least privilege is the foundation to prevent assigning excessive permissions to roles
and applications. Any legitimate use of privileged identities and access must be minimized and monitored closely
as their compromise can cause systems compromise and data exposure.
Cloud Users’ system designers should use escalation management system to manage and track the approval,
provisioning, and revocation of tightly scoped permissions with only necessary timeframe, high visibility, and

20
immutable accountability, for example, resolving a severity issue in system configuration or deploying an
emergency patch to address a critical vulnerability. Cloud IAM services normally provide privilege management
capabilities, including support for temporary elevated access needs[75]. Alternatively, organization may also
implement third-party vendor supported privilege management solutions to enable cross-platform management
from on-premises to the cloud environment[76].
6.4.3 Data encryption
Objective: To protect the confidentiality and integrity of personal data against unauthorized access, use, and
disclosure during data at rest and in-transit and minimize data exposure should the system or users be
compromised.
Data encryption is an important technical measure in the toolbox of mechanisms for providing the required
security safeguards on personal data and throughout the data life cycle stages. Besides protecting data
confidentiality and integrity (as discussed in Sections 6.2.3 and 6.4.4), it serves as an alternative or added
mechanism for enforcing access control, and minimize the risk of data exposure.
6.4.4 Data integrity
Objective: To protect the integrity of personal data against unauthorized modification during data at rest and in-
transit and enable verification of the authenticity of the original data.
Data integrity is about protecting data against unauthorized modification. Data integrity is an important attribute
of data quality. It means that the data is complete and accurate. Implementing data integrity measures enables
organizations to verify the authenticity of the original data so it can be trusted for onward processing needs. For
example, in a data analytic systems data integrity mechanism enables automated checks to ensure data quality
and reliability as part of the data cleaning process when data is ingested into the application. When data is ported
to another application or platform, the receiving entity could also use the data integrity mechanism to verify the
data authenticity, and hence its quality and reliability[77-79].
Data integrity control includes digital signature and cryptographic hash function (also known as message
authentication code) that are computed using a cryptographic key and related integrity enabling algorithm. Data
integrity control should be applied at the point of data collection, and whenever data gets updated. The integrity
code or digital signature should be paired with the original data to enable the integrity of the data to be verified
during the access, use, transfer, or sharing stages of the life cycle[80, 81].
Applications may also use encryption (as discussed in Sections 6.2.3 and 6.4.3) for ensuring data integrity. When a
perpetrator altered encrypted data, it will not decrypt into its original clear text, which enables such alteration to
be detected. If on-demand data integrity verification is required, a copy of the encrypted object will have to be
made available for decryption to verify that a data object is the same as the original data in the encrypted object.
Major CSPs offer cryptographic services that enable applications to provide personal data integrity throughout the
data life cycle. Through integration with other cloud services that provides for data filtering checks and message-
based functions, applications can automate data integrity and data quality verification as part of the data flow,
before data storage, and prior to data use or transfer by other connected application services[82, 83].
6.4.5 Code integrity
Objective: To ensure application code that process personal data is authorized and has gone through proper
testing and security reviews before transitioning into the production environment.
The integrity of application code is paramount to ensure the security of personal data. In the cloud environment,
common DevOps and CI/CD practices will create, change, and move code iteratively between the development and
operation environment. Modern applications can have multiple releases or updates each day. Without a proper
mechanism to ensure application code is authorized and has gone through proper testing and security reviews,
there is a high risk of vulnerable code that put personal data at risk. Designers and developers should understand
the common application design and coding errors and take measures to ensure they are addressed within the

21
software development lifecycle processes. A useful source for gaining such knowledge is the Open Source Web
Application Security Project (OWASP) Top 10 List[84].
Code signing and verification are readily available in cloud services. Once the code has successfully completed the
required testing and security reviews, the approved code can be digitally signed as part of the approval process.
Cloud Users’ systems can automatically verify the digital signature to confirm that the code is unaltered and from a
trusted publisher before checking it to the production environment. For examples, see [85, 86].
6.4.6 Anti-malware and threat detection
Objective: To prevent and detect unauthorized network and systems intrusion and unauthorize use of application
systems that process personal data by malware and other cybersecurity threat actors.
Where cloud network and cloud-based applications connect to the Internet or other third-party networks, there
will be opportunities for threat actors and malware to attempt to intrude or infiltrate into the cloud network.
Application users may also have their computing devices infected when they connect to the public network or
exchange files and emails with other third parties. Microsoft, for example, reported blocking 9.6 billion malware
and 35.7 billion phishing and other threats targeting enterprise and consumers in 2021[73]. Protecting against such
threats is similarly important in the cloud network as in on-premises environments.
Cloud Users must ensure secure design of the cloud network architecture and deploy anti-malware and other
threat prevention and detection services in the cloud environment[87, 88]. Both cloud native and third-party
security vendors’ solutions are available on major CSPs’ platform for implementing this critical security measure.
For examples, see [58, 89, 90].
6.4.7 Vulnerability management
Objective: To ensure the network, operating systems, application software and connected devices used in the
processing of personal data maintain the most updated software implementation available or secure workarounds
to prevent exploitation that could lead to data compromise.
Software vulnerability is a weakness whereby its exploitation by malicious actors could lead to the compromise of
systems and data. Vulnerability management involves managing the processes, and tools for identifying,
evaluating, treating, and reporting on security vulnerabilities and misconfigurations within an organization's
software and systems. It allows the organization to monitor its digital environment to identify potential risks, for
an up-to-the-minute view of the organization’s security posture.
As part of the organization’s cyber hygiene practices, virtual servers, databases, and desktop instances in the cloud
network require vulnerability management to ensure they are up to date to the latest patch level at the earliest
possible time when the updates are available. To ensure the development and deployment environment install the
latest security updates, the CI/CD pipeline should incorporate vulnerability scanning and analysis. For example, see
[91]. Organizations should also be prepared to respond to critical vulnerabilities. For example, the Log4j[92] (a
zero-day exploit) requires organizations to locate the vulnerability and implement workarounds to address the
eminent risk exposure[93].
Vulnerability management and automated patch testing and updates are common architecture patterns we can
find in major CSPs cloud platforms[94-97]. Cloud Users can implement automatic fail-over high-availability
architecture and blue-green deployment models efficiently in the cloud, which enables virtual systems and
network devices to be tested and patched “on-the-fly” with minimum impact (if any) on the production
systems[98, 99].
6.5 Providing transparency of processing and assurance of compliance
Transparency and openness are key privacy principles to ensure accountability. Notice is one means to provide
transparency and openness. They require Cloud Users to provide assurance and show their compliance and
accountability to privacy in a transparency manner. The assurance should cover all technology layers that are
managed by the CSPs and the Cloud Users. As shown in Table 8, there are five technical measures applicable for
enabling transparency and assurance of compliance in the data life cycle.

22
Table 8: Technical controls applicable to enable transparency and assurance in the data life cycle
Categories of Data
3. Transparency and
assurance
• Openness
• Transparency
• Notice, choice, consent,
individual participation
• Accountability
3.1 Data map X X X X X X
3.2 Continuous governance X X X X X X
3.3 Compliance assessment,
attestation, and certifications
X X X X X X
3.4 Data lineage X X X X X X
3.5 Automated reasoning and formal
verification
X X X X X X
6.5.1 Data map
Objective: To provide visibility on the processing of personal data in the organization by capturing and maintaining
an up-to-date record of the data elements, processing activities, location, flow, access authorization, and security
safeguards implemented to ensure personal data confidentiality, integrity (including quality) and availability across
the data life cycle.
Organizations cannot protect what they do not know. Knowing what data that the organization collects, stores,
uses, discloses, transfers/shares, archives and deletes are essential to their data protection strategy. As discussed
in Section 6.1.3, the data map is the fundamental tool that Cloud Users can use for gaining visibility to the data in
the organization and data that are shared with third parties. Without the data map, organization cannot provide
transparency to data subjects and the regulator, and accountability to their processing actions on personal data.
Systems designers may leverage cloud services with automated data discovery and data classification capabilities
to create and maintain the organization’s data map.
6.5.2 Continuous governance
Objective: To provide visibility and enable governance of the personal data processing activities in the
organization, and with third-party data processors throughout the data life cycle stages.
The data map captures a snapshot of the data held in the organization. Continuous oversight is to track the
subsequent use, disclosure, transfer, sharing, archival, and any new collection, storage, and deletion of personal
data in the organization systems. As discussed in Section 6.3.1, Cloud Users can log every API call and associated
events in every cloud account, including data transfer to external systems, or data stores accessed by third-party
applications or users. They can centralize the event logs for monitoring and analysis with required retention and
backup to ensure availability and durability. Using the event logs monitoring and analysis system, the DPO could
gain oversights of personal data processing activities and apply suitable decisions to manage the data flow and
access at the occurrence of exceptional events. With such an oversight system, the DPO may, for example, identify
opportunity for reducing data exposure and work with the application teams to apply data minimization and de-
identification measures, as discussed in Sections 6.2.1 and 6.2.2, redacting or de-identifying personal data when
downstream applications don’t need or are not allowed to access personal identifiers. A DataHub system, using
meta-data available from the variety of data sources, may be used to implement an oversight system in the
cloud[100, 101]. Cloud-based audit services may also leverage event logs as evidence for attesting compliance
status[102].
6.5.3 Compliance assessment, attestation, and certification
Objective: To provide transparency to data protection measures, including related processes and tools used, and
assurance of their effectiveness in operations.
Cloud Users should provide assurance to regulators, auditors, customers, or other interested parties via
compliance programs, which can leverage the CSPs’ programs and compliance evidence for CSP-managed controls.
Such an assurance program may include self-assessment and third party audits, such as the Systems and
Organization Controls (SOC) for Privacy, the ISO/IEC 27001 ISMS[25] and ISO/IEC 27701 PIMS [19] certifications,
Cloud Security Alliance (CSA) Security, Trust, Assurance, and Risk (STAR) Certification, and CSA STAR Attestation
programs[24]. Organizations perform assurance programs at periodic time intervals, which may leave systems and

23
process discrepancies that emerged between audits undetected. In the cloud environment, Cloud Users can
address such an assurance gap through continuous compliance monitoring and assessment services. Cloud Users
may use cloud services for continuous self-assessment by evaluating configurations and permission policies against
cloud logs and services’ control plane APIs to report on compliance status and any gaps identified[103].
6.5.4 Data lineage
Objective: To enable visibility of the processing of personal data from its origin to its current stage in the data life
cycle.
Note that data lineage is currently not an explicit requirement under existing data protection laws/regulations that
we have analyzed in this paper. However, as we analyze the EU GDPR requirement for automated decision making,
we envisage data subjects may also request for automated decisions to be explained as a condition for their
consent. Data lineage is a technical measure that Cloud Users may use to support such a requirement.
Data lineage describes what happens to data from the origin, and as it moves along the data life cycle, going
through different processing activities. It provides visibility into the analytics pipeline and traces errors back to
their sources. Data lineage helps ensure that accurate, complete, and trustworthy data is being used to drive
decisions that would affect data subjects' rights. It provides organizations continuous oversights when a large data
volume, diverse data sources and destinations are involved, and helps them assess the quality of a metric or
dataset, supporting data quality principle.
Combined data lineage with the inputs, entities, systems, and processes that influenced the data, which can
reproduce the data, organization will provide data provenance at the same time. Major CSPs support such
requirements with the use of cloud services. For examples, see [104, 105].
6.5.5 Automated reasoning and formal verification
Objective: To provide formal proof that the technical measures are implemented correctly for personal data
protection.
Formal verification uses mathematics and formal methods to prove the correctness of an algorithm or logic that a
system implements. Automated reasoning is a method of formal verification that automatically generates and
checks mathematical proofs to prove the correctness of systems. This approach, also known as Provable Security,
provides a deeper level of assurance of systems security and privacy design and implementation. However, current
data protection laws/regulations have not mandated the use of this approach for providing assurance of
compliance. Its use has not been pervasive among CSPs, except for Amazon Web Services (AWS), which has a
group dedicated to using automated reasoning to enhance assurance of their cloud services. Cloud Users on AWS
may use such tools along with the cloud services that are supported by automated reasoning. Using automated
reasoning capabilities, organization can analyze their IAM policies, storage access configurations, and other
resource policies that control access, use, and transfer of personal data and show their compliance with a higher
level of confidence in the correctness and security of their policy settings and configurations. For example, see
[106].
7 Conclusion
In this paper, we have demonstrated that the principle-based approach can provide a more scalable and efficient
method for identifying technical measures required for compliance with the growing numbers of data protection
laws and regulations. This addresses a key challenge for Cloud Users to ensure privacy in the cloud. By categorizing
principles between the various privacy regulatory frameworks by common objectives, we have identified baseline
technical measures in the cloud that Cloud Users can leverage to achieve privacy compliance across multiple
jurisdictions. This paper has also shown that it is possible to leverage cloud services to automate and streamline
privacy management in the organization overall. Further, given the broad range of regulatory frameworks covered
in this paper, adopting the technical measures identified sets a strong baseline for Cloud Users in the face of future
regulations.

24
In future iterations of this series, we plan to apply the principle-based approach to validate the technical measures
identified in this paper to newly published data protection regulation. We plan to also evaluate the practicality of
the technical measures for privacy compliance with use cases common to Cloud Users’ applications in the cloud.
8 Acknowledgements
We would like to acknowledge and thank Ivy Young, Yu Xin Sung, and Simon Hollander of Amazon Web Services,
Sam Goh of DataX, Dr Hing-Yan Lee and Daniele Catteddu of Cloud Security Alliance, Dr Prinya Hom-anek of
TMBThanachart Bank, Dr Zhan Wang of Envision Digital, Sim Xin Yi, Manny Bhullar and Rifdi Ridwan of Asia Cloud
Computing Association and all those who have provided feedback and contributed to this paper.

25
List of Appendices
1. The Data Life Cycle
2. Common concerns on security and privacy OF the cloud and CSPs’ assurance responses
3. OECD Privacy Principles
4. APEC Privacy Principles
5. Thailand Personal Data Protection Act (TH PDPA) Privacy Principles
6. An analysis and mapping of TH PDPA against EU GDPR and SG PDPA.

26
Appendix 1 – The Data Life Cycle
A typical data life cycle consists of five stages: (1) Collection, (2) Storage, (3) Access and use, (4) Transfer, share, or
disclosure, (5) Archival, and (6) Deletion. In EU GDPR and SG PDPA, the term “processing” includes all these stages.
Figure 3: Data life cycle stages
From a state-machine model perspective, there are three main data states: (a) data at rest, (b) data in transit, and
(c) data in use. Mapping the data states to the data life cycle stages, data collection is the input to the system,
processed data is the intermediate output that will transition into one of the three states, i.e., at rest is the Storage
or Archival stages, in-use is the Access/Use stage and in-transit is the Share/Transfer stage. Data destruction is the
final output, or end-of-life of the data, which also maps to the Deletion stage in the life cycle.
In this paper, we use the data life cycle and the data states as a reference framework to study and analyze the data
protection principles and regulatory requirements and identify the security and privacy controls applicable for
addressing the technical needs in the cloud.
Each of the six data life cycle stages may involve third-party providers, who may be vendors or suppliers. In most
cases, these third parties will be data processors or sub-processors, or sometimes, joint controllers. In either case,
the context of the applications and processing involved should determine the privacy principles and identify
technical controls applicable. Similar technical requirements and controls apply to Cloud Users no matter they are
single controller and processor, or joint-controllers and joint-processors.
In EU GDPR, “’processing’ means any operation or set of operations which is performed on personal data or on
sets of personal data, whether by automated means, such as collection, recording, organization, structuring,
storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or
otherwise making available, alignment or combination, restriction, erasure or destruction16
.”
In SG PDPA, processing means, “in relation to personal data, means the carrying out of any operation or set of
operations in relation to the personal data, and includes any of the following: (a) recording; (b) holding; (c)
organization, adaptation or alteration; (d) retrieval; (e) combination; (f) transmission; (g) erasure or destruction.”
Neither processing nor the other terms in the data life cycle are defined in the TH PDPA. Table below provides a
brief interpretation of the stages of the data life cycle.
16 https://gdpr.eu/article-4-definitions/

27
Table 9: Description of Data Life Cycle Stages
Stage Description
Collection This is the stage where data is requested from data subject or received from another
organization to input into the application system. It is also known as data creation or
recording, as the received data will get created or recorded in a data store in the system.
Storage Where data is kept in a data store, which may be an electronic file, or a database system
(including data warehouse, data lake, etc.) In storage is also known as “data at rest”.
Access and use Where individual or principal, which could be a person or an application system gain an
ability to read and/or update the data or use the data for certain purpose, which may be
computational, analytical, visualization, verification against some other data, etc.
Transfer, share, or
disclosure
In this stage, data is transmitted out of its storage within an organization to another
organization. Transfer, share, or disclosure are often used interchangeably. Data-in-
transit is commonly used in place of this definition. We may, however, categorize them
by the level of control the data owner has over the recipient(s). For example, transfer
could mean a one-to-one relationship that is bounded by an agreement between two
entities or principals to handle the received data in an agreed manner, including security
and privacy protection. Sharing could refer to one-to-few transmissions of data, which
may or may not be bounded by a sharing agreement for the onward processing or use of
the data. Disclosure could refer to a one-to-many transmission, in which the data owner
and recipients do not have an agreement in place for onward processing and protection
of the data. Organization should therefore clarify the use of these terms based on the
context involved.
Archival This refers to the long-term storage of the data. At this stage, the data is infrequently
used, and when in archival, retrieval may take a longer time to complete, which may
range from a few hours to a few days depending on the archival arrangement. Data
archival should comply with the retention policy, which should be aligned with the data
protection regulation.
Deletion When data is no longer needed or useful to serve its intended purposes, it should be
deleted unless there are specific legal compliance requirements for archival. This is the
end-of-life stage for the data involved. Deletion shall be performed using secure data
destruction methods. In practice, data encryption is used to minimize the need for
additional destruction mechanism to be applied, except when there are regulatory
requirements for specific methods to be used for data destruction.

Data Privacy in the Cloud.pdf

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Data Privacy in the Cloud.pdf

Similaire à Data Privacy in the Cloud.pdf (20)

Plus de accacloud

Plus de accacloud (20)

Dernier

Dernier (20)

Data Privacy in the Cloud.pdf