2. Agenda
§ GDPR in detail
§ Rights of individuals
§ Data transfers
§ GDPR and CMS platforms
§ Existing systems
§ Future systems
§ Work surrounding technical
platforms
3. About Exove
§ Digital design and development
company in Finland, Estonia, the
UK, and Singapore
§ Full service portfolio from
business consulting and service
design to development and care
§ We serve both multinational giants
and new start-ups alike
We deliver digital growth
More about us:
§ www.exove.com
§ www.exove.com/gdpr
§ @exove
4. About Janne Kalliola
§ Founder and CEO of Exove
§ Continuent, First Hop, SSH,
Helsinki University of Technology
§ Been coding since 1983, first web
stuff in 1994
§ Worked with web publishing and
content managements systems
since 1999
§ I’ve written three CMS in the past
§ Worked with open source since
1998, with Drupal from 2007
More about me:
§ www.kallio.la
§ linkedin.com/in/jannekalliola
§ @plastic
6. General Data Protection Regulation
§ The new EU data protection act that harmonises the use of private data
across EEA
§ The regulation has been heavily lobbied and it took several years to negotiate the
final version
§ Transition period ended in May 2018
§ The GDPR replaced the national laws and regulations based on the EU Data
Protection Directive (46/95/EC)
§ The GDPR is directly applicable in each member state
§ Will lead to a greater degree of data protection harmonization across EU nations
§ Member States have retained significant rights to legislate in certain areas
7. Key Concepts
§ Data Controller – company managing personal data
§ Data Processor – company handling data for a data controller
§ Data Subject – an individual person
§ Private Data – very broad definition of a data that can be used to identify a
person directly or non-directly
§ Name, email, user account, phone number, address, IP address
§ Private data can be processed only and only if it is required to provide the
service
§ If the service can be provided to anonymous users, it cannot ask for private data
8. Two Data Handling Roles
Controller
§ The company collecting the data
and controlling its usage
§ Responsible for and able to
demonstrate compliance with
the regulation
§ Including also work done by
processors
Processor
§ A company that processes
personal data on behalf of a
controller
§ Must be contractually bound to
the controller and follow written
orders
§ Must return or delete data when
contract ends
9. Key Concepts – Special Category
§ Data that reveals racial or ethnic origin, political opinions, religious or
philosophical beliefs, or trade union membership, and the processing of
genetic data, biometric data for the purpose of uniquely identifying a
natural person, data concerning health or data concerning a natural
person’s sex life or sexual orientation
§ Data in special category has stricter rules than the generic private data
§ It can be processed, but there needs to be reason to do so
10. Children
§ Children are identified as vulnerable individuals that require specific
protection
§ Consent given by person with parental responsibility for the child
§ Also national laws about children making contracts, etc.
11. Key Principles – Controllers and Processors
§ Accountability
§ Demonstrating compliance
§ Increased documentation obligations
§ Risk-based approach
§ Privacy by design and default
§ Privacy Impact Assessment and prior consultation where risk is high
§ Data Protection Officers
§ New breach reporting obligations
§ Detailed prescription of what must be included in outsourcing contracts
12. Key Principles – Individuals
§ Transparency and consent – The individuals need to know how and why
their data is used, and companies need to have valid reason for the data
usage
§ More extensive data subject rights
§ Restriction
§ Erasure
§ Portability
§ "Profiling"
§ Changing consent requirements (including in relation to children)
13. Key Concepts – Risk Based Approach
§ Authorities have few resources to control many companies with growing
number of data
§ Thus
§ The company is made accountable
§ The measures need to be in relation with the risk involved, for example:
§ Appropriate
§ Effective
§ By design and default
14. Accountability
§ Organisations must be able to proof that they are following the
regulation, i.e. reversed burden of proof
§ Requires process documentation, paper trails of decisions, and in some cases
privacy impact assessments
15. Key Concepts – Applicability
§ The regulation applies to the private data of an EEA national
§ Notwithstanding the location of the person, data, or processing
§ Only one EEA national is enough to make the data processing regulated by
GDPR
16. Fines
§ There has been a lot of talk about ”fines” in GDPR, or administrative
sanctions
§ The maximum fines are high – 20M€ or 4% of global turnover, which one is higher
§ In reality, big fines are probably exceptions and one needs to show utter
disregard of GDPR to get such sanctions
§ The scale of sanctions start from notification and turns into monetary sanctions
somewhere down the road
§ But the sanctions have made sure that everybody has taken GDPR seriously
18. The Rights of the Individuals
Article Description
13/14 Transparency, right to be informed
15 Access to personal data
16 Rectification of inaccurate data
17 Right to be forgotten
18 Right to restrict processing
20 Data portability
21 Automated decision making and right for human intervention
19. Rights Explained (1/2)
§ Access to data – The individuals must be able to see the data collected
about them
§ By request that needs to be followed in a month - there are extensions for some
cases, in commonly used electronic format
§ First copy must be free of charge
§ Rectification of inaccurate data – The individuals can ask inaccurate data to
be corrected
§ Right of erasure – The individuals can ask data to be removed
§ Object of processing – The individuals can stop specific kind of processing,
for example, direct marketing
20. Rights Explained (2/2)
§ Portability – The individuals have right to have their data ported to them
or to another service
§ Restricting processing – The individuals can ask to stop processing their
data for a period of time
§ Data can also be temporarily removed in this case
§ Profiling and automated decision-taking – Profiling based on sensitive
data requires explicit consent and the individuals can request manual
intervention of automated decision-taking that cause them significant
effects
21. Lawfulness of Processing
§ Data subject has given consent
§ Necessary for the performance of contract or to take steps prior to entering
into a contract
§ Necessary to protect vital interests of data subject
§ Necessary for legitimate interests of controller or 3rd party
§ Necessary for compliance with legal obligation to which the controller is
subject
§ Necessary for task carried out in the public interest or exercise of official
authority
22. Consent
§ Consent must be
§ Actively given
§ Separable from other written agreements
§ Clearly presented
§ As easily revoked as given
§ Additional requirements include an effective prohibition on "bundled" consents and
the offering of services which are contingent on consent to processing
§ Where consent is relied on controllers should be able to demonstrate that consent
was given by the data subject to the processing
§ In practice, consent metadata is necessary
23. Consent – Implications for UX
§ Consent is more regulated than before
§ Needs to be specific and unambigious, cannot be part of other written agreements
§ Must be active – i.e. no preticked checkboxes
§ Must be reversible – in other words, must be available in user profile or similar
§ Record of the given content is required
§ Consent cannot be required for a service that works also without processing
personal data
§ Privacy policy is more important than before
§ Data has to have storage times, and a lot of other tidbits
24. Legitimate Interest
§ Consent is rather difficult to achieve & demonstrate
§ Other grounds for processing relatively narrow
§ Legitimate interests likely to be one of the most important grounds
25. Legitimate Interest
§ Controllers that rely on "legitimate interests"
should maintain a record of the assessment to
demonstrate that they have given proper
consideration to the rights and freedoms of
data subjects
§ When relying on "legitimate interests” – must
be set out in the information notices
§ Recommendation: perform risk assessment
and documentation
Examples of legitimate interest:
§ Processing for direct marketing purposes or
preventing fraud
§ Transmission of personal data within a group
of undertakings for internal administrative
purposes, including client and employee data
§ Processing for the purposes of ensuring
network and information security, including
preventing unauthorised access to e-
communications networks and stopping
damage to computer and e-communication
systems
§ Reporting possible criminal acts or threats to
public security to a competent authority
27. Data Transfers – Basic Principles
§ Transfers outside EEA (European Economic Area) are restricted, but not forbidden
§ Transfers require adequate level of data protection, such as following EU model
clauses or binding corporate rules inside a group of companies
§ Safe Harbor is now replaced with Privacy Shield, a new deal to self-certify US
companies to allow hosting data regulated by the GDPR
§ Number of safe countries whose regulation provides similar protection of personal
data as GDPR
§ Andorra, Argentina, Canada (only commercial organisations), Faroe Islands, Guernsey, Israel,
Isle of Man, Jersey, New Zealand, Switzerland, Uruguay and USA (if the recipient belongs to
the Privacy Shield)
§ Updated from time to time by European Commission
28. Data Transfers – Hidden Complexity
§ Modern IT architectures are complex and they are designed in a layered
fashion
§ Thus the complexity of the underlying systems may easily escape
§ The data flows should be designed and documented clearly
§ And this documentation must be kept up to date all the time
§ Reducing privacy complexity by restricting the data to essentials, using
encryption, hashes, pseudonymisation, etc. makes perfect sense
29. Data Transfers – APIs and Integrations
§ Be aware what is sent over API and/or integrations with other systems
§ As the definition of private data is very broad, it is too easy to send also
private data through an integration point
§ If you provide the API end points, check the API thoroughly to see whether it
inadvertently provides some private data
§ There are no technical measures to control the flow or the destination of
the data after it has left the system
§ Users must be kept informed about the potential of their private data
being handled outside of the system, including also the locations
30. Data Transfers – They Are Needed
§ You cannot avoid data transfers in the modern networked economy
§ Cloud services and serverless paradigm multiply the interconnectivity
§ And each interconnecting point might be a source of data transfer
§ There is no point fighting back and trying to do everything by yourself
§ You will be so inefficient in rolling out new features that competition will crush
you
§ Instead, try to minimise the risks while reaping most of the benefits
32. Structured vs. Unstructured Data
§ Most of the data processed by computers is structured, in other words it
contains named fields that might have types
§ Structured data is easy to put into spreadsheets
§ Content management systems handle a lot of unstructured data – the
content
§ Unstructured data is easy to put into documents
§ This data is also under GDPR
33. Content and GDPR
§ Content contains easily a lot of personal information, such as names,
email addresses, phone numbers, and images of people
§ These cannot easily be exported from the system to satisfy end user
rights
§ Thus, one needs to be diligent
§ Best solutions are to make suitable content types and other structures that
move a lot repeating data into structured data
§ For example, staff listing implemented as a list of persons and not freely
editable page
34. Content and Consent
§ Remember also to have consent from people to use their personal
information
§ Discussion forums, blog comments, etc.
§ This applies to your own personnel, too
§ Using names and photos in a staff listing needs a consent or legitimate
interest
§ It does not help whether you use company provided email addresses or
phone numbers, as people can still be identified using them – thus they
are also personal information
35. Analytics
§ Using analytics is ok in general
§ It is good to check what kind of data goes into analytics and how the
system processes them
§ Even if does not store the data, it might temporarily be accessible by the
personnel of the analytics provider
§ And this needs to be covered in the contract between you and them
§ IP address is a typical piece of data transferred to analytics
§ Some solutions – such as Google Analytics – offer anonymisation of IP
address before sending it to the analytics
36. Access and Error Logs
§ Content management systems generate various logs for administrative and
error management purposes
§ These logs have at least the IP address of the user and thus are also full of
personal data
§ The procedures for such logs need to be checked
§ Who has access to them
§ Whether they are exported to an analysis system
§ Also own or third party extensions to CMS may write own log files
§ Debug mode may cause more personal data to be written to the files
37. URLs
§ Your system may transfer personal data in URLs, such as
§ https://example.com/person/?name=Janne+Kalliola&birthdate=...
§ All systems storing that URL – logs, analytics, etc. – suddenly may contain
way more personal data that you know and have defined in your
processes
§ Also transaction ids and other pieces of data that identify a single user
are considered personal data
38. Staging and Development Environments
§ GDPR affects to all systems, including also staging and development
environments
§ In case of requests from users, the data in these systems need to be included
in erasure, rectification, etc.
§ When data is copied from production to staging or development –
typically to debug issues – special care is needed
§ As people tend to have a more relaxed attitude towards these systems, the
probability of data leaks increase
40. Compliance
§ Digital marketing platforms must be GDPR compliant
§ This should not be a problem with all major platform provider, as without
compliance they would be quickly out of business
§ But it is a good thing to check
§ Your processes need to be compliant, too
§ This is typically harder
§ And also connections between platforms need to be compliant
41. Mass Mailing
§ It is still allowed to send cold emails to people under GDPR, with the
following requirements:
§ The recipient address is a business address
§ The recipients are targeted based on your business – the mail should benefit
the recipient
§ You need to inform recipient how their personal data is processed
§ You need to include instructions how to remove or change their data
§ The personal data is not processed any longer than it is necessary
42. Subscribers Added before GDPR
§ If you have asked permission at the very beginning and you have
received their consent, there is no need to ask the permission again
§ If the purpose of the processing has changed or will change, they need to
be informed and given an easy way to decide if they want to allow
processing their data or not
§ If you have bought subscriber lists, you need to know how the data was
obtained and be able to explain to individuals, how and why you got their
data
§ This applies also to cases that you outsource address collection to a partner
43. Tool Chains
§ Digital marketing tools are typically chained
§ The source of the data is in CRM
§ Then there are marketing automation systems, websites, etc.
§ When data must be removed or changed, it has to be done through the
whole chain
§ Or the systems should be implemented so that they do not store anything –
just use the data when it is received and then discard it right away
§ It is very important to define retention times for the personal data that did
not lead into a business relationship
45. Documentation Can Mislead
§ If the system’s documentation is from era before GDPR, it does not focus
on data privacy much or at all
§ Further, the documentation is typically somewhat simplified view of the
architecture
§ Sometimes very simplified
§ Finally, it is most probably also outdated
48. Data Storage
§ Data is stored in modern systems into multiple locations and multiple
times
§ Performance, scalability, error management, data security needs, etc.
§ Without thorough and detailed understanding of the architecture, some
data storages may not be known by anyone
§ But the data needs to be expunged from those, too, when requested or when
the data is not needed anymore
49. Auditing Storage
§ For each existing system, find out:
§ Where the personal data is stored
§ What are the retention times and criteria
§ If these have not been specified, start the work
§ ”Forever” is not a retention policy and it must change
§ Why the data is stored – there needs to be legitimate reason for keeping the
data
§ Also the metadata of consent needs to be stored
50. Data Deletion – Real or Not?
§ Deletion of data is a complex task in a networked data model
§ Removing something may left dangling pointers or otherwise render part of
the data unusable
§ Thus, deletion might have been implemented by marking the item
deleted or hidden
§ The user cannot see it and considers it removed
§ This, of course, does not work with GDPR – unless you have valid legal
reasons to keep the data
51. Residual Data
§ Modern architectures duplicate data frequently – also private data
§ Some of these duplications are not deleted when they are no longer
needed technically
§ Log files, especially audit and debug logs
§ Synchronization files
§ This is called residual data
§ And there can be plenty of it
53. § Varnish or CDN in the front
§ Web server logs
§ Platform logs
§ Local caches
§ Uploaded binary files
§ Maillog of all the sent emails
§ Backups of the servers
54. § SQL logs
§ Binary logs on all servers
§ Backups of binary logs
§ Database dumps made by
developers
§ Production dumps to staging
environment
55. § Integration platform logs and
local caches
§ Integration platform document
DB oplogs
§ SaaS messaging platform logs
and internal database
56. § Finally the actual data master,
its logs, backups and
development environment
57. What to Do?
§ Data flow mapping is crucial
§ The natural starting point is the data entry, typically a website or a mobile application
§ Map the flow of the data from the source to the storage
§ Also external integrations need to be documented
§ Reduce data, if possible
§ Tune log levels, synchronisation frequencies, etc.
§ Mark down or define retention policies for residual data
§ Log rotation, cron based removals, etc.
§ Have proper policies for the rest
§ For example, how to make database dump for testing, how to handle it, when to remove it,
etc.
58. Special Categories
§ The private data falling under special categories – health, religion, union
membership, etc. – needs to handled with extra care
§ Proper access control who can see and manipulate the data
§ Audit trail of all actions
§ Also, use tight scrutiny to check whether the special category data is
actually needed or not
§ It adds extra burden that might not be bring good enough benefits
§ Or ask it when needed, use, and discard – no storing at all
59. Privacy Policies
§ The privacy policies of the systems need to be constantly upgraded when
the system, the processes, or the purpose of the processing changes
§ This is surprisingly frequent activity, if the system is under active development
§ Of course, the first step with existing systems is to check that the policies
actually exist and they are compliant with GDPR
§ This is more of a territory for lawyers
§ Just make sure that the document is not written in hard to understand legalise,
but also a layman can understand it
60. Data Security and GDPR
§ Focus in the past has been in data security
§ GDPR is not about data security and it does not define data security
requirements
§ It requires adequate security
§ Adequacy depends on the situation, and no hard and fast rules can be given
§ Data security procedures have not taken data privacy into account that
much
62. By Design and by Default
§ Data protection by design and data protection by default is still very much
undefined
§ We will have new clues flowing in as there is more guidelines from authorities
and actual cases
§ Requirements for processes and daily handling of personal data are not
defined, nor have they gotten much focus in GDPR preparations
63. Architecture Planning under GDPR
§ When planning architectures of new systems, take the following into
consideration:
§ Allowing data subject rights in new services
§ Personal data design
§ Risk-based security built-in to new services
§ Data protection and security in maintaining new services
64. Personal Data Design
§ Create a personal data design for the new service
§ Do not collect anything that you cannot design a use
§ Do not collect anything that can be considered a high risk
§ Limit technical data collection
65. Example High Level Personal Data Design
Before Use While Using After Usage
Unregistered
usage tracking
based on
cookies, email
address if on
mailing list,
technical data
Full contact
details, profile,
usage tracking,
purchase
history, mailing
list actions,
technical data
Email address
for mailing list
and re-contact,
purchase history
for 2 years
66. Example Use-Case Level Personal Data Design
Registration Update Profile Purchase Contact to
Customer Care
Full name,
address, email,
gender, ip
number, user-
agent,
anonymous
cookies
connected,
phone number
Avatar image,
preferences,
hobbies, age,
household
income, children,
marital status
Product details,
cost, discounts,
path to purchase
Full call record,
call transcript,
phone number,
product
reference,
internal comms
regarding
support case
67. Minimising Use of Private Data
§ The amount of private data collected can – and should – be minimised
§ Requires good architectural skills
§ Several strategies, such as
§ Collect, use, discard – do not store for later use, works well with background
checks
§ Encryption – when data is passed through a system that is not using it
§ Hashing – storing one way hashes instead of real information, for example,
banned accounts
68. Risk-Based Approach to Security
§ Data security should be built in accordance to risk
§ Risk to the rights and freedoms of data subjects
§ Risk is not based on data only, but also context of the service
§ Risk should be knowingly analysed with the Product Owner and the
technical people
§ Risk analysis should be documented
§ Data security should be documented as functional requirements and non-
functional requirements; otherwise it does not happen
69. Risk-Based Approach to Security, Example
§ Limit the completeness of data sets
§ Denormalisation for performance – in other words, copying the same data to
several places to speed up data reading
§ Leakage of full or individually usable data set has higher impact than partial data –
for example, leaking addresses vs. leaking addresses and names
§ Risk of unencrypted data in transit
§ For example, email notifications – the risk grows when the service has higher
impact on individuals, such as banks, stock brokers, or dating services
§ Leaking data via user friendly features
§ For example, login boxes that inform whether an account exists or not
71. Privacy Related Metadata
§ GDPR requires some metadata about private data, such as recording
giving the consent
§ More you know about the allowed usage of the data, more it offers
benefits and possibilities
§ When drafting personal data designs, discuss and document also the
needs of the metadata
§ Keep in mind that the metadata will most probably be also private data and it
must be treated accordingly
72. Managing Consents
§ As consent must be reversible at will and any time, it requires extra
thinking to make it right
§ Also, part of the service might use other legal basis and they should
continue to operate even if consent is withdrawn
§ Further, there might be several consents asked throughout the service
lifecycle
§ If possible, unify consent checking in the code into a library
§ Document the consent checking to keep the system internally uniform – when
and what
73. Aggregation
§ Collecting all private data under a single service helps to tackle the individuals’ right to
check their data
§ Implementing this is somewhat straightforward
§ When an individual wants to change or remove data, things become trickier
§ Deletion is straightforward if there is a single identifier for the individual across systems – this
is rarely the case
§ Changing is more complex operation, especially if the data has almost but not quite duplicate
fields – for example, shipping address, billing address, address, registered address, etc.
§ The typical choices are
§ Do the changes manually, in other words add the request to queue and handle it later
§ Require other systems to expose API to control changes and deletion
74. Automation
§ If some task occurs frequently, it might make sense to automate it
§ If your organisation receives only a few GDPR related requests per year,
documentation might be better choice
§ The level of automation defines the cost
§ Simple scripts to clean an individual database
vs.
§ One button to remove all personal data from all systems
§ Automation is not a silver bullet, use it only when it makes sense
75. Good Development Practices
§ Peer reviews – helps to raise quality on other matters, too
§ Auditing of third party components – must be based on risk
§ Automated, controlled, and repeatable process for deployments
§ Remove all manual work
§ Encryption of data at rest and at move
§ Automatic anonymisation when moving data from production to staging
or development
§ If not possible, have good and thorough processes that are also followed
77. Privacy Policies
§ Privacy policy is the first and foremost tool to show your compliance to GDPR
§ It must be included in every service processing private data
§ Privacy policy must be kept up to date
§ Consider versioning it
§ Checking its validity should be in a release checklist
§ Also, all changes to private data handling should be document – for example, written in the
change log
§ Based on these changes, it should be relatively easy to see whether the privacy policy needs
updates
§ The simpler is the policy, the easier is the update procedure
§ You cannot automate this
78. Privacy Policy – Contents
§ You need to define
§ Who is collecting the data
§ What information is collected and processed
§ Why it is collected – the purpose and legal basis for processing
§ Are there any transfers to third parties, and if yes to whom and where
§ How long the data is processed
§ How the individual can fulfil her rights and raise complaints
79. Deployments
§ Badly done deployments lead to increased security and privacy risks
§ Automate everything that is humanly possible
§ Remove every need for human interaction
§ If possible, make sure that the deployment can be rolled back
80. Maintenance
§ Maintenance process of digital services should be governed by data
protection policies
§ Data security in maintenance is usually directed at attack vectors on a
platform – not preventing data leaks
§ Data security should also focus on preventing data leaks instead of
penetration protection
§ Of course, systems implemented well from privacy standpoint need to be
compromised before a leak can take place
§ Keep privacy debt in discussion when doing small-scale development
§ Quick fixes may have very and tedious tail
81. Backups
§ Data in backups is also under GDPR
§ There are no clear instructions how to deal with backups
§ One solution is to have shorter backup cycle than 30 days – the limit of
responding to queries of users
§ The integrity of the backups must be kept
§ In other words, they should not be tampered when removing user’s data from the
system
§ Backups should have similar retention period as other data
§ And if you need to do a restore after removing or correcting user’s data, you
need to play the changes again
82. Data Portability
§ GDPR requires the controller to provide the data in an interchangeable
format, should one exist
§ Currently, there are few cases that provide interchangeable formats
§ The world might move towards more uniformity in the future
§ This requires, of course, a first mover that sees business benefits of having
interchangeable format
§ Or an open source project that does this with “the right thing to do”
mentality
§ Until then, it is sufficient to provide the data in machine readable format