SlideShare une entreprise Scribd logo
1  sur  61
1
2
3
4
5
Agenda
Machine learning (ML) and AI (Artificial Intelligence)
Secure Data-sharing
• Secure multi-party computation (SMPC) and uses cases
• Homomorphic encryption (HE) and use cases
• Zero trust architecture (ZTA) vs. Zero knowledge
• Trusted execution environments (TEE)
Regulations and Standards in Data Privacy
• International privacy standards
• Differential Privacy (DP) and K-Anonymity
6
Technologies for Data
Privacy in AI, ML and
Analytics
7http://dataprotection.link/Zn1Uk#https://www.wsj.com/articles/coronavirus-paves-way-for-new-age-of-digital-surveillance-11586963028
American officials are drawing cellphone location data from mobile advertising firms to track the presence of crowds—but
not individuals.
• Apple Inc. and Alphabet Inc.’s Google - a voluntary app that health officials can use to reverse-engineer sickened patients’
recent whereabouts—provided they agree to provide such information.
Collect personal or anonymized data?
In Western Australia, lawmakers approved a bill to install surveillance gadgets in people’s homes to monitor those placed
under quarantine.
Authorities in Hong Kong and India are using geofencing that draws virtual fences around quarantine zones.
• They monitor digital signals from smartphone or wristbands to deter rule breakers and nab offenders, who can be sent to
jail.
8
Global Map Of Privacy Rights And Regulations
9
10
11
12
13
Use Case: Insilico Medicine
https://insilico.com/
Since 2014: An alternative to animal testing for research and development programs in the pharmaceutical industry.
• By using artificial intelligence and deep-learning techniques, Insilico is able to analyze how a compound will affect cells
and what drugs can be used to treat the cells in addition to possible side effects.
• The company provides machine learning services to different pharmaceutical, biotechnology, and skin care companies.
• The company has multiple collaborations in the applications of next-generation artificial intelligence technologies such as
the generative adversarial networks and reinforcement learning to the generation of novel molecular structures with
desired properties.
A comprehensive drug
discovery engine, which
utilizes millions of samples
and multiple data types to
discover signatures of
disease and identify the
most promising targets for
billions of molecules that
already exist or can be
generated de novo with the
desired set of parameters.
14
Swarm AI for Event Outcome Prediction
15https://www.marketresearchfuture.com/reports/machine-learning-market-2494, September 2019
Global Machine Learning Market
Machine learning is a part of artificial intelligence (AI) that grants computers the capability to learn without being
programmed in detail.
It has multiple uses in today’s technology market concerning safety and security such as face detection, face recognition,
image classification, speech recognition, antivirus, Google, antispam, genetic, signal diagnosing, and weather forecast.
16
17
18Privacyshield.gov
Privacy Shield Program*
• On July 12, 2016, the European Commission deemed the EU-U.S. Privacy Shield Framework adequate to
enable data transfers under EU law (see the adequacy determination).
• On July 16, 2020, the Court of Justice of the European Union issued a judgment declaring as “invalid” the
European Commission’s Decision (EU) 2016/1250 of 12 July 2016 on the adequacy of the protection
provided by the EU-U.S. Privacy Shield.
As a result of that decision, the EU-U.S. Privacy Shield Framework is no longer a valid mechanism to comply
with EU data protection requirements when transferring personal data from the European Union to the
United States.
This decision does not relieve participants in the EU-U.S. Privacy Shield of their obligations under the EU-U.S. Privacy
Shield Framework.
*: The EU-U.S. Privacy Shield Framework were designed by the U.S. Department of Commerce, and the
European Commission and Swiss Administration, respectively, to provide companies on both sides of the Atlantic
with a mechanism to comply with data protection requirements when transferring personal data from the
European Union and Switzerland to the United States in support of transatlantic commerce.
19
Privacy Shield safeguards: Encryption
• The CJEU reaffirmed the validity of SCCs* but stated that companies must verify, on a case-by-case basis,
whether the law in the recipient country ensures adequate protection.
• The ruling placed the same requirement on EU data protection authorities to suspend such transfers on a
case-by-case basis where equivalent protection can not be ensured.
• Privacy professionals may need to consider whether relevant surveillance programs and authorities apply in
particular contexts. If they do, they could then assess whether those authorities include proportional
limitations in the given context, as well as whether effective judicial remedies exist.
• Alternatively, they might consider ways to limit the context itself through additional safeguards.
• Encryption, for instance, might be a consideration.
https://iapp.org/news/a/the-schrems-ii-decision-eu-us-data-transfers-in-question/
*: Standard Contractual Clauses (SCC). Standard contractual clauses for data transfers between EU and non-EU countries.
20
Gartner MQ for Data Science and
Machine Learning Platforms
https://www.kdnuggets.com/2020/02/gartner
-mq-2020-data-science-machine-learning.html
Data and analytics pipeline,
including all the following areas:
1. Data ingestion
2. Data preparation
3. Data exploration
4. Feature engineering
5. Model creation and training
6. Model testing
7. Deployment
8. Monitoring
9. Maintenance
10.Collaboration
2020 vs 2019 changes
21Digikey, techbrij
Machine Learning Model Lifecycle - Example
1. Define the model: using the Sequential or Model class and add the layers
2. Compile the model: call compile method and specify the loss, optimizer
and metrics
3. Train the model: call fit method and use training data
4. Evaluate the model: call evaluate method and use testing data to evaluate
trained model
5. Get predictions: use predict method on new data for predictions
22
23
Secure Data-sharing for
Hybrid Cloud
24
25Source: Gartner
Source:
Netskope
Current use or plan to use:
Spending by Deployment Model, Digital Commerce Platforms, Worldwide
26
27
Protection throughout the lifecycle of data in Hadoop
Tokenizes or encrypts
sensitive data fields
Enterprise
Policies
Privacy policies may be
managed on-prem or
Cloud Platform
• Policy Enforcement Point (PEP)
Protected data fields
U
Separation of Duties
• Encryption Key Management
Big Data Analytics
Data
Producers
Data
Users
Google Cloud
UU
Big Data Protection with Granular Field Level Protection for Google Cloud
28
29
Legal Compliance and Nation-State Attacks
• Many companies have information that is attractive to governments and intelligence services.
• Others worry that litigation may result in a subpoena for all their data.
Securosis, 2019
Multi-Cloud Data Privacy considerations
Jurisdiction
• Cloud service
providers
redundancy is great
for resilience, but
regulatory concerns
arises when moving
data across regions
which may have
different laws and
jurisdictions.
SecuPi
30Securosis, 2019
Consistency
• Most firms are quite familiar with their
on-premises encryption and key
management systems, so they often
prefer to leverage the same tool and skills
across multiple clouds.
• Firms often adopt a “best of breed” cloud
approach.
Examples of Hybrid Cloud considerations
Trust
• Some customers simply do not trust
their vendors.
Vendor Lock-in and Migration
• A common concern is vendor
lock-in, and an inability to
migrate to another cloud
service provider.
Cloud Gateway
Google Cloud AWS Cloud Azure Cloud
S3
Salesforce
31Source: Gartner
Six Important
Privacy-Preserving Computation
Techniques
32
Increased need for data analytics drives requirements.
Data Lake,
ETL, Files
…
• Policy Enforcement Point (PEP)Protected data fields
U
• Encryption Key Management
U
External Data
Internal
Data
Secure Multi Party Computation
Analytics, Data Science, AI and ML
Data Pipeline
Data Collaboration
Data Pipeline
Data Privacy
On-premises
Cloud
Internal and Individual Third-Party Data Sharing
33
http://homomorphicencryption.org
Use Cases for Secure Multi Party Computation &
Homomorphic Encryption (HE)
34
Use case - Financial services industry
Confidential financial datasets which are vital for gaining significant insights.
• The use of this data requires navigating a minefield of private client information as well as sharing data
between independent financial institutions, to create a statistically significant dataset.
• Data privacy regulations such as CCPA, GDPR and other emerging regulations around the world
• Data residency controls as well as enable data sharing in a secure and private fashion.
Reduce and remove the legal, risk and compliance processes
• Collaboration across divisions, other organizations and across jurisdictions where data cannot be
relocated or shared
• Generating privacy respectful datasets with higher analytical value for Data Science and Analytics
applications.
35
Use case – Retail - Data for Secondary Purposes
Large aggregator of credit card transaction data.
Open a new revenue stream
• Using its data with its business partners: retailers, banks and advertising companies.
• They could help their partners achieve better ad conversion rate, improved customer satisfaction, and more timely offerings.
• Needed to respect user privacy and specific regulations. In this specific case, they wanted to work with a retailer.
• Allow the retailer to gain insights while protecting user privacy, and the credit card organization’s IP.
• An analyst at each organization’s office first used the software to link the data without exchanging any of the underlying
data.
Data used to train the machine learning and statistical models.
• In this specific use-case, a logistic and linear regression model was trained using secure multi-party computation (SMC).
• In the simplest form SMC splits a dataset into secret shares and enables you to train a model without needing to put together
the pieces.
• The information that is communicated between the peers is encrypted at all times and cannot be reverse engineered.
• The resultant machine learning model coefficients (output of the training) were only shared with the partner identified as the
receiver of such information.
With the augmented dataset, the retailer was able to get a better picture of its customers buying habits.
36
Use case: Bank - Internal Data Usage by Other Units
A large bank wanted to broaden access to its data lake without compromising data privacy, preserving the data’s analytical
value, and at reasonable infrastructure costs.
• Current approaches to de-identify data did not fulfill the compliance requirements and business needs, which had led to
several bank projects being stopped.
• The issue with these techniques, like masking, tokenization, and aggregation, was that they did not sufficiently protect the
data without overly degrading data quality.
This approach allows creating privacy protected datasets that retain their analytical value for Data Science and business
applications.
A plug-in to the organization’s analytical pipeline to enforce the compliance policies before the data was consumed by data
science and business teams from the data lake.
• The analytical quality of the data was preserved for machine learning purposes by-using AI and leveraging privacy models like
differential privacy and k-anonymity.
Improved data access for teams increased the business’ bottom line without adding excessive infrastructure costs, while
reducing the risk of-consumer information exposure.
37Source: Gartner
Six Important
Privacy-Preserving Computation
Techniques
38
https://royalsociety.org
Secure Multi-Party Computation (MPC)
Private multi-party machine learning with MPC
Using MPC, different
parties send
encrypted messages
to each other, and
obtain the model
F(A,B,C) they wanted
to compute without
revealing their own
private input, and
without the need for a
trusted central
authority.
Secure Multi-Party machine learningCentral trusted authority
A B C
F(A, B,C)
F(A, B,C) F(A, B,C)
Protected data fields
U
B
A C
F(A, B,C)
U U
U
39
Medium.com
Example of Multi-party Computation: Average Salary #1
40Source: Gartner
Six Important Privacy-Preserving
Computation Techniques
April 2020
41
Case Study – HE and Securely sharing sensitive information
An example from the healthcare domain.
The recent ability to fully map the human genome has opened endless possibilities for advances in
healthcare.
1. Data from DNA analysis can test for genetic abnormalities, empower disease-risk analysis,
discover family history, and the presence of an Alzheimer’s allele.
• But these studies require very large DNA sample sizes to detect accurate patterns.
2. However, sharing personal DNA data is a particularly problematic domain.
• Many citizens hesitate to share such personal information with third-party providers,
uncertain of if, how and to whom the information might be shared downstream.
3. Moreover, legal limitations designed to protect privacy restrict providers from sharing this data as
well.
4. HE techniques enable citizens to share their genome data and retain key privacy concerns without
the traditional all-or-nothing trust threshold with third-party providers.
42
https://royalsociety.org
Homomorphic encryption (HE)
HE depicted in a client-server model
• The client sends encrypted
data to a server, where a
specific analysis is performed
on the encrypted data,
without decrypting that data.
• The encrypted result is then
sent to the client, who can
decrypt it to obtain the
result of the analysis they
wished to outsource.
Encryption of x
Client
Server
Analysis
Encrypted F(x)
• Policy Enforcement Point (PEP)
Protected data fields
U
• Encryption Key Management
43
44Source: Gartner
Important
Privacy-Preserving Computation
Techniques
45
Trusted execution environments
Trusted Execution Environments (TEEs) provide secure computation capability through a combination of special-purpose
hardware in modern processors and software built to use those hardware features.
The special-purpose hardware provides a mechanism by which a process can run on a processor without its memory or
execution state being visible to any other process on the processor,
• not even the operating system or other privileged code.
*: Source: http://publications.officialstatistics.org
Computation in a TEE is not
performed on data while it remains
encrypted.
• Typically, the memory space of
each TEE (enclave) application is
protected from access
• AES-encrypted when and if
it is stored off-chip.
Usability is low and products/services are emerging in MS Azure, IBM’s cloud service Amazon AWS (late 2020)*
46Source: Gartner
Important
Privacy-Preserving Computation
Techniques
47
48
Regulations and Standards
in Data Privacy
49
FTI Consulting -
Corporate Data
Privacy Today, 2020
Which of the following aspects of data privacy are you particularly concerned about?
50
TrustArc
Legal and regulatory risks are exploding
51
52
Personally Identifiable Information
(PII) in compliance with the EU Cross
Border Data Protection Laws,
specifically
• Datenschutzgesetz 2000 (DSG
2000) in Austria, and
• Bundesdatenschutzgesetz in
Germany.
This required access to Austrian and
German customer data to be
restricted to only requesters in each
respective country.
• Achieved targeted compliance with
EU Cross Border Data Security laws
• Implemented country-specific data
access restrictions
Data sources
Case Study
A major international bank performed a consolidation of all European operational data sources
to Italy
53
Access to DataLow High
High -
Low -
I I
Lower Risk and Higher Productivity
with More Access to More Data
User Productivity
Risk
More
Access to
Data
Low Risk Tokens
High Risk Clear Data
54
Field Privacy Action (PA) PA Config
Variant Twin
Output
Gender Pseudonymise AD-lks75HF9aLKSa
Pseudonymization
Generalization
Field Privacy Action (PA) PA Config
Variant Twin
Output
Age Integer Range Bin
Step 10 +
Pseud.
Age_KXYC
Age Integer Range Bin
Custom
Steps
18-25
Aggregation/Binning
Field Privacy Action (PA) PA Config
Variant Twin
Output
Balance Nearest Unit Value Thousand 94000
Rounding
Generalization
Source data:
Output data:
Last name Balance Age Gender
Folds 93791 23 m
… … … …
Generalization
Source data:
Output data:
Patient Age Gender Region Disease
173965429 57 Female Hamburg Gastric ulcer
Patient Age Gender Region Disease
173965429 >50 Female Germany Gastric ulcer
Generalization
Examples of data de-identification
Source: INTERNATIONAL STANDARD ISO/IEC 20889, Privitar, Anonos
55
Differential
Privacy (DP)
2-way
Format
Preserving
Encryption
(FPE)
Homomorphic
Encryption
(HE)
K-anonymity
modelTokenization
MaskingHashing
1-way
Machine Learning (ML) and
Secure Multi Party Computation (SMPC)
Algorithmic
Random
Noise
added
Computing
on
encrypted
data
Format
Preserving
Fast Slow Very
slow
Fast
Fast
Format
Preserving
Encryption and Privacy Models
56
57
Data protection techniques: Deployment on-premises, and clouds
Data
Warehouse
Centralized Distributed
On-
premises
Public
Cloud
Private
Cloud
Vault-based tokenization y y
Vault-less tokenization y y y y y y
Format preserving
encryption
y y y y y
Homomorphic encryption y y
Masking y y y y y y
Hashing y y y y y y
Server model y y y y y y
Local model y y y y y y
L-diversity y y y y y y
T-closeness y y y y y y
Privacy enhancing data de-identification
terminology and classification of techniques
De-
identification
techniques
Tokenization
Cryptographic
tools
Suppression
techniques
Formal
privacy
measurement
models
Differential
Privacy
K-anonymity
model
58
IS: International Standard
TR: Technical Report
TS: Technical Specification
Guidelines to help comply
with ethical standards
20889 IS Privacy enhancing de-identification terminology and
classification of techniques
27018 IS Code of practice for protection of PII in public clouds acting
as PII processors
27701 IS Security techniques - Extension to ISO/IEC 27001 and
ISO/IEC 27002 for privacy information management - Requirements
and guidelines
29100 IS Privacy framework
29101 IS Privacy architecture framework
29134 IS Guidelines for Privacy impact assessment
29151 IS Code of Practice for PII Protection
29190 IS Privacy capability assessment model
29191 IS Requirements for partially anonymous, partially unlinkable
authentication
Cloud
11 Published International Privacy Standards
Framework
Management
Techniques
Impact
19608 TS Guidance for developing security and privacy functional
requirements based on 15408
Requirements
27550 TR Privacy engineering for system lifecycle processesProcess
ISO Privacy Standards
59
References A:
1. C. Gentry. “A Fully Homomorphic Encryption Scheme.” Stanford University. September 2009,
https://crypto.stanford.edu/craig/craig-thesis.pdf
2. Status Report on the Second Round of the NIST Post-Quantum Cryptography Standardization Process,
https://csrc.nist.gov/publications/detail/nistir/8309/final
3. ISO/IEC 29101:2013 (Information technology – Security techniques – Privacy architecture framework)
4. ISO/IEC 19592-1:2016 (Information technology – Security techniques – Secret sharing – Part 1: General)
5. ISO/IEC 19592-2:2017 (Information technology – Security techniques – Secret sharing – Part 2: Fundamental mechanisms
6. Homomorphic Encryption Standardization, Academic Consortium to Advance Secure Computation,
https://homomorphicencryption.org/standards-meetings/
7. Homomorphic Encryption Standardization, https://homomorphicencryption.org/
8. NIST Post-Quantum Cryptography PQC, https://csrc.nist.gov/Projects/Post-Quantum-Cryptography
9. UN Handbook on Privacy-Preserving Computation Techniques, http://publications.officialstatistics.org/handbooks/privacy-
preserving-techniques-handbook/UN%20Handbook%20for%20Privacy-Preserving%20Techniques.pdf
10. ISO/IEC 29101:2013 Information technology – Security techniques – Privacy architecture framework,
https://www.iso.org/standard/45124.html
11. Homomorphic encryption, https://brilliant.org/wiki/homomorphic-encryption/
12. Survey on Secure Search Over Encrypted Data on the Cloud, https://arxiv.org/abs/1811.09767
60
References B:
1. California Consumer Privacy Act, OCT 4, 2019, https://www.csoonline.com/article/3182578/california-consumer-privacy-act-what-
you-need-to-know-to-be-compliant.html
2. CIS Controls V7.1 Mapping to NIST CSF, https://dataprivacylab.org/projects/identifiability/paper1.pdf
3. GDPR and Tokenizing Data, https://tdwi.org/articles/2018/06/06/biz-all-gdpr-and-tokenizing-data-3.aspx
4. GDPR VS CCPA, https://wirewheel.io/wp-content/uploads/2018/10/GDPR-vs-CCPA-Cheatsheet.pdf
5. General Data Protection Regulation, https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
6. IBM Framework Helps Clients Prepare for the EU's General Data Protection Regulation, https://ibmsystemsmag.com/IBM-
Z/03/2018/ibm-framework-gdpr
7. INTERNATIONAL STANDARD ISO/IEC 20889, https://webstore.ansi.org/Standards/ISO/ISOIEC208892018?gclid=EAIaIQobChMIvI-
k3sXd5gIVw56zCh0Y0QeeEAAYASAAEgLVKfD_BwE
8. INTERNATIONAL STANDARD ISO/IEC 27018, https://webstore.ansi.org/Standards/ISO/
ISOIEC270182019?gclid=EAIaIQobChMIleWM6MLd5gIVFKSzCh3k2AxKEAAYASAAEgKbHvD_BwE
9. New Enterprise Application and Data Security Challenges and Solutions https://www.brighttalk.com/webinar/new-enterprise-
application-and-data-security-challenges-and-solutions/
10. Machine Learning and AI in a Brave New Cloud World https://www.brighttalk.com/webcast/14723/357660/machine-learning-and-ai-
in-a-brave-new-cloud-world
11. Emerging Data Privacy and Security for Cloud https://www.brighttalk.com/webinar/emerging-data-privacy-and-security-for-cloud/
12. New Application and Data Protection Strategies https://www.brighttalk.com/webinar/new-application-and-data-protection-
strategies-2/
13. The Day When 3rd Party Security Providers Disappear into Cloud https://www.brighttalk.com/webinar/the-day-when-3rd-party-
security-providers-disappear-into-cloud/
14. Advanced PII/PI Data Discovery https://www.brighttalk.com/webinar/advanced-pii-pi-data-discovery/
15. Emerging Application and Data Protection for Cloud https://www.brighttalk.com/webinar/emerging-application-and-data-protection-
for-cloud/
16. Data Security: On Premise or in the Cloud, ISSA Journal, December 2019, ulf@ulfmattsson.com
17. Webinars and slides, www.ulfmattsson.com
61

Contenu connexe

Tendances

Emerging application and data protection for multi cloud
Emerging application and data protection for multi cloudEmerging application and data protection for multi cloud
Emerging application and data protection for multi cloud
Ulf Mattsson
 
New opportunities and business risks with evolving privacy regulations
New opportunities and business risks with evolving privacy regulationsNew opportunities and business risks with evolving privacy regulations
New opportunities and business risks with evolving privacy regulations
Ulf Mattsson
 

Tendances (20)

Privacy preserving computing and secure multi-party computation ISACA Atlanta
Privacy preserving computing and secure multi-party computation ISACA AtlantaPrivacy preserving computing and secure multi-party computation ISACA Atlanta
Privacy preserving computing and secure multi-party computation ISACA Atlanta
 
New regulations and the evolving cybersecurity technology landscape
New regulations and the evolving cybersecurity technology landscapeNew regulations and the evolving cybersecurity technology landscape
New regulations and the evolving cybersecurity technology landscape
 
What I learned at the Infosecurity ISACA North America Conference 2019
What I learned at the Infosecurity ISACA North America Conference 2019What I learned at the Infosecurity ISACA North America Conference 2019
What I learned at the Infosecurity ISACA North America Conference 2019
 
A practical data privacy and security approach to ffiec, gdpr and ccpa
A practical data privacy and security approach to ffiec, gdpr and ccpaA practical data privacy and security approach to ffiec, gdpr and ccpa
A practical data privacy and security approach to ffiec, gdpr and ccpa
 
Evolving regulations are changing the way we think about tools and technology
Evolving regulations are changing the way we think about tools and technologyEvolving regulations are changing the way we think about tools and technology
Evolving regulations are changing the way we think about tools and technology
 
Emerging application and data protection for multi cloud
Emerging application and data protection for multi cloudEmerging application and data protection for multi cloud
Emerging application and data protection for multi cloud
 
New technologies for data protection
New technologies for data protectionNew technologies for data protection
New technologies for data protection
 
What is tokenization in blockchain?
What is tokenization in blockchain?What is tokenization in blockchain?
What is tokenization in blockchain?
 
Protecting data privacy in analytics and machine learning - ISACA
Protecting data privacy in analytics and machine learning - ISACAProtecting data privacy in analytics and machine learning - ISACA
Protecting data privacy in analytics and machine learning - ISACA
 
Bridging the gap between privacy and big data Ulf Mattsson - Protegrity Sep 10
Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10Bridging the gap between privacy and big data   Ulf Mattsson - Protegrity Sep 10
Bridging the gap between privacy and big data Ulf Mattsson - Protegrity Sep 10
 
Future data security ‘will come from several sources’
Future data security ‘will come from several sources’Future data security ‘will come from several sources’
Future data security ‘will come from several sources’
 
Practical risk management for the multi cloud
Practical risk management for the multi cloudPractical risk management for the multi cloud
Practical risk management for the multi cloud
 
How to protect privacy sensitive data that is collected to control the corona...
How to protect privacy sensitive data that is collected to control the corona...How to protect privacy sensitive data that is collected to control the corona...
How to protect privacy sensitive data that is collected to control the corona...
 
Book
BookBook
Book
 
Where data security and value of data meet in the cloud brighttalk webinar ...
Where data security and value of data meet in the cloud   brighttalk webinar ...Where data security and value of data meet in the cloud   brighttalk webinar ...
Where data security and value of data meet in the cloud brighttalk webinar ...
 
ETIS Information Security Benchmark Successful Practices in telco security
ETIS Information Security Benchmark Successful Practices in telco securityETIS Information Security Benchmark Successful Practices in telco security
ETIS Information Security Benchmark Successful Practices in telco security
 
New opportunities and business risks with evolving privacy regulations
New opportunities and business risks with evolving privacy regulationsNew opportunities and business risks with evolving privacy regulations
New opportunities and business risks with evolving privacy regulations
 
Nov 2 security for blockchain and analytics ulf mattsson 2020 nov 2b
Nov 2 security for blockchain and analytics   ulf mattsson 2020 nov 2bNov 2 security for blockchain and analytics   ulf mattsson 2020 nov 2b
Nov 2 security for blockchain and analytics ulf mattsson 2020 nov 2b
 
GDPR and evolving international privacy regulations
GDPR and evolving international privacy regulationsGDPR and evolving international privacy regulations
GDPR and evolving international privacy regulations
 
Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
Data Virtualization for Accelerated Digital Transformation in Banking and Fin...Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
Data Virtualization for Accelerated Digital Transformation in Banking and Fin...
 

Similaire à Protecting Data Privacy in Analytics and Machine Learning

Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
Ulf Mattsson
 
Safeguarding customer and financial data in analytics and machine learning
Safeguarding customer and financial data in analytics and machine learningSafeguarding customer and financial data in analytics and machine learning
Safeguarding customer and financial data in analytics and machine learning
Ulf Mattsson
 
The criticality-of-security-in-the-internet-of-things joa-eng_1115
The criticality-of-security-in-the-internet-of-things joa-eng_1115The criticality-of-security-in-the-internet-of-things joa-eng_1115
The criticality-of-security-in-the-internet-of-things joa-eng_1115
Devaraj Sl
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_published
Shradha Verma
 
Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009
Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009 Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009
Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009
EuroCloud
 

Similaire à Protecting Data Privacy in Analytics and Machine Learning (20)

ISC2 Privacy-Preserving Analytics and Secure Multiparty Computation
ISC2 Privacy-Preserving Analytics and Secure Multiparty ComputationISC2 Privacy-Preserving Analytics and Secure Multiparty Computation
ISC2 Privacy-Preserving Analytics and Secure Multiparty Computation
 
Safe Harbor Webinar
Safe Harbor WebinarSafe Harbor Webinar
Safe Harbor Webinar
 
Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
 
Cloud and mobile computing for lawyers
Cloud and mobile computing for lawyersCloud and mobile computing for lawyers
Cloud and mobile computing for lawyers
 
The Internet of Things in insurance
The Internet of Things in insurance The Internet of Things in insurance
The Internet of Things in insurance
 
GDPR and NIS Compliance - How HyTrust Can Help
GDPR and NIS Compliance - How HyTrust Can HelpGDPR and NIS Compliance - How HyTrust Can Help
GDPR and NIS Compliance - How HyTrust Can Help
 
Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...
Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...
Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...
 
Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...
Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...
Beyond Privacy: Learning Data Ethics - European Big Data Community Forum 2019...
 
Safeguarding customer and financial data in analytics and machine learning
Safeguarding customer and financial data in analytics and machine learningSafeguarding customer and financial data in analytics and machine learning
Safeguarding customer and financial data in analytics and machine learning
 
May 6 evolving international privacy regulations and cross border data tran...
May 6   evolving international privacy regulations and cross border data tran...May 6   evolving international privacy regulations and cross border data tran...
May 6 evolving international privacy regulations and cross border data tran...
 
GDPR - Top 10 AWS Security and Compliance Best Practices
GDPR - Top 10 AWS Security and Compliance Best PracticesGDPR - Top 10 AWS Security and Compliance Best Practices
GDPR - Top 10 AWS Security and Compliance Best Practices
 
The criticality-of-security-in-the-internet-of-things joa-eng_1115
The criticality-of-security-in-the-internet-of-things joa-eng_1115The criticality-of-security-in-the-internet-of-things joa-eng_1115
The criticality-of-security-in-the-internet-of-things joa-eng_1115
 
Written-Blog_Ethic_AI_08Aug23_pub_jce.pdf
Written-Blog_Ethic_AI_08Aug23_pub_jce.pdfWritten-Blog_Ethic_AI_08Aug23_pub_jce.pdf
Written-Blog_Ethic_AI_08Aug23_pub_jce.pdf
 
IM seminor.pptx
IM seminor.pptxIM seminor.pptx
IM seminor.pptx
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_published
 
Data security and privacy
Data security and privacyData security and privacy
Data security and privacy
 
EU-US Privacy Shield - Safe Harbor Replacement
EU-US Privacy Shield - Safe Harbor ReplacementEU-US Privacy Shield - Safe Harbor Replacement
EU-US Privacy Shield - Safe Harbor Replacement
 
Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009
Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009 Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009
Carla Pinheiro Presentation / CloudViews.Org - Cloud Computing Conference 2009
 
Evolving international privacy regulations and cross border data transfer - g...
Evolving international privacy regulations and cross border data transfer - g...Evolving international privacy regulations and cross border data transfer - g...
Evolving international privacy regulations and cross border data transfer - g...
 

Plus de Ulf Mattsson

Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
Ulf Mattsson
 
Secure analytics and machine learning in cloud use cases
Secure analytics and machine learning in cloud use casesSecure analytics and machine learning in cloud use cases
Secure analytics and machine learning in cloud use cases
Ulf Mattsson
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
Ulf Mattsson
 

Plus de Ulf Mattsson (8)

Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
 
Jun 15 privacy in the cloud at financial institutions at the object managemen...
Jun 15 privacy in the cloud at financial institutions at the object managemen...Jun 15 privacy in the cloud at financial institutions at the object managemen...
Jun 15 privacy in the cloud at financial institutions at the object managemen...
 
Qubit conference-new-york-2021
Qubit conference-new-york-2021Qubit conference-new-york-2021
Qubit conference-new-york-2021
 
Secure analytics and machine learning in cloud use cases
Secure analytics and machine learning in cloud use casesSecure analytics and machine learning in cloud use cases
Secure analytics and machine learning in cloud use cases
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
 
The future of data security and blockchain
The future of data security and blockchainThe future of data security and blockchain
The future of data security and blockchain
 
What is tokenization in blockchain - BCS London
What is tokenization in blockchain - BCS LondonWhat is tokenization in blockchain - BCS London
What is tokenization in blockchain - BCS London
 
What is tokenization in blockchain?
What is tokenization in blockchain?What is tokenization in blockchain?
What is tokenization in blockchain?
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Protecting Data Privacy in Analytics and Machine Learning

  • 1. 1
  • 2. 2
  • 3. 3
  • 4. 4
  • 5. 5 Agenda Machine learning (ML) and AI (Artificial Intelligence) Secure Data-sharing • Secure multi-party computation (SMPC) and uses cases • Homomorphic encryption (HE) and use cases • Zero trust architecture (ZTA) vs. Zero knowledge • Trusted execution environments (TEE) Regulations and Standards in Data Privacy • International privacy standards • Differential Privacy (DP) and K-Anonymity
  • 6. 6 Technologies for Data Privacy in AI, ML and Analytics
  • 7. 7http://dataprotection.link/Zn1Uk#https://www.wsj.com/articles/coronavirus-paves-way-for-new-age-of-digital-surveillance-11586963028 American officials are drawing cellphone location data from mobile advertising firms to track the presence of crowds—but not individuals. • Apple Inc. and Alphabet Inc.’s Google - a voluntary app that health officials can use to reverse-engineer sickened patients’ recent whereabouts—provided they agree to provide such information. Collect personal or anonymized data? In Western Australia, lawmakers approved a bill to install surveillance gadgets in people’s homes to monitor those placed under quarantine. Authorities in Hong Kong and India are using geofencing that draws virtual fences around quarantine zones. • They monitor digital signals from smartphone or wristbands to deter rule breakers and nab offenders, who can be sent to jail.
  • 8. 8 Global Map Of Privacy Rights And Regulations
  • 9. 9
  • 10. 10
  • 11. 11
  • 12. 12
  • 13. 13 Use Case: Insilico Medicine https://insilico.com/ Since 2014: An alternative to animal testing for research and development programs in the pharmaceutical industry. • By using artificial intelligence and deep-learning techniques, Insilico is able to analyze how a compound will affect cells and what drugs can be used to treat the cells in addition to possible side effects. • The company provides machine learning services to different pharmaceutical, biotechnology, and skin care companies. • The company has multiple collaborations in the applications of next-generation artificial intelligence technologies such as the generative adversarial networks and reinforcement learning to the generation of novel molecular structures with desired properties. A comprehensive drug discovery engine, which utilizes millions of samples and multiple data types to discover signatures of disease and identify the most promising targets for billions of molecules that already exist or can be generated de novo with the desired set of parameters.
  • 14. 14 Swarm AI for Event Outcome Prediction
  • 15. 15https://www.marketresearchfuture.com/reports/machine-learning-market-2494, September 2019 Global Machine Learning Market Machine learning is a part of artificial intelligence (AI) that grants computers the capability to learn without being programmed in detail. It has multiple uses in today’s technology market concerning safety and security such as face detection, face recognition, image classification, speech recognition, antivirus, Google, antispam, genetic, signal diagnosing, and weather forecast.
  • 16. 16
  • 17. 17
  • 18. 18Privacyshield.gov Privacy Shield Program* • On July 12, 2016, the European Commission deemed the EU-U.S. Privacy Shield Framework adequate to enable data transfers under EU law (see the adequacy determination). • On July 16, 2020, the Court of Justice of the European Union issued a judgment declaring as “invalid” the European Commission’s Decision (EU) 2016/1250 of 12 July 2016 on the adequacy of the protection provided by the EU-U.S. Privacy Shield. As a result of that decision, the EU-U.S. Privacy Shield Framework is no longer a valid mechanism to comply with EU data protection requirements when transferring personal data from the European Union to the United States. This decision does not relieve participants in the EU-U.S. Privacy Shield of their obligations under the EU-U.S. Privacy Shield Framework. *: The EU-U.S. Privacy Shield Framework were designed by the U.S. Department of Commerce, and the European Commission and Swiss Administration, respectively, to provide companies on both sides of the Atlantic with a mechanism to comply with data protection requirements when transferring personal data from the European Union and Switzerland to the United States in support of transatlantic commerce.
  • 19. 19 Privacy Shield safeguards: Encryption • The CJEU reaffirmed the validity of SCCs* but stated that companies must verify, on a case-by-case basis, whether the law in the recipient country ensures adequate protection. • The ruling placed the same requirement on EU data protection authorities to suspend such transfers on a case-by-case basis where equivalent protection can not be ensured. • Privacy professionals may need to consider whether relevant surveillance programs and authorities apply in particular contexts. If they do, they could then assess whether those authorities include proportional limitations in the given context, as well as whether effective judicial remedies exist. • Alternatively, they might consider ways to limit the context itself through additional safeguards. • Encryption, for instance, might be a consideration. https://iapp.org/news/a/the-schrems-ii-decision-eu-us-data-transfers-in-question/ *: Standard Contractual Clauses (SCC). Standard contractual clauses for data transfers between EU and non-EU countries.
  • 20. 20 Gartner MQ for Data Science and Machine Learning Platforms https://www.kdnuggets.com/2020/02/gartner -mq-2020-data-science-machine-learning.html Data and analytics pipeline, including all the following areas: 1. Data ingestion 2. Data preparation 3. Data exploration 4. Feature engineering 5. Model creation and training 6. Model testing 7. Deployment 8. Monitoring 9. Maintenance 10.Collaboration 2020 vs 2019 changes
  • 21. 21Digikey, techbrij Machine Learning Model Lifecycle - Example 1. Define the model: using the Sequential or Model class and add the layers 2. Compile the model: call compile method and specify the loss, optimizer and metrics 3. Train the model: call fit method and use training data 4. Evaluate the model: call evaluate method and use testing data to evaluate trained model 5. Get predictions: use predict method on new data for predictions
  • 22. 22
  • 24. 24
  • 25. 25Source: Gartner Source: Netskope Current use or plan to use: Spending by Deployment Model, Digital Commerce Platforms, Worldwide
  • 26. 26
  • 27. 27 Protection throughout the lifecycle of data in Hadoop Tokenizes or encrypts sensitive data fields Enterprise Policies Privacy policies may be managed on-prem or Cloud Platform • Policy Enforcement Point (PEP) Protected data fields U Separation of Duties • Encryption Key Management Big Data Analytics Data Producers Data Users Google Cloud UU Big Data Protection with Granular Field Level Protection for Google Cloud
  • 28. 28
  • 29. 29 Legal Compliance and Nation-State Attacks • Many companies have information that is attractive to governments and intelligence services. • Others worry that litigation may result in a subpoena for all their data. Securosis, 2019 Multi-Cloud Data Privacy considerations Jurisdiction • Cloud service providers redundancy is great for resilience, but regulatory concerns arises when moving data across regions which may have different laws and jurisdictions. SecuPi
  • 30. 30Securosis, 2019 Consistency • Most firms are quite familiar with their on-premises encryption and key management systems, so they often prefer to leverage the same tool and skills across multiple clouds. • Firms often adopt a “best of breed” cloud approach. Examples of Hybrid Cloud considerations Trust • Some customers simply do not trust their vendors. Vendor Lock-in and Migration • A common concern is vendor lock-in, and an inability to migrate to another cloud service provider. Cloud Gateway Google Cloud AWS Cloud Azure Cloud S3 Salesforce
  • 32. 32 Increased need for data analytics drives requirements. Data Lake, ETL, Files … • Policy Enforcement Point (PEP)Protected data fields U • Encryption Key Management U External Data Internal Data Secure Multi Party Computation Analytics, Data Science, AI and ML Data Pipeline Data Collaboration Data Pipeline Data Privacy On-premises Cloud Internal and Individual Third-Party Data Sharing
  • 33. 33 http://homomorphicencryption.org Use Cases for Secure Multi Party Computation & Homomorphic Encryption (HE)
  • 34. 34 Use case - Financial services industry Confidential financial datasets which are vital for gaining significant insights. • The use of this data requires navigating a minefield of private client information as well as sharing data between independent financial institutions, to create a statistically significant dataset. • Data privacy regulations such as CCPA, GDPR and other emerging regulations around the world • Data residency controls as well as enable data sharing in a secure and private fashion. Reduce and remove the legal, risk and compliance processes • Collaboration across divisions, other organizations and across jurisdictions where data cannot be relocated or shared • Generating privacy respectful datasets with higher analytical value for Data Science and Analytics applications.
  • 35. 35 Use case – Retail - Data for Secondary Purposes Large aggregator of credit card transaction data. Open a new revenue stream • Using its data with its business partners: retailers, banks and advertising companies. • They could help their partners achieve better ad conversion rate, improved customer satisfaction, and more timely offerings. • Needed to respect user privacy and specific regulations. In this specific case, they wanted to work with a retailer. • Allow the retailer to gain insights while protecting user privacy, and the credit card organization’s IP. • An analyst at each organization’s office first used the software to link the data without exchanging any of the underlying data. Data used to train the machine learning and statistical models. • In this specific use-case, a logistic and linear regression model was trained using secure multi-party computation (SMC). • In the simplest form SMC splits a dataset into secret shares and enables you to train a model without needing to put together the pieces. • The information that is communicated between the peers is encrypted at all times and cannot be reverse engineered. • The resultant machine learning model coefficients (output of the training) were only shared with the partner identified as the receiver of such information. With the augmented dataset, the retailer was able to get a better picture of its customers buying habits.
  • 36. 36 Use case: Bank - Internal Data Usage by Other Units A large bank wanted to broaden access to its data lake without compromising data privacy, preserving the data’s analytical value, and at reasonable infrastructure costs. • Current approaches to de-identify data did not fulfill the compliance requirements and business needs, which had led to several bank projects being stopped. • The issue with these techniques, like masking, tokenization, and aggregation, was that they did not sufficiently protect the data without overly degrading data quality. This approach allows creating privacy protected datasets that retain their analytical value for Data Science and business applications. A plug-in to the organization’s analytical pipeline to enforce the compliance policies before the data was consumed by data science and business teams from the data lake. • The analytical quality of the data was preserved for machine learning purposes by-using AI and leveraging privacy models like differential privacy and k-anonymity. Improved data access for teams increased the business’ bottom line without adding excessive infrastructure costs, while reducing the risk of-consumer information exposure.
  • 38. 38 https://royalsociety.org Secure Multi-Party Computation (MPC) Private multi-party machine learning with MPC Using MPC, different parties send encrypted messages to each other, and obtain the model F(A,B,C) they wanted to compute without revealing their own private input, and without the need for a trusted central authority. Secure Multi-Party machine learningCentral trusted authority A B C F(A, B,C) F(A, B,C) F(A, B,C) Protected data fields U B A C F(A, B,C) U U U
  • 39. 39 Medium.com Example of Multi-party Computation: Average Salary #1
  • 40. 40Source: Gartner Six Important Privacy-Preserving Computation Techniques April 2020
  • 41. 41 Case Study – HE and Securely sharing sensitive information An example from the healthcare domain. The recent ability to fully map the human genome has opened endless possibilities for advances in healthcare. 1. Data from DNA analysis can test for genetic abnormalities, empower disease-risk analysis, discover family history, and the presence of an Alzheimer’s allele. • But these studies require very large DNA sample sizes to detect accurate patterns. 2. However, sharing personal DNA data is a particularly problematic domain. • Many citizens hesitate to share such personal information with third-party providers, uncertain of if, how and to whom the information might be shared downstream. 3. Moreover, legal limitations designed to protect privacy restrict providers from sharing this data as well. 4. HE techniques enable citizens to share their genome data and retain key privacy concerns without the traditional all-or-nothing trust threshold with third-party providers.
  • 42. 42 https://royalsociety.org Homomorphic encryption (HE) HE depicted in a client-server model • The client sends encrypted data to a server, where a specific analysis is performed on the encrypted data, without decrypting that data. • The encrypted result is then sent to the client, who can decrypt it to obtain the result of the analysis they wished to outsource. Encryption of x Client Server Analysis Encrypted F(x) • Policy Enforcement Point (PEP) Protected data fields U • Encryption Key Management
  • 43. 43
  • 45. 45 Trusted execution environments Trusted Execution Environments (TEEs) provide secure computation capability through a combination of special-purpose hardware in modern processors and software built to use those hardware features. The special-purpose hardware provides a mechanism by which a process can run on a processor without its memory or execution state being visible to any other process on the processor, • not even the operating system or other privileged code. *: Source: http://publications.officialstatistics.org Computation in a TEE is not performed on data while it remains encrypted. • Typically, the memory space of each TEE (enclave) application is protected from access • AES-encrypted when and if it is stored off-chip. Usability is low and products/services are emerging in MS Azure, IBM’s cloud service Amazon AWS (late 2020)*
  • 47. 47
  • 49. 49 FTI Consulting - Corporate Data Privacy Today, 2020 Which of the following aspects of data privacy are you particularly concerned about?
  • 50. 50 TrustArc Legal and regulatory risks are exploding
  • 51. 51
  • 52. 52 Personally Identifiable Information (PII) in compliance with the EU Cross Border Data Protection Laws, specifically • Datenschutzgesetz 2000 (DSG 2000) in Austria, and • Bundesdatenschutzgesetz in Germany. This required access to Austrian and German customer data to be restricted to only requesters in each respective country. • Achieved targeted compliance with EU Cross Border Data Security laws • Implemented country-specific data access restrictions Data sources Case Study A major international bank performed a consolidation of all European operational data sources to Italy
  • 53. 53 Access to DataLow High High - Low - I I Lower Risk and Higher Productivity with More Access to More Data User Productivity Risk More Access to Data Low Risk Tokens High Risk Clear Data
  • 54. 54 Field Privacy Action (PA) PA Config Variant Twin Output Gender Pseudonymise AD-lks75HF9aLKSa Pseudonymization Generalization Field Privacy Action (PA) PA Config Variant Twin Output Age Integer Range Bin Step 10 + Pseud. Age_KXYC Age Integer Range Bin Custom Steps 18-25 Aggregation/Binning Field Privacy Action (PA) PA Config Variant Twin Output Balance Nearest Unit Value Thousand 94000 Rounding Generalization Source data: Output data: Last name Balance Age Gender Folds 93791 23 m … … … … Generalization Source data: Output data: Patient Age Gender Region Disease 173965429 57 Female Hamburg Gastric ulcer Patient Age Gender Region Disease 173965429 >50 Female Germany Gastric ulcer Generalization Examples of data de-identification Source: INTERNATIONAL STANDARD ISO/IEC 20889, Privitar, Anonos
  • 55. 55 Differential Privacy (DP) 2-way Format Preserving Encryption (FPE) Homomorphic Encryption (HE) K-anonymity modelTokenization MaskingHashing 1-way Machine Learning (ML) and Secure Multi Party Computation (SMPC) Algorithmic Random Noise added Computing on encrypted data Format Preserving Fast Slow Very slow Fast Fast Format Preserving Encryption and Privacy Models
  • 56. 56
  • 57. 57 Data protection techniques: Deployment on-premises, and clouds Data Warehouse Centralized Distributed On- premises Public Cloud Private Cloud Vault-based tokenization y y Vault-less tokenization y y y y y y Format preserving encryption y y y y y Homomorphic encryption y y Masking y y y y y y Hashing y y y y y y Server model y y y y y y Local model y y y y y y L-diversity y y y y y y T-closeness y y y y y y Privacy enhancing data de-identification terminology and classification of techniques De- identification techniques Tokenization Cryptographic tools Suppression techniques Formal privacy measurement models Differential Privacy K-anonymity model
  • 58. 58 IS: International Standard TR: Technical Report TS: Technical Specification Guidelines to help comply with ethical standards 20889 IS Privacy enhancing de-identification terminology and classification of techniques 27018 IS Code of practice for protection of PII in public clouds acting as PII processors 27701 IS Security techniques - Extension to ISO/IEC 27001 and ISO/IEC 27002 for privacy information management - Requirements and guidelines 29100 IS Privacy framework 29101 IS Privacy architecture framework 29134 IS Guidelines for Privacy impact assessment 29151 IS Code of Practice for PII Protection 29190 IS Privacy capability assessment model 29191 IS Requirements for partially anonymous, partially unlinkable authentication Cloud 11 Published International Privacy Standards Framework Management Techniques Impact 19608 TS Guidance for developing security and privacy functional requirements based on 15408 Requirements 27550 TR Privacy engineering for system lifecycle processesProcess ISO Privacy Standards
  • 59. 59 References A: 1. C. Gentry. “A Fully Homomorphic Encryption Scheme.” Stanford University. September 2009, https://crypto.stanford.edu/craig/craig-thesis.pdf 2. Status Report on the Second Round of the NIST Post-Quantum Cryptography Standardization Process, https://csrc.nist.gov/publications/detail/nistir/8309/final 3. ISO/IEC 29101:2013 (Information technology – Security techniques – Privacy architecture framework) 4. ISO/IEC 19592-1:2016 (Information technology – Security techniques – Secret sharing – Part 1: General) 5. ISO/IEC 19592-2:2017 (Information technology – Security techniques – Secret sharing – Part 2: Fundamental mechanisms 6. Homomorphic Encryption Standardization, Academic Consortium to Advance Secure Computation, https://homomorphicencryption.org/standards-meetings/ 7. Homomorphic Encryption Standardization, https://homomorphicencryption.org/ 8. NIST Post-Quantum Cryptography PQC, https://csrc.nist.gov/Projects/Post-Quantum-Cryptography 9. UN Handbook on Privacy-Preserving Computation Techniques, http://publications.officialstatistics.org/handbooks/privacy- preserving-techniques-handbook/UN%20Handbook%20for%20Privacy-Preserving%20Techniques.pdf 10. ISO/IEC 29101:2013 Information technology – Security techniques – Privacy architecture framework, https://www.iso.org/standard/45124.html 11. Homomorphic encryption, https://brilliant.org/wiki/homomorphic-encryption/ 12. Survey on Secure Search Over Encrypted Data on the Cloud, https://arxiv.org/abs/1811.09767
  • 60. 60 References B: 1. California Consumer Privacy Act, OCT 4, 2019, https://www.csoonline.com/article/3182578/california-consumer-privacy-act-what- you-need-to-know-to-be-compliant.html 2. CIS Controls V7.1 Mapping to NIST CSF, https://dataprivacylab.org/projects/identifiability/paper1.pdf 3. GDPR and Tokenizing Data, https://tdwi.org/articles/2018/06/06/biz-all-gdpr-and-tokenizing-data-3.aspx 4. GDPR VS CCPA, https://wirewheel.io/wp-content/uploads/2018/10/GDPR-vs-CCPA-Cheatsheet.pdf 5. General Data Protection Regulation, https://en.wikipedia.org/wiki/General_Data_Protection_Regulation 6. IBM Framework Helps Clients Prepare for the EU's General Data Protection Regulation, https://ibmsystemsmag.com/IBM- Z/03/2018/ibm-framework-gdpr 7. INTERNATIONAL STANDARD ISO/IEC 20889, https://webstore.ansi.org/Standards/ISO/ISOIEC208892018?gclid=EAIaIQobChMIvI- k3sXd5gIVw56zCh0Y0QeeEAAYASAAEgLVKfD_BwE 8. INTERNATIONAL STANDARD ISO/IEC 27018, https://webstore.ansi.org/Standards/ISO/ ISOIEC270182019?gclid=EAIaIQobChMIleWM6MLd5gIVFKSzCh3k2AxKEAAYASAAEgKbHvD_BwE 9. New Enterprise Application and Data Security Challenges and Solutions https://www.brighttalk.com/webinar/new-enterprise- application-and-data-security-challenges-and-solutions/ 10. Machine Learning and AI in a Brave New Cloud World https://www.brighttalk.com/webcast/14723/357660/machine-learning-and-ai- in-a-brave-new-cloud-world 11. Emerging Data Privacy and Security for Cloud https://www.brighttalk.com/webinar/emerging-data-privacy-and-security-for-cloud/ 12. New Application and Data Protection Strategies https://www.brighttalk.com/webinar/new-application-and-data-protection- strategies-2/ 13. The Day When 3rd Party Security Providers Disappear into Cloud https://www.brighttalk.com/webinar/the-day-when-3rd-party- security-providers-disappear-into-cloud/ 14. Advanced PII/PI Data Discovery https://www.brighttalk.com/webinar/advanced-pii-pi-data-discovery/ 15. Emerging Application and Data Protection for Cloud https://www.brighttalk.com/webinar/emerging-application-and-data-protection- for-cloud/ 16. Data Security: On Premise or in the Cloud, ISSA Journal, December 2019, ulf@ulfmattsson.com 17. Webinars and slides, www.ulfmattsson.com
  • 61. 61