SlideShare une entreprise Scribd logo
1  sur  28
Timing,
data access types and
degree of anonymization
in microdata dissemination
…
Rajiv Ranjan
NISR/UNDP-Rwanda
Reflections on
data
confidentiality,
privacy, and
curation Regional Workshop on Microdata Dissemination Policy
Kigali, Rwanda: 27 – 29 August 2014
Confidentiality concerns
Access issues
Legal basis
Assurance
Challenges
Harmony Governance
Practices
Timing,
data access types
and
degree of
anonymization
in microdata
dissemination
Scheme of the presentation
Confidentiality
Caveat
Microdata dissemination must maintain
confidentiality of individual units: people,
households or enterprises.
Individual data collected by statistical agencies for statistical compilation, whether
they refer to natural or legal persons, are to be strictly confidential and used
exclusively for statistical purposes.
Principle 6
United Nations Fundamental Principles of Official Statistics
http://unstats.un.org/unsd/dnss/gp/fundprinciples.aspx
Legal basis in Rwanda
Source: Law on the organisation of statistical activities in Rwanda. Chapter VI: Statistical Confidentiality, Article
17: Prohibited dissemination of information (N° 45/2013 of 16/06/2013)
Data collected by the institutions of the national
statistical system through surveys or any other
method of collection are protected by statistical
confidentiality. Statistical confidentiality implies
that the dissemination of such data as well as
statistical information which can be calculated from
them, shall be conducted in a way that those who
provided it are not identified whether directly or
indirectly.
Access
Access benefits
• Fosters diversity of research
• Increases transparency and accountability
• Mitigates duplication of data collection work
• Increases the quality of data
https://unstats.un.org/unsd/accsub-public/microdata.pdf
Access assurance in Rwanda
The anonymous basic databases on
individuals and other institutions
shall be accessible to researchers
who, however, shall be committed to :
1° make a written note, that they shall not communicate to any person
the contents of such databases without the written authorization of
the National Institute of Statistics of Rwanda;
2° give to the National Institute of Statistics of Rwanda, the findings of
their research.
Source: Law on the organisation of statistical activities in Rwanda. Chapter VI: Statistical Confidentiality, Article
19: Accessibility to anonymous basic database not to be published (N° 45/2013 of 16/06/2013)
Challenges
Balancing act
Disclosure risks Information loss
• In practice, the more the disclosure risks are reduced, the
lower will be the expected utility of the microdata sets.
• The objective remains to deal with the trade-off between
disclosure risks and information loss.
Source: Chris Skinner: Statistical Disclosure Control for Survey Data: http://personal.lse.ac.uk/skinnecj/SDC%20for%20survey%20data%20S3RI.pdf
Challenges
[Emerging mash-ups]
Datasets are being
reused and
combined with other
datasets in ways
never before
thought possible,
including for use that
go beyond the
original intent.
[Growing motives]
While there are
promising research
efforts underway to
protect privacy, far
more advanced
efforts are presently
in use to re-identify
seemingly
“anonymous” data
[Improved access]
Access to datasets
have eased their
discoverability and
data could be used
to re-identify
previously de-
identified datasets
http://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_5.1.14_final_print.pdf
Complicating the challenges
Disclosure risks Information loss
Images: (1.) From the cover of ‘Open Data Now’ - a book by Joel Gurin, exploring how open data within public records will create new jobs, applications and other
technology innovations . http://www.opendatanow.com & (2.) A project at PARIS21 on data revolution for post 2015 SDGs http://www.paris21.org/node/1654
Machine readability,
Open standards and
Free for reuse
Post 20151 2
Harmony
Coexistence
“There is nothing inherently contradictory about
hiding one piece of information while revealing
another, so long as the information we want to hide is
different from the information we want to disclose.”
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2031808
- Felix T. Wu in Defining Privacy and Utility in Data Sets.
Though not easy, but it is possible and desirable for openness and privacy to co-exist.
Decision factors
Disclosure risks Information loss
Sensitivity of the dataset
Usage intent
Enabling dimensions
• Asserting users
types
• Controlling
release timing
• Categorizing
access methods
• Varying the
degree of
anonymization
Tools & Methods1 Governance Practices
• Legal basis
• Policy backing
• Institutionalization
• sdcMicro
• sdcMicroGUI
• Deterministic
• Probabilistic
1: http://cran.r-project.org/web/packages/sdcMicro/vignettes/sdc_guidelines.pdf
Anonymization
Governance
Law on the
organisation of
statistical
activities in
Rwanda
(Feb 14, 2006)
Law
Microdata
Release
Policy
@
National Institute
of Statistics of
Rwanda
Policy
Microdata
Release
Committee
&
Data curation
team
@
NISR
Institutionalization
Practices
Users types served
Govt. (Policy makers and researchers)
International development agencies
Research and academic institutions
Students and professors
Others (scientific researchers)
Release timing
6 – 24 monthsafter the 1st release of aggregated data from a survey/census
Within
DHS 2010
EICV(3) 2010-2011
Census 2012
7
7
?
Seasonal Agri Survey 2013 ?
24 Months
Examples
Integrated Household Living Conditions Survey (EICV)
Access methods
Web-based distribution
Types of files/access
16
1
3
Open access (no restriction)
Direct access or Public Use Files
(some restrictions on use, but no
screening of users)
Research Use Files (or Scientific
Use Files, or Licensed Files)
Availability only in an enclave
No access authorized
Data not available
Data available from external repo 4
Totalnoofstudies=24
Degree of anonymization
• Suppressing/deleting the records of direct identifiers (e.g. name
of the head of HH) and few indirect identifiers (e.g. sub-national
admin boundaries)
• Generalizing/replacing (recoding) some indirect identifiers with
less specific but semantically consistent groupings of observation
values (e.g. place of birth, occupation)
• Perturbing/distorting some indirect identifiers by randomizing
the values (e.g. clusters)
Removing or modifying the identifying variables contained in the microdata
The usual practice at NISR is to release microdata as Public Use Files.
For example, in EICV3, the methods applied for anonymizing data were:
Integrated Household Living Conditions Survey (EICV): EICV3 was done in 2010-2011
Variations in the degree of anonymization (and resulting access files/types)
may be considered depending on the sensitivity of the dataset and the use.
e.g.: Recoding (Occupation)
@rajiv_r_in
…
Thank you!
“87% of the U.S.
population can be
uniquely identified
by date of birth +
gender + zip”
Latanya Sweeney, CMU
latanyasweeney.org

Contenu connexe

Tendances

The role of civil society in data collection
The role of civil society in data collectionThe role of civil society in data collection
The role of civil society in data collectionSightsavers
 
Privacy tool osha comments
Privacy tool osha commentsPrivacy tool osha comments
Privacy tool osha commentsMicah Altman
 
Open data for digital development in Botswana/Jack Meshingo
Open data for digital development in Botswana/Jack MeshingoOpen data for digital development in Botswana/Jack Meshingo
Open data for digital development in Botswana/Jack MeshingoAfrican Open Science Platform
 
Uptake and Utilization of Open Data
Uptake and Utilization of Open DataUptake and Utilization of Open Data
Uptake and Utilization of Open DataAdegboyega Ojo
 
Use of data in safe havens: ethics and reproducibility issues
Use of data in safe havens: ethics and reproducibility issuesUse of data in safe havens: ethics and reproducibility issues
Use of data in safe havens: ethics and reproducibility issuesLouise Corti
 
dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020
dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020
dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020dkNET
 
Confidential data management_key_concepts
Confidential data management_key_conceptsConfidential data management_key_concepts
Confidential data management_key_conceptsMicah Altman
 
Connected health cities
Connected health citiesConnected health cities
Connected health citiesJisc
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017Vivien Bonazzi
 
HTSI ExecutiveSummary_OCT2015_V5
HTSI ExecutiveSummary_OCT2015_V5HTSI ExecutiveSummary_OCT2015_V5
HTSI ExecutiveSummary_OCT2015_V5Kristin Wiebe
 
Lessons from the UK: Data access, patient trust & real-world impact with heal...
Lessons from the UK: Data access, patient trust & real-world impact with heal...Lessons from the UK: Data access, patient trust & real-world impact with heal...
Lessons from the UK: Data access, patient trust & real-world impact with heal...Varsha Khodiyar
 
Managing confidential data
Managing confidential dataManaging confidential data
Managing confidential dataMicah Altman
 
Supporting Open Data Publishers
Supporting Open Data PublishersSupporting Open Data Publishers
Supporting Open Data PublishersDerilinx
 
State of open research data open con
State of open research data   open conState of open research data   open con
State of open research data open conAmye Kenall
 
Open data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni KaijageOpen data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni KaijageAfrican Open Science Platform
 
Data Commons Garvan - 2016
Data Commons Garvan -  2016 Data Commons Garvan -  2016
Data Commons Garvan - 2016 Vivien Bonazzi
 

Tendances (20)

Drt findings presentation
Drt findings presentationDrt findings presentation
Drt findings presentation
 
The role of civil society in data collection
The role of civil society in data collectionThe role of civil society in data collection
The role of civil society in data collection
 
Privacy tool osha comments
Privacy tool osha commentsPrivacy tool osha comments
Privacy tool osha comments
 
Open data for digital development in Botswana/Jack Meshingo
Open data for digital development in Botswana/Jack MeshingoOpen data for digital development in Botswana/Jack Meshingo
Open data for digital development in Botswana/Jack Meshingo
 
Uptake and Utilization of Open Data
Uptake and Utilization of Open DataUptake and Utilization of Open Data
Uptake and Utilization of Open Data
 
Use of data in safe havens: ethics and reproducibility issues
Use of data in safe havens: ethics and reproducibility issuesUse of data in safe havens: ethics and reproducibility issues
Use of data in safe havens: ethics and reproducibility issues
 
dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020
dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020
dkNET Webinar - Vivli: A Global Clinical Trials Data Sharing Platform 12/11/2020
 
Confidential data management_key_concepts
Confidential data management_key_conceptsConfidential data management_key_concepts
Confidential data management_key_concepts
 
Connected health cities
Connected health citiesConnected health cities
Connected health cities
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
 
DataShare: Empowering Researcher Data Curation
DataShare: Empowering Researcher Data CurationDataShare: Empowering Researcher Data Curation
DataShare: Empowering Researcher Data Curation
 
HTSI ExecutiveSummary_OCT2015_V5
HTSI ExecutiveSummary_OCT2015_V5HTSI ExecutiveSummary_OCT2015_V5
HTSI ExecutiveSummary_OCT2015_V5
 
Lessons from the UK: Data access, patient trust & real-world impact with heal...
Lessons from the UK: Data access, patient trust & real-world impact with heal...Lessons from the UK: Data access, patient trust & real-world impact with heal...
Lessons from the UK: Data access, patient trust & real-world impact with heal...
 
Brokerage and market Platform
Brokerage and market PlatformBrokerage and market Platform
Brokerage and market Platform
 
Managing confidential data
Managing confidential dataManaging confidential data
Managing confidential data
 
Supporting Open Data Publishers
Supporting Open Data PublishersSupporting Open Data Publishers
Supporting Open Data Publishers
 
State of open research data open con
State of open research data   open conState of open research data   open con
State of open research data open con
 
Open data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni KaijageOpen data and open access landscape in Tanzania/Zaituni Kaijage
Open data and open access landscape in Tanzania/Zaituni Kaijage
 
Public deck
Public deckPublic deck
Public deck
 
Data Commons Garvan - 2016
Data Commons Garvan -  2016 Data Commons Garvan -  2016
Data Commons Garvan - 2016
 

Similaire à Microdata anonymization considerations

An itinerary for FAIR and privacy respecting data-driven innovation and research
An itinerary for FAIR and privacy respecting data-driven innovation and researchAn itinerary for FAIR and privacy respecting data-driven innovation and research
An itinerary for FAIR and privacy respecting data-driven innovation and researchMarlon Domingus
 
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...Darío Garigliotti
 
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...Micah Altman
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Joanne Luciano
 
Comments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyComments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyMicah Altman
 
Gobinda Chowdhury
Gobinda ChowdhuryGobinda Chowdhury
Gobinda Chowdhurymaredata
 
2020 Geography in Government: Trends
2020 Geography in Government: Trends2020 Geography in Government: Trends
2020 Geography in Government: TrendsPLACE
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET Journal
 
Brussels Privacy Hub: SATORI and iTRACK
Brussels Privacy Hub: SATORI and iTRACKBrussels Privacy Hub: SATORI and iTRACK
Brussels Privacy Hub: SATORI and iTRACKTrilateral Research
 
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Katie Whipkey
 
Research data sharing
Research data sharingResearch data sharing
Research data sharingCGIAR
 
Digital transformation to enable a FAIR approach for health data science
Digital transformation to enable a FAIR approach for health data scienceDigital transformation to enable a FAIR approach for health data science
Digital transformation to enable a FAIR approach for health data scienceVarsha Khodiyar
 
Open Government Data: What it is, Where it is Going, and the Opportunities fo...
Open Government Data: What it is, Where it is Going, and the Opportunities fo...Open Government Data: What it is, Where it is Going, and the Opportunities fo...
Open Government Data: What it is, Where it is Going, and the Opportunities fo...OECD Governance
 
Open Government Data & Privacy Protection
Open Government Data & Privacy ProtectionOpen Government Data & Privacy Protection
Open Government Data & Privacy ProtectionSylvia Ogweng
 
Winter school in research data science research data management - final
Winter school in research data science research data management - finalWinter school in research data science research data management - final
Winter school in research data science research data management - finalARDC
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsPriyanka Aash
 

Similaire à Microdata anonymization considerations (20)

An itinerary for FAIR and privacy respecting data-driven innovation and research
An itinerary for FAIR and privacy respecting data-driven innovation and researchAn itinerary for FAIR and privacy respecting data-driven innovation and research
An itinerary for FAIR and privacy respecting data-driven innovation and research
 
Data sharing: Seeing & Thinking Together
Data sharing: Seeing & Thinking TogetherData sharing: Seeing & Thinking Together
Data sharing: Seeing & Thinking Together
 
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
FACT-IR. Fairness, Accountability, Confidentiality and Transparency in Inform...
 
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
July IAP: Confidential Information - Storage, Sharing, & Publication - with M...
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
Comments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyComments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data Privacy
 
Gobinda Chowdhury
Gobinda ChowdhuryGobinda Chowdhury
Gobinda Chowdhury
 
2020 Geography in Government: Trends
2020 Geography in Government: Trends2020 Geography in Government: Trends
2020 Geography in Government: Trends
 
CODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon HodsonCODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon Hodson
 
Preparing Research Data for Sharing
Preparing Research Data for SharingPreparing Research Data for Sharing
Preparing Research Data for Sharing
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
 
Brussels Privacy Hub: SATORI and iTRACK
Brussels Privacy Hub: SATORI and iTRACKBrussels Privacy Hub: SATORI and iTRACK
Brussels Privacy Hub: SATORI and iTRACK
 
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
 
Research data sharing
Research data sharingResearch data sharing
Research data sharing
 
Digital transformation to enable a FAIR approach for health data science
Digital transformation to enable a FAIR approach for health data scienceDigital transformation to enable a FAIR approach for health data science
Digital transformation to enable a FAIR approach for health data science
 
Open Government Data: What it is, Where it is Going, and the Opportunities fo...
Open Government Data: What it is, Where it is Going, and the Opportunities fo...Open Government Data: What it is, Where it is Going, and the Opportunities fo...
Open Government Data: What it is, Where it is Going, and the Opportunities fo...
 
Open Government Data & Privacy Protection
Open Government Data & Privacy ProtectionOpen Government Data & Privacy Protection
Open Government Data & Privacy Protection
 
Winter school in research data science research data management - final
Winter school in research data science research data management - finalWinter school in research data science research data management - final
Winter school in research data science research data management - final
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
 

Plus de Rajiv Ranjan

Data Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing CountriesData Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing CountriesRajiv Ranjan
 
Strategic use of digital information in Government
Strategic use of digital information in GovernmentStrategic use of digital information in Government
Strategic use of digital information in GovernmentRajiv Ranjan
 
Odp rwanda-odra-rajiv
Odp rwanda-odra-rajivOdp rwanda-odra-rajiv
Odp rwanda-odra-rajivRajiv Ranjan
 
What is knowledge management
What is knowledge managementWhat is knowledge management
What is knowledge managementRajiv Ranjan
 
Building platform for social engagements
Building platform for social engagementsBuilding platform for social engagements
Building platform for social engagementsRajiv Ranjan
 
Strategic use of digital information in Government - Rwanda-CMU-2014
Strategic use of digital information in Government - Rwanda-CMU-2014Strategic use of digital information in Government - Rwanda-CMU-2014
Strategic use of digital information in Government - Rwanda-CMU-2014Rajiv Ranjan
 
DigiGov_cmu_rwanda
DigiGov_cmu_rwandaDigiGov_cmu_rwanda
DigiGov_cmu_rwandaRajiv Ranjan
 
The design aspects of the new website of NISR
The design aspects of the new website of NISRThe design aspects of the new website of NISR
The design aspects of the new website of NISRRajiv Ranjan
 
eGovernment Interoperability
eGovernment InteroperabilityeGovernment Interoperability
eGovernment InteroperabilityRajiv Ranjan
 
Understanding Interoperability through Mobile Phones
Understanding Interoperability through Mobile PhonesUnderstanding Interoperability through Mobile Phones
Understanding Interoperability through Mobile PhonesRajiv Ranjan
 

Plus de Rajiv Ranjan (14)

Data Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing CountriesData Portals in National Statistics Offices: Case of Developing Countries
Data Portals in National Statistics Offices: Case of Developing Countries
 
Strategic use of digital information in Government
Strategic use of digital information in GovernmentStrategic use of digital information in Government
Strategic use of digital information in Government
 
Odp rwanda-odra-rajiv
Odp rwanda-odra-rajivOdp rwanda-odra-rajiv
Odp rwanda-odra-rajiv
 
What is knowledge management
What is knowledge managementWhat is knowledge management
What is knowledge management
 
Building platform for social engagements
Building platform for social engagementsBuilding platform for social engagements
Building platform for social engagements
 
Strategic use of digital information in Government - Rwanda-CMU-2014
Strategic use of digital information in Government - Rwanda-CMU-2014Strategic use of digital information in Government - Rwanda-CMU-2014
Strategic use of digital information in Government - Rwanda-CMU-2014
 
DigiGov_cmu_rwanda
DigiGov_cmu_rwandaDigiGov_cmu_rwanda
DigiGov_cmu_rwanda
 
ODOO_klab
ODOO_klabODOO_klab
ODOO_klab
 
NISR
NISRNISR
NISR
 
Knownet why how
Knownet why howKnownet why how
Knownet why how
 
Gettind data used
Gettind data usedGettind data used
Gettind data used
 
The design aspects of the new website of NISR
The design aspects of the new website of NISRThe design aspects of the new website of NISR
The design aspects of the new website of NISR
 
eGovernment Interoperability
eGovernment InteroperabilityeGovernment Interoperability
eGovernment Interoperability
 
Understanding Interoperability through Mobile Phones
Understanding Interoperability through Mobile PhonesUnderstanding Interoperability through Mobile Phones
Understanding Interoperability through Mobile Phones
 

Dernier

An Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCCAn Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCCNAP Global Network
 
Tuvalu Coastal Adaptation Project (TCAP)
Tuvalu Coastal Adaptation Project (TCAP)Tuvalu Coastal Adaptation Project (TCAP)
Tuvalu Coastal Adaptation Project (TCAP)NAP Global Network
 
Call Girls Chakan Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chakan Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Chakan Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chakan Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Akurdi ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Akurdi ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Akurdi ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Akurdi ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...tanu pandey
 
2024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 312024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 31JSchaus & Associates
 
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...tanu pandey
 
Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'NAP Global Network
 
Election 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdfElection 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdfSamirsinh Parmar
 
Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCNAP Global Network
 
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...SUHANI PANDEY
 
Call Girls in Sarita Vihar Delhi Just Call 👉👉9873777170 Independent Female ...
Call Girls in  Sarita Vihar Delhi Just Call 👉👉9873777170  Independent Female ...Call Girls in  Sarita Vihar Delhi Just Call 👉👉9873777170  Independent Female ...
Call Girls in Sarita Vihar Delhi Just Call 👉👉9873777170 Independent Female ...adilkhan87451
 
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...Call Girls in Nagpur High Profile
 
Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...nehasharma67844
 
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...Dipal Arora
 
Pimpri Chinchwad ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi R...
Pimpri Chinchwad ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi R...Pimpri Chinchwad ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi R...
Pimpri Chinchwad ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi R...tanu pandey
 
SMART BANGLADESH I PPTX I SLIDE IShovan Prita Paul.pptx
SMART BANGLADESH  I    PPTX   I    SLIDE   IShovan Prita Paul.pptxSMART BANGLADESH  I    PPTX   I    SLIDE   IShovan Prita Paul.pptx
SMART BANGLADESH I PPTX I SLIDE IShovan Prita Paul.pptxShovan Prita Paul .
 

Dernier (20)

An Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCCAn Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCC
 
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
 
Tuvalu Coastal Adaptation Project (TCAP)
Tuvalu Coastal Adaptation Project (TCAP)Tuvalu Coastal Adaptation Project (TCAP)
Tuvalu Coastal Adaptation Project (TCAP)
 
Call Girls Chakan Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chakan Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Chakan Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chakan Call Me 7737669865 Budget Friendly No Advance Booking
 
Akurdi ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Akurdi ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Akurdi ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Akurdi ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
2024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 312024: The FAR, Federal Acquisition Regulations, Part 31
2024: The FAR, Federal Acquisition Regulations, Part 31
 
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
 
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'Coastal Protection Measures in Hulhumale'
Coastal Protection Measures in Hulhumale'
 
Election 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdfElection 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdf
 
Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCC
 
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Baramati ( Pune ) Call ON 8005736733 Starting From 5K to...
 
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
Call Girls in Chandni Chowk (delhi) call me [9953056974] escort service 24X7
 
Call Girls in Sarita Vihar Delhi Just Call 👉👉9873777170 Independent Female ...
Call Girls in  Sarita Vihar Delhi Just Call 👉👉9873777170  Independent Female ...Call Girls in  Sarita Vihar Delhi Just Call 👉👉9873777170  Independent Female ...
Call Girls in Sarita Vihar Delhi Just Call 👉👉9873777170 Independent Female ...
 
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
celebrity 💋 Nagpur Escorts Just Dail 8250092165 service available anytime 24 ...
 
Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...Call Girls In datia Escorts ☎️7427069034  🔝 💃 Enjoy 24/7 Escort Service Enjoy...
Call Girls In datia Escorts ☎️7427069034 🔝 💃 Enjoy 24/7 Escort Service Enjoy...
 
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
Just Call Vip call girls Wardha Escorts ☎️8617370543 Starting From 5K to 25K ...
 
Pimpri Chinchwad ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi R...
Pimpri Chinchwad ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi R...Pimpri Chinchwad ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi R...
Pimpri Chinchwad ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi R...
 
AHMR volume 10 number 1 January-April 2024
AHMR volume 10 number 1 January-April 2024AHMR volume 10 number 1 January-April 2024
AHMR volume 10 number 1 January-April 2024
 
SMART BANGLADESH I PPTX I SLIDE IShovan Prita Paul.pptx
SMART BANGLADESH  I    PPTX   I    SLIDE   IShovan Prita Paul.pptxSMART BANGLADESH  I    PPTX   I    SLIDE   IShovan Prita Paul.pptx
SMART BANGLADESH I PPTX I SLIDE IShovan Prita Paul.pptx
 

Microdata anonymization considerations

  • 1. Timing, data access types and degree of anonymization in microdata dissemination … Rajiv Ranjan NISR/UNDP-Rwanda Reflections on data confidentiality, privacy, and curation Regional Workshop on Microdata Dissemination Policy Kigali, Rwanda: 27 – 29 August 2014
  • 2. Confidentiality concerns Access issues Legal basis Assurance Challenges Harmony Governance Practices Timing, data access types and degree of anonymization in microdata dissemination Scheme of the presentation
  • 4. Caveat Microdata dissemination must maintain confidentiality of individual units: people, households or enterprises. Individual data collected by statistical agencies for statistical compilation, whether they refer to natural or legal persons, are to be strictly confidential and used exclusively for statistical purposes. Principle 6 United Nations Fundamental Principles of Official Statistics http://unstats.un.org/unsd/dnss/gp/fundprinciples.aspx
  • 5. Legal basis in Rwanda Source: Law on the organisation of statistical activities in Rwanda. Chapter VI: Statistical Confidentiality, Article 17: Prohibited dissemination of information (N° 45/2013 of 16/06/2013) Data collected by the institutions of the national statistical system through surveys or any other method of collection are protected by statistical confidentiality. Statistical confidentiality implies that the dissemination of such data as well as statistical information which can be calculated from them, shall be conducted in a way that those who provided it are not identified whether directly or indirectly.
  • 7. Access benefits • Fosters diversity of research • Increases transparency and accountability • Mitigates duplication of data collection work • Increases the quality of data https://unstats.un.org/unsd/accsub-public/microdata.pdf
  • 8. Access assurance in Rwanda The anonymous basic databases on individuals and other institutions shall be accessible to researchers who, however, shall be committed to : 1° make a written note, that they shall not communicate to any person the contents of such databases without the written authorization of the National Institute of Statistics of Rwanda; 2° give to the National Institute of Statistics of Rwanda, the findings of their research. Source: Law on the organisation of statistical activities in Rwanda. Chapter VI: Statistical Confidentiality, Article 19: Accessibility to anonymous basic database not to be published (N° 45/2013 of 16/06/2013)
  • 10. Balancing act Disclosure risks Information loss • In practice, the more the disclosure risks are reduced, the lower will be the expected utility of the microdata sets. • The objective remains to deal with the trade-off between disclosure risks and information loss. Source: Chris Skinner: Statistical Disclosure Control for Survey Data: http://personal.lse.ac.uk/skinnecj/SDC%20for%20survey%20data%20S3RI.pdf
  • 11. Challenges [Emerging mash-ups] Datasets are being reused and combined with other datasets in ways never before thought possible, including for use that go beyond the original intent. [Growing motives] While there are promising research efforts underway to protect privacy, far more advanced efforts are presently in use to re-identify seemingly “anonymous” data [Improved access] Access to datasets have eased their discoverability and data could be used to re-identify previously de- identified datasets http://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_5.1.14_final_print.pdf
  • 12. Complicating the challenges Disclosure risks Information loss Images: (1.) From the cover of ‘Open Data Now’ - a book by Joel Gurin, exploring how open data within public records will create new jobs, applications and other technology innovations . http://www.opendatanow.com & (2.) A project at PARIS21 on data revolution for post 2015 SDGs http://www.paris21.org/node/1654 Machine readability, Open standards and Free for reuse Post 20151 2
  • 14. Coexistence “There is nothing inherently contradictory about hiding one piece of information while revealing another, so long as the information we want to hide is different from the information we want to disclose.” http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2031808 - Felix T. Wu in Defining Privacy and Utility in Data Sets. Though not easy, but it is possible and desirable for openness and privacy to co-exist.
  • 15. Decision factors Disclosure risks Information loss Sensitivity of the dataset Usage intent
  • 16. Enabling dimensions • Asserting users types • Controlling release timing • Categorizing access methods • Varying the degree of anonymization Tools & Methods1 Governance Practices • Legal basis • Policy backing • Institutionalization • sdcMicro • sdcMicroGUI • Deterministic • Probabilistic 1: http://cran.r-project.org/web/packages/sdcMicro/vignettes/sdc_guidelines.pdf Anonymization
  • 18. Law on the organisation of statistical activities in Rwanda (Feb 14, 2006) Law
  • 22. Users types served Govt. (Policy makers and researchers) International development agencies Research and academic institutions Students and professors Others (scientific researchers)
  • 23. Release timing 6 – 24 monthsafter the 1st release of aggregated data from a survey/census Within DHS 2010 EICV(3) 2010-2011 Census 2012 7 7 ? Seasonal Agri Survey 2013 ? 24 Months Examples Integrated Household Living Conditions Survey (EICV)
  • 25. Types of files/access 16 1 3 Open access (no restriction) Direct access or Public Use Files (some restrictions on use, but no screening of users) Research Use Files (or Scientific Use Files, or Licensed Files) Availability only in an enclave No access authorized Data not available Data available from external repo 4 Totalnoofstudies=24
  • 26. Degree of anonymization • Suppressing/deleting the records of direct identifiers (e.g. name of the head of HH) and few indirect identifiers (e.g. sub-national admin boundaries) • Generalizing/replacing (recoding) some indirect identifiers with less specific but semantically consistent groupings of observation values (e.g. place of birth, occupation) • Perturbing/distorting some indirect identifiers by randomizing the values (e.g. clusters) Removing or modifying the identifying variables contained in the microdata The usual practice at NISR is to release microdata as Public Use Files. For example, in EICV3, the methods applied for anonymizing data were: Integrated Household Living Conditions Survey (EICV): EICV3 was done in 2010-2011 Variations in the degree of anonymization (and resulting access files/types) may be considered depending on the sensitivity of the dataset and the use.
  • 28. @rajiv_r_in … Thank you! “87% of the U.S. population can be uniquely identified by date of birth + gender + zip” Latanya Sweeney, CMU latanyasweeney.org

Notes de l'éditeur

  1. We often use the terms "confidentiality" and "privacy" interchangeably in our everyday lives. However, they mean distinctly different things. While confidentiality relates to information/data about an individual, privacy relates to a person and is a right rooted in common law. Privacy protects access to the person, whereas confidentiality protects access to the data. In the context of statistics – ‘confidentiality’ is the researcher’s agreement with the participant about how the participant’s identifiable private information will be handled, managed, and disseminated. Hence, confidentiality is an ethical duty. [Situations vary. In some cases the duty is easy and in some cases it is not.] How is this duty is performed by controlling the factors of (1) timing of data release, (2) data access types and (3) degree of anonymization: is my topic of presentation.
  2. I’ll keep two parallel tracks during my presentation. Generic track and Rwanda specific track. While talking about generic stuff, I’ll be often jumping off and on to Rwanda specific examples to illustrate my points
  3. Lets dig deeper into the subject.
  4. In most cases of statistical practices, the caveat is…. Microdata dissemination must maintain confidentiality of individual units: people, households or enterprises. Driven by Principle 6 of UN Fundamental Principles of Official Statistics. However, if in some cases, it facilitates the caveat, in others, the strict confidentiality is often invoked as a reason not to share any microdata
  5. -In Rwanda, there is a strong legal basis – facilitating the caveat. -The law also provide for ‘PENALTIES ‘ in case of breach of statistical confidentiality
  6. Regarding the Principle 6 of UN Fundamental Principles of Official Statistics, if access becomes the casualty – then it is loss. Therefore, broadly accepted rationale is: though confidentiality should be upheld, access to data should not be jeopardised. See some benefits:
  7. Access rationale is broadly accepted.
  8. In Rwanda, statistical law provides for the ‘assurance of access’.
  9. It is obvious that seemingly conflicting ideas may pose some challenges, if applied simultaneously. It is therefore, a balancing act.
  10. There is a constant struggle to minimize both.
  11. What has added to the misery?
  12. What recourse do we have? Is it possible to have harmony?
  13. Though not easy, but it is possible and desirable for openness and privacy to co-exist.
  14. What are the decision factors?
  15. What helps?
  16. Leaves the pressure out, for microdata to appeal to ‘all’ / ‘normal’ users.
  17. At NISR there is only one dataset which has Licensed data files - General Census of Population and Housing 2002. It is because the entire dataset is made available (though anonymized). The current Census where only 5% data will be released (after anonymization) will be Public Use Files.
  18. The challenge is quite big here (read in the context of Big Data). We are learning. And though simple means are currently in use, we intend to move towards more complex arrangements where ‘balancing act’ is more optimized.