SlideShare une entreprise Scribd logo
1  sur  27
Sidra ali
Collecting a Citizen’s
Digital Footprint for Health
Data Mining
Oguzhan Gencoglu, Heidi Simil, Harri Honko,
Minna Isomursu
Abstract
 This paper describes a case study for collecting digital
footprint data for the purpose of health data mining.
 The case study involved 20 subjects residing in Finland who
were instructed to collect data from registries which they
evaluated to be useful for understanding their health or
health behavior, current or past.
 11 subjects were active, sending 100 data requests to 49
distinct organizations in total.
 Our results indicate that there are still practical challenges in
collecting actionable digital footprint data.
Abstract
 Out of the received data, 44 datasets (72.1% were
delivered in paper format.
 4 (6.6%) in portable document format .
 13 (21.3%) in structured digital form.
 The time duration between the sending of the
information requests and reception of a reply was
26.4 days on the average.
Introduction
 Digital footprint or digital shadowrefers to one's unique set of
traceable digital activities, actions, contributions and communications that are manifested on
the Internet or on digital devices
 There are two main classifications for digital footprints:
 Passive digital footprints . A passive digital footprint is created when data is
collected without the owner knowing, it can be stored in many ways depending on the situation. In an
online environment a footprint may be stored in an online data base as a "hit". This footprint may track
the user IP address, when it was created, and where they came from; with the footprint later being
analyzed. In an offline environment, a footprint may be stored in files, which can be accessed
by administrators to view the actions performed on the machine, without being able to see who
performed them.
 Active digital footprints active digital footprints are created when personal
data is released deliberately by a user for the purpose of sharing information about oneself by means
of websites or social media.
Introduction
 digital footprints can tell a lot about the behavior, characteristics and preferences
of an individual [2] [3] [4] [5] [6], provided it’s accessible in digitally digestible,
machine-readable form.
 Increasingly the data sets, open or closed are being made available over an
application programming interface, API. Where accessible, the person’s digital
footprint is used today, for example, for personalized recommendation services,
person-, income- and even location-context[7].
 There are ideas promoting that digital footprint data, when properly gathered
and analyzed with modern data analytics could provide significant opportunities
for providing new, more personalized and timely health services.
 Aggregated and analyzed data can help individuals themselves learn about
their health condition [10] [11].
Introduction
 Better access to electronic health records can help communication
between careers, health professionals and other service providers [12].
 This can create opportunities for totally new kind of health and
wellbeing services, which create new business opportunities for
companies, and help increasing efficiency of health interventions
through targeted care.
 In this paper, we examine the state-of-the-practice of collecting 2010’s
citizen’s personal footprint for the purpose of health data mining.
Introduction
 Our research question is ”Can digital footprint of an individual be collected successfully
today for health data mining?”.
 For the purpose of the study, we hire some individual to send information to different
organizations of their own choice. they tried to maximize the number of responses.
 Our results summarize how successful our case subjects were in collecting their digital
footprint data.
 did the organizations provide them access to their personal footprint data?
 in what format the data was presented to them?
Introduction
 and what procedures roughly would be needed to make
that data actionable so that it could be used for
computerized health data mining by anyone attempting to
refine and analyze the data to provide insights and health
related value.
 Our discussion summarizes our experience and suggests
further work on how such data can be examined to reveal
health behavior patterns.
METHODOLOGY
 Total of 20 volunteer participants were hired among active researchers in this study.
 The participants were instructed to print, sign and mail the information request with
the covering letter to 5-10 target organizations of their own choice.
 A preliminary list of candidate sources for digital footprint information was collected
to serve as an example for the participants, although they were instructed to decide
themselves which data sources could be valuable for health data analytics.
 In order to follow the process, the participants kept a record of dates when the
information requests were sent, when the replies were received and in which format.
METHODOLOGY
 The data was asked to be delivered to each participants home address or email.
 In the information request form it is stated that data is preferred to be delivered
via an API, a memory stick or DVD, instead of printed paper documents.
 After receiving the data, the participants were instructed to go through the data
and decide which representative set of the individual registers data they were
willing to donate for the research program.
 The sensitive personal information was removed or edited when needed. Each
participant signed an informed consent while handing over the data.
RESULTS AND DISCUSSION
 The number of voluntary participants, all residing in Finland, was 20 (18 natives, 2 foreigners) for the
study.
 11 (55.0%) individuals were active during period of five months (11/2014-03/2015), sending 100
information requests (9.09 per person) to 49 (2.04 per registry) distinct data sources in total.
 With respect to their content, these data sources were classified by researchers into 15 categories, i.e.,
banking, education, energy, fitness, groceries, healthcare, housing, insurance, library, mobility,
municipality, police, retail, telecommunication and web.
 The average number of distinct data sources and number of sent requests per category is 3.27 and
6.67, respectively.
 Maximum number of distinct data sources along with maximum number of sent requests belongs to
health category with 30 requests from 13 data sources.
 For each category, a detailed summary of number of data sources, number of sent requests, number
of received replies and number of replies resulting in an access to data can be seen from Table I.
RESULTS AND DISCUSSION
 Overall response rate and data response rate of the
study was 75.0% and 61.0% respectively.
 As the main purpose of a digital footprint collection
process eventually is to perform data analysis on
each individual’s data.
 the amount of collected data has a great effect on
the analysis performance.
RESULTS AND DISCUSSION
 The format of the collected data is crucial as well for the analysis to be conducted properly.
 Even though more than half of the data sources provided some data to the individuals, most of
the cases the format of the returned data is not analysis-friendly, even not digitized.
 The format of the delivered data can be categorized into three groups as paper format (hard
copy), portable document format (PDF) and spreadsheet/structured format which includes
formats such as comma-separated values (CSV), Microsoft Excel file formats (XLS/XLSX),
JavaScript object notation (JSON).
 The listed order is from least analysis-friendly to the most. A detailed view of the format of the
collected data for different categories can be seen from Table II.
 Hard copy, i.e., paper format, corresponds to the majority of the collected data with 72.1%. Only
21.3% of the collected data can be considered as structured. None of the data sources had APIs
for such data ingestion process.
RESULTS AND DISCUSSION
 When the process of transforming non-analysis-friendly data into analysis-
friendly form is considered, the drawbacks become more obvious.
 Data delivered in paper format, first of all, has to be printed and mailed, which
comes at a cost.
 As an individual can easily own hundreds of pages of data residing in several
data sources; logistics, security and storing problems arise.
 Then, the data has to be digitized by the recipient, for example by scanning.
Such a process is not only burdensome but also error-prone.
 After digitization, data is in the form of PDF or digital images which has to be
fed into an optical character recognition (OCR) algorithm.
RESULTS AND DISCUSSION
 As the paper-form data is likely to contain artifacts (lines, logos, bright/dark spots due to
scanning, irrelevant text, folded/torn down parts) acting as noise to the OCR system, the
likelihood of error increases.
 Furthermore, the OCR system had to be tuned specifically for the structure of the text in paper;
thus, parsing the relevant information becomes even more demanding.
 In addition, as there is no guarantee of the data source delivering the data on the paper in the
same format in the future, such tasks are discouraged with respect to the reproducible research
paradigm.
RESULTS AND DISCUSSION
 Another interesting aspect of the data collection process is the analysis of quickness
of the data sources, i.e., how quick each registry replies to the requests.
 56 of the requests have both sending and reply dates recorded.
 On the average, a reply (providing data or not) took 26.4 days to arrive.
 Average reply times for different categories can be seen from Table III.
 The average durations for the data registries with small number of recorded times
are given for the sake of completeness rather than conclusion determined.
 The average reply time for requests resulting in data reception was 29.6 days while
replies failing to do so came in 14.8 days on the average.
CONCLUSION
 One’s behavior is reflecting to his/her actions and those actions are recorded in great amounts in
today’s world as digital footprint.
 As the advancing data mining algorithms enable efficient harmonization of multi-modal data to
perform inferential, predictive and even causal analysis of people’s behavior, these digital
footprints are of considerable value for health data mining purposes.
 An expected rise in the demand of personal data from various data registries is likely to change
the current situation of such information retrieval process which is presented in this paper.
 Our results show that currently utilization of digital footprint in services has practical challenges.
Companies and institutions in control of the data of individuals are not responsive and attentive
to the emerging value of digital footprint.
 Even in the Finnish context, where the individuals have right by law to access their personal data,
many organizations ignored the request or refused the access to the data.
 Very few provided data in format which could be easily digested by digital tools.
CONCLUSION
 Providing high quality data to the cutting-edge data mining and machine
learning systems is essential for high performance predictive analysis, health
behavioral modeling and personalized services.
 In order to achieve this goal, controlled and secure data access via service web
portals, or even better, through machine readable APIs are needed.
 Our work continues with exploration of the collected datasets in terms of validity,
suitability and information value for health data mining, leading to in-depth
analysis of how the digital footprint can be used in health services.
REFERENCES
 [1] A. Sellen, Y. Rogers, R. Harper, and T. Rodden, “Reflecting human values in the
digital age,” Communications of the ACM, vol. 52, no. 3, pp. 58–66, 2009.
 [2] “World economic forum - rethinking personal data: Strengthening trust,”
2012.
 [3] D. Zhang, B. Guo, B. Li, and Z. Yu, “Extracting social and community
intelligence from digital footprints: an emerging research area,” in Ubiquitous
Intelligence and Computing. Springer, 2010, pp. 4–18.
REFERENCES
 [4] C. Moiso and R. Minerva, “Towards a user-centric personal data ecosystem
the role of the bank of individuals’ data,” in Intelligence in Next Generation
Networks (ICIN), 2012 16th International Conference on. IEEE, 2012, pp. 202–209.
 [5] A. Malhotra, L. Totti, W. Meira Jr, P. Kumaraguru, and V. Almeida, “Studying
user footprints in different online social networks,” in Proceedings of the 2012
International Conference on Advances in Social Networks Analysis and Mining
(ASONAM 2012). IEEE Computer Society, 2012, pp. 1065–1070.
REFERENCES
 [6] N. Eagle and A. Pentland, “Reality mining: sensing complex social systems,”
Personal and ubiquitous computing, vol. 10, no. 4, pp. 255– 268, 2006.
 [7] M. Venkataramanan, “My identity for sale,” http://www.wired.co.uk
/magazine/archive/2014/11/features/my-identity-for-sale/viewall, accessed: 2015-
27-03.
 [8] “Mac basics: Notifications keep you informed,” https://support.apple.com/en-
lb/HT204079, accessed: 2015-27-03.
 [9] “Google now,” https://www.google.com/landing/now/, accessed: 2015-
REFERENCES
 [10] J. H. Frost and M. P. Massagli, “Social uses of personal health 27-03. information
within patientslikeme, an online patient community: what can happen when patients
have access to one anothers data,” Journal of Medical Internet Research, vol. 10, no.
3, 2008.
 [11] S. Kumar, W. Nilsen, M. Pavel, and M. Srivastava, “Mobile health: Revolutionizing
healthcare through transdisciplinary research,” Computer, no. 1, pp. 28–35, 2013.
 [12] C. Pagliari, D. Detmer, and P. Singleton, “Potential of electronic personal health
records,” BMJ: British Medical Journal, vol. 335, no. 7615, p. 330, 2007.
 [13] “Finnish legislation - personal data act, 523/199,” translation completed: 2001-31-
03.

Contenu connexe

Tendances

Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Philip Bourne
 
Chinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-web
Chinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-webChinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-web
Chinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-webMala Chinoy
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveKees van Bochove
 
A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...IAEME Publication
 
BigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCHBigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCHJohn Koch
 
Hadoop and Big Data Readiness in Africa: A Case of Tanzania
Hadoop and Big Data Readiness in Africa: A Case of TanzaniaHadoop and Big Data Readiness in Africa: A Case of Tanzania
Hadoop and Big Data Readiness in Africa: A Case of Tanzaniaijsrd.com
 
Mining Social Media Data for Understanding Drugs Usage
Mining Social Media Data for Understanding Drugs  UsageMining Social Media Data for Understanding Drugs  Usage
Mining Social Media Data for Understanding Drugs UsageIRJET Journal
 
Social Media Datasets for Analysis and Modeling Drug Usage
Social Media Datasets for Analysis and Modeling Drug UsageSocial Media Datasets for Analysis and Modeling Drug Usage
Social Media Datasets for Analysis and Modeling Drug Usageijtsrd
 
Benefits, Adoption Barriers and Myths of Open Data and Open Governmnet
Benefits, Adoption Barriers and Myths of Open Data and Open GovernmnetBenefits, Adoption Barriers and Myths of Open Data and Open Governmnet
Benefits, Adoption Barriers and Myths of Open Data and Open GovernmnetFatemeh Ahmadi
 
My harmony generating statistics from clinical text for monitoring clinical...
My harmony   generating statistics from clinical text for monitoring clinical...My harmony   generating statistics from clinical text for monitoring clinical...
My harmony generating statistics from clinical text for monitoring clinical...Conference Papers
 
Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Carolyn Ten Holter
 
Use of secondary data in marketing analytics
Use of secondary data in marketing analyticsUse of secondary data in marketing analytics
Use of secondary data in marketing analyticsDebasisMohanty37
 
Big Data in Healthcare -- What Does it Mean?
Big Data in Healthcare -- What Does it Mean?Big Data in Healthcare -- What Does it Mean?
Big Data in Healthcare -- What Does it Mean?M2SYS Technology
 
Healthcare Data Integrity and Interoperability Standards Podcast Summary
Healthcare Data Integrity and Interoperability Standards Podcast SummaryHealthcare Data Integrity and Interoperability Standards Podcast Summary
Healthcare Data Integrity and Interoperability Standards Podcast SummaryM2SYS Technology
 
archenaa2015-survey-big-data-government.pdf
archenaa2015-survey-big-data-government.pdfarchenaa2015-survey-big-data-government.pdf
archenaa2015-survey-big-data-government.pdfAkuhuruf
 
A SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONA SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONIJDKP
 

Tendances (20)

Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?Data Science in Biomedicine - Where Are We Headed?
Data Science in Biomedicine - Where Are We Headed?
 
Chinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-web
Chinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-webChinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-web
Chinoy Paper 2016 - WDQC-MakingtheMostofWorkforceData-web
 
PA Data Sharing Survey 2016 POSTED.final
PA Data Sharing Survey 2016 POSTED.finalPA Data Sharing Survey 2016 POSTED.final
PA Data Sharing Survey 2016 POSTED.final
 
Research-KS-Jun2015
Research-KS-Jun2015Research-KS-Jun2015
Research-KS-Jun2015
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
 
A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...A survey on various architectures, models and methodologies for information r...
A survey on various architectures, models and methodologies for information r...
 
BigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCHBigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCH
 
Hadoop and Big Data Readiness in Africa: A Case of Tanzania
Hadoop and Big Data Readiness in Africa: A Case of TanzaniaHadoop and Big Data Readiness in Africa: A Case of Tanzania
Hadoop and Big Data Readiness in Africa: A Case of Tanzania
 
Mining Social Media Data for Understanding Drugs Usage
Mining Social Media Data for Understanding Drugs  UsageMining Social Media Data for Understanding Drugs  Usage
Mining Social Media Data for Understanding Drugs Usage
 
Social Media Datasets for Analysis and Modeling Drug Usage
Social Media Datasets for Analysis and Modeling Drug UsageSocial Media Datasets for Analysis and Modeling Drug Usage
Social Media Datasets for Analysis and Modeling Drug Usage
 
Benefits, Adoption Barriers and Myths of Open Data and Open Governmnet
Benefits, Adoption Barriers and Myths of Open Data and Open GovernmnetBenefits, Adoption Barriers and Myths of Open Data and Open Governmnet
Benefits, Adoption Barriers and Myths of Open Data and Open Governmnet
 
My harmony generating statistics from clinical text for monitoring clinical...
My harmony   generating statistics from clinical text for monitoring clinical...My harmony   generating statistics from clinical text for monitoring clinical...
My harmony generating statistics from clinical text for monitoring clinical...
 
Big Data Analytics in Health Care: A Review Paper
Big Data Analytics in Health Care: A Review PaperBig Data Analytics in Health Care: A Review Paper
Big Data Analytics in Health Care: A Review Paper
 
Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...
 
Use of secondary data in marketing analytics
Use of secondary data in marketing analyticsUse of secondary data in marketing analytics
Use of secondary data in marketing analytics
 
Big Data in Healthcare -- What Does it Mean?
Big Data in Healthcare -- What Does it Mean?Big Data in Healthcare -- What Does it Mean?
Big Data in Healthcare -- What Does it Mean?
 
Healthcare Data Integrity and Interoperability Standards Podcast Summary
Healthcare Data Integrity and Interoperability Standards Podcast SummaryHealthcare Data Integrity and Interoperability Standards Podcast Summary
Healthcare Data Integrity and Interoperability Standards Podcast Summary
 
archenaa2015-survey-big-data-government.pdf
archenaa2015-survey-big-data-government.pdfarchenaa2015-survey-big-data-government.pdf
archenaa2015-survey-big-data-government.pdf
 
A SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONA SURVEY OF LINK MINING AND ANOMALIES DETECTION
A SURVEY OF LINK MINING AND ANOMALIES DETECTION
 
[IJCT-V3I2P30] Authors: Sunny Sharma
[IJCT-V3I2P30] Authors: Sunny Sharma[IJCT-V3I2P30] Authors: Sunny Sharma
[IJCT-V3I2P30] Authors: Sunny Sharma
 

Similaire à Health data mining

Accessing Secondary Data A Literature Review
Accessing Secondary Data   A Literature ReviewAccessing Secondary Data   A Literature Review
Accessing Secondary Data A Literature ReviewGina Rizzo
 
Research Evaluation And Data Collection Methods
Research Evaluation And Data Collection MethodsResearch Evaluation And Data Collection Methods
Research Evaluation And Data Collection MethodsJessica Robles
 
‘Personal data literacies’: A critical literacies approach to enhancing under...
‘Personal data literacies’: A critical literacies approach to enhancing under...‘Personal data literacies’: A critical literacies approach to enhancing under...
‘Personal data literacies’: A critical literacies approach to enhancing under...eraser Juan José Calderón
 
Running Head WEEK 1 .docx
Running Head WEEK 1                                              .docxRunning Head WEEK 1                                              .docx
Running Head WEEK 1 .docxjeffsrosalyn
 
Running Head WEEK 1 .docx
Running Head WEEK 1                                              .docxRunning Head WEEK 1                                              .docx
Running Head WEEK 1 .docxrtodd599
 
RESEARCH ARTICLEEXPECTING THE UNEXPECTED EFFECTS OF DATA.docx
RESEARCH ARTICLEEXPECTING THE UNEXPECTED  EFFECTS OF DATA.docxRESEARCH ARTICLEEXPECTING THE UNEXPECTED  EFFECTS OF DATA.docx
RESEARCH ARTICLEEXPECTING THE UNEXPECTED EFFECTS OF DATA.docxaudeleypearl
 
Evidence Based Healthcare Design
Evidence Based Healthcare DesignEvidence Based Healthcare Design
Evidence Based Healthcare DesignCarmen Martin
 
A comparative study on remote tracking of parkinson’s disease progression usi...
A comparative study on remote tracking of parkinson’s disease progression usi...A comparative study on remote tracking of parkinson’s disease progression usi...
A comparative study on remote tracking of parkinson’s disease progression usi...ijfcstjournal
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSijistjournal
 
Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Kato Mivule
 
research publish journal
research publish journalresearch publish journal
research publish journalrikaseorika
 
paper publication
paper publicationpaper publication
paper publicationrikaseorika
 
Salus.Coop Informe Final
Salus.Coop Informe FinalSalus.Coop Informe Final
Salus.Coop Informe FinalAndrea Barbiero
 
Analysis of open health data quality using data object-driven approach to dat...
Analysis of open health data quality using data object-driven approach to dat...Analysis of open health data quality using data object-driven approach to dat...
Analysis of open health data quality using data object-driven approach to dat...Anastasija Nikiforova
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxjennifer822
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxedgar6wallace88877
 
What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?Vincenzo Patruno
 

Similaire à Health data mining (20)

Accessing Secondary Data A Literature Review
Accessing Secondary Data   A Literature ReviewAccessing Secondary Data   A Literature Review
Accessing Secondary Data A Literature Review
 
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
[IJET-V1I3P10] Authors : Kalaignanam.K, Aishwarya.M, Vasantharaj.K, Kumaresan...
 
Research Evaluation And Data Collection Methods
Research Evaluation And Data Collection MethodsResearch Evaluation And Data Collection Methods
Research Evaluation And Data Collection Methods
 
‘Personal data literacies’: A critical literacies approach to enhancing under...
‘Personal data literacies’: A critical literacies approach to enhancing under...‘Personal data literacies’: A critical literacies approach to enhancing under...
‘Personal data literacies’: A critical literacies approach to enhancing under...
 
Running Head WEEK 1 .docx
Running Head WEEK 1                                              .docxRunning Head WEEK 1                                              .docx
Running Head WEEK 1 .docx
 
Running Head WEEK 1 .docx
Running Head WEEK 1                                              .docxRunning Head WEEK 1                                              .docx
Running Head WEEK 1 .docx
 
RESEARCH ARTICLEEXPECTING THE UNEXPECTED EFFECTS OF DATA.docx
RESEARCH ARTICLEEXPECTING THE UNEXPECTED  EFFECTS OF DATA.docxRESEARCH ARTICLEEXPECTING THE UNEXPECTED  EFFECTS OF DATA.docx
RESEARCH ARTICLEEXPECTING THE UNEXPECTED EFFECTS OF DATA.docx
 
Evidence Based Healthcare Design
Evidence Based Healthcare DesignEvidence Based Healthcare Design
Evidence Based Healthcare Design
 
Mobilization +
Mobilization +Mobilization +
Mobilization +
 
A comparative study on remote tracking of parkinson’s disease progression usi...
A comparative study on remote tracking of parkinson’s disease progression usi...A comparative study on remote tracking of parkinson’s disease progression usi...
A comparative study on remote tracking of parkinson’s disease progression usi...
 
A SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICSA SURVEY OF BIG DATA ANALYTICS
A SURVEY OF BIG DATA ANALYTICS
 
Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...
 
ugc journal
ugc journalugc journal
ugc journal
 
research publish journal
research publish journalresearch publish journal
research publish journal
 
paper publication
paper publicationpaper publication
paper publication
 
Salus.Coop Informe Final
Salus.Coop Informe FinalSalus.Coop Informe Final
Salus.Coop Informe Final
 
Analysis of open health data quality using data object-driven approach to dat...
Analysis of open health data quality using data object-driven approach to dat...Analysis of open health data quality using data object-driven approach to dat...
Analysis of open health data quality using data object-driven approach to dat...
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docx
 
Singapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docxSingapore Management UniversityInstitutional Knowledge at Si.docx
Singapore Management UniversityInstitutional Knowledge at Si.docx
 
What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?What does “BIG DATA” mean for official statistics?
What does “BIG DATA” mean for official statistics?
 

Dernier

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Dernier (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Health data mining

  • 2. Collecting a Citizen’s Digital Footprint for Health Data Mining Oguzhan Gencoglu, Heidi Simil, Harri Honko, Minna Isomursu
  • 3. Abstract  This paper describes a case study for collecting digital footprint data for the purpose of health data mining.  The case study involved 20 subjects residing in Finland who were instructed to collect data from registries which they evaluated to be useful for understanding their health or health behavior, current or past.  11 subjects were active, sending 100 data requests to 49 distinct organizations in total.  Our results indicate that there are still practical challenges in collecting actionable digital footprint data.
  • 4. Abstract  Out of the received data, 44 datasets (72.1% were delivered in paper format.  4 (6.6%) in portable document format .  13 (21.3%) in structured digital form.  The time duration between the sending of the information requests and reception of a reply was 26.4 days on the average.
  • 5. Introduction  Digital footprint or digital shadowrefers to one's unique set of traceable digital activities, actions, contributions and communications that are manifested on the Internet or on digital devices  There are two main classifications for digital footprints:  Passive digital footprints . A passive digital footprint is created when data is collected without the owner knowing, it can be stored in many ways depending on the situation. In an online environment a footprint may be stored in an online data base as a "hit". This footprint may track the user IP address, when it was created, and where they came from; with the footprint later being analyzed. In an offline environment, a footprint may be stored in files, which can be accessed by administrators to view the actions performed on the machine, without being able to see who performed them.  Active digital footprints active digital footprints are created when personal data is released deliberately by a user for the purpose of sharing information about oneself by means of websites or social media.
  • 6. Introduction  digital footprints can tell a lot about the behavior, characteristics and preferences of an individual [2] [3] [4] [5] [6], provided it’s accessible in digitally digestible, machine-readable form.  Increasingly the data sets, open or closed are being made available over an application programming interface, API. Where accessible, the person’s digital footprint is used today, for example, for personalized recommendation services, person-, income- and even location-context[7].  There are ideas promoting that digital footprint data, when properly gathered and analyzed with modern data analytics could provide significant opportunities for providing new, more personalized and timely health services.  Aggregated and analyzed data can help individuals themselves learn about their health condition [10] [11].
  • 7. Introduction  Better access to electronic health records can help communication between careers, health professionals and other service providers [12].  This can create opportunities for totally new kind of health and wellbeing services, which create new business opportunities for companies, and help increasing efficiency of health interventions through targeted care.  In this paper, we examine the state-of-the-practice of collecting 2010’s citizen’s personal footprint for the purpose of health data mining.
  • 8. Introduction  Our research question is ”Can digital footprint of an individual be collected successfully today for health data mining?”.  For the purpose of the study, we hire some individual to send information to different organizations of their own choice. they tried to maximize the number of responses.  Our results summarize how successful our case subjects were in collecting their digital footprint data.  did the organizations provide them access to their personal footprint data?  in what format the data was presented to them?
  • 9. Introduction  and what procedures roughly would be needed to make that data actionable so that it could be used for computerized health data mining by anyone attempting to refine and analyze the data to provide insights and health related value.  Our discussion summarizes our experience and suggests further work on how such data can be examined to reveal health behavior patterns.
  • 10. METHODOLOGY  Total of 20 volunteer participants were hired among active researchers in this study.  The participants were instructed to print, sign and mail the information request with the covering letter to 5-10 target organizations of their own choice.  A preliminary list of candidate sources for digital footprint information was collected to serve as an example for the participants, although they were instructed to decide themselves which data sources could be valuable for health data analytics.  In order to follow the process, the participants kept a record of dates when the information requests were sent, when the replies were received and in which format.
  • 11. METHODOLOGY  The data was asked to be delivered to each participants home address or email.  In the information request form it is stated that data is preferred to be delivered via an API, a memory stick or DVD, instead of printed paper documents.  After receiving the data, the participants were instructed to go through the data and decide which representative set of the individual registers data they were willing to donate for the research program.  The sensitive personal information was removed or edited when needed. Each participant signed an informed consent while handing over the data.
  • 12. RESULTS AND DISCUSSION  The number of voluntary participants, all residing in Finland, was 20 (18 natives, 2 foreigners) for the study.  11 (55.0%) individuals were active during period of five months (11/2014-03/2015), sending 100 information requests (9.09 per person) to 49 (2.04 per registry) distinct data sources in total.  With respect to their content, these data sources were classified by researchers into 15 categories, i.e., banking, education, energy, fitness, groceries, healthcare, housing, insurance, library, mobility, municipality, police, retail, telecommunication and web.  The average number of distinct data sources and number of sent requests per category is 3.27 and 6.67, respectively.  Maximum number of distinct data sources along with maximum number of sent requests belongs to health category with 30 requests from 13 data sources.  For each category, a detailed summary of number of data sources, number of sent requests, number of received replies and number of replies resulting in an access to data can be seen from Table I.
  • 13.
  • 14. RESULTS AND DISCUSSION  Overall response rate and data response rate of the study was 75.0% and 61.0% respectively.  As the main purpose of a digital footprint collection process eventually is to perform data analysis on each individual’s data.  the amount of collected data has a great effect on the analysis performance.
  • 15. RESULTS AND DISCUSSION  The format of the collected data is crucial as well for the analysis to be conducted properly.  Even though more than half of the data sources provided some data to the individuals, most of the cases the format of the returned data is not analysis-friendly, even not digitized.  The format of the delivered data can be categorized into three groups as paper format (hard copy), portable document format (PDF) and spreadsheet/structured format which includes formats such as comma-separated values (CSV), Microsoft Excel file formats (XLS/XLSX), JavaScript object notation (JSON).  The listed order is from least analysis-friendly to the most. A detailed view of the format of the collected data for different categories can be seen from Table II.  Hard copy, i.e., paper format, corresponds to the majority of the collected data with 72.1%. Only 21.3% of the collected data can be considered as structured. None of the data sources had APIs for such data ingestion process.
  • 16.
  • 17. RESULTS AND DISCUSSION  When the process of transforming non-analysis-friendly data into analysis- friendly form is considered, the drawbacks become more obvious.  Data delivered in paper format, first of all, has to be printed and mailed, which comes at a cost.  As an individual can easily own hundreds of pages of data residing in several data sources; logistics, security and storing problems arise.  Then, the data has to be digitized by the recipient, for example by scanning. Such a process is not only burdensome but also error-prone.  After digitization, data is in the form of PDF or digital images which has to be fed into an optical character recognition (OCR) algorithm.
  • 18. RESULTS AND DISCUSSION  As the paper-form data is likely to contain artifacts (lines, logos, bright/dark spots due to scanning, irrelevant text, folded/torn down parts) acting as noise to the OCR system, the likelihood of error increases.  Furthermore, the OCR system had to be tuned specifically for the structure of the text in paper; thus, parsing the relevant information becomes even more demanding.  In addition, as there is no guarantee of the data source delivering the data on the paper in the same format in the future, such tasks are discouraged with respect to the reproducible research paradigm.
  • 19.
  • 20. RESULTS AND DISCUSSION  Another interesting aspect of the data collection process is the analysis of quickness of the data sources, i.e., how quick each registry replies to the requests.  56 of the requests have both sending and reply dates recorded.  On the average, a reply (providing data or not) took 26.4 days to arrive.  Average reply times for different categories can be seen from Table III.  The average durations for the data registries with small number of recorded times are given for the sake of completeness rather than conclusion determined.  The average reply time for requests resulting in data reception was 29.6 days while replies failing to do so came in 14.8 days on the average.
  • 21.
  • 22. CONCLUSION  One’s behavior is reflecting to his/her actions and those actions are recorded in great amounts in today’s world as digital footprint.  As the advancing data mining algorithms enable efficient harmonization of multi-modal data to perform inferential, predictive and even causal analysis of people’s behavior, these digital footprints are of considerable value for health data mining purposes.  An expected rise in the demand of personal data from various data registries is likely to change the current situation of such information retrieval process which is presented in this paper.  Our results show that currently utilization of digital footprint in services has practical challenges. Companies and institutions in control of the data of individuals are not responsive and attentive to the emerging value of digital footprint.  Even in the Finnish context, where the individuals have right by law to access their personal data, many organizations ignored the request or refused the access to the data.  Very few provided data in format which could be easily digested by digital tools.
  • 23. CONCLUSION  Providing high quality data to the cutting-edge data mining and machine learning systems is essential for high performance predictive analysis, health behavioral modeling and personalized services.  In order to achieve this goal, controlled and secure data access via service web portals, or even better, through machine readable APIs are needed.  Our work continues with exploration of the collected datasets in terms of validity, suitability and information value for health data mining, leading to in-depth analysis of how the digital footprint can be used in health services.
  • 24. REFERENCES  [1] A. Sellen, Y. Rogers, R. Harper, and T. Rodden, “Reflecting human values in the digital age,” Communications of the ACM, vol. 52, no. 3, pp. 58–66, 2009.  [2] “World economic forum - rethinking personal data: Strengthening trust,” 2012.  [3] D. Zhang, B. Guo, B. Li, and Z. Yu, “Extracting social and community intelligence from digital footprints: an emerging research area,” in Ubiquitous Intelligence and Computing. Springer, 2010, pp. 4–18.
  • 25. REFERENCES  [4] C. Moiso and R. Minerva, “Towards a user-centric personal data ecosystem the role of the bank of individuals’ data,” in Intelligence in Next Generation Networks (ICIN), 2012 16th International Conference on. IEEE, 2012, pp. 202–209.  [5] A. Malhotra, L. Totti, W. Meira Jr, P. Kumaraguru, and V. Almeida, “Studying user footprints in different online social networks,” in Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012). IEEE Computer Society, 2012, pp. 1065–1070.
  • 26. REFERENCES  [6] N. Eagle and A. Pentland, “Reality mining: sensing complex social systems,” Personal and ubiquitous computing, vol. 10, no. 4, pp. 255– 268, 2006.  [7] M. Venkataramanan, “My identity for sale,” http://www.wired.co.uk /magazine/archive/2014/11/features/my-identity-for-sale/viewall, accessed: 2015- 27-03.  [8] “Mac basics: Notifications keep you informed,” https://support.apple.com/en- lb/HT204079, accessed: 2015-27-03.  [9] “Google now,” https://www.google.com/landing/now/, accessed: 2015-
  • 27. REFERENCES  [10] J. H. Frost and M. P. Massagli, “Social uses of personal health 27-03. information within patientslikeme, an online patient community: what can happen when patients have access to one anothers data,” Journal of Medical Internet Research, vol. 10, no. 3, 2008.  [11] S. Kumar, W. Nilsen, M. Pavel, and M. Srivastava, “Mobile health: Revolutionizing healthcare through transdisciplinary research,” Computer, no. 1, pp. 28–35, 2013.  [12] C. Pagliari, D. Detmer, and P. Singleton, “Potential of electronic personal health records,” BMJ: British Medical Journal, vol. 335, no. 7615, p. 330, 2007.  [13] “Finnish legislation - personal data act, 523/199,” translation completed: 2001-31- 03.