SlideShare une entreprise Scribd logo
1  sur  67
Télécharger pour lire hors ligne
Behavioral Big Data
& Healthcare Research
WiDS Taipei, 31 March 2019
Galit Shmueli 徐茉莉
Institute of Service Science Behavioral
Big Data
Researcher
Human
Subjects
Research
Question
In memory of
Prof Aya Cohen
1940-2019
1994-2000 (MSc + PhD, Statistics)
Israel Institute of Technology
Faculty of IE & M
2000-2002
Carnegie Mellon University
Department of Statistics
2002-2012
University of Maryland
Smith School of Business
2011-2014
Indian School of Business
Hyderabad, India
2014-…
National Tsing Hua University
Institute of Service Science
My Academic Path
My Research
‘Entrepreneurial’ statistical &
data mining modeling
Interdisciplinary
Statistical Strategy
• To Explain or To Predict?
• Information Quality
• Data Mining for Causality
• Predicting with Causal Models
• Behavioral Big Data
1991-1994 (BA, Statistics & Psychology)
University of Haifa, Israel
What is Behavioral Big Data (BBD)
Special type of Big Data
• Behavioral: people’s measurable
“everyday” behavior, interactions, self-
reported opinions, thoughts, feelings
• Human and social aspects:
Intentions, deception, emotion,
reciprocation, herding,…
When aware of data collection ->
modified behavior (legal risks, embarrassment, unwanted solicitation)
BBD vs. Inanimate Big Data
Behavioral
Big Data
Researcher
Human
Subjects
Research
Question
Inanimate
Big Data
Researcher
Research
Question
1. Aware, ongoing interaction with
the BBD - “contaminate” BBD
with intention, deception,
emotion, herding…
2. Can be harmed by BBD
Figure 1: The types of physiological
data points and the wearable
sensors under development or on
the market to monitor them.
Elenko, Underwood & Zohar (2015),
“Defining Digital Medicine”,
Nature Biotechnology 33, 456-461
Physiological
Big Data
Human
Subjects
BBD vs.
Physio
Big Data
• Individual bodies
• Physical measurements
• Medical systems set
data collection timing
• Clinical trials:
awareness & vested
interest
• Collection of connected people
• Measurable behaviors: actions,
interactions, self-reported
feelings, opinions, thoughts
• User chooses data generation
content & timing
• Experiments: users unaware; not
always in user’s best interest
Different research methods in life sciences and behavioral sciences
• Measurement instruments
• Models (latent variable models, social network analysis)
• Human subjects risks
Getting
Closer
“The main products of the 21st
century economy will not be
textiles, vehicles, and weapons
but bodies, brains, and minds”
https://www.ynharari.com/homo-deus-impact-digitalization-society/
“If you wear biometric sensors (such as a Fitbit band) and these
sensors are connected to the computer, the computer will know
exactly what your heart rate, blood pressure and adrenalin level are,
and based on this information, it can identify your emotional state
better than any human psychologist”
https://www.yediot.co.il/articles/0,7340,L-4948868,00.html
Physiological data
translated to BBD
He’s part of a small but
growing group of people
who are wearing CGMs
to track—and then
hack—what goes on in
their own bodies.
Physiological
data collection
turns into BBD
BBD in
Healthcare
Research
Landscape
Players
Value
Landscape of
health-related BBD
Data from a typical hospital, about…
Patients
Personal info
Medical history (visits, tests,
medication, hospitalization...)
Scheduled events, billing
Physicians
Scheduled + actual appointments,
procedures, prescriptions,…
Entries of patient info/data
Nurses
Location, work hours,…
Pharmacy staff
Speed of service
Quality of service
Lab staff
Speed of service
Quality of service
Other staff
Finance/accounting
Cleaning
Receptionists
Volunteers
Food court!
Data Collection
Technologies:
• Medical devices
• HIT systems
(EHR, HR for
Health Info
System)
• WiFi
---
Smart Hospital
• Cameras
• Sensors
• GPS
• IoT
Interactions between
Patients – doctors/nurses
Doctors – other doctors
Patients – other patients
Patient family – hospital staff
Patients – social network ”friends”
...
New data #1:
Recorded Interactions
Chiu, C. C., Tripathi, A., Chou, K., Co, C., Jaitly, N., Jaunzeikare, D., ... & Tansuwan, J. (2017).
Speech recognition for medical conversations. arXiv preprint arXiv:1711.07274.
Data:
• 90, 000 conversations between
doctors and patients during
clinical visits.
• 151 types of medical visits for
different purposes
• Each conversation is typically
between a single doctor and a
patient, sometimes also including
a nurse, or family member.
Telemedicine
/ Telehealth
Remote
Patient
Monitoring
mHealth/ eHealth
New data #2:
smart hospital
remote medical services
Mobile health apps and wearable devices that use
artificial intelligence to help diagnose or even treat
medical conditions pose a new regulatory challenge for
the U.S. Food and Drug Administration
This comes at a time when medical devices have
evolved from fairly self-contained gadgets into
implants and wearables that communicate
wirelessly with medical software on separate
computers or in the cloud. The definition of medical
device has also stretched as smartphone apps and
online services—often backed by machine-
learning algorithms—promise to deliver medical
diagnoses that once would have required a visit to a
doctor's office and specialized lab equipment.
This is where it becomes
ethically challenging:
Who’s collecting the data
and for what purpose?
Are users aware of the data collection
and usage?
What are users’ benefits & risks
from sharing their data?
New data #3:
Health-related online behavior
Health-related BBD: Online
• Medical/health websites
• Online forums
• Social networks
• Search engines
Data voluntarily entered by users: personal details, photos, comments,
messages, search terms, likes, payment information, connections with “friends”
Passive footprints: duration on the website, pages browsed, sequence,
referring website, Internet browser, operating system, location, IP address
New data #4:
Health-related behavior self-logged on Apps
Every day, women manually log around 1.4
M new data points including cycle history,
ovulation and pregnancy tests results, age,
height, weight, lifestyle statistics about
sleep, activity, and nutrition. In addition,
more data comes from wearable devices
like Fitbit & Apple Watch.
Data voluntarily entered by users: health condition, symptoms, behaviors
(eating, exercise, sleep, sex, parking, feelings…)
Passive footprints: app log times, pages browsed, sequence, location…
Flo became the most
downloaded app
worldwide in its
category within months
after introducing
neural networks to its
prediction algorithm.
In addition to logging a menstruation and health diary, users can join a number of
different themed groups including weight loss, clothing, fitness,
relationships, and travel. These groups look and work much like “message
board”-style social network
To date, Meet You has reportedly accumulated two million daily active users,
1.2 million daily active users of its social network, and over 800,000 daily posts.
Sea Hero Quest, a mobile app that
measures spatial navigation ability.
Credit: Hugo Spiers et al.
Since its launch in May 2016, some 2.5
million people have played Sea Hero Quest
Health-related BBD:
Gaming
New data #5:
Health-related behavior from IoT
New data #6:
Health-“unrelated” (implicit) behaviors
“Some hospitals are collecting new information
from patients directly, while others have sought
data from companies that sell consumer and
financial information, or federal agencies that
provide statistics on poverty, housing density
and unemployment.”
The big obstacle: access to the data. Doctors and nurses have limited time to collect new data
and patients bombarded with questions about their lives may suffer “interview fatigue”
This is where it becomes
ethically challenging:
Who’s collecting the data
and for what purpose?
Are users aware of the data collection
and usage?
What are users’ benefits & risks
from sharing their data?
Quantified self devices
also collect…
Subjects went home with an app that measured the
ways they touched their phone’s display (swipes,
taps, and keyboard typing)
Before starting Mindstrong, Paul Dagum, its founder
and CEO, paid for two Bay Area–based studies to
figure out whether there might be a systemic measure
of cognitive ability—or disability—hidden in how we
use our phones. 150 research subjects came into a
clinic and underwent a standardized
neurocognitive assessment
memory problems… can be spotted by looking at things
including how rapidly you type and what errors you make
(such as how frequently you delete characters), as well as by
how fast you scroll down a list of contacts.
“thousands of people are using
the app, and the company now
has five years of clinical study
data to confirm its science and
technology.”
PRIVACY:
“while Mindstrong says it protects users’
data, collecting such data at all could be
a scary prospect for many of the people it
aims to help.
Companies may be interested in, say,
including it as part of an employee
wellness plan, but most of us wouldn’t
want our employers anywhere near our
mental health data”
Microsoft Xbox 360 comes with a
microphone, a camera and technology that
recognizes a user's voice and face
• sign in and sign off
• games you played
• game-score statistics
• Xbox console hardware & operating performance data
• manufacturing codes from game discs
• network performance data
• data that indicates the quality of the Xbox service
to prevent cheating
• IP address
• operating system
• Xbox Live software version
to improve your experience
• Bing search terms
• samples of voice commands to perform search
• what you watched on Xbox One’s TV service
• music & videos you watched or listened to using Xbox
Live
At
home/school/work
At work
provide a ride-hailing platform available specifically to
healthcare providers, letting clinics, hospitals, rehab
centers and more easily assign rides for their patients
and clients from a centralized dashboard – without
requiring that the rider even have the Uber app, or a
smartphone.
Uber Health’s creation was rooted in some alarming
statistics about patient care and healthcare client
absentee rates.
Researchers Using Health BBD
Research Fields using Health BBD
Operations Researchers and Industrial Engineers
For: Hospital Management and Operations
(staffing, scheduling,…)
Medical/Healthcare Researchers & Clinicians
For: Improved Medical Treatment
(safety, effectiveness,…)
Information Systems Researchers
For: Improved Design & Use of Medical IS
(value of IS, effectiveness, standardization,…)
Marketing
Advertising
Insurance
Machine Learning
Social science
How Do Researchers Get
Health BBD?
1. Open/Publicly Available Data
Constantly refreshed or single data dump
API, web scraping
Hacked data
2. Partner with Company/Organization
• Both parties interested in research question
• Data purchase
• Personal connections, sabbaticals, internships
• Partnership between school and organization
• Third party (WCAI)
3. Crowdsourcing
4. China (!)
Research Using New Health BBD: Challenges
Behavioral
Big Data
Researcher
Human
Subjects
Research
Question
Scientific vs.
Clinical vs.
Commercial
Explain
vs.
Predict
Different (conflicting) Goals:
Unit of analysis vs.
Unit of measurement
Under/over-
coverage
New risks (privacy, liability,
security, HIPAA compliance)
New ethical challenges:
Generalization Challenges:
Acquire + analyze data
Users (self-selection,
spill-over, knowledge of
allocation, network)
Company algorithms
Average effect vs. individual effect
Data contaminated by:New modes of connection &
information (social networks,
forums, IoT, Apps)
ATE vs.
Individual
Technical expertise
larger distance
Old Q, new data: Operationalize new variables
New Q: Lack of literature
Value
Two examples of high-profile studies
using new health BBD
Emotional contagion in
social networks
Kramer et al. (PNAS, 2014)
Detecting influenza epidemics
using search engine query data
Ginsberg et al. (Nature, 2009)
Example #1
• No Ethics Board Review (IRB)
“[The work] was consistent with Facebook’s Data
Use Policy, to which all users agree prior to
creating an account on Facebook, constituting
informed consent for this research.”
• PNAS editorial Expression of Concern
• Varied response from public, academia, press,
ethicists, corporates
Where do Data Scientists get Ethics Training?
New Q: Lack of literature
Behavioral
Big Data
Explain
vs.
Predict
Different (conflicting) Goals:
Unit of analysis vs.
Unit of measurement
Under/over-
coverage
Generalization Challenges:
Acquire + analyze data
Users (self-selection,
spill-over, knowledge of
allocation, network)
Company algorithms
Average effect vs. individual effect
Data contaminated by:
ATE vs.
Individual
Technical expertise
Old Q, new data: Operationalize new variables
Scientific vs.
Clinical vs.
Commercial
Researcher
Human
Subjects
New risks (privacy, liability,
security, HIPAA compliance)
New ethical challenges:
New modes of connection &
information (social networks,
forums, IoT)
Research
Question
Example #2
• “Up-to-date influenza estimates may
enable public health officials and health
professional to better respond to seasonal
epidemics”
• BBD: automated search results for 50M
keywords on Google.com (2003-2007). For
each query: {query text, IP address}
• Fit 450M different models, correlating
each query text with CDC data; Combined
45 queries with highest correlation
Researchers: epidemiologists + data science academics
Dalton et al. (2016), “Flutracking weekly online community
survey of influenza-like illness annual report, 2015”
Communicable diseases intelligence quarterly report
Challenge: Acquire data
• The algorithm detects “flu” or
“winter”?
• Persistent over-estimation
• Performs worse than lagged CDC
3-week-old data
• Never released 45 terms used
• Lazer et al. recommend
combining/ calibrating GFT with
CDC data
But most importantly…
Changes made by Google’s search
algorithm to display potential
diagnoses + recommend search for
treatment (more advertising)
-> increased search
This type of BBD research is still popular
New Q: Lack of literature
Average effect vs. individual effect
Human
Subjects
Under/over-
coverage
New risks (privacy, liability,
security, HIPAA compliance)
New ethical challenges:
Users (self-selection,
spill-over, knowledge of
allocation, network)
New modes of connection &
information (social networks,
forums, IoT)
ATE vs.
Individual
Old Q, new data: Operationalize new variables
Explain
vs.
Predict
Scientific vs.
Clinical vs.
Commercial
Different (conflicting) Goals:
Unit of analysis vs.
Unit of measurement
Research
Question
Generalization Challenges:
Acquire + analyze data
Technical expertise
Company algorithms
Data contaminated by:
Behavioral
Big Data
Researcher
Uses Google searches to measure sensitive
behaviors/opinions/thoughts on
racism, self-induced abortion, depression,
child abuse, hateful mobs, the science of
humor, sexual preference, anxiety, son
preference, and sexual insecurity, among
many other topics.
New Q: Lack of literature
Old Q, new data: Operationalize new variables
Research
Question
Scientific vs.
Clinical vs.
Commercial
Explain
vs.
Predict
Different (conflicting) Goals:
Unit of analysis vs.
Unit of measurement
Under/over-
coverage
Generalization Challenges:
Acquire + analyze data
Users (self-selection,
spill-over, knowledge of
allocation, network)
Company algorithms
Average effect vs. individual effect
Data contaminated by:
ATE vs.
Individual
Technical expertise
Let’s Discuss Data Privacy
Researcher
larger distance
Human
Subjects
Behavioral
Big Data
New risks (privacy, liability,
security, HIPAA compliance)
New ethical challenges:
New modes of connection &
information (social networks,
forums, IoT, Apps)
Data Privacy is a Big Issue Right Now
Behavioral
Big Data
Researcher
Human
Subjects
Research
Question
Scientific vs.
Clinical vs.
Commercial
Explain
vs.
Predict
Different (conflicting) Goals:
Unit of analysis vs.
Unit of measureme
New risks (privacy, liability,
security, HIPAA compliance)
New ethical challenges:
Generalization Cha
Acquire + analyze data
Users (self-se
spill-over, kn
allocation, ne
Company alg
Average effec
Data contaNew modes of connection &
information (social networks,
forums, IoT)
ATE vs.
Individual
Technical expertise
What we’ve learned… is that
we need to take a more
proactive role in a broader
view of our responsibility. It’s
not enough to just build tools,
we need to make sure that
they’re used for good
“patients as well as medical staff will be
communicating in a non-private
environment. It is very important to
understand, monitor and control your own
content for its privacy implications. More
dangerous and needing control will be the
reach of patient-to-patient identification
and communication.”
Medical data privacy is typically regulated
What about BBD?
several Israeli hospitals have been
conducting a pilot program which
used AI software to assist in deciding
whether patients should undergo
surgery. However these patients were
subjected to these tests without their
knowledge. The software has been
developed by a startup named
MEDecide in Tel Aviv and used
Recent data privacy regulations is
reshaping the collection & use of BBD
Using BBD for Research: Human Subjects
Institutional Review Board (IRB)
“ethics committee”
University-level committee designated
to approve, monitor, and review
biomedical and behavioral research
involving humans.
Medical and behavioral researchers are
aware of IRB. What about data science
researchers?
The “Final Rule” (July 19, 2018):
Update to the “Common Rule”
New exemption category: Research involving “benign
behavioral interventions”
Exemption for secondary research using identifiable
private information or identifiable biospecimens
No review needed under certain circumstances:
- publicly available data
- participant cannot readily be identified
- participant is regulated under HIPAA for purposes of “health-care
operations,” “research,” or “public health activities”—but not
where the investigator plans to report individual research results
• Am I respecting the rights of my data subjects?
• Are my data pseudonymized?
• Is my research “minimal risk?”
• Do I have broad consent for secondary analysis?
Greene, Shmueli, Ray, and Fell (2019)
Adjusting to the GDPR: The Impact on
Data Scientists and Behavioral
Researchers
Health-”unrelated” behavior
New healthcare BBD offers new research opportunities
Health-related behavior
… and new challenges
Behavioral
Big Data
Researcher
Human
Subjects
Research
Question
Scientific vs.
Clinical vs.
Commercial
Explain
vs.
Predict
Different (conflicting) Goals:
Unit of analysis vs.
Unit of measurement
Under/over-
coverage
New risks (privacy, liability,
security, HIPAA compliance)
New ethical challenges:
Generalization Challenges:
Acquire + analyze data
Users (self-selection,
spill-over, knowledge of
allocation, network)
Company algorithms
Average effect vs. individual effect
Data contaminated by:New modes of connection &
information (social networks,
forums, IoT, Apps)
ATE vs.
Individual
Technical expertise
New Q: Lack of literature
Old Q, new data: Operationalize new variables
Anal yt ics
Humanit y
Responsibil it y
Galit Shmueli 徐茉莉
Institute of Service Science
Shmueli, G. (2017), Research Dilemmas With Behavioral Big Data, Big Data, vol 5 issue 2, pp. 98-119

Contenu connexe

Tendances

A Cognitive-Based Semantic Approach to Deep Content Analysis in Search Engines
A Cognitive-Based Semantic Approach to Deep Content Analysis in Search EnginesA Cognitive-Based Semantic Approach to Deep Content Analysis in Search Engines
A Cognitive-Based Semantic Approach to Deep Content Analysis in Search Engines
Mei Chen, PhD
 
Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...
Amit Sheth
 
Applications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human LifeApplications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human Life
Associate Professor in VSB Coimbatore
 

Tendances (20)

Healthcare in the Era of Digital Disruption (January 29, 2020)
Healthcare in the Era of Digital Disruption (January 29, 2020)Healthcare in the Era of Digital Disruption (January 29, 2020)
Healthcare in the Era of Digital Disruption (January 29, 2020)
 
A Cognitive-Based Semantic Approach to Deep Content Analysis in Search Engines
A Cognitive-Based Semantic Approach to Deep Content Analysis in Search EnginesA Cognitive-Based Semantic Approach to Deep Content Analysis in Search Engines
A Cognitive-Based Semantic Approach to Deep Content Analysis in Search Engines
 
IT Use for Nursing Administration (June 21, 2019)
IT Use for Nursing Administration (June 21, 2019)IT Use for Nursing Administration (June 21, 2019)
IT Use for Nursing Administration (June 21, 2019)
 
Ai applied in healthcare
Ai applied in healthcareAi applied in healthcare
Ai applied in healthcare
 
ARTIFICIAL INTELLIGENCE ROLE IN HEALTH CARE Dr.T.V.Rao MD
ARTIFICIAL INTELLIGENCE ROLE IN HEALTH CARE  Dr.T.V.Rao MDARTIFICIAL INTELLIGENCE ROLE IN HEALTH CARE  Dr.T.V.Rao MD
ARTIFICIAL INTELLIGENCE ROLE IN HEALTH CARE Dr.T.V.Rao MD
 
Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...
 
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
5 Powerful Real World Examples Of How AI Is Being Used In Healthcare
 
Bio IT World 2019 - AI For Healthcare - Simon Taylor, Lucidworks
Bio IT World 2019 - AI For Healthcare - Simon Taylor, LucidworksBio IT World 2019 - AI For Healthcare - Simon Taylor, Lucidworks
Bio IT World 2019 - AI For Healthcare - Simon Taylor, Lucidworks
 
Artificial intelligence in health care by Islam salama " Saimo#BoOm "
Artificial intelligence in health care by Islam salama " Saimo#BoOm "Artificial intelligence in health care by Islam salama " Saimo#BoOm "
Artificial intelligence in health care by Islam salama " Saimo#BoOm "
 
The Post-Relational Reality Sets In: 2011 Survey on Unstructured Data
The Post-Relational Reality Sets In: 2011 Survey on Unstructured DataThe Post-Relational Reality Sets In: 2011 Survey on Unstructured Data
The Post-Relational Reality Sets In: 2011 Survey on Unstructured Data
 
Applications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human LifeApplications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human Life
 
AI in Healthcare: From Hype to Impact (updated)
AI in Healthcare: From Hype to Impact (updated)AI in Healthcare: From Hype to Impact (updated)
AI in Healthcare: From Hype to Impact (updated)
 
Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
Sensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor ApplicationsSensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor Applications
Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
 
How artificial intelligence(AI) will change the world in 2021
How artificial intelligence(AI) will change the world in 2021How artificial intelligence(AI) will change the world in 2021
How artificial intelligence(AI) will change the world in 2021
 
Medical System and Artificial Intelligence: How AI assists hospital-dependent...
Medical System and Artificial Intelligence: How AI assists hospital-dependent...Medical System and Artificial Intelligence: How AI assists hospital-dependent...
Medical System and Artificial Intelligence: How AI assists hospital-dependent...
 
Artificial intelligence in field of pharmacy
Artificial intelligence in field of pharmacyArtificial intelligence in field of pharmacy
Artificial intelligence in field of pharmacy
 
Artificial intelligence in Health Care
Artificial intelligence in Health CareArtificial intelligence in Health Care
Artificial intelligence in Health Care
 
AI in Healthcare: Defining New Health
AI in Healthcare: Defining New HealthAI in Healthcare: Defining New Health
AI in Healthcare: Defining New Health
 
Artificial Intelligence in the Hospital Setting
Artificial Intelligence in the Hospital SettingArtificial Intelligence in the Hospital Setting
Artificial Intelligence in the Hospital Setting
 
Big data march2016 ipsos mori
Big data march2016 ipsos moriBig data march2016 ipsos mori
Big data march2016 ipsos mori
 

Similaire à Behavioral Big Data & Healthcare Research: Talk at WiDS Taipei

June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
June 2015 (142)  MIS Quarterly Executive   67The Big Dat.docxJune 2015 (142)  MIS Quarterly Executive   67The Big Dat.docx
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
croysierkathey
 
PLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organi
PLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organiPLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organi
PLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organi
samirapdcosden
 
An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learn
Pavankalayankusetty
 

Similaire à Behavioral Big Data & Healthcare Research: Talk at WiDS Taipei (20)

Behavioral Big Data & Healthcare Research
Behavioral Big Data & Healthcare ResearchBehavioral Big Data & Healthcare Research
Behavioral Big Data & Healthcare Research
 
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
Researcher Dilemmas  using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...Researcher Dilemmas  using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
Researcher Dilemmas using Behavioral Big Data in Healthcare (INFORMS DMDA Wo...
 
Citizen controlled health data lockers as a game changer
Citizen controlled health data lockers as a game changerCitizen controlled health data lockers as a game changer
Citizen controlled health data lockers as a game changer
 
Big Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical DevicesBig Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical Devices
 
Nicolas Terry, "Big Data, Regulatory Disruption, and Arbitrage in Health Care"
Nicolas Terry, "Big Data, Regulatory Disruption, and Arbitrage in Health Care"Nicolas Terry, "Big Data, Regulatory Disruption, and Arbitrage in Health Care"
Nicolas Terry, "Big Data, Regulatory Disruption, and Arbitrage in Health Care"
 
Data set Legislation
Data set   Legislation Data set   Legislation
Data set Legislation
 
Centralized BI in Healthcare
Centralized BI in HealthcareCentralized BI in Healthcare
Centralized BI in Healthcare
 
CLGPPT FOR DISEASE DETECTION PRESENTATION
CLGPPT FOR DISEASE DETECTION PRESENTATIONCLGPPT FOR DISEASE DETECTION PRESENTATION
CLGPPT FOR DISEASE DETECTION PRESENTATION
 
Big Data Ethics Cjbe july 2021
Big Data Ethics Cjbe july 2021Big Data Ethics Cjbe july 2021
Big Data Ethics Cjbe july 2021
 
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
June 2015 (142)  MIS Quarterly Executive   67The Big Dat.docxJune 2015 (142)  MIS Quarterly Executive   67The Big Dat.docx
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
 
PLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organi
PLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organiPLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organi
PLEASE POST EACH DISCUSSION SEPARATELYEach healthcare organi
 
Smart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislationSmart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislation
 
Data Mining Appliction chapter 5.pdf
Data Mining  Appliction    chapter 5.pdfData Mining  Appliction    chapter 5.pdf
Data Mining Appliction chapter 5.pdf
 
An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learn
 
"Your Health App may be Illegal" IEEE 3 Feb 2021, Manion
"Your Health App may be Illegal" IEEE 3 Feb 2021, Manion"Your Health App may be Illegal" IEEE 3 Feb 2021, Manion
"Your Health App may be Illegal" IEEE 3 Feb 2021, Manion
 
Marcus Comiter, "Data Policy for Internet of Things Healthcare Devices: Align...
Marcus Comiter, "Data Policy for Internet of Things Healthcare Devices: Align...Marcus Comiter, "Data Policy for Internet of Things Healthcare Devices: Align...
Marcus Comiter, "Data Policy for Internet of Things Healthcare Devices: Align...
 
Big data
Big dataBig data
Big data
 
Ehealth: enabling self-management, public health 2.0 and citizen science
Ehealth: enabling self-management, public health 2.0 and citizen scienceEhealth: enabling self-management, public health 2.0 and citizen science
Ehealth: enabling self-management, public health 2.0 and citizen science
 
Class_onlineprivacy.ppt
Class_onlineprivacy.pptClass_onlineprivacy.ppt
Class_onlineprivacy.ppt
 
Digital Identity & Global Health
Digital Identity & Global HealthDigital Identity & Global Health
Digital Identity & Global Health
 

Plus de Galit Shmueli

Plus de Galit Shmueli (20)

“Improving” prediction of human behavior using behavior modification
“Improving” prediction of human behavior using behavior modification“Improving” prediction of human behavior using behavior modification
“Improving” prediction of human behavior using behavior modification
 
Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...
 
To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?
 
Reinventing the Data Analytics Classroom
Reinventing the Data Analytics ClassroomReinventing the Data Analytics Classroom
Reinventing the Data Analytics Classroom
 
Repurposing predictive tools for causal research
Repurposing predictive tools for causal researchRepurposing predictive tools for causal research
Repurposing predictive tools for causal research
 
Statistical Modeling in 3D: Describing, Explaining and Predicting
Statistical Modeling in 3D: Describing, Explaining and PredictingStatistical Modeling in 3D: Describing, Explaining and Predicting
Statistical Modeling in 3D: Describing, Explaining and Predicting
 
Workshop on Information Quality
Workshop on Information QualityWorkshop on Information Quality
Workshop on Information Quality
 
Behavioral Big Data: Why Quality Engineers Should Care
Behavioral Big Data: Why Quality Engineers Should CareBehavioral Big Data: Why Quality Engineers Should Care
Behavioral Big Data: Why Quality Engineers Should Care
 
Statistical Modeling in 3D: Explaining, Predicting, Describing
Statistical Modeling in 3D: Explaining, Predicting, DescribingStatistical Modeling in 3D: Explaining, Predicting, Describing
Statistical Modeling in 3D: Explaining, Predicting, Describing
 
Prediction-based Model Selection in PLS-PM
Prediction-based Model Selection in PLS-PMPrediction-based Model Selection in PLS-PM
Prediction-based Model Selection in PLS-PM
 
When Prediction Met PLS: What We learned in 3 Years of Marriage
When Prediction Met PLS: What We learned in 3 Years of MarriageWhen Prediction Met PLS: What We learned in 3 Years of Marriage
When Prediction Met PLS: What We learned in 3 Years of Marriage
 
A Tree-Based Approach for Addressing Self-selection in Impact Studies with B...
A Tree-Based Approach  for Addressing Self-selection in Impact Studies with B...A Tree-Based Approach  for Addressing Self-selection in Impact Studies with B...
A Tree-Based Approach for Addressing Self-selection in Impact Studies with B...
 
A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Bi...
A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Bi...A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Bi...
A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Bi...
 
Analyzing Behavioral Big Data: Methodological, Practical, Ethical & Moral Issues
Analyzing Behavioral Big Data: Methodological, Practical, Ethical & Moral IssuesAnalyzing Behavioral Big Data: Methodological, Practical, Ethical & Moral Issues
Analyzing Behavioral Big Data: Methodological, Practical, Ethical & Moral Issues
 
Big Data - To Explain or To Predict? Talk at U Toronto's Rotman School of Ma...
Big Data - To Explain or To Predict?  Talk at U Toronto's Rotman School of Ma...Big Data - To Explain or To Predict?  Talk at U Toronto's Rotman School of Ma...
Big Data - To Explain or To Predict? Talk at U Toronto's Rotman School of Ma...
 
Information Quality: A Framework for Evaluating Empirical Studies
Information Quality: A Framework for Evaluating Empirical Studies Information Quality: A Framework for Evaluating Empirical Studies
Information Quality: A Framework for Evaluating Empirical Studies
 
E.SUN Academic Award presentation (Jan 2016)
E.SUN Academic Award presentation (Jan 2016)E.SUN Academic Award presentation (Jan 2016)
E.SUN Academic Award presentation (Jan 2016)
 
Big Data & Analytics in the Digital Creative Industries
Big Data & Analytics in the Digital Creative IndustriesBig Data & Analytics in the Digital Creative Industries
Big Data & Analytics in the Digital Creative Industries
 
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
 
Predictive Model Selection in PLS-PM (SCECR 2015)
Predictive Model Selection in PLS-PM (SCECR 2015)Predictive Model Selection in PLS-PM (SCECR 2015)
Predictive Model Selection in PLS-PM (SCECR 2015)
 

Dernier

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 

Dernier (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service AvailableVastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 

Behavioral Big Data & Healthcare Research: Talk at WiDS Taipei

  • 1. Behavioral Big Data & Healthcare Research WiDS Taipei, 31 March 2019 Galit Shmueli 徐茉莉 Institute of Service Science Behavioral Big Data Researcher Human Subjects Research Question In memory of Prof Aya Cohen 1940-2019
  • 2. 1994-2000 (MSc + PhD, Statistics) Israel Institute of Technology Faculty of IE & M 2000-2002 Carnegie Mellon University Department of Statistics 2002-2012 University of Maryland Smith School of Business 2011-2014 Indian School of Business Hyderabad, India 2014-… National Tsing Hua University Institute of Service Science My Academic Path My Research ‘Entrepreneurial’ statistical & data mining modeling Interdisciplinary Statistical Strategy • To Explain or To Predict? • Information Quality • Data Mining for Causality • Predicting with Causal Models • Behavioral Big Data 1991-1994 (BA, Statistics & Psychology) University of Haifa, Israel
  • 3. What is Behavioral Big Data (BBD) Special type of Big Data • Behavioral: people’s measurable “everyday” behavior, interactions, self- reported opinions, thoughts, feelings • Human and social aspects: Intentions, deception, emotion, reciprocation, herding,… When aware of data collection -> modified behavior (legal risks, embarrassment, unwanted solicitation)
  • 4. BBD vs. Inanimate Big Data Behavioral Big Data Researcher Human Subjects Research Question Inanimate Big Data Researcher Research Question 1. Aware, ongoing interaction with the BBD - “contaminate” BBD with intention, deception, emotion, herding… 2. Can be harmed by BBD
  • 5. Figure 1: The types of physiological data points and the wearable sensors under development or on the market to monitor them. Elenko, Underwood & Zohar (2015), “Defining Digital Medicine”, Nature Biotechnology 33, 456-461 Physiological Big Data Human Subjects
  • 6. BBD vs. Physio Big Data • Individual bodies • Physical measurements • Medical systems set data collection timing • Clinical trials: awareness & vested interest • Collection of connected people • Measurable behaviors: actions, interactions, self-reported feelings, opinions, thoughts • User chooses data generation content & timing • Experiments: users unaware; not always in user’s best interest Different research methods in life sciences and behavioral sciences • Measurement instruments • Models (latent variable models, social network analysis) • Human subjects risks
  • 8. “The main products of the 21st century economy will not be textiles, vehicles, and weapons but bodies, brains, and minds” https://www.ynharari.com/homo-deus-impact-digitalization-society/ “If you wear biometric sensors (such as a Fitbit band) and these sensors are connected to the computer, the computer will know exactly what your heart rate, blood pressure and adrenalin level are, and based on this information, it can identify your emotional state better than any human psychologist” https://www.yediot.co.il/articles/0,7340,L-4948868,00.html Physiological data translated to BBD
  • 9. He’s part of a small but growing group of people who are wearing CGMs to track—and then hack—what goes on in their own bodies. Physiological data collection turns into BBD
  • 13. Data from a typical hospital, about… Patients Personal info Medical history (visits, tests, medication, hospitalization...) Scheduled events, billing Physicians Scheduled + actual appointments, procedures, prescriptions,… Entries of patient info/data Nurses Location, work hours,… Pharmacy staff Speed of service Quality of service Lab staff Speed of service Quality of service Other staff Finance/accounting Cleaning Receptionists Volunteers Food court! Data Collection Technologies: • Medical devices • HIT systems (EHR, HR for Health Info System) • WiFi --- Smart Hospital • Cameras • Sensors • GPS • IoT
  • 14. Interactions between Patients – doctors/nurses Doctors – other doctors Patients – other patients Patient family – hospital staff Patients – social network ”friends” ... New data #1: Recorded Interactions
  • 15. Chiu, C. C., Tripathi, A., Chou, K., Co, C., Jaitly, N., Jaunzeikare, D., ... & Tansuwan, J. (2017). Speech recognition for medical conversations. arXiv preprint arXiv:1711.07274. Data: • 90, 000 conversations between doctors and patients during clinical visits. • 151 types of medical visits for different purposes • Each conversation is typically between a single doctor and a patient, sometimes also including a nurse, or family member.
  • 16. Telemedicine / Telehealth Remote Patient Monitoring mHealth/ eHealth New data #2: smart hospital remote medical services
  • 17. Mobile health apps and wearable devices that use artificial intelligence to help diagnose or even treat medical conditions pose a new regulatory challenge for the U.S. Food and Drug Administration This comes at a time when medical devices have evolved from fairly self-contained gadgets into implants and wearables that communicate wirelessly with medical software on separate computers or in the cloud. The definition of medical device has also stretched as smartphone apps and online services—often backed by machine- learning algorithms—promise to deliver medical diagnoses that once would have required a visit to a doctor's office and specialized lab equipment.
  • 18. This is where it becomes ethically challenging: Who’s collecting the data and for what purpose? Are users aware of the data collection and usage? What are users’ benefits & risks from sharing their data?
  • 19. New data #3: Health-related online behavior
  • 20. Health-related BBD: Online • Medical/health websites • Online forums • Social networks • Search engines Data voluntarily entered by users: personal details, photos, comments, messages, search terms, likes, payment information, connections with “friends” Passive footprints: duration on the website, pages browsed, sequence, referring website, Internet browser, operating system, location, IP address
  • 21. New data #4: Health-related behavior self-logged on Apps Every day, women manually log around 1.4 M new data points including cycle history, ovulation and pregnancy tests results, age, height, weight, lifestyle statistics about sleep, activity, and nutrition. In addition, more data comes from wearable devices like Fitbit & Apple Watch. Data voluntarily entered by users: health condition, symptoms, behaviors (eating, exercise, sleep, sex, parking, feelings…) Passive footprints: app log times, pages browsed, sequence, location…
  • 22. Flo became the most downloaded app worldwide in its category within months after introducing neural networks to its prediction algorithm. In addition to logging a menstruation and health diary, users can join a number of different themed groups including weight loss, clothing, fitness, relationships, and travel. These groups look and work much like “message board”-style social network To date, Meet You has reportedly accumulated two million daily active users, 1.2 million daily active users of its social network, and over 800,000 daily posts.
  • 23.
  • 24. Sea Hero Quest, a mobile app that measures spatial navigation ability. Credit: Hugo Spiers et al. Since its launch in May 2016, some 2.5 million people have played Sea Hero Quest Health-related BBD: Gaming
  • 25. New data #5: Health-related behavior from IoT
  • 26. New data #6: Health-“unrelated” (implicit) behaviors
  • 27. “Some hospitals are collecting new information from patients directly, while others have sought data from companies that sell consumer and financial information, or federal agencies that provide statistics on poverty, housing density and unemployment.” The big obstacle: access to the data. Doctors and nurses have limited time to collect new data and patients bombarded with questions about their lives may suffer “interview fatigue”
  • 28. This is where it becomes ethically challenging: Who’s collecting the data and for what purpose? Are users aware of the data collection and usage? What are users’ benefits & risks from sharing their data?
  • 30. Subjects went home with an app that measured the ways they touched their phone’s display (swipes, taps, and keyboard typing) Before starting Mindstrong, Paul Dagum, its founder and CEO, paid for two Bay Area–based studies to figure out whether there might be a systemic measure of cognitive ability—or disability—hidden in how we use our phones. 150 research subjects came into a clinic and underwent a standardized neurocognitive assessment memory problems… can be spotted by looking at things including how rapidly you type and what errors you make (such as how frequently you delete characters), as well as by how fast you scroll down a list of contacts.
  • 31. “thousands of people are using the app, and the company now has five years of clinical study data to confirm its science and technology.” PRIVACY: “while Mindstrong says it protects users’ data, collecting such data at all could be a scary prospect for many of the people it aims to help. Companies may be interested in, say, including it as part of an employee wellness plan, but most of us wouldn’t want our employers anywhere near our mental health data”
  • 32.
  • 33. Microsoft Xbox 360 comes with a microphone, a camera and technology that recognizes a user's voice and face • sign in and sign off • games you played • game-score statistics • Xbox console hardware & operating performance data • manufacturing codes from game discs • network performance data • data that indicates the quality of the Xbox service to prevent cheating • IP address • operating system • Xbox Live software version to improve your experience • Bing search terms • samples of voice commands to perform search • what you watched on Xbox One’s TV service • music & videos you watched or listened to using Xbox Live At home/school/work
  • 35.
  • 36. provide a ride-hailing platform available specifically to healthcare providers, letting clinics, hospitals, rehab centers and more easily assign rides for their patients and clients from a centralized dashboard – without requiring that the rider even have the Uber app, or a smartphone. Uber Health’s creation was rooted in some alarming statistics about patient care and healthcare client absentee rates.
  • 38. Research Fields using Health BBD Operations Researchers and Industrial Engineers For: Hospital Management and Operations (staffing, scheduling,…) Medical/Healthcare Researchers & Clinicians For: Improved Medical Treatment (safety, effectiveness,…) Information Systems Researchers For: Improved Design & Use of Medical IS (value of IS, effectiveness, standardization,…) Marketing Advertising Insurance Machine Learning Social science
  • 39. How Do Researchers Get Health BBD? 1. Open/Publicly Available Data Constantly refreshed or single data dump API, web scraping Hacked data 2. Partner with Company/Organization • Both parties interested in research question • Data purchase • Personal connections, sabbaticals, internships • Partnership between school and organization • Third party (WCAI) 3. Crowdsourcing 4. China (!)
  • 40. Research Using New Health BBD: Challenges Behavioral Big Data Researcher Human Subjects Research Question Scientific vs. Clinical vs. Commercial Explain vs. Predict Different (conflicting) Goals: Unit of analysis vs. Unit of measurement Under/over- coverage New risks (privacy, liability, security, HIPAA compliance) New ethical challenges: Generalization Challenges: Acquire + analyze data Users (self-selection, spill-over, knowledge of allocation, network) Company algorithms Average effect vs. individual effect Data contaminated by:New modes of connection & information (social networks, forums, IoT, Apps) ATE vs. Individual Technical expertise larger distance Old Q, new data: Operationalize new variables New Q: Lack of literature
  • 41. Value
  • 42. Two examples of high-profile studies using new health BBD Emotional contagion in social networks Kramer et al. (PNAS, 2014) Detecting influenza epidemics using search engine query data Ginsberg et al. (Nature, 2009)
  • 44. • No Ethics Board Review (IRB) “[The work] was consistent with Facebook’s Data Use Policy, to which all users agree prior to creating an account on Facebook, constituting informed consent for this research.” • PNAS editorial Expression of Concern • Varied response from public, academia, press, ethicists, corporates Where do Data Scientists get Ethics Training?
  • 45. New Q: Lack of literature Behavioral Big Data Explain vs. Predict Different (conflicting) Goals: Unit of analysis vs. Unit of measurement Under/over- coverage Generalization Challenges: Acquire + analyze data Users (self-selection, spill-over, knowledge of allocation, network) Company algorithms Average effect vs. individual effect Data contaminated by: ATE vs. Individual Technical expertise Old Q, new data: Operationalize new variables Scientific vs. Clinical vs. Commercial Researcher Human Subjects New risks (privacy, liability, security, HIPAA compliance) New ethical challenges: New modes of connection & information (social networks, forums, IoT) Research Question
  • 46. Example #2 • “Up-to-date influenza estimates may enable public health officials and health professional to better respond to seasonal epidemics” • BBD: automated search results for 50M keywords on Google.com (2003-2007). For each query: {query text, IP address} • Fit 450M different models, correlating each query text with CDC data; Combined 45 queries with highest correlation
  • 47. Researchers: epidemiologists + data science academics Dalton et al. (2016), “Flutracking weekly online community survey of influenza-like illness annual report, 2015” Communicable diseases intelligence quarterly report Challenge: Acquire data
  • 48. • The algorithm detects “flu” or “winter”? • Persistent over-estimation • Performs worse than lagged CDC 3-week-old data • Never released 45 terms used • Lazer et al. recommend combining/ calibrating GFT with CDC data But most importantly…
  • 49. Changes made by Google’s search algorithm to display potential diagnoses + recommend search for treatment (more advertising) -> increased search
  • 50. This type of BBD research is still popular
  • 51. New Q: Lack of literature Average effect vs. individual effect Human Subjects Under/over- coverage New risks (privacy, liability, security, HIPAA compliance) New ethical challenges: Users (self-selection, spill-over, knowledge of allocation, network) New modes of connection & information (social networks, forums, IoT) ATE vs. Individual Old Q, new data: Operationalize new variables Explain vs. Predict Scientific vs. Clinical vs. Commercial Different (conflicting) Goals: Unit of analysis vs. Unit of measurement Research Question Generalization Challenges: Acquire + analyze data Technical expertise Company algorithms Data contaminated by: Behavioral Big Data Researcher
  • 52. Uses Google searches to measure sensitive behaviors/opinions/thoughts on racism, self-induced abortion, depression, child abuse, hateful mobs, the science of humor, sexual preference, anxiety, son preference, and sexual insecurity, among many other topics.
  • 53. New Q: Lack of literature Old Q, new data: Operationalize new variables Research Question Scientific vs. Clinical vs. Commercial Explain vs. Predict Different (conflicting) Goals: Unit of analysis vs. Unit of measurement Under/over- coverage Generalization Challenges: Acquire + analyze data Users (self-selection, spill-over, knowledge of allocation, network) Company algorithms Average effect vs. individual effect Data contaminated by: ATE vs. Individual Technical expertise Let’s Discuss Data Privacy Researcher larger distance Human Subjects Behavioral Big Data New risks (privacy, liability, security, HIPAA compliance) New ethical challenges: New modes of connection & information (social networks, forums, IoT, Apps)
  • 54. Data Privacy is a Big Issue Right Now Behavioral Big Data Researcher Human Subjects Research Question Scientific vs. Clinical vs. Commercial Explain vs. Predict Different (conflicting) Goals: Unit of analysis vs. Unit of measureme New risks (privacy, liability, security, HIPAA compliance) New ethical challenges: Generalization Cha Acquire + analyze data Users (self-se spill-over, kn allocation, ne Company alg Average effec Data contaNew modes of connection & information (social networks, forums, IoT) ATE vs. Individual Technical expertise
  • 55. What we’ve learned… is that we need to take a more proactive role in a broader view of our responsibility. It’s not enough to just build tools, we need to make sure that they’re used for good
  • 56. “patients as well as medical staff will be communicating in a non-private environment. It is very important to understand, monitor and control your own content for its privacy implications. More dangerous and needing control will be the reach of patient-to-patient identification and communication.”
  • 57.
  • 58. Medical data privacy is typically regulated What about BBD? several Israeli hospitals have been conducting a pilot program which used AI software to assist in deciding whether patients should undergo surgery. However these patients were subjected to these tests without their knowledge. The software has been developed by a startup named MEDecide in Tel Aviv and used
  • 59. Recent data privacy regulations is reshaping the collection & use of BBD
  • 60.
  • 61. Using BBD for Research: Human Subjects Institutional Review Board (IRB) “ethics committee” University-level committee designated to approve, monitor, and review biomedical and behavioral research involving humans. Medical and behavioral researchers are aware of IRB. What about data science researchers?
  • 62. The “Final Rule” (July 19, 2018): Update to the “Common Rule” New exemption category: Research involving “benign behavioral interventions” Exemption for secondary research using identifiable private information or identifiable biospecimens No review needed under certain circumstances: - publicly available data - participant cannot readily be identified - participant is regulated under HIPAA for purposes of “health-care operations,” “research,” or “public health activities”—but not where the investigator plans to report individual research results
  • 63. • Am I respecting the rights of my data subjects? • Are my data pseudonymized? • Is my research “minimal risk?” • Do I have broad consent for secondary analysis? Greene, Shmueli, Ray, and Fell (2019) Adjusting to the GDPR: The Impact on Data Scientists and Behavioral Researchers
  • 64.
  • 65. Health-”unrelated” behavior New healthcare BBD offers new research opportunities Health-related behavior
  • 66. … and new challenges Behavioral Big Data Researcher Human Subjects Research Question Scientific vs. Clinical vs. Commercial Explain vs. Predict Different (conflicting) Goals: Unit of analysis vs. Unit of measurement Under/over- coverage New risks (privacy, liability, security, HIPAA compliance) New ethical challenges: Generalization Challenges: Acquire + analyze data Users (self-selection, spill-over, knowledge of allocation, network) Company algorithms Average effect vs. individual effect Data contaminated by:New modes of connection & information (social networks, forums, IoT, Apps) ATE vs. Individual Technical expertise New Q: Lack of literature Old Q, new data: Operationalize new variables
  • 67. Anal yt ics Humanit y Responsibil it y Galit Shmueli 徐茉莉 Institute of Service Science Shmueli, G. (2017), Research Dilemmas With Behavioral Big Data, Big Data, vol 5 issue 2, pp. 98-119