PYA Principal Dr. Kent Bottles, who is also PYA Analytics’ Chief Medical Officer, presented “The Pros and Cons of Big Data in an ePatient World” at the ePatient Connections 2013 conference.
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
The Pros and Cons of Big Data in an ePatient World
1. The Pros and Cons of Big Data in an
ePatient World
Kent Bottles, MD
Chief Medical Officer, PYA Analytics
ePatient Connections/2013
Philadelphia, Pennsylvania
September 16, 2013
2. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Big data refers to things one can do at a large scale
that cannot be done at a smaller one, to extract
new insights or create new forms of value, in ways
that change markets, organizations, the
relationship between citizens and governments.
• Causality is replaced by correlation
• Not knowing why but only what
3. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Statistics allows richest findings using the smallest
amount of data
• Randomness trumped sample size
• 2007 300 exabytes of stored data
• 2013 1,200 exabytes of stored data
• 2013 only 2% is non-digital
4. Sizing Up Big Data
Steve Lohr, NY Times, June 20, 2013
• Bundle of technologies
– Web pages, browsing habits, sensor signals, social
media, GPS location data, genomic information,
surveillance videos
– Advances in data storage and processing
– Machine learning/AI software to find actionable
correlations from the big data
5. Sizing Up Big Data
Steve Lohr, NY Times, June 20, 2013
• Philosophy about how decisions should be made
– Decisions based on data and analysis
– Less based on experience and gut intuition
– Eliminates anchoring bias and confirmation bias
• Revolution in measurement
– Digital equivalent of the telescope
– Digital equivalent of the microscope
8. Jeffrey Hammerbacher
http://www.youtube.com/watch?v=OVBZTDREg7c
• All industries are being disrupted
– Moneyball, 538, Large Hadron Collider
• McKinsley: Big Data: The Next Frontier for
Competition
– $338 billion potential annual value to US healthcare
– $165 billion in clinical operations
– $105 billion in research and development
9. Jeffrey Hammerbacher
http://www.youtube.com/watch?v=OVBZTDREg7c
• Oracle: From Overload to Impact
– Healthcare executives say collecting & managing more
business information today than 2 years ago
– Average increase 85% per year
• Frost & Sullivan: US Hospital Health Data
Analytics Market
– 2011 10% of US hospitals use data analytic tools
– 2016 50% of US hospitals will use data analytic tools
10. Jeffrey Hammerbacher on Moneyball
www.youtube.com/watch?v=OVBZTDREg7c
• Triple Crown in MLB: Batting average, RBI, HR
• OPS (on base plus slugging)
• GPA (gross production average)
• TOB (times on base)
• The outcome is how many runs we score and allow; A’s
have Matt Stairs; Need stat that reflects both runs
produced at bat & runs saved by defense
• WAR (“Wins above replacement”)
11. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• To analyze & understand the world we used to test
hypotheses driven by theories
• Big data discards theories & causality for
correlations
• University of Ontario premature baby studies
• 1,260 data points per second
• Diagnose infections 24 hours before apparent
• Very constant vital signs indicate impending
infection
12. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Google Nature article predicts flu spread in USA
• Compared 50 million search terms with CDC data
on spread of flu from 2003 to 2008
• 450 million different mathematical models
• 45 search terms had strong correlation with spread
of flu
• H1N1 crisis in 2009 Google approach worked
13. New Tools to Combat Epidemics
Amy O’Leary, NY Times, June 20, 2013
• Google Flu overestimates spread of flu in 2013
• Goggle Flu does not track new diseases
• BioMosaic
– Combines airline records, disease reports, demographic
data
– Website and iPad app
– Showed 5 counties in Florida, 5 counties in NY were
most at risk from cholera epidemic in Haiti in 2010
14. New York City’s Office of Policy & Strategic
Planning
• 1 terabyte of data flows into office every day
• 95% success rate in identifying restaurants
dumping cooking oil into sewers
• Doubled the hit rate of finding stores selling
bootleg cigarettes
• Sped removal of trees toppled by Sandy
• Guided building inspectors to increase citation
rate from 13 to 80% for buildings likely to have
catastrophic house fires
15. Algorithms Mine Public Data
• Atul Butte combined data from 130 studies of
gene activity levels in diabetic & healthy tissue
• Butte identified new gene associate with Type 2
DM because stood out in 78/130 studies
• Algorithm looking for drugs & diseases that had
opposing effects on gene expression
– Cimetidine for lung adenocarcinomas
– Topiramate for Chrohn’s Disease
16. Algorithms Mine Public Data
• Russ Altman used algorithms to mine Stanford
Translational Research Integrated Database
Environment & FDA adverse event reports
database
• Patients taking SSRI antidepressants and thiazide
are at increased risk for long QT syndrome, a
serious cardiac arrhythmia
17. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• GPS allows us to establish location quickly,
cheaply, and without requiring specialized
knowledge
• UPS uses geo-loc data from sensors, wireless
modules, and GPS on vehicles
• 2011 UPS shaved 30 million miles off routes,
saved 3 million gallons of fuel, and 30,000 metric
tons of carbon dioxide emissions
18. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Datafication of acts of living
• Zeo large database of sleep patterns
• Asthmapolis sensor to inhaler that tracks location
via GPS identifies environmental triggers
• Fitbit and Jawbone
• iTrem monitors Parkinson’s tremors almost as
well as the tri-axial accelerometer used in
specialized office medical equipment
19. Big Data for Cancer Care
Ron Winslow, WSJ, March 27, 2013
• ASCO
• Database of hundreds of thousands of patients
• Prototype has collected 100,000 breast cancer
patients from 27 groups who have different EMRs
• “Recognition that big data is imperative for the
future of medicine” Lynn Etheredge
• Less than 5% of adult cancer patients participate
in randomized clinical trials
20. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Recombinant data
• Danish Cancer Society study on cell phone/cancer
• Cellphone users from 1987 to 1995 (358,403)
• Brain cancer patients (10,729)
• Registry of education and disposable income
• Combining the three databases found no increase in risk of
cancer for those who used cell phones
• Not based on sample size; based on N=all
21. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Multiple uses of same database
• Data exhaust: digital trail people leave in their
wake
• Google spell-checking system uses bad data to
improve search, autocomplete feature in Gmail,
Google Docs, and translation system
22. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Paralyzing privacy
– Notice and consent
– Cannot give informed consent for secondary uses
– Anonymization does not work
• AOL 2006 20 million search queries from 657,000 users: NY
Times identified user number 4417749 as Thelma Arnold
(“My goodness, it’s my whole personal life. I had no idea
somebody was looking over my shoulder”)
• Netflix Prize 100 million rental records from 500,000 users.
Mother and closeted lesbian in Midwest was reidentified
23. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Probability and punishment
– Minority Report: People are imprisoned not for what
they did, but for what they are foreseen to do, even
though they never actually commit the crime
– Blue CRUSH (Crime Reduction, Utilizing Statistical
History in Memphis, Tennessee)
– Homeland Security FAST (Future Attribute Screening
Technology)
– Big data based on correlation unsuitable tool to judge
causality and thus assign individual culpability
24. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Dictatorship of Data
– Relying on numbers when they are far more fallible
than we think
– Robert McNamara’s body count numbers in Vietnam
– Michael Eisen tried to buy The Making of a Fly on
Amazon in April 2011. Two established sellers offering
the book for $1,730,045 and $2,198,177. Two week
escalation to a peak of $23,698,655.93 on April 18
– Unsupervised algorithms priced the books for the two
sellers.
25. Big Data
Viktor Mayer-Schonberger & Kenneth Cukier, 2013
• Regulatory shift from “privacy by consent” to
“privacy through accountability”
• “Differential privacy” through deliberately
blurring the data so hard to reidentify people
• Openness, certification, disprovability
• Algorithmists to perform “audits”
26. What Big Data Can’t Do
David Brooks, NY Times, February 26, 2013
• Data struggles with the social
• Data struggles with context
• Data creates bigger haystacks (spurious
correlations that are statistically significant)
• Data has trouble with big problems
• Data favors memes over masterpieces
• Data obscures values
27. What Big Data Will Never Explain
http://www.newrepublic.com/article/112734/what-big-data-will-never-explain
• “To datafy a phenomenon,” they explain, “is to
put it in a quantified format so it can be tabulated
and analyzed.”
• Sentiment analysis mathematical model for grief
called Good Grief Algorithm
• “The mathematization of subjectivity will founder
upon the resplendent fact that we are ambiguous
beings. We frequently have mixed feelings, and
are divided against ourselves.”
28. The Hidden Biases of Big Data
http://blogs.hbr.org/cs/2013/04/the_hidden_biases_in_big_data.html
• Big Data vs. Data with Depth
• “With enough data, the numbers speak for themselves.”
Chris Anderson
• Can numbers actually speak for themselves? Sadly, they
can't. Data and data sets are not objective; they are
creations of human design. We give numbers their voice,
draw inferences from them, and define their meaning
through our interpretations.
• Hidden biases in both the collection and analysis stages
29. The Hidden Biases of Big Data
http://blogs.hbr.org/cs/2013/04/the_hidden_biases_in_big_data.html
• Google Flu Trends vs. CDC
– 11% vs. 6% of US population infected
– Media coverage affected Google Flu Trends
• Boston’s StreetBump smartphone app
– 20,000 potholes a year need to be patched
– Poor areas have less cell phones, less service
• Hurricane Sandy 20 million tweets + 4square
– Grocery shopping day before
– Night life peaked day after
– Illusion Manhattan was hub of disaster
30. Automate This
Christopher Steiner, 2012
• Dr. Bot
– Always be convenient and available
– Know all your strengths and weaknesses
– Know every risk factor past conditions might signal
– Know your complete medical history
– Know medical history of last 3 generations of family
– Never make careless mistake in prescription
31. Automate This
Christopher Steiner, 2012
• Dr. Bot
– Always be up-to-date on treatments and discoveries
– Never fall into bad habits or ruts
– Monitor you at all times
– Always be searching for the hint of a problem by
monitoring pulse, cholesterol, blood pressure, weight,
lung capacity, bone density, changes in the air you
expel
32. Computers Are Just Not That Smart
• Eric Horvitz, MD of Microsoft
• Medical kiosk avatar interview mother & child
with diarrhea
• Avatar decides child does not need to go to ER
• Avatar makes appointment with clinic
• The moderator of AI panel thought the avatar was
much more compassionate than the human triage
nurses she has encountered in NYC ERs
33. Vinod Khosla (Sun Microsystems)
http://techcrunch.com/2012/01/10/doctors-or-algorithms/
• Being part of the health care system is a
disadvantage to disrupting the status quo
• Machine learning system will be cheaper, more
accurate, and more objective than physicians
• Machine expertise would need to be in the 80th
percentile of human physician expertise
34. Vinod Khosla (Sun Microsystems)
http://techcrunch.com/2012/01/10/doctors-or-algorithms/
• Do we need doctors or algorithms
• “Health is like witchcraft and just based on
tradition”
• 80% of physicians will be replaced by machines
• 80% of doctors are below the top 20%
• We will not need average doctors
• Still need “doctors like Gregory House who solve
biomedical puzzles beyond our best input ability”
35. Will Robots Steal Your Job?
http://www.slate.com/articles/technology/robot_invasion/2011/09/will_robots_steal_your_job_3.single.ht
ml
• “At this moment, there's someone training for your
job. He may not be as smart as you are—in fact, he
could be quite stupid—but what he lacks in
intelligence he makes up for in drive, reliability,
consistency, and price. He's willing to work for
longer hours, and he's capable of doing better work,
at a much lower wage. He doesn't ask for health or
retirement benefits, he doesn't take sick days, and he
doesn't goof off when he's on the clock. What's
more, he keeps getting better at his job.”
36. How Robots Will Replace Doctors
http://www.washingtonpost.com/blogs/ezra-klein/post/how-robots-will-replace-
doctors/2011/08/25/gIQASA17AL_blog.html
• “We’re not sitting in that room wrapped in a
garment made of the finest recycled sandpaper
because we were hoping for a good conversation.
We’re there because we’re sick…, and we’re
hoping this arrogant, hurried, credentialed genius
can tell us what’s wrong. We go to doctors not
because they’re great empaths, but because we’re
hoping medical school has made them into the
closest thing the human race has developed into
robots.”