VIP Model Call Girls Kiwale ( Pune ) Call ON 8005736733 Starting From 5K to 2...
Applications & Implications of Big Data for Official Statistics - Emmanuel Letouzé
1. Applications &
Implications of
Big Data for
Official Statistics
Emmanuel Letouzé
Director & co-Founder
Data-Pop Alliance
DfID, London, February 26, 2015
2. 1. The Emergence of Big Data &
The Statistical Tragedy
Framing and surfacing of the issue
2. Big Data and Official Statistics:
Substitute, Complement, or “It’s complicated”?
3. The Case of the SDGs
A story of fish and fishermen
3. 1. The Emergence of Big Data
vs. The Statistical Tragedy
Framing and surfacing of the issue
4.
5. Hal Varian’s nowcasting, GDP
and light emissions paper.…
Line shows returns for “Big Data” on Google Trends between 2007 and 2014; 100=maximum value
“We are at
the beginning
of what I call
The Industrial
Revolution of
Data.”
Joe Hellerstein
, November
19, 2008
Context: the Big Data rush
17. 1. Early warning
1. Real time awareness
1. Real-time feedback
Source: Letouzé, 2012
“What can it be used for?”
—Taxonomy of applications (1)
18. 1. Descriptive
-e.g. maps, clouds..
1. Predictive:
-forecasting
-inference
1. Prescriptive
-causal inference Source: Letouzé, Vinck and Meier, 2013
“What can it be used for?”
—Taxonomy of applications (2)
19. National
Statistical
Institutes carry
out surveys
Telefonica team
used their data to
‘predict’ SELs from
Cell Phone Usage
Predict the present
(SELs for non-
surveyed regions)
and monitor the
future (track
changes over time)
Survey from “a
major city in
Latin America”
Source: “Prediction of Socio-Economic Levels Using Cell-
Phone Records” (Telefonica research, 2011)
‘Predicting’ socioeconomic levels?
27. Source: Letouzé, 2014, based on primary and secondary sources
What & how much do we know?
….and does it matter?
Poverty prevalence 1990-2030
Fragile States vs. Non-Fragile States
29. “Official statistics assumes a key role in ensuring
democracy and fostering social
progress…[should] ”provide society with
knowledge of itself”
Enrico Giovanni—former President of Istat
Co-Chair of IEG on the Data Revolution
“Knowledge is power; statistics is democracy”
Former President of Statistics Finland
30. (2) Official
Statistics as
systems—not
reducible to
producing (1)
(1) Official
Statistics as
data—entirely
defined as
product of (2)*
* According to Fundamental Principles of Official Statistics
What is/are official statistics?
31. Official
Statistics
(1) Ensure that
societies benefit from
“knowledge of itself”
(according to some
political and
technical standards)
(2) Ensure that
societies benefit
from the presence
of a deliberative
public space
What is/are the purpose(s)
of official statistics?
32. (2) Ensure that societies
benefit from the
presence of a
deliberative public
space
It’s complex—and it’s political
(1) Ensure that societies
benefit from “knowledge of
itself” (according to some
political and technical
standards)
Official
Statistics
Big Data
33. Source: Letouzé, 2014, based on primary and secondary sources
(How) can data reduce poverty?
Poverty prevalence 1990-2030
Fragile States vs. Non-Fragile States
34. …by that Gary King means: it’s about the analytics
Jonathan Glemmie
The Guardian, Oct 3, 2013
35. 3. The case of the SDGs
A story of fish and fishermen
36. SDGs adopted by
the OWG
Big data examples What is monitored How is
monitored
Country(ies) Year Advantages of using
big data
1. Poverty
eradication
Satellite data to estimate povertyi Poverty Satellite images,
night-lights
Global map 2009 International
comparable data,
which can be
updated more
frequently
Estimating poverty maps with cell-
phone recordsii
Poverty Cell phone
records
Cote d’Ivoire 2013-4
Internet-based data to estimate
consumer price index and poverty
ratesiii
Price indexes Online prices at
retailers
websites
Argentina 2013 Cheaper data
available at higher
frequencies
Cell-phone records to predict socio-
economic levelsiv
Socio-economic
levels
Cell phone
records
City in Latin
America
2011 Data available more
regularly and cheaper
than official data;
informal economy
better reflected
2. End hunger,
achieve food
security and
improved nutrition,
and promote
sustainable
agriculture
Mining Indonesian Tweets to
understand food price crisesv
Food price crises Tweets Indonesia 2014
Uses indicators derived from mobile
phone data as a proxy for food
security indicatorsvi
Food security Cell phone data
and airtime
credit
purchases
A country in
Central Africa
2014
Use of remote-sensing data for
drought assessment and monitoring
Drought Remote sensing Afghanistan,
India,
Pakistanvii
2004
Chinaviii 2008
3. Health Internet-based data to identify
influenza breakoutsix
Influenza Google search
queries
US 2009 Real-time data;
captures disease cases
not officially recorded;
data available earlier
than official data
Data from online searches to
monitor influenza epidemicsx
Influenza Online searches
data
China 2013
Detecting influenza epidemics using
twitterxi
Influenza Twitter Japan 2011
Monitoring influenza outbreaks
using twitterxii
Influenza Twitter US 2013
Systems to monitor the activity of
influenza-like-illness with the aid of
volunteers via the internetxiii,xiv
Influenza Voluntary
reporting
through the
internet
Belgium, Italy,
Netherlands,
Portugal,
United
Kingdom,
United States
ongoi
ng
Cell-phone data to model malaria
spreadxv
Malaria Cell-phone
data
Kenya 2012
Using social and news media to Cholera Social and news Haiti 2012
SDG monitoring & Big Data
Example of our work: Official statistics
Why is it important?
africas statistical tragedy
platos cave
Big data == solution?
Methods
Technical assistance
Data literacy
Ethics
Convening
Marcelo Giugale
Fix Africa's Statistics
Posted: 12/18/2012 4:17 pm EST Updated: 02/17/2013 5:12 am EST
How would you feel if you were on an airplane and the pilot made the following announcement: "This is your captain speaking. I'm happy to report that all of our engines checked fine, we have just climbed to 36,000 feet, will soon reach our cruising speed, and should get to our destination right on time.... I think. You see, the airline has not invested enough in our flight instruments over the past 40 years. Some of them are obsolete, some are inaccurate and some are just plain broken. So, to be honest with you, I'm not sure how good the engines really are. And I can only estimate our altitude, speed and location. Apart from that, sit back, relax and enjoy the ride." This is, in a nutshell, the story of statistics in Africa. Fueled by its many natural resources, the region is growing fast, is finally beginning to reduce poverty and seems headed for success. Or so we think, for there are major problems with its data, problems that call for urgent, game-changing action.
First, we don't really know how big (or small) many African economies are. In about half of them, the system of "national accounts" dates back to the 1960s (1968, to be precise); in the other half, it is from 1993. This means that measuring things like how much is produced, consumed or invested is done with methods from the times when computers were rare, the Internet did not exist and nobody spoke about "globalization." That is, the methodology ignores the fact that some industries have disappeared and new ones were born.
How badly does this skew the data? Well, to give you an idea, when Ghana used a newer methodology to update its accounts in 2010, it found out that its economy was about 60 percent bigger than it had previously thought -- and the country instantly became "middle-income" in the global ranking. [Old-timers have a neat way to tell when the size of an economy is underestimated in countries with weak institutions: if what the government collects in taxes is equivalent to more than a fifth of the country's "gross domestic product," then gross domestic product is probably larger than what the official numbers show.]
Second, the latest poverty counts for Africa are, on average, five years old. So we only have guesstimates of how the global financial, food and fuel crises have impacted the distribution of income, wealth and opportunities in the region. This is because, to count the poor, you need "household surveys" -- those face-to-face, home visits where people are asked how much they earn, own, know and so on. In fifteen African countries, this has been done only once since 2000. Ironically, technology now allows for the surveys to be done not only more frequently, but continuously. You give families a cell-phone free of charge in exchange for them answering a questionnaire, say, twice a month. And you don't need to ask every household; about three thousand are enough -- that's the beauty of statistical sampling. So, why is it not done? Coming to that in a minute.
Third, "industrial" surveys are even more infrequent than household surveys -- only a handful of African countries have done at least one in the last ten years. This is a pity. Knowing what your producers are doing -- and what keeps them from producing more -- is critical if you want to design policies that increase employment, productivity and economic growth. To be sure, academics, NGOs, development banks and business organizations carry out sporadic surveys of enterprises for one purpose or another -- from understanding how informal jobs are created to selling logistical services. But regular, comprehensive, nation-wide data is, at best, rare. What's true for African employers is also true for African employees. "Labor market" surveys are few and far between -- most workers, remember, are informal and tend to shy away from answering questions by public officials. So when you ask about the unemployment rate in Africa, you are likely to be given a number that means little, is old, or both. And how about the good old "census" -- that once-in-a-while count of a country's entire population? Censuses are the only time when we learn how many we are, how fast we are aging, where we live, how we live and lots of other information that helps governments make smart(er) decisions on things like health care, school construction or crime prevention. Experts say that you should have a census every ten years. Sixteen African countries have fallen behind that tempo -- which means that, at the moment, we don't know much about a third of those who live in the region. Counting people is particularly important in African countries that get income from extracting oil, gas or minerals, which is most of them. That income is supposed to be shared among provinces, counties and municipalities on the basis of their population size -- how that's done in practice if the data is outdated or wrong beats a statistician's guess.
How does one even start tackling a problem like that? The truth is that a lot of money has been invested in improving Africa's statistics. Most of that money came as donations from well-meaning rich countries, and went to fund "institutional development," that is, to train and equip national statistics offices. According to the "Partnership in Statistics for Development in the 21st Century" (aka "PARIS21"), between 2009 and 2011 alone, Africa received 700 million dollars to build up its capacity to collect data. That led to some progress, but dismally short of what is needed. Why? Mostly politics. Solution? A mix of democracy and technology.
It is much more difficult to mess with a country's statistics when people are free to complain about it. Democratization has brought a new appetite for information to the average African citizen, much of it expressed through a data-hungry media. The number of recalcitrant governments that gather data but refuse to release it, or release it when it is obsolete-late, is falling -- slowly but surely. There is even talk of inviting independent experts ("high-level technical commissions") to regularly vet whatever official figures are put out. And the legal walls that protect public statisticians from meddling politicians are hardening -- Senegal pioneered the trend in the early 2000s. This is good, because national statistics offices are like central banks: once you recruit the best technicians, you want to step back and let them do their job.
But what will really revolutionize African statistics is communication technology. The continent is embracing cellular telephony with gusto; it is only a question of time before its people can become regular respondents in censuses and surveys. [Disclaimer: this writer works with a team that is trying to do just that through a project code-named, you guessed, "Listening to Africa."] Satellite imagery can now be used to literally see and gauge, from outer space, economic activity in ports, highways and markets. And tracking what people do, search or talk about on the web -- their "data exhaust" -- gives you a sense of what they are up to as workers, consumers and investors. All of this is yet to be deployed in Africa. To fix its data problem, the region should not just bring up to par the statistical systems it currently has; it may also want to leapfrog into tomorrow's systems.
Big data == solution?
Technocratic bias
Algorithmic regulation
privacy, ethics.
Digital divide
Big data hubris? More data always better?
What have CDRS been used to predict:
Disease dynamics
Disaster response
Socio economic indicators
Personality
Are official stats more than shadows in the cave?
UN call for a Data Revolution
Post 2015 agenda
Expert panel
Should cite drawing!
Allegory of the cave, Plaots
Plato has Socrates describe a gathering of people who have lived chained to the wall of a cave all of their lives, facing a blank wall. The people watch shadows projected on the wall by things passing in front of a fire behind them, and begin to designate names to these shadows. The shadows are as close as the prisoners get to viewing reality. He then explains how the philosopher is like a prisoner who is freed from the cave and comes to understand that the shadows on the wall do not make up reality at all, as he can perceive the true form of reality rather than the mere shadows seen by the prisoners.
Example of stop and frisk --- ==? (how far) do we want a data-driven society? Who owns the data? Could big data as a new class of assets empower people by counteracting natural tendency of capital to accumulate cf Piketty etc?