Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

How to Use Data for Good

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 80 Publicité

How to Use Data for Good

Télécharger pour lire hors ligne

Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we learned from DataKind – Harnessing the Power of Data Science in the Service of humanity, Real Impact Analytics, Elissa Redmiles, a Data Science for Social Good Summer Fellow at the University of Chicago, Nick Eng, Data Scientist at the Center for Data Science and Public Policy at the University of Chicago, Kevin Chen, the Chief Scientist at the Experian, North America Data Lab, and others.

Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we learned from DataKind – Harnessing the Power of Data Science in the Service of humanity, Real Impact Analytics, Elissa Redmiles, a Data Science for Social Good Summer Fellow at the University of Chicago, Nick Eng, Data Scientist at the Center for Data Science and Public Policy at the University of Chicago, Kevin Chen, the Chief Scientist at the Experian, North America Data Lab, and others.

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à How to Use Data for Good (20)

Publicité

Plus par Experian_US (20)

Publicité

How to Use Data for Good

  1. 1. #DataTalkHow to Use Data for Good
  2. 2. Join our #DataTalk on Thursdays at 5 p.m. ET This week, we talked with DataKind, Real Impact Analytics, Elissa Redmiles, a Data Science for Social good Summer Fellow at the University of Chicago, Nick Eng, Data Scientist at the Center for Data Science and Public Policy at the University of Chicago, and Kevin Chen, Chief Scientist at the Experian Data Lab. Check out the resources and tweets from this chat: ex.pn/dataforgood
  3. 3. What is a “data for good” project?
  4. 4. Real Impact Analytics @RIAnalytics Data for good is the use of big data to help policymakers and aid workers foster the social public good and maximize impact. ex.pn/datatalk #DataTalk
  5. 5. Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 ex.pn/datatalk #DataTalk Data for good projects use data and data science to help nonprofits better reach their mission and assist their target audience.
  6. 6. Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng ex.pn/datatalk #DataTalk Projects that use data to improve society as a whole, rather than any single individual. To be more specific: communities/cities.
  7. 7. DataKind @DataKind ex.pn/datatalk #DataTalk Data and subject matter experts working together to use data to address humanitarian challenges. Collaboration is key!
  8. 8. Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen ex.pn/datatalk #DataTalk Projects that improve social equality by leveraging public and private data.
  9. 9. Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng ex.pn/datatalk #DataTalk Using data to help the underprivileged, especially for those who might not know how data can be used as a tool.
  10. 10. What are some favorite examples of how data has been used for good?
  11. 11. ex.pn/datatalk #DataTalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 One of the projects that drew me to @DataSciFellows was the #NurseFamilyParnership project, which used data science to predict people in need.
  12. 12. ex.pn/datatalk #DataTalkDataKind @DataKind Using anonymous mobile location data in aggregated ways to identify mobility patterns and design better public transportation.
  13. 13. ex.pn/datatalk #DataTalkLitterati @Litterati We leverage data to get smarter about our litter patterns.
  14. 14. ex.pn/datatalk #DataTalk Our favorite D4G example is the use of telecom data with the Global Pulse UN team in Uganda to detect and tackle food crises and prioritize actions against poverty. More specifically, we have developed a mapping of income inequality and income shocks in Africa using changes in pre-paid patterns. Real Impact Analytics @RIAnalytics
  15. 15. ex.pn/datatalk #DataTalkDataKind @DataKind Another neat one from Data Science Bowl: convolutional neural nets to predict ocean health. CLICK HERE
  16. 16. ex.pn/datatalk #DataTalk A second powerful example of D4G is the use of telecom mobility data to identify, prevent and treat contagious diseases such as Ebola, malaria and cholera. We have been able to identify micro-communities as well as mobility patterns. This leads towards identifying key routes to block and assess the potential impact on the spread of a disease. Real Impact Analytics @RIAnalytics
  17. 17. ex.pn/datatalk #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng Beyond predictive models and confidential datasets, products like clearstreets.org simplify our lives using open data.
  18. 18. ex.pn/datatalk #DataTalkDataKind @DataKind @DataKindUK volunteers mapped public data to help @SSChospices find children in need of hospice care.
  19. 19. ex.pn/datatalk #DataTalk Melissa Correia @melissacorreia Child welfare agencies are using sophisticated analyses to improve outcomes for kids in foster care.
  20. 20. What challenges do organizations face when working on data philanthropy projects?
  21. 21. ex.pn/datatalk #DataTalkDataKind @DataKind One challenge is defining a clear question upfront for the project that will help an organization maximize impact.
  22. 22. ex.pn/datatalk #DataTalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 Organizational culture is very important. Having good data to analyze and resources directed toward analysis are key.
  23. 23. ex.pn/datatalk #DataTalk Finding balance between retaining proprietary knowledge on either data or technology and applying to data for good projects can be hard. Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen
  24. 24. ex.pn/datatalk #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng Implementation! Fancy models or cool visualizations is only step one. Making these tools part of the day-to-day is number two.
  25. 25. We can see 3 types of challenges: (i) design of the tools/apps; (ii) access to data; (iii) align the eco-system. ex.pn/datatalk #DataTalkReal Impact Analytics @RIAnalytics
  26. 26. The operational challenge is mostly to repackage research insights to generate real impact on decisions of aid workers in the field. Many tools are not simple enough for a daily field use or less actionable and have usually not been designed around an actual worker’s needs. ex.pn/datatalk #DataTalkReal Impact Analytics @RIAnalytics
  27. 27. The technical challenge is to be able to connect to relevant data sources, being external data sources (e.g. WHO, World Bank) or telecom data sources. ex.pn/datatalk #DataTalkReal Impact Analytics @RIAnalytics
  28. 28. The legal and regulatory challenge is to syndicate our approach with local regulators and secure the data handling process, in terms of privacy, anonymization of data or remote access. All data must remain at the telecom operator premises within the country. This last challenge can be partly address through securing a sustainable eco- system involving all parties. ex.pn/datatalk #DataTalkReal Impact Analytics @RIAnalytics
  29. 29. ex.pn/datatalk #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng And figuring out what the problem exactly is, and framing it. We don’t always know the domain. We need your help and feedback.
  30. 30. ex.pn/datatalk #DataTalkIoT Channel @IoTchannel Key challenge is to maintain protection of user/client info and data without it being compromised/leaked.
  31. 31. ex.pn/datatalk #DataTalkDataKind @DataKind There is also the challenge (and fun) of prepping and cleaning data before you dive in.
  32. 32. What type of data can be used for data for good projects?
  33. 33. ex.pn/datatalk #DataTalkDataKind @DataKind Time series, text, audio, geo, etc. We need to make sure privacy is preserved and it doesn’t promote discrimination.
  34. 34. ex.pn/datatalk #DataTalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 Many different formats are usable: database data, excel data, csv data are all easily processable, but text and web data work, too.
  35. 35. ex.pn/datatalk #DataTalkReal Impact Analytics @RIAnalytics Telecom data are particularly unique in emerging markets, as they are collected systematically, locally and in real time. These data can be complemented by 2 data sources: (i) external or public databases, such as occurrences of a specific disease in a specific location; (ii) additional / ad-hoc data which are collected through a mobile application. The most important limitation is the possibility to identify back individual people based on the shared insights or tools. This would dramatically undermine the scaling up of Data for Good.
  36. 36. ex.pn/datatalk #DataTalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 Real good can be done with access to internal data without releasing this data publicly.
  37. 37. ex.pn/datatalk #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng Open data and APIs are a great start. Check out the new CitySDK from the census.
  38. 38. ex.pn/datatalk #DataTalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 We focus more on internal vs. external data & complete data, more than formats.
  39. 39. ex.pn/datatalk #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng And when structured data isn’t available, you can get creative to make your own data (e.g. scraping websites).
  40. 40. ex.pn/datatalk #DataTalkDataKind @DataKind Totally agree with Nick Eng on getting creative with scraping websites or not forgetting about data sources like satellite imagery.
  41. 41. ex.pn/datatalk #DataTalkDataKind @DataKind One example, @kvarshney worked with @Give_Directly using satellite imagery to target villages in need.
  42. 42. What are some best practices for using data for good?
  43. 43. #DataTalk Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen Garbage in, garbage out. Validate and carefully examine the data.
  44. 44. #DataTalkReal Impact Analytics @RIAnalytics The best D4G solutions provide action-oriented insights to end-users, which are supported by science and easily accessible by mobile.
  45. 45. #DataTalkReal Impact Analytics @RIAnalytics We need to understand the actual needs of the potential users, assess correlation between available data and possible actions and outcomes and adapt apps and algorithms accordingly.
  46. 46. #DataTalkReal Impact Analytics @RIAnalytics We need to be able to refresh and operationalize the tools offering a mobile access to insights; we need to technically secure the access to the data and ensure privacy; and we need to be able to measure impact and correct algorithm accordingly. Overall, trust is one of the key overarching success factors, as it allows to have a smooth decision flow and maximize impact. Therefore, we need to build strong partnerships with international institutions to ensure global impact and scalability of our actions.
  47. 47. #DataTalk EWD Rozier @PrarieScience The biggest step for using data for good is finding a committed, involved, partner who will help transition to practice.
  48. 48. #DataTalkDataKind @DataKind Love this guide from @engrnroom. Great read on how to practice responsible development data.
  49. 49. #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng When doing a project, make sure it’s a constant partnership with your other stakeholders (e.g. nonprofits).
  50. 50. #DataTalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 Talk to SMEs and find the domain knowledge you don’t have. Data is only have the puzzle.
  51. 51. What are ways to use data for good, while protecting privacy?
  52. 52. #DataTalk ex.pn/datatalk EWD Rozier @PrarieScience Right now, it’s very ad-hoc; to move forward we need new data privacy solutions, which allow proofs of privacy preservation.
  53. 53. #DataTalk ex.pn/datatalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 @DataSciFellows keeps data secure while letting the code for processing the data be open source.
  54. 54. #DataTalk ex.pn/datatalk Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen Use the data in aggregates (e.g. finding activity patterns of city dwellers using aggregated mobile phone activity).
  55. 55. #DataTalk ex.pn/datatalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 We keep data science code open source so that other nonprofit organizations can use these resources to process their own data.
  56. 56. #DataTalk ex.pn/datatalk EWD Rozier @PrarieScience We’ve been working on solutions for homomorphisms for database operations to create a privacy aware kernel for data science.
  57. 57. #DataTalk ex.pn/datatalk DataKind @DataKind Shouting out @CrisisTextLine: they provide personalized care to those in crisis via text messages while protecting privacy.
  58. 58. #DataTalk ex.pn/datatalk EWD Rozier @PrarieScience The hard part about privacy preserving operations are the current limits on performable homomorphisms.
  59. 59. #DataTalk ex.pn/datatalk Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen Add noise to the data, bucket the values (e. g. age) or use coarser level of info (e.g. zip3 vs zip5) when possible are a few ways.
  60. 60. What type of data philanthropy would you like to see happen?
  61. 61. #DataTalk ex.pn/datatalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 We need more public info showcasing the impact of using data science for good.
  62. 62. #DataTalk ex.pn/datatalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng And more scalable ways to help nonprofits determine how data can help them.
  63. 63. #DataTalk ex.pn/datatalk I would really like to see projects that use data to help understand, prevent, intervene, and treat cancers. Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen
  64. 64. #DataTalk ex.pn/datatalk EWD Rozier @PrarieScience Biggest plausible projects I want to see are cities becoming more data savvy like Chicago. Public access democratizes science.
  65. 65. #DataTalk ex.pn/datatalk Real Impact Analytics @RIAnalytics We would like to co-design a sustainable, scalable and open ecosystem of mobile anti-poverty apps together with other developers, NGOs, international agencies and philanthropists. There will be different types of apps required, such as apps supporting NGO’s in disaster relief or sudden outbreak of a contagious disease or apps supporting ministries or public authorities in their decision-making, optimizing the targeting and impact of public policies. Most of emerging countries lack data about their populations.
  66. 66. #DataTalk ex.pn/datatalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 I also think it’s important for more corporations like Experian and IBM to raise awareness of #Data4Good projects.
  67. 67. What are ways to support organizations and data scientists working in data philanthropy?
  68. 68. #DataTalk ex.pn/datatalk Real Impact Analytics @RIAnalytics Philanthropists can best support D4G by joining the dialogue with app developers and end-users on the public questions to address. There is a clear need to fund specific apps and an operational platform to ensure that Data for Good becomes not only a science but foster also operational impact. Securing such platform with a first set of apps will generate spillovers and a positive dynamics among the communities of developers, NGOs and public institutions.
  69. 69. #DataTalk ex.pn/datatalk EWD Rozier @PrarieScience Focus on open source tools. I will be controversial: we need to move away from prototyping in python to a performable ecosystem.
  70. 70. #DataTalk ex.pn/datatalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 Agree with @PrarieScience. @NSF funding for outcomes based #DataScience projects is key especially for training data scientists.
  71. 71. #DataTalk ex.pn/datatalk Corporates can provide funding and recognitions to their data scientists to encourage participation in data philanthropy projects. Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen
  72. 72. #DataTalk ex.pn/datatalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng Maybe start by finding your local #Data4Good community. Strength in numbers!
  73. 73. ex.pn/datatalk #DataTalkDataKind @DataKind Funders can play a big role supporting nonprofits to expand the use of data beyond reporting.
  74. 74. Any final tips for data scientists who want to use data for good?
  75. 75. ex.pn/datatalk #DataTalkReal Impact Analytics @RIAnalytics Our main tip is to collaborate, as an operational open ecosystem is critical to realize our shared vision of a healthy and poverty-free world. Data for Good is at the cross-road of multiple skill sets, such as data sciences, software development, algorithm design, epidemiology, traffic modelling, field work involving poor communities in emerging markets, telecom regulation. There is no chance one organization could offer these internally. Data for Good needs to offer both operational tools and scientific insights.
  76. 76. ex.pn/datatalk #DataTalk Elissa Redmiles Data Science for Social Good Summer Fellow at the University of Chicago @eredmil1 Don’t be discouraged by imperfect data!
  77. 77. ex.pn/datatalk #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng Data will always be messy. Especially from nonprofits.
  78. 78. ex.pn/datatalk #DataTalk Kevin Chen Chief Data Scientist, Experian Data Lab @kevincchen Let the data speak. Interpret the results objectively.
  79. 79. ex.pn/datatalk #DataTalk Nick Eng Data Scientist, Center for Data Science at the University of Chicago @nick_eng Start simple. Simple projects can sometimes make the biggest impact.
  80. 80. Join our #DataTalk on Twitter on Thursdays at 5 p.m. ET. experian.com/datatalk

×