In this webinar, we look at how you obtain and use open data, the key role of search engines and how you establish rust in the data you find. The webinar will also look at the quality of data and how to clean and prepare data for analysis. Finally, the session will look at how you can quickly visualise cleaned data and the applications of this in the agriculture sector.
Boost Fertility New Invention Ups Success Rates.pdf
Using Open Data - David Tarrant
1. Content created by
The Open Data Institute
Using Open Data
Dr David Tarrant | @davetaz | The Open Data Institute
2. Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
3. Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
6. Content created by
The Open Data Institute
Google advanced
site: Get results only from certain
sites or domains
link: Find pages that link to a
certain page
related: Find sites similar to one
you already know
filetype: Find certain file types only
7. Content created by
The Open Data Institute
Aggregators and portals
Collect together data from across the web into one place.
FAO World Bank
8. Content created by
The Open Data Institute
Scraping
If you can’t obtain usable data (csv, xls) then you may have to
resort to scraping.
pdftables.com magic.import.io
13. Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
15. Content created by
The Open Data Institute
Open Data Certificate
http://certificates.theodi.org
16. Content created by
The Open Data Institute
Establishing trust in data
Who
Collected it?
Owns it?
Publishes it?
Is the Audience?
What
Is it (title/description)?
Type of data is it?
Type of objects?
When
Collected?
Published?
Updated?
Due next update?
Where
Was it collected?
Is it used?
Is it described?
Is it located?
18. Content created by
The Open Data Institute
Open Refine
http://openrefine.org
A free power
tool for cleaning
messy data
19. Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
21. Content created by
The Open Data Institute
Remember
• Not all data is structured
• Not all numeric data is structured
• Some text data is structured
23. Content created by
The Open Data Institute
Beware!
• Targets
• Fluctuation
• Chance
• Correlation != Causation
https://xkcd.com/925/
24. Content created by
The Open Data Institute
Analysing qualitative data
Entity recognition can
help with coding and
thematic network
analysis.
Try Open Calais
Search: open calais
25. Content created by
The Open Data Institute
Visualisaion
Not all data
visualisations are
good!
26. Content created by
The Open Data Institute
Picking the right visulisation
1) Audience
• Who are your audience and what do they expect?
2) Purpose
• What story are you trying to tell.
3) Data
• What types of visulisation suit the data
27. Content created by
The Open Data Institute
Keep it simple!
Which country achieved the greatest crop yield in 2014?
28. Content created by
The Open Data Institute
Nothing wrong with a bar chart
Observe how you don’t need unnecessary clutter like axis and labels you can’t read
29. Content created by
The Open Data Institute
Simple lines and interactivity
https://www.nytimes.com/interactive/2017/0
1/15/us/politics/you-draw-obama-
legacy.html?_r=0
30. Content created by
The Open Data Institute
Agenda
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
31. Content created by
The Open Data Institute
The policy cycle
Open data helps at
every stage of the
policy cycle!
32. Content created by
The Open Data Institute
Example policy
Agenda: To publish more open data from Universities on Agriculture.
Why? To increase the benefit from this data to improve agriculture
worldwide.
But what is the benefit to those who already hold the data?
33. Content created by
The Open Data Institute
Understanding researchers
Universities are ranked on the quality of their research which is
linked to publication.
Therefor if data publication can hold the same value and benefit
then we should see more data.
34. Content created by
The Open Data Institute
How research creates impact
1) The journal of publication
2) The number of citations the paper has
35. Content created by
The Open Data Institute
Doing the same for research data
1) Create reputable places to share data
2) Create a way to link/reference the data,
including an index
3) Mandate the publication of research data
36. Content created by
The Open Data Institute
https://blog.datacite.org/general-assembly-2016/
37. Content created by
The Open Data Institute
Recap
Discovering open data
Quality and provenance
Data analysis and visualisation
Open data in policy cycles
Referencing data
38. Content created by
The Open Data Institute
Thank-you
Dr David Tarrant | @davetaz | The Open Data Institute
https://xkcd.com/552/