4. Name Gender Age Height Feeling
Mandy F 21 150cm Swamped
Shani F 23 167cm Nervous
Zizo F 25 167cm Curious
Ashleigh F 22 163cm Relaxed
Danyal M 22 156cm Optimistic
Jason M 36 200cm Flustered
Hannah F 35 167cm Very excited
Phumlani M 24 180cm Grumpy
Milena F 29 160cm Excited
5. Data types
● QUALITATIVE DATA: is everything that refers to the
quality of something: A description of colours, texture
and feel of an object , a description of experiences, and
interview are all qualitative data.
● QUANTITATIVE DATA: is data that refers to a number.
6. Data types
● DISCRETE DATA: is numerical data with values which
are distinct and separate, i.e. they can be counted.
Examples might include the number of kittens in a litter;
the number of patients in a doctors surgery;
● CONTINUOUS DATA: is numerical data with a
continuous range. You can count, order and measure
continuous data. For example height, weight,
temperature, the amount of sugar in an orange, etc.
7. ● CATEGORICAL DATA: puts the item you are
describing into a category; Examples can include
gender, colour, size, etc.
● ORDINAL DATA: data which can be ranked (put in
order) or have a rating scale attached. You can count
and order, but not measure, ordinal data; Example: a
scale from 1 to 5
Data types
8. Data types quiz
Role: Drummer
❏ Continuous Data
❏ Categorical Data
❏ Quantitative Data
Year Born: 1963
❏ Qualitative Data
❏ Discrete Data
❏ Continuous Data
❏ Categorical Data
Name: Rick Allen
❏ Quantitative Data
❏ Qualitative Data
❏ Discrete Data
Size: M
❏ Ordered Data
❏ Categorical Data
❏ Continuous Data
Height: 187cm
❏ Discrete Data
❏ Categorical Data
❏ Continuous Data
❏ Qualitative Data
Date: 5th of March 2014
❏ Discrete Data
❏ Categorical Data
❏ Continuous Data
13. Good practices and basic ethics
● Save original copy of data and do not touch it.
● Paper trail - Keep a log with every step that you take in the
analysis.
● Do not change original columns. Duplicate them and make
the changes here.
● Have several drafts and look at how your analysis
developed.
● Spend to understand your data. Read the methodology.
14. Good practices and basic ethics
● Do not assume what the data is. Run integrity check on each
column.
● Clean the data before interviewing it
● Count the records. Cross-reference with the methodology.
Report any inconsistency and request the missing data or a
recount. Keep the total records in mind while analysing the data.
● If a result looks to good to be true, it probably is.
● Make a summary of the end results, as if you were writing a
press release. Look for mistakes
15. Good practices and basic ethics
● Have somebody else verify your work, preferably
somebody who knows nothing about your project.
● Check your biases and look at your data from new
angles
● Look for context that would explain your results to
yourself and to your audience
● e.g. Egypt worst country for women’s rights
● Bounce your results against experts
17. Advanced search
● Google Advanced Search
● Wayback Machine – for the dead web (1996 onwards)
http://archive.org/web/
18. Search operators
● * (asterix) – substitutes a word and will allow your search to
cover similar phrases
● Cache: - allows you to find web pages hidden in Google’s
cache
● filetype: - will get look for the specified file type
● Link: - helps you find all the sites that link to a particular
page
19. Search operators
● ‘ ‘ or “ “ (Quotation marks) – help you find the exact phrase
● + or AND – narrows down your search by returning the exact
word phrases
● OR – expands search by including either of two search
phrases
● - or NOT – it would tell an engine to exclude a term
● e.g. Monsanto-’agent orange’
21. What makes a good visualisation
For each of these visualisations think of:
● What is the target audience
● What is the key message
● How successful are they in communicating the
message
● What makes them stand out?
● How well are they explained?
● How simple/ complex they are?
What we mean by data when we do data journalism?
Whether you began with a question or not, you should always keep your eyes open for unexpected patterns, unusual results, or anything that surprises you. Often, the most interesting stories aren’t the ones you were looking for.
Discrete data is counted, Continuous data is measured
Discrete Data
Discrete Data can only take certain values.
Example: the number of students in a class (you can't have half a student).
Continuous Data
Continuous Data can take any value (within a range)
Examples:
A person's height: could be any value (within the range of human heights), not just certain fixed heights,
Time in a race: you could even measure it to fractions of a second,
A dog's weight,
The length of a leaf
Machine readable - if it is in a format that can be easily processed by a computer. Digital ≠machine readabale. Example: a PDF document containing tables of data (is digital but are not machine-readable because a computer would struggle to access the tabular information even though they are very human readable!). The equivalent tables in a format such as a spreadsheet would be machine readable. In general, HTML and PDF are *not* machine-readable.
COMPARE
COMPARE AND PUT IN CONTEXT:
put in context the loss of men and women in the Afgan was as compared to Vietnam and the second World War
Show trends
SHOW TREND OVER TIME
Trend over time
Compares different presidencies
Show trend over time
Tell a story
Engage, captivate
Compares countries
Patterns
Personal angle
Show hierarchy
Personal angle - people get where they fit in the bigger picture
Compares, puts things into perspective