SlideShare une entreprise Scribd logo
1  sur  31
Data
visualisation
Daviz, quality and interoperability
About me
● Web technology manager (EEA)
● M.Sc. in Computer Science
(Lund University, SWE)
● Surveyor (ITA)
● 15 years in IT and web development
(programming and project management)
● Junior Researcher: Machine vision for
surveillance cameras at Axis
● E-commerce websites for telecom industry
● Product Owner of DaViz and many powerful Plone Add-ons
● Technical manager for the EEA main portal and CMS
● Data Visualisation, Data Science, Open Data, Statistics, Semantic Web,
Linked Data, Usability and User Experience, Artificial Intelligence,
Agile/Lean management…
demarinis@eea.europa.eu
DaViz, what and why
desktop
based
web-based
Remove any visual clutter
before
after
Unsorted (Don’t) Sorted (Do)
Remove legend when not needed
There is no need to have a legend when there
is only one data category shown. What is
measured can be added to the title or axis.
Avoid pie charts and donuts
The human mind thinks linearly: we
can easily compare lengths/heights of
line segments but when it comes to
angles and areas most of us can't
judge them well.
Do you see what works best?
Avoid stacked barchart
Don’t Do
Correlation does not imply causation
● see also "Superimposing time series is the biggest source of silly
theories"
Per capita consumption of cheese correlates with number of people who died by becoming tangled in
their bedsheets
Use map only when needed
The map on the right is just trying to
show too much information at once.
Moreover data would be much
easier to compare with a basic bar
chart (below).
Difficult to compare bar charts placed on map, since they are not aligned. A bar chart
would make it much easier and precise to compare countries.
Countries with relative small area are hidden, countries with large areas are
made more prominent (intentional?). Is country’s area really relevant
here? Is the geo-distribution important? How to compare properly?
Colors
● Different colors should be used for
different categories (e.g.,
male/female, types of fruit), not
different values in a range (e.g., age,
temperature).
● Do not use rainbows for range values
● If you want color to show a numerical
value, use a range that goes from
white to a highly saturated color in
one of the universal color categories.
no rainbows
Don’t Do
Don’t forget 7%-10% of
your male audience
(color deficiency)
what color-deficient people seeoriginal chart
Use Vischeck to test your images. If the chart is
readable in black and white than it is even better!
Choose your chart type wisely
Online tools like the Data Visualization
Catalogue or a decision diagram [2006,
A.Abela] helps you finding the right chart for
your data.
Data provenance, trust, legitimacy
● Adding data source information helps giving credibility
and trust in your chart
● When adding source info on your chart, distinguish
datasource info from figure source info
● Disclose who financed the data visualisation work and
data collection
● Disclose your data and methodology -> reproducible
and verifiable
from: “Legitimacy, transparency,
reproducibility”, Andrea Saltelli, JRC, Head of the Econometrics and
Applied Statistics Unit
Show the level of confidence, build trust
Ask these questions before publishing your chart, and be
prepared for the critiques:
1. What was the source of your data?
2. How well do the sample data represent the population?
3. Does your data distribution include outliers? How did they
affect the results?
4. What assumptions are behind your analysis? Might certain
conditions render your assumptions and your model invalid?
5. Why did you decide on that particular analytical approach?
What alternatives did you consider?
6. How likely is it that the independent variables are actually
causing the changes in the dependent variable? Might other
analyses establish causality more clearly?
Typical statistical error - EU trends
See online example
Typical statistical error - EU trends
See online example
It is not statistically correct to make a trend analysis of data across time
when the data in question (or sample) is not representative for the whole.
E.g. EU12 is not representative for EU25 or EU28, therefore the data cannot
be used to state a trend for the entire EU as it is in 2014, EU has changed!
very important info!
Typical statistical error - including no data
See online example
We cannot say “20.9% of our colleagues are male”. But we can say “20.9% of the sample
we met are male”, but this is not saying much about the entire population (the entire
staff).
Typical statistical error - including no data
See online example
If we have used a proper sampling technique, e.g. randomly selecting the staff, we have a
sample of (580 people) that is representative for the whole (1000 people) with a 95%
confidence level and a margin-error of 2.64%.
We can now say that 39.7% +- 2.64% are male at our work, with a confidence level
of 95%, and that is a big difference to what we said in previous slide (20.9%) !
https://www.checkmarket.com/market-research-resources/sample-size-calculator/
Show the level of confidence
Tell your audience how confident you are in your assertions by.
Include error bars any time you use data to make an argument
source: The importance of uncertainty, Berkeley Science
review. http://sciencereview.berkeley.edu/importance-
uncertainty/
Get it professionally reviewed
Have a statistician review
your analysis and your
representation. You will
be surprised about how
much corrections and
improvements you can
achieve.
Welcome to the data science!
source: http://sciencereview.berkeley.edu/article/first-rule-data-science/
I shall not use visualization
to intentionally hide or
confuse the truth which it is
intended to portray. I will
respect the great power
visualization has in garnering
wisdom and misleading the
uninformed. I accept this
responsibility willfully and
without reservation, and
promise to defend this oath
against all enemies, both
domestic and foreign.
hippocratic oath for
data scientists
VisWeek2011, Jason Moore, A code for ethics for data visualisations
professionals
THANK YOU!
More resources: http://www.eea.europa.eu/data-and-maps/daviz/learn-
more/

Contenu connexe

Similaire à Data visualisations quality aspects

CommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_UnbrandedCommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_UnbrandedJim Parnitzke
 
Art and Science of Dashboard Design
Art and Science of Dashboard DesignArt and Science of Dashboard Design
Art and Science of Dashboard DesignSavvyData
 
Aftros
Aftros Aftros
Aftros Sezzar
 
OutlierAnalysisIDIO071216.pptx.otliers is the main
OutlierAnalysisIDIO071216.pptx.otliers is the mainOutlierAnalysisIDIO071216.pptx.otliers is the main
OutlierAnalysisIDIO071216.pptx.otliers is the mainRamlalMeena5
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwaresDr.ammara khakwani
 
Data Visualisation Design Workshop #UXbne
Data Visualisation Design Workshop #UXbneData Visualisation Design Workshop #UXbne
Data Visualisation Design Workshop #UXbneCam Taylor
 
Data is love data viz best practices
Data is love   data viz best practicesData is love   data viz best practices
Data is love data viz best practicesGregory Nelson
 
Representative Of The Populationseek Your Dream/Tutorialoutletdotcom
Representative Of The Populationseek Your Dream/TutorialoutletdotcomRepresentative Of The Populationseek Your Dream/Tutorialoutletdotcom
Representative Of The Populationseek Your Dream/Tutorialoutletdotcomapjk512
 
SPSS GuideAssessing Normality, Handling Missing Data, and Calculating Scores...
SPSS GuideAssessing Normality, Handling Missing Data, and Calculating  Scores...SPSS GuideAssessing Normality, Handling Missing Data, and Calculating  Scores...
SPSS GuideAssessing Normality, Handling Missing Data, and Calculating Scores...ahmedragab433449
 
2016 Pittsburgh Data Jam Student Workshop
2016 Pittsburgh Data Jam Student Workshop2016 Pittsburgh Data Jam Student Workshop
2016 Pittsburgh Data Jam Student WorkshopMatthew DeReno
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in MalaysiaAhmed Elmalla
 
Creating Functional Art in Excel
Creating Functional Art in ExcelCreating Functional Art in Excel
Creating Functional Art in ExcelAmanda Makulec
 
Analysing The Data
Analysing The DataAnalysing The Data
Analysing The DataAngel Evans
 
Bj research session 9 analysing quantitative
Bj research session 9 analysing quantitativeBj research session 9 analysing quantitative
Bj research session 9 analysing quantitativeIan Cammack
 
Data Analysis Toolkit_Final v1.0
Data Analysis Toolkit_Final v1.0Data Analysis Toolkit_Final v1.0
Data Analysis Toolkit_Final v1.0lee_anderson40
 

Similaire à Data visualisations quality aspects (20)

CommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_UnbrandedCommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_Unbranded
 
Art and Science of Dashboard Design
Art and Science of Dashboard DesignArt and Science of Dashboard Design
Art and Science of Dashboard Design
 
Aftros
Aftros Aftros
Aftros
 
OutlierAnalysisIDIO071216.pptx.otliers is the main
OutlierAnalysisIDIO071216.pptx.otliers is the mainOutlierAnalysisIDIO071216.pptx.otliers is the main
OutlierAnalysisIDIO071216.pptx.otliers is the main
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwares
 
Data Visualisation Design Workshop #UXbne
Data Visualisation Design Workshop #UXbneData Visualisation Design Workshop #UXbne
Data Visualisation Design Workshop #UXbne
 
Data is love data viz best practices
Data is love   data viz best practicesData is love   data viz best practices
Data is love data viz best practices
 
Presenting data
Presenting dataPresenting data
Presenting data
 
Representative Of The Populationseek Your Dream/Tutorialoutletdotcom
Representative Of The Populationseek Your Dream/TutorialoutletdotcomRepresentative Of The Populationseek Your Dream/Tutorialoutletdotcom
Representative Of The Populationseek Your Dream/Tutorialoutletdotcom
 
Quantitative data essentials for charities - Learning Lab
Quantitative data essentials for charities - Learning LabQuantitative data essentials for charities - Learning Lab
Quantitative data essentials for charities - Learning Lab
 
Unit2
Unit2Unit2
Unit2
 
SPSS GuideAssessing Normality, Handling Missing Data, and Calculating Scores...
SPSS GuideAssessing Normality, Handling Missing Data, and Calculating  Scores...SPSS GuideAssessing Normality, Handling Missing Data, and Calculating  Scores...
SPSS GuideAssessing Normality, Handling Missing Data, and Calculating Scores...
 
2016 Pittsburgh Data Jam Student Workshop
2016 Pittsburgh Data Jam Student Workshop2016 Pittsburgh Data Jam Student Workshop
2016 Pittsburgh Data Jam Student Workshop
 
DATA VISUALIZATION
DATA VISUALIZATIONDATA VISUALIZATION
DATA VISUALIZATION
 
Data Visualization - A Brief Overview
Data Visualization - A Brief OverviewData Visualization - A Brief Overview
Data Visualization - A Brief Overview
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
 
Creating Functional Art in Excel
Creating Functional Art in ExcelCreating Functional Art in Excel
Creating Functional Art in Excel
 
Analysing The Data
Analysing The DataAnalysing The Data
Analysing The Data
 
Bj research session 9 analysing quantitative
Bj research session 9 analysing quantitativeBj research session 9 analysing quantitative
Bj research session 9 analysing quantitative
 
Data Analysis Toolkit_Final v1.0
Data Analysis Toolkit_Final v1.0Data Analysis Toolkit_Final v1.0
Data Analysis Toolkit_Final v1.0
 

Dernier

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 

Dernier (20)

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 

Data visualisations quality aspects

  • 2. About me ● Web technology manager (EEA) ● M.Sc. in Computer Science (Lund University, SWE) ● Surveyor (ITA) ● 15 years in IT and web development (programming and project management) ● Junior Researcher: Machine vision for surveillance cameras at Axis ● E-commerce websites for telecom industry ● Product Owner of DaViz and many powerful Plone Add-ons ● Technical manager for the EEA main portal and CMS ● Data Visualisation, Data Science, Open Data, Statistics, Semantic Web, Linked Data, Usability and User Experience, Artificial Intelligence, Agile/Lean management… demarinis@eea.europa.eu
  • 3. DaViz, what and why desktop based web-based
  • 7. Remove legend when not needed There is no need to have a legend when there is only one data category shown. What is measured can be added to the title or axis.
  • 8. Avoid pie charts and donuts The human mind thinks linearly: we can easily compare lengths/heights of line segments but when it comes to angles and areas most of us can't judge them well.
  • 9. Do you see what works best?
  • 11. Correlation does not imply causation ● see also "Superimposing time series is the biggest source of silly theories" Per capita consumption of cheese correlates with number of people who died by becoming tangled in their bedsheets
  • 12. Use map only when needed
  • 13. The map on the right is just trying to show too much information at once. Moreover data would be much easier to compare with a basic bar chart (below).
  • 14. Difficult to compare bar charts placed on map, since they are not aligned. A bar chart would make it much easier and precise to compare countries.
  • 15. Countries with relative small area are hidden, countries with large areas are made more prominent (intentional?). Is country’s area really relevant here? Is the geo-distribution important? How to compare properly?
  • 16. Colors ● Different colors should be used for different categories (e.g., male/female, types of fruit), not different values in a range (e.g., age, temperature). ● Do not use rainbows for range values ● If you want color to show a numerical value, use a range that goes from white to a highly saturated color in one of the universal color categories. no rainbows
  • 18. Don’t forget 7%-10% of your male audience (color deficiency) what color-deficient people seeoriginal chart Use Vischeck to test your images. If the chart is readable in black and white than it is even better!
  • 19. Choose your chart type wisely Online tools like the Data Visualization Catalogue or a decision diagram [2006, A.Abela] helps you finding the right chart for your data.
  • 20. Data provenance, trust, legitimacy ● Adding data source information helps giving credibility and trust in your chart ● When adding source info on your chart, distinguish datasource info from figure source info ● Disclose who financed the data visualisation work and data collection ● Disclose your data and methodology -> reproducible and verifiable
  • 21. from: “Legitimacy, transparency, reproducibility”, Andrea Saltelli, JRC, Head of the Econometrics and Applied Statistics Unit
  • 22. Show the level of confidence, build trust Ask these questions before publishing your chart, and be prepared for the critiques: 1. What was the source of your data? 2. How well do the sample data represent the population? 3. Does your data distribution include outliers? How did they affect the results? 4. What assumptions are behind your analysis? Might certain conditions render your assumptions and your model invalid? 5. Why did you decide on that particular analytical approach? What alternatives did you consider? 6. How likely is it that the independent variables are actually causing the changes in the dependent variable? Might other analyses establish causality more clearly?
  • 23. Typical statistical error - EU trends See online example
  • 24. Typical statistical error - EU trends See online example It is not statistically correct to make a trend analysis of data across time when the data in question (or sample) is not representative for the whole. E.g. EU12 is not representative for EU25 or EU28, therefore the data cannot be used to state a trend for the entire EU as it is in 2014, EU has changed! very important info!
  • 25. Typical statistical error - including no data See online example We cannot say “20.9% of our colleagues are male”. But we can say “20.9% of the sample we met are male”, but this is not saying much about the entire population (the entire staff).
  • 26. Typical statistical error - including no data See online example If we have used a proper sampling technique, e.g. randomly selecting the staff, we have a sample of (580 people) that is representative for the whole (1000 people) with a 95% confidence level and a margin-error of 2.64%. We can now say that 39.7% +- 2.64% are male at our work, with a confidence level of 95%, and that is a big difference to what we said in previous slide (20.9%) ! https://www.checkmarket.com/market-research-resources/sample-size-calculator/
  • 27. Show the level of confidence Tell your audience how confident you are in your assertions by. Include error bars any time you use data to make an argument source: The importance of uncertainty, Berkeley Science review. http://sciencereview.berkeley.edu/importance- uncertainty/
  • 28. Get it professionally reviewed Have a statistician review your analysis and your representation. You will be surprised about how much corrections and improvements you can achieve.
  • 29. Welcome to the data science! source: http://sciencereview.berkeley.edu/article/first-rule-data-science/
  • 30. I shall not use visualization to intentionally hide or confuse the truth which it is intended to portray. I will respect the great power visualization has in garnering wisdom and misleading the uninformed. I accept this responsibility willfully and without reservation, and promise to defend this oath against all enemies, both domestic and foreign. hippocratic oath for data scientists VisWeek2011, Jason Moore, A code for ethics for data visualisations professionals
  • 31. THANK YOU! More resources: http://www.eea.europa.eu/data-and-maps/daviz/learn- more/