36. 0 20 40 60 80 100 120
Russian Federation
Costa Rica
Ecuador
United Arab Emirates
Taiwan
United States
Guatemala
China
Jordan
Japan
El Salvador
South Korea
Hong Kong
Nigeria
Kenya
Sri Lanka
Singapore
Pakistan
India
Oracle Google Searches - By Region, Normalized
37. 0 20 40 60 80 100 120
Russian Federation
Costa Rica
Ecuador
United Arab Emirates
Taiwan
United States
Guatemala
China
Jordan
Japan
El Salvador
South Korea
Hong Kong
Nigeria
Kenya
Sri Lanka
Singapore
Pakistan
India
Oracle Google Searches - By Region, Normalized
Notes de l'éditeur
Visualization - visual display of graphical information.
I am going to show how to be more effective in analyzing and communication information using graphical methods.
Visualization is sometimes discarded as a cop-out. Newbies and managers use graphs because they are not manly enough. Real DBAs use numbers and command line!
In the excellent book “Lies, Damn Lies and Statistics” there is entire chapter dedicated to graphs and the author says something like: People use graphs because they are afraid of numbers, maybe a trauma from school.
This is a bit like saying that people use cars because they are too lazy to walk. Sometimes its true. But it ignores the fact that cars are really more efficient.
In the same way, graphs are really a more efficient way to display information. In fact, for reasons I’ll show soon, graphs are even more useful experts than they are for beginners.
What I’ll take about:
Why using graphics is so efficient
New graphical methods
Simple design principals
Structure = Trends, repetitions and outliers, etc.
High bandwidth information channel.
Apply pattern matching skills and prior knowledge to analysis of data.
We can easily find information in very ambiguous data. Its an evolutionary thing.
Differences between color shades and sizes of shapes are difficult to compare and quantify
Average describes normal distributions quite well. Give height as an example for why average is a good descriptor for normal distribution.
Extremely Skewed distribution! Its not even close to normal. Average does not really describe how slow export can get.
That looks like a good description. But wait!
Sometimes export doesn’t run at all. I can explain the outliers (both low and high) - those 5 days one Netapp head was down and we didn’t run exports, and when we did performance was awful. Since I can explain the outliers – I know I can remove them.
histogram. Looks kind of normal, but hard to tell.
qqnorm. Yep, looks normal with some noise. You don’t see a consistent skew.
Multiple Boxplots
Scatter plot
Less is more. Be clear and to the point. Do not distort or mislead. Think of your data as a fashion model – you look at her and photograph her from all positions and angles, but only the best photos appear in the magazine – often hiding as much as they reveal!