Data Visualization Inspiration: Analysis To Insights To Action, Faster!
1. Data Visualization Inspiration: Analysis To Insights To
Action, Faster!
Like a vast majority on planet Earth, I love data visualizations. Ok, so
perhaps as the author of two bestselling books on analytics I love it a little bit more!
There is something magical about taking an incredible amount of complexity and presenting it as
simply as we possibly can with the goal of letting the cogently presented insight drive action.
Magical.
A day-to-day manifestation of this love is on my Google+ or Facebook profiles where 75% of my
posts are related to my quick analysis and learnings from a visualization. Be it looking at 1.1 million
FCC net neutrality comments, things people around the world identify as their biggest threat, water
consumption of a burger patty vs. daily cooking, the religious gap on spanking children, or a simple
graph that rises profound questions about where we donate vs. diseases that kill us.
Data visualized is data understood. Better. Faster. More useful. It delivers world peace!
I'm exaggerating a tiny bit. (As is clear from the discussion on the preference of guns over
knowledge in 37 US states. But at least we're talking.)
In this post, I want to share some examples of data visualization I was excited about recently. In
each case the creator did something interesting that made me wonder how I can use their strategy
in my daily efforts in service of digital marketing and analytics.
We will look at six short stories. You are welcome to read them all at once (warning: once you start
you won't be able to stop!), or you can consume them one at a time. For four of the examples, I'll
also share how the visualization inspired me to apply the lessons to my web analytics data. In the
other two, I'll ask for your help in how you might connect the inspiration to your work as a
Marketer/Analyst.
Six stories, a total of eleven different data visualization techniques to inspire you to think different at
work when you play with data. Ready?
Short story #1: Treemaps, Sunbursts, Packed Trees, Oh My!
Our lives are dominated by columns and rows. [And sometimes they are indeed optimal: 7 Data
Presentation Tips: Think, Simplify, Calibrate, Visualize. You'll also see examples below.]
So a table like this one is par for the course for you.
2. Table comes in. You do you best to understand what is going on. Yes you see the numbers, there are
lots of them. You scroll up and down. Nice. Some countries use a lot of oil. That's just the top 12
rows, there are another 196 rows of data. Sure, sure, sure. Long tail . Yippie!
But how can you be expected to understand it all? How can you understand enough to at least pick
directions you want to go down?
The table above is from Stats Monkey. Their approach is to actually present the data using a
Treemap (they call it a squaretree for some reason).
It is so much better!
3. You can suddenly see the forest and the trees. (Get it?)
The few dominating countries (USA! USA! USA!) are more clearly visible, and you get a much
stronger sense of proportions. Yes, you could see in the table that the US was bigger than China, but
the Treemap really brings the comparison home. You start to see weird things like Russia and India
are the same. Yes, it was in the table. But for a visual person like me, this is the ah-ha moment.
While you can't see the smaller consumers all that easily, you can hover your mouse and see the
details.
Additionally, you can go down to the little ones, now that you have the ability to easily do that, and
point and hover.
Three cool benefits: 1. Treemaps are a great way to visualize a lot of information. 2. They are really
good at showing the differences in the big head and the long tail. 3. They can form the foundation of
allowing data consumers to drilldown into the represented segments.
4. One of my favourite implementations of Treemaps is in the competitive intelligence tool Compete. It
shows all the incoming traffic to a site as a Treemap.
At a glance you can see all the big clusters of sources (close to the channels view in Google
Analytics).
You can hover over each box to get a sense of the key metrics. Number of visits, percentage of share
of total visits and the percentage change (which you can discern from the color of each box, in that
sense the Compete Treemap does not use color just for decoration).
If you are interested in any particular channel, Miscellaneous as an example, you can click on it
and... boom!
You see the big ones named, the hidden mysterious ones, you can unmask using a mouse hover.
It would be nice to see all the sites named, but it is kind of nice that it forces you to internalize the
big ones, likely where you can have the biggest impact, and then look at the small ones.
Net, net. A delightful way to take your 198 row table and present it in a manner that aids stronger
understanding of performance.
Let's go back to our table, and global oil consumption.
5. Stats Monkey also presents that table using the Sunburst visualization...
Perhaps compared to the Treemap, this visual
shows fewer countries and fewer actual numbers of
oil consumption due to space limitation.
You can still hover your mouse and get the details of each country. Additionally you can click on any
country and just look at that one. Better than the table, but perhaps less optimal than the Treemap.
I want to use the above visual to share with you how much I adore the Sunburst visualization. I
believe it is best at describing sequences of events. It is best demonstrated in the example below,
which illustrates the path followed by a group of people on a website.
You get a confusing little thing, but the visualization is interactive. You simply move your mouse and
6. it illuminates the journey and how many people follow a particular path.
For example...
You can configure what the end is, in my case the end is people who converted. Now I can quite
literally follow the path to every conversion. I can find the biggest pools of customers who share a
behavior and go back and optimize my campaign strategy, my content strategy and indeed my
overall digital strategy.
I've used Sunbursts to do the same with keyword portfolios. No better way to optimize for all of
search behavior, rather than the absolutely silly obsession with a few keywords (it is fatal when
apply to single session conversion scenarios!).
The Sunburst visual of our oil consumption is nice. But you can see how much more powerful
Sunbursts can be. Learn how to use them from this tutorial, which is linked off my most beloved data
visualization source d3js.org.
One last nice visual from our friends at Stats Monkey. This time around using a Packedcircle...
7. Pretty mesmerizing, right? It does serve practical value as well.
You can visualize the sizes a bit better. You have the ability to still get the details when you hover
your mouse.
For the Packedcircle they also provide a list of countries in the table, now you can choose the one
you want from the right side and go to the one you are most interested in.
You can see the country you choose zoomed in context of the others that are around that same level
8. of consumption.
The Treemap, Sunburst and Packedcircle demonstrate three possible paths you can take to go from
a table to something much more understandable and much more interactive. It makes understanding
data incrementally better, and encourages drilling down and exploration much easier than the table.
You've see the application of content consumption and keyword analysis using the Sunburst above,
and the use of Treemaps by Compete. I was inspired by the above work to apply the Treemap to our
day-to-day work, let me share that with you.
There are many ways to create Treemaps online. I used infogr.am to create mine below. They have a
free option, you can try it yourself.
This Treemap illustrates the traffic sources and the number of Visitors. It is created using the All
Traffic Sources report in Google Analytics, and clicking Source (rather than the default
Source/Medium).
You can have a simple table that shows the visitors, but this is so much better in being able to show
so much more data, much more easily. It is also so much nicer in being able to illustrate the
proportional differences between each source.
As in all cases above, you can hover your mouse and get the specific number of Visitors.
When I'm creating a dashboard for a high level view, I would take the Treemap above and combine it
with the one below that illustrates the amount of Goal Value delivered by each source.
9. I am sure you have noticed that the sized of each source in the Goal Value Treemap is different from
the Visitors one. This allows for very quick understanding of site performance and the asking for
very good questions very, very quickly.
I also want you to appreciate that you can't actually show this in a table. You would sort the table by
count of Visitors, in which case some of the rows in the second Treemap would disappear, or you
would sort it by Goal Value, in which case some of the ones in the first one would disappear.
You can definitely have two different tables with this data. In my case, and this may vary, it is not as
easy to connect the dots (both on proportionality and deltas).
And that, my dear friends, is the power of simple visualization.
Let's look at some more.
Short story #2: Predictive Modeling, Quantifying Cost of Inaction.
This example is about the very sad reality of the Ebola epidemic and the sadder still inaction by
governments (like ours). The work of the New York Times team inspired me it to do some predictive
modeling for inaction in our world of digital marketing.
Ebola is an extremely serious topic, and I do not mean to trivialize it in any way by using it in the
context of learning a digital analytics lesson. If this is upsetting to you, I do apologize sincerely in
advance.
I found the NYT interactive visualization to be extremely illuminating: How the Speed of Response
Defined the Ebola Crisis.
10. It shows the low and high estimates of infections of Ebola due to this terrible disease. It also allows
us to predict what would happen if we delay action, by moving the blue dot on the graph.
Ignore the black line for a moment (it shows the actual reported cases of infection). The graph below
shows the predictions of what would have happened if aggressive intervention started in June 2014.
The high and low estimates of cases, as you can see below, would have been much, much lower than
reality.
Countries with the money, resources and knowledge to deal with an Ebola type epidemic did not
come to the rescue of the African countries as fast as you might have expected. Empty words of
support and urgency were delivered (along with calls from the ill-informed to shut down flights etc.).
At that time these countries had models to predict what was the cost in human lives from inaction.
Just move the blue dot. Let's say to August
11. Approximately 12k additional deaths. Very close to what happened in reality.
Large-scale intervention did not start until August, but thank goodness it did.
It is important to note that the high estimate includes deaths experts believe have been
underreported.
At that time we could also have move the slider further ahead to model out the impact of inaction.
You can see that moving in Aug did have an impact, the black line, reported cases is less worse than
it could have been. Even assuming that 12k is a lower number (lots of people don't report the
disease). It also does not include the other countries beyond Liberia and Sierra Leone where we
know infections and deaths have occurred.
If you want to see scary, move the blue dot to October.
12. I'm sure that information like this played a key role in getting our government, and likely others, to
jump and take action when they did. Thank goodness for predictive models.
[Sidebar]
If you wonder why sometimes governments move so slowly, among the many other reasons, attribute
some of the blame to the power of communication. The data used in the interactive visualization
above comes from the Centers of Disease Control and Prevention. This is one of the fourteen tabs in
their spreadsheet explaining all this stuff...
13. You can imagine how difficult it is to communicate what is going on - even after you give them credit
for the fact that they are likely rushed and are trying to do a lot more in the spreadsheet. If you have
an opportunity to do volunteer work for government agencies, please take the opportunity.
Meanwhile, if you want to play with the Ebola dataset, you can download it here: Generic Ebola
Response Modeling.
[/Sidebar]
This example got me thinking about applying the spirit of the visualization, without any of the
technical resources to create it, to the world of digital.
The inaction that upsets me the most is senior executives in companies brazenly disregarding our
recommendations for taking actions based on our web data analysis. It. Makes. Me. Mad!
Here's a simple predictive model (though that might be too pompous of a word to use here) to get
them to take action faster. Or at least think a lot harder about not investing the little amount of
money to take big action.
The core performance of the current website look like this....
While we are applying it to a B2B case, it could just as easily be applied to a B2C / Ecommerce
scenarios.
We've dug deep into the data and we've found some inefficiencies/suckiness in the digital
experience. We know the optimization that is required. some straight forward in the stop the
bleeding category that can be fixed without much thought. A couple other things, including the lead-gen
form itself, which we would test to improve performance.
We need to invest a small amount of company time/resource and an additional small cost with our
Conversion Optimization Agency.
You present a verbose Word document/email with your recommendations. You wear a Superwoman
suit and, as an agency, present black slides with light grey text and deep shades of blue graphs to
make the case.
Nothing happens.
Why?
You did not make the whole thing painful enough. People respond to pain. No pain = inaction.
Do this... Create a table with the future Conversion Rate, resulting leads, use the value of each lead
to compute incremental value to the company... all extremely straightforward columns...
14. Then add the last column, the impact of not taking action for three months!
With the first just stop sucking we are predicting that the conversion rate can be moved to 2.5%. The
cost to get that done is $200k. It sounded scary. Now, the $4 mil in incremental value eases the
pain. And the leader, who by the way is smart, can look at the last column and understand the delay
of not starting on the project right away!
The second bold conversion rate is what we predict we can get to with just stop sucking and our first
two A/B tests on the product overview pages.
The third bold conversion rate is for what we predict after the package of changes, including multi-variate
testing on the lead-gen page, will deliver.
With this simple predictive model the need for your Superwoman costume and black slides is
reduced. Everyone can come together around a small table and discuss assumptions that went into
creating it, argue about which changes to start first, and who to assign various parts of the project.
Action, baby!
The big challenge in creating this model rests on your ability to compute the business impact of the
changes you are recommending. We as an ecosystem are not very good at this. But you can see how
incredibly valuable it is.
You can make small improvements to make the table better (remember, tables with lots of numbers
don't work as well as you might have assumed).
Highlight the rows, click on Conditional Formatting in Excel and choose Data Bars. I choose red to
15. imply red ink in our accounting system from not doing what we are proposing!
You know where your eyes are going to go. : )
You can play with the formatting options to get the one you like the most. I'm partial to using the
Color Scales in the Conditional Formatting section.
When we apply that, change the font color a smidgen, this is the resulting impact... a small
improvement...
16. Once you have the initial predictive model, you can start to play with other scenarios and model
them out as well.
For example, what would happen if we focused on not only improving the conversion rate from
making changes to the just stop sucking, product pages and the lead-gen pages, and also impact the
value of each lead inquiry?
What if it moved from $238 currently to $275? This...
17. Even stronger sense of urgency around action, and hopefully an ever higher willingness on behalf of
the company to pay the Agency more, incentivize the internal teams with bonuses to take action
even faster because actual company profits are on the line so clearly!
Since you have the model, more more thing you could add... the cost of delaying action for six
months....
18. Can you imagine anyone in your company saying no to your recommendations based on your data
analysis? I honestly can't imagine even the biggest HiPPO saying no.
The onus is on you though to first compute business impact and wrap it into a predictive model.
[All the computations above are quite straightforward, but if you would like to have a copy of the
above spreadsheet just send me an email.]
Short story #3: Streamgraphs, Data Trends Diving Made Simple!
I'm giving the punchline away with my section titles, but stick with me. Pretend you did not read it.
Here is a lovely straightforward visualization.Worldwide Smartphone Sales by Operating System, in
thousands. Nothing complicated here. BlackBerry is close to not much, even if the red line seems to
be moving up (one challenge with this type of graph). So sad about Symbian.
19. The raw numbers used for sales makes the above graph less insightful. The total number of smart
phones has exploded to such a degree that the decimation of other platforms, and their sad lost
opportunity even with an early start, is hard to see. All you can see is that Android is big, iOS is
doing wonderfully.
The fix is not that difficult though. The creators provide a lovely option called Extended, click, boom!
Better, much better at seeing the trends.
Not only can you see more clearly how Android and iOS are doing, the massive scale of Nokia's
missed opportunity is also more clear. Ditto for Blackberry. Windows Phone, an early starter, is also
visible now.
Additionally, you see that Android seems to be going through its almost predictable dip every x
months at this time with iOS predictably taking that share.
Each graph has its purpose, in my words above you can see the kind of insights that I was looking
for. That drives the graph I find most useful.
While the above graph was pretty lovely, it was the third option that I loved the most. Stream...
Now isn't that awesome?
I love the combination of two insights, one related to the raw growth of the entire space (so much
better than the Stacked option above) and the evolving shares of each player (incrementally better
than the Expanded option above).
20. You also have the capability of hovering your mouse at any time period and getting the actual
numbers, should you need them.
The chart above is rendered using the NVD3 JavaScript libraries. It is an excellent resource for those
with just a little technical aptitude. Here's my go to resource... NVD3 examples gallery of reusable
charts.
The above example inspired me to share with you my use of Streamgraph as a visualization.
This example illustrates the presence of key concepts in my social activity on Twitter...
Rather than looking at a flat table, or lines or some such ungodly thing, I can see the concepts that
become more or less important to me over time. For an analytics person my interest in Data seems
to have a real ebb and flow. And why did I not care at all end of June and July? What caused me to
really care in early Nov? All great questions, from a simple visualization.
21. I can of course focus in on just a certain time period...
Or obsess about just one of the concepts and follow its journey over time.
In the streams are also concepts that became very important at a certain period of time and then
died, some came back again later. The Streamgraph gives me a great ability to truly explore lots of
data in a visual that literally fits my small laptop screen. #awesomeness
If you are interested in exploring simple Streamgraph examples, our lovely friends at Microsoft
Research have made it easy. Please visit their site on data visualization apps for Office, and
download the app. They also have an app for Treemaps, and it includes using color, as in the
Compete case, to represent a metric (say Conversion Rate)!
Another excellent resource is the Google Chart Gallery. Trust me, you'll be impressed with what you
can do very quickly. Candlesticks, Scatter Plots, and Sankey charts! Don't forget the Sankey!!
In the next two stories, I want to share three examples of visuals that made me think. I'm hoping to
spark some ideas in your head as to how they might inspire you to do something different. Please
share your ideas via comments and help all of us learn from you. In the last story, we'll go through
another visualization exercise and end with a bang.
Short story #4: Multi-dimensional Slicing and Dicing!
This example is an interactive visualization on Luxury and Foreign Travelers in Rome.
22. On the x-axis you see the hotel stars (1-5) where the tourists stayed, and on the y-axis you see the
distance of their home country from Rome. It is pretty nice.
It is easy to see the outliers, countries that are bunched together, and wonder if you need to find a
job in Iceland as they can afford four-star hotels much more than others!
You can hover the mouse over any country and get a bit more detail about why they hold the position
they do along the trend line.
What is cool about this is that we can switch the dimensions we are looking at quite easily to arrive
at a more sophisticated understanding of what is going on.
23. Let's look at Length of Stay and Income Per Capita. Do people from richer countries stay longer in
Rome?
Korea (blue, red yin-yang flag on the left) has a income per capita of $31,000 per year, but people
stay a lot less than the wonderful Russians who make a lot less. And the Koreans stay for a shorter
duration.
You can find the Greeks and the Swiss very close together on the chart now. You can see
implications on marketing to individual countries if you are the Rome Tourism Board.
I like the option to slice by inequality index rating of the country (GINI rating).
It is a little surprising at first glance that countries with very high inequality move to the right.
China, South Africa, Egypt. Or, maybe it is not surprising (the outliers from those countries would
stay at nicer hotels I suppose.
24. I like the last option the best. How many tourists?
If you were the Rome Tourism Board, now all the sexy fun starts! You can understand your tourists
better, you can make more precise decisions about your media spend, the color of the red carpets
you might want to roll out, the languages you need to teach translators or wait staff at hotels. And so
much more.
You can possibly send this type of a self-contained dataset to the VPs and the C-Suite of a company.
But it is perhaps best created for the Directors and people who are making quarterly, six-monthly
horizon decisions. It is not great for day-to-day tactical analysts/optimizers/marketers.
You can see it's power for executives under the C-Suite to slice and dice, bring their own business
knowledge and context and help make more informed decisions.
So. If you could create this type of an environment for your digital existence, what dimensions would
25. you use? What would you like to plot? If you were an SEO, or ran all Ecommerce for your company,
or were responsible for consumer experience?
My initial thought was to have Channels are the plotted dimension (where you see Country). Average
Order Value, Assisted Conversions, Bounce Rates and % New Visits. The last choice (How many
tourists) of course would be Unique Visitors (for the same reason).
What do you think? What would you do?
Short story #5: Segmented Stacked Square Charts.
Another example from my beloved New York Times. This time it was about the elderly, and their
challenges of people who live longer in bad shape. Bracing for the Falls of an Aging Nation .
[Regardless of your age, this is extremely well worth reading - for the benefit of your parents today
and yourself in the future.]
There was something mesmerizing about this graph that was in the article...
I admit it took a few minutes to figure out what was going on. The non-normal placement of the
starting point of the graph might have been it. Or the shades in the legend. But it did take a few
minutes.
26. I came to like it very much.
It shows some obvious things. The older you are, the more likely it is that you will have more falls
and visits to ER. The fact that we had this step change suddenly in 7+ in 2008 gives you a pause.
Then you can see that what likely happened is that the 6 became 7+. What happened? (The article
shares some ideas.)
Overall everything getting worse and, sadly, these incidences are happening earlier and earlier.
I do like this as a nice stacked bar graph that includes a very relevant segmentation, broken into
small squares for a nice effect.
If you had to use something like this at work, what would you plot?
I think people don't worry about multi-session conversion enough, they don't obsess about people
enough. It is so silly but they are still obsessed about single session conversions! Makes me mad.
[See: Multi-Channel Attribution Modeling] Hence, I would plot calendar quarters on the top,
channels on the y-axis (not a number), and assisted conversions as the squares for each channel.
What do you think? Good idea? Crazy? What would you do?
Let's close on a let me inspire you to really do this so that you will rock a lot note.
Short story #6: Conditional Formatting, Simple Strategies To Drive Big Focus!
We started with a table, global oil consumption, let's end with a table too. The magnificent
Information is Beautiful site recently shared this lovely infographic: What is the world's biggest cash
crop?
The right answer to any such question is.... It depends. Always, it depends.
The first visual shared on the site ran off this table.
27. [Among many other things, one thing I deeply appreciate about the site is that they almost always
share their data in a Google Docs file. This means that I, and you, can always download new real
world datasets to play with to perfect our own skills.]
Experts that they are, this is their wonderful visualization of the above data...
Depending on your definition of success (planted, yield, production or revenue), your answer might
be different. Most people might be surprised that cannabis/marijuana is the grand champion
(astonishing considering it is barely visible in the first three columns!). Poor cocaine, so roundly
defeated! And good old rice, not very far behind. Go, rice, go!
So many interesting things going on here, look at Sugar Cane.
28. But that is not why I wanted to close with this story.
The point I wanted to make is that when we seen the work of such amazing artists like David
McCandless and his team, it might seem like they are working at an unattainable level. Ok, yes they
are. But while you and I, normal people, can't get to their level, we can do more than we might
otherwise imagine.
We should try. We should find inspiration from them and we should see what we can do with the
tools we have access to.
In the above case, thanks to the table quickly dumped into Excel, we can use the same strategy of
using Conditional Formatting to create our version of the above graphic.
Not too bad for ten minutes worth of work. We even made the column titles smaller! :)
29. We can experiment with different options at our disposal. This is purely a matter of taste, but I
thought the Data Bars lead to a better visual. The green intensity drags our eyes to handful of cells
we need to look first. Wheat. Sugar cane. Marijuana!
All of the other numbers are there, but they are invisible, even as you can see them!
I see you are complaining that you don't like my table with all the white space and lovely row sizes.
And you don't like black font. And you want it all to fit in one page. Happy birthday!
30. Still works.
David and his team actually took this a couple notches higher and created a nice bubble chart with
Revenue. This is the end-state, and it is so nice...
31. It is harder for us to create this with our normal tools or do it quickly. But you and I can do a lot
more than we might believe. Yes the infographic will be shared a lot more in social channels. For us
though, the table might work just fine.
I encourage you to go the extra mile, not give in to the default outputs in our digital analytics
solutions, to find inspiration outside our space, as I did in all examples above, and get better every
day at communicating our ideas more effectively.
We've moved beyond our obsession of data capture, escaped the time suck of data reporting, we are
getting better every day at data analysis. This, data visualization, is our last frontier. The last thing
between us and the glorious glory to be achieved by driving intelligent, fast action based on our
insights.
Carpe diem!
As always, it is your turn now.
First do please share with all of us your ideas for what you would do with examples four and five.
And then... How much time do you, or your team, spend on data visualization on a day-to-day basis?
Does your company allow you to have the time to think and try various techniques? What are some
of your favourite data visualizations for digital marketing and analytics? Are there resources you use
to learn that you would like to share with all of us?
I would love to hear your ideas, critique, life lessons, specific tips and inspiration from this post.