Gramener collaborated with Nasscom to conduct an online masterclass session on "Storytelling With Data." Gramener's CEO, S Anand, led the masterclass and shared some important slides on how to make data stories and how to drive storytelling.
The slides talk about the structure of data stories and how to find meaning full insights from data. There are real-time examples of data analysis and visualizations we created a Gramener to communicate insights as stories.
This is an ultimate guide on data storytelling that offers tips to create data stories, things to keep in mind while making storylines, and choosing designs to make a design-led data story.
Know more about Gramener's data storytelling workshop for analysts and data scientists at https://gramener.com/data-storytelling-workshop
2. Data storytelling is a critical skill for data scientists, analysts & managers
2
But analysts present their work, not their message
Data scientists present their analysis – what they did,
and what they found. That’s not what the audience
needs.
Audiences need a message that tells them what to do,
and why. Told in an engaging way. As a story.
Share your data & analysis as data stories
Whenever you share inferences from data – whether
it’s as a presentation, or an email or document with
your analysis, or as a dashboard – craft it as a story.
This workshop will teach you the techniques of how to
convert an analysis into a memorable story – even if
you’ve never told a story before.
Storytelling has a 30X Return on Investment
Rob Walker and Joshua Glenn auctioned common
items like mugs, golf balls, toys, etc. The item
descriptions were stories purpose-written by 200+
contributing writers.
Items that were bought for $250 sold for over $8,000 –
a return of over 3,000% for storytelling!
Stories are memorable and viral
People remember stories. They’ll act on them.
People share stories. That enables collective action.
We analyze data to improve people’s decision making.
For this to be effective, data stories are needed more
than ever before.
3. With the growth of self-service BI, 85% of companies have lost track of how many
dashboards they generated
What QUESTION does
the dashboard answer?
Is the ANSWER evident
from the dashboard?
What ACTION should the
user take now?
BUT 3 THINGS
ARE UNCLEAR ON
MOST DASHBOARDS
3
5. This is a dataset (1975 – 1990) that
has been around for several years and
has been studied extensively. Yet, a
visualization can reveal patterns that
are neither obvious nor well known.
For example,
• Are birthdays uniformly distributed?
• Do doctors or parents exercise the C-section option to move dates?
• Is there any day of the month that has unusually high or low births?
• Are there any months with relatively high or low births?
Very high births in September.
But this is fairly well known. Most
conceptions happen during the
winter holiday season
Relatively few births during the
Christmas and Thanksgiving
holidays, as well as New Year
and Independence Day.
Most people prefer not
to have children on the
13th of any month, given
that it’s an unlucky day
Some special days like April
Fool’s day are avoided, but
Valentine’s Day is quite
popular
More births Fewer births … on average, for each day of the year (from 1975 to 1990)
Let’s look at 15 years of US Birth Data
Education
LINK
Fraud
6. The pattern in India is quite different
This is a birth date dataset that’s
obtained from school admission data
for over 10 million children. When we
compare this with births in the US, we
see none of the same patterns.
For example,
• Is there an aversion to the 13th or is there a local cultural nuance?
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Very few children are born in the
month of August, and thereafter.
Most births are concentrated in
the first half of the year
We see a large number of
children born on the 5th, 10th,
15th, 20th and 25th of each month
– that is, round numbered dates
Such round numbered patterns a
typical indication of fraud. Here,
birthdates are brought forward to
aid early school admission
More births Fewer births … on average, for each day of the year (from 2007 to 2013)
Education
LINK
Fraud
7. This adversely impacts children’s marks
It’s a well-established fact that older
children tend to do better at school in
most activities. Since many children
have had their birth dates brought
forward, these younger children suffer.
The average marks of children “born” on the 1st, 5th, 10th, 15th etc.. of
the month tend to score lower marks.
• Are holidays avoided for births?
• Which months have a higher propensity for births, and why?
• Are there any patterns not found in the US data?
Higher marks Lower marks … on average, for children born on a given day of the year (from 2007 to 2013)
Children “born” on round numbered days score lower marks on average,
due to a higher proportion of younger children
Education
LINK
Fraud
8. You have data.
You have analysis.
Now what?
Understanding the audience & intent
Finding insights
Storylining
Designing data stories
10. DO IT: Who might be an audience for your
analysis?
• Lookback at your recent analytics project.
• Who do you know that can use this analysis?
(Come up with a real or hypothetical personas)
CHECK IT: Verify these yourself
Is there a name for the individual?
Was the role specific enough? (Head of sales
instead of just executive)
The same data analysis can be relevant to
many people — each group is called persona.
• The trends in sales data for an organization is
relevant for a CEO, head of sales, region leads,
individual sales team member & every
employee.
• The analysis of polio cases in UP is relevant to
the Minister of health, polio campaign manager,
field workers, NGOs, journalists & general
public.
This section will dive deeper into defining a persona
Define your audience, they determine the story
11. DO IT: Start with your own hypothesis
• Pick one of the personas you had listed earlier.
• What problems do you think your persona is facing?
• How do you feel the persona will use the analysis?
• Frame it as a user scenario.
CHECK IT: Verify user scenario with a partner
Is it framed as “As a [persona], I’m in [situation]
where I face [problem], leading to
[consequence]. Solving it by [action] leads to
[impact]”
Would the persona relate to this user scenario if
they heard it?
List scenario(s) for each persona
For each persona, answer the following questions:
1. What situation are they currently in?
2. What problems do they face?
3. What is the consequence?
4. What action can they need to take using your analysis?
5. What is the impact of this action?
Combine these as a user scenario:
“As a [persona], I’m in [situation] where I face [problem], leading
to [consequence]. Solving it by [action] leads to [impact]”
• John: As a Marketing manager, I have to create region-wise
budget for the next quarter. I don’t know which regions give the
highest RoI, so my spend isn’t optimized. Solving it by prioritizing
the region will lead to maximum ROI.
Clear needs & future scenario leads to effective communication.
Know your audience’s needs, that helps align the message
Reference: SPIN Selling by Neil Rackham
13. Insights must be Big, Useful, and Surprising
Filter the analyses using these as a checklist
IS THE INSIGHT
BIG
IS THE INSIGHT
USEFUL
IS THE INSIGHT
SURPRISING
The analysis must, of course, be statistically significant.
But it should also be numerically significant.
We want a result that substantially changes the outcome.
What should the audience do after hearing the insight?
Can they take an action that improves their objective?
Even if it’s informational, what should they do next?
Is this something they didn’t know? Is it non-obvious?
Does it overturn a domain-driven belief or a gut feel?
Or does it bring consensus to a group with divided opinion?
14. Marking each analysis as Big, Useful or Surprising (High, Medium, Low)
14Only those that are high or medium on all aspects are insights
Insights Big Useful Surprising
Twice as many Detractors talk about our Product’s ease of use. Low Medium High
Typing with capitalization in a credit application indicates creditworthiness Low Low High
Almost 20% of all voice search queries are triggered by just 25 words Low High Medium
More engaged employees have fewer accidents Low High Low
About 50% of American small businesses do not have a website High Medium Low
The recommendation system influences about 80% of content streamed on
Netflix
Big Low Low
16. A business storyline
• Our NPS improved 6%
• It was 34% in 4Q18. Now it’s at 40% in 2Q19
• Despite lower satisfaction with our Support,
our NPS grew
• This increase in NPS was mainly due to better
Product Quality & Research
Gladiator’s storyline
• The Emperor asks General Maximus to take
control of Rome and give it back to people
• The ambitious Prince murders the emperor.
• Maximus is sold as a gladiator slave. His family
is murdered
• Maximus grows famous, fights the Prince in the
arena, and wins
• He joins his family in death. Rome is in the
hands of the people
Outlines are the backbone on which you flesh out your story.
This section explains how to create storylines
Storylines are plot outlines. They summarize the entire story
Notice “characters” in red. All stories
have characters, human or otherwise.
16
17. DO IT: Write your takeaway as one sentence
What’s the one thing you want the audience to
remember from your story?
What’s the one message that the audience
should take away?
CHECK IT: Verify these yourself
Is it a single, complete, sentence?
Does it deliver what you want the audience to
remember?
Will your audience care a lot about this?
Close your eyes. Think of a childhood tale.
Summarize the moral of the story in one line
We easily we remember these stories and their
summary as a moral several years later.
Close your eyes. Think of a business
presentation from last week. Can you easily
summarize the message in one line?
Stories are designed around a moral. A single
takeaway. An “elevator pitch”
It’s a one-sentence summary of the most important message for the audience.
1. Start with the takeaway (The elevator pitch. The moral of the story.)
17
18. DO IT: Write your supporting analysis
1. List all possible analysis
2. Re-word them as sentences
3. Strike off what’s not relevant
CHECK IT: Verify these yourself
Is each necessary? Does each analysis
support the takeaway?
Are they sufficient? Do the analyses prove
the takeaway?
What supports your takeaway from
“The Lion and the Mouse”?
http://www.read.gov/aesop/007.html
The lion was an Asiatic lion
The lion had a huge paw
The lion spared the mouse it caught
The lion was caught by a hunter’s net
It was stalking its prey when it got caught
The mouse was nibbling grass nearby
The mouse took few minutes to cut the net
Only include analysis that proves the takeaway.
Ensure that they fully prove the takeaway.
2. Find analysis that supports your takeaway. Ignore irrelevant content
There’s no right or wrong answer. Think
about how it supports your takeaway.
18
19. 3. Convert analysis into messages by adding context
19
DO IT: Add context to your analysis
1. Take each relevant analysis
2. Convert it to a message for the audience by
adding context
CHECK IT: Verify these yourself
Will your audience understand the messages
without explanation?
Will your audience understand why this
message is relevant?
Analysis doesn’t mean anything to people. When
it does, it’s a message. We do this by adding
context. Three ways to add context are:
1. Compare with similar numbers.
Our $15 mn sales is $3 mn more than last
year, $1 mn below budget, and twice our
nearest competitors.
2. Explain with analogies.
If we stopped producing, it’ll take 3 months to
dispose our excess inventory of $2 mn.
3. Add business interpretation.
Usage is correlated with discounts. For every
$1 discount, customer LTV increases by $24.
Frame each analysis as a message that the audience will understand and find relevant
20. 4. Structure the messages into a pyramid or a tree
20
Example of a business tree
Launch sales were 30% less than target due to
high competition
• Launch sales were projected at $20 mn in the
first month, but achieved only $14 mn
o Sales in every region were 20-50% lower.
o Only Philippines & Korea were on target
• Competitors discounted price by 35% - which
is unsustainable for them
o 80 store discounts increased from 15% to 35%
o The maximum sustainable discount is 20%
• Stores offered higher discounts saw less than
20% of our target sales
Construct a pyramid or tree-like outline
• Start with the takeaway at the root of the tree
• Add a message that supports the takeaway
• Add further details or supporting messages
• Messages must prove the first message, and
only the first message
• Strike off any message that isn’t required to
prove or support the takeaway
• Add next message that supports takeaway
• Add details to prove the second message
• Remaining messages for the takeaway
• Add details as required
Arrange messages hierarchically to prove & support the parent message
21. 5. Re-order the messages to increase memorability and motivation
21
Order messages into an emotionally contrasting,
motivating sequence.
Take this aspects-based flow:
• Our profits doubled. But our sales only grew
20%. Our gross margins stayed flat.
The “emotional arc” is falling,
and not motivating.
Here’s the same message re-ordered:
• Our sales grew mildly at 20%. Our margins
didn’t improve at all. But our profits doubled!
This emotional arc falls before
rising. This is more motivating.
Structure your supporting messages into a
memorable flow. Here are 7 flows that help:
1. Time: e.g. Past, Present, Future
Sales was $15 mn. Now it’s $18 mn. We
expect it to grow to $20 mn.
2. Place: e.g. NA, EU, APAC
3. Aspects: e.g. company, competition, context
4. Benefits: e.g. better, faster, cheaper
5. Scale: e.g. local, regional, global
6. Balance: e.g. pros, cons
7. Priority or climactic: least to most important
Remember: Emotional contrast requires bad news – it makes good look better
23. Visually representing data helps us to see patterns in the data quickly
23
• It’s hard to find patterns & derive insights from
raw data
• Statistics can summarize data, but may hide
patterns in how the data is spread
• We use visual encoding techniques to map
data to visual attributes
24. How the data should be interpreted decides the type of chart to be used
24
https://gramener.github.io/visual-vocabulary-vega/
Deviation
Change-
over-Time
Spatial Ranking
Correlation
Part-to-
Whole
Flow
Magnitude
Distribution
25. We use visual design cues to support our annotations & message
25
• Pre-attentive processing drives our attention
towards certain elements more than others.
• We can leverage these to highlight aspects
of the chart that are relevant to our story.
• For ex, when listing a set of countries, if the
relevant insight is for one country, we can
make it stand out as below:
Position is the most powerful encoding.
The eye and brain are naturally wired to detect mis-
alignment of the smallest order
1
Colour, when used in context, is powerful.
We can detect miniscule changes or variations in colour
when comparing an element with neighbouring elements.
This is what makes true colour (32-pixel colour, i.e. 4
billion) a necessity in computer graphics
2
Size is a useful differentiator.
The eye can detect moderate size variations at
moderate distances. Size also has a natural
interpretation: that of priority.
3
Several other
encodings are possible
Aesthetics such as angle,
shadows, shapes, patterns,
density, labelling, enclosures,
etc. can each be used to map
data.
4
26. DO IT: What can you understand from the
chart shown next?
Look at the chart that will be shown next.
List down what all you can understand as points.
CHECK IT: Verify these yourself
How many did aspects did you notice from the
final list of observations?
• Meaning or message behind a chart isn’t
always obvious.
• The same chart can be interpreted in several
ways by your audience.
• You must guide your audience to see the
message you want to show.
Your audience may not understand what you meant to show
27. Class Xth English Marks Distribution
0
5,000
10,000
15,000
20,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
28. 4 type of annotations help the audience understand your intent
0
5,000
10,000
15,000
20,000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
Marks
# students
Teachers add marks to stop some students from failing
This chart shows Class 10 students’ English marks in Tamil Nadu, India, in
2011. The X-axis has the mark a student has scored. The Y-axis has the # of
students who scored that mark.
This is a bell curve. But the spike at 35 (the mark at which students pass) is
unusual. Teachers must be adding marks to some of the students who are
likely to fail by a small margin.
Large number of students score
exactly 35 marks
Few (but not 0) students score
between 30-35
What’s unusual
Large number of students
score 35 marks.
Few (but not 0) students score
between 30-35
Only some students get this benefit.
Identify a fair policy that will be applied consistently.
Summarize the chart in its title
Don’t describe the chart.
Don’t write the question to answer.
Write the answer itself. Like a headline.
Explain the chart
How should the user read it?
What do you say when you talk through it?
Explain what the visual is. Then the axes. Then
its contents. Then the inference.
Recommend an action
How should I act on this?
You need to change the audience.
(Otherwise, you made no difference.)
Highlight essential elements
What should the user focus their eyes on?
Point it out.
Interpret what they’re seeing – in words.
29. We can apply the same principles to the chart & surrounding elements
A chart is supported by
several elements to get
context of the data & insight.
Each element has a scope to
be well designed and help
improve the comprehension
of the chart.
Legends
Chart Area
Horizontal Axis
Horizontal Axis ValuesVertical Axis Values
Vertical Axis
Vertical Axis Label
Data Labels
Chart Title
Students
Horizontal Axis Label
31. With the growth of self-service BI, 85% of companies have lost track of how many
dashboards they generated
What QUESTION does
the dashboard answer?
Is the ANSWER evident
from the dashboard?
What ACTION should the
user take now?
BUT 3 THINGS
ARE UNCLEAR ON
MOST DASHBOARDS
31
32. Today we use dashboards to expose data. But users must explore & interpret it.
Quarterly Sales vs Target Product-wise growth
Country-wise revenue vs target Country-wise product growth (%)
- 2,000,000 4,000,000
Enterprise A
Enterprise B
Enterprise C
Consumer A
Consumer B
Consumer C
- 2,000,000 4,000,000 6,000,000 8,000,000
NA
MEA
EU
AU
AP Region Cons.. Cons.. Cons.. Enter.. Enter.. Enter..
AP 12 11 15 12 9 14
AU 15 10 22 17 13 18
EU 18 20 12 14 15 22
MEA 22 30 9 16 18 20
NA 7 4 3 9 10 12
33. We automate data stories. So users act, rather than interpret.
SERVICES REVENUE 5% BELOW TARGET, DESPITE 8% QOQ
GROWTH
STORY GUIDE
The visual on the right shows
our services revenue against
target. If we’re below target,
we must understand why.
The visuals below break up
the revenue by product and
region. Focus on the area with
the weakest performance.
Q1, 2017 Q2, 2017 Q3, 2017 Q4, 2017 Q1, 2018 Q2, 2018 Q3, 2018 Q4, 2018 Q1, 2019 Q2, 2019 Q3, 2019 Q4, 2019
Revenue Target
GROWTH DRIVEN BY
CONSUMER PRODUCTS, NOT
ENTERPRISEQ4 2020
Enterprise A
Enterprise B
Enterprise C
Consumer A
Consumer B
Consumer C
TARGET IMPACTED BY NA
SHORTFALL
Q4 2020
AP
MEA
EU
AU
NA
NA CONSUMER PRODUCTS
HAVE GROWN THE LEAST IN Q4
Q4 2020
Action: North America should grow consumer products. Leverage learning from other regions
Revenue is 5%
Below Target
Cons A Cons B Cons C Ent A Ent B Ent C
AP 12 11 15 12 9 14
AU 15 10 22 17 13 18
EU 18 20 12 14 15 22
MEA 22 30 9 16 18 20
NA 7 4 3 9 10 12
QoQ growth
is 8%
34. Here’s are some live data stories that applies these principles
Does access to new Technology facilitate Innovation? Does it
facilitate Entrepreneurship? The Global Information Technology
Report findings tell us that "innovation is increasingly based on
digital technologies and business models, which can drive
economic and social gains from ICTs...".
We were curious about whether the data on TCData360 could tell a
story about influential factors on innovation and entrepreneurship.
With over 1800 indicators, we focused on the Networked Readiness
Index, as it has indicators on entrepreneurship, technology, and
innovation.
LINK
Source: https://tcdata360.worldbank.org/stories/tech-entrepreneurship/
35. European brewery identified €15 m cost savings after consolidating vendors
35WATCH A 4-MINUTE VIDEOSEE LIVE DEMO
A leading European brewery’s plants purchased
commodity raw materials from several vendors
each – and had low volume discounts.
Plants also placed multiple orders placed every
week, leading to higher logistics cost.
When plant managers were shown the data, they
objected, saying “That’s not always the case.” Or,
“That’s the only way– no one else does better.”
Gramener built a custom analytics solution that
sourced their SAP order data, automatically
identified which plants ordered which commodities
the most from multiple vendors – and when.
It showed how each plant performed compared to
peers – shaming those with poor performance.
With this, they identified savings of €15 m — which
the plant managers couldn’t refute.
€15 m 40%
savings potential identified
annually
vendor based reduction
identified
36.
37. You have data.
You have analysis.
Now, create data stories!
Understanding the audience & intent
Finding insights
Storylining
Designing data stories
Source: How to Vanish Management Reporting Mania, Sep 2014: https://www.cfo.com/management-accounting/2014/09/vanquish-management-report-mania/
Add steps on the right side
Who do you know that will use this analysis? (If no one, make someone up).Edit exercise
Activities to be checked yourself
Add steps on the right side
Think of good anecdotes here
Add steps on the right side
Instructors: Give the audience 1 minute to write down a one-sentence takeaway. Ask 2 people to read it out. Apply the checklist. If they don’t meet the checklist, prompt them to revise it. Allow them to struggle through it before taking help.
Instructors: Give the audience 2 minutes to write down supporting material. Ask 1 person to read it out. Apply the checklist. Debate with them whether each point is necessary: “How does it support the takeaway?” Check if the analyses, when put together, logically prove the takeaway. There must be no alternative conclusion possible. If not, we need additional material to prove the takeaway.
Instructors: Give the audience 2 minutes to expand on the context for any one analysis. Ask 1-2 people to read it out. Apply the checklist. Debate with them how they could make the point clearer and meaningful to the audience.. Explore alternatives, using comparisons, analogies or interpretation.
Instructors: Ask 1-2 people from the audience to add supporting points to their takeaway or any message. Ask others to debate whether these points are necessary and sufficient to prove the parent message. Ask the audience if some of them are sub-bullets to a supporting point.
Instructors: The emotional arc is an interesting device and can attract questions. Clarify a few points. It’s not a graph of the data. It’s a graph of the EMOTION the graph triggers. There’s no right or wrong level to the emotion arc. It’s subjective, and audience driven. It’s important to ensure high contrast, and end with the feeling you want to deliver. That’s all. What most people are afraid to do is deliver bad news. Encourage the audience to deliver bad news as a vehicle to make the good news look better.Practice this by asking people for examples of related good and bad news. Then ask how they would word it so that the good news looks better.
For example, “Sales fell. Margins improved” can become “Despite lower sales, our profits have not worsened proportionally, thanks to a significant improvement in our margins.”
Add more visual encoding content here & move animation up
Make a longer ordered list of what cues should I use?
Source: Designing Data Visualizations by Noah Iliinsky and Julie Steele (O’Reilly).
Copyright 2011 Julie Steele and Noah Iliinsky, 978-1-449-31228-2.
Specificity? Structure?
Specificity? Structure?
Source: How to Vanish Management Reporting Mania, Sep 2014: https://www.cfo.com/management-accounting/2014/09/vanquish-management-report-mania/