Data-Driven Publishing: Using Big Data and smart analysis to make better decisions across the business -- Presented by Ken Brooks, Senior Vice President, Global Supply Chain Management, McGraw-Hill
At Publishers Launch Frankfurt, Frankfurt Book Fair, 8 October 2013
With more data from more internal and external sources available to publishers than ever before, and with ever-more powerful tools and service providers to crunch them, it is incumbent on C-level executives to build Big Data capabilities into their organizations. The possibilities, and the imperatives, will be the topic for Ken Brooks, who has held senior management positions at Bantam Doubleday Dell, Simon & Schuster, Barnes & Noble, and Cengage, and is both a master of data and experienced with all kinds of publishing.
Although there are service providers to do Big Data crunching, and any publisher might use them for some challenges, Brooks believes that learning to use available tools routinely will become a necessary skill set in most publishing houses. He says the key is to become more “data-driven” in analysis and decision-making, because data-driven decisions are possible in more ways than ever before and because publishing is particularly amenable to improvement through the skilled use of data.
Brooks also points out that routine Big Data analysis will become increasingly accurate and beneficial over time. He believes it is an emerging competitive tool of great importance and that the companies that get it soonest will gain great advantage. In this presentation, he will give publishers ideas about how to use Big Data across their enterprise: marketing, editorial, operations, and finance.
2. 2
The analytics / big data world..
Content
Engagement
Transactions
Customer /
User
Relational databases
& SQL
JSON / APIs
NoSQL databases
Key-value
Object
Row-column
Graph
AWS, Azure, Google
MapReduce / Hadoop
Data visualization
Statistical reasoning
Algorithms
Regression /
Classification
Parametric / non-
parametric
Problem types
Outlier detection
Recommendation
Prediction
Data Sources Data Management Analytics
3. 3
Start with key decisions and actions
Business Unit / Function Key decisions / actions
Strategic • How should I price?
• What channels should be used?
• What products should be offered?
Editorial • Which titles to acquire?
• What product features to offer?
Sales • Which customers to prioritize? How?
• What should an account buy?
• Net or gross sales commission?
• Territory allocation?
• Discounts or other incentives
Marketing • What promotions to run?
• Where to place publicity / ads?
• What loyalty programs to run?
Operations • When and how much to print?
• What vendors to use?
• Number and location of facilities?
6. 6
So how do you do it?
Business
Decisions
Should I
increase
prices this
year?
7. 7
So how do you do it?
Business
Decisions
Should I
increase
prices this
year?
A price increase
of 5% will lead
to a 2% fall in
revenue due to
elasticity.
8. 8
So how do you do it?
Business
Knowledge
Business
Decisions
9. 9
So how do you do it?
Business
Knowledge
Analytics
Business
Decisions
10. 10
So how do you do it?
Business
Knowledge
Analytics
Business
Decisions
Data visualization
Statistical reasoning
Algorithms
Regression /
Classification
Parametric / non-
parametric
Problem types
Outlier detection
Recommendation
Prediction
11. 11
So how do you do it?
Business
Knowledge
Analytics Data
Business
Decisions
12. 12
So how do you do it?
Business
Knowledge
Analytics Data
Business
Decisions
Relational databases & SQL
JSON / APIs
NoSQL databases
Key-value
Object
Row-column
Graph
AWS, Azure, Google
MapReduce / Hadoop
13. 13
So how do you do it?
Business
Knowledge
Analytics Data
Business
Decisions
Good morning, I’m Ken Brooks, SVP of Global Supply Chain management at McGraw-Hill Education. I’ve worked in most sectors of book publishing and across most of the functions. During that time I’ve spent a lot of time thinking about why, with a few notable exceptions, publishing doesn’t spend more time on analytics to support one-time and ongoing business decisions. I can’t tell you the number of meetings I’ve sat in where we’ve either been discussing the lack of data, the quality of data, or why something unexpected was going on when we could have been more effectively spending time figuring out what to do about. I’ve pretty much come to the conclusion that because there was so little change for so long, much of the decision making is intuitive based on years of experience and up until a few years ago that worked pretty well.Subject: So I’m here today to talk about big data in publishing and how that can improve decision makingImportance: In my view big data – or analytics – is *the* key to business efficiency and effectiveness – and can be applied across the organization regardless of function. As I was discussing this with Mike Shatzkin before the event he made a very good illustrative point. Anyone that knows Mike well, knows that he’s a huge fan of baseball. In baseball the difference between a .300 hitter ( a benchmark of quality) and a .250 hitter (a benchmark of mediocrity) over 500 official at-bats (basically a season), is 25 hits. That’s a hit per week. If our player plays about 5 games a week, that’s a hit every five games.Nobody can see that with the naked eye. And nobody can see that by watching one or two games. This is a very important difference and well below the margin of intuition.Preview: In my 20 minutes today I plan to discuss:What big data is: the nuts and bolts of data sources, databases and analytical frameworks.How it fits: a framework for thinking about it: it’s not the nuts and bolts where the value comes in. It’s what questions you ask, decisions you make and actions you take as a result.How you do it / get started: What are the skills needed and where do you get them.Transition: So what is big data?
According to a 2001 Gartner report, big data is characterized by the “3 v’s” of volume, velocity and variety. In publishing this comes from several sources, many of them brand new:Content: all of the content, metadata, etc.Customer / user: who is the customer, what do they buyTransaction: sales data, production data, etc.Engagement data: How does the customer engage with your content: twitter feeds, reading metrics, etc.Along with this data are all of the tools for managing it…this consists of NoSQL databases, such as MongoDB, MarkLogic and tools to handle the flood of data such as Hadoop, Pig and a variety of other tools with unlikely names.On top of the tools are the various analytics approaches ranging from those designed to tease out patterns visually…called data visualization tools…to various types of algorithms and analysis frameworks…predictive analysis, outlier detection, recommender systems and collaborative filtering through machine learning tools come to mind as key examples.But I’d like to add a couple more pieces of the puzzle. All of this is meaningless in the absence of the decisions you need to make or the actions you plan to take as a result. You’re doing this analysis and collecting this data in order to be able to take more effective action: to find the .300 hitters among the mass of .250 and lower ideas that are out there. These actions can be ongoing, ad hoc, or real time. Some of the tools make possible on an ongoing basis what used to require months of work to generate.And the decisions and actions need to fit into the business context, both strategic and functional, of your particular company and sector.Transition: So what are some of these questions?
I’ve listed a few of the big questions by function here, starting with strategic question on the top.At the strategic level there are decision around pricing, channels and productsIn editorial the questions center around titles to be acquired and product features to be offered.For sales it’s what customers or channels to prioritize, how should targets and territories be set and even the eternal question of gross versus net sales commissionsFor operations on the of the key issues is inventoryTransition: Let me give you an example: Everyone knows that physical demand is falling. This is borne out in the unit sales numbers of most publishers. What do we do about it?I know one CEO that was so exasperated by the operations staff and their inability to answer this question quantitatively that she just mandated a 20% reduction in spend for the coming budget year.Transition: Let me give you an example: Everyone knows that physical demand is falling. What do we do about it?
One thing is certain. If you keep ordering as you always have, you *will* end up with a write-off *and* you’ll be spending money on working capital you should be using for new products.Here’s the prescription:Do a title-level forecast and update it every monthProject inventory position so you know when you’ll run outPlan printings to balance your costs: makeready, ordering, unit cost, storage and cost of capitalUtilize POD to fill in low quantities as neededOnce the internal approach is working, supplement it with customer data – it will quickly become apparent when customers are over-orderingFor example in higher ed we’ve been seeing a shift of students buying their textbooks through amazon versus through the college bookstores. The bookstores are still ordering the way they always have and amazon’s algorithms work fine for trade book, but don’t work well for “bursty” demand like college rush schedules. Both will over-order and, as a result, send back massive returns.This is not difficult, but it does require an analytical, objective approach which can then be tempered by intuition. The base, though, is quantitative.Transition: Ok, so how do you do this? Well it turns out that many of the levers are already in your hands…
It starts with the questions.…This is the C-level skill required. What are the most important questions for the business and what is the standard of proof for answers?I listed a number of possibilities on a previous slide, but you’ll each have your own cut at those. Transition: For example there’s the perennial question about pricing…
Normally this would be phrased, “How *much* should I increase prices,” but it’s not always a foregone conclusion.I’ve been in a number of meetings where the answer was driven by working backward from revenue projections and falling unit sales. And this is certainly one way to get an answer.Transition: I would content, however, that a better approach would be a bit more data-driven. It may come down to a bunch of people sitting around the table making the decision, but it shouldn’t start that way.
It should start with some analysis to understand:competitive positionelasticity of demand and be based on facts to supplement intuition. I could go on about estimates of uncertainty and further analyses, but the question is how to you make this perspective part of your normal operations?What kind of people do you need to do this and where do you get them? What do these people look like?Transition: The first characteristicis business knowledge…
This lines up pretty directly with an understanding of the data sources and what they mean. It helps (a lot) if the person knows this going in, but I’ve seen very experienced publishing executives that don’t have a full understanding of all of the domains I mentioned, particularly when it comes to distinctions of who buys from the publisher versus who is the ultimate customer, and how does the ultimate customer use your products/titles.At the very minimum, however, business knowledge means:knowing how businesses in general fit together: how is sales different than marketing and what should you be expecting from each function?Also how the various business processes fit together and the underlying causalities that are relevant in your particular sector. I’ve generally found this kind of background in individuals with MBAs, consulting backgrounds, or individuals with wide functional experience. Transition: But business knowledge alone isn’t enough…
An orientation towards quantitative analysis is the bare minimum. And I’d take it past the traditional level of analysis that’s done by Finance. You have to get past the financials and be able to look at the levers that drive the financials: the financials are a *result* of business actions, not the cause of them in anything other than a philosophical sense.Transition: Inthis world of big data, this can go much deeper than what you’d expect to see from an MBA or from Finance.
An MBA may have a grasp of the first element, data visualization, but I’d suggest that the further down the list you go, the more specialized the knowledge you’re looking for. Typically you’d find someone with graduate level education in Statistics, Operations Research or Computer Science with an understanding of these tools, and with a salary to go along with it.Transition: You also need individuals that are competent in handling data…
One of the hardest parts of any analytics work is dealing with the wide variety of data and the various tools to acquire and clean it up. This is often called “data wrangling”.I recently had a situation where I was interesting in seeing if there were an increase in customer complaints – via a sentiment analysis – of a decision to outsource tech support. This requires an ability to capture a twitter data feed and then bounce it against a dictionary of phrases with established sentiment scores.Transition: This requires skills that begin to look like those associated with a Computer Science degree.
At the basic level are the tools used by IT departments everywhere: relational databases and some variant of SQL to access them. These are then supplemented by:Skills with APIs to access remote data sourcesSome of the more specialized, but increasingly common NoSQLdatabasesCloud services for storing and analyzing large data sets.Approaches to dealing with large datasets like MapReduce or it’s open source implementation, HadoopTransition: So where do you find all of these skills in a single person?
There really aren’t too many of these folks running around out there and if you’ve been paying attention to the news on Big Data and Data Science, you’ll know that these folks are increasingly expensive.But there’s a path to take that will get you the benefits of big data and analytics without getting too wrapped up in the details of data science:Start with the decisions you need to make and the actions you need to decide onFollow this up with a standard that decisions will be made with the support of analysis and data.Look widely for approaches and data sources to provide objective criteria for your decisions.Expect this level of rigor from those around youEncourage members of your staff to self-educate via MOOCs and online courses. You don’t have to spend anything at all to do this, but it will require time and patience.If you need to utilize more specialized tools than you have available, look for specialty consultants – they will save you immense amounts of time and moneyLook to your IT team to build some of the data handling capabilities Transition: So that’s it for my prepared remarks today…
Review: In my 20 minutes today I quickly went through:What big data is all about.How it fits: I call this the framework for action.How you do it / get started.Importance: In my view big data – or analytics – is *the* key to business efficiency and effectiveness – and can be applied across the organization regardless of function. It’s how we identify those .300 hitter ideas versus those at .250 or below and support decisions below the margin of intuition.Request: I’d like to request that in the weeks ahead you pick out some of the big questions that come up and ask yourself if perhaps there is a more analytical way to approach the answer and how you’d go about pursuing it. I’m available at the email address shown and you’ll be hearing from other folks today that will be happy to discuss those issues with you.