SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez nos Conditions d’utilisation et notre Politique de confidentialité.
SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez notre Politique de confidentialité et nos Conditions d’utilisation pour en savoir plus.
Data Science And Analytics Outsourcing – Vendors, Models, Steps by Ravi Kalakota - May 28, 2015
Data Science and Analytics Outsourcing
– Vendors, Models, Steps
By Ravi Kalakota
May 28, 2015
Data-driven business processes are not anice-to-havebut a need-to-havecapability today. So, if
you’re an executive, manager, or team leader, one of your toughest assignments is managing and
organizing your analytics and reporting initiative.
The days of business as usual are over. Data generation costs are falling everyday. The cost of
collection and storage is also falling. The speed of insight-to-action business requirement is
increasing. Systems of Record, Systems of Engagement, Systems of Insight are being transformed
with consumerization and digital.
With this tsunami of data and new applications, the bottleneck is clearly shifting from transaction
processing to Analytics & Insight-driven“sense-and-respond”Action. This slide from IBM’s Investor
Briefing summarizes the data-driven transformation underway in most businesses.
Click Image To Enlarge
Better/Faster/Cheaper Analytics Execution
Industrialization of analyticsis the new buzzword. Overcoming the jumble of point solutions is
a non-trivial challenge in a big firm. Disparate vendors, disparate capabilities, different interfaces,
all acquired over a long period of time.
To meet demand for faster/better/cheaper innovation around analytics, CFOs and CIOs are
rethinking their silo’d sourcing strategies, fragmented tech budgets aligned against one-off projects,
and are looking at new ways of doing things via out-tasking, IT outsourcing and business process
outsourcing their Analytics and Data Science functions.
The“should we or shouldn’t we outsource data science”discussion is heating up in board-rooms and
executive suites as analytics becomes core to the firm, C-level execs have to consolidate efforts for
delivering the same services to different groups within an organization.
As managers look to execute on outsourcing strategy they have many structural options depending
on which variable they want to optimize around (cost, quality, productivity, innovation, time to
● Outsource Analytics vs. Building a Shared Services Analytics Function at the LoB or
● Outsource the Analytics Platform development and support or keep it in-house in the IT
function or LoB
● Outsource the modeling and data science part or hire/build the capabilities in-house?
● Augment the current staff with domain specific expertise or hire FTEs?
● Centralize analytics in a shared services model or let the LoBs do their thing?
Build vs. Buy vs. Lease (as-a-service cloud solutions) — What is the right configuration… the answer
depends on the organization – internal politics, credibility of IT leadership, ability to execute,
maturity of business requirements and so on.
Click Image To Enlarge
Why Outsource or Outtask Data Science or Analytics?
Data science and analytics capability is becoming table stakes in businesses that haven’t traditionally
been thought of as data-focused industries. Who would have thought that maintenance, online
dating or renting movies would be an analytically intensive business.
However, enterprise IT is often slow to react. In many enterprise IT budgets, the cost of operations
(run the business) is the fastest growing line item—consuming 70+% of budget dollars. IT
organizations are being asked respond to (grow the business or change the business) by enabling
new analytics innovation opportunities, regulatory demands, and building shared private cloud
infrastructure that is comparable to Amazon Web Services. A herculean task for IT to keep up.
The race to implement innovation is often a driver for outsourcing…acquire the right mindset,
toolset, skillset and dataset… to get the job done. LoB leaders, instead of waiting for IT, drive
top-line growth by seeking the most direct path to solutions that will support their initiatives. They
want solutions that deliver a quick ROI, can be implemented quickly and affordably, without a huge
drain on IT (a source they may not have much control over).
The result of this approach is a one-off pragmatic, get something-to-market fractured environments.
Typical scenario in a large firm… business units leveraging different service providers, different
storage and processing technologies, and different front-end visualization tools. In some cases, I
have seen organizations with multiple teams contracting with different vendors within the same
business units to solve similar problems (e.g., customer retention/attrition, next best offer/action),
creating a nightmare for IT who have to support multiple overlapping solutions concurrently in
production and customer-facing organizations getting conflicting insights from the different
This scenario typically forces a centralization and subsequent outsourcing discussion. But the nature
of centralization (from BI Platform or Datawarehousing) is changing. See figure below.
Click Image To Enlarge
What are Some Areas to Outsource?
The different areas of data sciences or analytics outsourcing (based on lifecycle of a project) include:
● Analytics Consulting (strategy, platform selection, model development, decision process
● Analytics Platform Deployment, Customization and Integration
● Analytics “as-a-service” platform strategies—by leveraging a common set of development,
production, and support capabilities
● Analytics Program Staffing — resource augmentation (salary and intellectual arbitrage),
project and program management
● Domain and Function Modeling Knowhow — depends on how standardized the tasks and
● Dashboard Populating and Creating – data collation, cleansing and dashboard creation
● Legacy BI modernization – a growing problem of enhancing or wrapping the old to produce
● Emerging technology areas like Mobile BI…using a “innovation-as-a-service” model
● Data Quality – With data increasingly critical to business strategy, the costs of poor quality
data, fragmentation, and lack of lineage take center stage.
Click Image To Enlarge
For each area and business need (transformation vs. strategic vs. tactical) there are different vendors
that are a better fit. In this posting we examine the frequently asked executive questions around
Outtasking or Outsourcing Analytics (and Data Sciences)– models of engagement, cost
models etc. Also included is a list of Analytics Outsourcing Providers that I have been tracking.
Most of these firms are evolving their capabilities but are rooted in providing BI and Analytics
capabilities on a staffing or project basis.
Outsourced analytic providers serving many industries, including retail, telecommunications,
healthcare and others, provide clients with domain expertise in database-driven marketing and
The following figure from GenPact illustrates how a vendor thinks about Analytics Outsourcing.
Click Image To Enlarge
What are the models of Analytics Outsourcing engagement?
Regardless of the services—data management, business intelligence and reporting, research,
advanced predictive analytics services, and analytics consulting services; you will have to pick a
● Project based model
● Competency based Staff Augmentation based on salary arbitrage
● Creating a Analytics center of excellence (CoE) staffed by your team and vendor team
● Creating a hybrid CoE (partly onshore in the corporation + partly offshore at a captive or
third party vendor)
● SLA or Outcome based is the most complex engagement model.
● Pay per use “as-a-service” Cloud models – providers are responding to the continuing
shortage of data scientists by offering data science know-how as a cloud service.
In some cases you will need mixed models. For instance, it’s important to keep in mind that 80% of
the costs for data-related projects get spent on data preparation – mostly on cleanup data quality
issues. Unfortunately data related budgets for many companies tend to go into platforms,
frameworks which can only be used after you have quality data.
Who makes the Outsourcing Decision?
Who handles the management and implementation of analytics in the enterprise?
CIO, CFO, CDO, LoB or marketing executives?
Most enterprises are struggling with the right operating model for analytics and data science. This
was relatively straightforward with BI and data management which was often under a global CIO or
Data sciences and analytics while seen as potential game changer seems to have a fragmented set of
buyers: Line of Business, Function or even IT? Who is on point to fund the project? Depends on
whether its a departmental initiative or a cross-silo initiative.
Analytics is increasingly business driven. Why? With the right architecture and execution, analytics
can have a powerful impact on customer engagement, frontline business units, and operations. Also
as speed-to-market and innovation become critical, getting the right solutions and implementing
them is typically a business initiative done with outside vendors (often outside the purview of a
typical CIO or CTO.)
In a recent Deloitte study, “The Analytics Advantage,”highlights how diverse the initiative
ownership is. Executives in many different types of roles own the analytics initiatives within their
enterprises, and no clear title emerges as the dominant owner (see below).
Click Image To Enlarge
Who are the industry leaders in this space?
This is a tough question to answer without more context around problem or use-case. But in general,
our survey of market leaders shows:
● Broad “super market” services firms with a broad array of capabilities – Accenture, IBM,
● The growing pure-play analytics firms include: Mu-Sigma, Opera, EXL Analytics
● Offshore vendors who have built their model around analytics – Genpact (spin out from GE)
● Domain specific vendors — Dunnhumby (retail analytics); Acxiom (database marketing)
What are the Range of Outsourcing Services Offered? Increasingly vendors are able to offer
horizontal and vertical solutions effectively packaged in a variety of configurations. Vendors are
becoming more sophisticated as they gain experience handling large, complex datasets. The services
range from Data Sciences -> expertise in various techniques -> toolsets -> vertical specific expertise.
Click Image To Enlarge
What are the range of Technical Skills?
There has been explosion of innovation in the Hadoop Ecosystem. Companies are racing to adopt
new open source tools to gain a competitive advantage. Does your vendor have a deep enough bench
in these projects? Do they have architecture skills to be put together effective solutions around target
Click Image To Enlarge
Technical toolkits around Big Data and Analytics include: RDBMS, Open source Hadoop distribution
(e.g., Apache Hadoop), Commercial Hadoop distribution (Cloudera, Microsoft, MapR, IBM, …),
Cloud-based Big Data platform (AWS, Rackspace, …), Cassandra, MongoDB, Hbase, Hive, Kafka,
Pig, Search (ElasticSearch, Solr, Lucene, …), Spark, Storm, and Zookeeper.
What is Data science?Data Science is an umbrella term that encapsulates the extraction of
timely, actionable information from diverse data sources. It covers data collection, data modeling
and analysis, and problem solving and decision making. It incorporates and builds on techniques
and theories from many fields, including mathematics, statistics, pattern recognition and learning,
advanced computing, visualization, and uncertainty modeling with the goal of extracting meaning
from data and creating data products.
Data science is often used interchangeably with business analytics, although it is becoming more
common. Data science seeks to use all available and relevant data to effectively tell a story that can be
easily understood by non-practitioners.
Data science is nothing new. But digital has increasingly created new opportunities where scientific
methods can be applied to massive, real world data sets.
See below for a partial list of Data Science and Analytics Services Providers…
24. Idiro Technologies(Predictive modelling specialising in Social Network Analysis, Big
Data) – www.idiro.com.
25. WNS Analytics(acquired Marketics) (Marketing, Consumer Behavior Analytics)
26. Opera Solutions (General – serves broad areas) –
27. Data Monitor(General – serves broad areas) – http://www.datamonitor.com/
28. Ipsos(Marketing Analytics) – http://www.ipsosasiapacific.com/
29. EXL Services(acquired Inductis) (General – focuses on broad areas)
30. Meritus(Marketing, Customer Analytics) – http://www.meritusglobal.com/
31. Modelytics(Financial, Lending, Collections, Recovery, Retail Banking)
32. Bridge i2i Analytics(Behavioral Modeling & Resource Planning)
33. Cytel(Clinical & Pharma Analytics) – http://www.cytel.co.in/index.shtml
34. Neural Techsoft(Financial & Risk Analytics) – http://www.neuraltechsoft.com/
35. Vehere Interactive(Telecom, Financial) – http://www.vehereinteractive.com/
36. Aegis Global(General – focuses on broad areas)
37. Datamatics(Financial, Insurance)
38. Marketelligent(CPG, Finance, Telecom Analytics)
39. TNS Global(Marketing Analytics) – http://www.tnsglobal.com
40. NettPositive Analytics(Marketing, Credit Risk Analytics)
41. Affine Analytics(Marketing Analytics) – http://www.affineanalytics.com/
42. EVALUESERVE(Financial, Life Sciences Analytics) – http://www.evalueserve.com
43. ZS Associates (Life Sciences, Pharma, Sales and Marketing) –
Issues to Consider in Picking an Analytics Service Provider?
● Who handles the data; How sensitive is the data; how unusual (and competitive advantage
based) are the analytics usually dictates the engagement model
● Capability of the team: Most firms and vendors are capable of report generation, descriptive
statistics or dashboard generation
● Ability to Analyze and interpret results: Moving to more complex predictive models requires
domain expertise and use case knowhow….most vendors claim to have this but very rarely
● How easy are they to work with? Do you have to spoon feed them or ambiguity is ok. Since
clients are looking for faster turn-arounds for more sophisticated insights on continuously
increasing amounts of data, vendors need to deliver solutions that will scale better with lower
cost of ownership to meet their clients’ internal service-level agreements.
● Experience with large complex data sets or ability need to mix and match different types of
● Emerging Technology Expertise… can they help innovate around new data sources like
Mobile or hyper-connected “Internet of Customers”.
What are Different Resource Cost models?
● Onshore consultants (Data scientists will be in the $250-350 per hour range); Specialized
domains (Risk Analytics) will carry a 30% premium ($300-$600 per hour fees).
● Also hot geographic areas with lot of startups like San Francisco or New York…the rates will
be much higher…. supply vs. demand.
● China, especially Shanghai, is a good place for analytical talent in my experience. India also
with different Indian Statistical Institutes (where sound engineering firm Bose came from)
also has good cheap talent. We built an actuarial center of excellence in New Delhi which
● Offshore consultants (India will be around the $30-$75 per hour range — good for
dashboard creation and other commodity work… many people i spoke to are not sure about
about offshore talent for generating complex analytical models and insights).
Resource costs depend on domain expertise and analytics niche: Predictive analytics (Industry
specific); Behavioral analytics; Risk analytics; Sales & Marketing analytics, Social media analytics,
What are the different Pricing Models in Analytics Outsourcing?
The structure of the pricing for the outsourcing contract can be one of the following:
● Cost Plus. This approach pays the supplier for its actual costs, plus a predetermined profit
percentage. This plan allow little or no flexibility when business objectives and technology
change during the life of the contract, nor does it give any incentive for the supplier to
perform more effectively.
● Unit Pricing. This is a set rate determined by the supplier for a particular level of service,
and the client pays based on its usage. Paying for desktop maintenance based on the number
of users is an example of this approach.
● Fixed Price. Some buyers think this is the best approach, because they know exactly what
the supplier’s price will be, even in the future. But the problem with this approach is that if
the buyer does not adequately define the scope of the process and design effective metrics
before signing the contract, too often the result will be that the supplier claims a particular
service or service level is beyond the scope of the contract and then charges a premium for it.
● Variable Pricing. This plan involves use of a fixed price at the low end of the supplier’s
service, with variances based on higher service levels. Its effectiveness, again, depends on
adequately defining scope of process and metrics.
● Incentive-based (or performance-based) pricing. Here, the buyer provides incentives
to encourage the supplier to perform at peak level (or complete a one-time project ahead of
time, for example) by offering a bonus reward if the supplier performs well. This same plan
works in ensuring that the supplier must pay a penalty if it does not perform to at least the
“satisfactory” service level designated in the agreement. This plan is the one to use to ensure
the supplier’s excellence in performance.
● Risk/reward sharing. Here, the buyer and supplier each have an amount of money at risk
and each stand to gain a percentage of the profits if the supplier’s performance is optimum
and achieves the buyer’s objectives.
The buyer will select a supplier using a pricing model that best fits the business objectives the buyer
is trying to accomplish by outsourcing.
What are the Measures of success?
● Effort based vs. Outcome based
● For repeated analytics like Dashboard generation – one can have SLA, Quality and Errors as
a measure of success.
How effective are vendors in scaling (upwards – more and downwards – less)?
● Depends on whether the vendor is an IT vendor like TCS, Big 5 like Deloitte or pure-play
analytics vendor like Mu-Sigma. These vendors can rampup from a standing start to 200
people in a few months.
● For simple use cases and simple analytics – most vendors can ramp up to 30-50 people easily
(made up of data management, cleansing/quality, BI report generation and Dashboards)
● Vendors can also rampup around technology platforms like SAP, Oracle more easily than
around use-cases like marketing analytics.
● For more challenging use cases like recommendation engines, next best offer which require
more sophisticated modeling (simulation, optimization, time series etc.) – most vendors
probably can assemble a small team but not be able to scale easily beyond 10.
● Domain modeling expertise, Architects and skilled project managers tend to be the hardest
skills to find.
What are the Expected Benefits of Analytics Outsourcing?
● Specialization, Focus, Speed-to-market and Scale – tend to be the expected benefits.
● Vendors may have proprietary IP and tools (see below for landscape view of different
● Lower cost by leveraging economies of scale (often the sales pitch but seldom works in
● Better process quality through forced standardization (vendors force clients to standardize
which requires re-engineering the way things are done)
Firms must not expect to outsource analytics and then just assume that the specifics will take care of
themselves is a recipe for disaster. Managers must retain enough program management capability to
enforce processes, communicate with all parties, and keep track of critical details.
Vertical Industry Specific Domain Expertise
See this blog posting for Use Cases for Big Data and Analytics
The communications industry is characterized by intense competition and customer attrition, or
“churn.” Targeted marketing opportunities and the rapid response to behavior trends are paramount
to the success of communications service providers in retaining existing customers and attracting
new customers. Customer relationship management, or CRM, analyses need to be constantly and
quickly performed, to enable service providers to market to at-risk customers before they churn,
offer new products and services to those most likely to buy, and identify and manage key customer
relationships. Other key analytical needs of communications service providers include call data
record analysis for revenue assurance, billing and least-cost routing, fraud detection and network
Digital Media and Ecommerce
For online businesses, the process of collecting, analyzing and reporting data about page visits,
otherwise known as click stream analysis, is required for constant monitoring of website
performance and customer pattern changes. In addition to needing to address the operational and
customer relationship challenges faced by traditional retailers, digital media businesses must also
analyze hundreds of millions or even billions of click stream data records to track and respond to
customer behavior patterns in real time. Additionally, with online advertising becoming a major
revenue generator, many digital media businesses and their advertisers need to understand who is
looking at the advertisements and their actions as a result of viewing the advertisements. Fast
analysis of online activity can enable better cross-selling of products, prevent customers from
abandoning shopping carts or leaving the web site, and mitigate click stream fraud.
With thousands of products and millions of customers, many retailers need sophisticated systems to
track, manage and optimize customer and supplier relationships. Targeted marketing programs
often require the analysis of millions of customer transactions. To prevent supply shortages large
retailers must integrate and analyze customer transaction data, vendor delivery schedules and radio
frequency identification supply chain data. Other useful analyses for retail companies include
“market basket” analysis of the items customers buy in a given shopping session, customer loyalty
programs for frequent buyers, overstock/understock and supply chain optimization.
See this blog posting for KPIs for Retail Industry.
Financial services institutions generate terabytes of data related to millions of client purchases,
banking transactions and contacts with marketing, sales and customer service across multiple
channels. This data contains crucial business information on client preferences and buying behavior,
and can reveal insights that enable stronger customer relationship management and increase the
lifetime value of the customer. In addition, risk management and portfolio management applications
require analysis of vast amounts of rapidly changing data for fraud prevention and loan analysis.
With extensive compliance and regulatory requirements, financial institutions are required to retain
an ever-increasing amount of data and need to make this data available for detailed reporting on a
See this blog posting forKPIs for Financial Services Industry.
As some of the largest creators and consumers of data, government agencies around the world need
to access, analyze and share vast amounts ofup-to-date data quickly and efficiently. These agencies
face a broad range of challenges, including identifying terrorist threats and reducing fraud, waste
and abuse. Iterative analysis on many terabytes of data with high performance is crucial for
achieving these missions.
Health and Life Sciences
Healthcare providers seek to analyze terabytes of operational and patient care data to measure drug
effectiveness and interactions, improve quality of care and streamline operations through more
cost-effective services. Pharmaceutical companies rely on data analysis to speed new drug
development and increase marketing effectiveness. In the future, these companies plan to
incorporate large amounts of genomic data into their analyses in order to tailor drugs for more
See this blog post for Digital Health and Data.
See this blog post forInformatics in Healthcare
The significant growth of enterprise data is fueling a need for additional storage and other
information technology infrastructure to maintain and manage it. These technology needs are being
further driven by a steady decline in data storage prices, which makes storing large data sets more
As the volume of data continues to grow, enterprises have recognized the value in analyzing such
data to significantly improve their operations and competitive position. They have also realized that
frequent analysis of data at a more detailed level is more meaningful than periodic analysis of
sampled data. In addition, companies are making analytic capabilities more widely available to a
broad range of users across the enterprise for both strategic and tactical decision-making.
These factors have driven the demand for next generation data warehouses infrastructure like
Hadoop,NoSQL, Sparkthat provide the critical framework for data-driven enterprise
decision-making by way of business intelligence.
Fortune recently reported, “Online help-wanted ads for data analysis mavens have shot up 46% since
April 2011, and 246% since April 2009, to over 31,000 openings now, according to job-market
The shortage of analysts is driving companies to consider outsourcing their segments of this value
chain.. “Raw Data -> Aggregated Data -> Intelligence -> Insights -> Decisions -> Operational
Impact -> Financial Outcomes -> Value creation.”.
Clearly, choosing the right analytics providers (onshore or offshore) and structuring effective
business relationships that deliver continuous value require managers to have a clear understanding
of what they’re looking for and the potential risks involved.