SlideShare a Scribd company logo
1 of 5
Download to read offline
Infochimps targets enterprises with
     stream-processing additions to 'big
                 data' PaaS
Analyst: Matt Aslett
14 Nov, 2012



'Big data' PaaS provider Infochimps has updated its Infochimps Platform with the addition of
stream-processing capabilities to the Infochimps Data Delivery Service based on technologies
first developed at Twitter and LinkedIn. With its first paying customer on board, the company is
now seeking partnerships to help support its enterprise-focused PaaS offering.




   The 451 Take

   There's a big difference between offering Hadoop as a service to be configured, deployed and
   managed, and offering a managed service that masks the complexity of configuring and
   deploying Hadoop. We believe the latter will gain traction as more late adopters begin to look
   at adopting the benefits of Hadoop without investing upfront in the expertise and
   infrastructure required to support it. While Infochimps will need to establish the trust of its
   target customers, it is well-positioned with an easy-to-consume managed service for those
   without Hadoop expertise, as well as a stack of technologically interesting projects for the
   'devops' crowd.




Context

We first covered Infochimps earlier this year when the company pivoted from being a data
marketplace provider to releasing the technology that supported its data marketplace, both as
open source projects and as PaaS. The initial focus was on making it easier to deploy the Hadoop


Copyright 2012 - The 451 Group                                                                       1
data-processing framework via a Chef-based systems provisioning, deployment and updating tool
called IronFan. Infochimps has expanded since then with the addition in April of an operations
dashboard called Dashpot, and in August with the addition of the Apache Flume-based Data
Delivery Service (DDS) for integrating with existing data sources, as well as early data-streaming
functionality in DDS via extensions to Wukong, the company's Ruby for Hadoop. The latest addition
to the platform expands its support for stream processing through the integration of open source
stream-processing projects Storm and Kafka.


Initially developed by BackType and released as an open source project by Twitter in August 2011
following its acquisition of the social analytics provider, Storm is a stream-processing engine. Kafka,
meanwhile, is a distributed message queue originally developed by LinkedIn and used by the
company in a number of projects, including feeding all activity events to its data warehouse and
Hadoop, as well as keeping its search engine up to date with network activity in real time. Storm
and Kafka are used by Infochimps as the foundation of DDS, which is used to connect the
company's Hadoop-based PaaS with multiple existing data sources, enabling real-time integration
of relevant data for processing and analysis.


DDS is a key component of the Infochimps Platform that elevates it beyond a platform for Hadoop
deployment to being a potential big data management and analytics platform of choice. It is DDS
that will enable businesses to adopt the Infochimps Platform alongside existing data management
technologies and quickly gain insight from new and existing sources of data.


Infochimps' main selling point is in lowering the barriers to adopting Hadoop. While there is a lot of
complex technology involved – such as IronFan, elastic Hadoop, DDS, elasticsearch, NoSQL and
NewSQL databases, Wukong and Dashpot – the platform is delivered as a service designed to mask
that complexity. The company maintains that it can take customers from nowhere to generating
business insight from the Infochimps Platform in 30 days, without the need to hire specialist
support and analytics staff, or invest in specialist infrastructure.


Infochimps has attracted nine paying customers since its platform went live in the second quarter,
with an average selling price of $200,000. The company charges customers per node per month for
what is currently a public cloud offering hosted on Amazon Web Services or Rackspace Cloud.
Infochimps has established relationships (soon to be announced) to deliver both private cloud and
virtual private cloud offerings supported in its customers' own datacenters or via their trusted
datacenter provider. The company is launching its cloud services across a network of tier four
datacenters in North America and will begin offering its big data cloud services in the first quarter
of 2013. The potential to support private cloud deployments will be aided by the fact that IronFan is

Copyright 2012 - The 451 Group                                                                           2
a key component in VMware's Serengeti project to make it easy to configure and deploy Hadoop on
virtual machines, while the Infochimps Platform also supports the OpenStack API.


The shift toward more enterprise-focused services and partnerships is being led by former Teradata
and StackIQ executive (and Xerox PARC EIR) Jim Kaskade, who joined the company as CEO in
August, replacing cofounder Joe Kelly, who became COO. Kaskade has also been busy lining up a
new major financing round. Infochimps had previously raised a total of $3m from investors
including DFJ Mercury, although that was during its previous incarnation as a data marketplace
provider. The company currently has 23 employees, up from 14 in March.


Competition

There are an increasing number of vendors offering Hadoop as a service, with Amazon and Google
being the biggest players at this point. While they therefore pose a competitive threat to
Infochimps, the value proposition is quite different, since it still requires a degree of expertise to
configure, deploy and manage a cloud-based Hadoop service in comparison to Infochimps'
managed services approach. We've seen limited uptake of cloud-based Hadoop services to date,
with the main use case being development and testing. Indeed, we've noted before that if a
company begins to move toward a larger-scale deployment, the costs can be prohibitive enough to
require on-premises deployment. While Infochimps' service is initially based on the public cloud, it
has designs on supporting deployment choice. The company also believes that with the added
value of IronFan, DDS, Wukong, Dashpot and the rest, along with its managed services approach, it
has enough to justify the additional cost above that of running Hadoop on a public cloud service
with the required expertise.


Other Hadoop service providers include SunGard, Treasure Data, Qubole, Mortar Data and Guavus,
while Infochimps believes its closest competition will come from MetaScale, the Hadoop managed
services subsidiary of Sears Holdings, and tresata, the stealthy data platform provider founded by
former Bank of America managing director for big data and analytics Abhi Mehta. Other vendors are
trying to mask the complexity of configuring and deploying Hadoop by building it into larger
on-premises application stacks, so we might also expect would-be customers to consider the likes
of Drawn to Scale, Splice Machine or Digital Reasoning, depending on the specific application. The
company must also be considered a rival to some extent with Hadoop distributors such as Cloudera,
Hortonworks, MapR, IBM and EMC, although there is also the potential for partnerships here, as
indicated by the fact that Cloudera CEO Mike Olson is an adviser to Infochimps.




Copyright 2012 - The 451 Group                                                                           3
SWOT Analysis

 Strengths                                               Weaknesses
 We were already fans of the Chef-based cluster          Managed services relationships are built on trust.
 platform tuned for the needs of enterprises using       While Infochimps has technological expertise, it
 Hadoop. DDS adds all-important integration with         will need to establish itself before some would-be
 existing tools that will help drive wider adoption.     customers will consider it.
 Opportunities                                           Threats
 We are seeing an increasing need for technologies and   The big services and software providers are
 services that mask the complexity of configuring,       unlikely to sit back and let demand for Hadoop
 deploying and managing Hadoop for late adopters.        managed services go elsewhere. Expect the
 Infochimps has both.                                    competition to increase with demand.




Copyright 2012 - The 451 Group                                                                                4
Reproduced by permission of The 451 Group; © 2012. This report was originally published within 451
 Research’s Market Insight Service. For additional information on 451 Research or to apply for trial access, go
 to: www.451research.com




Copyright 2012 - The 451 Group                                                                                    5

More Related Content

What's hot

Battling the disrupting Energy Markets utilizing PURE PLAY Cloud Computing
Battling the disrupting Energy Markets utilizing PURE PLAY Cloud ComputingBattling the disrupting Energy Markets utilizing PURE PLAY Cloud Computing
Battling the disrupting Energy Markets utilizing PURE PLAY Cloud ComputingEdwin Poot
 
Pouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy IndustryPouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy IndustryDataWorks Summit
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataHortonworks
 
Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIDataWorks Summit
 
Georgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureGeorgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureMicrosoft
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseDataWorks Summit
 
SplunkSummit 2015 - Real World Big Data Architecture
SplunkSummit 2015 -  Real World Big Data ArchitectureSplunkSummit 2015 -  Real World Big Data Architecture
SplunkSummit 2015 - Real World Big Data ArchitectureSplunk
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...DataWorks Summit
 
Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0 Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0 ShiHeng1
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysiszafarali1981
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelDataWorks Summit
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsSnapLogic
 

What's hot (20)

Battling the disrupting Energy Markets utilizing PURE PLAY Cloud Computing
Battling the disrupting Energy Markets utilizing PURE PLAY Cloud ComputingBattling the disrupting Energy Markets utilizing PURE PLAY Cloud Computing
Battling the disrupting Energy Markets utilizing PURE PLAY Cloud Computing
 
Pouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy IndustryPouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy Industry
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AI
 
Georgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureGeorgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft Azure
 
OpenPOWER Update
OpenPOWER UpdateOpenPOWER Update
OpenPOWER Update
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
 
SplunkSummit 2015 - Real World Big Data Architecture
SplunkSummit 2015 -  Real World Big Data ArchitectureSplunkSummit 2015 -  Real World Big Data Architecture
SplunkSummit 2015 - Real World Big Data Architecture
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...Building intelligent applications, experimental ML with Uber’s Data Science W...
Building intelligent applications, experimental ML with Uber’s Data Science W...
 
Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0 Privacy-Preserving AI Network - PlatON 2.0
Privacy-Preserving AI Network - PlatON 2.0
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Big Data/Hadoop Option Analysis
Big Data/Hadoop Option AnalysisBig Data/Hadoop Option Analysis
Big Data/Hadoop Option Analysis
 
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelMoving Health Care Analytics to Hadoop to Build a Better Predictive Model
Moving Health Care Analytics to Hadoop to Build a Better Predictive Model
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
Hadoop dev 01
Hadoop dev 01Hadoop dev 01
Hadoop dev 01
 
Cloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIsCloud-Con: Integration & Web APIs
Cloud-Con: Integration & Web APIs
 

Viewers also liked

Hw09 Cloudera Desktop In Detail
Hw09   Cloudera Desktop In DetailHw09   Cloudera Desktop In Detail
Hw09 Cloudera Desktop In DetailCloudera, Inc.
 
The Future of Data
The Future of DataThe Future of Data
The Future of Datablynnbuckley
 
Cloudera introduction
Cloudera introductionCloudera introduction
Cloudera introductionPhate334
 
Spark tuning2016may11bida
Spark tuning2016may11bidaSpark tuning2016may11bida
Spark tuning2016may11bidaAnya Bida
 
Introduction to YARN Apps
Introduction to YARN AppsIntroduction to YARN Apps
Introduction to YARN AppsCloudera, Inc.
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerCloudera, Inc.
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopDavid Yahalom
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebookNiranjan Pandey
 
Hadoop & Cloudera Workshop
Hadoop & Cloudera WorkshopHadoop & Cloudera Workshop
Hadoop & Cloudera WorkshopSerkan Sakınmaz
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopCloudera, Inc.
 
Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2IMC Institute
 

Viewers also liked (14)

Hadoop I/O Analysis
Hadoop I/O AnalysisHadoop I/O Analysis
Hadoop I/O Analysis
 
Hw09 Cloudera Desktop In Detail
Hw09   Cloudera Desktop In DetailHw09   Cloudera Desktop In Detail
Hw09 Cloudera Desktop In Detail
 
The Future of Data
The Future of DataThe Future of Data
The Future of Data
 
Cloudera introduction
Cloudera introductionCloudera introduction
Cloudera introduction
 
Spark tuning2016may11bida
Spark tuning2016may11bidaSpark tuning2016may11bida
Spark tuning2016may11bida
 
Introduction to YARN Apps
Introduction to YARN AppsIntroduction to YARN Apps
Introduction to YARN Apps
 
Yarns About Yarn
Yarns About YarnYarns About Yarn
Yarns About Yarn
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera Hadoop
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebook
 
Hadoop & Cloudera Workshop
Hadoop & Cloudera WorkshopHadoop & Cloudera Workshop
Hadoop & Cloudera Workshop
 
Cloudera Desktop
Cloudera DesktopCloudera Desktop
Cloudera Desktop
 
Data Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache HadoopData Science at Scale Using Apache Spark and Apache Hadoop
Data Science at Scale Using Apache Spark and Apache Hadoop
 
Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2Hadoop Workshop using Cloudera on Amazon EC2
Hadoop Workshop using Cloudera on Amazon EC2
 

Similar to 451 Research Impact Report

SnapLogic Raises $37.5M to Fuel Big Data Integration Push
SnapLogic Raises $37.5M to Fuel Big Data Integration PushSnapLogic Raises $37.5M to Fuel Big Data Integration Push
SnapLogic Raises $37.5M to Fuel Big Data Integration PushSnapLogic
 
InterSystems IRIS Data Platform : Machine learning on the way
InterSystems IRIS Data Platform : Machine learning on the wayInterSystems IRIS Data Platform : Machine learning on the way
InterSystems IRIS Data Platform : Machine learning on the wayRobert Bira
 
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges" Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges" Dataconomy Media
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaSkillspeed
 
How pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architectureHow pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architectureKovid Academy
 
Hadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersHadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersMrigendra Sharma
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paperSupratim Ray
 
DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...
DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...
DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...Stefan Zosel
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
IIA: The Current State of Hadoop in the Enterprise
IIA: The Current State of Hadoop in the EnterpriseIIA: The Current State of Hadoop in the Enterprise
IIA: The Current State of Hadoop in the EnterpriseCoy Dean
 
Capgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using HadoopCapgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using HadoopAppfluent Technology
 
The Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosThe Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosSenturus
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
Building a Hybrid Data Pipeline for Salesforce and Hadoop
Building a Hybrid Data Pipeline for Salesforce and HadoopBuilding a Hybrid Data Pipeline for Salesforce and Hadoop
Building a Hybrid Data Pipeline for Salesforce and HadoopSumit Sarkar
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_SuiteRobin Fong 方俊强
 
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...SnapLogic
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataWANdisco Plc
 

Similar to 451 Research Impact Report (20)

Why Hadoop as a Service?
Why Hadoop as a Service?Why Hadoop as a Service?
Why Hadoop as a Service?
 
SnapLogic Raises $37.5M to Fuel Big Data Integration Push
SnapLogic Raises $37.5M to Fuel Big Data Integration PushSnapLogic Raises $37.5M to Fuel Big Data Integration Push
SnapLogic Raises $37.5M to Fuel Big Data Integration Push
 
InterSystems IRIS Data Platform : Machine learning on the way
InterSystems IRIS Data Platform : Machine learning on the wayInterSystems IRIS Data Platform : Machine learning on the way
InterSystems IRIS Data Platform : Machine learning on the way
 
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges" Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
Moustafa Soliman "HP Vertica- Solving Facebook Big Data challenges"
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
How pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architectureHow pig and hadoop fit in data processing architecture
How pig and hadoop fit in data processing architecture
 
Hadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersHadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, Providers
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Hadoop data-lake-white-paper
Hadoop data-lake-white-paperHadoop data-lake-white-paper
Hadoop data-lake-white-paper
 
DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...
DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...
DevOps and Modern Application Development in the Cloud: Red Hat, T-Systems, a...
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
IIA: The Current State of Hadoop in the Enterprise
IIA: The Current State of Hadoop in the EnterpriseIIA: The Current State of Hadoop in the Enterprise
IIA: The Current State of Hadoop in the Enterprise
 
Capgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using HadoopCapgemini Data Warehouse Optimization Using Hadoop
Capgemini Data Warehouse Optimization Using Hadoop
 
The Big Picture on Big Data and Cognos
The Big Picture on Big Data and CognosThe Big Picture on Big Data and Cognos
The Big Picture on Big Data and Cognos
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Building a Hybrid Data Pipeline for Salesforce and Hadoop
Building a Hybrid Data Pipeline for Salesforce and HadoopBuilding a Hybrid Data Pipeline for Salesforce and Hadoop
Building a Hybrid Data Pipeline for Salesforce and Hadoop
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suite
 
Combining hadoop with big data analytics
Combining hadoop with big data analyticsCombining hadoop with big data analytics
Combining hadoop with big data analytics
 
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
SnapLogic's Latest Elastic iPaaS Release Adds Hybrid Links for Spark, Cortana...
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 

More from Infochimps, a CSC Big Data Business

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business
 
Case Study: Digital Agency Turbocharges Social Listening and Insights with t...
Case Study: Digital  Agency Turbocharges Social Listening and Insights with t...Case Study: Digital  Agency Turbocharges Social Listening and Insights with t...
Case Study: Digital Agency Turbocharges Social Listening and Insights with t...Infochimps, a CSC Big Data Business
 

More from Infochimps, a CSC Big Data Business (14)

Vayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex SystemsVayacondios: Divine into Complex Systems
Vayacondios: Divine into Complex Systems
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
AHUG Presentation: Fun with Hadoop File Systems
AHUG Presentation: Fun with Hadoop File SystemsAHUG Presentation: Fun with Hadoop File Systems
AHUG Presentation: Fun with Hadoop File Systems
 
Report: CIOs & Big Data
Report: CIOs & Big DataReport: CIOs & Big Data
Report: CIOs & Big Data
 
Infographic: CIOs & Big Data
Infographic: CIOs & Big DataInfographic: CIOs & Big Data
Infographic: CIOs & Big Data
 
5 Big Data Use Cases for 2013
5 Big Data Use Cases for 20135 Big Data Use Cases for 2013
5 Big Data Use Cases for 2013
 
[Webinar] Top Strategies for Successful Big Data Projects
[Webinar] Top Strategies for Successful Big Data Projects[Webinar] Top Strategies for Successful Big Data Projects
[Webinar] Top Strategies for Successful Big Data Projects
 
[Webinar] High Speed Retail Analytics
[Webinar] High Speed Retail Analytics[Webinar] High Speed Retail Analytics
[Webinar] High Speed Retail Analytics
 
Taming the Big Data Tsunami using Intel Architecture
Taming the Big Data Tsunami using Intel ArchitectureTaming the Big Data Tsunami using Intel Architecture
Taming the Big Data Tsunami using Intel Architecture
 
The Other Way of Doing Big Data
The Other Way of Doing Big DataThe Other Way of Doing Big Data
The Other Way of Doing Big Data
 
Real-Time Analytics: The Future of Big Data in the Agency
Real-Time Analytics: The Future of Big Data in the AgencyReal-Time Analytics: The Future of Big Data in the Agency
Real-Time Analytics: The Future of Big Data in the Agency
 
The Power of Elasticsearch
The Power of ElasticsearchThe Power of Elasticsearch
The Power of Elasticsearch
 
Case Study: Digital Agency Turbocharges Social Listening and Insights with t...
Case Study: Digital  Agency Turbocharges Social Listening and Insights with t...Case Study: Digital  Agency Turbocharges Social Listening and Insights with t...
Case Study: Digital Agency Turbocharges Social Listening and Insights with t...
 
Meet the Infochimps Platform
Meet the Infochimps PlatformMeet the Infochimps Platform
Meet the Infochimps Platform
 

Recently uploaded

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

451 Research Impact Report

  • 1. Infochimps targets enterprises with stream-processing additions to 'big data' PaaS Analyst: Matt Aslett 14 Nov, 2012 'Big data' PaaS provider Infochimps has updated its Infochimps Platform with the addition of stream-processing capabilities to the Infochimps Data Delivery Service based on technologies first developed at Twitter and LinkedIn. With its first paying customer on board, the company is now seeking partnerships to help support its enterprise-focused PaaS offering. The 451 Take There's a big difference between offering Hadoop as a service to be configured, deployed and managed, and offering a managed service that masks the complexity of configuring and deploying Hadoop. We believe the latter will gain traction as more late adopters begin to look at adopting the benefits of Hadoop without investing upfront in the expertise and infrastructure required to support it. While Infochimps will need to establish the trust of its target customers, it is well-positioned with an easy-to-consume managed service for those without Hadoop expertise, as well as a stack of technologically interesting projects for the 'devops' crowd. Context We first covered Infochimps earlier this year when the company pivoted from being a data marketplace provider to releasing the technology that supported its data marketplace, both as open source projects and as PaaS. The initial focus was on making it easier to deploy the Hadoop Copyright 2012 - The 451 Group 1
  • 2. data-processing framework via a Chef-based systems provisioning, deployment and updating tool called IronFan. Infochimps has expanded since then with the addition in April of an operations dashboard called Dashpot, and in August with the addition of the Apache Flume-based Data Delivery Service (DDS) for integrating with existing data sources, as well as early data-streaming functionality in DDS via extensions to Wukong, the company's Ruby for Hadoop. The latest addition to the platform expands its support for stream processing through the integration of open source stream-processing projects Storm and Kafka. Initially developed by BackType and released as an open source project by Twitter in August 2011 following its acquisition of the social analytics provider, Storm is a stream-processing engine. Kafka, meanwhile, is a distributed message queue originally developed by LinkedIn and used by the company in a number of projects, including feeding all activity events to its data warehouse and Hadoop, as well as keeping its search engine up to date with network activity in real time. Storm and Kafka are used by Infochimps as the foundation of DDS, which is used to connect the company's Hadoop-based PaaS with multiple existing data sources, enabling real-time integration of relevant data for processing and analysis. DDS is a key component of the Infochimps Platform that elevates it beyond a platform for Hadoop deployment to being a potential big data management and analytics platform of choice. It is DDS that will enable businesses to adopt the Infochimps Platform alongside existing data management technologies and quickly gain insight from new and existing sources of data. Infochimps' main selling point is in lowering the barriers to adopting Hadoop. While there is a lot of complex technology involved – such as IronFan, elastic Hadoop, DDS, elasticsearch, NoSQL and NewSQL databases, Wukong and Dashpot – the platform is delivered as a service designed to mask that complexity. The company maintains that it can take customers from nowhere to generating business insight from the Infochimps Platform in 30 days, without the need to hire specialist support and analytics staff, or invest in specialist infrastructure. Infochimps has attracted nine paying customers since its platform went live in the second quarter, with an average selling price of $200,000. The company charges customers per node per month for what is currently a public cloud offering hosted on Amazon Web Services or Rackspace Cloud. Infochimps has established relationships (soon to be announced) to deliver both private cloud and virtual private cloud offerings supported in its customers' own datacenters or via their trusted datacenter provider. The company is launching its cloud services across a network of tier four datacenters in North America and will begin offering its big data cloud services in the first quarter of 2013. The potential to support private cloud deployments will be aided by the fact that IronFan is Copyright 2012 - The 451 Group 2
  • 3. a key component in VMware's Serengeti project to make it easy to configure and deploy Hadoop on virtual machines, while the Infochimps Platform also supports the OpenStack API. The shift toward more enterprise-focused services and partnerships is being led by former Teradata and StackIQ executive (and Xerox PARC EIR) Jim Kaskade, who joined the company as CEO in August, replacing cofounder Joe Kelly, who became COO. Kaskade has also been busy lining up a new major financing round. Infochimps had previously raised a total of $3m from investors including DFJ Mercury, although that was during its previous incarnation as a data marketplace provider. The company currently has 23 employees, up from 14 in March. Competition There are an increasing number of vendors offering Hadoop as a service, with Amazon and Google being the biggest players at this point. While they therefore pose a competitive threat to Infochimps, the value proposition is quite different, since it still requires a degree of expertise to configure, deploy and manage a cloud-based Hadoop service in comparison to Infochimps' managed services approach. We've seen limited uptake of cloud-based Hadoop services to date, with the main use case being development and testing. Indeed, we've noted before that if a company begins to move toward a larger-scale deployment, the costs can be prohibitive enough to require on-premises deployment. While Infochimps' service is initially based on the public cloud, it has designs on supporting deployment choice. The company also believes that with the added value of IronFan, DDS, Wukong, Dashpot and the rest, along with its managed services approach, it has enough to justify the additional cost above that of running Hadoop on a public cloud service with the required expertise. Other Hadoop service providers include SunGard, Treasure Data, Qubole, Mortar Data and Guavus, while Infochimps believes its closest competition will come from MetaScale, the Hadoop managed services subsidiary of Sears Holdings, and tresata, the stealthy data platform provider founded by former Bank of America managing director for big data and analytics Abhi Mehta. Other vendors are trying to mask the complexity of configuring and deploying Hadoop by building it into larger on-premises application stacks, so we might also expect would-be customers to consider the likes of Drawn to Scale, Splice Machine or Digital Reasoning, depending on the specific application. The company must also be considered a rival to some extent with Hadoop distributors such as Cloudera, Hortonworks, MapR, IBM and EMC, although there is also the potential for partnerships here, as indicated by the fact that Cloudera CEO Mike Olson is an adviser to Infochimps. Copyright 2012 - The 451 Group 3
  • 4. SWOT Analysis Strengths Weaknesses We were already fans of the Chef-based cluster Managed services relationships are built on trust. platform tuned for the needs of enterprises using While Infochimps has technological expertise, it Hadoop. DDS adds all-important integration with will need to establish itself before some would-be existing tools that will help drive wider adoption. customers will consider it. Opportunities Threats We are seeing an increasing need for technologies and The big services and software providers are services that mask the complexity of configuring, unlikely to sit back and let demand for Hadoop deploying and managing Hadoop for late adopters. managed services go elsewhere. Expect the Infochimps has both. competition to increase with demand. Copyright 2012 - The 451 Group 4
  • 5. Reproduced by permission of The 451 Group; © 2012. This report was originally published within 451 Research’s Market Insight Service. For additional information on 451 Research or to apply for trial access, go to: www.451research.com Copyright 2012 - The 451 Group 5