SlideShare a Scribd company logo
1 of 26
Download to read offline
1
Cloudera	
  Search	
  
Embracing	
  Apache	
  Solr	
  into	
  Cloudera’s	
  Pla9orm	
  for	
  Big	
  Data	
  
	
  
Eva	
  Andreasson,	
  Sr.	
  Product	
  Manager,	
  Cloudera	
  	
  
Steven	
  Noels,	
  Co-­‐founder	
  and	
  SVP	
  of	
  Products,	
  NGDATA	
  
Who	
  is	
  Cloudera?	
  
2	
  
What	
  the	
  Enterprise	
  
Requires	
  
§  Only	
  100%	
  open	
  source	
  
Hadoop-­‐based	
  pla<orm	
  
with	
  both	
  batch	
  and	
  real-­‐
@me	
  processing	
  engines,	
  
enterprise-­‐ready	
  with	
  
na@ve	
  high	
  availability	
  
§  Suite	
  of	
  system	
  and	
  data	
  
management	
  soEware	
  
§  Comprehensive	
  support	
  
and	
  consul@ng	
  services	
  
§  Broadest	
  Hadoop	
  training	
  
and	
  cer@fica@on	
  programs	
  
Extensive	
  Partner	
  
Ecosystem	
  
§  Over	
  600	
  partners	
  across	
  
hardware,	
  soEware	
  and	
  
services	
  	
  
The	
  Leader	
  in	
  
Big	
  Data	
  
Management	
  
	
  
§  Deliver	
  a	
  revolu@onary	
  
data	
  management	
  
pla<orm	
  powered	
  by	
  
Apache	
  Hadoop	
  
§  World’s	
  leading	
  
commercial	
  vendor	
  of	
  	
  
Apache	
  Hadoop	
  
§  Enable	
  organiza@ons	
  to	
  
improve	
  opera@onal	
  
efficiency	
  and	
  Ask	
  
Bigger	
  Ques@ons	
  of	
  all	
  
their	
  data	
  
Customers	
  &	
  Users	
  
Across	
  Industries	
  
§  More	
  produc@on	
  
deployments	
  than	
  all	
  other	
  
vendors	
  combined	
  
 	
  
INGEST	
   STORE	
   EXPLORE	
   PROCESS	
   ANALYZE	
   SERVE	
  
CDH	
   CLOUDERA	
  
MANAGER	
  
CLOUDERA	
  
SUPPORT	
  
Cloudera	
  Enterprise	
  
3	
  
BRINGS	
  STORAGE	
  &	
  
COMPUTE	
  TOGETHER	
  
WORKS	
  WITH	
  EVERY	
  
TYPE	
  OF	
  DATA	
  
CHANGES	
  THE	
  
ECONOMICS	
  OF	
  DATA	
  
MANGAGEMENT	
  
A	
  revolu@onary	
  solu@on	
  powered	
  by	
  Apache	
  Hadoop	
  
CLOUDERA	
  
NAVIGATOR	
  
“
About	
  NGDATA	
  
NGDATA	
  is	
  the	
  next	
  genera@on	
  Customer	
  Intelligence	
  company	
  that	
  enables	
  ac@onable	
  
customer	
  insights,	
  personalized	
  product	
  offers	
  and	
  in@mate	
  customer	
  experience	
  with	
  a	
  
unique	
  combina@on	
  of	
  interac@ve	
  Big	
  Data	
  management	
  and	
  machine	
  learning	
  technologies	
  
in	
  one	
  integrated	
  solu@on.	
  
Business Expertise
Enterprise
Architectures
Big Data Technology
Machine
Learning,
Algorithms,
Analytics
Customer
Intelligence
VISION	
  &	
  EXPERTISE	
   SOLUTION	
  
Customer Database
Enterprise Data
Reference
Data
Customer
Data
Customer
Engagement
Governance
and Risk
Management
Insights, Trends
and Analysis
lily
A	
  Next	
  GeneraVon	
  Customer	
  Intelligence	
  Company	
  
Agenda	
  
§  Why	
  Search?	
  
§  What	
  is	
  Cloudera	
  Search?	
  
§  Using	
  Cloudera	
  Search	
  
§  Learn	
  more	
  
6
Why	
  Search?	
  
Cloudera’s	
  Enterprise	
  Strategy	
  
An	
  Integrated	
  Part	
  of	
  
the	
  Hadoop	
  System	
  
One	
  pool	
  of	
  data	
  
One	
  security	
  framework	
  
One	
  set	
  of	
  system	
  resources	
  
One	
  management	
  interface	
  
Search	
  Simplifies	
  Interac@on	
  
Explore	
  
Navigate	
  
Correlate	
  
Experts	
  know	
  MapReduce.	
  Savvy	
  people	
  know	
  SQL.	
  	
  
Everyone	
  knows	
  Search.	
  
Benefits	
  of	
  Search	
  
Improved	
  Big	
  Data	
  ROI	
  
•  An	
  interac@ve	
  experience	
  without	
  technical	
  knowledge	
  
•  Single	
  data	
  set	
  for	
  mul@ple	
  compu@ng	
  frameworks	
  
9
Faster	
  Vme	
  to	
  insight	
  
•  Exploratory	
  analysis,	
  esp.	
  unstructured	
  data	
  
•  Broad	
  range	
  of	
  indexing	
  op@ons	
  to	
  accommodate	
  needs	
  
Cost	
  efficiency	
  
•  Single	
  scalable	
  pla<orm;	
  no	
  incremental	
  investment	
  
•  No	
  need	
  for	
  separate	
  systems,	
  storage	
  
Solid	
  foundaVons	
  and	
  reliability	
  
•  Solr	
  in	
  produc@on	
  environments	
  for	
  years	
  
•  Hadoop-­‐powered	
  reliability	
  and	
  scalability	
  
10
What	
  is	
  Cloudera	
  Search?	
  
Cloudera	
  Search	
  
InteracVve	
  search	
  for	
  Hadoop	
  
•  Full-­‐text	
  and	
  faceted	
  naviga@on	
  
•  Batch,	
  near	
  real-­‐@me,	
  and	
  on-­‐demand	
  indexing	
  
11
Apache	
  Solr	
  integrated	
  with	
  CDH	
  
•  Established,	
  mature	
  search	
  with	
  vibrant	
  community	
  
•  Separate	
  run@me	
  like	
  MapReduce,	
  Impala	
  
•  Incorporated	
  as	
  part	
  of	
  the	
  Hadoop	
  ecosystem	
  
Open	
  Source	
  
•  100%	
  Apache,	
  100%	
  Solr	
  
•  Standard	
  Solr	
  APIs	
  
Scalable	
  and	
  Robust	
  Index	
  Storage	
  
HDFS	
  
Lucene	
  
Extrac@on	
   Mapping	
  
Solr	
  
Zookeeper	
  
SolrCloud	
  
Querying	
  API	
   Indexing	
  API	
  
12	
  
Solr	
  and	
  HDFS	
  
•  Scalable,	
  cost-­‐efficient	
  
index	
  storage	
  
•  Higher	
  availability	
  
•  Search	
  and	
  process	
  data	
  
in	
  one	
  pla<orm	
  
Near	
  Real	
  Time	
  Indexing	
  at	
  Ingest	
  
Log	
  File	
  
Solr	
  and	
  Flume	
  
•  Data	
  ingest	
  at	
  scale	
  
•  Flexible	
  extrac@on	
  and	
  
mapping	
  
•  Indexing	
  at	
  data	
  ingest	
  
•  Document-­‐level	
  ACL	
  
HDFS	
  
Flume	
  
Agent	
  
Indexer	
  
Other	
  
Log	
  File	
  
Flume	
  
Agent	
  
Indexer	
  
13	
  
Streamlined	
  Extrac@on	
  and	
  Mapping	
  
Cloudera	
  Morphlines	
  
•  Simple	
  and	
  flexible	
  data	
  
transforma@on	
  	
  
•  Reusable	
  across	
  mul@ple	
  
index	
  workloads	
  
•  Over	
  @me,	
  extend	
  and	
  re-­‐use	
  
across	
  pla<orm	
  workloads	
  
syslog	
   Flume	
  
Agent	
  
Solr	
  sink	
  
Command:	
  readLine	
  
Command:	
  grok	
  
Command:	
  loadSolr	
  
Solr	
  
Event	
  
Record	
  
Record	
  
Record	
  
Document	
  
Scalable	
  Batch	
  Indexing	
  
Index	
  
shard	
  
Files	
  
Index	
  
shard	
  
Indexer	
  
Files	
  
Solr	
  
server	
  
Indexer	
  
Solr	
  
server	
  
15
HDFS	
  
Solr	
  and	
  MapReduce	
  
•  Flexible,	
  scalable	
  batch	
  
indexing	
  
•  Start	
  serving	
  new	
  indices	
  
with	
  no	
  down@me	
  
•  On-­‐demand	
  indexing,	
  cost-­‐
efficient	
  re-­‐indexing	
  
Scalable	
  Batch	
  Indexing	
  
16
Mapper:	
  
Parse	
  input	
  into	
  
indexable	
  document	
  
Mapper:	
  
Parse	
  input	
  into	
  
indexable	
  document	
  
Mapper:	
  
Parse	
  input	
  into	
  
indexable	
  document	
  
Index	
  
shard	
  1	
  
Index	
  
shard	
  2	
  
Arbitrary	
  reducing	
  steps	
  of	
  indexing	
  and	
  merging	
  
End-­‐Reducer	
  (shard	
  1):	
  
Index	
  document	
  
End-­‐Reducer	
  (shard	
  2):	
  
Index	
  document	
  
Searchable	
  Real-­‐Time	
  Data	
  
Indexing	
  HBase	
  
HDFS	
  
HBase	
  
interac@ve	
  load	
  
Indexer(s)	
  
Triggers	
  on	
  
updates	
  
Solr	
  server	
  
Solr	
  server	
  
Solr	
  server	
  
Solr	
  server	
  
Solr	
  server	
  
Search	
  
+	
   =	
  
planet-­‐sized	
  tabular	
  data	
  
immediate	
  access	
  &	
  updates	
  
fast	
  &	
  flexible	
  informaVon	
  
discovery	
  
BIG	
  DATA	
  DATAMANAGEMENT	
  
Searchable	
  Real-­‐Time	
  Data	
  
HBase	
  &	
  Search	
  
HBase	
  SEP	
  Triggers	
  &	
  Indexer	
  
•  HBase	
  replica@on	
  
mechanism	
  for	
  reliable	
  
indexing	
  
•  light-­‐weight,	
  zero	
  impact	
  on	
  
write	
  performance	
  
•  easy	
  to	
  set	
  up	
  &	
  integrate	
  
•  flexible,	
  configura@on-­‐based	
  
mapping	
  &	
  content	
  
extrac@on	
  
Many	
  use	
  cases	
  
•  indexes	
  near-­‐real-­‐@me	
  
HBase	
  updates	
  into	
  Solr	
  
•  fielded	
  search	
  on	
  HBase	
  
columns	
  
•  faceted	
  search	
  
•  query	
  by	
  example	
  
•  datacube	
  
•  secondary	
  indexes	
  
Simple,	
  Customizable	
  Search	
  Interface	
  
Hue	
  
•  Simple	
  UI	
  
•  Navigated,	
  faceted	
  drill	
  
down	
  
•  Customizable	
  display	
  
•  Full	
  text	
  search,	
  
standard	
  Solr	
  API	
  and	
  
query	
  language	
  
Simplified	
  Management	
  
Cloudera	
  Manager	
  
•  Install,	
  configure,	
  deploy	
  Solr	
  
services	
  on	
  the	
  cluster	
  
•  Unified	
  management	
  and	
  
monitoring	
  
•  Resource	
  management	
  
21
Using	
  Cloudera	
  Search	
  
Skybox	
  
•  Advanced	
  parallel	
  image	
  processing	
  on	
  
images	
  stored	
  in	
  HDFS	
  
•  Before:	
  difficult	
  to	
  interac@vely	
  evaluate	
  
image	
  quality	
  and	
  correlate	
  with	
  satellite	
  
logs	
  
•  Now:	
  Index	
  images	
  and	
  satellite	
  logs	
  at	
  
acquisi@on	
  and	
  on	
  demand,	
  interac@vely	
  
introspect	
  image	
  quality	
  
Scalable,	
  efficient	
  image	
  search	
  for	
  
analysis	
  and	
  process	
  improvement	
  
Explorys	
  Medical	
  
"Hadoop	
  has	
  been	
  Explorys'	
  center	
  of	
  gravity	
  for	
  
data	
  management	
  since	
  the	
  company's	
  incep@on.	
  
The	
  addi@on	
  of	
  Search	
  to	
  Cloudera's	
  pla<orm	
  
expands	
  its	
  usability	
  by	
  suppor@ng	
  more	
  workloads	
  
and	
  reducing	
  data	
  movement	
  between	
  
infrastructure	
  systems.	
  Deploying	
  Cloudera	
  Search	
  
supports	
  Explorys'	
  mission	
  to	
  help	
  healthcare	
  
providers	
  deliver	
  beker,	
  more	
  cost	
  efficient	
  care	
  
through	
  fast,	
  flexible	
  data	
  analysis."	
  	
  
-­‐-­‐	
  Michael	
  Onders,	
  SVP	
  &	
  CTO,	
  Explorys	
  
Event,	
  exploraVon,	
  and	
  data	
  correlaVon	
  	
  
to	
  meet	
  SLAs	
  
Pakerns	
  and	
  Predic@ons	
  
•  Iden@fy	
  pakerns	
  in	
  social	
  media	
  and	
  
perform	
  analy@cs	
  on	
  term	
  usage	
  to	
  improve	
  
suicide	
  predic@ve	
  capability	
  	
  
•  Before:	
  Social	
  media	
  data	
  sets	
  too	
  large;	
  
tradi@onal	
  enterprise	
  search	
  
•  Now:	
  Near	
  real-­‐@me	
  correla@on	
  of	
  medical	
  
records,	
  notes,	
  social	
  media;	
  access	
  for	
  
doctors	
  and	
  non-­‐tech	
  staff	
  
ProacVve	
  healthcare	
  for	
  returning	
  
military	
  veterans	
  
Ques@ons	
  
•  Ask	
  on	
  the	
  Q&A	
  tab	
  	
  
	
  
•  Recording	
  will	
  be	
  available	
  	
  
at	
  cloudera.com	
  
	
  
•  A^er	
  webinar,	
  inquire	
  at:	
  
info@cloudera.com	
  	
  
	
  
•  Presenters	
  contact	
  info:	
  	
  
eva@cloudera.com	
  
stevenn@ngdata.com	
  	
  
	
  
Thank	
  you	
  for	
  a,ending!	
  
	
  
25
Download	
  Cloudera	
  Search	
  	
  
cloudera.com/downloads	
  
	
  
Learn	
  more	
  about	
  Cloudera	
  
Search,	
  powered	
  by	
  Solr	
  
cloudera.com/search	
  	
  	
  
	
  
Learn	
  more	
  about	
  NGDATA	
  
and	
  Lily	
  
www.ngdata.com	
  
Cloudera Search Webinar: Big Data Search, Bigger Insights

More Related Content

What's hot

Welcome to Hadoop2Land!
Welcome to Hadoop2Land!Welcome to Hadoop2Land!
Welcome to Hadoop2Land!Uwe Printz
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014cdmaxime
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoopmarkgrover
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafkaSzehon Ho
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...lucenerevolution
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnhdhappy001
 
Hadoop distributions - ecosystem
Hadoop distributions - ecosystemHadoop distributions - ecosystem
Hadoop distributions - ecosystemJakub Stransky
 
Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014hadooparchbook
 
Hadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduceHadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduceUwe Printz
 
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSecuring Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSpark Summit
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Cedric CARBONE
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)Todd Lipcon
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoopmarkgrover
 

What's hot (20)

Welcome to Hadoop2Land!
Welcome to Hadoop2Land!Welcome to Hadoop2Land!
Welcome to Hadoop2Land!
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
 
Kudu Cloudera Meetup Paris
Kudu Cloudera Meetup ParisKudu Cloudera Meetup Paris
Kudu Cloudera Meetup Paris
 
Hive on kafka
Hive on kafkaHive on kafka
Hive on kafka
 
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
The Search Is Over: Integrating Solr and Hadoop in the Same Cluster to Simpli...
 
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarnBikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
 
Envelope
Envelope Envelope
Envelope
 
Spark Uber Development Kit
Spark Uber Development KitSpark Uber Development Kit
Spark Uber Development Kit
 
Hadoop distributions - ecosystem
Hadoop distributions - ecosystemHadoop distributions - ecosystem
Hadoop distributions - ecosystem
 
Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014Application architectures with Hadoop – Big Data TechCon 2014
Application architectures with Hadoop – Big Data TechCon 2014
 
Impala for PhillyDB Meetup
Impala for PhillyDB MeetupImpala for PhillyDB Meetup
Impala for PhillyDB Meetup
 
Hadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduceHadoop 2 - Beyond MapReduce
Hadoop 2 - Beyond MapReduce
 
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSecuring Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
 
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profitDeep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
 
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
Apache Falcon : 22 Sept 2014 for Hadoop User Group France (@Criteo)
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoop
 

Similar to Cloudera Search Webinar: Big Data Search, Bigger Insights

AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...Amazon Web Services
 
Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Adam Doyle
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18Cloudera, Inc.
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for AnalyticsJen Stirrup
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingCascading
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Cloudera, Inc.
 
Ankus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration frameworkAnkus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration frameworkAshrith Mekala
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全Jianwei Li
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiFelicia Haggarty
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Cloudera, Inc.
 
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Cloudian
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoopgregchanan
 

Similar to Cloudera Search Webinar: Big Data Search, Bigger Insights (20)

AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
 
Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
CC -Unit4.pptx
CC -Unit4.pptxCC -Unit4.pptx
CC -Unit4.pptx
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
 
Ankus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration frameworkAnkus, bigdata deployment and orchestration framework
Ankus, bigdata deployment and orchestration framework
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Twitter with hadoop for oow
Twitter with hadoop for oowTwitter with hadoop for oow
Twitter with hadoop for oow
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
 
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
 
Solr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for HadoopSolr + Hadoop: Interactive Search for Hadoop
Solr + Hadoop: Interactive Search for Hadoop
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 

Recently uploaded (20)

Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 

Cloudera Search Webinar: Big Data Search, Bigger Insights

  • 1. 1 Cloudera  Search   Embracing  Apache  Solr  into  Cloudera’s  Pla9orm  for  Big  Data     Eva  Andreasson,  Sr.  Product  Manager,  Cloudera     Steven  Noels,  Co-­‐founder  and  SVP  of  Products,  NGDATA  
  • 2. Who  is  Cloudera?   2   What  the  Enterprise   Requires   §  Only  100%  open  source   Hadoop-­‐based  pla<orm   with  both  batch  and  real-­‐ @me  processing  engines,   enterprise-­‐ready  with   na@ve  high  availability   §  Suite  of  system  and  data   management  soEware   §  Comprehensive  support   and  consul@ng  services   §  Broadest  Hadoop  training   and  cer@fica@on  programs   Extensive  Partner   Ecosystem   §  Over  600  partners  across   hardware,  soEware  and   services     The  Leader  in   Big  Data   Management     §  Deliver  a  revolu@onary   data  management   pla<orm  powered  by   Apache  Hadoop   §  World’s  leading   commercial  vendor  of     Apache  Hadoop   §  Enable  organiza@ons  to   improve  opera@onal   efficiency  and  Ask   Bigger  Ques@ons  of  all   their  data   Customers  &  Users   Across  Industries   §  More  produc@on   deployments  than  all  other   vendors  combined  
  • 3.     INGEST   STORE   EXPLORE   PROCESS   ANALYZE   SERVE   CDH   CLOUDERA   MANAGER   CLOUDERA   SUPPORT   Cloudera  Enterprise   3   BRINGS  STORAGE  &   COMPUTE  TOGETHER   WORKS  WITH  EVERY   TYPE  OF  DATA   CHANGES  THE   ECONOMICS  OF  DATA   MANGAGEMENT   A  revolu@onary  solu@on  powered  by  Apache  Hadoop   CLOUDERA   NAVIGATOR  
  • 4. “ About  NGDATA   NGDATA  is  the  next  genera@on  Customer  Intelligence  company  that  enables  ac@onable   customer  insights,  personalized  product  offers  and  in@mate  customer  experience  with  a   unique  combina@on  of  interac@ve  Big  Data  management  and  machine  learning  technologies   in  one  integrated  solu@on.   Business Expertise Enterprise Architectures Big Data Technology Machine Learning, Algorithms, Analytics Customer Intelligence VISION  &  EXPERTISE   SOLUTION   Customer Database Enterprise Data Reference Data Customer Data Customer Engagement Governance and Risk Management Insights, Trends and Analysis lily A  Next  GeneraVon  Customer  Intelligence  Company  
  • 5. Agenda   §  Why  Search?   §  What  is  Cloudera  Search?   §  Using  Cloudera  Search   §  Learn  more  
  • 7. Cloudera’s  Enterprise  Strategy   An  Integrated  Part  of   the  Hadoop  System   One  pool  of  data   One  security  framework   One  set  of  system  resources   One  management  interface  
  • 8. Search  Simplifies  Interac@on   Explore   Navigate   Correlate   Experts  know  MapReduce.  Savvy  people  know  SQL.     Everyone  knows  Search.  
  • 9. Benefits  of  Search   Improved  Big  Data  ROI   •  An  interac@ve  experience  without  technical  knowledge   •  Single  data  set  for  mul@ple  compu@ng  frameworks   9 Faster  Vme  to  insight   •  Exploratory  analysis,  esp.  unstructured  data   •  Broad  range  of  indexing  op@ons  to  accommodate  needs   Cost  efficiency   •  Single  scalable  pla<orm;  no  incremental  investment   •  No  need  for  separate  systems,  storage   Solid  foundaVons  and  reliability   •  Solr  in  produc@on  environments  for  years   •  Hadoop-­‐powered  reliability  and  scalability  
  • 10. 10 What  is  Cloudera  Search?  
  • 11. Cloudera  Search   InteracVve  search  for  Hadoop   •  Full-­‐text  and  faceted  naviga@on   •  Batch,  near  real-­‐@me,  and  on-­‐demand  indexing   11 Apache  Solr  integrated  with  CDH   •  Established,  mature  search  with  vibrant  community   •  Separate  run@me  like  MapReduce,  Impala   •  Incorporated  as  part  of  the  Hadoop  ecosystem   Open  Source   •  100%  Apache,  100%  Solr   •  Standard  Solr  APIs  
  • 12. Scalable  and  Robust  Index  Storage   HDFS   Lucene   Extrac@on   Mapping   Solr   Zookeeper   SolrCloud   Querying  API   Indexing  API   12   Solr  and  HDFS   •  Scalable,  cost-­‐efficient   index  storage   •  Higher  availability   •  Search  and  process  data   in  one  pla<orm  
  • 13. Near  Real  Time  Indexing  at  Ingest   Log  File   Solr  and  Flume   •  Data  ingest  at  scale   •  Flexible  extrac@on  and   mapping   •  Indexing  at  data  ingest   •  Document-­‐level  ACL   HDFS   Flume   Agent   Indexer   Other   Log  File   Flume   Agent   Indexer   13  
  • 14. Streamlined  Extrac@on  and  Mapping   Cloudera  Morphlines   •  Simple  and  flexible  data   transforma@on     •  Reusable  across  mul@ple   index  workloads   •  Over  @me,  extend  and  re-­‐use   across  pla<orm  workloads   syslog   Flume   Agent   Solr  sink   Command:  readLine   Command:  grok   Command:  loadSolr   Solr   Event   Record   Record   Record   Document  
  • 15. Scalable  Batch  Indexing   Index   shard   Files   Index   shard   Indexer   Files   Solr   server   Indexer   Solr   server   15 HDFS   Solr  and  MapReduce   •  Flexible,  scalable  batch   indexing   •  Start  serving  new  indices   with  no  down@me   •  On-­‐demand  indexing,  cost-­‐ efficient  re-­‐indexing  
  • 16. Scalable  Batch  Indexing   16 Mapper:   Parse  input  into   indexable  document   Mapper:   Parse  input  into   indexable  document   Mapper:   Parse  input  into   indexable  document   Index   shard  1   Index   shard  2   Arbitrary  reducing  steps  of  indexing  and  merging   End-­‐Reducer  (shard  1):   Index  document   End-­‐Reducer  (shard  2):   Index  document  
  • 17. Searchable  Real-­‐Time  Data   Indexing  HBase   HDFS   HBase   interac@ve  load   Indexer(s)   Triggers  on   updates   Solr  server   Solr  server   Solr  server   Solr  server   Solr  server   Search   +   =   planet-­‐sized  tabular  data   immediate  access  &  updates   fast  &  flexible  informaVon   discovery   BIG  DATA  DATAMANAGEMENT  
  • 18. Searchable  Real-­‐Time  Data   HBase  &  Search   HBase  SEP  Triggers  &  Indexer   •  HBase  replica@on   mechanism  for  reliable   indexing   •  light-­‐weight,  zero  impact  on   write  performance   •  easy  to  set  up  &  integrate   •  flexible,  configura@on-­‐based   mapping  &  content   extrac@on   Many  use  cases   •  indexes  near-­‐real-­‐@me   HBase  updates  into  Solr   •  fielded  search  on  HBase   columns   •  faceted  search   •  query  by  example   •  datacube   •  secondary  indexes  
  • 19. Simple,  Customizable  Search  Interface   Hue   •  Simple  UI   •  Navigated,  faceted  drill   down   •  Customizable  display   •  Full  text  search,   standard  Solr  API  and   query  language  
  • 20. Simplified  Management   Cloudera  Manager   •  Install,  configure,  deploy  Solr   services  on  the  cluster   •  Unified  management  and   monitoring   •  Resource  management  
  • 22. Skybox   •  Advanced  parallel  image  processing  on   images  stored  in  HDFS   •  Before:  difficult  to  interac@vely  evaluate   image  quality  and  correlate  with  satellite   logs   •  Now:  Index  images  and  satellite  logs  at   acquisi@on  and  on  demand,  interac@vely   introspect  image  quality   Scalable,  efficient  image  search  for   analysis  and  process  improvement  
  • 23. Explorys  Medical   "Hadoop  has  been  Explorys'  center  of  gravity  for   data  management  since  the  company's  incep@on.   The  addi@on  of  Search  to  Cloudera's  pla<orm   expands  its  usability  by  suppor@ng  more  workloads   and  reducing  data  movement  between   infrastructure  systems.  Deploying  Cloudera  Search   supports  Explorys'  mission  to  help  healthcare   providers  deliver  beker,  more  cost  efficient  care   through  fast,  flexible  data  analysis."     -­‐-­‐  Michael  Onders,  SVP  &  CTO,  Explorys   Event,  exploraVon,  and  data  correlaVon     to  meet  SLAs  
  • 24. Pakerns  and  Predic@ons   •  Iden@fy  pakerns  in  social  media  and   perform  analy@cs  on  term  usage  to  improve   suicide  predic@ve  capability     •  Before:  Social  media  data  sets  too  large;   tradi@onal  enterprise  search   •  Now:  Near  real-­‐@me  correla@on  of  medical   records,  notes,  social  media;  access  for   doctors  and  non-­‐tech  staff   ProacVve  healthcare  for  returning   military  veterans  
  • 25. Ques@ons   •  Ask  on  the  Q&A  tab       •  Recording  will  be  available     at  cloudera.com     •  A^er  webinar,  inquire  at:   info@cloudera.com       •  Presenters  contact  info:     eva@cloudera.com   stevenn@ngdata.com       Thank  you  for  a,ending!     25 Download  Cloudera  Search     cloudera.com/downloads     Learn  more  about  Cloudera   Search,  powered  by  Solr   cloudera.com/search         Learn  more  about  NGDATA   and  Lily   www.ngdata.com