Driving Behavioral Change for Information Management through Data-Driven Gree...
Hadoop a Natural Choice for Data Intensive Log Processing
1. Apache Hadoop A Natural Choice for Data Intensive Multiform at Log Processing Date: 22 nd April’ 2011 Authored and Compiled By: Hitendra Kumar
2.
3.
4. XML Logs CSV SQL Objects, JSONs Binary Hadoop Distributed File System (HDFS) M A P C R E A T I O N Reduce Commodity Server Cloud (Scale Out) Hadoop Environment RDBMS import Reporting Dash Boards BI Applications Enterprise High Volume Data In-Flow Map-Reduce Process Consume Results Hadoop Processing How it works? How it works?
16. Example System (Web Portal) Tera-Bytes of data being populated to centralized storage and processed, every week-end!
17.
18. DB Server J2EE Application Server HTTP HTTP DB Server J2EE Application Server Apache Web Server Tomcat mod_jk Plug-In JBOSS - J2EE Application JBOSS – Portal Web Service JBOSS – jBPM JBOSS - Portal HTTP JDBC Web Portal Servers (Apache + App Server) Web Portal Deployment Landscape Shrading Function Hadoop Processing Web Portal – Deployment Landscape Web Portal – Deployment Landscape DB LB LB DB
19. Example – AOL Advertising Platform http://www.cloudera.com/blog/2011/02/an-emerging-data-management-architectural-pattern-behind-interactive-web-application/