31. Big Data is still a big problem for many companies.
● How do you collect, process and distribute it?
● How do you analyze it?
Hadoop promises an answer to these questions.
32. Hadoop
Apache Hadoop® is an open source Java based framework for distributed storage and
processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
gain insight from massive amounts of structured and unstructured data.
45. Business intelligence (BI) is a technology-driven process for analyzing data and presenting actionable
information to help corporate executives, business managers and other end users make more informed
business decisions.
53. What is Spark?
● Spark is new technology that sits on top of Hadoop Distributed File System (HDFS)
● It is characterized as “a fast and general engine for large-scale data processing.”
● Spark has three key features:
1. For iterative analysis like logistic regression, Random Forests, or other advanced algorithms,
Spark has demonstrated 100X increase in speed that scales to hundreds of millions of rows.
2. Spark has native support for the latest and greatest programming languages Java, Scala, and of
course Python.
3. Spark has generality or platform compatibility in both directions meaning it integrates nicely with
SQL engines (Shark), Machine Learning (MLlib), and streaming (Spark Streaming) without
requiring new software installed on the cluster using Hadoop’s new YARN cluster manager.
64. Spark Or Hadoop--
Which Is The Best Big Data Framework?
● Hadoop, for many years, was the leading open source Big Data framework
● Spark has become the more popular of the Apache Software Foundation tool from 2014.
● Spark does not include its own system for organizing files in a distributed way (the file system)
● so it requires one provided by a third-party. For this reason many Big Data projects involve installing
Spark on top of Hadoop
● Spark’s advanced analytics applications can make use of data stored using the Hadoop Distributed File
System (HDFS).
● Many of the big vendors (i.e Cloudera) now offer Spark as well as Hadoop, so will be in a good position
to advise companies on which they will find most suitable, on a job-by-job basis.
65.
66.
67. Top 6 Hadoop Vendors providing Big Data Solutions in Open Data Platform
69. Big Data Is Big Market & Big Business - $50 Billion Market
by 2017
Big Data not only refers to the data itself but also a set of
technologies that capture, store, manage and analyze large
and variable collections of data to solve complex problems.
95. 7 Ways Big Data Training Can
Change Your Organization
96. 1.Information Technology: Improving productivity with Big Data Training
2.Product Development: Rethinking innovation across all stages of R&D
3.Finance: Training employees on big data platforms to handle financial modelling
4.Human Resources: Redefining HR employee capabilities
5.Supply Chain & Logistics: Training delivery team with big data platforms
6.Operations, Support & Customer service: Employee training on big data at every customer interaction
7.Marketing: Training employees on a systematic marketing approach with big data