Scaling API-first – The story of a global engineering organization
Overview of Bigdata Analytics
1.
2.
3. Big Data
So large data that it
becomes difficult to
process using the
traditional system
4. Characteristics of Big Data
Velocit
y
• Data
Speed
Volume
• Data
quantity
Variety
•Data
Types
5. Volume refers to huge amount of data
being generated every minute.
90% of the data we have now is
created in just past 2 years.
Cisco estimates internet traffic
4.8ZB per year.
3 billion people would be online by
2015 .
Volume
6. Velocity refers to SPEED at which
new data is being generated and moves
around.
It includes Real time working
systems such as Online banking.
Need of low response time.
Velocity
7. Variety refers to various
datatypes which we can now use.
Earlier focus was on neat and
structured data kept in form of
tables in RDBMS.
80% of data available now is
unstructured data
Data in the from of text,
videos, audios and pictures.
Variety
12. Why Big Data
– Increase of storage capacities
– Increase of processing power
– Availability of data
– 90% of the data in the world today has been
created in the last two years alone
13. Big Data Analytics
• Examining large amount of data
• Appropriate information
• Competitive advantage
• Better business decisions
• Effective marketing, customer satisfaction, increased
revenue
14. Applications for Big Data Analytics
Multi-channel sales
Smarter Healthcare Finance
Log Analysis Traffic Control
Search Quality
Manufacturing Trading Analytics
Telecom
15. NoSQL : non-relational or at least non-SQL database
solutions such as HBase (also a part of the Hadoop
ecosystem), Cassandra, MongoDB, Riak, CouchDB, and
many others.
Hadoop: It is an ecosystem of software packages,
including MapReduce, HDFS, and a whole host of other
software packages
16. What is Hadoop?
Hadoop is a open source framework
Java-based programming framework
Processing and storing of large data
sets
Distributed computing environment.
17. References
•http://searchbusinessanalytics.techtarget.com/
Experts sound off on big data , Analytics and its tools
• http://www.ibmbigdatahub.com/infographic/four-vs-big-
data
Big data and analytics hub
• https://bigdatauniversity.com/bdu-wp/bdu-course/
hadoop-fundamentals-i-version-3/
Hadoop fundamentals
Editor's Notes
Explain well.
Quote practical examples
NoSQL : approach to data management and database design that's useful for very large sets of distributed data.
Hadoop: free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment
Map Reduce: software framework that allows developers to write programs that process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers.
Map, a function that parcels out work to different nodes in the distributed cluster. Reduce, another function that collates the work and resolves the results into a single value.