The document provides information about Ahmed Banafa's background and experience, including extensive experience in operations and management with a research background in various techniques and analysis. It also lists that he has taught at several universities and has received several awards for his work. It provides a brief introduction to big data and defines it as large and complex structured and unstructured data that cannot be processed by traditional database tools. It also discusses some of the roots and key aspects of big data like volume, velocity, and variety.
2. Extensive experience in operations and management, with
research background in a variety of techniques and analysis.
Taught at several universities and colleges, including the
University of California, Berkeley, California State University-
East Bay, San Jose State University and University of
Massachusetts.
Recipient of several awards, including Distinguished Tenured
Staff Award of 2013, Business Program Instructor of the year for
2013 and 2014 and the Parthenon award for best instructor in
2012, 2010 and 2003, and Certificate of Honor for instructor of
the year from the City and County of San Francisco.
Included in the 2000 to 2001 "Who’s Who in Finance and
Industry."
Ahmed Banafa
14. Big Data?
The simplest definition of big data is large and complex
structured and unstructured data (images posted on
Facebook, email, text messages, GPS signals from mobile
phones, tweets, and other social media updates, etc.)
that cannot be processed by traditional database tools.
18. Starting from the basics
statistics is using numbers to quantify the data.
Data mining is using statistics and programming
languages to find patterns hidden in the data.
Machine learning uses data mining to build models
to predict future outcomes.
Artificial intelligence uses models built by machine
learning to make machines act in an intelligent way
like playing a game or driving a car (e.g., IBM’s Watson
supercomputer and the driverless car by Google).
19. Big data analytics is the process of studying big data
to uncover hidden patterns and correlations to make
better decisions using technologies like NoSQL
databases, Hadoop, and MapReduce. The main goal of
big data analytics is to help organizations make better
business decisions.
21. Volume. Unstructured data streaming in from social
media. Increasing amounts of sensor and machine-to-
machine data being collected.
Velocity. Data is streaming in at unprecedented speed
and must be dealt with in a timely manner.
Variety. Data today comes in all types of formats—
structured, numeric data in traditional databases.
Information created from line-of-business
applications.
22. Big Data Analytics 3.0
Analytics 1.0 : BI
Analytics 2.0: Used by online companies only
(Google, Yahoo, Facebook, etc.).
Analytics 3.0: A new resolve to apply powerful data-
gathering and analysis methods not just to a
company’s operations but also to its offerings—to
embed data smartness into the products and services
customers buy.
23. Attributes of Analytics 3.0:
The most important trait is that not only online firms,
but virtually any type of firm in any industry, can
participate in the data-driven economy.
Multiple data types: Organizations are combining
large and small volumes of data, internal and external
sources, and structured and unstructured formats to
yield new insights in predictive and prescriptive
models.
24. Technologies and methods are much faster: Big data
technologies include a variety of hardware/software
architectures, including clustered parallel servers
using Hadoop/MapReduce, in-memory analytics, and
so forth. All of these technologies are considerably
faster than previous generations.
25. Integrated and embedded: built into consumer-
oriented products and features.
Data science/analytics/IT teams will work together
Chief analytics officers (CAO) are new leadership
positions.
27. Prescriptive analytics: There have always been three
types of analytics: descriptive, that report on the past;
predictive, that use models based on past data to
predict the future; and prescriptive, that use models to
specify optimal behaviors and actions. Analytics 3.0
includes all types, but there is an increased emphasis
on prescriptive analytics.
28. Old and New!
Google announced acquisition of Nest (smart home
devices), a source of massive data from homes all over
the United States, confirming the direction of Analytics
3.0 by an online company at the leading edge of
Analytics 2.0.
31. Dark Data
Gartner defines dark data: as the information assets
organizations collect, process and store during regular
business activities, but generally fail to use for other
purposes (for example, analytics, business
relationships and direct monetizing).
IDC, stated that up to 90 percent of big data is dark
data.
32. Similar to dark matter in physics, dark data often
comprises most organizations’ universe of information
assets. Thus, organizations often retain dark data for
compliance purposes only. Storing and securing data
typically incurs more expense (and sometimes greater
risk) than value.
33. Dark data is a type of unstructured, untagged and
untapped data that is found in data repositories and
has not been analyzed or processed. It is similar to big
data but differs in how it is mostly neglected by
business and IT administrators in terms of its value.
Dark data is also known as dusty data.
34. Dark data, unlike dark matter, can be brought to light
and so can its potential ROI. And what’s more, a
simple way of thinking about what to do with the data
–- through a cost-benefit analysis –- can remove the
complexity surrounding the previously mysterious
dark data.
36. Big Data as a Service: the next big
thing ?
Big data as a service (BDaaS) is a term typically used
to refer to services that offer analysis of large or
complex data sets, using the cloud hosted services.
Similar types of services include software as a service
(SaaS) or infrastructure as a service (IaaS), where
specific big data as a service options are used to help
businesses handle what the IT world calls big data, or
sophisticated aggregated data sets that provide a lot of
value for today’s companies.
38. Network Security Needs Big Data
ZTM: "Zero trust model" is an aggressive model of
network security that monitors every piece of data
possible, assuming that every file is a potential threat
The convergence of Big Data and Network Security is a
direct product of “Applied Big Data “and it’s a prime
example of using analytics technologies to tackle a
current business problem such as cyberattacks
39. Google
They process 3.5 billion requests per day, and each
request queries a database of 20 billion web pages
40. Amazon
Amazon has recently obtained a patent on a system
designed to ship goods to us before we have even
decided to buy it – predictive despatch.