2. Why analysis
● Humans can count only till so much
● We understand summarized information
● We understand graphs faster
● We need to take decisions
● Wrong Decisions lead to huge costs
3. Central Tendency
● What is the difference between mean and
median
● When to use what?
● What is expected value?
● When can mean be misleading?
Exercise- What is the average height of this class
4. Grouped Means
Exercise-
What is height of class
What is the height of class by gender
What is the height of class by team
What is the height of class by dark-light colored clothing
CROSS TABS-exercise
of mtcars
5. Variance
What is the range (max - min)
What is a quartile (4 quarters)
What is a decile (10 deciles)
No one really uses standard deviation in
business world
6. Frequency Analysis
contingency tables
Height range Number of students Cumulative number
less than 5.0
feet
25 25
5.0–5.5 feet 35 60
5.5–6.0 feet 20 80
6.0–6.5 feet 20 100
Dance Sports TV Total
Men 2 10 8 20
Women 16 6 8 30
Total 18 16 16 50
10. Analytics
• What is analytics?
• Where is it used?
• How is it used?
• What are some good practices?
11. Analytics
• What is analytics? – Study of data for helping
with decision making using software
• Where is it used?
• How is it used?
• What are some good practices?
12. Analytics
• What is analytics?
• Where is it used? – Industries (like Pharma,
BFSI, Telecom, Retail)
• How is it used? –Use statistics and software
• What are some good practices?
13. Analytics
• What is analytics?
• Where is it used?
• How is it used?
• What are some good practices? –
– Learn one new thing extra from your
competition every day. This is a fast moving field.
– Etc.
16. Social Media Analytics
Some examples
http://decisionstats.com/2013/12/04/top-fourteen-interfaces-in-social-media-and-web-analytics-on-the-
internet/
Some use cases
http://decisionstats.com/2014/05/10/analyzing-facebook-networks-using-rstats/
http://decisionstats.com/2013/09/11/using-twitter-data-with-r/
17. What is R?
http://www.r-project.org/
• Language
– Object oriented
– Open Source
– Free
– Widely used
the concept of "objects" that have data fields(attributes that describe the object)
and associated procedures known as methods. Objects, which are
usually instances of classes, are used to interact with one another to design
applications and computer programs
18. Pre Requisites
• Installation of R
http://cran.rstudio.com/bin/windows/base/
• R Studio
• R Packages
19. Pre Requisites
• Installation of R
– RTools
• R Studio
http://www.rstudio.com/products/rstudio/download/
• R Packages
install.packages(),
update.packages(),
library()
21. Demo-
Basic Objects on R Console
• +
• -
• Log
• Exp
• *
• /
• ()
Functions-ls()
– what objects are here
rm(“foo”) removes object named foo
Assignment
Using = or -> assigns object names to values
Hint- Up arrow gives you last
typed command
23. Functions and Loops
• Function
functionajay=function(a)(a^2+2*a+1)
Hint: Always match brackets
Each ( deserves a )
Each { deserves a }
Each [ deserves a ]
24. Demo-
Basic Objects on R Console
• +
• -
• Log
• Exp
• *
This is made more clear in
next slide
Functions-class()
gives class
dim() gives dimensions
nrow() gives rows
ncol() gives columns
length() gives length
str() gives structure
Hint- Up arrow gives you last
typed command
25. Demo-
Datasets on R Console
•
Hint- use data() to list all loaded
datasets
26. Demo-
Datasets on R Console
•
Hint- use data() to list all loaded
datasets
library(FOO) loads package “FOO”
37. From Databases
The RODBC package provides access to databases through
an ODBC interface.
The primary functions are
• odbcConnect(dsn, uid="", pwd="") Open a connection
to an ODBC database
• sqlFetch(channel, sqltable) Read a table from an ODBC
database into a data frame
Hint- a good site to learn R
http://www.statmethods.net
39. From Web (aka Web Scraping)
• readlines Hint : R is case sensitive
readlines is not the same as readLines
Hint : Use head() and tail() to inspect objects
Other packages are XML and Curl
Case Study- http://decisionstats.com/2013/04/14/using-r-for-cricket-analysis-rstats/
42. Data Selection: Demo
Questions- How do I use multiple conditions (AND OR)
Can I do away with subset function
How do I select random sample
Useful Link- http://decisionstats.com/2013/11/24/50-functions-to-clear-a-basic-interview-for-
business-analytics-rstats/
43. Data Exploration
• missing values are represented by NA in R
• Demo
– is.na
– na.omit
– na.rm
44. Data Visualization
Notes-
Explaining Basic Types of Graphs
Customizing Graphs
Graph Output
Advanced Graphs
Facets,
Grammar of Graphics
Data Visualization Rules