SlideShare une entreprise Scribd logo
1  sur  35
28. 5. 2014
@rgavuliak
roman.gavuliak@gmail.com
R v praxi
freeware nástroj na analýzu
R2
• Free + open source
• Štatistika + Grafy
• “Programovací jazyk”
• Unlimited packages
• Vektorové operácie
R je hnusné...
R je hnusné...
R je hnusné...
ale populárne!
Zdroj: r4stats.com
Dátové typy
• klasika (číslo, písmeno, slovo)
• vektor
• tabuľka (data.frame, data.table)
• zoznam (list)
Čo do rka?
• A<-1
• B<-c("biela","modra","cervena")
• C<-read.csv(“platby.csv”)
• D<-dbSendQuery(MySQL(),”select * from xy”)
• Big data technológie (hadoop, cassandra...)
• Sociálne siete
• NA
Čo s tým
• > a+1
2
• summary(c)
• model<-lm(pageviews~navstevnost,data=stranka)
• plot(stranka$navstevnost)
How to: Rko a twitter
#volilisme
Cesta začína na
https://dev.twitter.com/
Prepojenie s twitterom
Vytvárame fiktívnu applikáciu
Prepájame Rko s twitterom
install.packages("RCurl")
library(RCurl)
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem",
package = "RCurl")))
install.packages("twitteR")
library(twitteR)
reqURL <- "https://api.twitter.com/oauth/request_token"
accessURL <- "https://api.twitter.com/oauth/access_token"
authURL <- "https://api.twitter.com/oauth/authorize"
apiKey <- "xxxxxxxxxxxxxxxxxxxxxx"
apiSecret <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx„
twitCred$handshake(
cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")
)
registerTwitterOAuth(twitCred)
Dáta
volbyEU<-searchTwitter("#volilisme",n =
10000,since="2014-05-23")
tweety<-twListToDF(volbyEU)
head(tweety)
Dáta
people<-lookupUsers(tweety$screenName)
users<-twListToDF(people)
head(users,3)
Čo nás zaujíma?
1. Popularita tweetov
2. Popularita userov
3. Lokácia tweetov
Popularita tweetov
• Počet retweetov
• Počet favorites
Popularita tweetov
summary(tweety$retweetCount)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 0 0 4.008 3 20
summary(tweety$favoriteCount)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 0 0 0.3525 0 6
Popularita tweetov
Popularita userov
• Počet followerov
• Počet favorites
Popularita userov
summary(tweety$followersCount)
Min. 1st Qu. Median Mean 3rd Qu. Max.
4 71 186.5 1286 620.5 31151
summary(tweety$favoritesCount)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 9.5 63 315 316 223700
Popularita userov
Popularita userov
plot(users$favoritesCount~users$followersCount,main="Followes vs
favorites“, xlab="Total followers",ylab="Total favorites",col="blue")
Popularita userov
plot(log(users$favoritesCount+1)~log(users$followersCount+1),xlab="l
og Total followers",ylab="log Total favorites", col="orange",
main="Followes vs favorites",cex=1.5)
Popularita userov
Linearny_model<-
lm(log(users$favoritesCount+1)~log(users$followersCount+1))
summary(Linearny_model)
Popularita userov
abline(Linearny_model,col="red")
Lokácia tweetov
nrow(tweety[!is.na(tweety$longitude),])
4 tweety majú lokáciu
Lokácia tweetov
table(tweety$longitude)
class(tweety$longitude)
tweety$longitude<-as.numeric(tweety$longitude)
tweety$latitude<-as.numeric(tweety$latitude)
library(ggplot2)
install.packages("ggmap")
library(ggmap)
center=paste(mean(tweety$latitude,na.rm=TRUE),mean(tweety$longitude,na.rm=TRUE)
,sep=" ")
map <- get_map(location = center, zoom = 9, maptype = "terrain", source = "google")
vysledna_mapa <- ggmap(map)
vysledna_mapa <- vysledna_mapa + geom_text(data=tweety,aes(x=longitude,
y=latitude,label = paste("@“,screenName,sep=" ")), colour="purple",size=5,hjust=0,
vjust=0)+ theme(legend.position = "none")
vysledna_mapa <- vysledna_mapa + geom_point(data=tweety,aes(x=longitude,
y=latitude),colour="purple",size=2,na.rm=TRUE)
vysledna_mapa
Lokácia tweetov
Závery
• Neúspech?
• Výber cieľovky
• Call to action
Zdroje
http://www.r-bloggers.com/r-text-mining-on-twitter-prayformh370-
malaysia-airlines/
http://cran.r-project.org/web/packages/twitteR/twitteR.pdf
https://dev.twitter.com/docs/platform-objects/tweets
https://dev.twitter.com/docs/platform-objects/users
https://stackoverflow.com/questions/14095495/plotting-coordinates-of-
multiple-points-at-google-map-in-r
Ako sa naučiť Rko?
• R programming (https://www.coursera.org/course/rprog)
• https://www.datacamp.com/courses/introduction-to-r
• http://tryr.codeschool.com/
Ďakujem za pozornosť!
Otázky? Pripomienky?
@rgavuliak
roman.gavuliak@gmail.com

Contenu connexe

En vedette

Leaderseminaarit - päivitetty 3. toukokuuta
Leaderseminaarit - päivitetty 3. toukokuutaLeaderseminaarit - päivitetty 3. toukokuuta
Leaderseminaarit - päivitetty 3. toukokuutaJere Rinne
 
Startup 2.0: From Silicon Valley to Hong Kong
Startup 2.0: From Silicon Valley to Hong KongStartup 2.0: From Silicon Valley to Hong Kong
Startup 2.0: From Silicon Valley to Hong KongDave McClure
 
#500STRONG: Building Global Family, Global Community
#500STRONG: Building Global Family, Global Community#500STRONG: Building Global Family, Global Community
#500STRONG: Building Global Family, Global CommunityDave McClure
 
Nie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetów
Nie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetówNie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetów
Nie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetówContent King
 
Social media and your website
Social media and your websiteSocial media and your website
Social media and your websiteMichael Stoner
 
Kommunikativt ledarskap 2012 04-26
Kommunikativt ledarskap 2012 04-26Kommunikativt ledarskap 2012 04-26
Kommunikativt ledarskap 2012 04-26uasel
 
Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013
Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013
Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013Europeana
 
Asea > Redox Signaling Molecule Testamonial
Asea > Redox Signaling Molecule Testamonial Asea > Redox Signaling Molecule Testamonial
Asea > Redox Signaling Molecule Testamonial ASEA
 

En vedette (11)

Mahzarnama
MahzarnamaMahzarnama
Mahzarnama
 
Geek Events
Geek EventsGeek Events
Geek Events
 
Leaderseminaarit - päivitetty 3. toukokuuta
Leaderseminaarit - päivitetty 3. toukokuutaLeaderseminaarit - päivitetty 3. toukokuuta
Leaderseminaarit - päivitetty 3. toukokuuta
 
Startup 2.0: From Silicon Valley to Hong Kong
Startup 2.0: From Silicon Valley to Hong KongStartup 2.0: From Silicon Valley to Hong Kong
Startup 2.0: From Silicon Valley to Hong Kong
 
#500STRONG: Building Global Family, Global Community
#500STRONG: Building Global Family, Global Community#500STRONG: Building Global Family, Global Community
#500STRONG: Building Global Family, Global Community
 
Nie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetów
Nie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetówNie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetów
Nie marnuj pieniędzy. Content Marketing - zabójca przepalonych budżetów
 
Social media and your website
Social media and your websiteSocial media and your website
Social media and your website
 
Kommunikativt ledarskap 2012 04-26
Kommunikativt ledarskap 2012 04-26Kommunikativt ledarskap 2012 04-26
Kommunikativt ledarskap 2012 04-26
 
Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013
Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013
Europeana Licensing Framework Update, CENL-FEP Meeting, 18 Nov 2013
 
Core10
Core10Core10
Core10
 
Asea > Redox Signaling Molecule Testamonial
Asea > Redox Signaling Molecule Testamonial Asea > Redox Signaling Molecule Testamonial
Asea > Redox Signaling Molecule Testamonial
 

R v praxi