2. Where is this Going?
●
Introduction to database connection methods
●
Examples from some common R packages (cheat sheets
a.k.a. eye charts)
●
Introduction to the sqldf package
14/12/10 Database Access Through R 2
3. Database access is about fitting round pegs into
square holes.
14/12/10 Database Access Through R 3
4. Issues to Consider when Choosing a Data
Access Method for Basic Analysis
●
How much work does it take to set up?
●
Lazy ways – GUIs like RCommander, Deducer, JGR,
Revolutions, RedR . . . .
●
Diligent ways – Database or Protocol Specific R
Packages.
●
Speed
●
Stability
●
Platform
14/12/10 Database Access Through R 4
5. High-Level Database Connection Procedure
●
Open a database connection object using the
appropriate driver (ODBC, JDBC, etc.)
●
Authenicate user and confirm connection
●
Execute database tasks by referencing the appropriate
methods on the database object
14/12/10 Database Access Through R 5
6. DBI Package
●
Big package with connections to various database
protocols including Oracle, PostgreSQL, ODBC, SQLite,
MySQL
## choose the proper DBMS driver and connect to the server
drv <- dbDriver("ODBC")
con <- dbConnect(drv, "dsn", "usr", "pwd")
## the interface can work at a higher level importing tables as data.frames and exporting
data.frames as DBMS tables.
dbListTables(con)
dbListFields(con, "quakes")
if(dbExistsTable(con, "new_results"))
dbRemoveTable(con, "new_results")
dbWriteTable(con, "new_results", new.output)
14/12/10 Database Access Through R 6
7. RODBC
●
Provides access to ODBC compliant databases,
including MSSQL, MS Access, and others
# connect to database
library(RODBC)
myconn <-odbcConnect("mydsn", uid="Rob", pwd="aardvark")
# query data from the database
crimedat <- sqlFetch(myconn, Crime)
pundat <- sqlQuery(myconn, "select * from Punishment")
# close database connection
close(myconn)
14/12/10 Database Access Through R 7
8. RJDBC
●
Uses the DBI interface for the front-end and JDBC
driver on the back-end
# connect to the database
drv <- JDBC("com.mysql.jdbc.Driver",
"/etc/jdbc/mysql-connector-java-3.1.14-bin.jar", "`")
conn <- dbConnect(drv, "jdbc:mysql://localhost/test")
# access database tables
dbListTables(conn)
data(iris)
# write to and query tables
dbWriteTable(conn, "iris", iris)
dbGetQuery(conn, "select count(*) from iris")
d <- dbReadTable(conn, "iris")
14/12/10 Database Access Through R 8
9. RMySQL
●
Database interface for MySQL driver using the DBI
standard.
## connect and authenticate to a MySQL Db
con <- dbConnect(MySQL(), group = "lasers")
con2 <- dbConnect(MySQL(), user="opto", password="pure-light",
dbname="lasers", host="merced"
## list tables ad fields in a table
dbListTables(con)
dbListFields(con, "table_name")
## import and export data frames
d <- dbReadTable(con, "WL")
dbWriteTable(con, "WL2", a.data.frame) ## table from a data.frame
dbWriteTable(con, "test2", "~/data/test2.csv") ## table from file
14/12/10 Database Access Through R 9
10. RpgSQL
●
PostgreSQL interface to R via RJDBC
# the user/password/dbname used here are actually the defaults
con <- dbConnect(pgSQL(), user = "postgres", password = "", dbname = "test")
# create table, populate it and display it
s <- 'create table tt("id" int primary key, "name" varchar(255))'
dbSendUpdate(con, s)
dbSendUpdate(con, "insert into tt values(1, 'Hello')")
dbSendUpdate(con, "insert into tt values(2, 'World')")
dbGetQuery(con, "select * from tt")
# transfer a data frame to pgSQL and then display it from the database
# dbWriteTable is case sensitive
dbWriteTable(con, "BOD", BOD)
# table names are lower cased unless double quoted
dbGetQuery(con, 'select * from "BOD"')
14/12/10 Database Access Through R 10
11. RMongo
●
Access to Mongodb through R. Modeled on RMySQL.
Still in alpha as of Nov 3, 2010.
# connect to a database
mongo <- mongoDbConnect("eat2treat_development")
# show the collections
dbShowCollections(mongo)
# perform an 'all' query with a document limit of 2 and offset of 0.
# the results is a data.frame object. Nested documents are not supported at the moment. They
will just be the string output.
results <- dbGetQuery(mongo, "nutrient_metadatas", "{}", 0, 2)
names(results)
results <- dbGetQuery(mongo, "nutrient_metadatas", '{"nutrient_definition_id": 307}')
14/12/10 Database Access Through R 11
12. A Few Words about the sqldf Package
●
Sqldf provides a way to run SQL statements on R
dataframes.
●
Sqldf works with the SQLite, H2, and PostgreSQL
databases.
●
This package allows you to run most SQL commands
against an R dataframe: Selects, Joins, Ordering,
Grouping, Averaging, etc.
14/12/10 Database Access Through R 12
13. Sqldf Example
# load sqldf into workspace and execute SELECT queries
library(sqldf)
sqldf("select * from iris limit 5")
sqldf("select count(*) from iris")
sqldf("select Species, count(*) from iris group by
Species")
# example of a JOIN
Abbr <- data.frame(Species = levels(iris$Species),
+ Abbr = c("S", "Ve", "Vi"))
sqldf("select Abbr, avg(Sepal_Length)
+ from iris natural join Abbr group by Species")
14/12/10 Database Access Through R 13
14. Thank You
“How are you going to run the universe if you can't
answer a few unsolvable problems?”
14/12/10 Database Access Through R 14