Ajax World2008 Eric Farrar

10101010101010001010101101101010101011

It’s 11 p.m., Do you know where you queries
are?
Eric Farrar, Sybase iAnywhere

Outline

 What are ORMs and Active Records?
 Tradeoffs
 Playing Nice with your Database
 Managing Indexes
 Eager Loading and Client-Side Joins
 Lazy Loading
 Conclusion

Object-Relational Mapper

 Systems to bridge the gap between object-oriented languages
and relational databases
class Employee < ActiveRecord::Base
belongs_to :office
end

class Office < ActiveRecord::Base
has_one :employee
end

 Inherently difficult:
 Normalization (splitting data across tables)
 Databases can only store scalar values
 Add an extra layer of abstraction

Active Record Pattern

 The ‘meat’ of an ORM that handles the CRUD work
 Allows regular objects to be treated as persistent objects
 Ideally, totally abstracts all database interaction

my_office = Office.new()
my_office.number = 123
me = Employee.new
me.name = ‘Eric Farrar’
me.office = my_office

Examples of ORMs/Active Records

 LINQ (Language Integrated Query)
 Hibernate / NHibernate
 Django
 Ruby on Rails (ActiveRecord)
 Many more…

 For our purposes, we will use Rail’s ActiveRecord for the
examples

Trade-offs

 Advantages
 Easy to learn
 Simplifies database creation and management
 No context switching between languages
 You don’t need know about the database
 Disadvantages
 Performance suffers (up to 50% slower)
 Often uses lowest-common denominator solution
 Concurrency semantics often very difficult
 You don’t need know about the database

Managing Indexes

 Indexes are used to make things quick to look up
 phone book vs. reverse look-up
 Indexes should be present on anything you will search for
 Searching for non-indexed properties will result in full table
scan
 By default, indexes are usually only put on primary keys
 Lack of indexes often will not appear during development
 Result will be a gradual slowdown (as data volume increases)
as opposed to avalanche failure
 Why not put an index on everything?
 Multi-column indexes vs. single column indexes

Client-Side Join

 Objects are usually ‘related’ to each other
 belongs_to
 has_one
 has_many
 has_and_belongs_to_many
 ORMs use these relationship to allow object traversal
 ex. me.office
 Assuming 10000 employees, how many queries will this code
produce?
Employees.find(:all).each do |e|
puts e.office.number
end

“Man, this is heavy!”

 Answer: 10001
Employees.find(:all).each do |e| # <-- 1 query here
puts e.office.number # <-- 10,000 queries here
end

 Why? The application is doing the work of joining the data, not
the database. This is called a ‘client-side’ join
 This is solved by giving a hint to the ORM and the database
that you intend to use the ‘office’ property
Employees.find(:all :include => :office).each do |e|
puts e.office.number
end

 This pattern is called eager loading

Inviting the Database to the Party

 Eager loading solves the N+1 problem, but it is still only half
way there
 In ORMs, the relations are defined inside the object models
 The ORM may know that Employees are Offices are related,
but the database doesn’t know that
 The database will obediently execute the query, but don’t
expect it to do anything clever
 Modern query optimizers will use every statistic available when
determining query paths
 Keeping them ignorant will result in bare-bones optimization

Lazy Loading

 Eager loading deals with the case where you want more than
your class includes
 What if you want less?
 Suppose your Employee class includes a picture field that is a
high resolution bitmap (~ 3 mb)
 The previous query will actually return the picture in order to fully
populate the object

Employees.find(:all).each do |e|
puts e.name
end

 This innocent code will naively return > 30 Gb of data

Be Lazy

 Instead, lazily load your object properties
Employees.find(:all :select => [“name”]).each do |e|
puts e.name
end

 Accessing e.picture will work by issuing another database
query
 This simple example ignores potential problems with
concurrency
 Use locking

Conclusions

 ORMs and Active Records can provide large productivity
advantages, typically at the expense of performance
 ORMs should never be seen as an alternative to learning
about databases (although it can be a good introduction)
 At times, you will likely need to drop down to the database
level (profiling, etc) to diagnose problems
 Ideally, a programmer using a ORM will always consider how
their code will actually look once it hits the database
 Similarities to a C compiler
 You should be able to answer “Yes!” to the question, “Do you
know where your queries are?”

Ajax World2008 Eric Farrar

Recommended

Recommended

More Related Content

Similar to Ajax World2008 Eric Farrar

Similar to Ajax World2008 Eric Farrar (20)

More from rajivmordani

More from rajivmordani (20)

Recently uploaded

Recently uploaded (20)

Ajax World2008 Eric Farrar