SlideShare une entreprise Scribd logo
1  sur  72
Télécharger pour lire hors ligne
A New Approach to Resolve
Object-Relational Impedance
Mismatch
Henrietta Dombrovskaya
Enova
Chicago IL
hdombrovskaya@enova.com
What is Object-Relational
Impedance Mismatch and why It is
bad?
Why should we care?
We should care because we care
about..
2013 Henrietta Dombrovskaya – Enova 2	
  
Database Application Performance
Everybody wants their application to function efficiently,
right?
After all – why choose to use databases in the first
place?
Because
The DBMS is specialized software designed
to manage data in the most efficient way.
Nevertheless, the most common complaint of application
developers is ….
2013 Henrietta Dombrovskaya – Enova 3	
  
THE DATABASE IS
SLOW
WHY???
2013 Henrietta Dombrovskaya – Enova 4	
  
Let’s see…
Where do we usually start to look, if we want
to see, why the database performance is
bad?...
2013 Henrietta Dombrovskaya – Enova 5	
  
Our plan of actions
Look at the most long-running queries and try to
optimize them
Look at the queries, which are executed most
often, and make sure all of them are tuned and/or
running fast enough
Everything looks good, why is the app still slow?
2013 Henrietta Dombrovskaya – Enova 6
Helpful reports:
2013 Henrietta Dombrovskaya – Enova 7
Our performance monitoring tool (pgBadger) produces
several classes of reports, among them:
Longest-running queries (slowest queries)
Most frequently running queries
Queries, which take up the most time
(top offenders).
Based on these reports - is a database
really slow?
The top 50 most offensive queries, i.e. the queries
which eat up the most of the execution time, almost
exclusively are the queries, which are very short
(fast), but executed too many times
For example
SELECT	
  *	
  FROM	
  loans	
  WHERE	
  id=?	
  
	
  
is executed 8,500,000 times during the day, each
time taking several milliseconds, with total execution
time about 2.5 hours.
2013 Henrietta Dombrovskaya – Enova 8	
  
You may say – that’s how many times the
screen is refreshed, but…
UK: account home controller as accessed about 50,000
times during the day, and the customer support tool even
less
However, the pgBadger log shows total almost 1.5 million
executions of different versions of the
SELECT sum_accounts	
  	
  
statement and similar number of
	
  
SELECT*	
  FROM	
  loans.
Statistics show that some application controllers produce
over 1,000 database calls for each screen refresh.
2013 Henrietta Dombrovskaya – Enova
How could this possibly be happening?!
Let’s take a step back…
2013 Henrietta Dombrovskaya – Enova 10
First came a program…
•  Once upon a time there was a program…
•  Until mid-60’s – only sequential access storage was available; a program was
considered a stable object, which may process different input data, and produce
different output
Input Output
•  Direct access storage appeared in the early 70s, allowing concurrent data access and
selective updates. At the same time developers started to face rising complexity and
costs of software development, concurrency control and data integrity problems, and
a need of specialized data-management software became apparent.
•  DBMS emerged as specialized programs for centralized data management
2013 Henrietta Dombrovskaya – Enova
Program
11	
  
DBMS
Since Then We Have…
-Imperative programming languages, which tells, how to do
things
for	
  (i:=1,	
  i++,	
  n)	
  do	
  
…	
  
end;	
  
and
-  Declarative data manipulating languages, which define
what to do:
SELECT	
  	
  first_name,	
  last_name	
  FROM	
  people	
  	
  
	
   	
  WHERE	
  id=101	
  	
  	
  
2013 Henrietta Dombrovskaya – Enova 12
Database provides:
•  Independent data storage
•  Complex data structures
•  Data integrity
•  Concurrent access and data consistency
Application provides:
•  User interface
•  Application control flow
2013 Henrietta Dombrovskaya – Enova 13	
  
So…
Now everybody is happy, right?
…not really…
2013 Henrietta Dombrovskaya – Enova 14	
  
Why? Because…
Both imperative programming languages and
declarative query languages work perfectly to
accomplish the tasks the were designed to
accomplish.
The problems start, when we try to make them to work
together.
2013 Henrietta Dombrovskaya – Enova 15	
  
What is Impedance Mismatch?
These tools were designed to operate with objects of different size
(granularity): variables vs. tables
And different means of identification: address vs. set of attribute values
More precisely:
•  Object incompatibility - database objects are “too big” for the
imperative programming language
•  This means that classes and methods for tables and sets should be
integrated with the database to work effectively (methods should be
executed by the DB engine)
•  But most of the architectures do not allow such integration, which
leads to reimplementation of database operations at the application
layer
2013 Henrietta Dombrovskaya – Enova 16	
  
When We Have an OO Application…
•  During one screen rendering: objects may belong to
different classes with different access methods
•  Object-Relational Mapping (ORM): maps a database
object to the in-memory application object:
–  solves a problem of abstraction from details of data storing
–  does not provide effective means of manipulation with data sets
This particular case of impedance mismatch is called ORIM
– object-relational impedance mismatch
2013 Henrietta Dombrovskaya – Enova 17	
  
ORIM Definition
“A set of conceptual and technical difficulties that is
encountered when a relational database management
system (RDBMS) is being used by a program written in
an object-oriented programming language or style,
particularly apparent when objects or class definitions
are mapped in a straightforward way to database
tables or relational schema”
Ambler, S., Agile Database Techniques: Effective
Strategies for the Agile Software Developer
2013 Henrietta Dombrovskaya – Enova 18	
  
Development Standards & Performance
Internal company standards and/or industry standards may
prevent from creating efficient database application.
Typical OO design recommendations - clearly separate the
following layers of classes:
•  End-user interface
•  Business logic
•  Relationships
•  Interaction with the database
The standard allows limited interaction between adjacent
levels, therefore equivalents of database operations may
be implemented within several different processes,
sometimes on different machines.
2013 Henrietta Dombrovskaya – Enova 19	
  
Translation:
•  As a database developer you would assume, that to get
all payments for all loans for a given customer, you need
to join a loans table with the payments table
•  But for an application developer loans and payments
belong to different classes, therefore they are accessed
by different sets of methods
•  Which means, that the join operation is replicated
“manually”..
Let’s look at some examples…
2013 Henrietta Dombrovskaya – Enova 20	
  
What We Have at Enova?
Enova is a Ruby on Rails shop that uses the
ActiveRecord Object Relational Mapping (ORM) library
to communicate with a Postgres database.
With ActiveRecord a database table or view is mapped
into a class and an object instance is tied to a single
row in the table.
ActiveRecord is the default “model” component of the
model-view-controller web-application framework Ruby
on Rails, and is also a stand-alone ORM package for
other Ruby applications.
2013 Henrietta Dombrovskaya – Enova 21	
  
How ActiveRecord Works
2013 Henrietta Dombrovskaya – Enova 22	
  
What This Means for Application/
Database Interaction
Due to the lack of awareness of the underlying
database interaction on the part of the object methods,
one controller performs multiple trips to the database
For example…
2013 Henrietta Dombrovskaya – Enova 23	
  
Account Presenter
def initialize(customer)
@customer = customer
@customer_extra = customer.customer_extra
@person = customer.person
@address = customer.person.try(:address)
@company = customer.person.try(:company)
@bank_account = customer.bank_account(true)
@debit_card = customer.debit_card
@customer_paydate = customer.customer_paydate(true)
@paydate_schedule =
customer.customer_paydate.try(:paydate_schedule)
@customer_source = customer.customer_source
end
2013 Henrietta Dombrovskaya – Enova 24	
  
Corresponding Application Log
SELECT	
  *	
  FROM	
  customers	
  	
  
	
  	
  	
  	
  	
  	
  WHERE	
  (customers.id	
  =	
  12470535)	
  	
  
	
  
SELECT	
  *	
  FROM	
  people	
  	
  
	
  	
  	
  	
  WHERE	
  (people.id=	
  61657007	
  AND	
  	
  	
  (type	
  =	
  
'CustomerPerson'))	
  AND	
  (	
  (people.type	
  =	
  
'CustomerPerson'	
  )	
  );	
  
	
  
SELECT	
  addresses.*,	
  people_addresses.serial_number	
  
FROM	
  addresses	
  	
  
	
  	
  	
  	
  	
  	
  	
  INNER	
  JOIN	
  people_addresses	
  ON	
  addresses.id	
  =	
  	
  	
  	
  	
  	
  	
  
	
  people_addresses.address_id	
  	
  	
  	
  	
  	
  
	
  WHERE	
  (people_addresses.person_id	
  =	
  61657007	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  AND	
  (eff_end_date	
  is	
  NULL));	
  
	
  
2013 Henrietta Dombrovskaya – Enova 25	
  
But Wait, There’s More!
SELECT	
  *	
  FROM	
  approvals	
  WHERE	
  (customer_id	
  =	
  12470535)	
  ORDER	
  BY	
  processed_on	
  desc	
  
LIMIT	
  1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  customers	
  WHERE	
  (customers.id	
  =	
  12470535)	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  loans	
  WHERE	
  (loans.id	
  =	
  25563928)	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  loans	
  WHERE	
  (customer_id	
  =	
  12470535	
  and	
  status_cd	
  in	
  
('applied','approved','on_hold'))	
  ORDER	
  BY	
  funding_date	
  DESC	
  LIMIT	
  1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  loans	
  WHERE	
  (customer_id	
  =	
  12470535	
  and	
  status_cd	
  in	
  
('applied','approved',E	
  
'on_hold'))	
  ORDER	
  BY	
  funding_date	
  DESC	
  LIMIT	
  1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  loans	
  WHERE	
  (customer_id	
  =	
  12470535	
  and	
  status_cd	
  in	
  
(‘issued','issued_pmt_proc'))	
  ORDER	
  BY	
  funding_date	
  DESC	
  LIMIT	
  1	
  
	
  SELECT	
  *	
  FROM	
  loans	
  WHERE	
  (customer_id	
  =	
  12470535	
  and	
  status_cd	
  in	
  
('applied','approved',E'on_hold'))	
  ORDER	
  BY	
  funding_date	
  DESC	
  LIMIT	
  1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  customers	
  WHERE	
  (customers.person_id	
  =	
  61657007)	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  count(*)	
  AS	
  count_all	
  FROM	
  loans	
  WHERE	
  (loans.customer_id	
  =	
  12470535	
  AND	
  
(status_cd	
  in	
  ('applied','approved','on_hold','issued','issued_pmt_proc')	
  and	
  
loan_type_cd	
  =	
  'installment'))	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  loans	
  WHERE	
  (customer_id	
  =	
  12470535	
  and	
  status_cd	
  in	
  
('applied','approved','on_hold'))	
  ORDER	
  BY	
  funding_date	
  DESC	
  LIMIT	
  1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
SELECT	
  *	
  FROM	
  loans	
  WHERE	
  (customer_id	
  =	
  12470535	
  and	
  status_cd	
  in	
  
('applied','approved','on_hold'))	
  ORDER	
  BY	
  funding_date	
  DESC	
  LIMIT	
  1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  
2013 Henrietta Dombrovskaya – Enova 26	
  
What We Should See Instead…
2013 Henrietta Dombrovskaya – Enova
SELECT	
  *	
  	
  
FROM	
  customers	
  	
  
WHERE	
  (customers."id"	
  =	
  
12470535)	
  	
  
	
  
SELECT	
  *	
  	
  
FROM	
  people	
  	
  
WHERE	
  (people."id"	
  =	
  
61657007	
  	
  
AND	
  (type	
  =	
  
'CustomerPerson'))	
  	
  
AND	
  (	
  (people."type"	
  	
  
=	
  'CustomerPerson'	
  )	
  );	
  
SELECT	
  addresses.*,	
  
people_addresses.serial_number	
  	
  
FROM	
  addresses	
  	
  
INNER	
  JOIN	
  people_addresses	
  	
  
ON	
  addresses.id	
  =	
  
people_addresses.address_id	
  	
  
WHERE	
  
(people_addresses.person_id	
  =	
  
61657007	
  	
  
AND	
  (eff_end_date	
  is	
  NULL));	
  
	
  	
  
27	
  
Why Does This Happen?
… because the app is written in an imperative way:
“Fist we need to select all loans,
then for each loan we need to select…..”
2013 Henrietta Dombrovskaya – Enova 28	
  
There Are Some Ways To Improve
Eager loading generates almost correct SQL:
SELECT customers."id”
,customers."created_by”
<..>,customers."person_id"
<..>,customers_extra."estimated_monthly_living_expense"
<..>,people."type"
<...>,addresses."id"<...>
,companies."id" <...>
FROM customers
LEFT OUTER JOIN customers_extra
ON customers_extra.customer_id = customers.id
LEFT OUTER JOIN people
ON people.id = customers.person_id AND people."type" = 'CustomerPerson’
LEFT OUTER JOIN people_addresses ON people_addresses.person_id =
people.id LEFT OUTER JOIN addresses ON addresses.id = <…>
WHERE (customers."id" = 17674188)
2013 Henrietta Dombrovskaya – Enova 29	
  
But Then We Apply the Next Method
has_many :bank_accounts, :foreign_key => 'person_id' do
def default_for_customer(customer, reload = false)
@bank_account_cache = {} if @bank_account_cache.nil? ||
reload
@bank_account_cache[customer.brand.code] ||= find(:first,
:joins => "INNER JOIN payment_instruments.defaults pid
ON pid.payment_instrument_id =
payment_instruments.bank_accounts.payment_instrument_id",
:conditions => ["pid.customer_id = ?", customer],
:readonly => false)
end
And we execute SELECT again!
2013 Henrietta Dombrovskaya – Enova 30	
  
If We Continue With Existing Frameworks
… this problem (ORIM) will never be solved, and
we will continue to loose money on timeouts!
2013 Henrietta Dombrovskaya – Enova 31
New Approach to Resolve Object-Relational
Impedance Mismatch
Our new approach allows drastically reduce the
number of calls to the database, and get
exactly the data we need.
An application view can be rendered using one
or two calls to the database instead of 500-900
call in the old app. In addition, it allows to
optimize queries execution without any
changes to the application and/or models.
2013 Henrietta Dombrovskaya – Enova 32	
  
The old methods are unaware of the data
which is already available – make them
data-aware!
Isn’t this contrary to the standard OO
approach?
Yes, but…
This is the only way to improve the App/DB
interaction.
2013 Henrietta Dombrovskaya – Enova 33
Logic Split Methodology
Disassemble method into atomic steps,
Identify ones, which require data retrieval
Using the knowledge about database
objects relationships, construct a single
query
Execute
Use retrieved data in other steps
2013 Henrietta Dombrovskaya – Enova 34	
  
Example: Amount_Outstanding
2013 Henrietta Dombrovskaya – Enova
Definitions
AccountsOutstanding=
AccountsUncollected +
FeesOutstanding +
InterestOutstanding +
PrincipalAccounts +
AccountsDue
In turn:
•  AccountsUncollected =
uncollected_principal +
uncollected_installment_principal
2013 Henrietta Dombrovskaya – Enova Financial 36	
  
Under the Hood: Database Calls
SELECT	
  	
  
	
  vl.value	
  AS	
  account	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
,SUM(CASE	
  vl.value	
  WHEN	
  pt.debit_account_cd	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  END)	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
-­‐	
  SUM(CASE	
  vl.value	
  WHEN	
  pt.credit_account_cd	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  END)	
  AS	
  sum	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
FROM	
  payment_transactions	
  pt	
  
JOIN	
  valuelists	
  vl	
  ON	
  vl.type_cd	
  ='transaction_account'	
  	
  	
  	
  
	
  AND	
  vl.value	
  IN	
  (pt.debit_account_cd,	
  
pt.credit_account_cd)	
  
AND	
  loan_id=?	
  
… and then a value for specific account is selected.
2013 Henrietta Dombrovskaya – Enova 37	
  
Drawbacks
The method itself would allow retrieving all the information
related to one loan “in one shot”.
However, because the application developers are unaware
of the underlying layers, there appears to be no difference:
–  Whether we obtain the values of all account balances one-by-
one by following normal Object-Oriented method logic, in an
imperative way
–  Or if we obtain them all, simultaneously
Now for some results:
2013 Henrietta Dombrovskaya – Enova 38	
  
PGBadger Log
2013 Henrietta Dombrovskaya – Enova 39	
  
Modified Method
2013 Henrietta Dombrovskaya – Enova 40	
  
Under the Hood: Database Calls
SELECT	
  	
  loan_id	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  ,	
  sum(CASE	
  WHEN	
  debit_account_cd	
  =	
  'uncollected_principal'	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  	
  END	
  	
  	
  	
  
-­‐CASE	
  WHEN	
  credit_account_cd='uncollected_principal'	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  	
  END)	
  AS	
  uncollected_principal	
  	
  	
  	
  	
  
<...>	
  
,	
  sum(CASE	
  WHEN	
  debit_account_cd	
  =	
  'uncollected_nsf_fees'	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  	
  END	
  	
  
-­‐	
  CASE	
  WHEN	
  credit_account_cd	
  =	
  'uncollected_nsf_fees'	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  	
  END)	
  AS	
  uncollected_nsf_fees	
  	
  
	
  	
  	
  	
  	
  	
  ,	
  sum(CASE	
  WHEN	
  debit_account_cd	
  =	
  'installment_principal'	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  	
  END	
  	
  	
  
-­‐	
  CASE	
  WHEN	
  credit_account_cd	
  =	
  'installment_principal'	
  	
  	
  
	
  	
  	
  	
  	
  THEN	
  pt.amount	
  ELSE	
  0	
  	
  END)	
  AS	
  installment_principal	
  
	
  	
  FROM	
  payment_transactions_committed	
  pt	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  INNER	
  JOIN	
  loans	
  l	
  ON	
  l.id=pt.loan_id	
  	
  	
  
WHERE	
  customer_id={?}	
  	
  	
  
GROUP	
  BY	
  loan_id	
  	
  
	
  
2013 Henrietta Dombrovskaya – Enova 41	
  
Execution Statistics From Dark Testing: Old
2013 Henrietta Dombrovskaya – Enova 42	
  
… and New
2013 Henrietta Dombrovskaya – Enova 43	
  
So… now everybody happy?...
…not really…
What do application developers say at
this point?....
Wait! What about the business logic?!
2013 Henrietta Dombrovskaya – Enova 44	
  
What About the Business Logic?
We need some business logic to know how
to execute joins and selects – just enough
to minimize the number of db calls.
Selected results transformations and
manipulations do not have to be executed
on the database side.
2013 Henrietta Dombrovskaya – Enova 45	
  
One More Time
ü Disassemble method into atomic steps,
ü Identify ones which require data retrieval
ü Using knowledge about database objects
relationships, construct a single query
ü Execute
ü Use retrieved data in other steps
2013 Henrietta Dombrovskaya – Enova 46	
  
Let’s review another example – account
balance calculation for OEC loans.
2013 Henrietta Dombrovskaya – Enova 47	
  
Account_Balance Method for OEC
Ø Obtain account principal balance
Ø Obtain outstanding fees and interest as of
next payment due date
Ø Calculate the interest credit (unearned
interest) for the number of days left before
the payment due date
Ø Obtain existing customer balance
Ø Calculate the total account balance using the
values obtained on steps 1-4.
2013 Henrietta Dombrovskaya – Enova 48	
  
Traditional Object-Oriented Approach
Create Account_Balance method which would call the
following methods:
Ø Principal_Balance,
Ø Interest_Amount,
Ø Fees_Amount,
Ø Customer_Balance
Ø  Interest_Credit
Each of these methods would interact with a database
independently.
2013 Henrietta Dombrovskaya – Enova 49	
  
Drilling Down Into Each Of the Steps
1.  Principal balance as described for account_outstanding requires a
single database call.
2.  Outstanding interest and fees require one database call each.
3.  Interest credit calculation:
3.1. Obtain the daily interest rate for this customer
3.2. Obtain base amount, which is used to calculate the total interest
3.3. Obtain the number of days, for which the interest should be
credited:
3.3.1. Obtain the next payment due date
3.3.2. Calculate number of days based on obtained date and todays’ date
3.4. Calculate amount of credit, based on results from the previous
three steps
4.  Customer balance can be obtained using one database call, same
as steps 1-3.
2013 Henrietta Dombrovskaya – Enova 50	
  
Combining Steps With Data Retrieval
Those steps are: 1,2,3, 3.1, 3.2, 3.3.1, 4,
which means that the database accessing
method should execute the following task:
For a given loan, retrieve payment
transactions, which show principal balance,
current interest, fees and customer balance,
also retrieve loan’s daily interest rate and
next payment due date.
2013 Henrietta Dombrovskaya – Enova 51	
  
SELECT statement
SELECT l.id AS loan_id
,sum( CASE WHEN debit_account_cd = ‘principal’
AND t.acct_date<= v_current_date
THEN t.amount ELSE 0 END
- CASE WHEN credit_account_cd
=’principal’
AND t.acct_date<= v_current_date
THEN t.amount ELSE 0 END ) AS amount_payable
,sum (CASE WHEN
t.debit_account_cd=’fees_provisional’
THEN t.amount ELSE 0 END
- CASE WHEN t.credit_account_cd= ‘fees_provisional’
THEN t.amount ELSE 0 END )
AS fees_provisional
,sum (CASE WHEN t.debit_account_cd=’
interest_provisional’
THEN t.amount ELSE 0 END
- CASE WHEN t.credit_account_cd =’interest_provisional’
THEN t.amount ELSE 0 END )
AS interest_provisional
,st.end_date AS next_closing_date
,l.daily_rate AS interest_rate
,sum (CASE WHEN t.debit_account_cd
='customer_balance'
THEN -t.amount ELSE 0 END
- CASE WHEN t.credit_account_cd ='customer_balance'
THEN -t.amount ELSE 0 END )
AS customer_balance
FROM loans l
LEFT OUTER JOIN payment_transactions_committed
t ON
l.id=t.loan_id
LEFT OUTER JOIN statements st ON
l.id=st.loan_id
WHERE l.id={?}
GROUP BY l.id
,l.daily_rate
,st.end_date
2013 Henrietta Dombrovskaya – Enova 52	
  
Ruby Method: account_balance
def account_balance
amount_payable = amount_outstanding + amount_charged_off
amt = amount_payable - customer_balance
provisional_interest = interest_provisional
provisional_fees = fees_provisional
if provisional_interest > 0.0
amt = amt + provisional_fees + [ provisional_interest - [unearned_interest,
0].max, 0 ].max
else
amt = amt + provisional_fees
end
amt = amt.round_near
return [0, amt].max
end
2013 Henrietta Dombrovskaya – Enova 53	
  
Ruby Method: unearned_interest
def unearned_interest
amt = -1 * (principal_amount + [(customer_balance -
interest_provisional - fees_provisional), 0].max)
next_closing_date = Date.parse(self.next_closing_date)
return 0 unless next_closing_date
interest_period = [next_closing_date - Date.today + 1, 0].max
interest_rate = oec_daily_rate.to_f
interest = amt * interest_period * interest_rate
interest = interest.round_down
interest < 0.0 ? 0 : interest
end
2013 Henrietta Dombrovskaya – Enova 54	
  
How It Works
This methods is used as a callback in a more complex method,
which returns Loan_Summary where most of the values can be
selected directly from the database.
For compliance reasons all calculations are executed in the
Ruby model.
We need a “virtual column” as one of the fields of the data
output; the Ruby callback can be inserted into the data set here.
The callback produces required value, and the whole data set is
passed to the upper-level method for future processing and then
to the web application.
2013 Henrietta Dombrovskaya – Enova 55	
  
2013 Henrietta Dombrovskaya – Enova 56
LOAN_ID,
FUNDING_DATE<
FUNDING_AMOUNT
FEE
STATE
APR
LOAN_STATUS
AMOUNT_DUE
ACCOUNT_BALANCE
PRINCIPAL_AMOUNT
CUSTOMER_BALANCE
INTEREST_PROVISIONAL
<…>
LANGUAGE_CD
ACCOUNT_BALANCE
METHOD
Note, that all data elements can still
be retrieved with a single select
statement, which won’t be possible
within the standard ORM
framework.
On a Larger Scale
In our new application a single Postgres
function retrieves summaries for all
customer loans.
Here is some execution statistics:
2013 Henrietta Dombrovskaya – Enova Financial 57
Number of Database Calls/Customer
2013 Henrietta Dombrovskaya – Enova Financial 58	
  
1 2 3 4 5 6 7 8 9
Old App avg_calls 121 196 373 449 632 765 819 923 1129
New App avg_calls 4 4 4 4 4 4 4 4 4
0
200
400
600
800
1000
1200
Old App avg_calls New App avg_calls
Execution Time/Customer
2013 Henrietta Dombrovskaya – Enova Financial 59	
  
0
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9
New App avg_time Old App avg_time
Now
•  We save both time
•  And money!
2013 Henrietta Dombrovskaya – Enova 60
Were we the first to notice the problem?
Definitely NOT!
2013 Henrietta Dombrovskaya – Enova 61
Related Work
The problem of object-relational impedance
mismatch is a constant discussion topic,
when it comes to developing an efficient
application.
In recent years multiple attempts were made
to try to resolve this issue with no significant
outcome.
2013 Henrietta Dombrovskaya – Enova Financial 62	
  
First Came HORA
HORA (Hybrid Object-Relational Architecture)
approach was first introduced in 1993 and became
a foundation of multiple ORM-based systems, such
as:
•  Java Persistence
•  ActiveJDBC
•  ADO.NET
•  Ruby on Rails ActiveRecord
•  Many others.
All implementations have the same ORIM issues.
2013 Henrietta Dombrovskaya – Enova Financial 63	
  
Hybernate
Hybernate is a high-performance Object/Relational
persistence and query service, considered one of the
most flexible and powerful Object/Relational
solutions on the market.
•  Mapping from Java classes to database tables
and from Java data types to SQL data types.
•  Significantly reducing development time
•  Allowing the application developers concentrate
on the business side of the project.
2013 Henrietta Dombrovskaya – Enova Financial 64	
  
Hybernate Developers Claim
•  In contrast to other solutions it does not hide
“the power of SQL” from developers.
•  This claim is true in some sense, since the
solution, indeed allows us to write queries
similar to SQL queries.
2013 Henrietta Dombrovskaya – Enova Financial 65	
  
The Caveat
•  Creation of complex queries using Hybernate is
not an easy task
•  Similar to other ORM systems, Hybernate
prompts for solutions, which seem more natural
for application developers
2013 Henrietta Dombrovskaya – Enova Financial 66	
  
Newer Version of Active Record
•  Allows Eager loading and some customer
queries, but has the same limitations as
Hybernate
•  Eager loading may cause an excessive
application memory usage
2013 Henrietta Dombrovskaya – Enova Financial 67	
  
Agile Technology
•  acknowledges the existence of ORIM ;
•  not only acknowledges the technical impedance
mismatch, but also a cultural impedance
mismatch:
“The object-oriented paradigm is based on proven software
engineering principles. The relational paradigm, however, is
based on proven mathematical principles”.
•  while raising awareness of the problem, the
solutions, proposed by Agile data technology
refers mostly to the database schema changes
and/or more careful design and data refactoring,
but it offers little in terms of dealing with inefficient
queries.
2013 Henrietta Dombrovskaya – Enova Financial 68	
  
What’s next?
2013 Henrietta Dombrovskaya – Enova69
Future Work
Rewriting a whole application using our new
methodology of splitting logic between an application and
a database; we are in the middle of the process.
Problems:
–  Large amount of the legacy code, where most of the
business logic is embedded in Ruby classes.
–  Virtually no business specifications, which means that we
have to extract the business logic from the existing code.
–  The legacy application is evolving and existing models are
being modified all the time.
–  The human factor
Because of this….
2013 Henrietta Dombrovskaya – Enova Financial 70	
  
Moving In Two Directions:
ü Continue to rewrite the larger parts of our
applications, shooting for having only a
couple of database queries per screen
rendering;
ü While doing this, clarifying our technology,
making it more transparent and easier to
use by application developers.
2013 Henrietta Dombrovskaya – Enova Financial 71	
  
Aknowledgements
Many thanks to:
•  Enova CIO/CTO Fred Lee
•  Team Raven, and everybody who worked on the Aperture
project during it’s lifetime:
•  Sheldon Strauch and the entire DB Dev team
•  Chad Slaughter and Kurt Stephens
•  Ben Heilman and Donny Jekels
2013 Henrietta Dombrovskaya – Enova Financial 72	
  
–  Cheetana Yogeesh
–  Allan Zoppa
–  Ana Lebon
–  Kevin Glowacz
–  Neha Bhardwaj
–  Jef Jonjevic
–  Marc Groulx
–  Richard Lee
–  Preeti Dhiman
–  Luda Baklanova

Contenu connexe

En vedette

Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...PostgresOpen
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenPostgresOpen
 
Islamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningIslamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningUmair Shahid
 
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenRobert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenPostgresOpen
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Denish Patel
 
Islamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningIslamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningUmair Shahid
 
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenPostgresOpen
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenPostgresOpen
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finishelliando dias
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres OpenPostgresOpen
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGiuseppe Broccolo
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HAharoonm
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenPostgresOpen
 
PostgreSQL replication from setup to advanced features.
 PostgreSQL replication from setup to advanced features. PostgreSQL replication from setup to advanced features.
PostgreSQL replication from setup to advanced features.Pivorak MeetUp
 
Logical replication with pglogical
Logical replication with pglogicalLogical replication with pglogical
Logical replication with pglogicalUmair Shahid
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterSrihari Sriraman
 
Postgresql on NFS - J.Battiato, pgday2016
Postgresql on NFS - J.Battiato, pgday2016Postgresql on NFS - J.Battiato, pgday2016
Postgresql on NFS - J.Battiato, pgday2016Jonathan Battiato
 

En vedette (18)

Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres Open
 
Islamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningIslamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuning
 
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenRobert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
Islamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningIslamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuning
 
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfs
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
 
Geometria Projetiva
Geometria ProjetivaGeometria Projetiva
Geometria Projetiva
 
PostgreSQL replication from setup to advanced features.
 PostgreSQL replication from setup to advanced features. PostgreSQL replication from setup to advanced features.
PostgreSQL replication from setup to advanced features.
 
Logical replication with pglogical
Logical replication with pglogicalLogical replication with pglogical
Logical replication with pglogical
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL Cluster
 
Postgresql on NFS - J.Battiato, pgday2016
Postgresql on NFS - J.Battiato, pgday2016Postgresql on NFS - J.Battiato, pgday2016
Postgresql on NFS - J.Battiato, pgday2016
 

Similaire à New Approach to Resolve Object-Relational Impedance Mismatch

Performance Testing from Scratch + JMeter intro
Performance Testing from Scratch + JMeter introPerformance Testing from Scratch + JMeter intro
Performance Testing from Scratch + JMeter introMykola Kovsh
 
Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...
Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...
Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...Dakiry
 
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal ForecastingBest Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal ForecastingEric Kavanagh
 
ALFRED demo - www2013
ALFRED demo - www2013ALFRED demo - www2013
ALFRED demo - www2013Disheng Qiu
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid WorldTanel Poder
 
Enterprise applications in the cloud - are providers ready?
Enterprise applications in the cloud - are providers ready?Enterprise applications in the cloud - are providers ready?
Enterprise applications in the cloud - are providers ready?Leonid Grinshpan, Ph.D.
 
Building data pipelines at Shopee with DEC
Building data pipelines at Shopee with DECBuilding data pipelines at Shopee with DEC
Building data pipelines at Shopee with DECRim Zaidullin
 
IWMW 1998: Dataweb: Three Worlds Colide
IWMW 1998: Dataweb: Three Worlds ColideIWMW 1998: Dataweb: Three Worlds Colide
IWMW 1998: Dataweb: Three Worlds ColideIWMW
 
IT6701 Information Management - Unit I
IT6701 Information Management - Unit I  IT6701 Information Management - Unit I
IT6701 Information Management - Unit I pkaviya
 
Chapter 1 Fundamentals of Database Management System
Chapter 1 Fundamentals of Database Management SystemChapter 1 Fundamentals of Database Management System
Chapter 1 Fundamentals of Database Management SystemEddyzulham Mahluzydde
 
The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York City
The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York CityThe Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York City
The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York CityNeo4j
 
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACE
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACEINTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACE
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACEMohamed Reda
 
Introduction to DBMS.pptx
Introduction to DBMS.pptxIntroduction to DBMS.pptx
Introduction to DBMS.pptxSreenivas R
 
Twelve Factor - Designing for Change
Twelve Factor - Designing for ChangeTwelve Factor - Designing for Change
Twelve Factor - Designing for ChangeEric Wyles
 
CQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureCQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureThomas Jaskula
 
Migrating from Oracle to Postgres
Migrating from Oracle to PostgresMigrating from Oracle to Postgres
Migrating from Oracle to PostgresEDB
 

Similaire à New Approach to Resolve Object-Relational Impedance Mismatch (20)

PANKAJ SINGH-061.pptx
PANKAJ SINGH-061.pptxPANKAJ SINGH-061.pptx
PANKAJ SINGH-061.pptx
 
Performance Testing from Scratch + JMeter intro
Performance Testing from Scratch + JMeter introPerformance Testing from Scratch + JMeter intro
Performance Testing from Scratch + JMeter intro
 
Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...
Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...
Микола Ковш “Performance Testing Implementation From Scratch. Why? When and H...
 
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal ForecastingBest Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
Best Laid Plans: Saving Time, Money and Trouble with Optimal Forecasting
 
PykQuery.js
PykQuery.jsPykQuery.js
PykQuery.js
 
Findfixandvalidate 140221013443-phpapp01
Findfixandvalidate 140221013443-phpapp01Findfixandvalidate 140221013443-phpapp01
Findfixandvalidate 140221013443-phpapp01
 
ALFRED demo - www2013
ALFRED demo - www2013ALFRED demo - www2013
ALFRED demo - www2013
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid World
 
Enterprise applications in the cloud - are providers ready?
Enterprise applications in the cloud - are providers ready?Enterprise applications in the cloud - are providers ready?
Enterprise applications in the cloud - are providers ready?
 
Building data pipelines at Shopee with DEC
Building data pipelines at Shopee with DECBuilding data pipelines at Shopee with DEC
Building data pipelines at Shopee with DEC
 
Sql good practices
Sql good practicesSql good practices
Sql good practices
 
IWMW 1998: Dataweb: Three Worlds Colide
IWMW 1998: Dataweb: Three Worlds ColideIWMW 1998: Dataweb: Three Worlds Colide
IWMW 1998: Dataweb: Three Worlds Colide
 
IT6701 Information Management - Unit I
IT6701 Information Management - Unit I  IT6701 Information Management - Unit I
IT6701 Information Management - Unit I
 
Chapter 1 Fundamentals of Database Management System
Chapter 1 Fundamentals of Database Management SystemChapter 1 Fundamentals of Database Management System
Chapter 1 Fundamentals of Database Management System
 
The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York City
The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York CityThe Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York City
The Connected Data Imperative: Why Graphs? at Neo4j GraphDay New York City
 
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACE
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACEINTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACE
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACE
 
Introduction to DBMS.pptx
Introduction to DBMS.pptxIntroduction to DBMS.pptx
Introduction to DBMS.pptx
 
Twelve Factor - Designing for Change
Twelve Factor - Designing for ChangeTwelve Factor - Designing for Change
Twelve Factor - Designing for Change
 
CQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureCQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architecture
 
Migrating from Oracle to Postgres
Migrating from Oracle to PostgresMigrating from Oracle to Postgres
Migrating from Oracle to Postgres
 

Dernier

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Dernier (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

New Approach to Resolve Object-Relational Impedance Mismatch

  • 1. A New Approach to Resolve Object-Relational Impedance Mismatch Henrietta Dombrovskaya Enova Chicago IL hdombrovskaya@enova.com
  • 2. What is Object-Relational Impedance Mismatch and why It is bad? Why should we care? We should care because we care about.. 2013 Henrietta Dombrovskaya – Enova 2  
  • 3. Database Application Performance Everybody wants their application to function efficiently, right? After all – why choose to use databases in the first place? Because The DBMS is specialized software designed to manage data in the most efficient way. Nevertheless, the most common complaint of application developers is …. 2013 Henrietta Dombrovskaya – Enova 3  
  • 4. THE DATABASE IS SLOW WHY??? 2013 Henrietta Dombrovskaya – Enova 4  
  • 5. Let’s see… Where do we usually start to look, if we want to see, why the database performance is bad?... 2013 Henrietta Dombrovskaya – Enova 5  
  • 6. Our plan of actions Look at the most long-running queries and try to optimize them Look at the queries, which are executed most often, and make sure all of them are tuned and/or running fast enough Everything looks good, why is the app still slow? 2013 Henrietta Dombrovskaya – Enova 6
  • 7. Helpful reports: 2013 Henrietta Dombrovskaya – Enova 7 Our performance monitoring tool (pgBadger) produces several classes of reports, among them: Longest-running queries (slowest queries) Most frequently running queries Queries, which take up the most time (top offenders).
  • 8. Based on these reports - is a database really slow? The top 50 most offensive queries, i.e. the queries which eat up the most of the execution time, almost exclusively are the queries, which are very short (fast), but executed too many times For example SELECT  *  FROM  loans  WHERE  id=?     is executed 8,500,000 times during the day, each time taking several milliseconds, with total execution time about 2.5 hours. 2013 Henrietta Dombrovskaya – Enova 8  
  • 9. You may say – that’s how many times the screen is refreshed, but… UK: account home controller as accessed about 50,000 times during the day, and the customer support tool even less However, the pgBadger log shows total almost 1.5 million executions of different versions of the SELECT sum_accounts     statement and similar number of   SELECT*  FROM  loans. Statistics show that some application controllers produce over 1,000 database calls for each screen refresh. 2013 Henrietta Dombrovskaya – Enova
  • 10. How could this possibly be happening?! Let’s take a step back… 2013 Henrietta Dombrovskaya – Enova 10
  • 11. First came a program… •  Once upon a time there was a program… •  Until mid-60’s – only sequential access storage was available; a program was considered a stable object, which may process different input data, and produce different output Input Output •  Direct access storage appeared in the early 70s, allowing concurrent data access and selective updates. At the same time developers started to face rising complexity and costs of software development, concurrency control and data integrity problems, and a need of specialized data-management software became apparent. •  DBMS emerged as specialized programs for centralized data management 2013 Henrietta Dombrovskaya – Enova Program 11   DBMS
  • 12. Since Then We Have… -Imperative programming languages, which tells, how to do things for  (i:=1,  i++,  n)  do   …   end;   and -  Declarative data manipulating languages, which define what to do: SELECT    first_name,  last_name  FROM  people        WHERE  id=101       2013 Henrietta Dombrovskaya – Enova 12
  • 13. Database provides: •  Independent data storage •  Complex data structures •  Data integrity •  Concurrent access and data consistency Application provides: •  User interface •  Application control flow 2013 Henrietta Dombrovskaya – Enova 13  
  • 14. So… Now everybody is happy, right? …not really… 2013 Henrietta Dombrovskaya – Enova 14  
  • 15. Why? Because… Both imperative programming languages and declarative query languages work perfectly to accomplish the tasks the were designed to accomplish. The problems start, when we try to make them to work together. 2013 Henrietta Dombrovskaya – Enova 15  
  • 16. What is Impedance Mismatch? These tools were designed to operate with objects of different size (granularity): variables vs. tables And different means of identification: address vs. set of attribute values More precisely: •  Object incompatibility - database objects are “too big” for the imperative programming language •  This means that classes and methods for tables and sets should be integrated with the database to work effectively (methods should be executed by the DB engine) •  But most of the architectures do not allow such integration, which leads to reimplementation of database operations at the application layer 2013 Henrietta Dombrovskaya – Enova 16  
  • 17. When We Have an OO Application… •  During one screen rendering: objects may belong to different classes with different access methods •  Object-Relational Mapping (ORM): maps a database object to the in-memory application object: –  solves a problem of abstraction from details of data storing –  does not provide effective means of manipulation with data sets This particular case of impedance mismatch is called ORIM – object-relational impedance mismatch 2013 Henrietta Dombrovskaya – Enova 17  
  • 18. ORIM Definition “A set of conceptual and technical difficulties that is encountered when a relational database management system (RDBMS) is being used by a program written in an object-oriented programming language or style, particularly apparent when objects or class definitions are mapped in a straightforward way to database tables or relational schema” Ambler, S., Agile Database Techniques: Effective Strategies for the Agile Software Developer 2013 Henrietta Dombrovskaya – Enova 18  
  • 19. Development Standards & Performance Internal company standards and/or industry standards may prevent from creating efficient database application. Typical OO design recommendations - clearly separate the following layers of classes: •  End-user interface •  Business logic •  Relationships •  Interaction with the database The standard allows limited interaction between adjacent levels, therefore equivalents of database operations may be implemented within several different processes, sometimes on different machines. 2013 Henrietta Dombrovskaya – Enova 19  
  • 20. Translation: •  As a database developer you would assume, that to get all payments for all loans for a given customer, you need to join a loans table with the payments table •  But for an application developer loans and payments belong to different classes, therefore they are accessed by different sets of methods •  Which means, that the join operation is replicated “manually”.. Let’s look at some examples… 2013 Henrietta Dombrovskaya – Enova 20  
  • 21. What We Have at Enova? Enova is a Ruby on Rails shop that uses the ActiveRecord Object Relational Mapping (ORM) library to communicate with a Postgres database. With ActiveRecord a database table or view is mapped into a class and an object instance is tied to a single row in the table. ActiveRecord is the default “model” component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. 2013 Henrietta Dombrovskaya – Enova 21  
  • 22. How ActiveRecord Works 2013 Henrietta Dombrovskaya – Enova 22  
  • 23. What This Means for Application/ Database Interaction Due to the lack of awareness of the underlying database interaction on the part of the object methods, one controller performs multiple trips to the database For example… 2013 Henrietta Dombrovskaya – Enova 23  
  • 24. Account Presenter def initialize(customer) @customer = customer @customer_extra = customer.customer_extra @person = customer.person @address = customer.person.try(:address) @company = customer.person.try(:company) @bank_account = customer.bank_account(true) @debit_card = customer.debit_card @customer_paydate = customer.customer_paydate(true) @paydate_schedule = customer.customer_paydate.try(:paydate_schedule) @customer_source = customer.customer_source end 2013 Henrietta Dombrovskaya – Enova 24  
  • 25. Corresponding Application Log SELECT  *  FROM  customers                WHERE  (customers.id  =  12470535)       SELECT  *  FROM  people            WHERE  (people.id=  61657007  AND      (type  =   'CustomerPerson'))  AND  (  (people.type  =   'CustomerPerson'  )  );     SELECT  addresses.*,  people_addresses.serial_number   FROM  addresses                  INNER  JOIN  people_addresses  ON  addresses.id  =                people_addresses.address_id              WHERE  (people_addresses.person_id  =  61657007                                    AND  (eff_end_date  is  NULL));     2013 Henrietta Dombrovskaya – Enova 25  
  • 26. But Wait, There’s More! SELECT  *  FROM  approvals  WHERE  (customer_id  =  12470535)  ORDER  BY  processed_on  desc   LIMIT  1                                                                                                                                                                             SELECT  *  FROM  customers  WHERE  (customers.id  =  12470535)                                                                             SELECT  *  FROM  loans  WHERE  (loans.id  =  25563928)                                                                                             SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                   SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved',E   'on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                                                               SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   (‘issued','issued_pmt_proc'))  ORDER  BY  funding_date  DESC  LIMIT  1    SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved',E'on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                 SELECT  *  FROM  customers  WHERE  (customers.person_id  =  61657007)                                                               SELECT  count(*)  AS  count_all  FROM  loans  WHERE  (loans.customer_id  =  12470535  AND   (status_cd  in  ('applied','approved','on_hold','issued','issued_pmt_proc')  and   loan_type_cd  =  'installment'))                                                                                                                               SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                   SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                     2013 Henrietta Dombrovskaya – Enova 26  
  • 27. What We Should See Instead… 2013 Henrietta Dombrovskaya – Enova SELECT  *     FROM  customers     WHERE  (customers."id"  =   12470535)       SELECT  *     FROM  people     WHERE  (people."id"  =   61657007     AND  (type  =   'CustomerPerson'))     AND  (  (people."type"     =  'CustomerPerson'  )  );   SELECT  addresses.*,   people_addresses.serial_number     FROM  addresses     INNER  JOIN  people_addresses     ON  addresses.id  =   people_addresses.address_id     WHERE   (people_addresses.person_id  =   61657007     AND  (eff_end_date  is  NULL));       27  
  • 28. Why Does This Happen? … because the app is written in an imperative way: “Fist we need to select all loans, then for each loan we need to select…..” 2013 Henrietta Dombrovskaya – Enova 28  
  • 29. There Are Some Ways To Improve Eager loading generates almost correct SQL: SELECT customers."id” ,customers."created_by” <..>,customers."person_id" <..>,customers_extra."estimated_monthly_living_expense" <..>,people."type" <...>,addresses."id"<...> ,companies."id" <...> FROM customers LEFT OUTER JOIN customers_extra ON customers_extra.customer_id = customers.id LEFT OUTER JOIN people ON people.id = customers.person_id AND people."type" = 'CustomerPerson’ LEFT OUTER JOIN people_addresses ON people_addresses.person_id = people.id LEFT OUTER JOIN addresses ON addresses.id = <…> WHERE (customers."id" = 17674188) 2013 Henrietta Dombrovskaya – Enova 29  
  • 30. But Then We Apply the Next Method has_many :bank_accounts, :foreign_key => 'person_id' do def default_for_customer(customer, reload = false) @bank_account_cache = {} if @bank_account_cache.nil? || reload @bank_account_cache[customer.brand.code] ||= find(:first, :joins => "INNER JOIN payment_instruments.defaults pid ON pid.payment_instrument_id = payment_instruments.bank_accounts.payment_instrument_id", :conditions => ["pid.customer_id = ?", customer], :readonly => false) end And we execute SELECT again! 2013 Henrietta Dombrovskaya – Enova 30  
  • 31. If We Continue With Existing Frameworks … this problem (ORIM) will never be solved, and we will continue to loose money on timeouts! 2013 Henrietta Dombrovskaya – Enova 31
  • 32. New Approach to Resolve Object-Relational Impedance Mismatch Our new approach allows drastically reduce the number of calls to the database, and get exactly the data we need. An application view can be rendered using one or two calls to the database instead of 500-900 call in the old app. In addition, it allows to optimize queries execution without any changes to the application and/or models. 2013 Henrietta Dombrovskaya – Enova 32  
  • 33. The old methods are unaware of the data which is already available – make them data-aware! Isn’t this contrary to the standard OO approach? Yes, but… This is the only way to improve the App/DB interaction. 2013 Henrietta Dombrovskaya – Enova 33
  • 34. Logic Split Methodology Disassemble method into atomic steps, Identify ones, which require data retrieval Using the knowledge about database objects relationships, construct a single query Execute Use retrieved data in other steps 2013 Henrietta Dombrovskaya – Enova 34  
  • 36. Definitions AccountsOutstanding= AccountsUncollected + FeesOutstanding + InterestOutstanding + PrincipalAccounts + AccountsDue In turn: •  AccountsUncollected = uncollected_principal + uncollected_installment_principal 2013 Henrietta Dombrovskaya – Enova Financial 36  
  • 37. Under the Hood: Database Calls SELECT      vl.value  AS  account                                               ,SUM(CASE  vl.value  WHEN  pt.debit_account_cd                                            THEN  pt.amount  ELSE  0  END)                                                   -­‐  SUM(CASE  vl.value  WHEN  pt.credit_account_cd                                                THEN  pt.amount  ELSE  0  END)  AS  sum                               FROM  payment_transactions  pt   JOIN  valuelists  vl  ON  vl.type_cd  ='transaction_account'          AND  vl.value  IN  (pt.debit_account_cd,   pt.credit_account_cd)   AND  loan_id=?   … and then a value for specific account is selected. 2013 Henrietta Dombrovskaya – Enova 37  
  • 38. Drawbacks The method itself would allow retrieving all the information related to one loan “in one shot”. However, because the application developers are unaware of the underlying layers, there appears to be no difference: –  Whether we obtain the values of all account balances one-by- one by following normal Object-Oriented method logic, in an imperative way –  Or if we obtain them all, simultaneously Now for some results: 2013 Henrietta Dombrovskaya – Enova 38  
  • 39. PGBadger Log 2013 Henrietta Dombrovskaya – Enova 39  
  • 40. Modified Method 2013 Henrietta Dombrovskaya – Enova 40  
  • 41. Under the Hood: Database Calls SELECT    loan_id                                  ,  sum(CASE  WHEN  debit_account_cd  =  'uncollected_principal'                                            THEN  pt.amount  ELSE  0    END         -­‐CASE  WHEN  credit_account_cd='uncollected_principal'                                    THEN  pt.amount  ELSE  0    END)  AS  uncollected_principal           <...>   ,  sum(CASE  WHEN  debit_account_cd  =  'uncollected_nsf_fees'                                  THEN  pt.amount  ELSE  0    END     -­‐  CASE  WHEN  credit_account_cd  =  'uncollected_nsf_fees'                    THEN  pt.amount  ELSE  0    END)  AS  uncollected_nsf_fees                ,  sum(CASE  WHEN  debit_account_cd  =  'installment_principal'                                          THEN  pt.amount  ELSE  0    END       -­‐  CASE  WHEN  credit_account_cd  =  'installment_principal'                THEN  pt.amount  ELSE  0    END)  AS  installment_principal      FROM  payment_transactions_committed  pt                            INNER  JOIN  loans  l  ON  l.id=pt.loan_id       WHERE  customer_id={?}       GROUP  BY  loan_id       2013 Henrietta Dombrovskaya – Enova 41  
  • 42. Execution Statistics From Dark Testing: Old 2013 Henrietta Dombrovskaya – Enova 42  
  • 43. … and New 2013 Henrietta Dombrovskaya – Enova 43  
  • 44. So… now everybody happy?... …not really… What do application developers say at this point?.... Wait! What about the business logic?! 2013 Henrietta Dombrovskaya – Enova 44  
  • 45. What About the Business Logic? We need some business logic to know how to execute joins and selects – just enough to minimize the number of db calls. Selected results transformations and manipulations do not have to be executed on the database side. 2013 Henrietta Dombrovskaya – Enova 45  
  • 46. One More Time ü Disassemble method into atomic steps, ü Identify ones which require data retrieval ü Using knowledge about database objects relationships, construct a single query ü Execute ü Use retrieved data in other steps 2013 Henrietta Dombrovskaya – Enova 46  
  • 47. Let’s review another example – account balance calculation for OEC loans. 2013 Henrietta Dombrovskaya – Enova 47  
  • 48. Account_Balance Method for OEC Ø Obtain account principal balance Ø Obtain outstanding fees and interest as of next payment due date Ø Calculate the interest credit (unearned interest) for the number of days left before the payment due date Ø Obtain existing customer balance Ø Calculate the total account balance using the values obtained on steps 1-4. 2013 Henrietta Dombrovskaya – Enova 48  
  • 49. Traditional Object-Oriented Approach Create Account_Balance method which would call the following methods: Ø Principal_Balance, Ø Interest_Amount, Ø Fees_Amount, Ø Customer_Balance Ø  Interest_Credit Each of these methods would interact with a database independently. 2013 Henrietta Dombrovskaya – Enova 49  
  • 50. Drilling Down Into Each Of the Steps 1.  Principal balance as described for account_outstanding requires a single database call. 2.  Outstanding interest and fees require one database call each. 3.  Interest credit calculation: 3.1. Obtain the daily interest rate for this customer 3.2. Obtain base amount, which is used to calculate the total interest 3.3. Obtain the number of days, for which the interest should be credited: 3.3.1. Obtain the next payment due date 3.3.2. Calculate number of days based on obtained date and todays’ date 3.4. Calculate amount of credit, based on results from the previous three steps 4.  Customer balance can be obtained using one database call, same as steps 1-3. 2013 Henrietta Dombrovskaya – Enova 50  
  • 51. Combining Steps With Data Retrieval Those steps are: 1,2,3, 3.1, 3.2, 3.3.1, 4, which means that the database accessing method should execute the following task: For a given loan, retrieve payment transactions, which show principal balance, current interest, fees and customer balance, also retrieve loan’s daily interest rate and next payment due date. 2013 Henrietta Dombrovskaya – Enova 51  
  • 52. SELECT statement SELECT l.id AS loan_id ,sum( CASE WHEN debit_account_cd = ‘principal’ AND t.acct_date<= v_current_date THEN t.amount ELSE 0 END - CASE WHEN credit_account_cd =’principal’ AND t.acct_date<= v_current_date THEN t.amount ELSE 0 END ) AS amount_payable ,sum (CASE WHEN t.debit_account_cd=’fees_provisional’ THEN t.amount ELSE 0 END - CASE WHEN t.credit_account_cd= ‘fees_provisional’ THEN t.amount ELSE 0 END ) AS fees_provisional ,sum (CASE WHEN t.debit_account_cd=’ interest_provisional’ THEN t.amount ELSE 0 END - CASE WHEN t.credit_account_cd =’interest_provisional’ THEN t.amount ELSE 0 END ) AS interest_provisional ,st.end_date AS next_closing_date ,l.daily_rate AS interest_rate ,sum (CASE WHEN t.debit_account_cd ='customer_balance' THEN -t.amount ELSE 0 END - CASE WHEN t.credit_account_cd ='customer_balance' THEN -t.amount ELSE 0 END ) AS customer_balance FROM loans l LEFT OUTER JOIN payment_transactions_committed t ON l.id=t.loan_id LEFT OUTER JOIN statements st ON l.id=st.loan_id WHERE l.id={?} GROUP BY l.id ,l.daily_rate ,st.end_date 2013 Henrietta Dombrovskaya – Enova 52  
  • 53. Ruby Method: account_balance def account_balance amount_payable = amount_outstanding + amount_charged_off amt = amount_payable - customer_balance provisional_interest = interest_provisional provisional_fees = fees_provisional if provisional_interest > 0.0 amt = amt + provisional_fees + [ provisional_interest - [unearned_interest, 0].max, 0 ].max else amt = amt + provisional_fees end amt = amt.round_near return [0, amt].max end 2013 Henrietta Dombrovskaya – Enova 53  
  • 54. Ruby Method: unearned_interest def unearned_interest amt = -1 * (principal_amount + [(customer_balance - interest_provisional - fees_provisional), 0].max) next_closing_date = Date.parse(self.next_closing_date) return 0 unless next_closing_date interest_period = [next_closing_date - Date.today + 1, 0].max interest_rate = oec_daily_rate.to_f interest = amt * interest_period * interest_rate interest = interest.round_down interest < 0.0 ? 0 : interest end 2013 Henrietta Dombrovskaya – Enova 54  
  • 55. How It Works This methods is used as a callback in a more complex method, which returns Loan_Summary where most of the values can be selected directly from the database. For compliance reasons all calculations are executed in the Ruby model. We need a “virtual column” as one of the fields of the data output; the Ruby callback can be inserted into the data set here. The callback produces required value, and the whole data set is passed to the upper-level method for future processing and then to the web application. 2013 Henrietta Dombrovskaya – Enova 55  
  • 56. 2013 Henrietta Dombrovskaya – Enova 56 LOAN_ID, FUNDING_DATE< FUNDING_AMOUNT FEE STATE APR LOAN_STATUS AMOUNT_DUE ACCOUNT_BALANCE PRINCIPAL_AMOUNT CUSTOMER_BALANCE INTEREST_PROVISIONAL <…> LANGUAGE_CD ACCOUNT_BALANCE METHOD Note, that all data elements can still be retrieved with a single select statement, which won’t be possible within the standard ORM framework.
  • 57. On a Larger Scale In our new application a single Postgres function retrieves summaries for all customer loans. Here is some execution statistics: 2013 Henrietta Dombrovskaya – Enova Financial 57
  • 58. Number of Database Calls/Customer 2013 Henrietta Dombrovskaya – Enova Financial 58   1 2 3 4 5 6 7 8 9 Old App avg_calls 121 196 373 449 632 765 819 923 1129 New App avg_calls 4 4 4 4 4 4 4 4 4 0 200 400 600 800 1000 1200 Old App avg_calls New App avg_calls
  • 59. Execution Time/Customer 2013 Henrietta Dombrovskaya – Enova Financial 59   0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 New App avg_time Old App avg_time
  • 60. Now •  We save both time •  And money! 2013 Henrietta Dombrovskaya – Enova 60
  • 61. Were we the first to notice the problem? Definitely NOT! 2013 Henrietta Dombrovskaya – Enova 61
  • 62. Related Work The problem of object-relational impedance mismatch is a constant discussion topic, when it comes to developing an efficient application. In recent years multiple attempts were made to try to resolve this issue with no significant outcome. 2013 Henrietta Dombrovskaya – Enova Financial 62  
  • 63. First Came HORA HORA (Hybrid Object-Relational Architecture) approach was first introduced in 1993 and became a foundation of multiple ORM-based systems, such as: •  Java Persistence •  ActiveJDBC •  ADO.NET •  Ruby on Rails ActiveRecord •  Many others. All implementations have the same ORIM issues. 2013 Henrietta Dombrovskaya – Enova Financial 63  
  • 64. Hybernate Hybernate is a high-performance Object/Relational persistence and query service, considered one of the most flexible and powerful Object/Relational solutions on the market. •  Mapping from Java classes to database tables and from Java data types to SQL data types. •  Significantly reducing development time •  Allowing the application developers concentrate on the business side of the project. 2013 Henrietta Dombrovskaya – Enova Financial 64  
  • 65. Hybernate Developers Claim •  In contrast to other solutions it does not hide “the power of SQL” from developers. •  This claim is true in some sense, since the solution, indeed allows us to write queries similar to SQL queries. 2013 Henrietta Dombrovskaya – Enova Financial 65  
  • 66. The Caveat •  Creation of complex queries using Hybernate is not an easy task •  Similar to other ORM systems, Hybernate prompts for solutions, which seem more natural for application developers 2013 Henrietta Dombrovskaya – Enova Financial 66  
  • 67. Newer Version of Active Record •  Allows Eager loading and some customer queries, but has the same limitations as Hybernate •  Eager loading may cause an excessive application memory usage 2013 Henrietta Dombrovskaya – Enova Financial 67  
  • 68. Agile Technology •  acknowledges the existence of ORIM ; •  not only acknowledges the technical impedance mismatch, but also a cultural impedance mismatch: “The object-oriented paradigm is based on proven software engineering principles. The relational paradigm, however, is based on proven mathematical principles”. •  while raising awareness of the problem, the solutions, proposed by Agile data technology refers mostly to the database schema changes and/or more careful design and data refactoring, but it offers little in terms of dealing with inefficient queries. 2013 Henrietta Dombrovskaya – Enova Financial 68  
  • 69. What’s next? 2013 Henrietta Dombrovskaya – Enova69
  • 70. Future Work Rewriting a whole application using our new methodology of splitting logic between an application and a database; we are in the middle of the process. Problems: –  Large amount of the legacy code, where most of the business logic is embedded in Ruby classes. –  Virtually no business specifications, which means that we have to extract the business logic from the existing code. –  The legacy application is evolving and existing models are being modified all the time. –  The human factor Because of this…. 2013 Henrietta Dombrovskaya – Enova Financial 70  
  • 71. Moving In Two Directions: ü Continue to rewrite the larger parts of our applications, shooting for having only a couple of database queries per screen rendering; ü While doing this, clarifying our technology, making it more transparent and easier to use by application developers. 2013 Henrietta Dombrovskaya – Enova Financial 71  
  • 72. Aknowledgements Many thanks to: •  Enova CIO/CTO Fred Lee •  Team Raven, and everybody who worked on the Aperture project during it’s lifetime: •  Sheldon Strauch and the entire DB Dev team •  Chad Slaughter and Kurt Stephens •  Ben Heilman and Donny Jekels 2013 Henrietta Dombrovskaya – Enova Financial 72   –  Cheetana Yogeesh –  Allan Zoppa –  Ana Lebon –  Kevin Glowacz –  Neha Bhardwaj –  Jef Jonjevic –  Marc Groulx –  Richard Lee –  Preeti Dhiman –  Luda Baklanova