SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Accelerating OpenERP accounting:
   Precalculated period sums



                      Borja López Soilán
                      http://www.kami.es
Index
Current approach (sum of entries)
●   Current approach explained.
●   Performance analysis.
Proposal: “Precalculated period sums”
●   Alternative 1: Accumulated values using triggers
    –   Proposed by Ferdinand Gassauer (Chricar)
●   Alternative 2: Period totals using the ORM
    –   Proposed by Borja L.S. (@NeoPolus)
●   Current approach vs Precalculated period sums
Current approach: Sum of entries

          Currently each time you read the
credit/debit/balance of one account OpenERP has
     to recalculate it from the account entries
                    (move lines).

The magic is done by the “_query_get()” method
 of account.move.line, that selects the lines to
  consider, and the “__compute()” method of
     account.account that does the sums.
Inside the current approach
_query_get() filters: builds the “WHERE” part
of the SQL query that selects all the account
move lines involving a set of accounts.
●   Allows to do complex filters, but usually look like
    “include non-draft entries from these periods for these
    accounts”.
__compute() sums: uses the filter to query for
the sums of debit/credit/balance for the current
account and its children.
●   Does just one SQL query for all the accounts. (nice!)
●   Has to aggregate the children values on python.
Sample query done by __compute
SELECT l.account_id as id,
COALESCE(SUM(l.debit), 0) as debit,
COALESCE(SUM(l.credit), 0) as credit,
COALESCE(SUM(l.debit),0) -
COALESCE(SUM(l.credit), 0) as balance
FROM account_move_line l          Account + children = lot of ids!


WHERE l.account_id IN (2, 3, 4, 5, 6, ...,
1648, 1649, 1650, 1651) AND l.state <>
'draft' AND l.period_id IN (SELECT id FROM
account_period WHERE fiscalyear_id IN (1))
AND l.move_id IN (SELECT id FROM account_move
WHERE account_move.state = 'posted')
GROUP BY l.account_id
Sample query plan
                                                 QUERY PLAN

---------------------------------------------------------------------------------------------------------------------

 HashAggregate    (cost=57.83..57.85 rows=1 width=18)

   ->   Nested Loop Semi Join     (cost=45.00..57.82 rows=1 width=18)                  Ugh!, sequential scan
         Join Filter: (l.period_id = account_period.id)                              on a table with (potentially)
         ->   Nested Loop     (cost=45.00..57.52 rows=1 width=22)
                                                                                         lots of records... :(
                 ->   HashAggregate   (cost=45.00..45.01 rows=1 width=4)

                        ->   Seq Scan on account_move   (cost=0.00..45.00 rows=1 width=4)

                               Filter: ((state)::text = 'posted'::text)

                 ->   Index Scan using account_move_line_move_id_index on account_move_line l   (cost=0.00..12.49 rows=1 width=26)

                        Index Cond: (l.move_id = account_move.id)

                     Filter: (((l.state)::text <> 'draft'::text) AND (l.account_id = ANY ('{2,3,4,5, ...,
1649,1650,1651}'::integer[])))

         ->   Index Scan using account_period_fiscalyear_id_index on account_period     (cost=0.00..0.29 rows=1 width=4)

                 Index Cond: (account_period.fiscalyear_id = 1)
Performance Analysis
    Current approach big O 1/2
“Selects all the account move lines”
The query complexity depends on l, the
number of move lines for that account and
(recursive) children:
   O(query) = O(f(l))
“Has to aggregate the children values”
The complexity depends on c, the number of
children.
  O(aggregate) = O(g(c))
Current approach big O 2/2
O(__compute) = O(query) + O(aggregate)


O(__compute) = O(f(l)) + O(g(c))


What kind of functions are f and g?

Let's do some empiric testing (funnier than
maths, isn't it?)...
Let's test this chart... 1/2
The official Spanish
chart of accounts, when
empty:
  Has about 1600
  accounts.
  Has 5 levels.


(to test this chart of
accounts install the
l10n_es module)
Let's test this chart... 2/2
      How many accounts
      below each level?
Account code            Number of
                        children
                        (recursive)
Level 5 – 430000        0
(leaf account)
Level 4 - 4300          1
Level 3 - 430           6
Level 2 - 43            43
Level 1 - 4             192
Level 0 – 0             1678
(root account)


      To get the balance of account “4” we need to sum the balance of 192 accounts!
Ok, looks like the number of children c has a
lot of influence, and the number of moves l
has little or zero influence, g(c) >> f(l)
Lets split them...
Now it is clear that g(c) is linear!
(note: the nº of children grows exponentially)
O(g(c)) = O(c)
So, the influence was little, but linear too!
O(f(l)) = O(l)
Big O - Conclusion
O(__compute) = O(l) + O(c)

c has an unexpectedly big influence on the
results
=> Bad performance on complex charts of
accounts!
c does not grow with time, but l does...
=> OpenERP accounting becomes slower and
slower with time! (though it's not that bad as expected)
Proposal: Precalculated sums
OpenERP recalculates the debit/credit/balance
from move lines each time.
Most accounting programs store the totals per
period (or the cumulative values) for each
account. Why?
●   Reading the debit/credit/balance becomes much
    faster.
●   ...and reading is much more data intensive than
    writing:
    –   Accounting reports read lots of times lots of accounts.
    –   Accountants only update a few accounts at a time.
It's really faster?
Precalculated sums per period means:
●   O(p)query (get the debit/credit/balance of each
    period for that account) instead of O(l)query, with
    p being the number of periods, p << l.
    Using opening entries, or cumulative totals, p
    becomes constant => O(1)
●   If aggregated sums (with children values) are also
    precalculated, we don't have to do one
    O(c)aggregation per read.
It's O(1) for reading!!
    (but creating/editing entries is a bit slower)
Alternative 1: Accumulated values
         using triggers (I)
Proposed by Ferdinand Gassauer.
How does it work?
●   New object to store the accumulated
    debit/credit/balance per account and period (let's
    call it account.period.sum).
                        Opening   1st     2nd   3rd    4th
     Move line values   400       +200,   +25   -400   +25,
     in period                    +50                  +200
     Value in table     400       650     675   275    500


●   Triggers on Postgres (PL/pgSQL) update the
    account_period_sum table each time an account
    move line is created/updated/deleted.
Alternative 1: Accumulated values
         using triggers (II)
How does it work?(cont.)
●   The data is calculated accumulating the values from
    previous periods. (Ferdinand prototype requires an special naming of
    periods for this).
●   Creates SQL views based on the account
    account_period_sum table.
●   For reports that show data aggregated by period:
     –   New reports can be created that either directly use the
         SQL views, or use the account.period.sum object.
●   The account.account.__compute() method could be
    extended to optimize queries (modified to make use
    of the account_period_sum when possible) in the
    future.
Alternative 1: Accumulated values
          using triggers (III)
Good points                              Bad points
  Triggers guarantee that                  Database dependent
  the data is always in                    triggers.
  sync.
  (even if somebody writes directly to     Triggers are harder to
  the database!)                           maintain than Python
  Triggers are fast.                       code.

  Prototype available and                  Makes some
  working! - “used this method             assumptions on period
  already in very big                      names.
  installations - some 100                 (as OpenERP currently does
                                           not flag opening periods apart
  accountants some millions
                                           from closing ones)
  moves without any problems”
  (Ferdinand)
Alternative 2: Period totals using the
               ORM (I)
 Proposed by Borja L.S. (@NeoPolus).
 How does it work?
  ●   New object to store the debit/credit/balance sums
      per account and period (and state):
                          Opening   1st     2nd   3rd    4th
       Move line values   400       +200,   +25   -400   +25,
       in period                    +50                  +200
       Value in table     400       250     25    -400   225


  ●   Extends the account.move.line open object to
      update the account.sum objects each time a line is
      created/updated/deleted.
Alternative 2: Period totals using the
               ORM (II)
 How does it work?(cont.)
  ●   Extends account.account.__compute() method to
      optimize queries:
      –   If the query filters only by period/fiscal year/state, the
          data is retrieved from the account.sum object.
      –   If the query filters by dates, and one ore more fiscal
          periods are fully included on that range, the data is
          retrieved from for the account.sum objects (for the range
          covered by the periods) plus the account.move.lines (the
          range not covered by periods).
      –   Filtering by every other field (for example partner_id)
          causes a fallback into the normal __compute method.
Alternative 2: Period totals using the
              ORM (III)
Good points                Bad points
  Database                   Does not guarantee
  independent.               that the sums are in
  Optimizes all the          sync with the move
  accounting.                lines.
                             (but nobody should directly alter
                             the database in first place...)
  Flexible.
  No PL/pgSQL triggers       Python is slower than
  required, just Python      using triggers.
  => Easier to maintain.     No prototype yet! :)
                             (But take a look at
                             Tryton stock quantity computation)
Current approach VS Period sums
Current approach                 Precalculated sums
Pros                             Pros
  ●    No redundant data.          ●    Fast, always.
  ●    Simpler queries.            ●    Drill-down navigation.
Cons                             Cons
  ●    Slow.                       ●    Need to keep sums in
       –   Reports and                  sync with move lines.
           dashboard               ●    More complex
           charts/tables are
                                        (__compute) or
           performance hungry.
                                        specific queries to
  ●    Becomes even slower              make use of the
       with time.                       precalculated sums.
Precalculated sums – Drill down navigation
          (Chricar prototype) 1/3
Precalculated sums – Drill down navigation (Chricar prototype) 2/3
Precalculated sums – Drill down navigation (Chricar prototype) 3/3
And one last remark...




...all this is applicable to the stock quantities
                 computation too!

Contenu connexe

En vedette

Sintesis informativa 30 04 2013
Sintesis informativa 30 04 2013Sintesis informativa 30 04 2013
Sintesis informativa 30 04 2013
megaradioexpress
 
Calendario 3ª grupo ix 13 14
Calendario 3ª grupo ix 13 14Calendario 3ª grupo ix 13 14
Calendario 3ª grupo ix 13 14
Estepona Dxt
 
Examples of similies and metaphores power point
Examples of similies and metaphores power pointExamples of similies and metaphores power point
Examples of similies and metaphores power point
home
 

En vedette (17)

Dropbox
DropboxDropbox
Dropbox
 
Sintesis informativa 30 04 2013
Sintesis informativa 30 04 2013Sintesis informativa 30 04 2013
Sintesis informativa 30 04 2013
 
How to Design for Mobile
How to Design for MobileHow to Design for Mobile
How to Design for Mobile
 
Calendario 3ª grupo ix 13 14
Calendario 3ª grupo ix 13 14Calendario 3ª grupo ix 13 14
Calendario 3ª grupo ix 13 14
 
Wie High Potentials eine attraktiven Arbeitgeber finden
Wie High Potentials eine attraktiven Arbeitgeber findenWie High Potentials eine attraktiven Arbeitgeber finden
Wie High Potentials eine attraktiven Arbeitgeber finden
 
Facebookla Pazarlama Taktikleri
Facebookla Pazarlama TaktikleriFacebookla Pazarlama Taktikleri
Facebookla Pazarlama Taktikleri
 
Criterios de convergencia
Criterios de convergenciaCriterios de convergencia
Criterios de convergencia
 
Proyecto de-tesis-avance
Proyecto de-tesis-avanceProyecto de-tesis-avance
Proyecto de-tesis-avance
 
Diplomatic list. - Free Online Library
Diplomatic list. - Free Online LibraryDiplomatic list. - Free Online Library
Diplomatic list. - Free Online Library
 
Chapter 9 hitt pp slides
Chapter 9 hitt pp slidesChapter 9 hitt pp slides
Chapter 9 hitt pp slides
 
Netværkslæring og Web 2.0
Netværkslæring og Web 2.0Netværkslæring og Web 2.0
Netværkslæring og Web 2.0
 
Pobreza y mendicidad
Pobreza y mendicidadPobreza y mendicidad
Pobreza y mendicidad
 
Chapter#8
Chapter#8Chapter#8
Chapter#8
 
peut-on prévoir les modalités de fin de la relation de travail dés sa conclus...
peut-on prévoir les modalités de fin de la relation de travail dés sa conclus...peut-on prévoir les modalités de fin de la relation de travail dés sa conclus...
peut-on prévoir les modalités de fin de la relation de travail dés sa conclus...
 
Arduino
ArduinoArduino
Arduino
 
Examples of similies and metaphores power point
Examples of similies and metaphores power pointExamples of similies and metaphores power point
Examples of similies and metaphores power point
 
Laboratorio ElectróNica 02
Laboratorio ElectróNica 02Laboratorio ElectróNica 02
Laboratorio ElectróNica 02
 

Similaire à Accelerating OpenERP accounting: Precalculated period sums

Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0
PMILebanonChapter
 
MSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docx
MSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docxMSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docx
MSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docx
gilpinleeanna
 
Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
vraopolisetti
 
ProjectReport - Maurya,Shailesh
ProjectReport - Maurya,ShaileshProjectReport - Maurya,Shailesh
ProjectReport - Maurya,Shailesh
sagar.247
 

Similaire à Accelerating OpenERP accounting: Precalculated period sums (20)

Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0
 
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
 
Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon Redshift
 
Chapter 1 Data structure.pptx
Chapter 1 Data structure.pptxChapter 1 Data structure.pptx
Chapter 1 Data structure.pptx
 
Technologies used in the PVS-Studio code analyzer for finding bugs and potent...
Technologies used in the PVS-Studio code analyzer for finding bugs and potent...Technologies used in the PVS-Studio code analyzer for finding bugs and potent...
Technologies used in the PVS-Studio code analyzer for finding bugs and potent...
 
AA_Unit 1_part-I.pptx
AA_Unit 1_part-I.pptxAA_Unit 1_part-I.pptx
AA_Unit 1_part-I.pptx
 
MSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docx
MSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docxMSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docx
MSCD650 Final Exam feedback FormMSCD650 Final Exam Grading For.docx
 
UNIT-2-PPTS-DAA.ppt
UNIT-2-PPTS-DAA.pptUNIT-2-PPTS-DAA.ppt
UNIT-2-PPTS-DAA.ppt
 
Skills Portfolio
Skills PortfolioSkills Portfolio
Skills Portfolio
 
UNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.pptUNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.ppt
 
UNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.pptUNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.ppt
 
Introduction to Design Algorithm And Analysis.ppt
Introduction to Design Algorithm And Analysis.pptIntroduction to Design Algorithm And Analysis.ppt
Introduction to Design Algorithm And Analysis.ppt
 
Sap fico-fi-notes
Sap fico-fi-notesSap fico-fi-notes
Sap fico-fi-notes
 
Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
 
Introduction to Data Structure and algorithm.pptx
Introduction to Data Structure and algorithm.pptxIntroduction to Data Structure and algorithm.pptx
Introduction to Data Structure and algorithm.pptx
 
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
 
How to Create a l10n Payroll Structure
How to Create a l10n Payroll StructureHow to Create a l10n Payroll Structure
How to Create a l10n Payroll Structure
 
Salesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
 
ProjectReport - Maurya,Shailesh
ProjectReport - Maurya,ShaileshProjectReport - Maurya,Shailesh
ProjectReport - Maurya,Shailesh
 
Making Pretty Charts in Splunk
Making Pretty Charts in SplunkMaking Pretty Charts in Splunk
Making Pretty Charts in Splunk
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Accelerating OpenERP accounting: Precalculated period sums

  • 1. Accelerating OpenERP accounting: Precalculated period sums Borja López Soilán http://www.kami.es
  • 2. Index Current approach (sum of entries) ● Current approach explained. ● Performance analysis. Proposal: “Precalculated period sums” ● Alternative 1: Accumulated values using triggers – Proposed by Ferdinand Gassauer (Chricar) ● Alternative 2: Period totals using the ORM – Proposed by Borja L.S. (@NeoPolus) ● Current approach vs Precalculated period sums
  • 3. Current approach: Sum of entries Currently each time you read the credit/debit/balance of one account OpenERP has to recalculate it from the account entries (move lines). The magic is done by the “_query_get()” method of account.move.line, that selects the lines to consider, and the “__compute()” method of account.account that does the sums.
  • 4. Inside the current approach _query_get() filters: builds the “WHERE” part of the SQL query that selects all the account move lines involving a set of accounts. ● Allows to do complex filters, but usually look like “include non-draft entries from these periods for these accounts”. __compute() sums: uses the filter to query for the sums of debit/credit/balance for the current account and its children. ● Does just one SQL query for all the accounts. (nice!) ● Has to aggregate the children values on python.
  • 5. Sample query done by __compute SELECT l.account_id as id, COALESCE(SUM(l.debit), 0) as debit, COALESCE(SUM(l.credit), 0) as credit, COALESCE(SUM(l.debit),0) - COALESCE(SUM(l.credit), 0) as balance FROM account_move_line l Account + children = lot of ids! WHERE l.account_id IN (2, 3, 4, 5, 6, ..., 1648, 1649, 1650, 1651) AND l.state <> 'draft' AND l.period_id IN (SELECT id FROM account_period WHERE fiscalyear_id IN (1)) AND l.move_id IN (SELECT id FROM account_move WHERE account_move.state = 'posted') GROUP BY l.account_id
  • 6. Sample query plan QUERY PLAN --------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=57.83..57.85 rows=1 width=18) -> Nested Loop Semi Join (cost=45.00..57.82 rows=1 width=18) Ugh!, sequential scan Join Filter: (l.period_id = account_period.id) on a table with (potentially) -> Nested Loop (cost=45.00..57.52 rows=1 width=22) lots of records... :( -> HashAggregate (cost=45.00..45.01 rows=1 width=4) -> Seq Scan on account_move (cost=0.00..45.00 rows=1 width=4) Filter: ((state)::text = 'posted'::text) -> Index Scan using account_move_line_move_id_index on account_move_line l (cost=0.00..12.49 rows=1 width=26) Index Cond: (l.move_id = account_move.id) Filter: (((l.state)::text <> 'draft'::text) AND (l.account_id = ANY ('{2,3,4,5, ..., 1649,1650,1651}'::integer[]))) -> Index Scan using account_period_fiscalyear_id_index on account_period (cost=0.00..0.29 rows=1 width=4) Index Cond: (account_period.fiscalyear_id = 1)
  • 7. Performance Analysis Current approach big O 1/2 “Selects all the account move lines” The query complexity depends on l, the number of move lines for that account and (recursive) children: O(query) = O(f(l)) “Has to aggregate the children values” The complexity depends on c, the number of children. O(aggregate) = O(g(c))
  • 8. Current approach big O 2/2 O(__compute) = O(query) + O(aggregate) O(__compute) = O(f(l)) + O(g(c)) What kind of functions are f and g? Let's do some empiric testing (funnier than maths, isn't it?)...
  • 9. Let's test this chart... 1/2 The official Spanish chart of accounts, when empty: Has about 1600 accounts. Has 5 levels. (to test this chart of accounts install the l10n_es module)
  • 10. Let's test this chart... 2/2 How many accounts below each level? Account code Number of children (recursive) Level 5 – 430000 0 (leaf account) Level 4 - 4300 1 Level 3 - 430 6 Level 2 - 43 43 Level 1 - 4 192 Level 0 – 0 1678 (root account) To get the balance of account “4” we need to sum the balance of 192 accounts!
  • 11. Ok, looks like the number of children c has a lot of influence, and the number of moves l has little or zero influence, g(c) >> f(l) Lets split them...
  • 12. Now it is clear that g(c) is linear! (note: the nº of children grows exponentially) O(g(c)) = O(c)
  • 13. So, the influence was little, but linear too! O(f(l)) = O(l)
  • 14. Big O - Conclusion O(__compute) = O(l) + O(c) c has an unexpectedly big influence on the results => Bad performance on complex charts of accounts! c does not grow with time, but l does... => OpenERP accounting becomes slower and slower with time! (though it's not that bad as expected)
  • 15. Proposal: Precalculated sums OpenERP recalculates the debit/credit/balance from move lines each time. Most accounting programs store the totals per period (or the cumulative values) for each account. Why? ● Reading the debit/credit/balance becomes much faster. ● ...and reading is much more data intensive than writing: – Accounting reports read lots of times lots of accounts. – Accountants only update a few accounts at a time.
  • 16. It's really faster? Precalculated sums per period means: ● O(p)query (get the debit/credit/balance of each period for that account) instead of O(l)query, with p being the number of periods, p << l. Using opening entries, or cumulative totals, p becomes constant => O(1) ● If aggregated sums (with children values) are also precalculated, we don't have to do one O(c)aggregation per read. It's O(1) for reading!! (but creating/editing entries is a bit slower)
  • 17. Alternative 1: Accumulated values using triggers (I) Proposed by Ferdinand Gassauer. How does it work? ● New object to store the accumulated debit/credit/balance per account and period (let's call it account.period.sum). Opening 1st 2nd 3rd 4th Move line values 400 +200, +25 -400 +25, in period +50 +200 Value in table 400 650 675 275 500 ● Triggers on Postgres (PL/pgSQL) update the account_period_sum table each time an account move line is created/updated/deleted.
  • 18. Alternative 1: Accumulated values using triggers (II) How does it work?(cont.) ● The data is calculated accumulating the values from previous periods. (Ferdinand prototype requires an special naming of periods for this). ● Creates SQL views based on the account account_period_sum table. ● For reports that show data aggregated by period: – New reports can be created that either directly use the SQL views, or use the account.period.sum object. ● The account.account.__compute() method could be extended to optimize queries (modified to make use of the account_period_sum when possible) in the future.
  • 19. Alternative 1: Accumulated values using triggers (III) Good points Bad points Triggers guarantee that Database dependent the data is always in triggers. sync. (even if somebody writes directly to Triggers are harder to the database!) maintain than Python Triggers are fast. code. Prototype available and Makes some working! - “used this method assumptions on period already in very big names. installations - some 100 (as OpenERP currently does not flag opening periods apart accountants some millions from closing ones) moves without any problems” (Ferdinand)
  • 20. Alternative 2: Period totals using the ORM (I) Proposed by Borja L.S. (@NeoPolus). How does it work? ● New object to store the debit/credit/balance sums per account and period (and state): Opening 1st 2nd 3rd 4th Move line values 400 +200, +25 -400 +25, in period +50 +200 Value in table 400 250 25 -400 225 ● Extends the account.move.line open object to update the account.sum objects each time a line is created/updated/deleted.
  • 21. Alternative 2: Period totals using the ORM (II) How does it work?(cont.) ● Extends account.account.__compute() method to optimize queries: – If the query filters only by period/fiscal year/state, the data is retrieved from the account.sum object. – If the query filters by dates, and one ore more fiscal periods are fully included on that range, the data is retrieved from for the account.sum objects (for the range covered by the periods) plus the account.move.lines (the range not covered by periods). – Filtering by every other field (for example partner_id) causes a fallback into the normal __compute method.
  • 22. Alternative 2: Period totals using the ORM (III) Good points Bad points Database Does not guarantee independent. that the sums are in Optimizes all the sync with the move accounting. lines. (but nobody should directly alter the database in first place...) Flexible. No PL/pgSQL triggers Python is slower than required, just Python using triggers. => Easier to maintain. No prototype yet! :) (But take a look at Tryton stock quantity computation)
  • 23. Current approach VS Period sums Current approach Precalculated sums Pros Pros ● No redundant data. ● Fast, always. ● Simpler queries. ● Drill-down navigation. Cons Cons ● Slow. ● Need to keep sums in – Reports and sync with move lines. dashboard ● More complex charts/tables are (__compute) or performance hungry. specific queries to ● Becomes even slower make use of the with time. precalculated sums.
  • 24. Precalculated sums – Drill down navigation (Chricar prototype) 1/3
  • 25. Precalculated sums – Drill down navigation (Chricar prototype) 2/3
  • 26. Precalculated sums – Drill down navigation (Chricar prototype) 3/3
  • 27. And one last remark... ...all this is applicable to the stock quantities computation too!