Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Enabling Applications with
Informix' new OLAP functionality
Ajaykumar Gupte
IBM
1
Agenda
•What is OLAP
•OLAP functions in Informix
– the OVER clause
– supported OLAP functions
•Questions?
What is OLAP?
• On-Line Analytical Processing
• Commonly used in Business
Intelligence (BI) tools
– ranking products, sale...
OLAP Functions in Informix
• Supports subset of commonly used
OLAP functions
• Enables more efficient query
processing fro...
Example query with group by
select customer_num, count(*)
from orders
where customer_num <= 110
group by customer_num;
cus...
Example query with OLAP function
select customer_num, ship_date, ship_charge,
count(*) over (partition by customer_num)
fr...
Where does OLAP function fit?
Joins, group by,
having,
aggregation
OLAP functions
Final order by
OLAP function as predicates
• Use derived table query block to compute
OLAP function first
select * from
(select customer_...
OLAP function example
• Running 3-month average sales for a
particular product during a particular period
select product_n...
The over() Clause
olap_func(arg) over(partition by clause
order by clause window frame clause)
• Defines the “domain” of O...
Partition By
sum(x) over (
partition by a, b
order by c, d
rows between 2 preceding and 2 following)
a=1, b=1
a=2, b=2
a=1...
Order By
sum(x) over (
partition by a, b
order by c, d
rows between 2 preceding and 2 following)
partition a=1, b=2
c=1,d=...
Window Frame
c=1,d=1
c=1,d=2
c=1,d=3
c=2,d=2
c=2,d=4
c=3,d=1
c=4,d=1
c=4,d=2
sum(x) over (
partition by a, b
order by c, d...
Partition By
• Divide result set of query into partitions for
computing of an OLAP function
• If partition by clause is no...
Order By
• Ordering within each partition
• Required for some OLAP functions
–ranking, window frame clause
• Support ASC/D...
Window Frame
• Defines a sliding window within a partition
• OLAP function value computed from rows in the
sliding window
...
Physical vs. Logical Window Frame
• Physical window frame
– ROWS keyword
– count offset by position
– fixed window size
– ...
Window Frame Examples
avg(price) over (order by year, day
rows between 6 preceding and current row)
count(*) over (order b...
Order By – Special Semantics
• “cumulative” semantics in absence of window
frame clause
– for OLAP function that allows wi...
Supported OLAP Functions
• Ranking functions
– RANK, DENSE_RANK (DENSERANK)
– PERCENT_RANK, CUME_DIST, NTILE
– LEAD, LAG
•...
Ranking Functions
• Partition by clause is optional
• Order by clause is required
• Window frame clause is NOT allowed
• D...
RANK vs DENSE_RANK
select emp_num, sales,
rank() over (order by sales) as rank,
dense_rank() over (order by sales) as dens...
PERCENT_RANK and CUME_DIST
• Calculates ranking information as a percentile
• Returns value between 0 and 1
select emp_num...
NTILE
• Divides the ordered data set into N
number of tiles indicated by the
expression.
• Number of tiles needs to be exa...
NTILE Example
select name, salary,
ntile(5) over (partition by dept order by salary)
from employee;
name salary (ntile)
Jo...
LEAD and LAG
LEAD(expr, offset, default)
LAG(expr, offset, default)

Gives LEAD/LAG value of the expression at the
specif...
LEAD/LAG Example
select name, salary, lag(salary)
over (partition by dept order by salary),
lead(salary, 1, 0)
over (parti...
LEAD/LAG NULL handling
select price,
lag(price ignore nulls, 1) over (order by day),
lead(price, 1) ignore nulls over (ord...
Numbering Functions
• Partition by clause and order by clause are
optional
• Window frame clause is NOT allowed
• Provides...
ROW_NUMBER Example
select row_number() over (order by sales),
emp_num, sales
from sales;
(row_number) emp_num sales
1 101 ...
Aggregate Functions
• Partition by, order by and window frame
clauses are all optional
– window frame clause requires orde...
Aggregate Function Example
select price,
avg(price) over (order by day
rows between 1 preceding and 1 following)
from stoc...
DISTINCT handling
• DISTINCT is supported, however DISTINCT is mutually
exclusive with order by clause or window frame
cla...
FIRST_VALUE and LAST_VALUE
• Gives FIRST/LAST value of current partition
• NULL handling
– RESPECT NULLS (default)
– IGNOR...
FIRST_VALUE/LAST_VALUE Example
select price, price – first_value(price)
over (partition by year order by day)
as diff_pric...
RATIO_TO_REPORT
• Computes the ratio of current value to
sum of all values in current partition or
window frame.
select em...
RATIO_TO_REPORT Example
select year, sales, ratio_to_report(sales)
over (partition by year)
from sales;
year sales (ratio_...
Nested OLAP Functions
• OLAP function can be nested inside another
OLAP function
select emp_id, salary, salary – first_val...
OLAP functions and IWA
• Queries containing OLAP functions can be
accelerated by Informix Warehouse
Accelerator (IWA)
• IW...
References
• Links to OLAP function in Informix 12.1
documentation
http://pic.dhe.ibm.com/infocenter/informix/v121/inde
x....
Questions?
gupte@us.ibm.com
41
Prochain SlideShare
Chargement dans…5
×

Enabling Applications with Informix' new OLAP functionality

228 vues

Publié le

This session discusses Informix new OLAP
functionality in detail and how these can these
functions be leveraged in client application.

Publié dans : Logiciels
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Enabling Applications with Informix' new OLAP functionality

  1. 1. Enabling Applications with Informix' new OLAP functionality Ajaykumar Gupte IBM 1
  2. 2. Agenda •What is OLAP •OLAP functions in Informix – the OVER clause – supported OLAP functions •Questions?
  3. 3. What is OLAP? • On-Line Analytical Processing • Commonly used in Business Intelligence (BI) tools – ranking products, salesmen, items, etc – exposing trends in sales from historic data – testing business scenarios (forecast) – sales breakdown or aggregates on multiple dimensions (Time, Region, Demographics, etc)
  4. 4. OLAP Functions in Informix • Supports subset of commonly used OLAP functions • Enables more efficient query processing from BI tools such as Cognos
  5. 5. Example query with group by select customer_num, count(*) from orders where customer_num <= 110 group by customer_num; customer_num (count(*)) 101 1 104 4 106 2 110 2 4 row(s) retrieved.
  6. 6. Example query with OLAP function select customer_num, ship_date, ship_charge, count(*) over (partition by customer_num) from orders where customer_num <= 110; customer_num ship_date ship_charge (count(*)) 101 05/26/2008 $15.30 1 104 05/23/2008 $10.80 4 104 07/03/2008 $5.00 4 104 06/01/2008 $10.00 4 104 07/10/2008 $12.20 4 106 05/30/2008 $19.20 2 106 07/03/2008 $12.30 2 110 07/06/2008 $13.80 2 110 07/16/2008 $6.30 2 9 row(s) retrieved.
  7. 7. Where does OLAP function fit? Joins, group by, having, aggregation OLAP functions Final order by
  8. 8. OLAP function as predicates • Use derived table query block to compute OLAP function first select * from (select customer_num, ship_date, ship_charge, count(*) over (partition by customer_num) as cnt from orders where customer_num <= 110) where cnt >= 3;
  9. 9. OLAP function example • Running 3-month average sales for a particular product during a particular period select product_name, avg(sales) over ( partition by region order by year, month rows between 1 preceding and 1 following ) from total_sales where product_id = 105 and year between 2001 and 2010;
  10. 10. The over() Clause olap_func(arg) over(partition by clause order by clause window frame clause) • Defines the “domain” of OLAP function calculation – partition by: divide into partitions – order by: ordering within each partition – window frame: sliding window within each partition – all clauses optional
  11. 11. Partition By sum(x) over ( partition by a, b order by c, d rows between 2 preceding and 2 following) a=1, b=1 a=2, b=2 a=1, b=2 a=2, b=1
  12. 12. Order By sum(x) over ( partition by a, b order by c, d rows between 2 preceding and 2 following) partition a=1, b=2 c=1,d=1 c=1,d=2 c=1,d=3 c=2,d=2 c=2,d=4 c=3,d=1 c=4,d=1 c=4,d=2
  13. 13. Window Frame c=1,d=1 c=1,d=2 c=1,d=3 c=2,d=2 c=2,d=4 c=3,d=1 c=4,d=1 c=4,d=2 sum(x) over ( partition by a, b order by c, d rows between 2 preceding and 2 following)
  14. 14. Partition By • Divide result set of query into partitions for computing of an OLAP function • If partition by clause is not specified, then entire result set is a single partition max(salary) over (partition by dept_id) sum(sales) over (partition by region) avg(price) over ()
  15. 15. Order By • Ordering within each partition • Required for some OLAP functions –ranking, window frame clause • Support ASC/DESC, NULLS FIRST/NULLS LAST rank() over (partition by dept order by salary desc) dense_rank() over(order by total_sales nulls last)
  16. 16. Window Frame • Defines a sliding window within a partition • OLAP function value computed from rows in the sliding window • Order by clause is required
  17. 17. Physical vs. Logical Window Frame • Physical window frame – ROWS keyword – count offset by position – fixed window size – order by one or more column expressions • Logical window frame – RANGE keyword – count offset by value – window size may vary – order by single column (numeric, date or datetime type)
  18. 18. Window Frame Examples avg(price) over (order by year, day rows between 6 preceding and current row) count(*) over (order by ship_date range between 2 preceding and 2 following) • Current row can be physically outside the window avg(sales) over (order by month range between 3 preceding and 1 preceding) sum(sales) over (order by month rows between 2 following and 5 following)
  19. 19. Order By – Special Semantics • “cumulative” semantics in absence of window frame clause – for OLAP function that allows window frame clause – equivalent to “ROWS between unbounded preceding and current row” select sales, sum(sales) over (order by quarter) from sales where year = 2012 sales (sum) 120 120 135 255 127 382 153 535
  20. 20. Supported OLAP Functions • Ranking functions – RANK, DENSE_RANK (DENSERANK) – PERCENT_RANK, CUME_DIST, NTILE – LEAD, LAG • Numbering functions – ROW_NUMBER (ROWNUMBER) • Aggregate functions – SUM, COUNT, AVG, MIN, MAX – STDEV, VARIANCE, RANGE – FIRST_VALUE, LAST_VALUE – RATIO_TO_REPORT (RATIOTOREPORT)
  21. 21. Ranking Functions • Partition by clause is optional • Order by clause is required • Window frame clause is NOT allowed • Duplicate value handling is different between rank() and dense_rank() – same rank given to all duplicates – next rank used “skips” ranks already covered by duplicates in rank(), but uses next rank for dense_rank()
  22. 22. RANK vs DENSE_RANK select emp_num, sales, rank() over (order by sales) as rank, dense_rank() over (order by sales) as dense_rank from sales; emp_num sales rank dense_rank 101 2,000 1 1 102 2,400 2 2 103 2,400 2 2 104 2,500 4 3 105 2,500 4 3 106 2,650 6 4
  23. 23. PERCENT_RANK and CUME_DIST • Calculates ranking information as a percentile • Returns value between 0 and 1 select emp_num, sales, percent_rank() over (order by sales) as per_rank, cume_dist() over (order by sales) as cume_dist from sales; emp_num sales per_rank cume_dist 101 2,000 0 0.166666667 102 2,400 0.2 0.500000000 103 2,400 0.2 0.500000000 104 2,500 0.6 0.833333333 105 2,500 0.6 0.833333333 106 2,650 1.0 1.000000000
  24. 24. NTILE • Divides the ordered data set into N number of tiles indicated by the expression. • Number of tiles needs to be exact numeric with scale zero
  25. 25. NTILE Example select name, salary, ntile(5) over (partition by dept order by salary) from employee; name salary (ntile) John 35,000 1 Jack 38,400 1 Julie 41,200 2 Manny 45,600 2 Nancy 47,300 3 Pat 49,500 4 Ray 51,300 5
  26. 26. LEAD and LAG LEAD(expr, offset, default) LAG(expr, offset, default)  Gives LEAD/LAG value of the expression at the specified offset  offset is optional, default to 1 if not specified  default is optional, NULL if not specified • default used when offset goes beyond current partition boundary  NULL handling RESPECT NULLS (default) IGNORE NULLS
  27. 27. LEAD/LAG Example select name, salary, lag(salary) over (partition by dept order by salary), lead(salary, 1, 0) over (partition by dept order by salary) from employee; name salary (lag) (lead) John 35,000 38,400 Jack 38,400 35,000 41,200 Julie 41,200 38,400 45,600 Manny 45,600 41,200 47,300 Nancy 47,300 45,600 49,500 Pat 49,500 47,300 51,300 Ray 51,300 49,500 0
  28. 28. LEAD/LAG NULL handling select price, lag(price ignore nulls, 1) over (order by day), lead(price, 1) ignore nulls over (order by day) from stock_price; price (lag) (lead) 18.25 18.37 18.37 18.25 19.03 18.37 19.03 18.37 19.03 19.03 18.37 18.59 18.59 19.03 18.21 18.21 18.59
  29. 29. Numbering Functions • Partition by clause and order by clause are optional • Window frame clause is NOT allowed • Provides sequential row number to result set – regardless of duplicates when order by is specified
  30. 30. ROW_NUMBER Example select row_number() over (order by sales), emp_num, sales from sales; (row_number) emp_num sales 1 101 2,000 2 102 2,400 3 103 2,400 4 104 2,500 5 105 2,500 6 106 2,650
  31. 31. Aggregate Functions • Partition by, order by and window frame clauses are all optional – window frame clause requires order by clause • All currently supported aggregate functions – SUM, COUNT, MIN, MAX, AVG, STDEV, RANGE, VARIANCE • New aggregate functions – FIRST_VALUE/LAST_VALUE – RATIO_TO_REPORT
  32. 32. Aggregate Function Example select price, avg(price) over (order by day rows between 1 preceding and 1 following) from stock_price; price (avg) 18.25 18.31 18.37 18.31 18.37 19.03 19.03 18.81 18.59 18.61 18.21 18.40
  33. 33. DISTINCT handling • DISTINCT is supported, however DISTINCT is mutually exclusive with order by clause or window frame clause select emp_id, manager_id, count(distinct manager_id) over (partition by department) from employee; emp_id manager_id (count) 101 103 3 102 103 3 103 100 3 104 110 3 105 110 3
  34. 34. FIRST_VALUE and LAST_VALUE • Gives FIRST/LAST value of current partition • NULL handling – RESPECT NULLS (default) – IGNORE NULLS
  35. 35. FIRST_VALUE/LAST_VALUE Example select price, price – first_value(price) over (partition by year order by day) as diff_price from stock_price; price diff_price 18.25 0 18.37 0.12 19.03 0.78 18.59 0.34 18.21 -0.04
  36. 36. RATIO_TO_REPORT • Computes the ratio of current value to sum of all values in current partition or window frame. select emp_num, sales, ratio_to_report(sales) over (partition by year order by sales) from sales;
  37. 37. RATIO_TO_REPORT Example select year, sales, ratio_to_report(sales) over (partition by year) from sales; year sales (ratio_to_report) 1998 2400 0.2308 1998 2550 0.2452 1998 2650 0.2548 1998 2800 0.2692 1999 2450 0.2311 1999 2575 0.2429 1999 2725 0.2571 1999 2850 0.2689
  38. 38. Nested OLAP Functions • OLAP function can be nested inside another OLAP function select emp_id, salary, salary – first_value(salary) over (order by rank() over (order by salary)) as diff_salary from employee; select sum(ntile(10) over (order by salary)) over (partition by department) from employee;
  39. 39. OLAP functions and IWA • Queries containing OLAP functions can be accelerated by Informix Warehouse Accelerator (IWA) • IWA processes majority of the query block – scan, join, group by, having, aggregation • Informix server processes OLAP functions based on query result from IWA
  40. 40. References • Links to OLAP function in Informix 12.1 documentation http://pic.dhe.ibm.com/infocenter/informix/v121/inde x.jsp?topic=%2Fcom.ibm.sqls.doc %2Fids_sqs_2583.htm http://pic.dhe.ibm.com/infocenter/informix/v121/inde x.jsp?topic=%2Fcom.ibm.acc.doc %2Fids_acc_queries1.htm
  41. 41. Questions? gupte@us.ibm.com 41

×