SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING
  International Journal of JOURNAL OF and Technology (IJCET), ISSN 0976-
  6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME
                             & TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 1, January- February (2013), pp. 301-312
                                                                             IJCET
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2012): 3.9580 (Calculated by GISI)                ©IAEME
www.jifactor.com




  QUERY EVALUATION OVER NETWORK OF DATA AGGREGATORS

                                Rajesh Bharati1, Amit V. Kore2
                  Department of Computer Engineering, DYPIET Pimpri, Pune
                  Department of Computer Engineering, DYPIET Pimpri, Pune


  ABSTRACT
          Continuous queries are used to monitor changes to time varying data and to provide
  results useful for online decision making. Typically a user desires to obtain the value of some
  aggregation function over distributed data items, for example, to know value of portfolio for a
  client; or the AVG of temperatures sensed by a set of sensors. In these queries a client
  specifies a coherency requirement as part of the query. Present a low-cost, scalable technique
  to answer continuous aggregation queries using a network of aggregators of dynamic data
  items. In such a network of data aggregators, each data aggregator serves a set of data items
  at specific coherencies. Just as various fragments of a dynamic web-page are served by one or
  more nodes of a content distribution network, our technique involves decomposing a client
  query into sub-queries and executing sub-queries on judiciously chosen data aggregators with
  their individual sub-query incoherency bounds. Provide a technique for getting the optimal
  set of sub-queries with their incoherency bounds which satisfies client query's coherency
  requirement with least number of refresh messages sent from aggregators to the client. For
  estimating the number of refresh messages, Build a query cost model which can be used to
  estimate the number of messages required to satisfy the client specified incoherency bound.
  Performance results using real-world traces show that our cost based query planning leads to
  queries being executed using less than one third the number of messages required by existing
  schemes

  I. INTRODUCTION
          These aggregation queries are long running queries as data is continuously
  changing and the user is interested in notifications when certain conditions hold. Thus,
  responses to these queries are refreshed continuously. In these continuous query
  applications, users are likely to tolerate some inaccuracy in the results. That is, the exact
  data values at the corresponding data sources need not be reported as long as the query
  results satisfy user specified accuracy requirements. For instance, a portfolio tracker
  may be happy with an accuracy of $10.i. Password should be easy to remember.

                                               301
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976        0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

Data incoherency: Data accuracy can be specified in terms of incoherency of a data item,
                       :
defined as the absolute difference in value of the data item at the data source and the value known to a
                                                                             ce
client of the data. Let v(t) denote the value of the it" data item at the data source at time t;
and let the value the data item known to the client be u(t). Then the data incoherency at
the client is given by | v(t)-u(t)| For a data item which needs to be refreshed at an
                                   u(t)|.
incoherency bound C a data refresh message is sent to the client as soon as data
incoherency exceeds C, i.e., | v(t)
                                  v(t)-u(t)| > C.
Network of data aggregators: Data refresh from data sources to clients can be don
                   aggregators:                                                    done
using push or pull based mechanisms. In a push based mechanism data sources send
update messages to clients on their own whereas in pull based mechanism data sources
send messages to the client only when the client makes a request. It is assume the push
based mechanism for data transfer between data sources and clients. For scalable
handling of push based data dissemination, network of data aggregators are pro
                                                                             pro-posed.
In such network of data aggregators, data refreshes occur from data sources to the
clients through one or more data aggregators.
      s




                   Fig. 1 Data dissemination network for multiple data items

Here it is assume that each data aggregator maintains its configured incoherency bounds
for various data items. From a data dissemination capability point of view, each data
                                                                  point
aggregator (DA) is characterized by a set of (d, c) pairs, where d, is a data item which the
DA can disseminate at an incoherency bound c. The configured incoherency bound of a
data item at a data aggregator can be maintained using any of following methods: (a)
The data source refreshes the data value of the DA whenever DA's incoherency bound is
about to get violated. This method has scalability problems. (b) Data aggregator(s) with
tighter incoherency bound help the DA to maintain its incoherency bound in a scalable
manner.

II. PROBLEM STATEMENT

       Value of a continuous weighted additive aggregation query, at time t, can be
calculated as:
                             nq

                V q (t ) =   ∑      ( V qi ( t ) × W   qi   )
                             i =1                                     (1)

                                                                302
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

Vq is the value of a client query q involving nq data items with the weight of the ith data
item being wqit , Suppose the result for the query given by Equation (1) needs to be
continuously provided to a user at the query incoherency bound Cq• Then, the
dissemination network has to ensure that:
                     nq

               |   ∑       ( v qi ( t ) − u qi ( t )) × w qi | ≤ C q
                    i =1                                                     (2)

       Whenever data values at sources change such that query incoherency bound is
violated, the updated value should be refreshed to the client If the network of
aggregators can ensure that the iO, data item has incoherency bound Cqit then the
following condition ensures that the query incoherency bound Cq is satisfied:

                   nq

               ∑ (C          qi   × wqi ) ≤ Cq
                   i =1                                                      (3)


III. DATA DISSEMINATION COST MODEL

    The presented model is to estimate the number of refreshes required to disseminate a data
item while maintaining a certain incoherency bound. There are two primary factors affecting
the number of messages that are needed to maintain the coherency requirement:
        (a) The coherency requirement itself and
        (b) Dynamics of the data
Consider a data item which needs to be disseminated at an incoherency bound C, i.e., new
value of the data item will be pushed if the value deviates by more than C from the last
pushed value. Thus the number of dissemination messages will be proportional to the
probability of |v (t) - u (t) |greater than C for data value vet) at the source/ aggregator and u(t)
at the client, at time t. A data item can be modeled as a discrete time random process where
each step is correlated with its previous step

a) Data Dynamics Model

       Specifically, the cost of data dissemination for a data item will be proportional to
data sumdiff defined as

               Rs = ∑ | si − si −1 |
                              i                                              (5)

Where si and si −1 are the sampled values of a data item 5 at ith and (i-1)th time instances
(i.e., consecutive ticks). In practice, sumdiffvalue for a data item can be calculated at the
data source by taking running average of difference between data values for consecutive
ticks.




                                                                       303
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

b) Incoherency Bound Model
        Data source pushes the data value whenever it differs from the last pushed value by an
amount more than C. Client estimates data value based on server specified parameters. The
source pushes the new data value whenever it differs from the (client) estimated value by an
amount more than C
    In both these cases, value at the source can be modeled as a random process with average
as the value known at the client In case (b), the client and the server estimate the data value as
the mean of the modeled random process whereas in case (a) deviation from the last pushed
value can be modeled as zero mean process. Using Chebyshev's inequality:

               P(| v(t ) − u(t ) |> C ) ∝ 1 / C 2                 (4)

Thus, it hypothesize that the number of data refresh messages is inversely proportional to the
square of the incoherency bound.

c) Combining data dissemination models
    Number of refresh messages is proportional to data sum- diff Rs and inversely
proportional to square of the incoherency bound. Further, it need not disseminate any
message when either data value is not changing ( Rs = 0) or incoherency bound is
unlimited (1/ C 2 =0). Thus, for a given data item 5, disseminated with an incoherency
bound C, the data dissemination cost is proportional to R / C 2 . In the next section, it use this
data dissemination cost model for developing cost model for additive aggregation queries

IV. COST MODEL FOR ADDITIVE AGGREGATION QUERIES

        Consider an additive query over two data items P and Q with weights Wp and Wq
respectively and then need to estimate its dissemination cost. H data items are disseminated
separately, the query sumdiff will be:

               Rdata = wpRp +wqRq = wp ∑ pi − pi−1 | +wq ∑ qi −qi−1 |
                                        |                 |
                                                                        (6)

Instead, if the aggregator uses the information that client is interested in a query over P and Q
(rather than their individual values), it creates and pushes a composite data item (WpP+WqQ)
then the query sumdiffwill be:

               Rquery = ∑| wp ( pi − pi−1 ) + wq (qi − qi−1 ) |
                                                                        (7)

Rquery is clearly less than or equal compared to Rdata . Thus IT need to estimate the sumdiff of an
aggregation query (i.e., Rquery ) given the sumdiff values of individual data items (i.e., Rp and
Rq). Only data aggregators are in a position to calculate Rquery as different data items may be
disseminated from different sources. Here develop the query cost model in two stages.




                                                    304
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

a) Modeling Correlation between Data Dynamics
        From Equations (6) and (7) it can see that if two data items are correlated such that as
the value of one data item increases that of the other data item also increase then R query will be
closer to R data . On the other hand if the data items are inversely correlated then                                   R query   will be
less compared to R data . Thus, intuitively, it can represent the relationship between Rquery and
sumdiff values of the individual data items using a correlation measure associated with the pair
of data items. Specifically, if p is the correlation measure then Rquery can be written as:


                                 2
                                             (w2 Rp + wq Rq + 2ρwp Rp wq Rq )
                                               p
                                                  2    2 2

                               R query   ∝
                                                    (w2 + wq + 2ρwp wq )
                                                      p
                                                           2                                          (8)

The correlation measure ρ is defined such that − 1 ≤                            ρ ≤1     . So,   R query    will always be less than
| wpR   p   + w q R q | and
                      always be more than | w p R p − w q R q | .The above relation can be better
understood from its similarity with the standard deviation of the sum of two random
variables. For data items P and Q, ρ can be calculated as:


                               ρ=
                                             ∑(p      i   − p i −1 )(q i − q i −1 )
                                                                                                      (9)
                                    (    ∑(p    i   − p i −1 ) 2    ∑ (q    i   − q i −1 ) 2

b) Query based Normalization
        Suppose there is need to compare the cost of two queries: a SUM query
involving two data items and an AVG query involving the same set of data items. Let
the query incoherency bound for the SUM and the AVG queries be C =2C and C2=C,
respectively. From Equation (8), sumdiff of the SUM query will be double that of the
AVG query. Hence, query evaluation cost (as per R / C 2 ) of the SUM query will be hall
that of the AVG query. But, intuitively, disseminating the AVG of two data items at a
given incoherency bound should require the same number of refresh messages as their
SUM with double the incoherency bound. Thus, there is a need to normalize query
costs. From a query execution cost point of view, a query with weights Wj and
incoherency bound C is the same as query with weights nWj and incoherency bound nc.
So, while normalizing need to ensure that both, query weights and incoherency bounds,
are multiplied by the same factor.
Normalized query sumdiff is given by

                    Rquery=(w2Rp +w2Rq +2ρwpRpw Rq)/(w2 +w2 +2ρwpw )
                     2
                             p
                               2
                                   q
                                     2
                                               q      p   q       q
                                                                                                       (10)

The value of the normalizing factor for Rquery should be


                                               1 / w 2 + w q + 2 ρw p w q
                                                     p
                                                           2




The value of the incoherency bound has to be adjusted by the same factor. Normalization
ensures that queries with arbitrary values of weights can be compared for execution cost

                                                                   305
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

estimates. Equation (10) can be extended to get query sumdiff for any general weighted
aggregation query given by Equation (1) as:
                                          nq                     nq        nq

                                         ∑       w qi R i2 +
                                                   2
                                                                 ∑ ∑             ρ ij w qi w qj R i R     j
                                  2       i =1                   i =1 j =1, j ≠ i
                                 RQ =              nq               nq     nq

                                                  ∑
                                                  i=1
                                                          2
                                                        w qi +    ∑ ∑
                                                                   i =1
                                                                                    ρ ij w qi w qj
                                                                          j =1, j ≠ i



V. QUERY PLANNING FOR WEIGHTED ADDITIVE AGGREGATION QUERIES

      For executing an incoherency bounded continuous query, a query plan is required.
The query planning problem can be stated as:

Inputs:
(1) A network of data aggregators in the form of a relation f (A, D, C) specifying the N
data aggregators ak ∈ A(1 ≤ k ≤ N ) , set Dk ⊆ D of data items disseminated by the data
aggregator a k and incoherency bound t kj , which the aggregator a k can ensure for each
data item   d kj ∈ D k
(2) Client query q and its incoherency bound C q . An additive aggregation query q can be
represented as    ∑w d   qi qi

Where wqi ; is the weight of the data item d qi ; for 1 ≤ i ≤ nq
Outputs:
(1) q k , for (1 ≤ k ≤ N ) , i.e., sub-query for each data aggregator ak .
(2) C qk , for (1 ≤ k ≤ N ) i.e., incoherency bounds for all the sub queries. Thus, to get a
query plan here need to perform following tasks:

1. Determining sub-queries: For the client query q get sub queries q k s for each data
   aggregator.
2. Dividing incoherency bound: Divide the query incoherency bound C q , among sub-queries
  to get C qk s .
For optimal query planning, above tasks are to be performed with the following
objective and constraints:

Optimization objective: Number of refresh messages is minimized. For a sub query q k , the
estimated number of refresh messages is given by kR / C where R qk is the sumdiff of       qk
                                                                                                     2
                                                                                                     qk


the sub query q k , C qk is the incoherency bound assigned to it and k, the proportionality
factor, is the same for all sub queries of a given query q. Thus total number of refresh
messages is estimated as:
                                                  N     R qk
                                 Z   q   = k∑               2
                                                 k =1   C   qk
                                                                                                      (12)


                                                                 306
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

Hence Z , needs to be minimized for minimizing the number of refreshes.
                q


Constraint 1: q k is executable at ak : Each DA has the data items required to execute the sub-
query allocated to it, i.e., for each data item d qki required for the sub query q , d qki ∈ D k
                                                                                   k

Constraint2: Query incoherency bound is satisfied: Query incoherency should be less than or
equal to the query incoherency bound. For additive aggregation queries, value of the
client query is the sum of sub-query values. As different sub-queries are disseminated
by different data aggregators, need to ensure that sum of sub-query incoherencies is less
than or equal to the query incoherency bound. Thus

                           ∑C   qk   ≤ Cq
                                                                           (13)

Constraint3: Sub-query incoherency bound is satisfied: Data incoherency bounds at      ak   ( t kj for
d   qki
         should be such that the sub-query incoherency bound C qk can be satisfied at
          ∈ D   k


that DA. The tightest incoherency bound T qk which the data aggregator a k can satisfy for
the given sub-query q k can be calculated as Tqk = ∑ (Wqi × t qj | d qi ≡ d qj )
                                                                     nqk

For satisfying this constraint 1 ensure C qk > T qk
Following is the outline of my approach for solving this constraint optimization problem
as detailed in the rest of this chapter: it proves that determining sub-queries while
minimizing Z q , as given by Equation (12), is NP hard. If the set of sub-queries ( q k ) is
already given, sub-query incoherency bounds C qk s can be optimally determined to
minimize Z q . As optimally dividing the query into sub queries is NP-hard and there is no
known approximation algorithm, it present two heuristics for determining sub-queries while
satisfying as many constraints as possible (Constraint1 and Constraint2 to be precise).
Then it present variation of the two heuristics for ensuring that sub-query incoherency
bound is satisfied (Constraint3). In particular, to get a solution of the query planning
problem, the heuristics presented are used for determining sub-queries. Then, using the
set of sub-queries, the method outlined is used for dividing in coherency bound

VI. OPTIMAL ALLOCATION OF QUERY INCOHERENCY BOUND
AMONG SUB -QUERIES

        If know the division of the client query into sub queries, using Equation (11) can
calculate sumdiffvalues of all the sub-queries. Thus, it need to minimize Z, given by Equation
(12) subject to Constaint2 (query incoherency bound is satisfied) and Constaint3 (sub-query
incoherency bound is satisfied). Then it can get a close form expression by solving Equation (12)
with Equation (13) using Lagrange Multiplier scheme. In that scheme minimizes

                    N                        N
                                    2
                    ∑ (R
                    k =1
                           qk   / C qk ) + λ (∑ C qk − C q )
                                            k =1




                                                               307
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

For a constraint λ to get values of C qk s as
                                  N
               C qk = C q Rqk 3 /(∑ Rqk 3 )
                           1/        1/

                                 k =1                     (14)

i.e., without the Constaint3, sub-query incoherency bounds should be allocated in proportion to
 R1/3
  qk
      . Use this expression to develop heuristics for optimally dividing the client query into
sub-queries. If it also considers Constaint3 then, it can model the problem of minimization of
Z, (while satisfying Constaint2 and Constaint3) as a (non-linear) convex optimization
problem. The non-linear convex optimization problem can be solved using various convex
optimization techniques available in the literature such as gradient descent method, barrier
method etc. It used gradient descent method (fmincon function in MATLAB) to solve this
nonlinear optimization problem to get the values of individual sub-query incoherency bounds
for a given set of sub queries. Here next describe two greedy heuristics to determine sub-
queries while using the formulations developed in this section.

a) Greedy Heuristics for Deriving the Sub-queries
        The algorithm gives the outline of greedy algorithm for deriving sub-queries. First,
get a set of maximal sub-queries (Mq) corresponding to all the data aggregators in the network.
The maximal sub-query for a data aggregator is defined as the largest part of the query which
can be disseminated by the DA (i.e., the maximal sub-query has all the query data items
which the DA can disseminate). For example, consider a client query 50dl +200d2+100d3.For
the data aggregators a1, and a2, the maximal sub-query for a1 will be ml=50d1 + 100d3, whereas
for a2 it will be m2=50d1 + 200d2. For the given client query (q) and relation consisting of data
aggregators, data-items, and data incoherency bounds (f(A,D,C))maximal sub queries can be
obtained for each data aggregator by forming sub-query involving all data items in the
intersection of query data items and those being disseminated by the DA. This operation can
be performed in O|q|.max| Dk |.
Where |q| is number of data items in the query max| Dk |, is the maximum number of data
items disseminated by any DA. For each sub-query m ∈ M q , its sumdiff Rm can be calculated
using Equation (11). Different criteria (ψ ) can be used to select a sub-query in each iteration
of various greedy heuristics. All data items covered by the selected sub-query are removed
from all the remaining sub queries in M q before performing the next iteration. It should be
noted that sub-queries for DAs can be null. Now it describe two criteria (ψ ) for the greedy
heuristics;
1) min-cost: estimate of query execution cost is minimized, and
2) max-gain: estimated gain due to executing the query using sub-queries is maximized.

a) Minimum Cost Heuristic
As there is need to minimize the query cost, a sub-query with minimum cost per data item can be
chosen in each iteration of the algorithm criterion ψ ≡ minimize ( R m / C m | m |) . But from
                                                                            2


Equation (14) it can see that the sub-query incoherency bounds should be allocated in
proportion to Rk / 3 . Using Equations (12) and (14)
               1




                                                308
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME


                                                 N
                                    1
                  Z   1/3
                      q     =
                                C   2/3     ∑   k =1
                                                       R qk/ 3
                                                         1

                                    q



From Equation (15), it is clear that for minimizing the query execution cost it should select
the set of sub queries so that          is minimized. It can do that by using criterion ψ ≡
                                ∑ R                      1/3
                                                         qk



minimize ( Rm/ 3 / m |) in the greedy algorithm. Once IT get the optimal set of sub-queries it can
            1


use Equation (15) and Constraint 3 (Cqk ≥ Tqk) to optimally allocate the query incoherency
bound among them using any of the convex optimization techniques. But this method of first
deriving sub-queries and then allocating the incoherency bounds has a problem which is
described next

b) Maximum Gain Heuristic
       Now here presents an algorithm which instead of minimizing the estimated query
execution cost maximizes the estimated gains of executing the client query using sub
queries. In this algorithm, for each sub-query, here calculate the relative gain of executing
it by finding the sumdiff difference between cases when each data item is obtained
separately and when all the data items are aggregated as a single sub-query (i.e.,
maximal sub-query). Thus, the relative gain for a sub-query ∑ wi di , can be written as:


                                                       ∑w d
                                                         i
                                                                 i   i
                  Gm =                                                                        −1
                                ∑ w R + ∑∑ ρ
                                i
                                        2
                                        i        i
                                                  2

                                                             j   i
                                                                         ij   wi w j Ri R j


Where Ri is sumdiff of the data item di this algorithm can be implemented by using
criterion ψ ≡ maximize ( G m / | m |) to get the set of sub-queries and corresponding DAs.
Then it uses the convex optimization method to allocate incoherency bounds among sub
queries. To tackle the query satisfiability issue the query gain Equation (16) is modified
to:

                                        (∑ wiTi )
                                            i
                  G ' m = Gm −
                                         Cq Rm/ 3
                                             1




where T, is tightest incoherency bound that can be satisfied for the data item d; and R",
is the sub-query sumdiff Reasons for selecting the particular extended objective function are
same as ones outlined for the min-cost heuristic To summarize, for a given client query
and a network of data aggregators, first it get the maximal sub-queries for all data
aggregators. Here it uses heuristics described in this section to derive sub-queries. In
these heuristics extended objective functions are used to have the desired level of query
satisfiability.




                                                                              309
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

VII. QUERY PLANNING FOR MAX QUERIES

      This describes the optimal query planning for MAX queries. MIN queries can be
handled in the similar manner. A MAX query, where a client wants
the maximum of a specified set of data item values, can be written as:

                  Vq (t ) = max(vqi (t ),1 ≤ i ≤ nq )


For MAX queries, relationship between the query incoherency bound and required data
incoherency bounds. According to one such formulation, if the network of aggregators
can ensure that the ith data item has incoherency bound C; then the following condition
ensures that the query incoherency bound Cq is satisfied:

                  Ci ≤ Cq , ∀i,1 ≤ i ≤ nq

In these queries even if values of one or more data item change (changing their
individual incoherencies) it is possible that query incoherency remains unchanged.
Thus, for a given MAX query, it is possible to have an individual data (or sub-query)
incoherency bound which is more than the query incoherency bound. But such an
incoherency bound will depend on instantaneous values of data items thus changing
very dynamically.

a) Query Cost Model
        Let us consider a query Q= MAX (A, B), which is used for disseminating max of
data items A and B from a data aggregator. Let the sumdiff values of A and B is R. and Rb
respectively. For a MAX query, the query result is the maximum of data item values.
Thus the query dynamics is decided as per the dynamics of the data item with the
maximum value. Hence, the query sumdiff is nothing but weighted average of data
sumdiffs, weighted by fraction of time when the particular data item is maximum:

                         nq        nq

                  Rq ≈ ∑ Ri (
                         i =1
                                 ∏ p( x
                                j =1, j ≠ i
                                              i   > x j ) ≤ max(Ri | 1 ≤ i ≤ nq )


where P(Xit > Xj) is the probability that value of ith data item is more than value of jth data
item.
It has the values of data item sumdiffs but for getting the probabilities it need to have
exact values of data items. As query plan dependent on individual data values (instead
of data dynamics) will be too volatile, as a first approximation it uses upper bound of
the expression given by Equation (20) as query sumdiff. Approximation used is the
maximum                                                                                      of
sumdiffs of data items involved. Now it considers the optimized execution of MAX
queries using the above mentioned query cost model.




                                                         310
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

b) Greedy Heuristics
       It uses greedy algorithm for solving the query planning problem with different
set of sub-query selection criteria (ψ ). In the min-cost heuristic it selects the sub
query having minimum sub-query sumdiff per data item. For the MAX query, sub-
query sumdiff is nothing but the sumdiff of the most dynamic data item in the sub-
query. Thus, for the max-gain heuristic, the gain of each sub- query is calculated as
given in Equation (21):

result ← φ
while M q ≠ φ
{
    Choose a sub-query mi ∈ M q with criterion ψ
    result ← result ∪ mi ; M q ← M q − {mi } ;
    For each data item d ∈ mi
       For each mi ∈ M q
        mi ← m j − {d };
             if mi = φ      M q ← M q − {mi }
else calculate sumdiff for modified m j ;
return result
}

VIII. CONCLUSION

       Here, presented a cost based approach to minimize the number of refreshes
required to execute an incoherency bounded continuous query. It is assume the
existence of a network of data aggregators, where each DA is capable of
disseminating a set of data items at their pre-specified incoherency bounds.
Developed an important measure for data dynamics in the form of sumdiff which, is
a more appropriate measure compared to the widely used standard deviation based
measures.
       For optimal query execution it divides the query into sub-queries and
evaluates each sub-query at a judiciously chosen data aggregator. Performance
results show that by my method the query can be executed using less than one third
the messages required for existing schemes. It is showed that the following features
of the query planning algorithms improve performance:
1) Dividing the query into sub-queries (rather than data items) and executing them
at specifically chosen data aggregators.
2) Deciding the query plan using sumdiff based mechanism specifically by
maximizing sub-query gains. Executing queries such that more dynamic data items
are part of a larger sub-query.




                                                 311
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME

REFERENCES

[1]   Rajeev Gupta, and Krithi Ramamritham,” Query Planning for Continuous
      Aggregation Queries over a Network of Data Aggregators” IEEE 2012
      Transactions on Knowledge and Data Engineering ,Volume 24,Issue:6
[2]   D. VanderMeer, A. Datta, K Dutta, H Thomas and K Ramamritham, "Proxy-
      Based Acceleration of Dynamically Generated Content on the World Wide Web",
      ACM Transactions on Database Systems (faDS) Vol. 29,June2004.
[3]   J. Dilley, B. Maggs, J. Parikh. H Prokop, R. Sitaraman and B. Weihl, "Globally
      Distributed Content Delivery", IEEE Internet Computing Sept 2002.
[4]   S. Rangarajan. S. Mukerjee and P. Rodriguez, "User Specific Request Redirection
      in a Content Delivery Network", 8th Inti. Workshop on Web Content Caching and
      Distribution (IWCW), 2003.
[5]   S. Shah, K Ramamritham, and P. Shenoy, "Maintaining Coherency of Dynamic
      Data in Cooperating Repositories", VIDB 2002.
[6]   T. H Cormen, Charles E. Leiserson, Ronald L Rivest, and Clifford Stein.
      Introduction to Algorithms. MIT Press and McGraw-Hill 2001.
[7]   Y. Zhou, B. Chin Ooit and Kian-Lean Tan, "Disseminating Streaming Data in a
      Dynamic Environment: An Adaptive and Cost Based Approach", The
      VIDBJoumal, Issue 17, pg.I465-1483, 2008.
[8]   Asokan M, “jQuery Websites Performance Analysis Based On Loading Time: An
      Overview” International journal of Computer Engineering & Technology (IJCET),
      Volume 4, Issue 1, 2013, pp. 211 - 217, Published by IAEME.
[9]    Bharathi M A, Vijaya Kumar B P and Manjaiah D.H, “Power Efficient Data
      Aggregation Based On Swarm Intelligence And Game Theoretic Approach In
      Wireless Sensor Network” International journal of Computer Engineering &
      Technology (IJCET), Volume 3, Issue 3, 2012, pp. 184 - 199, Published by IAEME.




                                           312

Contenu connexe

Tendances

Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmIRJET Journal
 
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...csandit
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoidseSAT Publishing House
 
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMSIMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMScsandit
 
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...IOSR Journals
 
10 Algorithms in data mining
10 Algorithms in data mining10 Algorithms in data mining
10 Algorithms in data miningGeorge Ang
 
Mining Of Big Data Using Map-Reduce Theorem
Mining Of Big Data Using Map-Reduce TheoremMining Of Big Data Using Map-Reduce Theorem
Mining Of Big Data Using Map-Reduce TheoremIOSR Journals
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyIJTET Journal
 
Introducing Novel Graph Database Cloud Computing For Efficient Data Management
Introducing Novel Graph Database Cloud Computing For Efficient Data ManagementIntroducing Novel Graph Database Cloud Computing For Efficient Data Management
Introducing Novel Graph Database Cloud Computing For Efficient Data ManagementIJERA Editor
 
A location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applicationsA location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applicationsIAEME Publication
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET Journal
 
Paper id 25201498
Paper id 25201498Paper id 25201498
Paper id 25201498IJRAT
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Ieeepro techno solutions 2013 ieee java project -building confidential and ...
Ieeepro techno solutions   2013 ieee java project -building confidential and ...Ieeepro techno solutions   2013 ieee java project -building confidential and ...
Ieeepro techno solutions 2013 ieee java project -building confidential and ...hemanthbbc
 

Tendances (17)

Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithms
 
A Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means AlgorithmA Study of Efficiency Improvements Technique for K-Means Algorithm
A Study of Efficiency Improvements Technique for K-Means Algorithm
 
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
A SECURE DIGITAL SIGNATURE SCHEME WITH FAULT TOLERANCE BASED ON THE IMPROVED ...
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoids
 
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMSIMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
IMPROVING SCHEDULING OF DATA TRANSMISSION IN TDMA SYSTEMS
 
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...
 
10 Algorithms in data mining
10 Algorithms in data mining10 Algorithms in data mining
10 Algorithms in data mining
 
Mining Of Big Data Using Map-Reduce Theorem
Mining Of Big Data Using Map-Reduce TheoremMining Of Big Data Using Map-Reduce Theorem
Mining Of Big Data Using Map-Reduce Theorem
 
Visualization of Crisp and Rough Clustering using MATLAB
Visualization of Crisp and Rough Clustering using MATLABVisualization of Crisp and Rough Clustering using MATLAB
Visualization of Crisp and Rough Clustering using MATLAB
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A Survey
 
Introducing Novel Graph Database Cloud Computing For Efficient Data Management
Introducing Novel Graph Database Cloud Computing For Efficient Data ManagementIntroducing Novel Graph Database Cloud Computing For Efficient Data Management
Introducing Novel Graph Database Cloud Computing For Efficient Data Management
 
50120140505013
5012014050501350120140505013
50120140505013
 
A location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applicationsA location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applications
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
 
Paper id 25201498
Paper id 25201498Paper id 25201498
Paper id 25201498
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Ieeepro techno solutions 2013 ieee java project -building confidential and ...
Ieeepro techno solutions   2013 ieee java project -building confidential and ...Ieeepro techno solutions   2013 ieee java project -building confidential and ...
Ieeepro techno solutions 2013 ieee java project -building confidential and ...
 

En vedette

Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluationavniS
 
Classroom Observation Techniques
Classroom  Observation  TechniquesClassroom  Observation  Techniques
Classroom Observation TechniquesChuck Klinger
 
HCI 3e - Ch 7: Design rules
HCI 3e - Ch 7:  Design rulesHCI 3e - Ch 7:  Design rules
HCI 3e - Ch 7: Design rulesAlan Dix
 
HCI 3e - Ch 9: Evaluation techniques
HCI 3e - Ch 9:  Evaluation techniquesHCI 3e - Ch 9:  Evaluation techniques
HCI 3e - Ch 9: Evaluation techniquesAlan Dix
 

En vedette (6)

Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 
Classroom Observation Techniques
Classroom  Observation  TechniquesClassroom  Observation  Techniques
Classroom Observation Techniques
 
Classroom Observation Techniques
Classroom Observation TechniquesClassroom Observation Techniques
Classroom Observation Techniques
 
HCI 3e - Ch 7: Design rules
HCI 3e - Ch 7:  Design rulesHCI 3e - Ch 7:  Design rules
HCI 3e - Ch 7: Design rules
 
HCI 3e - Ch 9: Evaluation techniques
HCI 3e - Ch 9:  Evaluation techniquesHCI 3e - Ch 9:  Evaluation techniques
HCI 3e - Ch 9: Evaluation techniques
 
Observation report-1
Observation report-1Observation report-1
Observation report-1
 

Similaire à Query evaluation over network of data aggregators

A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...
A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...
A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...IJMER
 
An Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream ClusteringAn Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream ClusteringIJECEIAES
 
Compressive Data Gathering using NACS in Wireless Sensor Network
Compressive Data Gathering using NACS in Wireless Sensor NetworkCompressive Data Gathering using NACS in Wireless Sensor Network
Compressive Data Gathering using NACS in Wireless Sensor NetworkIRJET Journal
 
Synchronization and replication through ocmdbs
Synchronization and replication through ocmdbsSynchronization and replication through ocmdbs
Synchronization and replication through ocmdbsIAEME Publication
 
Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...
Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...
Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...IDES Editor
 
Efficient processing of spatial range queries on wireless broadcast streams
Efficient processing of spatial range queries on wireless broadcast streamsEfficient processing of spatial range queries on wireless broadcast streams
Efficient processing of spatial range queries on wireless broadcast streamsijdms
 
IEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and AbstractIEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and Abstracttsysglobalsolutions
 
IRJET- Distributed Resource Allocation for Data Center Networks: A Hierar...
IRJET-  	  Distributed Resource Allocation for Data Center Networks: A Hierar...IRJET-  	  Distributed Resource Allocation for Data Center Networks: A Hierar...
IRJET- Distributed Resource Allocation for Data Center Networks: A Hierar...IRJET Journal
 
Improvement of limited Storage Placement in Wireless Sensor Network
Improvement of limited Storage Placement in Wireless Sensor NetworkImprovement of limited Storage Placement in Wireless Sensor Network
Improvement of limited Storage Placement in Wireless Sensor NetworkIOSR Journals
 
Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...
Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...
Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...IJECEIAES
 
Implementation on Data Security Approach in Dynamic Multi Hop Communication
 Implementation on Data Security Approach in Dynamic Multi Hop Communication Implementation on Data Security Approach in Dynamic Multi Hop Communication
Implementation on Data Security Approach in Dynamic Multi Hop CommunicationIJCSIS Research Publications
 
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONPRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONcscpconf
 
Ieeepro techno solutions 2014 ieee dotnet project - cloud bandwidth and cos...
Ieeepro techno solutions   2014 ieee dotnet project - cloud bandwidth and cos...Ieeepro techno solutions   2014 ieee dotnet project - cloud bandwidth and cos...
Ieeepro techno solutions 2014 ieee dotnet project - cloud bandwidth and cos...ASAITHAMBIRAJAA
 
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...hemanthbbc
 
An effective reliable broadcast code dissemination in wireless sensor networks
An effective reliable broadcast code dissemination in wireless sensor networksAn effective reliable broadcast code dissemination in wireless sensor networks
An effective reliable broadcast code dissemination in wireless sensor networksIAEME Publication
 
IRJET- Secure Data Access on Distributed Database using Skyline Queries
IRJET- Secure Data Access on Distributed Database using Skyline QueriesIRJET- Secure Data Access on Distributed Database using Skyline Queries
IRJET- Secure Data Access on Distributed Database using Skyline QueriesIRJET Journal
 
Shceduling iot application on cloud computing
Shceduling iot application on cloud computingShceduling iot application on cloud computing
Shceduling iot application on cloud computingEman Ahmed
 

Similaire à Query evaluation over network of data aggregators (20)

A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...
A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...
A Novel Approach To Answer Continuous Aggregation Queries Using Data Aggregat...
 
An Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream ClusteringAn Improved Differential Evolution Algorithm for Data Stream Clustering
An Improved Differential Evolution Algorithm for Data Stream Clustering
 
Compressive Data Gathering using NACS in Wireless Sensor Network
Compressive Data Gathering using NACS in Wireless Sensor NetworkCompressive Data Gathering using NACS in Wireless Sensor Network
Compressive Data Gathering using NACS in Wireless Sensor Network
 
Synchronization and replication through ocmdbs
Synchronization and replication through ocmdbsSynchronization and replication through ocmdbs
Synchronization and replication through ocmdbs
 
Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...
Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...
Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...
 
i2ct_submission_105
i2ct_submission_105i2ct_submission_105
i2ct_submission_105
 
[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P
[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P
[IJET-V1I4P2] Authors : Doddappa Kandakur; Ashwini B P
 
Efficient processing of spatial range queries on wireless broadcast streams
Efficient processing of spatial range queries on wireless broadcast streamsEfficient processing of spatial range queries on wireless broadcast streams
Efficient processing of spatial range queries on wireless broadcast streams
 
IEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and AbstractIEEE Cloud computing 2016 Title and Abstract
IEEE Cloud computing 2016 Title and Abstract
 
IRJET- Distributed Resource Allocation for Data Center Networks: A Hierar...
IRJET-  	  Distributed Resource Allocation for Data Center Networks: A Hierar...IRJET-  	  Distributed Resource Allocation for Data Center Networks: A Hierar...
IRJET- Distributed Resource Allocation for Data Center Networks: A Hierar...
 
Improvement of limited Storage Placement in Wireless Sensor Network
Improvement of limited Storage Placement in Wireless Sensor NetworkImprovement of limited Storage Placement in Wireless Sensor Network
Improvement of limited Storage Placement in Wireless Sensor Network
 
Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...
Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...
Prediction Based Efficient Resource Provisioning and Its Impact on QoS Parame...
 
Implementation on Data Security Approach in Dynamic Multi Hop Communication
 Implementation on Data Security Approach in Dynamic Multi Hop Communication Implementation on Data Security Approach in Dynamic Multi Hop Communication
Implementation on Data Security Approach in Dynamic Multi Hop Communication
 
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONPRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
 
Q04606103106
Q04606103106Q04606103106
Q04606103106
 
Ieeepro techno solutions 2014 ieee dotnet project - cloud bandwidth and cos...
Ieeepro techno solutions   2014 ieee dotnet project - cloud bandwidth and cos...Ieeepro techno solutions   2014 ieee dotnet project - cloud bandwidth and cos...
Ieeepro techno solutions 2014 ieee dotnet project - cloud bandwidth and cos...
 
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
 
An effective reliable broadcast code dissemination in wireless sensor networks
An effective reliable broadcast code dissemination in wireless sensor networksAn effective reliable broadcast code dissemination in wireless sensor networks
An effective reliable broadcast code dissemination in wireless sensor networks
 
IRJET- Secure Data Access on Distributed Database using Skyline Queries
IRJET- Secure Data Access on Distributed Database using Skyline QueriesIRJET- Secure Data Access on Distributed Database using Skyline Queries
IRJET- Secure Data Access on Distributed Database using Skyline Queries
 
Shceduling iot application on cloud computing
Shceduling iot application on cloud computingShceduling iot application on cloud computing
Shceduling iot application on cloud computing
 

Plus de IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

Plus de IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Query evaluation over network of data aggregators

  • 1. INTERNATIONALComputer EngineeringCOMPUTER ENGINEERING International Journal of JOURNAL OF and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), pp. 301-312 IJCET © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2012): 3.9580 (Calculated by GISI) ©IAEME www.jifactor.com QUERY EVALUATION OVER NETWORK OF DATA AGGREGATORS Rajesh Bharati1, Amit V. Kore2 Department of Computer Engineering, DYPIET Pimpri, Pune Department of Computer Engineering, DYPIET Pimpri, Pune ABSTRACT Continuous queries are used to monitor changes to time varying data and to provide results useful for online decision making. Typically a user desires to obtain the value of some aggregation function over distributed data items, for example, to know value of portfolio for a client; or the AVG of temperatures sensed by a set of sensors. In these queries a client specifies a coherency requirement as part of the query. Present a low-cost, scalable technique to answer continuous aggregation queries using a network of aggregators of dynamic data items. In such a network of data aggregators, each data aggregator serves a set of data items at specific coherencies. Just as various fragments of a dynamic web-page are served by one or more nodes of a content distribution network, our technique involves decomposing a client query into sub-queries and executing sub-queries on judiciously chosen data aggregators with their individual sub-query incoherency bounds. Provide a technique for getting the optimal set of sub-queries with their incoherency bounds which satisfies client query's coherency requirement with least number of refresh messages sent from aggregators to the client. For estimating the number of refresh messages, Build a query cost model which can be used to estimate the number of messages required to satisfy the client specified incoherency bound. Performance results using real-world traces show that our cost based query planning leads to queries being executed using less than one third the number of messages required by existing schemes I. INTRODUCTION These aggregation queries are long running queries as data is continuously changing and the user is interested in notifications when certain conditions hold. Thus, responses to these queries are refreshed continuously. In these continuous query applications, users are likely to tolerate some inaccuracy in the results. That is, the exact data values at the corresponding data sources need not be reported as long as the query results satisfy user specified accuracy requirements. For instance, a portfolio tracker may be happy with an accuracy of $10.i. Password should be easy to remember. 301
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME Data incoherency: Data accuracy can be specified in terms of incoherency of a data item, : defined as the absolute difference in value of the data item at the data source and the value known to a ce client of the data. Let v(t) denote the value of the it" data item at the data source at time t; and let the value the data item known to the client be u(t). Then the data incoherency at the client is given by | v(t)-u(t)| For a data item which needs to be refreshed at an u(t)|. incoherency bound C a data refresh message is sent to the client as soon as data incoherency exceeds C, i.e., | v(t) v(t)-u(t)| > C. Network of data aggregators: Data refresh from data sources to clients can be don aggregators: done using push or pull based mechanisms. In a push based mechanism data sources send update messages to clients on their own whereas in pull based mechanism data sources send messages to the client only when the client makes a request. It is assume the push based mechanism for data transfer between data sources and clients. For scalable handling of push based data dissemination, network of data aggregators are pro pro-posed. In such network of data aggregators, data refreshes occur from data sources to the clients through one or more data aggregators. s Fig. 1 Data dissemination network for multiple data items Here it is assume that each data aggregator maintains its configured incoherency bounds for various data items. From a data dissemination capability point of view, each data point aggregator (DA) is characterized by a set of (d, c) pairs, where d, is a data item which the DA can disseminate at an incoherency bound c. The configured incoherency bound of a data item at a data aggregator can be maintained using any of following methods: (a) The data source refreshes the data value of the DA whenever DA's incoherency bound is about to get violated. This method has scalability problems. (b) Data aggregator(s) with tighter incoherency bound help the DA to maintain its incoherency bound in a scalable manner. II. PROBLEM STATEMENT Value of a continuous weighted additive aggregation query, at time t, can be calculated as: nq V q (t ) = ∑ ( V qi ( t ) × W qi ) i =1 (1) 302
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME Vq is the value of a client query q involving nq data items with the weight of the ith data item being wqit , Suppose the result for the query given by Equation (1) needs to be continuously provided to a user at the query incoherency bound Cq• Then, the dissemination network has to ensure that: nq | ∑ ( v qi ( t ) − u qi ( t )) × w qi | ≤ C q i =1 (2) Whenever data values at sources change such that query incoherency bound is violated, the updated value should be refreshed to the client If the network of aggregators can ensure that the iO, data item has incoherency bound Cqit then the following condition ensures that the query incoherency bound Cq is satisfied: nq ∑ (C qi × wqi ) ≤ Cq i =1 (3) III. DATA DISSEMINATION COST MODEL The presented model is to estimate the number of refreshes required to disseminate a data item while maintaining a certain incoherency bound. There are two primary factors affecting the number of messages that are needed to maintain the coherency requirement: (a) The coherency requirement itself and (b) Dynamics of the data Consider a data item which needs to be disseminated at an incoherency bound C, i.e., new value of the data item will be pushed if the value deviates by more than C from the last pushed value. Thus the number of dissemination messages will be proportional to the probability of |v (t) - u (t) |greater than C for data value vet) at the source/ aggregator and u(t) at the client, at time t. A data item can be modeled as a discrete time random process where each step is correlated with its previous step a) Data Dynamics Model Specifically, the cost of data dissemination for a data item will be proportional to data sumdiff defined as Rs = ∑ | si − si −1 | i (5) Where si and si −1 are the sampled values of a data item 5 at ith and (i-1)th time instances (i.e., consecutive ticks). In practice, sumdiffvalue for a data item can be calculated at the data source by taking running average of difference between data values for consecutive ticks. 303
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME b) Incoherency Bound Model Data source pushes the data value whenever it differs from the last pushed value by an amount more than C. Client estimates data value based on server specified parameters. The source pushes the new data value whenever it differs from the (client) estimated value by an amount more than C In both these cases, value at the source can be modeled as a random process with average as the value known at the client In case (b), the client and the server estimate the data value as the mean of the modeled random process whereas in case (a) deviation from the last pushed value can be modeled as zero mean process. Using Chebyshev's inequality: P(| v(t ) − u(t ) |> C ) ∝ 1 / C 2 (4) Thus, it hypothesize that the number of data refresh messages is inversely proportional to the square of the incoherency bound. c) Combining data dissemination models Number of refresh messages is proportional to data sum- diff Rs and inversely proportional to square of the incoherency bound. Further, it need not disseminate any message when either data value is not changing ( Rs = 0) or incoherency bound is unlimited (1/ C 2 =0). Thus, for a given data item 5, disseminated with an incoherency bound C, the data dissemination cost is proportional to R / C 2 . In the next section, it use this data dissemination cost model for developing cost model for additive aggregation queries IV. COST MODEL FOR ADDITIVE AGGREGATION QUERIES Consider an additive query over two data items P and Q with weights Wp and Wq respectively and then need to estimate its dissemination cost. H data items are disseminated separately, the query sumdiff will be: Rdata = wpRp +wqRq = wp ∑ pi − pi−1 | +wq ∑ qi −qi−1 | | | (6) Instead, if the aggregator uses the information that client is interested in a query over P and Q (rather than their individual values), it creates and pushes a composite data item (WpP+WqQ) then the query sumdiffwill be: Rquery = ∑| wp ( pi − pi−1 ) + wq (qi − qi−1 ) | (7) Rquery is clearly less than or equal compared to Rdata . Thus IT need to estimate the sumdiff of an aggregation query (i.e., Rquery ) given the sumdiff values of individual data items (i.e., Rp and Rq). Only data aggregators are in a position to calculate Rquery as different data items may be disseminated from different sources. Here develop the query cost model in two stages. 304
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME a) Modeling Correlation between Data Dynamics From Equations (6) and (7) it can see that if two data items are correlated such that as the value of one data item increases that of the other data item also increase then R query will be closer to R data . On the other hand if the data items are inversely correlated then R query will be less compared to R data . Thus, intuitively, it can represent the relationship between Rquery and sumdiff values of the individual data items using a correlation measure associated with the pair of data items. Specifically, if p is the correlation measure then Rquery can be written as: 2 (w2 Rp + wq Rq + 2ρwp Rp wq Rq ) p 2 2 2 R query ∝ (w2 + wq + 2ρwp wq ) p 2 (8) The correlation measure ρ is defined such that − 1 ≤ ρ ≤1 . So, R query will always be less than | wpR p + w q R q | and always be more than | w p R p − w q R q | .The above relation can be better understood from its similarity with the standard deviation of the sum of two random variables. For data items P and Q, ρ can be calculated as: ρ= ∑(p i − p i −1 )(q i − q i −1 ) (9) ( ∑(p i − p i −1 ) 2 ∑ (q i − q i −1 ) 2 b) Query based Normalization Suppose there is need to compare the cost of two queries: a SUM query involving two data items and an AVG query involving the same set of data items. Let the query incoherency bound for the SUM and the AVG queries be C =2C and C2=C, respectively. From Equation (8), sumdiff of the SUM query will be double that of the AVG query. Hence, query evaluation cost (as per R / C 2 ) of the SUM query will be hall that of the AVG query. But, intuitively, disseminating the AVG of two data items at a given incoherency bound should require the same number of refresh messages as their SUM with double the incoherency bound. Thus, there is a need to normalize query costs. From a query execution cost point of view, a query with weights Wj and incoherency bound C is the same as query with weights nWj and incoherency bound nc. So, while normalizing need to ensure that both, query weights and incoherency bounds, are multiplied by the same factor. Normalized query sumdiff is given by Rquery=(w2Rp +w2Rq +2ρwpRpw Rq)/(w2 +w2 +2ρwpw ) 2 p 2 q 2 q p q q (10) The value of the normalizing factor for Rquery should be 1 / w 2 + w q + 2 ρw p w q p 2 The value of the incoherency bound has to be adjusted by the same factor. Normalization ensures that queries with arbitrary values of weights can be compared for execution cost 305
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME estimates. Equation (10) can be extended to get query sumdiff for any general weighted aggregation query given by Equation (1) as: nq nq nq ∑ w qi R i2 + 2 ∑ ∑ ρ ij w qi w qj R i R j 2 i =1 i =1 j =1, j ≠ i RQ = nq nq nq ∑ i=1 2 w qi + ∑ ∑ i =1 ρ ij w qi w qj j =1, j ≠ i V. QUERY PLANNING FOR WEIGHTED ADDITIVE AGGREGATION QUERIES For executing an incoherency bounded continuous query, a query plan is required. The query planning problem can be stated as: Inputs: (1) A network of data aggregators in the form of a relation f (A, D, C) specifying the N data aggregators ak ∈ A(1 ≤ k ≤ N ) , set Dk ⊆ D of data items disseminated by the data aggregator a k and incoherency bound t kj , which the aggregator a k can ensure for each data item d kj ∈ D k (2) Client query q and its incoherency bound C q . An additive aggregation query q can be represented as ∑w d qi qi Where wqi ; is the weight of the data item d qi ; for 1 ≤ i ≤ nq Outputs: (1) q k , for (1 ≤ k ≤ N ) , i.e., sub-query for each data aggregator ak . (2) C qk , for (1 ≤ k ≤ N ) i.e., incoherency bounds for all the sub queries. Thus, to get a query plan here need to perform following tasks: 1. Determining sub-queries: For the client query q get sub queries q k s for each data aggregator. 2. Dividing incoherency bound: Divide the query incoherency bound C q , among sub-queries to get C qk s . For optimal query planning, above tasks are to be performed with the following objective and constraints: Optimization objective: Number of refresh messages is minimized. For a sub query q k , the estimated number of refresh messages is given by kR / C where R qk is the sumdiff of qk 2 qk the sub query q k , C qk is the incoherency bound assigned to it and k, the proportionality factor, is the same for all sub queries of a given query q. Thus total number of refresh messages is estimated as: N R qk Z q = k∑ 2 k =1 C qk (12) 306
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME Hence Z , needs to be minimized for minimizing the number of refreshes. q Constraint 1: q k is executable at ak : Each DA has the data items required to execute the sub- query allocated to it, i.e., for each data item d qki required for the sub query q , d qki ∈ D k k Constraint2: Query incoherency bound is satisfied: Query incoherency should be less than or equal to the query incoherency bound. For additive aggregation queries, value of the client query is the sum of sub-query values. As different sub-queries are disseminated by different data aggregators, need to ensure that sum of sub-query incoherencies is less than or equal to the query incoherency bound. Thus ∑C qk ≤ Cq (13) Constraint3: Sub-query incoherency bound is satisfied: Data incoherency bounds at ak ( t kj for d qki should be such that the sub-query incoherency bound C qk can be satisfied at ∈ D k that DA. The tightest incoherency bound T qk which the data aggregator a k can satisfy for the given sub-query q k can be calculated as Tqk = ∑ (Wqi × t qj | d qi ≡ d qj ) nqk For satisfying this constraint 1 ensure C qk > T qk Following is the outline of my approach for solving this constraint optimization problem as detailed in the rest of this chapter: it proves that determining sub-queries while minimizing Z q , as given by Equation (12), is NP hard. If the set of sub-queries ( q k ) is already given, sub-query incoherency bounds C qk s can be optimally determined to minimize Z q . As optimally dividing the query into sub queries is NP-hard and there is no known approximation algorithm, it present two heuristics for determining sub-queries while satisfying as many constraints as possible (Constraint1 and Constraint2 to be precise). Then it present variation of the two heuristics for ensuring that sub-query incoherency bound is satisfied (Constraint3). In particular, to get a solution of the query planning problem, the heuristics presented are used for determining sub-queries. Then, using the set of sub-queries, the method outlined is used for dividing in coherency bound VI. OPTIMAL ALLOCATION OF QUERY INCOHERENCY BOUND AMONG SUB -QUERIES If know the division of the client query into sub queries, using Equation (11) can calculate sumdiffvalues of all the sub-queries. Thus, it need to minimize Z, given by Equation (12) subject to Constaint2 (query incoherency bound is satisfied) and Constaint3 (sub-query incoherency bound is satisfied). Then it can get a close form expression by solving Equation (12) with Equation (13) using Lagrange Multiplier scheme. In that scheme minimizes N N 2 ∑ (R k =1 qk / C qk ) + λ (∑ C qk − C q ) k =1 307
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME For a constraint λ to get values of C qk s as N C qk = C q Rqk 3 /(∑ Rqk 3 ) 1/ 1/ k =1 (14) i.e., without the Constaint3, sub-query incoherency bounds should be allocated in proportion to R1/3 qk . Use this expression to develop heuristics for optimally dividing the client query into sub-queries. If it also considers Constaint3 then, it can model the problem of minimization of Z, (while satisfying Constaint2 and Constaint3) as a (non-linear) convex optimization problem. The non-linear convex optimization problem can be solved using various convex optimization techniques available in the literature such as gradient descent method, barrier method etc. It used gradient descent method (fmincon function in MATLAB) to solve this nonlinear optimization problem to get the values of individual sub-query incoherency bounds for a given set of sub queries. Here next describe two greedy heuristics to determine sub- queries while using the formulations developed in this section. a) Greedy Heuristics for Deriving the Sub-queries The algorithm gives the outline of greedy algorithm for deriving sub-queries. First, get a set of maximal sub-queries (Mq) corresponding to all the data aggregators in the network. The maximal sub-query for a data aggregator is defined as the largest part of the query which can be disseminated by the DA (i.e., the maximal sub-query has all the query data items which the DA can disseminate). For example, consider a client query 50dl +200d2+100d3.For the data aggregators a1, and a2, the maximal sub-query for a1 will be ml=50d1 + 100d3, whereas for a2 it will be m2=50d1 + 200d2. For the given client query (q) and relation consisting of data aggregators, data-items, and data incoherency bounds (f(A,D,C))maximal sub queries can be obtained for each data aggregator by forming sub-query involving all data items in the intersection of query data items and those being disseminated by the DA. This operation can be performed in O|q|.max| Dk |. Where |q| is number of data items in the query max| Dk |, is the maximum number of data items disseminated by any DA. For each sub-query m ∈ M q , its sumdiff Rm can be calculated using Equation (11). Different criteria (ψ ) can be used to select a sub-query in each iteration of various greedy heuristics. All data items covered by the selected sub-query are removed from all the remaining sub queries in M q before performing the next iteration. It should be noted that sub-queries for DAs can be null. Now it describe two criteria (ψ ) for the greedy heuristics; 1) min-cost: estimate of query execution cost is minimized, and 2) max-gain: estimated gain due to executing the query using sub-queries is maximized. a) Minimum Cost Heuristic As there is need to minimize the query cost, a sub-query with minimum cost per data item can be chosen in each iteration of the algorithm criterion ψ ≡ minimize ( R m / C m | m |) . But from 2 Equation (14) it can see that the sub-query incoherency bounds should be allocated in proportion to Rk / 3 . Using Equations (12) and (14) 1 308
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME N 1 Z 1/3 q = C 2/3 ∑ k =1 R qk/ 3 1 q From Equation (15), it is clear that for minimizing the query execution cost it should select the set of sub queries so that is minimized. It can do that by using criterion ψ ≡ ∑ R 1/3 qk minimize ( Rm/ 3 / m |) in the greedy algorithm. Once IT get the optimal set of sub-queries it can 1 use Equation (15) and Constraint 3 (Cqk ≥ Tqk) to optimally allocate the query incoherency bound among them using any of the convex optimization techniques. But this method of first deriving sub-queries and then allocating the incoherency bounds has a problem which is described next b) Maximum Gain Heuristic Now here presents an algorithm which instead of minimizing the estimated query execution cost maximizes the estimated gains of executing the client query using sub queries. In this algorithm, for each sub-query, here calculate the relative gain of executing it by finding the sumdiff difference between cases when each data item is obtained separately and when all the data items are aggregated as a single sub-query (i.e., maximal sub-query). Thus, the relative gain for a sub-query ∑ wi di , can be written as: ∑w d i i i Gm = −1 ∑ w R + ∑∑ ρ i 2 i i 2 j i ij wi w j Ri R j Where Ri is sumdiff of the data item di this algorithm can be implemented by using criterion ψ ≡ maximize ( G m / | m |) to get the set of sub-queries and corresponding DAs. Then it uses the convex optimization method to allocate incoherency bounds among sub queries. To tackle the query satisfiability issue the query gain Equation (16) is modified to: (∑ wiTi ) i G ' m = Gm − Cq Rm/ 3 1 where T, is tightest incoherency bound that can be satisfied for the data item d; and R", is the sub-query sumdiff Reasons for selecting the particular extended objective function are same as ones outlined for the min-cost heuristic To summarize, for a given client query and a network of data aggregators, first it get the maximal sub-queries for all data aggregators. Here it uses heuristics described in this section to derive sub-queries. In these heuristics extended objective functions are used to have the desired level of query satisfiability. 309
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME VII. QUERY PLANNING FOR MAX QUERIES This describes the optimal query planning for MAX queries. MIN queries can be handled in the similar manner. A MAX query, where a client wants the maximum of a specified set of data item values, can be written as: Vq (t ) = max(vqi (t ),1 ≤ i ≤ nq ) For MAX queries, relationship between the query incoherency bound and required data incoherency bounds. According to one such formulation, if the network of aggregators can ensure that the ith data item has incoherency bound C; then the following condition ensures that the query incoherency bound Cq is satisfied: Ci ≤ Cq , ∀i,1 ≤ i ≤ nq In these queries even if values of one or more data item change (changing their individual incoherencies) it is possible that query incoherency remains unchanged. Thus, for a given MAX query, it is possible to have an individual data (or sub-query) incoherency bound which is more than the query incoherency bound. But such an incoherency bound will depend on instantaneous values of data items thus changing very dynamically. a) Query Cost Model Let us consider a query Q= MAX (A, B), which is used for disseminating max of data items A and B from a data aggregator. Let the sumdiff values of A and B is R. and Rb respectively. For a MAX query, the query result is the maximum of data item values. Thus the query dynamics is decided as per the dynamics of the data item with the maximum value. Hence, the query sumdiff is nothing but weighted average of data sumdiffs, weighted by fraction of time when the particular data item is maximum: nq nq Rq ≈ ∑ Ri ( i =1 ∏ p( x j =1, j ≠ i i > x j ) ≤ max(Ri | 1 ≤ i ≤ nq ) where P(Xit > Xj) is the probability that value of ith data item is more than value of jth data item. It has the values of data item sumdiffs but for getting the probabilities it need to have exact values of data items. As query plan dependent on individual data values (instead of data dynamics) will be too volatile, as a first approximation it uses upper bound of the expression given by Equation (20) as query sumdiff. Approximation used is the maximum of sumdiffs of data items involved. Now it considers the optimized execution of MAX queries using the above mentioned query cost model. 310
  • 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME b) Greedy Heuristics It uses greedy algorithm for solving the query planning problem with different set of sub-query selection criteria (ψ ). In the min-cost heuristic it selects the sub query having minimum sub-query sumdiff per data item. For the MAX query, sub- query sumdiff is nothing but the sumdiff of the most dynamic data item in the sub- query. Thus, for the max-gain heuristic, the gain of each sub- query is calculated as given in Equation (21): result ← φ while M q ≠ φ { Choose a sub-query mi ∈ M q with criterion ψ result ← result ∪ mi ; M q ← M q − {mi } ; For each data item d ∈ mi For each mi ∈ M q mi ← m j − {d }; if mi = φ M q ← M q − {mi } else calculate sumdiff for modified m j ; return result } VIII. CONCLUSION Here, presented a cost based approach to minimize the number of refreshes required to execute an incoherency bounded continuous query. It is assume the existence of a network of data aggregators, where each DA is capable of disseminating a set of data items at their pre-specified incoherency bounds. Developed an important measure for data dynamics in the form of sumdiff which, is a more appropriate measure compared to the widely used standard deviation based measures. For optimal query execution it divides the query into sub-queries and evaluates each sub-query at a judiciously chosen data aggregator. Performance results show that by my method the query can be executed using less than one third the messages required for existing schemes. It is showed that the following features of the query planning algorithms improve performance: 1) Dividing the query into sub-queries (rather than data items) and executing them at specifically chosen data aggregators. 2) Deciding the query plan using sumdiff based mechanism specifically by maximizing sub-query gains. Executing queries such that more dynamic data items are part of a larger sub-query. 311
  • 12. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 1, January- February (2013), © IAEME REFERENCES [1] Rajeev Gupta, and Krithi Ramamritham,” Query Planning for Continuous Aggregation Queries over a Network of Data Aggregators” IEEE 2012 Transactions on Knowledge and Data Engineering ,Volume 24,Issue:6 [2] D. VanderMeer, A. Datta, K Dutta, H Thomas and K Ramamritham, "Proxy- Based Acceleration of Dynamically Generated Content on the World Wide Web", ACM Transactions on Database Systems (faDS) Vol. 29,June2004. [3] J. Dilley, B. Maggs, J. Parikh. H Prokop, R. Sitaraman and B. Weihl, "Globally Distributed Content Delivery", IEEE Internet Computing Sept 2002. [4] S. Rangarajan. S. Mukerjee and P. Rodriguez, "User Specific Request Redirection in a Content Delivery Network", 8th Inti. Workshop on Web Content Caching and Distribution (IWCW), 2003. [5] S. Shah, K Ramamritham, and P. Shenoy, "Maintaining Coherency of Dynamic Data in Cooperating Repositories", VIDB 2002. [6] T. H Cormen, Charles E. Leiserson, Ronald L Rivest, and Clifford Stein. Introduction to Algorithms. MIT Press and McGraw-Hill 2001. [7] Y. Zhou, B. Chin Ooit and Kian-Lean Tan, "Disseminating Streaming Data in a Dynamic Environment: An Adaptive and Cost Based Approach", The VIDBJoumal, Issue 17, pg.I465-1483, 2008. [8] Asokan M, “jQuery Websites Performance Analysis Based On Loading Time: An Overview” International journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 1, 2013, pp. 211 - 217, Published by IAEME. [9] Bharathi M A, Vijaya Kumar B P and Manjaiah D.H, “Power Efficient Data Aggregation Based On Swarm Intelligence And Game Theoretic Approach In Wireless Sensor Network” International journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 3, 2012, pp. 184 - 199, Published by IAEME. 312