SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Efficient Tabling of Structured Data
 Using Indexing and Program Transformation

         Christian Theil Have and Henning Christiansen

     Research group PLIS: Programming, Logic and Intelligent Systems
   Department of Communication, Business and Information Technologies
      Roskilde University, P.O.Box 260, DK-4000 Roskilde, Denmark


               PADL in Philadelphia, January 23, 2012




Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
Outline



1   Motivation and background
     The trouble with tabling of structured data


2   A workaround implemented in Prolog
      Examples
        Example: Edit Distance
        Example: Hidden Markov Model in PRISM


3   An automatic program transformation




        Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
Motivation and background


Motivation



   In the LoSt project we explore the use of Probabilistic Logic
   programming (with PRISM) for biological sequence analysis.
   We had some problems analyzing very long sequences..
   .. and identified the culprit: Tabling of structured data.
   Inspired by earlier work by Christiansen and Gallagher which address a
   similar problem related to tabling non-discrimininatory arguments.

   Henning Christiansen, John P. Gallagher
   Non-discriminating Arguments and their Uses.
   ICLP 2009.



      Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
Motivation and background    The trouble with tabling of structured data


The trouble with tabling of structured data

Tabling in logic programming is an established technique which,
    can give a significant speed-up of program execution.
    make it easier to write efficient programs in a declarative style.
    is similar to memoization in functional programming.
The idea:
    The system maintains a table of calls and their answers.
    when a new call is entered, check if it is stored in the table
    if so, use previously found solution.
Tabling is included in several recognized Prolog systems such as B-Prolog,
YAP and XSB.


       Christian Theil Have and Henning Christiansen             Efficient Tabling of Structured Data
Motivation and background    The trouble with tabling of structured data


The trouble with tabling of structured data


 An innocent looking                              call: last([1,2,3,4,5],X)
 predicate: last/2                                last([1,2,3,4,5],X)
 last([X],X).                                     last([1,2,3,4],X)
 last([_|L],X) :-                                 last([1,2,3],X)
      last(L,X).                                  last([1,2],X)
                                                  last([1],X)
     Traverses a list to find
     the last element.                            call table
     Time/space complexity:                       last([1,2,3,4,5],X).
     O(n).                                        last([1,2,3,4],X).
                                                  last([1,2,3],X).
     If we table last/2:
                                                  last([1,2],X).
     n + n − 1 + n − 2...1
                                                  last([1],X).
     ≈ O(n2 ) !

      Christian Theil Have and Henning Christiansen             Efficient Tabling of Structured Data
Motivation and background    The trouble with tabling of structured data


Wait a minute..

Tabling systems do employ some advanced techniques to avoid the
expensive copying and which may reduce memory consumption and/or
time complexity.
For instance,
    B-Prolog uses hashing of goals.
    XSB uses a trie data structure.
    YAP uses a trie structure, which is refined into a so-called global trie
    which applies a sharing strategy for common subterms whenever
    possible.
These techniques may reduce space consumption, but if there is no sharing
between the tables and the actual arguments of an active call, each
execution of a call may involve a full traversal (naive copying) of its
arguments.

       Christian Theil Have and Henning Christiansen             Efficient Tabling of Structured Data
Motivation and background    The trouble with tabling of structured data


Benchmarking last/2 for B-Prolog, Yap and XSB




To investigate, we benchmarked the tabled version of last/2 for each of
the major Prolog tabling engines.
To further investigate whether performance is data dependent, we
benchmarked with both repeated data and random data.




       Christian Theil Have and Henning Christiansen             Efficient Tabling of Structured Data
Motivation and background                               The trouble with tabling of structured data


Space usage, tabled last/2 (1)
                                                                                         b) Space usage




                                       2000000
                                                                                                                                                                     ●


                                                     ●       XSB (random data)                                                                                   ●

                                                     ●       B−Prolog (random data)                                                                          ●

                                                             Yap (random data)
                                                                                                                                                         ●
                                                             XSB (repeated data)
                                       1500000


                                                             B−Prolog (repeated data)                                                                ●



                                                             Yap (repeated data)
             space usage (kilobytes)




                                                                                                                                                 ●


                                                                                                                                             ●


                                                                                                                                         ●
                                       1000000




                                                                                                                                     ●


                                                                                                                                 ●


                                                                                                                             ●

                                                                                                                         ●

                                                                                                                     ●

                                                                                                                 ●
                                       500000




                                                                                                             ●

                                                                                                         ●

                                                                                                     ●

                                                                                                 ●
                                                                                             ●
                                                                                         ●
                                                                                     ●
                                                                                 ●
                                                                             ●
                                                                         ●
                                                                     ●
                                                                 ●
                                                             ●
                                                         ●
                                       0




                                                 ●   ● ●
                                                     ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●



                                                 0            500                1000                1500                2000                2500                3000

                                                                                             list length (N)
     Christian Theil Have and Henning Christiansen                                                                           Efficient Tabling of Structured Data
Motivation and background              The trouble with tabling of structured data


Space usage, tabled last/2 (2)
                                                       d) Space usage expanded for the four lower curves




                                       10000
                                                   ●       B−Prolog (random data)
                                                           XSB (repeated data)
                                                           B−Prolog (repeated data)
                                       8000

                                                           Yap (repeated data)
             space usage (kilobytes)

                                       6000
                                       4000
                                       2000




                                                                                                                           ●     ●
                                                                                                                 ●   ●
                                                                                                         ●   ●
                                                                                             ●     ●
                                                                                  ●     ●
                                                                        ●     ●
                                                               ●   ●
                                                       ●   ●
                                       0




                                               0                       5000                      10000                   15000

                                                                              list length (N)
     Christian Theil Have and Henning Christiansen                                                Efficient Tabling of Structured Data
Motivation and background                                   The trouble with tabling of structured data


Time usage, tabled last/2 (1)
                                                                                        a) Time usage

                                                                                                                                                                ●




                                    4
                                            ●           XSB (random data)                                                                                   ●

                                            ●           B−Prolog (random data)                                                                          ●

                                                        Yap (random data)
                                                                                                                                                    ●
                                                        XSB (repeated data)
                                                        B−Prolog (repeated data)                                                                ●
                                    3




                                                        Yap (repeated data)                                                                 ●
             time usage (seconds)




                                                                                                                                        ●
                                                                                                                                    ●




                                                                                                                                ●


                                                                                                                            ●
                                    2




                                                                                                                        ●

                                                                                                                    ●

                                                                                                                ●

                                                                                                            ●
                                                                                                        ●


                                                                                                    ●
                                    1




                                                                                                ●
                                                                                            ●
                                                                                        ●
                                                                                ●   ●


                                                                            ●
                                                                ●       ●
                                                                    ●
                                                            ●
                                        ●               ●
                                            ●       ●
                                                ●

                                        ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
                                    0




                                        0                500                1000                1500                2000                2500                3000

                                                                                        list length (N)
     Christian Theil Have and Henning Christiansen                                                                      Efficient Tabling of Structured Data
Motivation and background                                    The trouble with tabling of structured data


Time usage, tabled last/2 (2)
                                                                                                                                                                                                   ●
                                              c) Time usage expanded for the three lower curves
                                                                                                                                                                                               ●




                                    1.0
                                                                                                                                                                                           ●

                                                                            ●            B−Prolog (random data)
                                                                                         XSB (repeated data)                                                                           ●

                                                                                         Yap (repeated data)                                                                       ●
                                    0.8


                                                                                                                                                                               ●

                                                                                                                                                                           ●
             time usage (seconds)




                                                                                                                                                                       ●
                                    0.6




                                                                                                                                                                   ●
                                                                                                                                                               ●


                                                                                                                                                           ●
                                                                                                                                                       ●
                                                                                                                                                   ●
                                    0.4




                                                                                                                                               ●
                                                                                                                                           ●
                                                                                                                                       ●
                                                                                                                                   ●
                                                                                                                               ●
                                                                                                                           ●
                                                                                                                       ●
                                                                                                                   ●
                                                                                                               ●
                                                                                                           ●
                                    0.2




                                                                                                      ●●
                                                                                                  ●
                                                                                             ●●
                                                                            ●            ●
                                                                                     ●
                                                                                ●●
                                                                        ●
                                                                   ●●
                                                          ●   ●●
                                                       ●●
                                                 ●●●
                                          ●●●●
                                    0.0




                                          0             10000                        20000                             30000                                   40000                                   50000

                                                                                         list length (N)
     Christian Theil Have and Henning Christiansen                                                                                     Efficient Tabling of Structured Data
A workaround implemented in Prolog


A workaround implemented in Prolog



We describe a workaround giving O(1) time and space complexity for table
lookups for programs with arbitrarily large ground structured data as input
arguments.
    A term is represented as a set of facts.
    A subterm is referenced by a unique integer serving as an abstract
    pointer.
    Matching related to tabling is done solely by comparison of such
    pointers.




       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
A workaround implemented in Prolog


An abstract data type


The representation is given by the following predicates which all together
can be understood as an abstract datatype.
store term( +ground-term, pointer )
           The ground-term is any ground term, and the pointer
           returned is a unique reference (an integer) for that term.
retrieve term( +pointer , ?functor , ?arg-pointers-list)
           Returns the functor and a list of pointers to representations
           of the substructures of the term represented by pointer.
full retrieve term( +pointer , ?ground-term)
           Returns the term represented by pointer.




       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
A workaround implemented in Prolog


ADT properties

Property 1
It must hold for any ground term s, that the query
     store term(s, P), full retrieve term(P, S),
assigns to the variable S a value identical to s.

Property 2
It must hold for any ground term s of the form f (s1 , . . . ,sn ) that
     store term(s, P), retrieve term(P, F, Ss),
assigns to the variable F the symbol f , and to Ss a list of ground values
[p1 ,. . .,pn ] such that additional queries
     full retrieve term(pi , Si ), i = 1, . . . , n
assign to the variables Si values identical to si .

        Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
A workaround implemented in Prolog


ADT example

Example

The following call converts the term f(a,g(b)) into its internal
representation and returns a pointer value in the variable P.

  store_term(f(a,g(b)),P).

After this, the following sequence of calls will succeed.

  retrieve_term(P,f,[P1,P2]),
  retrieve_term(P1,a,[]),
  retrieve_term(P2,g,[P21]),
  retrieve_term(P21,b,[]),
  full_retrieve_term(P,f(a,g(b))).


        Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
A workaround implemented in Prolog


Implementation with assert

One possible way of implementing the predicates introduced above is to
have store term/2 asserting facts for the retrieve term/3 predicate
using increasing integers as pointers.
Example

The call store term(f(a,g(b)),P) considered in example 1 may assign
the value 100 to P and as a side-effect assert the following facts.

  retrieve_term(100,f,[101,102]).
  retrieve_term(101,a,[]).
  retrieve_term(102,g,[103]).
  retrieve_term(103,b,[]).

Notice that Prolog’s indexing on first arguments ensures a constant lookup
time.

       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
A workaround implemented in Prolog


Another level of abstraction: lookup pattern/2
Finally we introduce a utility predicate which may simplify the use of the
representation in application programs. It utilizes a special kind of terms,
called patterns, which are not necessarily ground and which may contain
subterms of the form lazy(variable).
lookup pattern( +pointer , +pattern)
           The pattern is matched in a recursive way against the term
           represented by the pointer p in the following way.
           – lookup pattern(p,X) is treated as
           full retrieve term(p,X).
           – lookup pattern(p,lazy(X)) unifies X with p.
           – For any other pattern =.. [F,X1 ,. . . ,Xn ] we call

                retrieve term(p, F, [P1 ,. . .,Pn ])

                followed by lookup pattern(Pi ,Xi ), i = 1, . . . , n.
       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
A workaround implemented in Prolog


A lookup pattern/2 example


Example

Continuing the previous example, we get that after the call
store term(f(a,g(b)),P).

 lookup_pattern(100, f(X,lazy(Y)))

leads to X=a and Y=102.
The lookup pattern/2 predicate will be use to simplify the automatic
program transformation introduced later.
Further efficiency can be gained by compiling it out for each specific
pattern (i.e. replacing it with calls to retrieve term/2).



       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


Examples




We consider two applying the workaround to two example programs:
    Edit Distance implemented in Prolog
    Hidden Markov Models in PRISM
For each of these we measure the impact (in time and space) of a
transformed version using the workaround.




       Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


Implementation of store term/2 and retrieve term/2

The programs have been transformed manually for these experiments
based on the pointer based representation previously introduced, but
simplified slightly for lists.
For store term/2 and retrieve term/2 we use the following simple
implementation:

store_term([],Index) :- assert(retrieve_term([],Index)).

store_term([X|Xs],Idx) :-
   Idx1 is Idx + 1,
   assert(retrieve_term(Idx,[X,Idx1])),
   store_term(Xs,Idx1).



       Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


Edit Distance



Calculate the minimum number of edits (insertions, deletions and
replacements) to transform on list into another.
    a minimal edit-distance algorithm written in Prolog which is
    dependent on tabling for any non-trivial problem.
    The theoretical best time complexity of edit distance has been proven
    to be O(N 2 ).
    Measure for various problem sizes.




       Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


Edit distance: Implementation in Prolog.


:- table edit/3.      edit([X|Xs],[Y|Ys],Dist) :-
                         edit([X|Xs],Ys,InsDist),
edit([],[],0).           edit(Xs,[Y|Ys],DelDist),
                         edit(Xs,Ys,TailDist),
edit([],[Y|Ys],Dist) :- (X==Y ->
   edit([],Ys,Dist1),        Dist = TailDist
   Dist is 1 + Dist1.        ;
                             % Minimum of insertion,
edit([X|Xs],[],Dist) :-      % deletion or substitution
   edit(Xs,[],Dist1),        sort([InsDist,DelDist,TailDist],
   Dist is 1 + Dist1.             [MinDist|_]),
                             Dist is 1 + MinDist).


      Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


Edit distance, transformed (1).


 original version                                  transformed version
                                                   edit(XIdx,YIdx,0) :-
 edit([],[],0).                                       retrieve_term(XIdx,[]),
                                                      retrieve_term(YIdx,[]).


                                                   edit(XIdx,YIdx,Dist) :-
 edit([],[Y|Ys],Dist) :-                              retrieve_term(XIdx,[]),
                                                      retrieve_term(YIdx,[_,YIdxNext]),
    edit([],Ys,Dist1),                                edit(XIdx,YIdxNext,Dist1),
    Dist is 1 + Dist1.                                Dist is Dist1 + 1.


                                                   edit(XIdx,YIdx,Dist) :-
 edit([X|Xs],[],Dist) :-                              retrieve_term(YIdx,[]),
                                                      retrieve_term(XIdx,[_,XIdxNext]),
    edit(Xs,[],Dist1),                                edit(XIdxNext,YIdx,Dist1),
                                                      Dist is Dist1 + 1.
    Dist is 1 + Dist1.
                                  continued on next slide...



       Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


Edit distance, transformed (2).




 original version                                         transformed version
 edit([X|Xs],[Y|Ys],Dist) :-                              edit(XIdx,YIdx,Dist) :-
                                                             retrieve_term(XIdx,[X,NextXIdx]),
                                                             retrieve_term(YIdx,[Y,NextYIdx]),
    edit([X|Xs],Ys,InsDist),                                 edit(XIdx,NextYIdx,InsDist),
    edit(Xs,[Y|Ys],DelDist),                                 edit(NextXIdx,YIdx,DelDist),
    edit(Xs,Ys,TailDist),                                    edit(NextXIdx,NextYIdx,TailDist),
    (X==Y ->                                                 (X==Y ->
        Dist = TailDist                                          Dist = TailDist
        ;                                                        ;
        sort([InsDist,DelDist,TailDist],                         sort([InsDist,DelDist,TailDist],
             [MinDist|_]),                                      [MinDist|_]),
        Dist is 1 + MinDist).                                    Dist is 1 + MinDist).




         Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog                                           Examples


Edit distance: Benchmarking results (time)
                                                                                                    ●
                                                                                                                                                                ●●
                                                                                                                                                                 ●
                                                                                                                                                                ●
                                                                                                    ●                                                           ●
                                                                                           ●                                                                ●
                                                                                                ●
                                                                                                ●

                                                                                                ●                                                          ●
                                                                                                                                                           ●     ●
                                                                                                                                                                ●
                                                                                                                                                                ●
                                                                                                ●
                                                                                                                                                               ●●
                                                                                               ●●                                                             ●
                                                                                                                                                             ●●
                                                                                                                                                              ●●




                                     1.0
                                                                                               ●                                                              ●
                                                                                                                                                           ●●
                                                                                                                                                            ●●
                                                                                                                                                             ●
                                                                                                                                                            ●
                                                                                          ●
                                                                                                                                                           ●●
                                                                                                                                                 ●         ●●
                                                                                                                                                          ●
                                                                                                                                                          ●
                                                                                                                                                          ●●
                                                                                           ●
                                                                                           ●
                                                                                                                                                         ●
                                                                                                                                                         ●
                                                                                          ●
                                                                                                                                                        ●
                                                                                                                                                        ●●
                                                                                                                                                       ●●
                                                                                          ●
                                                                                         ●                                                             ●
                                                                                                                                                       ●
                                                                                         ●                                                           ●
                                                                                                                                                     ●●
                                                                                                                                                      ●
                                                                                                                                                      ●
                                     0.8

                                                                                        ●                                                       ● ●● ●
                                                                                                                                                   ●
                                                                                                                                                   ●●
                                                                                   ● ●
                                                                                        ●                                                        ●●●
                                                                                                                                                  ●
                                                                                    ● ●                                                         ●●●
                                                                                                                                                ●●
              time usage (seconds)



                                                                                      ●
                                                                                      ●                                                 ●
                                                                                                                                        ●
                                                                                                                                                 ●
                                                                                                                                               ●●●
                                                                                     ●
                                                                                     ●
                                                                                                                                                ●
                                                                                                                                                ●
                                                                                                                                                ●
                                                                                    ●
                                                                                                                                               ●
                                                                                                                                               ●
                                                                                   ●                                                ●        ●●
                                                                                                                                              ●
                                                                                                                                              ●
                                                                                                                                        ●   ●●
                                                                                                                                             ●
                                                                                                                                          ●●●
                                                                                                                                            ●
                                                                                                                                    ● ● ●●
                                                                               ●
                                     0.6




                                                                              ●●
                                                                                                                                       ● ●
                                                                                                                                        ● ●●
                                                                              ●                                               ●     ● ● ●●
                                                                                                                                      ● ●●
                                                                               ●                                                     ● ●
                                                                                                                                     ● ●
                                                                               ●                                                 ●●
                                                                                                                                ●● ●
                                                                              ●
                                                                              ●
                                                                             ●                                                     ●
                                                                                                                                   ●
                                                                             ●                                                  ●●
                                                                                                                                 ●
                                                                                                                                 ●
                                                            ●
                                                            ●
                                                             ●              ●
                                                                            ●                                                   ● ●
                                                                           ●
                                                                           ●
                                                                         ●
                                                                          ●
                                                                          ●                                                   ●●
                                                                                                                               ●●
                                                              ●          ●                                                     ●
                                                                                                                               ●
                                     0.4




                                                              ●         ●
                                                                        ●                                                    ● ●
                                                                                                                             ● ●
                                                                                                                            ●●
                                                                       ●                                                     ●
                                                                                                                            ●●
                                                                                                                        ●● ●
                                                                ●      ●
                                                                      ●
                                                                      ●
                                                                                                                           ●
                                                                                                                           ●
                                                                                                                           ●
                                                               ●     ●
                                                                     ●                                                   ●
                                                                                                                        ●●
                                                                    ●
                                                                                                                     ● ● ●●
                                                                   ●●                                                 ●●
                                                                                                                      ●●
                                                                                                                     ●●
                                                                   ●                                                ● ●
                                                                                                                   ● ●●
                                                             ● ●
                                                             ● ●
                                                                  ●
                                                                  ●                                                 ● ●
                                                                                                                    ● ●
                                                                                                                     ●
                                                               ●●
                                                                                                  ●               ●●
                                                                                                                   ●
                                                                                                                 ●●
                                                                                                                ●●
                                                                                                                ●●
                                                                                                                 ●
                                                                                                                 ●
                                                       ●
                                                                                                             ●
                                                                                                             ●●
                                                                                                              ●
                                                                                                               ●
                                                                                                              ●●
                                                                                                               ●●       ●       XSB 3.3
                                                                 ●●
                                                                 ●●                         ●             ●●●
                                                                                                         ●●●●
                                     0.2




                                                                                                            ●
                                                                ●                         ●      ●●       ●●
                                                                                                          ●●
                                                           ●●
                                                            ●●●
                                                              ●
                                                               ●
                                                               ●
                                                                ●
                                                                                             ●● ● ●
                                                                                            ● ● ● ● ●●●●●
                                                                                                       ●
                                                                                                       ●
                                                                                                        ●
                                                                                                        ●●●             ●       B−Prolog 7.5#4
                                                                                                     ●●
                                                            ●
                                                        ●●
                                                          ●
                                                          ●●                                     ● ●●
                                                                                                ● ●● ●
                                                                                              ● ● ●● ●
                                                                                                     ●
                                                        ●●
                                                       ●●                                      ● ●  ●
                                              ● ●●●●●
                                                    ●●
                                            ● ● ●●●●●
                                             ● ●●
                                            ●● ●●
                                           ●● ●●
                                                 ●●●
                                                      ●
                                                      ●
                                                                                           ● ●●
                                                                                           ●● ●
                                                                                            ●●
                                                                                           ●●
                                                                                                                                Yap 6.2.1
                                           ●
                                                                                      ●●● ●
                                                                                      ●●
                                                                                     ●●●
                                                                                    ●●●
                                                                                    ●● ●
                                                                                   ●●●
                                                                                ●●●●
                                                                                        ●●
                                                                                         ●
                                                                                         ●
                                                                                                                                index(XSB)
                                                                                ●●●
                                                                                ●●●
                                                                                 ●●
                                                                               ●●●
                                                                             ●●●
                                                                            ●●●
                                                                            ●●●
                                                                            ●●●
                                                                             ●●
                                           ●●●●●●●●●●●●
                                           ●●●●●●●●●●●
                                           ●●●●●●●●●●●
                                            ●●●●●●●●●●●
                                            ●●●●●●●●●●
                                                                  ●●●●●
                                                                 ●●●●●
                                                                   ●●●●
                                                                         ●●●
                                                                         ●●●
                                                                          ●●
                                                                        ●●●
                                                                        ●●●
                                                                    ●●●●●
                                                                  ●●●●●                                                         index(B−Prolog)
                                     0.0




                                                                                                                                index(Yap)

                                           0            50                 100                          150      200      250           300          350

                                                                                                list length (N)
      Christian Theil Have and Henning Christiansen                                                                      Efficient Tabling of Structured Data
A workaround implemented in Prolog                    Examples


Edit distance: Benchmarking results (space)
                                                                                                                                   ●

                                                                                                                                  ●

                                                                                                                               ●

                                                                                                                              ●




                                        1e+05
                                                                                                                           ●

                                                                                                                          ●

                                                                                                                       ●

                                                    ●     XSB 3.3                                                  ●
                                                                                                                      ●



                                                    ●     B−Prolog 7.5#4                                          ●
                                                                                                               ●
                                                          Yap 6.2.1
                                        8e+04
                                                                                                              ●
                                                                                                           ●
                                                          index(XSB)                                      ●
              space usage (kilobytes)



                                                          index(B−Prolog)                             ●
                                                                                                       ●



                                                          index(Yap)                              ●
                                                                                                   ●


                                                                                               ●
                                        6e+04




                                                                                              ●
                                                                                                                    ●
                                                                                                                   ●
                                                                                                                  ●
                                                                                                                 ●
                                                                                                                ●
                                                                                                               ●
                                                                                                              ●
                                                                                                             ●
                                        4e+04




                                                                                                            ●
                                                                                                           ●
                                                                                                          ●
                                                                                                         ●
                                                                                                        ●
                                                                                                       ●
                                                                                                      ●
                                                                                                     ●
                                                                                                    ●
                                                                                                   ●
                                                                                                  ●
                                                                                                 ●
                                                                                                ●
                                                                                               ●
                                        2e+04




                                                                                              ●
                                                                                             ●
                                                                                            ●
                                                                                           ●
                                                                                          ●
                                                                                         ●
                                                                                        ●
                                                                                       ●
                                                                                      ●
                                                                                     ●
                                                                                    ●
                                                                                   ●
                                                                                  ●
                                                                                ●●
                                                                              ●●
                                                                            ●●
                                                                          ●●
                                                                        ●●
                                                                      ●●
                                        0e+00




                                                                    ●●
                                                                  ●●
                                                                 ●●
                                                               ●●                                                                          ●●●●●●●●
                                                                                                                                            ●●●●●●●
                                                                                                                      ●●●●●●●●●●●●●●●●●●●●●
                                                                                                                       ●●●●●●●●●●●●●●●●●●●●
                                                              ●●
                                                           ●●●
                                                          ●●●
                                                ●
                                                ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
                                                ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
                                                   ●●●●●●●
                                                  ●●●●●●●




                                                0                          50                                                 100                    150

                                                                                list length (N)
      Christian Theil Have and Henning Christiansen                                                                                Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


PRISM



PRISM ( PRogramming In Statistical Modelling )
    Extends Prolog (B-Prolog) with special goals representing random
    variables (msws).
    Semantics: Probabilistic Herbrand models.
    Supports various probabilistic inferences.
    Relies heavily on tabling for the efficiency of the probabilistic
    inferences.




       Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


A Hidden Markov Model in PRISM


values(init,[s0,s1]).
                                                                       0.1
values(out(_),[a,b]).
values(tr(_),[s0,s1]).                                                                            0.8
                                                                       s0
hmm(L):-                                                              0.3: a
                                                         0.5                        0.9           s1
    msw(init,S),                                                      0.7: b
                                                                                    0.2         0.6: a
    hmm(S,L).                                  init
                                                                                                0.4: b
                                                                       0.5

hmm(_,[]).

hmm(S,[Ob|Y]) :-
    msw(out(S),Ob),
    msw(tr(S),Next),
    hmm(Next,Y).
      Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog    Examples


Transformed Hidden Markov Model


We only need to consider the recursive predicate, hmm/2.

original version                         transformed version
hmm(_,[]).                               hmm(S,ObsPtr):-
                                             retrieve_term(ObsPtr,[]).

hmm(S,[Ob|Y]) :-                         hmm(S,ObsPtr) :-
    msw(out(S),Ob),                         retrieve_term(ObsPtr,[Ob,Y]),
    msw(tr(S),Next),                        msw(out(S),Ob),
    hmm(Next,Y).                            msw(tr(S),Next),
                                            hmm(Next,Y).



       Christian Theil Have and Henning Christiansen              Efficient Tabling of Structured Data
A workaround implemented in Prolog                                                                                                  Examples


Hidden Markov Model: Benchmarking results (time)

                               b) Running time without indexed lookup                                                                                                                                                              a) Running time with indexed lookup
                         140




                                                                                                                                                                                           ●                                                                                                                                                                                                                ●




                                                                                                                                                                                                                        0.08
                                                                                                                                                                                                                                                                                                                                                                                                  ●●●
                                                                                                                                                                                       ●                                                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                                          ●
                                                                                                                                                                                                                                                                                                                                                                                      ●
                         120




                                                                                                                                                                                   ●                                                                                                                                                                                              ●

                                                                                                                                                                               ●                                                                                                                                                                                              ●
                                                                                                                                                                                                                                                                                                                                                                         ●●

                                                                                                                                                                           ●
Running time (seconds)




                                                                                                                                                                                               Running time (seconds)
                                                                                                                                                                                                                                                                                                                                                                 ●
                         100




                                                                                                                                                                                                                        0.06
                                                                                                                                                                       ●                                                                                                                                                                                             ●
                                                                                                                                                                                                                                                                                                                                                        ●
                                                                                                                                                                   ●                                                                                                                                                                                ●       ●●

                                                                                                                                                               ●                                                                                                                                                                                ●
                                                                                                                                                                                                                                                                                                                                            ●
                                                                                                                                                           ●                                                                                                                                                                            ●
                         80




                                                                                                                                                                                                                                                                                                                                    ●
                                                                                                                                                       ●                                                                                                                                                                        ●
                                                                                                                                                                                                                                                                                                                            ●
                                                                                                                                                   ●




                                                                                                                                                                                                                        0.04
                                                                                                                                               ●                                                                                                                                                                       ●●
                                                                                                                                                                                                                                                                                                                  ●●
                                                                                                                                                                                                                                                                                                              ●
                         60




                                                                                                                                           ●                                                                                                                                                              ●
                                                                                                                                       ●                                                                                                                                                              ●
                                                                                                                                   ●                                                                                                                                                             ●●
                                                                                                                               ●
                                                                                                                                                                                                                                                                                             ●
                                                                                                                           ●
                         40




                                                                                                                       ●                                                                                                                                                                 ●
                                                                                                                   ●                                                                                                                                                                 ●




                                                                                                                                                                                                                        0.02
                                                                                                               ●                                                                                                                                                                 ●
                                                                                                                                                                                                                                                                             ●
                                                                                                           ●                                                                                                                                                            ●●
                                                                                                       ●
                                                                                                   ●                                                                                                                                                                ●
                                                                                               ●                                                                                                                                                                ●
                         20




                                                                                           ●                                                                                                                                                                ●
                                                                                       ●
                                                                                   ●                                                                                                                                                                    ●
                                                                               ●
                                                                           ●                                                                                                                                                                       ●●
                                                                      ●●                                                                                                                                                                       ●
                                                             ●   ●●
                                                          ●●                                                                                                                                                                               ●
                                                      ●                                                                                                                                                                                ●
                                                ●●●
                                   ●●●●●●●●●●
                                                                                                                                                                                                                        0.00
                         0




                                                                                                                                                                                                                                   ●



                               0          1000                   2000                              3000                                    4000                                    5000                                        0                            1000                                 2000                           3000                                 4000                           5000

                                                             sequence length                                                                                                                                                                                                                 sequence length




                                      Christian Theil Have and Henning Christiansen                                                                                                                                                    Efficient Tabling of Structured Data
An automatic program transformation


An automatic program transformation



We introduce an automatic transformation from a tabled program to an
efficient version using our approach.
To support the transformation, the user must declare modes for which
predicate arguments that should be indexed.

table_index_mode(hmm(+))
table_index_mode(hmm(-,+))

Each clause whose head is covered by a table mode declaration is
transformed and all other clauses are left untouched.




       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
An automatic program transformation


Transformation of HMM in PRISM

The transformation moves any term appearing in an indexed position in
the head of a clause into a call to the lookup pattern predicate, which is
added to the body. Variables in such terms are marked lazy when they do
not occur in any non-indexed argument inside the clause.

original program                   transformed program
hmm(_,[]).                         hmm(S,ObsPtr):-
                                      lookup_pattern(ObsPtr,[]).

hmm(S,[Ob|Y]) :-                   hmm(S,ObsPtr) :-
  msw(out(S),Ob),                    lookup_pattern(ObsPtr,[Ob | lazy(Y)]),
  msw(tr(S),Next),                   msw(out(S),Ob),
  hmm(Next,Y).                       msw(tr(S),Next),
                                     hmm(Next,Y).

       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
An automatic program transformation
Each clause whose head predicate is covered by a table mode de
  Transformation algorithm
transformed using the procedure outlined in algorithm 1, and all o
are left untouched. The transformation moves any term appearing in


   for each clause H:-B in original program do
       if table index mode (M) matching H then
           for each argument Hi ∈ H, Mi ∈M do
               if Mi =’+’ then
                    
                   Hi ← MarkLazy(Hi , B)
                                             
                   B ← (lookup pattern(Vi , Hi ), B)
                   Hi ← V i
           end
   end
   where MarkLazy is defined as
   MarkLazy(Hi ,B) :
     P otentialLazy = variables in all goals G ∈ B
      where G has table index mode declaration of Structured Data
         Christian Theil Have and Henning Christiansen Efficient Tabling
B ← (lookup pattern(Vi , Hi ), B)
                 An automatic program transformation
                     Hi ← V i
         end
  Transformation algorithm: MarkLazy
    end
    where MarkLazy is defined as
    MarkLazy(Hi ,B) :
      P otentialLazy = variables in all goals G ∈ B
        where G has table index mode declaration
      N onLazy = variables in all goals G ∈ B
        where G has no table index mode declaration
      Lazy = P otentialLazy  N onLazy
      for each variable V ∈ Hi do
          if V ∈ Lazy then
              V ← lazy(V )
      end
                        Algorithm 1: Program transformation.


position in the head of a clause into a call to theStructured Data pattern
          Christian Theil Have and Henning Christiansen Efficient Tabling of
                                                                           lookup
An automatic program transformation


Conclusions

    All Prolog implementations handle tabling of structured data
    inefficiently.
    We presented a Prolog based program transformation that ensures
    O(1) time and space complexity of tabled lookups.
    The transformation is data invariant and works with all the existing
    tabling systems.
    The transformation makes it possible to scale to much larger problem
    instances.
Some limitations and problems to be solved:
    Only applies to ground input arguments.
    Abstract pointers in head of clauses circumvents usual pattern based
    indexing (constant time overhead).

       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data
An automatic program transformation


The road ahead...




Our program transformation should be seen as workaround, until such
optimizations find their way into the tabling systems. We hope that
Prolog implementors will pick up on this and integrate such optimizations
directly in the tabling systems, so that the user does not need to transform
his program, and need not worry about the underlying tabled
representation and its implicit complexity.




       Christian Theil Have and Henning Christiansen   Efficient Tabling of Structured Data

Contenu connexe

En vedette

Constraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap ResolutionConstraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap ResolutionChristian Have
 
Kyles Power Point
Kyles Power PointKyles Power Point
Kyles Power Pointkoldro
 
Efficient Probabilistic Logic Programming for Biological Sequence Analysis
Efficient Probabilistic Logic Programming for Biological Sequence AnalysisEfficient Probabilistic Logic Programming for Biological Sequence Analysis
Efficient Probabilistic Logic Programming for Biological Sequence AnalysisChristian Have
 

En vedette (6)

Constraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap ResolutionConstraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap Resolution
 
Kyles Power Point
Kyles Power PointKyles Power Point
Kyles Power Point
 
God Exists
God ExistsGod Exists
God Exists
 
Cat
CatCat
Cat
 
Who Am I
Who Am IWho Am I
Who Am I
 
Efficient Probabilistic Logic Programming for Biological Sequence Analysis
Efficient Probabilistic Logic Programming for Biological Sequence AnalysisEfficient Probabilistic Logic Programming for Biological Sequence Analysis
Efficient Probabilistic Logic Programming for Biological Sequence Analysis
 

Similaire à Efficient Tabling of Structured Data Using Indexing and Program Transformation

Chromatic Sparse Learning
Chromatic Sparse LearningChromatic Sparse Learning
Chromatic Sparse LearningDatabricks
 
QuickCheck - Software Testing
QuickCheck - Software TestingQuickCheck - Software Testing
QuickCheck - Software TestingJavran
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GANDai-Hai Nguyen
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictionsAnton Kulesh
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Omid Vahdaty
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFJose Emilio Labra Gayo
 
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - EnglishKohei KaiGai
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaYohanes Nuwara
 
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...Thien Q. Tran
 
Machine Learning using Python
Machine Learning using PythonMachine Learning using Python
Machine Learning using PythonSuraj Kumar Jana
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceHansol Kang
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics MongoDB
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingRiyad Parvez
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsbutest
 
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...tdc-globalcode
 
spaGO: A self-contained ML & NLP library in GO
spaGO: A self-contained ML & NLP library in GOspaGO: A self-contained ML & NLP library in GO
spaGO: A self-contained ML & NLP library in GOMatteo Grella
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Alpine Spark Implementation - Technical
Alpine Spark Implementation - TechnicalAlpine Spark Implementation - Technical
Alpine Spark Implementation - Technicalalpinedatalabs
 

Similaire à Efficient Tabling of Structured Data Using Indexing and Program Transformation (20)

Chromatic Sparse Learning
Chromatic Sparse LearningChromatic Sparse Learning
Chromatic Sparse Learning
 
CatBoost intro
CatBoost   introCatBoost   intro
CatBoost intro
 
QuickCheck - Software Testing
QuickCheck - Software TestingQuickCheck - Software Testing
QuickCheck - Software Testing
 
Brief introduction on GAN
Brief introduction on GANBrief introduction on GAN
Brief introduction on GAN
 
DC02. Interpretation of predictions
DC02. Interpretation of predictionsDC02. Interpretation of predictions
DC02. Interpretation of predictions
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Inductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDFInductive Triple Graphs: A purely functional approach to represent RDF
Inductive Triple Graphs: A purely functional approach to represent RDF
 
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
 
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North SeaMachine Learning Applications in Subsurface Analysis: Case Study in North Sea
Machine Learning Applications in Subsurface Analysis: Case Study in North Sea
 
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
 
Machine Learning using Python
Machine Learning using PythonMachine Learning using Python
Machine Learning using Python
 
Deep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent spaceDeep Convolutional GANs - meaning of latent space
Deep Convolutional GANs - meaning of latent space
 
Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics Precog & MongoDB User Group: Skyrocket Your Analytics
Precog & MongoDB User Group: Skyrocket Your Analytics
 
Pregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph ProcessingPregel: A System For Large Scale Graph Processing
Pregel: A System For Large Scale Graph Processing
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical models
 
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...
 
spaGO: A self-contained ML & NLP library in GO
spaGO: A self-contained ML & NLP library in GOspaGO: A self-contained ML & NLP library in GO
spaGO: A self-contained ML & NLP library in GO
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Alpine Spark Implementation - Technical
Alpine Spark Implementation - TechnicalAlpine Spark Implementation - Technical
Alpine Spark Implementation - Technical
 

Dernier

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Efficient Tabling of Structured Data Using Indexing and Program Transformation

  • 1. Efficient Tabling of Structured Data Using Indexing and Program Transformation Christian Theil Have and Henning Christiansen Research group PLIS: Programming, Logic and Intelligent Systems Department of Communication, Business and Information Technologies Roskilde University, P.O.Box 260, DK-4000 Roskilde, Denmark PADL in Philadelphia, January 23, 2012 Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 2. Outline 1 Motivation and background The trouble with tabling of structured data 2 A workaround implemented in Prolog Examples Example: Edit Distance Example: Hidden Markov Model in PRISM 3 An automatic program transformation Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 3. Motivation and background Motivation In the LoSt project we explore the use of Probabilistic Logic programming (with PRISM) for biological sequence analysis. We had some problems analyzing very long sequences.. .. and identified the culprit: Tabling of structured data. Inspired by earlier work by Christiansen and Gallagher which address a similar problem related to tabling non-discrimininatory arguments. Henning Christiansen, John P. Gallagher Non-discriminating Arguments and their Uses. ICLP 2009. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 4. Motivation and background The trouble with tabling of structured data The trouble with tabling of structured data Tabling in logic programming is an established technique which, can give a significant speed-up of program execution. make it easier to write efficient programs in a declarative style. is similar to memoization in functional programming. The idea: The system maintains a table of calls and their answers. when a new call is entered, check if it is stored in the table if so, use previously found solution. Tabling is included in several recognized Prolog systems such as B-Prolog, YAP and XSB. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 5. Motivation and background The trouble with tabling of structured data The trouble with tabling of structured data An innocent looking call: last([1,2,3,4,5],X) predicate: last/2 last([1,2,3,4,5],X) last([X],X). last([1,2,3,4],X) last([_|L],X) :- last([1,2,3],X) last(L,X). last([1,2],X) last([1],X) Traverses a list to find the last element. call table Time/space complexity: last([1,2,3,4,5],X). O(n). last([1,2,3,4],X). last([1,2,3],X). If we table last/2: last([1,2],X). n + n − 1 + n − 2...1 last([1],X). ≈ O(n2 ) ! Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 6. Motivation and background The trouble with tabling of structured data Wait a minute.. Tabling systems do employ some advanced techniques to avoid the expensive copying and which may reduce memory consumption and/or time complexity. For instance, B-Prolog uses hashing of goals. XSB uses a trie data structure. YAP uses a trie structure, which is refined into a so-called global trie which applies a sharing strategy for common subterms whenever possible. These techniques may reduce space consumption, but if there is no sharing between the tables and the actual arguments of an active call, each execution of a call may involve a full traversal (naive copying) of its arguments. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 7. Motivation and background The trouble with tabling of structured data Benchmarking last/2 for B-Prolog, Yap and XSB To investigate, we benchmarked the tabled version of last/2 for each of the major Prolog tabling engines. To further investigate whether performance is data dependent, we benchmarked with both repeated data and random data. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 8. Motivation and background The trouble with tabling of structured data Space usage, tabled last/2 (1) b) Space usage 2000000 ● ● XSB (random data) ● ● B−Prolog (random data) ● Yap (random data) ● XSB (repeated data) 1500000 B−Prolog (repeated data) ● Yap (repeated data) space usage (kilobytes) ● ● ● 1000000 ● ● ● ● ● ● 500000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 500 1000 1500 2000 2500 3000 list length (N) Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 9. Motivation and background The trouble with tabling of structured data Space usage, tabled last/2 (2) d) Space usage expanded for the four lower curves 10000 ● B−Prolog (random data) XSB (repeated data) B−Prolog (repeated data) 8000 Yap (repeated data) space usage (kilobytes) 6000 4000 2000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 0 5000 10000 15000 list length (N) Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 10. Motivation and background The trouble with tabling of structured data Time usage, tabled last/2 (1) a) Time usage ● 4 ● XSB (random data) ● ● B−Prolog (random data) ● Yap (random data) ● XSB (repeated data) B−Prolog (repeated data) ● 3 Yap (repeated data) ● time usage (seconds) ● ● ● ● 2 ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 0 500 1000 1500 2000 2500 3000 list length (N) Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 11. Motivation and background The trouble with tabling of structured data Time usage, tabled last/2 (2) ● c) Time usage expanded for the three lower curves ● 1.0 ● ● B−Prolog (random data) XSB (repeated data) ● Yap (repeated data) ● 0.8 ● ● time usage (seconds) ● 0.6 ● ● ● ● ● 0.4 ● ● ● ● ● ● ● ● ● ● 0.2 ●● ● ●● ● ● ● ●● ● ●● ● ●● ●● ●●● ●●●● 0.0 0 10000 20000 30000 40000 50000 list length (N) Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 12. A workaround implemented in Prolog A workaround implemented in Prolog We describe a workaround giving O(1) time and space complexity for table lookups for programs with arbitrarily large ground structured data as input arguments. A term is represented as a set of facts. A subterm is referenced by a unique integer serving as an abstract pointer. Matching related to tabling is done solely by comparison of such pointers. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 13. A workaround implemented in Prolog An abstract data type The representation is given by the following predicates which all together can be understood as an abstract datatype. store term( +ground-term, pointer ) The ground-term is any ground term, and the pointer returned is a unique reference (an integer) for that term. retrieve term( +pointer , ?functor , ?arg-pointers-list) Returns the functor and a list of pointers to representations of the substructures of the term represented by pointer. full retrieve term( +pointer , ?ground-term) Returns the term represented by pointer. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 14. A workaround implemented in Prolog ADT properties Property 1 It must hold for any ground term s, that the query store term(s, P), full retrieve term(P, S), assigns to the variable S a value identical to s. Property 2 It must hold for any ground term s of the form f (s1 , . . . ,sn ) that store term(s, P), retrieve term(P, F, Ss), assigns to the variable F the symbol f , and to Ss a list of ground values [p1 ,. . .,pn ] such that additional queries full retrieve term(pi , Si ), i = 1, . . . , n assign to the variables Si values identical to si . Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 15. A workaround implemented in Prolog ADT example Example The following call converts the term f(a,g(b)) into its internal representation and returns a pointer value in the variable P. store_term(f(a,g(b)),P). After this, the following sequence of calls will succeed. retrieve_term(P,f,[P1,P2]), retrieve_term(P1,a,[]), retrieve_term(P2,g,[P21]), retrieve_term(P21,b,[]), full_retrieve_term(P,f(a,g(b))). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 16. A workaround implemented in Prolog Implementation with assert One possible way of implementing the predicates introduced above is to have store term/2 asserting facts for the retrieve term/3 predicate using increasing integers as pointers. Example The call store term(f(a,g(b)),P) considered in example 1 may assign the value 100 to P and as a side-effect assert the following facts. retrieve_term(100,f,[101,102]). retrieve_term(101,a,[]). retrieve_term(102,g,[103]). retrieve_term(103,b,[]). Notice that Prolog’s indexing on first arguments ensures a constant lookup time. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 17. A workaround implemented in Prolog Another level of abstraction: lookup pattern/2 Finally we introduce a utility predicate which may simplify the use of the representation in application programs. It utilizes a special kind of terms, called patterns, which are not necessarily ground and which may contain subterms of the form lazy(variable). lookup pattern( +pointer , +pattern) The pattern is matched in a recursive way against the term represented by the pointer p in the following way. – lookup pattern(p,X) is treated as full retrieve term(p,X). – lookup pattern(p,lazy(X)) unifies X with p. – For any other pattern =.. [F,X1 ,. . . ,Xn ] we call retrieve term(p, F, [P1 ,. . .,Pn ]) followed by lookup pattern(Pi ,Xi ), i = 1, . . . , n. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 18. A workaround implemented in Prolog A lookup pattern/2 example Example Continuing the previous example, we get that after the call store term(f(a,g(b)),P). lookup_pattern(100, f(X,lazy(Y))) leads to X=a and Y=102. The lookup pattern/2 predicate will be use to simplify the automatic program transformation introduced later. Further efficiency can be gained by compiling it out for each specific pattern (i.e. replacing it with calls to retrieve term/2). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 19. A workaround implemented in Prolog Examples Examples We consider two applying the workaround to two example programs: Edit Distance implemented in Prolog Hidden Markov Models in PRISM For each of these we measure the impact (in time and space) of a transformed version using the workaround. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 20. A workaround implemented in Prolog Examples Implementation of store term/2 and retrieve term/2 The programs have been transformed manually for these experiments based on the pointer based representation previously introduced, but simplified slightly for lists. For store term/2 and retrieve term/2 we use the following simple implementation: store_term([],Index) :- assert(retrieve_term([],Index)). store_term([X|Xs],Idx) :- Idx1 is Idx + 1, assert(retrieve_term(Idx,[X,Idx1])), store_term(Xs,Idx1). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 21. A workaround implemented in Prolog Examples Edit Distance Calculate the minimum number of edits (insertions, deletions and replacements) to transform on list into another. a minimal edit-distance algorithm written in Prolog which is dependent on tabling for any non-trivial problem. The theoretical best time complexity of edit distance has been proven to be O(N 2 ). Measure for various problem sizes. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 22. A workaround implemented in Prolog Examples Edit distance: Implementation in Prolog. :- table edit/3. edit([X|Xs],[Y|Ys],Dist) :- edit([X|Xs],Ys,InsDist), edit([],[],0). edit(Xs,[Y|Ys],DelDist), edit(Xs,Ys,TailDist), edit([],[Y|Ys],Dist) :- (X==Y -> edit([],Ys,Dist1), Dist = TailDist Dist is 1 + Dist1. ; % Minimum of insertion, edit([X|Xs],[],Dist) :- % deletion or substitution edit(Xs,[],Dist1), sort([InsDist,DelDist,TailDist], Dist is 1 + Dist1. [MinDist|_]), Dist is 1 + MinDist). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 23. A workaround implemented in Prolog Examples Edit distance, transformed (1). original version transformed version edit(XIdx,YIdx,0) :- edit([],[],0). retrieve_term(XIdx,[]), retrieve_term(YIdx,[]). edit(XIdx,YIdx,Dist) :- edit([],[Y|Ys],Dist) :- retrieve_term(XIdx,[]), retrieve_term(YIdx,[_,YIdxNext]), edit([],Ys,Dist1), edit(XIdx,YIdxNext,Dist1), Dist is 1 + Dist1. Dist is Dist1 + 1. edit(XIdx,YIdx,Dist) :- edit([X|Xs],[],Dist) :- retrieve_term(YIdx,[]), retrieve_term(XIdx,[_,XIdxNext]), edit(Xs,[],Dist1), edit(XIdxNext,YIdx,Dist1), Dist is Dist1 + 1. Dist is 1 + Dist1. continued on next slide... Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 24. A workaround implemented in Prolog Examples Edit distance, transformed (2). original version transformed version edit([X|Xs],[Y|Ys],Dist) :- edit(XIdx,YIdx,Dist) :- retrieve_term(XIdx,[X,NextXIdx]), retrieve_term(YIdx,[Y,NextYIdx]), edit([X|Xs],Ys,InsDist), edit(XIdx,NextYIdx,InsDist), edit(Xs,[Y|Ys],DelDist), edit(NextXIdx,YIdx,DelDist), edit(Xs,Ys,TailDist), edit(NextXIdx,NextYIdx,TailDist), (X==Y -> (X==Y -> Dist = TailDist Dist = TailDist ; ; sort([InsDist,DelDist,TailDist], sort([InsDist,DelDist,TailDist], [MinDist|_]), [MinDist|_]), Dist is 1 + MinDist). Dist is 1 + MinDist). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 25. A workaround implemented in Prolog Examples Edit distance: Benchmarking results (time) ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● 1.0 ● ● ●● ●● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● 0.8 ● ● ●● ● ● ●● ● ● ● ●●● ● ● ● ●●● ●● time usage (seconds) ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ●● ● 0.6 ●● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● 0.4 ● ● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● XSB 3.3 ●● ●● ● ●●● ●●●● 0.2 ● ● ● ●● ●● ●● ●● ●●● ● ● ● ● ●● ● ● ● ● ● ● ●●●●● ● ● ● ●●● ● B−Prolog 7.5#4 ●● ● ●● ● ●● ● ●● ● ●● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●●●●● ●● ● ● ●●●●● ● ●● ●● ●● ●● ●● ●●● ● ● ● ●● ●● ● ●● ●● Yap 6.2.1 ● ●●● ● ●● ●●● ●●● ●● ● ●●● ●●●● ●● ● ● index(XSB) ●●● ●●● ●● ●●● ●●● ●●● ●●● ●●● ●● ●●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●●● ●●●●●●●●●● ●●●●● ●●●●● ●●●● ●●● ●●● ●● ●●● ●●● ●●●●● ●●●●● index(B−Prolog) 0.0 index(Yap) 0 50 100 150 200 250 300 350 list length (N) Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 26. A workaround implemented in Prolog Examples Edit distance: Benchmarking results (space) ● ● ● ● 1e+05 ● ● ● ● XSB 3.3 ● ● ● B−Prolog 7.5#4 ● ● Yap 6.2.1 8e+04 ● ● index(XSB) ● space usage (kilobytes) index(B−Prolog) ● ● index(Yap) ● ● ● 6e+04 ● ● ● ● ● ● ● ● ● 4e+04 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2e+04 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ●● ●● 0e+00 ●● ●● ●● ●● ●●●●●●●● ●●●●●●● ●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●● ●● ●●● ●●● ● ●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●●● 0 50 100 150 list length (N) Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 27. A workaround implemented in Prolog Examples PRISM PRISM ( PRogramming In Statistical Modelling ) Extends Prolog (B-Prolog) with special goals representing random variables (msws). Semantics: Probabilistic Herbrand models. Supports various probabilistic inferences. Relies heavily on tabling for the efficiency of the probabilistic inferences. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 28. A workaround implemented in Prolog Examples A Hidden Markov Model in PRISM values(init,[s0,s1]). 0.1 values(out(_),[a,b]). values(tr(_),[s0,s1]). 0.8 s0 hmm(L):- 0.3: a 0.5 0.9 s1 msw(init,S), 0.7: b 0.2 0.6: a hmm(S,L). init 0.4: b 0.5 hmm(_,[]). hmm(S,[Ob|Y]) :- msw(out(S),Ob), msw(tr(S),Next), hmm(Next,Y). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 29. A workaround implemented in Prolog Examples Transformed Hidden Markov Model We only need to consider the recursive predicate, hmm/2. original version transformed version hmm(_,[]). hmm(S,ObsPtr):- retrieve_term(ObsPtr,[]). hmm(S,[Ob|Y]) :- hmm(S,ObsPtr) :- msw(out(S),Ob), retrieve_term(ObsPtr,[Ob,Y]), msw(tr(S),Next), msw(out(S),Ob), hmm(Next,Y). msw(tr(S),Next), hmm(Next,Y). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 30. A workaround implemented in Prolog Examples Hidden Markov Model: Benchmarking results (time) b) Running time without indexed lookup a) Running time with indexed lookup 140 ● ● 0.08 ●●● ● ● ● ● ● 120 ● ● ● ● ●● ● Running time (seconds) Running time (seconds) ● 100 0.06 ● ● ● ● ● ●● ● ● ● ● ● 80 ● ● ● ● ● 0.04 ● ●● ●● ● 60 ● ● ● ● ● ●● ● ● ● 40 ● ● ● ● 0.02 ● ● ● ● ●● ● ● ● ● ● 20 ● ● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ● ● ●●● ●●●●●●●●●● 0.00 0 ● 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 sequence length sequence length Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 31. An automatic program transformation An automatic program transformation We introduce an automatic transformation from a tabled program to an efficient version using our approach. To support the transformation, the user must declare modes for which predicate arguments that should be indexed. table_index_mode(hmm(+)) table_index_mode(hmm(-,+)) Each clause whose head is covered by a table mode declaration is transformed and all other clauses are left untouched. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 32. An automatic program transformation Transformation of HMM in PRISM The transformation moves any term appearing in an indexed position in the head of a clause into a call to the lookup pattern predicate, which is added to the body. Variables in such terms are marked lazy when they do not occur in any non-indexed argument inside the clause. original program transformed program hmm(_,[]). hmm(S,ObsPtr):- lookup_pattern(ObsPtr,[]). hmm(S,[Ob|Y]) :- hmm(S,ObsPtr) :- msw(out(S),Ob), lookup_pattern(ObsPtr,[Ob | lazy(Y)]), msw(tr(S),Next), msw(out(S),Ob), hmm(Next,Y). msw(tr(S),Next), hmm(Next,Y). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 33. An automatic program transformation Each clause whose head predicate is covered by a table mode de Transformation algorithm transformed using the procedure outlined in algorithm 1, and all o are left untouched. The transformation moves any term appearing in for each clause H:-B in original program do if table index mode (M) matching H then for each argument Hi ∈ H, Mi ∈M do if Mi =’+’ then Hi ← MarkLazy(Hi , B) B ← (lookup pattern(Vi , Hi ), B) Hi ← V i end end where MarkLazy is defined as MarkLazy(Hi ,B) : P otentialLazy = variables in all goals G ∈ B where G has table index mode declaration of Structured Data Christian Theil Have and Henning Christiansen Efficient Tabling
  • 34. B ← (lookup pattern(Vi , Hi ), B) An automatic program transformation Hi ← V i end Transformation algorithm: MarkLazy end where MarkLazy is defined as MarkLazy(Hi ,B) : P otentialLazy = variables in all goals G ∈ B where G has table index mode declaration N onLazy = variables in all goals G ∈ B where G has no table index mode declaration Lazy = P otentialLazy N onLazy for each variable V ∈ Hi do if V ∈ Lazy then V ← lazy(V ) end Algorithm 1: Program transformation. position in the head of a clause into a call to theStructured Data pattern Christian Theil Have and Henning Christiansen Efficient Tabling of lookup
  • 35. An automatic program transformation Conclusions All Prolog implementations handle tabling of structured data inefficiently. We presented a Prolog based program transformation that ensures O(1) time and space complexity of tabled lookups. The transformation is data invariant and works with all the existing tabling systems. The transformation makes it possible to scale to much larger problem instances. Some limitations and problems to be solved: Only applies to ground input arguments. Abstract pointers in head of clauses circumvents usual pattern based indexing (constant time overhead). Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data
  • 36. An automatic program transformation The road ahead... Our program transformation should be seen as workaround, until such optimizations find their way into the tabling systems. We hope that Prolog implementors will pick up on this and integrate such optimizations directly in the tabling systems, so that the user does not need to transform his program, and need not worry about the underlying tabled representation and its implicit complexity. Christian Theil Have and Henning Christiansen Efficient Tabling of Structured Data