1. CTL Model Checking in Database Cloud
German Shegalov
Oracle Corp.
500 Oracle Parkway, Redwood Shores, CA 94065, USA
german.shegalov@oracle.com
Abstract— Modern software systems such as OS, RDBMS, JVM, rated virtual machine that will close the gap for any Tur-
etc have reached enormous complexity by any metric ranging ing-computable problem. The task of a model checker is to
from the number of lines of code to the program state explosion verify whether the system under test satisfies a property posed
due to concurrency. Standard quality assurance methods do not as a formulae in temporal logics such as CTL.
yield strong correctness guarantees because even 100% code cov-
erage – while a desirable metric – is not equivalent to the II. CTL BACKGROUND
state/execution path coverage. Whereas model checking provides
rigor correctness proofs, its computational complexity is often Model checking is a formal method of software/hardware
prohibitive for real world systems. With advances of distributed verification – an automated way of providing mathematical
computing frameworks such as MapReduce, and affordability of proofs [1]. In this paper, we deal with the Computational Tree
large computer clusters (e.g., offered as an on-demand Cloud ser- Logic (CTL) [3] model checking.
vice), steadily larger systems can be verified using model check- Along with the traditional Boolean operators, CTL defines
ing. In this paper, we envision database vendors compete for the existential path quantifier E and the universal path quanti-
achieving the highest possible degree of verification using
fier A for the paths originating in some state s. Temporal as-
massive scalability features. To this end, we show a way of imple-
menting a CTL model checker as an SQL application that a data- pects are expressed using the unary modalities neXt (refers to
base system will “tune” for the cloud. successor state), Globally (all reachable states satisfy the for-
mula), Finally (a reachable state satisfies the formula), and the
I. INTRODUCTION binary modality Until (the left-hand formula is valid at least
In this paper, we demonstrate a relatively simple way to until a state is reached where the right-hand formula holds).
turn a relational database system in a powerful verification The unary modalities are usually most relevant in the praxis.
tool. We show only a very basic technique of implementing a The set of CTL formulae over a finite set of atomic propos-
model checker inside a database system. The idea is to en- itions P, denoted as CTL(P), is formally defined as follows us-
courage database vendors to compete not only on perform- ing the structural induction:
ance-oriented benchmark but on merits of objective software p ∈ P implies p ∈ CTL(P)
quality as well. An interesting side effect of this is that the {p, q} ⊆ CTL(P) implies {¬p, p ∧ q, EX p, E (p U q), A (p
database system will be able to verify its own concurrency U q)} ⊆ CTL(P)
control and recovery protocols. Given basic formulae defined above, the following short-
Several steps are required on the way towards software hand syntax is provided as equivalent to formulae in the basic
verification. First the source code has to be converted into set:
some abstract state transition model using e.g., the ETL func- p ∨ q ≡ ¬(¬p ∧ ¬q)
tionality. This is already a complex problem because finite ab- AX p ≡ ¬EX ¬p
stractions for things like recursion and heap allocations need AF p ≡ A (true U p)
to be found. In this paper, we assume that some technology as EF p ≡ E (true U p)
in Spin model checker is used to this end. Then each compon- AG p ≡ ¬E (true U ¬p)
ent architect will formulate safety and liveness properties in EG p ≡ ¬A (true U ¬p)
temporal logics that can be verified by the system as the final
step. These tasks by themselves should already embody a sub- The CTL presumes that a computing system is represented
stantial stress test of the database system itself. Often, the as a Kripke structure K = (S, R, L), where S is the finite set of
source code is generated from state diagrams for protocols, or states, R ⊆ S × S is the state transition relation with (s, t) ∈R if
grammars, and these specifications can be used directly in- t is an immediate successor of s, and L: S × P →{true, false}.
stead. A path is a potentially infinite sequence of successive states.
In this paper, we focus on implementing a Model Checking In our toy example of Figure 1, AX P0, EG P0, EF P1, AG
Engine as an SQL application. The idea here is to use SQL as P0∨P1 are true in S0
scalability vehicle for massive-parallel model checking. Dia-
S0: P0 S1: P0, P1 S1: P1
lects of SQL, the lingua franca, of most relational databases
are already a very powerful language that found its interesting
usages beyond the traditional OLTP and OLAP scopes, e.g., to Fig 1 A sample state-transition diagram with initial state S0 and two atomic
solve puzzles [7]. And when the SQL's expressiveness is not fornulae P0 and P1.
sufficient, we can resort to an efficient database-system-integ-
2. III. KRIPKE SCHEMA for the attribute 'RESULT' if s satisfies p or 'FALSE' otherwise.
In this section we present a way of translating the Kripke The SQL statements given below are written in Oracle
structure using database relations. The state transition diagram 11gR2's SQL [6] as close as possible to ANSI SQL and are
of Figure 1 translates to the instance of the Kripke schema just meant to give the reader a flavor of the idea; we claim
outlined in Table 1. As we incorporated id's into the names in neither their particular elegance nor efficiency.
this example, we focus solely on non-trivial relations valu- The algorithm of constructing a SQL representation
ation and transition. A negation not P is implied when the sql(state_id, p) of CTL is given using the structural induction
atomic proposition P is not shown in the state and analogously over CTL(P).
when there is no corresponding entry in the valuation relation. A. sql(state_id, 'FALSE')
The tuple (null, 0) in the transition relation specifies that the
state with s_id = 0 is the initial state in this state transition sys- This formula cannot be satisfied when the state with
tem. state_id exists.
select
create table state ( case count (*)
s_id number primary key, when 0 then NULL
s_nm varchar2(10) else 'FALSE'
); end as result
insert into state values(0, 'S_0'); from state
insert into state values(1, 'S_1'); where state.s_id = state_id;
insert into state values(2, 'S_2');
create table atomic ( B. sql(state_id, 'TRUE')
a_id number primary key,
a_nm varchar2(10) This formula is always satisfied when the state with
); state_id exists.
insert into atomic values(0, 'P_0');
insert into atomic values(1, 'P_1'); select
case count (*)
create table valuation ( when 0 then NULL
s_id number references state(s_id), else 'TRUE'
a_id number references atomic(a_id) end as result
); from state
insert into valuation values(0, 0); where state.s_id = state_id;
insert into valuation values(1, 0);
insert into valuation values(1, 1);
insert into valuation values(2, 1); C. sql(state_id, atomic_id)
An atomic propositional formula is satisfied when there is a
create table transition (
src_id number references state(s_id), tuple (state_id, atomic_id) in the relation atomic.
tgt_id number references state(s_id)
); select
insert into transition values(null, 1); case count (*)
insert into transition values(0, 1); when 0 then NULL
insert into transition values(1, 0); else 'TRUE'
insert into transition values(1, 2); end as result
from valuation
where valuation.a_id = atomic_id
valuation s_id a_id and valuation.s_id = state_id;
0 0
D. sql(state_id, ¬p)
1 0
The negation of p satisfied when p is false.
1 1
with subq as (
2 1 sql(state_id, p)
)
select
case subq.result
transition src_id tgt_id when 'TRUE' then 'FALSE'
null 0 else 'TRUE'
end as result
0 1 from subq;
1 0 E. sql(state_id, p ∧ q)
1 2 The conjunction is satisfied when both p and q are satisfied.
Fig 2 A sample relational representation of Kripke structure of Fig 1. with subq_p as (
sql(state_id, p)
IV. MODEL CHECKER AS AN SQL APPLICATION ), subq_q as (
sql(state_id, q)
In this section we translate basic CTL formulae into execut- )
select
able SQL queries as an implementation of the basic explicit case count(*)
model checking algorithm [1]. For a p ∈ CTL(P) and s ∈ S let when 0 then 'FALSE'
else 'TRUE'
sql(state_id, p) denote an SQL statement that returns 'TRUE' end as result
3. from subq_p natural join subq_q begin
where result = 'TRUE'; insert into temp (sql(state_id, q));
commit; -- autonomous transaction
F. sql(state_id, EX p)
select count(*) into counter
The disjunction is satisfied when state state_id is in the set from temp
where temp.rs = s_id;
of predecessors of states satisfying p. if counter 0 then
return 'TRUE';
select end if;
case count(*) loop
when 0 then 'FALSE' newstates := 0;
else 'TRUE' for r1 in
end as result (
from transition t select t1.src_id
where t.src_id = state_id from temp
and 'TRUE' = (sql(t.tgt_id, p)); join transition t1
on (
G. sql(state_id, E (p U q)) temp.rs = t1.tgt_id
and 'TRUE' = (sql(t1.src_id, p))
This formula is satisfied when state_id is in the set of states )
satisfying q or state_id is reachable through recursive reverse )
loop
traversal from the set of states already known to satisfy the select count(*) into counter
formula. In each recursive step we add states that satisfy p. from transition t2
where t2.src_id = r1.src_id
Since the state transition diagram may be cyclic, we use the and t2.tgt_id not in (
cycle detection clause. select * from temp tt3
);
with subq_EpUq (rs) as ( if counter = 0 then
select s_id as rs if r1.src_id = s_id then
from state return 'TRUE';
where 'TRUE' = (sql(s_id, q)) else
union all begin
select t.src_id as rs insert into temp values (r1.src_id);
from subq_EpUq commit; -- autonomous transaction
join transition t newstates := newstates + 1;
on ( exception
subq_EpUq.rs = t.tgt_id when dup_val_on_index then
and 'TRUE' = (sql(t.src_id, p)) dbms_output.put_line(
) 'ignored duplicate');
) end;
cycle rs set is_cycle to 'y' default 'n' end if;
select end if;
case count(*) end loop;
when 0 then 'FALSE' if newstates = 0 then
else 'TRUE' return 'FALSE';
end as result end if;
from EpUq end loop;
where rs = state_id; end;
select ApUq() as result from
dual;
H. sql(state_id, A (p U q))
This formula is computed similarly to the existentially This sample implementation can be further optimized at
quantified formula above with the difference that in every re- different levels. From the model checking perspective, the ba-
cursive step we make sure to not add states that have at least sic explicit algorithm is known to be outperformed by the
one successor that is not in the result set of the previous step. symbolic model checking [4] using OBDD-encoded Boolean
Hence, more than one reference to the result set computed in functions [5]. From the database perspective, we would start
the previous recursion step: predecessor computation and the looking at using the horizontal scalability features such as Par-
check whether all successors of the predecessor are in the pre- allel Pipelined Table Functions (PTF) in case of Oracle [6], or
vious set already. Therefore, this formula cannot be computed similar techniques such as MapReduce [2] depending on the
with the plain recursive SQL as above. Instead we develop a vendor's functionality. As you notice in this section the queries
PL/SQL stored function and use a temporary table to achieve implementing a composite CTL formula might consist of
the desired behavior. many subqueries that can be run in parallel. Many existential
queries will benefit from the ability to stream the query hits
drop table temp;
create global temporary table temp ( early before the whole result set is formed as can be done with
rs number primary key PTF.
)
on commit preserve rows;
V. BENCHMARK PROPOSALS
create or replace function ApUq()
return varchar2 In terms of self-verification it might be difficult to devise a
as vendor-independent metric for the model checking bench-
pragma autonomous_transaction;
counter number; mark. One such metric could be the percentage of the source
newstates number;
4. code verified given a set of the CTL propositions that apply to be implemented using the database system itself also presents
all products. an interesting test case in terms of traditional software testing.
Fortunately, it is much easier to design an apple-to-apple Further, we show a sample implementation of the basic ex-
benchmark if the verified system is a third-party product. We plicit model checking algorithm using the combination of
suggest that a substantial open-source project at the scale of Oracle 11.2 SQL and PL/SQL. Then we point out a couple of
Linux or MySQL is used as the system under verification. optimization areas where the vendors can work on excelling in
As an example of properties we want to verify, consider this benchmark. Last but not least, we suggest several bench-
two-phase locking (2PL) where there are distinct lock acquisi- mark metrics.
tion and release phases for a transaction. With the event of
lock acquisition/release by a transaction t encoded as t_acq REFERENCES
and t_rel, accordingly we can state: [1] Clarke, E., Schlinghoff, B.: Model Checking, in Handbook of
AG(¬t_rel ∨ AX(AG ¬t_acq)) Automated Reasoning, Volume 2, Elsevier and MIT, 1635-1790 (2001)
[2] Dean, J., Ghemawat, S.: Symposium on Operating System Design and
Implementation (OSDI), San Francisco, CA (2004).
We envision the following benchmark metrics: [3] Emerson, E.: Temporal and Modal Logic, in Handbook of Theoretical
• The fraction of the source code verified Computer Science, Volume B: Formal Models and Semantics, Elsevier
and MIT, 995-1072 (1990)
• The fraction of the formulae verified [4] McMillan, K.: Symbolic Model Checking, Kluwer , Norwell, MA
• The monetary cost of the setup needed for verifica- (1993)
tion [5] Meinel, C., Theobald T.: Algorithms and Data Structures in VLSI
Design OBDD Foundations and Applications, Springer, Heidelberg,
• The amount energy spent per verification per source (1998)
code line [6] Oracle Corp.: Oracle Database SQL Language Reference 11g Release
2 ( 1 1 . 2 ),
VI. CONCLUSION http://download.oracle.com/docs/cd/E11882_01/server.112/e17118/toc.
htm
This paper advocates spending recent scalability gains in [7] Sheffer, A.: Oracle RDBMS 11gR2 – Solving a Sudoku using
modern computing on finding rare and corner-case bugs in Recursive Subquery Factoring,
database systems to improve their quality by means of fully http://technology.amis.nl/blog/6404/oracle-rdbms-11gr2-solving-a-
automated model checking. The fact that model checking can sudoku-using-recursive-subquery-factoring