July 27, 2008: "Four Ways to Represent Computer-Executable Rules". Presented at InterSymp 2008 conference sponsored by the International Institute for Advanced Studies
in Systems Research and Cybernetics (IIAS). Paper published in conference proceedings.
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Four ways to represent computer executable rules
1. Cover Page
Four Ways to
Represent Computer‐
Executable Rules
Author: Jeffrey G. Long (jefflong@aol.com)
Date: July 25, 2008
Forum: Talk presented at the InterSymp 2008 Conference, sponsored by the
International Institute for Advanced Studies in Systems Research and Cybernetics
(IIAS). Paper published in conference proceedings, available at
http://iias.info/pdf_general/Booklisting.pdf
Contents
Pages 1‐5: Preprint of Article
Pages 6‐26: Slides (but no text) for presentation
License
This work is licensed under the Creative Commons Attribution‐NonCommercial
3.0 Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by‐nc/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
Uploaded June 24, 2011
2. Four Ways to Represent Computer-Executable Rules
Jeffrey G. Long
jefflong@aol.com
Abstract
Rules have long been used by society but have rarely been studied explicitly in their own right.
They are increasingly recognized as interesting and useful abstractions. The recent trend towards
business rules has brought the subject front-and-center in the business world, as have interests in
work process re-engineering over the past twenty years. Rules for computerized applications
currently are represented in three ways:
as software instructions
as production rules in the rulebase of an expert system
as pairs of XML tags.
Each of these has its strengths and weaknesses. This paper discusses these approaches and
briefly describes a proposed fourth approach, namely representing most rules in a relational
DBMS. I view this as an exercise in notational engineering, i.e. examining alternative represen-
tations to select one that is “best” in some engineering sense.
Key Words: Business Rules; Software; Expert Systems; XML; Relational Databases
General Features of Rules
Any manner of representing rules must have several fundamental features, including:
what kind of events can initiate a cascade of rule executions
the sequence in which rules are to be inspected, if sequence matters (including loops)
the various conditions under which each rule is to be inspected and/or fired
what happens if no rule, one rule, or multiple rules are found that match selection criteria
how to resolve conflicts if multiple actions are prescribed
when and how to stop or complete a rule cascade.
To be a rule management system, such a system must also have metadata such as:
who created or updated the rule, and when
why the rule was created/updated
by what device the rule was created or updated (manually, by import, by software, etc.)
whether the rule can safely be changed without consulting others
1
3. what kind of further “research” ought to be done regarding a rule, if any (e.g. are there
questions about the rule? Might it be obsolete?).
Software Rules
Software rules are implemented as lines of code in a computer language such as Java. Such rules
are typically called “business logic” rather than “business rules,” and are specified in terms of
one of four standard programming constructs:
an ordered sequence of instructions
loops, used to specify conditional re-iterations of rules
If-Then-Else statements that select among two or more options
Case statements that select among multiple options.
The result of executing a software rule is that either (a) internal or external data values are
updated, or (b) program control goes to a portion of the program that is specified. From there,
further rules are found and executed. Because different situations often have similar but slightly
different rules, parameters are often specified whereby the code reads the parameter (typically
stored as a data element) and branches to another section of code based on the value of the
parameter. This allows software designers to anticipate predictable differences in the way
different users might want the system to work. An example of a parameter is the definition of a
fiscal year-end month, so that accounting systems can handle the fact that any month may be the
year-end of a fiscal year for a particular user.
The ability to specify rules as software provides a very fine-grained ability to represent complex
and contingent rules. The downside of this is that there are always many such rules, typically
thousands or more, and as a result there are thousands to millions of lines of code in a typical
software application, or even a single software object. This large code corpus is difficult to
comprehend, and, since it must evolve with new rules, ensures significant life-cycle maintenance
costs. As with any complex system, changing one part of the system may have unanticipated
consequences for other parts. And since only programmers can update the code, there is always
the risk of miscommunication between the subject experts and the programmers.
Production Rules
In expert systems there is an inference engine, that knows only the rules of inference, a rulebase
that specifies the rules (called productions), and an initial set of facts (the environment). Rules
are triggered by facts, and any and all rules are selected that match the current environment.
Those rules are added to an agenda, any conflicts are resolved (often via rule prioritization) and
the remaining rules are fired. The result of firing a rule is to make a change to the facts (assert-
ing new facts or withdrawing existing facts), which may then cause other rules to be fired. This
process continues until a specified end-point is reached, or until there are no more rules on the
agenda.
2
4. Production rules are formulated in an If-Then (sometimes If-Then-Else) format. There can be an
unlimited number of If-conditions, used to specify the specific environmental conditions under
which the Then-action(s) will be taken, and an unlimited number of Then-actions. Rules are
typically stored in a text file which is loaded into memory at runtime, as are the initial facts. The
way rules are defined (formatted) has become important for rule interchange among different
systems, and the Object Management group (OMG) released in 11/2007 a Beta version of its
Production Rule Representation specification.
This approach has shed light on the kind of thinking that an expert seems to do, namely to look
for salient features of a given environment, respond to those features with changes to the
environment, and then respond to the changed environment. Its downside is that when the
rulebase exceeds a few thousand rules the system may behave in an unexpected manner, for the
rule interactions are hard to anticipate, and the order of rule execution is important. Another
difficulty is that there are many (possibly thousands of) free-standing, independent rules to
manage, even when the rules are grouped into rulesets. Yet future expert systems will need to
manage not just thousands but hundreds of thousands, even millions, of rules.
XML Rules
Much work has been done in recent years towards the design and standardization of XML-based
Rule Markup Languages. These are intended to make rules more easily maintainable by non-
programmers; to serve the semantic web; and to define rules in a manner not tied to any
particular vendor’s technology. A primary driver has been the increasing need to communicate
and cooperate with numerous systems not only within an organization but now across organiza-
tions (e.g. to customers, vendors, regulatory agencies, etc). This has led to an interest in exter-
nalizing certain rules outside of software so they may be more readily examined and changed.
The eXtendable Markup Language (XML) format has been widely adopted as a general
framework for the specification of rules (e.g. RuleML, R2ML). XML tags are used to demark
the beginning and the end of operators and relations to check for a particular rule; these may be
nested and combined as necessary. Rules so demarked may then be searched for and read by
multiple applications. There is a W3C Working Group dedicated to producing a Rule Inter-
change Format (RIF), and the OMG is working on a variety of important areas, and recently
released version 1.0 of its Semantics of Business Vocabulary and Business Rules.
One difficulty of this approach is that those who maintain the rules are still left with an enormous
number of free-standing, independent rules to manage. Integrity constraints are being developed,
but there is still no referential integrity, such that an update can cascade to all places where an
entity is referenced. Lastly, there is little query or reporting capability by which one can scan or
update rules quickly and easily. These problems are similar to the problems encountered with
the software representation of rules. An example of a simple RuleML rule implementation to
give a premium customer a 5% discount on any regular product is shown in Figure 1 below.
3
5. <imp>
<_head>
<atom>
<_opr><rel>discount</rel></_opr>
<var>customer</var>
<var>product</var>
<ind>5.0 percent</ind>
</atom>
</_head>
<_body>
<and>
<atom>
<_opr><rel>premium</rel></_opr>
<var>customer</var>
</atom>
<atom>
<_opr><rel>regular</rel></_opr>
<var>product</var>
</atom>
</and>
</_body>
</imp>
Figure 1: RuleML for a Price Discount Decision
Ultra-Structure Rules
Since 1985 I’ve developed and used a fourth approach, called “Ultra-Structure”. This approach
removes all business rules that might ever change from the software, leaving only the control
logic for a “competency rule engine” as software. The rest of the rules are represented via
relational tables; there are no data or facts in the system, only rules. Rules can be converted from
their natural language form (e.g. a policy manual) into one or more rules having a canonical form
consisting of:
one or more “If” statements, defining conditions under which the rule should be inspected
one or more “Then-Consider” statements, defining additional considerations (before
deciding what to do next) and/or actions
one or more metarule data fields specifying who set up the rule, why, whether it can
safely be changed without consulting others, etc.
We can then categorize those rules into a small number of formats called “ruleforms” that are
defined by their form and meaning, such that any logically possible rule pertaining to that
application area (e.g. order processing) can be expressed in some table in the system. This has
the profound effect of reducing the myriad numbers of known (and future unknown) rules to a
manageably small number of tables, typically less than 100 for an enterprise system.
Lastly, we can implement each ruleform as a table. All rules having the same number of If-
statements and similar meanings are grouped together into one table, with the If-statements
4
6. (called factors) forming columns that constitute the primary key of the table (and thereby
guaranteeing the uniqueness of each rule). Other columns in the table (called considerations)
represent the Then-Consider statements and the metadata about the rule. Thus, most business
rules are represented not as software, and not as data in XML tags, but as records (relations) in a
modern RDBMS. Questioning decades of focus on software, under this approach software is
seen as more of a problem than a solution, and the focus is on rules represented as relational data.
By specifying business rules as records in a RDBMS, the only software that remains is control
logic that knows nothing about the world except what tables to look at, in what order, and what
to do based on rules selected for execution. Key benefits of this approach are that:
the amount of software required is reduced between 10-100 times
since this control logic is unlikely to change over time, the software and data structures
stay remarkably stable even as the rules continue to evolve
rules can evolve by simply changing data, without any software changes, so many kinds
of changes can be implemented immediately
subject experts and business managers can explain new rules to business analysts (not
only programmers), who can then directly update the rules through the RDBMS.
The key benefits of using a relational database for storing such rules are that the RDBMS:
provides access security and logging of changes
provides utilities for querying and reporting on large numbers (millions) of rules
guarantees referential integrity
can easily handle millions of rules as necessary.
This approach is not presented as a perfect solution to the software bottleneck. Still to be
addressed are (a) the need to determine when certain conditions that might arise have not been
anticipated by any rule in the system, (b) the difficulty conventional programmers have with
looking in two places (the “data” as well as the software) to understand the logic of a situation,
and (c) the semantics of data such that each data element (such as “order date”) really means the
same thing to all parties. The OMG is working to address this last issue with its new standard.
We recently used this approach to create and install an enterprise system for a US$175M
wholesale distributor.
References
Long, J., and Denning, D. (1995); Ultra-Structure: A design theory for complex systems and
processes; Communications of the ACM Vol. 38, No. 1 (pp. 105-120)
5
7. Four Ways to Represent
Computer-Executable Rules
Jeffrey G. Long
jefflong@aol.com
IIAS Baden-Baden Conference
July 2008
8. Minimum Requirements of Rule Management
The sequence in which rules are to be inspected, if sequence
matters (including loops)
The various conditions under which each rule is to be inspected
and/or fired
What happens if no rule, one rule, or multiple rules are found
rule rule
that match selection criteria
How to resolve conflicts if multiple actions are prescribed
When d how t stop/end a rule cascade
Wh and h to t / d l d
Exceptions to rules are rules also.
2 July 2008
9. Conventional Ways to Represent Rules
Software (e.g. Java, C#)
Production Rules (e.g. CLIPS, Jess)
(e g CLIPS
XML (e.g. RuleML, JessML )
Natural languages
Mathematical functions
Chemical formulae
Music notation
3 July 2008
10. Software Rules
If (premium customer) and (regular product)
– Then (discount is 5%)
– Else (discount is 0%)
Select Case (customer category)
– Case “Premium”
Select Case (product category)
(p g y)
– Case “Regular”
discount = 5%
4 July 2008
11. Features of Software as a Notational System
Many valid ways to express a given rule
– both a strength and a weakness, depending on programmer
Seemingly easy to change
– but many times changes create new and unexpected
p
problems
The starting point, stopping point, and sequence of operations
are defined wholly and explicitly by the programmer
Control is based on program structure; rules (
p g (lines of code) are
)
data-insensitive and ordered
One missing bracket changes rule, can make it and entire
system inoperable (unexecutable)
5 July 2008
12. XML Rules
<imp> <_body>
<_head> <and>
<atom> <atom>
<_opr><rel>discount</rel></_op <_opr><rel>premium</rel></_op
r> r> <var>customer</var>
<var>customer</var> </atom>
<var>product</var>
d t / <atom>
<ind>5.0 percent</ind> <_opr><rel>regular</rel></_opr
</atom> > <var>product</var>
</_head> </atom>
</and>
</_body>
</imp>
6 July 2008
13. XML Rule Markup Features
Vendor-independent standard. Other rule standardization
efforts include RIF, PRR, CL, SBVR; open source rules
p
communities include jBoss Rules, Jess, Prova, OO jDrew,
Mandarax, XSB, XQuery
Designed for use on Semantic Web
– distributed, (partially) open, heterogeneous environments
One missing bracket changes rule, can make it unexecutable
7 July 2008
14. Production Rules
(defrule MAIN::good-customer-discount
(product is regular)
(customer is premium)
=>
(assert (price-discount is 5%)))
8 July 2008
15. Production Rule Features
The knowledge (rules) and the data (facts and instances) are
separated, and the inference engine is used to apply the
p g pp y
knowledge to the data
Rules are data-sensitive and unordered; control is based on
data state
There are three phases: rule-matching, rule-selection, and
rule-execution
There are limited choices during rule selection, depending on
the inference engine used to resolve a conflict set
9 July 2008
16. Real-World Rules are More Complex
Must be inspected from most specific circumstances
(exceptions) to most general (whole classes)
Have multiple circumstances (3-10 “factors”)
Each factor has many possible values (5+)
Circumstances trigger further inspection of complex
Ci t ti f th i ti f l
“considerations” (e.g. QOH)
After being selected, additional rules may need to determine
final outcome (e.g. lowest price)
10 July 2008
17. But They Don’t Easily Handle Many Rules
Having Multiple Factors and Multiple Values
Product Type =
yp Customer Type
yp Price =
Order Entry No No
Regular? = Premium? Price * 1.00
Yes Yes
Price = Customer Type Price =
No
Price * 1.00 = Premium? Price * 0.90
Yes
Price =
Price * 0.95
11 July 2008
18. Additional Management Requirements
Who created or updated the rule, and when was last update
Why the rule was created/updated
By what device the rule was created or updated (manually, by
import, by software, etc.)
Whether th
Wh th the rule can safely be changed b a person without
l f l b h d by ith t
consulting others
What kind of further “research” ought to be done regarding a
rule, if any, e.g. are th
l there questions about th rule? Mi ht it b
ti b t the l ? Might be
obsolete?
12 July 2008
19. Merge Tools & Techniques of:
Information Management
– databases industrial strength platforms
databases, industrial-strength
Knowledge Management
– repository for knowledge of organization, both human-
oriented and machine-oriented
Knowledge Engineering
– simulation of expert decision-making with continuous
decision process improvement
p p
13 July 2008
21. Ultra-Structure Provides Rules with Place-Value
Existing Options Ultra-Structure
– freedom of expression – expression of rules is
means complex syntax constrained by ruleforms
– semantics i assigned
ti is i d – semantics i assigned
ti is i d
largely by syntax positionally
– result is great freedom – result is adequate
but low manageability freedom plus high
manageability
15 July 2008
23. Benefits
Rule-recognition not triggered by working memory state but by
events; different events involve different rules
Able to define and manage more complex rules
– multiple factors and multiple values per factor address need
for high number of possible permutations
– multiple considerations applied during rule-recognition
RDBMS permits better management of millions of rules
– using standard RDBMS tools, report-writers, etc.
– can be read and managed by subject experts
Can exchange tables of rules as data
g
17 July 2008
24. Conclusion
The problems with rule management are primarily caused by
how we represent rules
This is a classic notation/representation problem
Ultra-Structure uses a new abstraction (i.e. ruleforms) to
provide a time-tested way of assigning meaning by column
18 July 2008
25. References
J. Long, D. Denning (1995), “Ultra-Structure: A design theory for complex
systems and processes”; Communications of the ACM Vol. 38, No. 1 (pp. 105-
120)
H. Boley, S Tabet, G. Wagner, “Design Rationale for RuleML: A Markup
Language for Semantic Web Rules” at citeseer.ist.psu.edu/boley01design.html
Rules
CLIPS Reference Manual (3/28/2008)
19 July 2008
26. Other Articles by JL
Long, J., "Automated Identification of Sensitive Information in Documents Using
Ultra-Structure". In Proceedings of the 20th Annual ASEM Conference,
American Society for Engineering Management (October 1999)
Long, J., "Editor's Note." In Long, J. (guest editor), Semiotica Special Issue:
Notational Engineering, Volume 125-1/3 (1999)
125 1/3
Long, J., "A new notation for representing business and other rules." In Long, J.
(guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-
1/3 (pp 215 227) (1999)
(pp. 215-227)
Long, J., "How could the notation be the limitation?" In Long, J. (guest editor),
Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999)
20 July 2008
27. Writings by Others
Shostko, A., “Design of an automatic course-scheduling system using Ultra-
Structure.” In Long, J. (guest editor), Semiotica Special Issue: Notational
Engineering, Volume 125-1/3 (1999)
125 1/3
Oh, Y., and Scotti, R., “Analysis and Design of a Database using Ultra-
Structure Theory (UST) – Conversion of a Traditional Software System to One
Based on UST,” Proceeding of the 20th Annual Conference, American Society
for Engineering Management (1999)
Parmelee, M., “Design For Change: Ontology-Driven Knowledgebase
Applications For Dynamic Biological Domains.” Master’s Paper for the M.S. in
I.S. degree, University of North Carolina, Chapel Hill (November 2002)
Maier, C., CoRE576 : An Exploration of the Ultra-Structure Notational System
for Systems Biology Research. Master’s Paper for the M.S. in I.S. degree,
University of North Carolina, Chapel Hill (April 2006)
21 July 2008