Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
UML Modeling Legal Tax Rules
1. .lusoftware verification & validation
VVS .lusoftware verification & validation
VVS
A Model-Based Framework
for Probabilistic Simulation
of Legal Policies
Ghanem Soltana, Nicolas Sannier, Mehrdad Sabetzadeh,
and Lionel Briand
SnT Centre for Security, Reliability and Trust
University of Luxembourg, Luxembourg
2. How did this work come about?
2
• Collaboration with"
Government of "
Luxembourg
§ CTIE: Government’s IT Centre
§ ACD: Tax Administration Department
• New tax system under development
• Develop tailored solutions for decision-support and
software verification
3. Context
3
Using UML for Modeling Procedural Legal Rules:
Approach and a Study of Luxembourg’s Tax Law
Ghanem Soltana, Elizabeta Fourneret, Morayo Adedjouma,
Mehrdad Sabetzadeh, and Lionel Briand
SnT Centre for Security, Reliability and Trust, University of Luxembourg
{firstname.lastname}@uni.lu
Abstract. Many laws, e.g., those concerning taxes and social benefits,
need to be operationalized and implemented into public administration
procedures and eGovernment applications. Where such operationaliza-
tion is warranted, the legal frameworks that interpret the underlying
4. Context
4
Simulation
data
Generates
(optional)
Simulates
Models of
legal policies
0%
2%
4%
6%
8%
10%
12%
0%
5%
10%
15%
20%
25%
0-10.000
10.000-20.000
20.000-30.000
30.000-40.000
40.000-50.000
50.000-60.000
60.000-70.000
70.000-80.000
80.000-90.000
90.000-100.000
100.000-110.000
110.000-120.000
120.000-130.000
130.000-140.000
140.000-150.000
150.000-160.000
160.000-170.000
170.000-180.000
180.000-190.000
190.000-200.000
200.000-250.000
250.000-350.000
350.000-500.000
500.000-700.000
700.000-1.000.000
>1.000.000
Gross annual income (in Euros)
Contributiontorevenue
Households
Percentage
Percentage
Percentage
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
0
1-3.000
3.001-6.000
6.001-9.000
9.001-12.000
12.001-15.000
15.001-18.000
18.001-21.000
21.001-24.000
24.001-27.000
27.001-30.000
>30.000
Annual income taxes due (in Euros)
Households
Before change
After change
Input to
Impact of legal policy
changes on variables
of interest
5. Objectives
5
• Simulating the impact of legal policy changes
• Enabling simulation even when simulation data is not
available
Simulation
data
Generates
(optional)
Simulates
Models of
legal policies
0%
2%
4%
6%
8%
10%
12%
0%
5%
10%
15%
20%
25%
0-10.000
10.000-20.000
20.000-30.000
30.000-40.000
40.000-50.000
50.000-60.000
60.000-70.000
70.000-80.000
80.000-90.000
90.000-100.000
100.000-110.000
110.000-120.000
120.000-130.000
130.000-140.000
140.000-150.000
150.000-160.000
160.000-170.000
170.000-180.000
180.000-190.000
190.000-200.000
200.000-250.000
250.000-350.000
350.000-500.000
500.000-700.000
700.000-1.000.000
>1.000.000
Gross annual income (in Euros)
Contributiontorevenue
Households
Percentage
Percentage
Percentage
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
0
1-3.000
3.001-6.000
6.001-9.000
9.001-12.000
12.001-15.000
15.001-18.000
18.001-21.000
21.001-24.000
24.001-27.000
27.001-30.000
>30.000
Annual income taxes due (in Euros)
Households
Before change
After change
Input to
Impact of legal policy
changes on variables
of interest
6. Legal policy simulation in practice
6
Some existing simulation tools focused on taxation and social security:
• ASSERT: Assessing the effects of reforms in taxation
• SYSIFF: A micro-simulation model for the French tax system
• POLIMOD: A national static tax-benefit model for the UK
• EUROMOD: European benefit-tax model and social integration
9. Limitations of current simulation frameworks
9
• Legal policies are hard-to-validate
• Single-purpose models
• Unusable when simulation data is not available
10. • Legal policies should be captured in a precise and
yet easy to understand manner
• Automated simulation/analysis should be possible
even when data is not available
Desiderata
10
11. 11
• Legal policies are from prescriptive laws
- Taxation and social benefits
• No change in human behavior due to legal policy
modifications
Working assumptions
12. Our policy simulation framework
Relevant
legal texts
Domain model
Policy models
Model
legal policies
Generated
simulation data
Simulation
results
¨
Generate
simulation data
Annotated
domain model
<<s>>
<<p>>
<<p>>
<<m>>
Annotate
domain model with
probabilities
≠
ÆØPerform
simulation
Is simulation
data available?
Yes
No
Simulation
data
12
13. • A legal policy model captures the procedure envisaged by law for
performing a certain activity
• Notation: Extended Activity Diagrams (ADs)
• Facilitates communication between legal and IT experts
Expressive
Visual
Precise
Executable
ADs
Legal policy models
[Soltana et al., 2014]
13
14. Art. 105bis […] The commuting expenses deduction is defined as a function over
the distance between the principal towns of the municipalities of a taxpayer's
home and his place of work.
The distance is measured in units of distance expressing the kilometric distance
between [principal] towns. A ministerial regulation provides these distances.
The amount of the deduction is calculated as follows:
• If the distance exceeds 4 units but is less than 30 units, the deduction is 99€
per unit of distance.
• The first 4 units are not taken into account and the deduction for a distance
exceeding 30 units is limited to 2,574€.
* Translation from French text
Excerpt from the income tax law
14
15. Example policy model
15
no (false)
«calculate»
Normal rate per unit
for declared distance
«policy»
«iterative»
inc : Income
«context» TaxPayer
OCL: self.incomes->
select(i:Income|
i.year = 2015)
incomes
«in»
distance <
maximal_distance
«fromrecord»
«calculate»
Special flat rate for
maximal distance
«formula»
«calculate»
No deduction
yes (true)
«query»
OCL: inc.prorata_period
prorata_period
«in»«fromrecord»
«query»
flat_rate
minimal_distance
maximal_distance
«fromlaw»«in»
«fromlaw»«in»
«fromlaw»«in»
maximal_flat_rate
«fromlaw»«in»
yes (true)
no (false)
«decision»
«decision»
distance >
minimal_distance
prorata_period *
flat_rate * distance
«formula»
prorata_period *
maximal_flat_rate
«formula»
0 (zero)
distance
«fromrecord»«in»
«query»
OCL: inc.distance
Source: Ministerial
Regulation of February 6,
2012
Source: Art. 105bis
of the LITL, 2013
«query»
flat_rate = 99€
maximal_flat_rate
= 2,574
minimal_distance = 4
maximal_distance = 30
«update»
{property: inc.taxCard.FD;
value: expected_amount}
: MonetaryValue
«intermediate»
expected_amount
Store simulation result
Procedure
defined by the
legal policy
16. Example policy model
16
no (false)
«calculate»
Normal rate per unit
for declared distance
«policy»
«iterative»
inc : Income
«context» TaxPayer
OCL: self.incomes->
select(i:Income|
i.year = 2015)
incomes
«in»
distance <
maximal_distance
«fromrecord»
«calculate»
Special flat rate for
maximal distance
«formula»
«calculate»
No deduction
yes (true)
«query»
OCL: inc.prorata_period
prorata_period
«in»«fromrecord»
«query»
flat_rate
minimal_distance
maximal_distance
«fromlaw»«in»
«fromlaw»«in»
«fromlaw»«in»
maximal_flat_rate
«fromlaw»«in»
yes (true)
no (false)
«decision»
«decision»
distance >
minimal_distance
prorata_period *
flat_rate * distance
«formula»
prorata_period *
maximal_flat_rate
«formula»
0 (zero)
distance
«fromrecord»«in»
«query»
OCL: inc.distance
Source: Ministerial
Regulation of February 6,
2012
Source: Art. 105bis
of the LITL, 2013
«query»
flat_rate = 99€
maximal_flat_rate
= 2,574
minimal_distance = 4
maximal_distance = 30
«update»
{property: inc.taxCard.FD;
value: expected_amount}
: MonetaryValue
«intermediate»
expected_amount
Store simulation result
Inputs from the
legal policy
17. Example policy model
17
no (false)
«calculate»
Normal rate per unit
for declared distance
«policy»
«iterative»
inc : Income
«context» TaxPayer
OCL: self.incomes->
select(i:Income|
i.year = 2015)
incomes
«in»
distance <
maximal_distance
«fromrecord»
«calculate»
Special flat rate for
maximal distance
«formula»
«calculate»
No deduction
yes (true)
«query»
OCL: inc.prorata_period
prorata_period
«in»«fromrecord»
«query»
flat_rate
minimal_distance
maximal_distance
«fromlaw»«in»
«fromlaw»«in»
«fromlaw»«in»
maximal_flat_rate
«fromlaw»«in»
yes (true)
no (false)
«decision»
«decision»
distance >
minimal_distance
prorata_period *
flat_rate * distance
«formula»
prorata_period *
maximal_flat_rate
«formula»
0 (zero)
distance
«fromrecord»«in»
«query»
OCL: inc.distance
Source: Ministerial
Regulation of February 6,
2012
Source: Art. 105bis
of the LITL, 2013
«query»
flat_rate = 99€
maximal_flat_rate
= 2,574
minimal_distance = 4
maximal_distance = 30
«update»
{property: inc.taxCard.FD;
value: expected_amount}
: MonetaryValue
«intermediate»
expected_amount
Store simulation result
Inputs from the
simulation data
IncomeTaxDeduction
Address
TaxPayer
- FLAT_RATE
- MAXIMAL_FLAT_RATE
- MAXIMAL_DISTANCE
- MINIMAL_DISTANCE
Constant
«enumera(on»
*
*
is granted
*
earns1
lives at
*
*
*
is accomplished at
1..*
CommutingExpense
Deduction
is based on
*
1..*
*
*
works at
Service
is paid for0..1
1
Income
- distance:DistanceUnit
- prorata_period:Number
Domain model
(partial)
18. Simulation framework overview
Relevant
legal texts
Domain model
Policy models
Model
legal policies
Generated
simulation data
Simulation
results
¨
Generate
simulation data
Annotated
domain model
<<s>>
<<p>>
<<p>>
<<m>>
Annotate
domain model with
probabilities
≠
ÆØPerform
simulation
Is simulation
data available?
Yes
No
Simulation
data
18
19. Related work on instance generation
• Exhaustive search:
- UML2CSP [Cabot et al., 2014]
- Alloy [Jackson, 2009]
• Non-exhaustive techniques:
- Metaheuristic-search [Ali et al., 2013]
- Predefined patterns [Gogolla et al., 2005]
- Mutation analysis [Di Nardo et al., 2015]
- Configurable random generation [Hartmann et al., 2014]
19
20. Limitations in existing work
Existing techniques cannot generate data
that is suitable for our analysis needs
20
Representativeness
Scalability
Limitations
21. Our solution to generate simulation data
21
Random generation
Profile for capturing
probabilistic
characteristics of
the real population
Scalability
Representativeness
guided by
Limitations
22. Relative frequencies
* Source: STATEC, Luxembourg
60% of income types are Employment, 20% are Pension, and the
remaining 20% are Other
Income
Employment
«probabilistic type»
{frequency: 0.6}
Pension
«probabilistic type»
{frequency: 0.2}
Other
«probabilistic type»
{frequency: 0.2}
(abstract)
22
27. 27
Consistency constraints
The sound application of the profile’s stereotypes is enforced by several
consistency constraints:
• Completeness of the probabilistic information
• Well-formedness of the probabilistic information
• Mutual-exclusiveness application of certain stereotypes
context probabilistic_value inv:
self.base_ObjectNode.getAppliedStereotypes()->select(s |
s.qualifiedName() = 'Profile::from_histogram' or
s.qualifiedName() = 'Profile::from_barchart' or
s.qualifiedName() = 'Profile::from_distribution' or
s.qualifiedName() = 'Profile::fixed_value')->size()=1
28. Simulation framework overview
Relevant
legal texts
Domain model
Policy models
Model
legal policies
Generated
simulation data
Simulation
results
¨
Generate
simulation data
Annotated
domain model
<<s>>
<<p>>
<<p>>
<<m>>
Annotate
domain model with
probabilities
≠
ÆØPerform
simulation
Is simulation
data available?
Yes
No
Simulation
data
28
29. 29
Fully automated data generation
Policy models (set)
Simulation
data (instance
of slice model)
Annotated domain model
<<s>>
<<p>>
<<p>>
Slice
model
Slice
domain model
¨
1
2
6
3
7
8
9
5
4
Instantiate
slice model
Ø
Traversal order
a c
b
d
a' b'
c'
d'
Segments
classification
Identify
traversal order
ÆClassify
path segments
Simulation unit (class)
≠
Sample size
30. Simulation framework overview
Relevant
legal texts
Domain model
Policy models
Model
legal policies
Generated
simulation data
Simulation
results
¨
Generate
simulation data
Annotated
domain model
<<s>>
<<p>>
<<p>>
<<m>>
Annotate
domain model with
probabilities
≠
ÆØPerform
simulation
Is simulation
data available?
Yes
No
Simulation
data
30
31. 31
Simulation process
Activity Diagram(s)
(legal rule) Feedback
Generate
simulation code
Simulation code
Visualize and
analyze results
Run simulator
Simulation Results
Simulation
data
Domain model
Original and
modified sets of
legal policies
33. 33
Research questions
• RQ1: Do data generation and simulation run in reasonable time?
• RQ2: Does our data generator produce data that is consistent with
the specified characteristics of the population?
• RQ3: Are the results of different data generation runs consistent
(up to random variation)?
34. 34
Case study
• Models for personal income taxes were
created (domain model + policy models)
• Six representative policy models were
selected (out of 18 policy models)
• All models were validated by legal experts
35. 35
Probabilistic information
Statistic
Description
Age
Distribution of taxpayers by age
Income type
Relative distribution of different incomes types (employment,
agriculture, business and trade, etc.)
Income rage
Distribution of the annual income ranges for taxpayers
Invalidity rate
Percentage of invalid taxpayers
Invalidity type
Relative distribution of different invalidity types
Residence status
Relative distribution of resident versus non-resident taxpayers
…
15 distributions (from census and synthetized data) were used to
specify Luxembourg’s population’s characteristics
STATEC, Luxembourg
36. 36
RQ1: Do data generation and simulation run in
reasonable time?
0 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k
051015202530
ID + CIS + PE + FD + LD + CIP
ID + CIS + PE + FD + LD
ID + CIS + PE + FD
ID + CIS + PE
ID + CIS
ID
Number of generated tax cases
Executiontime(inminutes)
Results for the generator
- Deduction for invalidity (ID)
- Credit for salaried workers (CIS)
- Deduction for permanent expenses (PE)
- Deduction for commuting expenses (FD)
- Deduction for long-term debts (LD)
- Credits for pensioners (CIP)
37. 37
RQ1: Do data generation and simulation run in
reasonable time?
0 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k
04812162024
ID + CIS + PE + FD + LD + CIP
ID + CIS + PE + FD + LD
ID + CIS + PE + FD
ID + CIS + PE
ID + CIS
ID
Number of simulated tax cases
Executiontime(inminutes)
- Deduction for invalidity (ID)
- Credit for salaried workers (CIS)
- Deduction for permanent expenses (PE)
- Deduction for commuting expenses (FD)
- Deduction for long-term debts (LD)
- Credits for pensioners (CIP)
Results for the simulator
38. 38
RQ2: Does our data generator produce data that is
consistent with the specified characteristics?
100 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k
00.10.20.30.40.5
Distance for age histograms
Distance for income histograms
Distance for income type histograms
Distance for aggregation of histograms
Number of generated tax cases
Euclideandistance
Generated sample starts
to be representative for a
size above 2000 units
39. 39
RQ3: Are the results of different
data generation runs consistent?
• 5 samples of 5000 tax cases
• Pairwise comparison of the generated samples using kolmogorov-
smirnov test
No counter-evidence that the samples come from different populations
40. 40
Ongoing work
• Decision-support for the Government’s actual tax reforms
• Evaluating the accuracy of the simulation results
0%
10%
20%
30%
40%
50%
60%
70%
Tax class 1 Tax class 1.a Tax class 2
Taxpayers
Before change
After change
-20%!
0%!
20%!
40%!
60%!
80%!
100%!
>21.001!
18.001-21.000!
15.001-18.000!
12.001-15.000!
9001-1200!
6001-9000!
3001-6000!
1-3000!
0!
1-3000!
3001-6000!
6001-9000!
9001-1200!
12.001-15.000!
15.001-18.000!
18.001-21.000!
>21.001!
Less taxes to pay! More taxes to pay!
Annual decrease / increase in taxes due (in Euros)!
Households!
41. 41
Summary
• Model-based simulation framework for legal policies
• A profile for expressing probabilistic characteristics of a
population
• An automated stochastic data generator
• Preliminary evaluation of scalability, representativeness,
and reproducibility is promising
• Applied to assess actual tax reforms
42. .lusoftware verification & validation
VVS .lusoftware verification & validation
VVS
A Model-Based Framework
for Probabilistic Simulation
of Legal Policies
Ghanem Soltana, Nicolas Sannier, Mehrdad Sabetzadeh,
and Lionel Briand
SnT Centre for Security, Reliability and Trust
University of Luxembourg, Luxembourg
43. 43
Model sizes
• The domain model has: 64 classes, 43 generalizations, 344 attributes,
and 53 associations
• The six policy models have an average of 35 elements
44. 44
Path segments classification illustration
Sample unit
3
0..1 taxCard income 1
IncomeType
TaxCard
incomeType1
* income
Income
2
1
Safe
Unsafe
45. 45
Traversal order illustration
Sample unit
0..1 taxCard income 1
IncomeType
TaxCard
«multiplicity»
{relativeTo: taxCard;
condition: self.incomeType.oclIsTypeOf(Other)
source: 0}
incomeType1
* income
Income
3
2
1
46. 46
Simulation results
Taxpayer AEP (old) AEP (new) Old Tax Class New Tax Class Income Type Gross Taxable Taxes (new) Taxes (old)
Resident_Tax_Payer 1 0 0 One_A One_A Other 21535,32 19150 0 0
Resident_Tax_Payer 2 0 0 Two One Pension 21588 21550 1218 0
Non_Resident_Tax_Payer 3 0 0 Two Two Employment 21600 19200 0 0
Resident_Tax_Payer 4 0 0 Two One Employment 21600 19200 790 14124 (with spouse)
Resident_Tax_Payer 5 4500 0 Two One_A Employment 21600 19200 0 3146(with spouse)
Resident_Tax_Payer 6 0 0 Two One Employment 21612 19200 790 10283(with spouse)
…
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
0
1-3.000
3.001-6.000
6.001-9.000
9.001-12.000
12.001-15.000
15.001-18.000
18.001-21.000
21.001-24.000
24.001-27.000
27.001-30.000
>30.000
Annual income taxes due (in Euros)
Households
Before change
After change
49. 49
RQ1: Do data generation and simulation run in
reasonable time?
0 1K 2K 3K 4K 5K 6K 7K 8K 9K 10K
051015202530
ID + CIS + PE + FD + LD + CIP
ID + CIS + PE + FD + LD
ID + CIS + PE + FD
ID + CIS + PE
ID + CIS
ID
Number of generated tax cases
Executiontime(inminutes)
Policy models
ID
ID+CIS
ID+CIS+PE
ID+CIS+PE+FD
ID+CIS+PE+FD+LD
ID+CIS+PE+FD+LD+CIP
Relevant fraction of the
domain model
4%
5%
7%
13%
20%
22%
- Credit for salaried workers (CIS)
- Credits for pensioners (CIP)
- Deduction for commuting expenses (FD)
- Deduction for invalidity (ID)
- Deduction for permanent expenses (PE)
- Deduction for long-term debts (LD)
Results of the generator
50. 50
Limitations of our data generator
• Does not consider constraints other than those
specified in our profile
• Works only when there are no cyclic dependencies
• Multiplicities of some generated objects might be not
satisfied