Multi Perspective Anomalies, by Jan W Veldsink, Master in the art of AI at Nyenrode, Rabobank, and Grio.
*Machine Learning School in The Netherlands 2022.
1. NYENRODE. A REWARD FOR LIFE
NYENRODE. A REWARD FOR LIFE
Modular MBA - AISEC - Artificial Intelligence and Security
This fall at Nyenrode!
Welcome at Nyenrode
2. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
July 4 - 6, 2022
2 n d E d i t i o n
4. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
T0 BE SUCCESSFUL
“TO BE TRULY SUCCESSFUL, INFORMATION MUST FLOW
THROUGHOUT THE ORGANIZATION. WORKERS' IDEAS AND
KNOWLEDGE ARE CONVEYED TO ALL LEVELS OF THE
COMPANY, AND THE ORGANIZATION IS FULLY RESPONSIVE.
IN COMPANIES WHERE THIS KIND OF INTERNAL
COMMUNICATION SYSTEM IS IN PLACE, PRODUCTIVITY,
QUALITY, AND CUSTOMER SERVICE IMPROVE.”
(DEMING,1982)
5. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Predictive Services vision
• Every initiative is going to involve AI
• Every initiative is driven by DATA
• Every initiative is MODEL BASED
•Every initiative is realised NON CODING
6. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Business
requirement
Design and analyse model
Architecture
Take it to a achitectual level
AI step
Model creation and
Machine Learning
Feedback
Evaluation of Rules / ML / AI
Feedback
Event
Signals and Business events
Signals
Signal
Optimization
Action
Any down
stream action
Detection
Applying task specific AI
Value creation Value creation
Know
ledge
crea
tion
Knowledge
creation
Value Chain
Alerts
Analysis
7. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Model driven and NON-CODING
8. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
2016: Artificial Intelligence positioning paper
9. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Why BigML
Auditability
Repeatability
10. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Artificial Intelligence
What is the definition?
“Artificial intelligence systems are software (and possibly also hardware) systems designed by
humans that, given a complex goal, act in the physical or digital dimension by perceiving their
environment through data acquisition, interpreting the collected (structured or unstructured) data,
reasoning on the knowledge or processing the information derived from these data, and
deriving/deciding the best action to take to achieve the given goal.” (EBA)
How does it work?
AI is let learned by the Decision Engineer to perform cognitive tasks, these tasks can vary in the
degree of autonomy. The creation-cycle is depicted in the ‘Ecosystem’.
To what is it used?
“We believe AI presents strong opportunities for prosperity and growth, both for society, and
specifically for financial services. The application of AI will be in the interest of consumers and
businesses, providing better, faster products and services, providing relevant or right information
at the right time. One of the many key advantages is better risk management as advanced data
analytics contributes to a better internal understanding of bank activities (e.g. provisioning and
capital models), operational risks and improved monitoring of compliancy.” (NVB)
What is the technology behind it?
Statistics, Machine learning, Logic programming, Linear programming, Knowledge based systems,
Autonomous intelligent Agents, Cybernetics / Computational intelligence and soft computing.
Related technologies
Robotica, Internet of Things, Robotics Process Automation
Ethical ‘Compass’ for AI
(Rabobank Compass aligned:)
Societal well-being
We act in the interest of our customer and we will
protect these interest where we can;
Respect for human autonomy
We have respect for the autonomy of the
individual;
Fair and explicable
We are ethical, transparent and approachable;
Justiceship
We aim at equal treatment. We will not treat two
persons with the same need / interest in a
different way.
Authors (Compliance cq. legal): Patty Braam-Liu, Sander Smits, Martijn Duijvestein.
Society
Possibilities Artificial Intelligence
Ecosystem Transparant AI principles
A model is explainable when it is possible to generate explanations that allow
humans to understand (i) how an outcome or result is reached or (ii) on what
grounds the result is based (similar to a justification).
The model is interpretable, since the internal behaviour (representing how the
result is reached) can be directly understood by a human.
Fairness requires that the model ensures the protection of groups against unfair
bias (direct or indirect), discrimination and stigmatisation. Unfairness can affect in
particular smaller populations and vulnerable groups.
Bias is an inclination of prejudice towards or against a person, object, or position.
Bias (or biased outcomes) can occur in many ways and must be avoided at all times.
All the steps and choices made throughout the entire data analytics process need to
be clear, transparent and traceable to enable its oversight. This includes, amongst
others, model changes, data traceability and decisions made by the model.
A auditable solution, for which there are detailed audit logs for all phases of the
process that can be used to identify ‘who did what, when and why’, facilitates
oversight of the system, as it makes it possible to follow the whole process and gain
better insights.
As a general principle, the customer should be informed about any data processing
performed on his or her personal data. All data processing must be on lawful
grounds and be protected with proper technical and organisational measurements.
A trustworthy system should respect customers’ rights and protect their
interests. Development and deployment of systems must therefore always comply
with law and not harm or diminish consumer rights.
Regulatory aspects
Non-discrimination – Article 1 of the Constitution (Grondwet),
European Convention on Human Rights (Europees Verdrag voor de
Rechten van de Mens) , Act on equal treatment.
Privacy – Profiling, transparency, automated decision making,
lawfulness, explainability, data minimization, accountability
TCF - Duty of care, comprehensibility, complaints procedure.
pm. Ethics: Moresprudentie in Commissie Ethiek also plays a role.
Date: 30/6/2020, v.0.5
AI helps financial inclusion by e.g.
creating access to credit for people
and businesses- that are currently
shut out of the market - at the same
or lower risk costs, made possible by
considering new data sources
Compliance angles
There are potential privacy issues with the use of AI. Especially since
data that AI often applies profiling and automated decision making.
Furthermore, the transparency and explanation of the use of AI
models requires extensive and continuous in dept-knowledge.
Due to the automated nature of AI the human factor of COI is
mitigated. However, there can still exist a COI between the outcomes
of AI and the vision and standards within the bank.
The use of AI can have significant impact on the detection of corruption
The use of AI can positively contribute to for example to automate
alert triage, investigation and reporting. Also: to deploy holistic
market-surveillance activities.
The use of AI can have significant positive impact on the detection of
money laundering, financing of terrorism and sanctions
(Prospect) Customers are exposed to various risks in Treating Clients
Fairly with the use of AI because of the uncertainty of the outcome of AI
in relation to the irrational human brain.
Often AI is related to ‘blackbox’ technologies in which decisions can have
a negative impact on certain customers or business partners.
Discrimination, exclusion etc.
The use of AI can have significant impact on the detection of fraud
Ethics
The standpoint of EU’s High Level Expert
Group on AI
Supervisors Initiatives
The standpoints
General Data Protection Regulation (EC)
Regulation on framework for free flow of
non-personal data in EU (EC)
White Paper on AI – A European approach
to excellence and trust (EC).
DNB - General Principles for the use of Artificial
Intelligence in the Financial Sector
EBA - Final Report on Big Data and Advanced Analytics
Autoriteit Persoonsgegevens - Toezicht op AI &
algoritmes
FSB - Artificial intelligence and machine learning in
financial services. Market developments and financial
stability implications (Financial Stability Board)
EU framework on algorithmic accountability and
transparency (study of European Parliament).
Sector guidelines
Institute of International Finance
Machine Learning in Credit Risk (IIF)
Legislation EU/NL
The current EU regulatory division
Ethics Guidelines for trustworthy AI
Discrimination
Biases
Governance
PublicOpinion
Transparency
Trust
Privacy
CLR & Technology Guild
Customers
More personalized
and better products
with more efficient
and customer
focused services
Rabobank
More and more
insights in
efficiency, risks
and compliancy
11. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
What we evaluated
What Evaluated Verdict
Thetaray Anomaly detection Too di
ffi
cult to handle. Limited
capabilities
Not useable
Microsoft ML ML and cloud platform Platform for datascience, limited
intelligence
Not usable
Riskshield Risk engine knowledge based Fit for purpose, the platform is for
execution of rules and logic
No Machine learning
DataRobot Datascience robot platform Looks promising automated ML and
model validation claims, supervised and
unsupervised
Was part of the RFP
H2O ML application Good and limited set of algorithms Is too technical and limited
R/Phyton/Weka/.. own development ML open source applications Open to all the hard work, depending on
knowledge and skills
Is too technical
BigML Integral platform, supervised /
unsupervised and unstructured
Easy to learn, great visuals, integration
with Riskshield
RFP and POC
RapidMiner Platform for data scientists Hard to learn Not usable
DataIku Platform to support Datascientist Too technical Not useable
SAS Part of a large suite Very large implementations, data science
oriented tooling
Not useable
IBM/SPSS Used it for years Not
fi
t for new tasks Not useable
12. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
MAR APR MAY JUN JUL
Milestone
Implemented BigML on
Rabobank hardware
Milestone
First results meeting
Milestone
Endreport
RFP
Installation
Education
Experiments 1
Experiments 2
Milestonse
Hardware plan
HardwarePlan
Reporting
POC BigML-timings
13. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
MAR APR MAY JUN JUL
Milestone
Implemented BigML on
Rabobank hardware
Milestone
First results meeting
Milestone
Endreport
RFP
Installation
Education
Experiments 1
Experiments 2
Milestonse
Hardware plan
HardwarePlan
Reporting
POC BigML-timings
Today
Hardware Implementation
14. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
BigML - first results
Environment/Experiment True positives False positives
Thetaray 10% 5000 / 1.000.000
R / Weka experiments 70% 350 / 1.000.000
Microsoft 70% 150 / 1.000.000
BigML 90% 5 / 1.000.000
•Fraud
• Ability to work with an predict fraud with our 2017 anonymized dataset.
(Thetaray / Microsoft / Inform / CCR and Dataiku did not reach the
results BigML did)
• CDD
• Reduction of 85% of the PEP alerts
22. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
FEC AND KYC the proper way
23. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Dynamic Customer Behavioral Event Monitoring
It is a wicked problem
Dynamic Customer Behavioral
Event monitoring
to manage and mitigate risks for
Bank’s customers, the Financial
system, Own organization
including:
Fraud, Misuse of the financial
system and Financial Economic
Crimes.
Project X
Project
RDT
Project
TM
Project Case
management
Project Data Lineage
WWFT
DNB-Guidelines
EBA-Guidelines
Org-Policies
Org-Standards
Org-Politics
Project
Prospero
Project Indica
Distinguish between Critical and
‘Safe to fail’ projects.
28. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Muli perspective
Anomaly
Customers path of life
Customers age
Customers industry segment
Customers Cash
intensiveness
Customers Relation to
High risk counties
Customers Newly
Onboarded
29. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Some features > 400 :-) different sets per Perpective
Cash related
Country related
31. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
The Peergrouping Anomaly pattern
32. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Perspective New Customers
33. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
AI Architecture for Fraud / KYC
34. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Some Numbers on a weekly basis we
Build client profile in Riskshield
Anomaly detection in RaboML
Powered by BigML
1 hours
3 hours
Number of weekly Risk-indicators
=> (Y/N) and explanation
63.000.000
Assess # Customers > 10.000.000
39. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Based upon BIGML’s explanations
40. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Fairness of Assessment
41. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Predicting with SensitiveAttributes
Dataset with
Two attributes:
- Anomaly Score
- Gender Attribute
Split 80%
Train
Split 20%
Test
Build Tree model
Target = GENDER
Evaluate model
Test
If Phi > 0.2 then
BIAS!!
Trained
Model(tree)
Data without Bias
attribute GENDER
Anomaly
Detector
Data with Anomaly
Score
Batch
Anomaly Score
42. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
The result from actual scores
43. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Detecting covariate drift
Data
Period 1
Data
Period 2
Sample and
add field Period
with value P1
Sample and
add field Period
with value P2
Merge
Train 80%
Test 20%
Build model
Target is Period
Missing splits = True
Trained model
Evaluate Check if phi > 0.2
Investigate
45. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Did it Work?????
Phi-Coefficient = 0,347 > 0.2 Why: Countries on a list.
46. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Did it Work?????
Phi-Coefficient = 0,4999 > 0.2 DRIFT! Why: SBI-class
47. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Dynamic Customer Behavioral Event Monitoring
It is a wicked problem
Dynamic Customer Behavioral
Event monitoring
to manage and mitigate risks for
Bank’s customers, the Financial
system, Own organization
including:
Fraud, Misuse of the financial
system and Financial Economic
Crimes.
Project X
Project
RDT
Project
TM
Project Case
management
Project Data Lineage
WWFT
DNB-Guidelines
EBA-Guidelines
Org-Policies
Org-Standards
Org-Politics
Project
Prospero
Project Indica
Distinguish between Critical and
‘Safe to fail’ projects.
48. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
Business / Information
PERSPECT: Simpel
Development
Operation
D
o
m
a
i
n
A
c
t
i
v
i
t
y
C
a
s
e
P
a
r
t
y
Vision
Development
Operation
Analysis
D
o
m
a
i
n
A
c
t
i
v
i
t
y
C
a
s
e
P
a
r
t
y
Infrastructure
Application
Information
Business
Services
Proces
Scope
49. Jan W
. Veldsink MSc STRICTLY CONFIDENTIAL
In a Journey towards resilience
50. Jan W
. Veldsink MSc
MULTI PERSPECTIVE
ANOMALY DETECTION
Jan W Veldsink MSc
jan@grio.nl / j.veldsink@nyenrode.nl