Ai and applications in the legal domain studium generale maastricht 20191101

Artificial Intelligence
Applications in the Legal Domain
Prof dr ir Jan C. Scholtes
Studium Generale Maastricht
AI Lecture Series
November 20, 2019
https://textmining.nu

https://www.linkedin.com/in/jscholtes/

In the short term
we overestimate
the role of
technology,
in the long term
we under
estimate it
Amara’s law
Source: https://en.wikipedia.org/wiki/Roy_Amara

4
Who are the players
in Legal Technology
market place?
Legal
Tech
Law Firms
Big Four
(but also
BDO, GT,
…)
Alternative
Legal
Service
Providers
Legal Tech
Service
Providers
Corporate
Legal

5
Who are these Alternative Legal Service Providers?
Source: https://legal.thomsonreuters.com/content/dam/ewp-m/documents/legal/en/pdf/reports/alsp-report-final.pdf

6
International LegalTech Vendors & Service Providers
Source; Catalyst Investors

7
Selected US Legal & AI Scientific Publications & Research

9
Maastricht Law & Tech Lab
• Modelling legal complexity
• Regulation of disruptive
technology
• Legal issues of data processing
and automated decision-
making

10
Source: ZyLAB Technologies BV, Amsterdam

11
Best AI for Legal Technology
Predictive
Analytics
Reasoning
Decision
Support
Analytics
Machine
Learning for
classification
Logic
Expert
Systems
Search
Content
Extraction
Natural
Language
Processing

12
How about …
Block chain
Practice
Management
Document
management &
Workflow

13
Strength, Weakness & Risk of AI
Strength
 Memory
 Speed
 Force
 Vision
 Sensory
Weakness
 Judgement
 Knowledge
 Dealing with
uncertainty
and
unexpected
behavior
 Creativity
Risk: Bias, not respecting human values in search of efficiency, …

14
Legal Research
Case law
IP & patents
Knowledge management
eDiscovery
Document review and analysis
Legal fact finding
Answering regulatory & public records requests
Internal investigations
Compliance monitoring & auditing
Criminal investigations
GDPR
Contract Law
Contract review
Due diligence in M&A and restructuring
Smart contracts
Legal Market Place
Best lawyers for your case
Best court for litigations
Predicting outcome court decisions
Digital Courts
online dispute
resolution
Selection of Legal Technology Applications

15
• First fully digital court
• Data analytics &
litigation support
• System copied to all
United Nations-
backed War Crime
Tribunals and
ongoing UN courts
1993: First large scale usage of eDiscovery

16
1999: concern about discovery

17
2006: US Federal Rules of Civil Procedure
eDiscovery:
December 1,
2006
Amended
2018

18
2002-2012: The law in the age of exabytes
SLIDE / 18
• Sheer volumes
• Continuing
exponential growth
• Disastrous effects
on a legal system
• Information
Inflation
• “Search alone is no
longer good
enough”

eDiscovery for enforcement, legal fact-finding,
compliance, transparency & trust
eDiscovery
Internal
Investigations
Regulatory
requests
-Litigation &
Arbitration
-Data privacy
& protection
Criminal
Investigations
FOIA/PRR

20
• In the future, all legal fact-finding will be done on electronic data
sets (email, phones, tablets, hard disks, USB storage, cloud, social
media, MS-SharePoint, file shares, O365…) and less and less (or not
at all) from paper files.
• Information requests from 3rd parties will be about the content of
such electronic data sets.
• Future legal professionals must be able to deal with large electronic
data sets.
– Take decisions based on facts and not based on guesses and
assumptions!
– Answer information requests timely, accurately and complete.
– Avoid high cost, reputation damage, regulatory measures,
business disruption and stress!
Why eDiscovery?

21
Legal Research
Case law
IP & patents
eDiscovery
Legal fact finding
GDPR
Contract Law
Contract review
Smart contracts
Legal Market Place
Digital Courts
online dispute
resolution
Selection of Legal Technology Applications

22
Contract Law
Source: LawGeek - 2018

24
Legal Market Place
Source: NRC, 11 September 2019

26
On-line dispute resolution
• Verdict
• Why (justification of
verdict)
• Transparency
Source: http://www.e-court.nl/

27
What AI tooling are relevant for Legal Technology?
Logic
Expert systems
Reasoning
Search
Language
Content
extraction
Text (document)
classification
Analytics for
decision support
Predictive
analytics

29
Logic and Legal Reasoning
• Law of Detachment: if p, then q. We find out that
p is true, therefore, q is true.
• Law of Syllogism: if p, then q. If q, then r. We
find out that p is true, therefore, r is true.
Example: the US First Amendment protects certain
kinds of expression from being banned. Nude dancing
is a form of expression protected by the First
Amendment. The government cannot ban people
from dancing without clothing…
https://www.mtsu.edu/first-amendment/article/27/nude-dancing

30
Expert Systems – Symbolic AI
Rules sets, decision
trees
Bayesian Networks
A method for explaining Bayesian networks for legal evidence with scenarios.
Vlek, C.S., Prakken, H., Renooij, S. et al. Artif Intell Law (2016) 24: 285.
https://doi.org/10.1007/s10506-016-9183-4

31
Arbeidsmarktresearch BV
University of Amsterdam

33
Source: http://probabilityandlaw.blogspot.com/2019/03/the-simonshaven-murder-case-modelled-as.html

34
Legal Search is A Major Challenge
SLIDE / 34
• Full-text search
been around since
the early 1960’s.
• With Google we
feel we can find
anything
immediately, but
‘popularity driven’
search’ not suited
for legal
applications.
• Most IT provided
search does not
work for lawyers.

35
Why is Legal Search different?
SLIDE / 35
• Completeness
• Incomplete
search
• Find unknown
unknowns
• Defensibility
• Transparency

36
Keyword Search

Tokenizer
Token stream Friends Romans Countrymen
Linguistic
modules
Modified tokens
friend roman countryman
Indexer
Inverted index
friend
roman
countryman
2 4
2
1
3
16
1
Documents to
be indexed
Friends, Romans, countrymen.
Source: https://nlp.stanford.edu/IR-book/information-retrieval-book.html

38
Measuring quality: How good is my search?
Corpus
TASK
Info Need
Query
Verbal
form
Results
SEARCH
ENGINE
Query
Refinement
Get rid of mice in a
politically correct way
Info about removing mice
without killing them
How do I trap mice alive?
mouse trap
Misconception?
Mistranslation?
Misformulation?

39
Precision and recall
• Lack of precision leads to
noise, too many false hits, too
much work to review, which
yields high cost of review.
• Lack of recall leads to
missing relevant documents
which yields risk.
39

40
Precisie & recall: reverse proportional
• Increase Precision:
AND, W/5, NOT
• Increase recall:
OR, *, ?, Thesaurus
Fuzzy.
both: Quorum search
100
75
50
2525
75 75
100
0
20
40
60
80
100
120
1 2 3 4
Precisie en Recall
Precisie Recall

41
F1 VALUE: COMBINATION OF
PRECISION & RECALL
Mostly used measurement to describe quality of a system

42
Human Performance
• When both precision and recall are
over 80%, human performance is
approached.
• This applies to the best humans.
• It can be argued that values over
80% are often subject to different
interpretations and discussions.
42

Document classification for search

Now imagine 1.2 million dimensional …
2-dimensional 3-dimensional

45
3x more relevant
documents than
Boolean search
No complex queries, just
review documents
2x total number of
relevant documents
is all that need to
be reviewed
Estimate
accurately percentage of all
relevant documents found at
end
Teach the computer what to look for …

47
Logic
Expert systems
Reasoning
Search
Language
Content
extraction
Text (document)
classification
Analytics for
decision support
Predictive
analytics

48
How about Natural Language Processing (NLP)
POS Tagging
(part of speech)
Dependency
Grammars
Source: https://www.nltk.org/ - Stanford University

49
Conditional random fields (CRF) for sequence prediction

50
Corpus for Named Entity Recognition (NER)

51
Corpus for Sentiment Mining

52
Source:
https://www.cs.colorado.edu/~mozer/Teaching/syllabi/ProbabilisticModelsSpring2018/lectures/ConditionalRandomFields.pdf

53
Long Short-Term Memory (LSTM) are better in capturing long-
term relations as seen in NLP
• Can deal with input of
variable sizes.
• Better in learning the
meaning of the same
word in different
locations (which is hard
for CNN), e.g.: drink a
lot of beers / or like to
drink a lot
• Better in dealing with
long term
dependencies

54
Matching Textual Occurrences to Real-World
Entities …

55
Co-reference & Anaphora Resolution
Source: SemEval 2018 Task 4: Character Identification on Multiparty Dialogues

57
LANGUAGE English
CITY New Brunswick, WASHINGTON
COMPANY J&J, Johnson & Johnson
COUNTRY Greece, Poland, Romania, United Kingdom
CURRENCY .02 USD, 21400000 USD, 48600000 USD, 59.47 USD, 70000000 USD
DATE 04-08
DAY Fri, Friday
NOUN_GROUP
biotech drugs, bribery case, denying guilt, final growth frontier, foreign countries, giving gifts, holding corporations, intense revenue pressure, meaningfu
credit, medical device kickbacks, medical devices, multiple businesses, next several days, non-U.S. markets, only way, orthopedic hips, other countries,
over-the-counter medicines, paid kickbacks, past year, paying kickbacks, same time, several new positions, similar violations, travel gifts
ORGANIZATION Department of Justice, Justice Department, SEC, Securities and Exchange Commission, University of Michigan
PEOPLES Iraqi
PERSON Erik Gordon, Mythili Raman, William Weldon
PLACE_REGION Europe
PRODUCT Benadryl, Tylenol
PROP_MISC Band-Aids, Food Program, Foreign Corrupt Practices Act, United Nations Oil
STATE N.J.
TIME 1:32 pm ET
TIME_PERIOD 13 years, five years, six months, three years
YEAR 2007
PROBLEM
"We went to the government to report improper payments and have taken full responsibility for these actions," said William Weldon, Chairman and CEO
of J&J., Last month federal health regulators took legal control of the plant where millions of bottles of defective medication were produced., The charges
against J&J were brought under the Foreign Corrupt Practices Act, which bars publicly traded companies from bribing officials in other countries to get or
retain business., The company will pay $21.4 million in criminal penalties for improper payments and return $48.6 million in illegal profits, according to
the government., The SEC says J&J agents used fake contracts and sham companies to deliver the bribes.
SENTIMENT giving meaningful credit to companies that self-report, We are committed to holding corporations accountable for bribing foreign officials, what is honest
REQUEST make sure it complies with anti-bribery laws across its businesses

58
Text Mining the Lord of the Rings
• Automatic
identification of key
players
(custodians)
• Automatic
identification of
locations.
• Automatic
identification of
travel patterns of
key players.
• Visualize in time.

60
CCPA

GDPR: redaction, anonymization, pseudonymization

SLIDE / 62
How does that work?
Search Pattern Recognition Text-Mining

HOW TO REPRESENT TEXT FOR
MACHINE LEARNING and
EXTRACTION of COMLEX
PATTERNS?
Legal data is primarily text-based.

Bag of Words (BoW)
Antony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth
Antony 1 1 0 0 0 1
Brutus 1 1 0 1 0 0
Caesar 1 1 0 1 1 1
Calpurnia 0 1 0 0 0 0
Cleopatra 1 0 0 0 0 0
mercy 1 0 1 1 1 1
worser 1 0 1 1 1 0
1 if play contains
word, 0 otherwise
Sec. 1.1

Bag of Words Variation: Term-document
count matrices
• Consider the number of occurrences of a
term in a document:
– Each document is a count vector in ℕv: a column
below
Antony 157 73 0 0 0 0
Brutus 4 157 0 1 0 0
Caesar 232 227 0 2 1 1
Calpurnia 0 10 0 0 0 0
Cleopatra 57 0 0 0 0 0
mercy 2 0 3 5 5 1
worser 2 0 1 1 1 0
Sec. 6.2

TF-IDF weighting
• The tf-idf weight of a term is the product of its
tf weight and its idf weight.
• Best known weighting scheme in information
retrieval
• Relevancy increases with the number of
occurrences within a document and with the
rarity of the term in the collection
)df/(log)tf1log(w 10,, tdt Ndt

Sec. 6.2.2

Binary → count → weight matrix
Antony 5.25 3.18 0 0 0 0.35
Brutus 1.21 6.1 0 1 0 0
Caesar 8.59 2.54 0 1.51 0.25 0
Calpurnia 0 1.54 0 0 0 0
Cleopatra 2.85 0 0 0 0 0
mercy 1.51 0 1.9 0.12 5.25 0.88
worser 1.37 0 0.11 4.15 0.25 1.95
Each document is now represented by a real-valued
vector of tf-idf weights ∈ R|V|
But what happened to our linguistic context?
Sec. 6.3

Faculty Humanities and Sciences
Word Embeddings for more Context
• Pre-trained model
• Understand context better
• Transfer learning: understand already general
aspects of language, subsequent only need to fine-
tune for a specific NLP task.
• No need for millions or billions of annotated training
data (when using deep learning).

69
Word Embeddings: Document Representation
derived with and used for Deep Learning*
Word2Vec Doc2Vec
Glove FastText
ELMO BERT …
Remember: with TF-IDF we create a vector for each document. How
can we do something similar for Deep Learning?
Idea behind Word Embeddings:
Use words from a vocabulary as input and embed them as vectors into a
lower dimensional space in order to enforce the system to create similar
encodings for semantically related words to include context.
*) but can also be used for SVM or other non-deep learning models.

Word2Vec
Mikolov, Tomas; et al. (2013). "Efficient Estimation of Word Representations in Vector Space". arXiv:1301.3781

Mikolov, Tomas; et al. (2013). "Efficient Estimation of Word Representations in Vector Space". arXiv:1301.3781

Why is Word2Vec so popular, although it is
language dependent
Revolutionized the use of word embedding’s by using a
continuous bag of words and skip-grams to derive high quality
word embedding’s.
Why: unexpected side effect was compositionality: algebraic
operations on word vectors result in a vector that is a semantic
composite:
man + royal = king
men – king = women – queen
…
See Gittens et al., Skip-Gram–Zipf+Uniform=VectorAdditivity, 2017 for theoretical justification of compositionality

Uni-Directional and Bi-Directional Context
“I accessed the bank account”
unidirectional contextual model would represent “bank”
based on “I accessed the” but not “account.”
bi-directional contextual model represents “bank” using both
its previous and next context — “I accessed the ... account”
Both ELMo and BERT are bi-directional. ELMo is shallow bi-
directional, BERT deep bi-directional.

BERT & ELMo: bi-directional models
Source: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova.
https://arxiv.org/abs/1810.04805

Differences Word Embeddings
Source: https://www.quora.com/What-are-the-main-differences-between-the-word-
embeddings-of-ELMo-BERT-Word2vec-and-GloVe
*) BERT has deep contextual and can deal with out of vocabulary words due to fully connected bi-
directional and sub word representation

76
Next Step: Find relations between entities
Source: SemEval 2014-2016 Task on ABSA (Pontiki et al.)

77
Specialized Legal Text Analytics for Predictive Analytics
Sue or settle?
Which court?
Which lawyer?

78
A Typical LegalTech Pipeline
Visualization
Analysis,
Clustering,
Machine
Learning
Predictions
Feature
Selection
Feature
Extraction
Text

79
What you need to analyze legal text?
• Citations
• Conditional statements
• Constraints
• Courts
• Dates
• Definitions (“such as …”)
• Durations
• Regulations
• …
Source: LexNLP

81
Predicting Court Decisions
• 200 years high
court decisions
• Using SCDB
• Around 240
categorical
variables with
100s of
categorical
values.
• Random forest.

82
Legal Research
Case law
IP & patents
eDiscovery
Legal fact finding
GDPR
Contract Law
Contract review
Smart contracts
Legal Market Place
Digital Courts
online dispute
resolution
Where is LegalTech most popular?

83
Logic
Expert systems
Reasoning
Search
Language
Content
extraction
Text (document)
classification
Analytics for
decision support
Predictive
analytics

Will computers replace judges?
Acceptance speech
H.J. van den Herik
Kunnen Computers Rechtspreken?
- 21 Juni, 1991
Quote p. 33:
“Yes, computer can judge in
specifically assigned areas of
the law”
“Technology cannot replace the depth of judicial
knowledge, experience, and expertise in law
enforcement that prosecutors and defendants’ attorneys
possess. Complete evaluation and determination of
whether to hold or release an accused defendant on bail
for any particular defendant accused of any specific
crime requires every bit of these combined skills.”

Thank you!
Time for Q&A
https://www.linkedin.com/in/jscholtes/
https://textmining.nu

Ai and applications in the legal domain studium generale maastricht 20191101

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Ai and applications in the legal domain studium generale maastricht 20191101

Similaire à Ai and applications in the legal domain studium generale maastricht 20191101 (20)

Plus de jcscholtes

Plus de jcscholtes (15)

Dernier

Dernier (20)

Ai and applications in the legal domain studium generale maastricht 20191101