November 20, 2019, it was my great pleasure to present a special lecture on Artificial Intelligence and Application in the Legal Domain. In this lecture I discuss how the development of machines that can learn, reason and act intelligently – Artificial Intelligence (AI) – is advancing rapidly in the legal domain. In some areas, machine intelligence have even already surpassed the limits of what the brightest human minds are capable of achieving, especially in the field of eDiscovery and Legal Review of large data set.
In others, machines still struggle with seemingly basic tasks. Nonetheless, breakthroughs in AI already have profound impact on the legal profession. AI is set to improve our world now and will continue to do so in the future. At the same time, there is the fear of losing control.
This lecture was part of a larger series on AI organized by our department of data science and knowledge engineering: https://www.maastrichtuniversity.nl/events/artificial-intelligence.
More information can be found here: https://textmining.nu
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Ai and applications in the legal domain studium generale maastricht 20191101
1. Artificial Intelligence
Applications in the Legal Domain
Prof dr ir Jan C. Scholtes
Studium Generale Maastricht
AI Lecture Series
November 20, 2019
https://textmining.nu
2. Prof dr ir Jan C. Scholtes
https://www.linkedin.com/in/jscholtes/
3. In the short term
we overestimate
the role of
technology,
in the long term
we under
estimate it
Amara’s law
Source: https://en.wikipedia.org/wiki/Roy_Amara
4. 4
Who are the players
in Legal Technology
market place?
Legal
Tech
Law Firms
Big Four
(but also
BDO, GT,
…)
Alternative
Legal
Service
Providers
Legal Tech
Service
Providers
Corporate
Legal
5. 5
Who are these Alternative Legal Service Providers?
Source: https://legal.thomsonreuters.com/content/dam/ewp-m/documents/legal/en/pdf/reports/alsp-report-final.pdf
9. 9
Maastricht Law & Tech Lab
• Modelling legal complexity
• Regulation of disruptive
technology
• Legal issues of data processing
and automated decision-
making
11. 11
Best AI for Legal Technology
Predictive
Analytics
Reasoning
Decision
Support
Analytics
Machine
Learning for
classification
Logic
Expert
Systems
Search
Content
Extraction
Natural
Language
Processing
13. 13
Strength, Weakness & Risk of AI
Strength
Memory
Speed
Force
Vision
Sensory
Weakness
Judgement
Knowledge
Dealing with
uncertainty
and
unexpected
behavior
Creativity
Risk: Bias, not respecting human values in search of efficiency, …
14. 14
Legal Research
Case law
IP & patents
Knowledge management
eDiscovery
Document review and analysis
Legal fact finding
Answering regulatory & public records requests
Internal investigations
Compliance monitoring & auditing
Criminal investigations
GDPR
Contract Law
Contract review
Due diligence in M&A and restructuring
Smart contracts
Legal Market Place
Best lawyers for your case
Best court for litigations
Predicting outcome court decisions
Digital Courts
online dispute
resolution
Selection of Legal Technology Applications
15. 15
• First fully digital court
• Data analytics &
litigation support
• System copied to all
United Nations-
backed War Crime
Tribunals and
ongoing UN courts
1993: First large scale usage of eDiscovery
Source: ZyLAB Technologies BV, Amsterdam
17. 17
2006: US Federal Rules of Civil Procedure
eDiscovery:
December 1,
2006
Amended
2018
18. 18
2002-2012: The law in the age of exabytes
SLIDE / 18
• Sheer volumes
• Continuing
exponential growth
• Disastrous effects
on a legal system
• Information
Inflation
• “Search alone is no
longer good
enough”
20. 20
• In the future, all legal fact-finding will be done on electronic data
sets (email, phones, tablets, hard disks, USB storage, cloud, social
media, MS-SharePoint, file shares, O365…) and less and less (or not
at all) from paper files.
• Information requests from 3rd parties will be about the content of
such electronic data sets.
• Future legal professionals must be able to deal with large electronic
data sets.
– Take decisions based on facts and not based on guesses and
assumptions!
– Answer information requests timely, accurately and complete.
– Avoid high cost, reputation damage, regulatory measures,
business disruption and stress!
Why eDiscovery?
21. 21
Legal Research
Case law
IP & patents
Knowledge management
eDiscovery
Document review and analysis
Legal fact finding
Answering regulatory & public records requests
Internal investigations
Compliance monitoring & auditing
Criminal investigations
GDPR
Contract Law
Contract review
Due diligence in M&A and restructuring
Smart contracts
Legal Market Place
Best lawyers for your case
Best court for litigations
Predicting outcome court decisions
Digital Courts
online dispute
resolution
Selection of Legal Technology Applications
27. 27
What AI tooling are relevant for Legal Technology?
Logic
Expert systems
Reasoning
Search
Language
Content
extraction
Text (document)
classification
Analytics for
decision support
Predictive
analytics
29. 29
Logic and Legal Reasoning
• Law of Detachment: if p, then q. We find out that
p is true, therefore, q is true.
• Law of Syllogism: if p, then q. If q, then r. We
find out that p is true, therefore, r is true.
Example: the US First Amendment protects certain
kinds of expression from being banned. Nude dancing
is a form of expression protected by the First
Amendment. The government cannot ban people
from dancing without clothing…
https://www.mtsu.edu/first-amendment/article/27/nude-dancing
30. 30
Expert Systems – Symbolic AI
Rules sets, decision
trees
Bayesian Networks
A method for explaining Bayesian networks for legal evidence with scenarios.
Vlek, C.S., Prakken, H., Renooij, S. et al. Artif Intell Law (2016) 24: 285.
https://doi.org/10.1007/s10506-016-9183-4
34. 34
Legal Search is A Major Challenge
SLIDE / 34
• Full-text search
been around since
the early 1960’s.
• With Google we
feel we can find
anything
immediately, but
‘popularity driven’
search’ not suited
for legal
applications.
• Most IT provided
search does not
work for lawyers.
37. Tokenizer
Token stream Friends Romans Countrymen
Linguistic
modules
Modified tokens
friend roman countryman
Indexer
Inverted index
friend
roman
countryman
2 4
2
1
3
16
1
Documents to
be indexed
Friends, Romans, countrymen.
Source: https://nlp.stanford.edu/IR-book/information-retrieval-book.html
38. 38
Measuring quality: How good is my search?
Corpus
TASK
Info Need
Query
Verbal
form
Results
SEARCH
ENGINE
Query
Refinement
Get rid of mice in a
politically correct way
Info about removing mice
without killing them
How do I trap mice alive?
mouse trap
Misconception?
Mistranslation?
Misformulation?
Source: https://nlp.stanford.edu/IR-book/information-retrieval-book.html
39. 39
Precision and recall
• Lack of precision leads to
noise, too many false hits, too
much work to review, which
yields high cost of review.
• Lack of recall leads to
missing relevant documents
which yields risk.
39
41. 41
F1 VALUE: COMBINATION OF
PRECISION & RECALL
Mostly used measurement to describe quality of a system
42. 42
Human Performance
• When both precision and recall are
over 80%, human performance is
approached.
• This applies to the best humans.
• It can be argued that values over
80% are often subject to different
interpretations and discussions.
42
44. Now imagine 1.2 million dimensional …
2-dimensional 3-dimensional
45. 45
3x more relevant
documents than
Boolean search
No complex queries, just
review documents
2x total number of
relevant documents
is all that need to
be reviewed
Estimate
accurately percentage of all
relevant documents found at
end
Teach the computer what to look for …
Source: ZyLAB Technologies BV, Amsterdam
47. 47
What AI tooling are relevant for Legal Technology?
Logic
Expert systems
Reasoning
Search
Language
Content
extraction
Text (document)
classification
Analytics for
decision support
Predictive
analytics
48. 48
How about Natural Language Processing (NLP)
POS Tagging
(part of speech)
Dependency
Grammars
Source: https://www.nltk.org/ - Stanford University
53. 53
Long Short-Term Memory (LSTM) are better in capturing long-
term relations as seen in NLP
• Can deal with input of
variable sizes.
• Better in learning the
meaning of the same
word in different
locations (which is hard
for CNN), e.g.: drink a
lot of beers / or like to
drink a lot
• Better in dealing with
long term
dependencies
57. 57
LANGUAGE English
CITY New Brunswick, WASHINGTON
COMPANY J&J, Johnson & Johnson
COUNTRY Greece, Poland, Romania, United Kingdom
CURRENCY .02 USD, 21400000 USD, 48600000 USD, 59.47 USD, 70000000 USD
DATE 04-08
DAY Fri, Friday
NOUN_GROUP
biotech drugs, bribery case, denying guilt, final growth frontier, foreign countries, giving gifts, holding corporations, intense revenue pressure, meaningfu
credit, medical device kickbacks, medical devices, multiple businesses, next several days, non-U.S. markets, only way, orthopedic hips, other countries,
over-the-counter medicines, paid kickbacks, past year, paying kickbacks, same time, several new positions, similar violations, travel gifts
ORGANIZATION Department of Justice, Justice Department, SEC, Securities and Exchange Commission, University of Michigan
PEOPLES Iraqi
PERSON Erik Gordon, Mythili Raman, William Weldon
PLACE_REGION Europe
PRODUCT Benadryl, Tylenol
PROP_MISC Band-Aids, Food Program, Foreign Corrupt Practices Act, United Nations Oil
STATE N.J.
TIME 1:32 pm ET
TIME_PERIOD 13 years, five years, six months, three years
YEAR 2007
PROBLEM
"We went to the government to report improper payments and have taken full responsibility for these actions," said William Weldon, Chairman and CEO
of J&J., Last month federal health regulators took legal control of the plant where millions of bottles of defective medication were produced., The charges
against J&J were brought under the Foreign Corrupt Practices Act, which bars publicly traded companies from bribing officials in other countries to get or
retain business., The company will pay $21.4 million in criminal penalties for improper payments and return $48.6 million in illegal profits, according to
the government., The SEC says J&J agents used fake contracts and sham companies to deliver the bribes.
SENTIMENT giving meaningful credit to companies that self-report, We are committed to holding corporations accountable for bribing foreign officials, what is honest
REQUEST make sure it complies with anti-bribery laws across its businesses
Source: ZyLAB Technologies BV, Amsterdam
58. 58
Text Mining the Lord of the Rings
• Automatic
identification of key
players
(custodians)
• Automatic
identification of
locations.
• Automatic
identification of
travel patterns of
key players.
• Visualize in time.
62. SLIDE / 62
How does that work?
Search Pattern Recognition Text-Mining
63. HOW TO REPRESENT TEXT FOR
MACHINE LEARNING and
EXTRACTION of COMLEX
PATTERNS?
Legal data is primarily text-based.
64. Bag of Words (BoW)
Antony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth
Antony 1 1 0 0 0 1
Brutus 1 1 0 1 0 0
Caesar 1 1 0 1 1 1
Calpurnia 0 1 0 0 0 0
Cleopatra 1 0 0 0 0 0
mercy 1 0 1 1 1 1
worser 1 0 1 1 1 0
1 if play contains
word, 0 otherwise
Sec. 1.1
Source: https://nlp.stanford.edu/IR-book/information-retrieval-book.html
65. Bag of Words Variation: Term-document
count matrices
• Consider the number of occurrences of a
term in a document:
– Each document is a count vector in ℕv: a column
below
Antony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth
Antony 157 73 0 0 0 0
Brutus 4 157 0 1 0 0
Caesar 232 227 0 2 1 1
Calpurnia 0 10 0 0 0 0
Cleopatra 57 0 0 0 0 0
mercy 2 0 3 5 5 1
worser 2 0 1 1 1 0
Sec. 6.2
Source: https://nlp.stanford.edu/IR-book/information-retrieval-book.html
66. TF-IDF weighting
• The tf-idf weight of a term is the product of its
tf weight and its idf weight.
• Best known weighting scheme in information
retrieval
• Relevancy increases with the number of
occurrences within a document and with the
rarity of the term in the collection
)df/(log)tf1log(w 10,, tdt Ndt
Sec. 6.2.2
67. Binary → count → weight matrix
Antony and Cleopatra Julius Caesar The Tempest Hamlet Othello Macbeth
Antony 5.25 3.18 0 0 0 0.35
Brutus 1.21 6.1 0 1 0 0
Caesar 8.59 2.54 0 1.51 0.25 0
Calpurnia 0 1.54 0 0 0 0
Cleopatra 2.85 0 0 0 0 0
mercy 1.51 0 1.9 0.12 5.25 0.88
worser 1.37 0 0.11 4.15 0.25 1.95
Each document is now represented by a real-valued
vector of tf-idf weights ∈ R|V|
But what happened to our linguistic context?
Sec. 6.3
Source: https://nlp.stanford.edu/IR-book/information-retrieval-book.html
68. Faculty Humanities and Sciences
Word Embeddings for more Context
• Pre-trained model
• Understand context better
• Transfer learning: understand already general
aspects of language, subsequent only need to fine-
tune for a specific NLP task.
• No need for millions or billions of annotated training
data (when using deep learning).
69. 69
Word Embeddings: Document Representation
derived with and used for Deep Learning*
Word2Vec Doc2Vec
Glove FastText
ELMO BERT …
Remember: with TF-IDF we create a vector for each document. How
can we do something similar for Deep Learning?
Idea behind Word Embeddings:
Use words from a vocabulary as input and embed them as vectors into a
lower dimensional space in order to enforce the system to create similar
encodings for semantically related words to include context.
*) but can also be used for SVM or other non-deep learning models.
70. Word2Vec
Mikolov, Tomas; et al. (2013). "Efficient Estimation of Word Representations in Vector Space". arXiv:1301.3781
71. Mikolov, Tomas; et al. (2013). "Efficient Estimation of Word Representations in Vector Space". arXiv:1301.3781
72. Why is Word2Vec so popular, although it is
language dependent
Revolutionized the use of word embedding’s by using a
continuous bag of words and skip-grams to derive high quality
word embedding’s.
Why: unexpected side effect was compositionality: algebraic
operations on word vectors result in a vector that is a semantic
composite:
man + royal = king
men – king = women – queen
…
See Gittens et al., Skip-Gram–Zipf+Uniform=VectorAdditivity, 2017 for theoretical justification of compositionality
73. Uni-Directional and Bi-Directional Context
“I accessed the bank account”
unidirectional contextual model would represent “bank”
based on “I accessed the” but not “account.”
bi-directional contextual model represents “bank” using both
its previous and next context — “I accessed the ... account”
Both ELMo and BERT are bi-directional. ELMo is shallow bi-
directional, BERT deep bi-directional.
74. BERT & ELMo: bi-directional models
Source: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova.
https://arxiv.org/abs/1810.04805
75. Differences Word Embeddings
Source: https://www.quora.com/What-are-the-main-differences-between-the-word-
embeddings-of-ELMo-BERT-Word2vec-and-GloVe
*) BERT has deep contextual and can deal with out of vocabulary words due to fully connected bi-
directional and sub word representation
76. 76
Next Step: Find relations between entities
Source: SemEval 2014-2016 Task on ABSA (Pontiki et al.)
77. 77
Specialized Legal Text Analytics for Predictive Analytics
Sue or settle?
Which court?
Which lawyer?
78. 78
A Typical LegalTech Pipeline
Visualization
Analysis,
Clustering,
Machine
Learning
Predictions
Feature
Selection
Feature
Extraction
Text
79. 79
What you need to analyze legal text?
• Citations
• Conditional statements
• Constraints
• Courts
• Dates
• Definitions (“such as …”)
• Durations
• Regulations
• …
Source: LexNLP
81. 81
Predicting Court Decisions
• 200 years high
court decisions
• Using SCDB
• Around 240
categorical
variables with
100s of
categorical
values.
• Random forest.
82. 82
Legal Research
Case law
IP & patents
Knowledge management
eDiscovery
Document review and analysis
Legal fact finding
Answering regulatory & public records requests
Internal investigations
Compliance monitoring & auditing
Criminal investigations
GDPR
Contract Law
Contract review
Due diligence in M&A and restructuring
Smart contracts
Legal Market Place
Best lawyers for your case
Best court for litigations
Predicting outcome court decisions
Digital Courts
online dispute
resolution
Where is LegalTech most popular?
83. 83
What AI tooling are relevant for Legal Technology?
Logic
Expert systems
Reasoning
Search
Language
Content
extraction
Text (document)
classification
Analytics for
decision support
Predictive
analytics
84. Will computers replace judges?
Acceptance speech
H.J. van den Herik
Kunnen Computers Rechtspreken?
- 21 Juni, 1991
Quote p. 33:
“Yes, computer can judge in
specifically assigned areas of
the law”
“Technology cannot replace the depth of judicial
knowledge, experience, and expertise in law
enforcement that prosecutors and defendants’ attorneys
possess. Complete evaluation and determination of
whether to hold or release an accused defendant on bail
for any particular defendant accused of any specific
crime requires every bit of these combined skills.”
85. Thank you!
Time for Q&A
Prof dr ir Jan C. Scholtes
https://www.linkedin.com/in/jscholtes/
https://textmining.nu