On May 19, 2016 I hosted a workshop at Stanford's Codex Center about ways to make legal data more open and accessible for computation. These are the slides from my presentation framing the issue.
DLL Catch Up Friday March 22.docx CATCH UP FRIDAYS
Open Legal Data Workshop at Stanford
1. Open Legal Data
Workshop
Stanford University CodeX Center
Harry Surden
Professor of Law, University of Colorado
Affiliated Faculty: Stanford CodeX Center
2. Overview
• Computation has Revolutionized Many Fields
• Law is not one of them
• Data is required for Computational Analysis
• Legal data: neither accessible nor high-quality
3. Computational Legal Analysis
• Computational Law (Rules-based, deductive)
– Rules-based systems computing legal outcomes
– Represent laws in computer-understandable form
– Example: Turbotax; Computable contracts
• Machine Learning & Law (often Statistical)
– Algorithms that learn patterns from data
– Widely Used: self-driving cars, translation, etc
– Example: Supreme Court prediction project
4. Problem
• Computation has revolutionized:
– Finance, medicine, engineering, science, etc.
– Machine learning and computation used for
• Prediction, automation, outlier detection, analysis,
• New drug discovery, etc
• But computation has barely touched law
– Why?
5. To do computation
We need data to analyze
• Think of Law as data to be analyzed
– Federal statutes and administrative rules
– State and local laws and codes
– Judicial orders and opinions
– Lawsuit motions and evidence, etc.
Quality legal data not widely available for analysis
6. The Legal Data Bottleneck
• Legal data exists, but it is not
– Openly accessible (behind pay-walls)
– Structured in a way that makes analysis feasible
• Lack of widely accessible legal data
– Bottleneck to really interesting work in
• Machine learning and Law
• Computational law
7. For really interesting
computational work in law we need
• High-quality legal data that is
– Open and Accessible (little or no cost)
– Structured (machine readable)
– Standardized (common encoding formats)
– Coded (human-tagged and organized)
– Semantic (embedded with meaning)
10. Possibilities
• With high quality, structured legal data:
– Predictions of federal, state court decision
– Finding patterns or biases in legal data
– More computational law systems
– Advanced legal data visualizations
– Discovery of unknown connections or structures
– Outlier detection
– ….many more
11. Open Legal Data
• Legal data for computation that is:
– Open and Accessible (little or no cost)
– Structured (machine readable)
– Standardized (common encoding formats)
– Coded (human-tagged and organized)
– Semantic (embedded with meaning)