Open Legal Data Workshop at Stanford

On May 19, 2016 I hosted a workshop at Stanford's Codex Center about ways to make legal data more open and accessible for computation. These are the slides from my presentation framing the issue.

  1. 1. Open Legal Data Workshop Stanford University CodeX Center Harry Surden Professor of Law, University of Colorado Affiliated Faculty: Stanford CodeX Center
  2. 2. Overview • Computation has Revolutionized Many Fields • Law is not one of them • Data is required for Computational Analysis • Legal data: neither accessible nor high-quality
  3. 3. Computational Legal Analysis • Computational Law (Rules-based, deductive) – Rules-based systems computing legal outcomes – Represent laws in computer-understandable form – Example: Turbotax; Computable contracts • Machine Learning & Law (often Statistical) – Algorithms that learn patterns from data – Widely Used: self-driving cars, translation, etc – Example: Supreme Court prediction project
  4. 4. Problem • Computation has revolutionized: – Finance, medicine, engineering, science, etc. – Machine learning and computation used for • Prediction, automation, outlier detection, analysis, • New drug discovery, etc • But computation has barely touched law – Why?
  5. 5. To do computation We need data to analyze • Think of Law as data to be analyzed – Federal statutes and administrative rules – State and local laws and codes – Judicial orders and opinions – Lawsuit motions and evidence, etc. Quality legal data not widely available for analysis
  6. 6. The Legal Data Bottleneck • Legal data exists, but it is not – Openly accessible (behind pay-walls) – Structured in a way that makes analysis feasible • Lack of widely accessible legal data – Bottleneck to really interesting work in • Machine learning and Law • Computational law
  7. 7. For really interesting computational work in law we need • High-quality legal data that is – Open and Accessible (little or no cost) – Structured (machine readable) – Standardized (common encoding formats) – Coded (human-tagged and organized) – Semantic (embedded with meaning)
  8. 8. Possibilities harrysurden.com
  9. 9. Possibilities lexpredict.com
  10. 10. Possibilities • With high quality, structured legal data: – Predictions of federal, state court decision – Finding patterns or biases in legal data – More computational law systems – Advanced legal data visualizations – Discovery of unknown connections or structures – Outlier detection – ….many more
  11. 11. Open Legal Data • Legal data for computation that is: – Open and Accessible (little or no cost) – Structured (machine readable) – Standardized (common encoding formats) – Coded (human-tagged and organized) – Semantic (embedded with meaning)