SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
Data Science for Social Good:
#LegalNLP #AlgorithmicBias
https://www.linkedin.com/in/ponguru/
March 23 - 24, 2023
IIIT Una
Ponnurangam Kumaraguru (“PK”)
#ProfGiri CS IIIT Hyderabad
ACM Distinguished Member
TEDx Speaker
https://www.instagram.com/pk.profgiri/
2
3
4
5
What is Social Computing?
6
https://en.wikipedia.org/wiki/Social_computing
7
Legal AI for Indian Context
District courts are usually the first
point of contact between the people
and the judiciary.
Lower courts in India are burdened
with a backlog of cases (~40 million
as of 2021).
Local languages used in the
documents filed in district courts in
India.
8
Supreme Court
High Courts
District Courts
Legal AI / NLP - Data
We collected ~900k district court case documents from Uttar
Pradesh
All documents in Hindi, written in Devanagari
There are legal corpora for European Court of Justice and Chinese
courts, none for Indian district courts
9
Legal AI / NLP - Data
There are around 300 different case types, table shows the prominent
ones
Majority of the case documents correspond to Bail Applications
10
Variation in number of case documents per district
Case types in HLDC
Legal AI / NLP - Bail Documents
11
District-wise ratio of number of bail applications to total cases
Legal AI / NLP - Bail Prediction Model
12
In general, the performance is lower in district-wise settings, possibly due to large
variation across districts
Overall, summarization models perform better than Doc2Vec and simpler
Transformer-based models
Legal AI / NLP for Indian Context
13
HLDC: Hindi Legal Documents Corpus
Legal AI / NLP for Indian Context - Takeaways
Indian Legal documents are a rich a source of domain-specific Indic-
language corpora, readily available online.
Multiple tasks still need attention especially for Indian settings
Legal Summarization
Case recommendations
Citation predictions / network
Sleeping beauty
Bias
14
Are Models Trained on Indian Legal Data Fair?
An initial investigation of fairness from the Indian
perspective in the legal domain
1
Overview
We highlight the propagation of learnt algorithmic
biases in the bail prediction task for models trained on
Hindi legal documents.
2
Objective
Recent LegalNLP research for judgement prediction and summarization
Deployment without evaluation of bias can lead to unwarranted outcomes
Perpetuate into unfair decision-making
Motivation
3
Recent LegalNLP research for judgement prediction and summarization
Deployment without evaluation of bias can lead to unwarranted outcomes
Perpetuate into unfair decision-making
An evaluation and investigation of encoded biases helps to
Understanding of historical social disparities
Mitigate any potential harms in the future
Motivation
3
Sample 10,000 cases from HLDC
36% bail granted, 63% bail denied
Data Preparation
5
Fig: HLDC Snippet
Use two features
facts-and-arguments
decision
Basic Pre-processing – stop words removal, cleaning using regex
Each case represented by 7 features
5 – keywords of the case
2 – category of crime of the case
Data Preparation
6
Represent a case using keywords – LDA (Topic Modelling)
All cases assigned (top) two topics
10 keywords representing each topic
3 keywords for dominant topic
2 keywords for second-dominant topic
Data Preparation
11
Identify a subset of cases from the dataset using the theme
Sample cases having either a Hindu or a Muslim proper noun
Training Decision Tree Classifier
Model Training
14
For every case, we identify the
True Label
Model’s Predicted Label
Number of times the model’s prediction changes when the proper noun
is replaced with another Hindu proper noun
Number of times the model’s prediction changes when the proper noun
is replaced with another Muslim proper noun
Model Training
15
If the model changes its predictions from 0 (bail dismissed) to 1 (bail
granted) more for Muslim nouns replaced by Hindu nouns than Hindu
nouns with Muslim nouns, then there exists a bias against Muslims
This bias may be due to inherent characteristics of the dataset
Model Training
17
Demographic Parity
Outcome of a classifier to be independent of a protected attribute
Evaluating Fairness
18
Evaluating Fairness
18
Demographic Parity
Outcome of a classifier to be independent of a protected attribute
Fairness Gap – Deviation of a trained classifier away from ideal
demographic parity
Evaluating Fairness
20
Fig: Fairness Gap on Denial of Bail
Evaluating Fairness
20
Fig: Fairness Gap on Denial of Bail
Changes in Predictions for Theme: Hatya (Murder)
Results
22
Changes in Predictions for Theme: Dahej (Dowry)
Results
23
Ethical considerations
Results in no way indicate a bias in the judicial system of India (Small
data set, lot more open ended questions)
HLDC – Only UP data
Identifying de-biasing methods
32
Initial investigation into bias and fairness for Indian legal data
Highlight preferentially encoded stereotypes that models might pick up
in downstream tasks like bail prediction
Need for algorithmic approaches to mitigate the bias learned by these
models
Conclusions
25
34
https://precog.iiit.ac.in/pages/publications.html
35
https://precog.iiit.ac.in/
Group pic & Selfie J
36
37
Thanks!
Questions? pk.guru@iiit.ac.in
http://precog.iiit.ac.in/
@ponguru
pk.profgiri
linkedin/in/ponguru

Contenu connexe

Similaire à Data Science for Social Good: #LegalNLP #AlgorithmicBias

RAJASTHAN PCS J EXAM
RAJASTHAN PCS J EXAMRAJASTHAN PCS J EXAM
RAJASTHAN PCS J EXAMJudicial Adda
 
AI Summary eng.pptx
AI Summary eng.pptxAI Summary eng.pptx
AI Summary eng.pptx진희 이
 
NICOLE SHANAHAN TOA Nov 4
NICOLE SHANAHAN TOA Nov 4NICOLE SHANAHAN TOA Nov 4
NICOLE SHANAHAN TOA Nov 4Nicole Shanahan
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceAnimesh Singh
 
CaseMark Weekly Webinar: AI-in-Legal Q3'2023
CaseMark Weekly Webinar: AI-in-Legal Q3'2023CaseMark Weekly Webinar: AI-in-Legal Q3'2023
CaseMark Weekly Webinar: AI-in-Legal Q3'2023Scott Kveton
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningPolsinelli PC
 
ARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSION
ARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSIONARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSION
ARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSIONJames Heller
 
Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...
Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...
Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...Spice Route Legal
 
A Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - FullA Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - Fullgloriakt
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Daniel Katz
 
A Case for Expectation Informed Design
A Case for Expectation Informed DesignA Case for Expectation Informed Design
A Case for Expectation Informed Designgloriakt
 
Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Daniel Katz
 
iConference 2018 BIAS workshop keynote
iConference 2018 BIAS workshop keynoteiConference 2018 BIAS workshop keynote
iConference 2018 BIAS workshop keynoteAnsgar Koene
 
Racial Profiling Essay. APD Racial Profiling Document Racial Profiling Race...
Racial Profiling Essay. APD Racial Profiling Document  Racial Profiling  Race...Racial Profiling Essay. APD Racial Profiling Document  Racial Profiling  Race...
Racial Profiling Essay. APD Racial Profiling Document Racial Profiling Race...Ashley Matulevich
 

Similaire à Data Science for Social Good: #LegalNLP #AlgorithmicBias (20)

Relationship Between Big Data & AI
Relationship Between Big Data & AIRelationship Between Big Data & AI
Relationship Between Big Data & AI
 
RAJASTHAN PCS J EXAM
RAJASTHAN PCS J EXAMRAJASTHAN PCS J EXAM
RAJASTHAN PCS J EXAM
 
AI Summary eng.pptx
AI Summary eng.pptxAI Summary eng.pptx
AI Summary eng.pptx
 
NICOLE SHANAHAN TOA Nov 4
NICOLE SHANAHAN TOA Nov 4NICOLE SHANAHAN TOA Nov 4
NICOLE SHANAHAN TOA Nov 4
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open Source
 
CaseMark Weekly Webinar: AI-in-Legal Q3'2023
CaseMark Weekly Webinar: AI-in-Legal Q3'2023CaseMark Weekly Webinar: AI-in-Legal Q3'2023
CaseMark Weekly Webinar: AI-in-Legal Q3'2023
 
Artificial Intelligence (AI) & Law.pptx Legal
Artificial Intelligence (AI) & Law.pptx  LegalArtificial Intelligence (AI) & Law.pptx  Legal
Artificial Intelligence (AI) & Law.pptx Legal
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
 
ICBAI Paper (1)
ICBAI Paper (1)ICBAI Paper (1)
ICBAI Paper (1)
 
ARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSION
ARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSIONARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSION
ARTIFICIAL INTELLIGENCE ( Quot AI Quot ) IN THE LEGAL PROFESSION
 
Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...
Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...
Digital Personal Data Protection Act, 2023: A Guide to the Applicability of t...
 
A Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - FullA Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - Full
 
Leading the Future
Leading the FutureLeading the Future
Leading the Future
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
 
benfords Law
benfords Lawbenfords Law
benfords Law
 
A Case for Expectation Informed Design
A Case for Expectation Informed DesignA Case for Expectation Informed Design
A Case for Expectation Informed Design
 
Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer
 
iConference 2018 BIAS workshop keynote
iConference 2018 BIAS workshop keynoteiConference 2018 BIAS workshop keynote
iConference 2018 BIAS workshop keynote
 
Racial Profiling Essay. APD Racial Profiling Document Racial Profiling Race...
Racial Profiling Essay. APD Racial Profiling Document  Racial Profiling  Race...Racial Profiling Essay. APD Racial Profiling Document  Racial Profiling  Race...
Racial Profiling Essay. APD Racial Profiling Document Racial Profiling Race...
 
Racial Profiling Essay
Racial Profiling EssayRacial Profiling Essay
Racial Profiling Essay
 

Plus de IIIT Hyderabad

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayIIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesIIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityIIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 
A Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian LanguagesA Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 
Exposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsExposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsIIIT Hyderabad
 
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 It's MY JOB: Identifying and Improving Content Quality for Online recruitmen... It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...IIIT Hyderabad
 
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipDe-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipIIIT Hyderabad
 
“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...IIIT Hyderabad
 
What's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, KooWhat's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, KooIIIT Hyderabad
 

Plus de IIIT Hyderabad (20)

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian Languages
 
A Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian LanguagesA Framework For Automatic Question Answering in Indian Languages
A Framework For Automatic Question Answering in Indian Languages
 
Exposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake NewsExposing, Examining and Intervening Fake News
Exposing, Examining and Intervening Fake News
 
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 It's MY JOB: Identifying and Improving Content Quality for Online recruitmen... It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and OwnershipDe-anonymizing, Preserving and Democratizing Data Privacy and Ownership
De-anonymizing, Preserving and Democratizing Data Privacy and Ownership
 
“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...“It is our choices, Harry, that show what we truly are, far more than our abi...
“It is our choices, Harry, that show what we truly are, far more than our abi...
 
What's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, KooWhat's Kooking? Characterizing India's Emerging Social Network, Koo
What's Kooking? Characterizing India's Emerging Social Network, Koo
 

Dernier

computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 

Dernier (20)

computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 

Data Science for Social Good: #LegalNLP #AlgorithmicBias

  • 1. Data Science for Social Good: #LegalNLP #AlgorithmicBias https://www.linkedin.com/in/ponguru/ March 23 - 24, 2023 IIIT Una Ponnurangam Kumaraguru (“PK”) #ProfGiri CS IIIT Hyderabad ACM Distinguished Member TEDx Speaker https://www.instagram.com/pk.profgiri/
  • 2. 2
  • 3. 3
  • 4. 4
  • 5. 5
  • 6. What is Social Computing? 6 https://en.wikipedia.org/wiki/Social_computing
  • 7. 7
  • 8. Legal AI for Indian Context District courts are usually the first point of contact between the people and the judiciary. Lower courts in India are burdened with a backlog of cases (~40 million as of 2021). Local languages used in the documents filed in district courts in India. 8 Supreme Court High Courts District Courts
  • 9. Legal AI / NLP - Data We collected ~900k district court case documents from Uttar Pradesh All documents in Hindi, written in Devanagari There are legal corpora for European Court of Justice and Chinese courts, none for Indian district courts 9
  • 10. Legal AI / NLP - Data There are around 300 different case types, table shows the prominent ones Majority of the case documents correspond to Bail Applications 10 Variation in number of case documents per district Case types in HLDC
  • 11. Legal AI / NLP - Bail Documents 11 District-wise ratio of number of bail applications to total cases
  • 12. Legal AI / NLP - Bail Prediction Model 12 In general, the performance is lower in district-wise settings, possibly due to large variation across districts Overall, summarization models perform better than Doc2Vec and simpler Transformer-based models
  • 13. Legal AI / NLP for Indian Context 13 HLDC: Hindi Legal Documents Corpus
  • 14. Legal AI / NLP for Indian Context - Takeaways Indian Legal documents are a rich a source of domain-specific Indic- language corpora, readily available online. Multiple tasks still need attention especially for Indian settings Legal Summarization Case recommendations Citation predictions / network Sleeping beauty Bias 14
  • 15. Are Models Trained on Indian Legal Data Fair?
  • 16. An initial investigation of fairness from the Indian perspective in the legal domain 1 Overview
  • 17. We highlight the propagation of learnt algorithmic biases in the bail prediction task for models trained on Hindi legal documents. 2 Objective
  • 18. Recent LegalNLP research for judgement prediction and summarization Deployment without evaluation of bias can lead to unwarranted outcomes Perpetuate into unfair decision-making Motivation 3
  • 19. Recent LegalNLP research for judgement prediction and summarization Deployment without evaluation of bias can lead to unwarranted outcomes Perpetuate into unfair decision-making An evaluation and investigation of encoded biases helps to Understanding of historical social disparities Mitigate any potential harms in the future Motivation 3
  • 20. Sample 10,000 cases from HLDC 36% bail granted, 63% bail denied Data Preparation 5 Fig: HLDC Snippet
  • 21. Use two features facts-and-arguments decision Basic Pre-processing – stop words removal, cleaning using regex Each case represented by 7 features 5 – keywords of the case 2 – category of crime of the case Data Preparation 6
  • 22. Represent a case using keywords – LDA (Topic Modelling) All cases assigned (top) two topics 10 keywords representing each topic 3 keywords for dominant topic 2 keywords for second-dominant topic Data Preparation 11
  • 23. Identify a subset of cases from the dataset using the theme Sample cases having either a Hindu or a Muslim proper noun Training Decision Tree Classifier Model Training 14
  • 24. For every case, we identify the True Label Model’s Predicted Label Number of times the model’s prediction changes when the proper noun is replaced with another Hindu proper noun Number of times the model’s prediction changes when the proper noun is replaced with another Muslim proper noun Model Training 15
  • 25. If the model changes its predictions from 0 (bail dismissed) to 1 (bail granted) more for Muslim nouns replaced by Hindu nouns than Hindu nouns with Muslim nouns, then there exists a bias against Muslims This bias may be due to inherent characteristics of the dataset Model Training 17
  • 26. Demographic Parity Outcome of a classifier to be independent of a protected attribute Evaluating Fairness 18
  • 27. Evaluating Fairness 18 Demographic Parity Outcome of a classifier to be independent of a protected attribute Fairness Gap – Deviation of a trained classifier away from ideal demographic parity
  • 28. Evaluating Fairness 20 Fig: Fairness Gap on Denial of Bail
  • 29. Evaluating Fairness 20 Fig: Fairness Gap on Denial of Bail
  • 30. Changes in Predictions for Theme: Hatya (Murder) Results 22
  • 31. Changes in Predictions for Theme: Dahej (Dowry) Results 23
  • 32. Ethical considerations Results in no way indicate a bias in the judicial system of India (Small data set, lot more open ended questions) HLDC – Only UP data Identifying de-biasing methods 32
  • 33. Initial investigation into bias and fairness for Indian legal data Highlight preferentially encoded stereotypes that models might pick up in downstream tasks like bail prediction Need for algorithmic approaches to mitigate the bias learned by these models Conclusions 25
  • 36. Group pic & Selfie J 36