SlideShare a Scribd company logo
1 of 26
Machine Learning and Affect Analysis
      Against Cyber-Bullying

             Michal Ptaszynski, Pawel Dybala,
             Tatsuaki Matsuba, Fumito Masui,
                      Rafal Rzepka, Kenji Araki
Presentation Outline
• Introduction
• What is Cyber-Bullying?
• Machine Learning Method for Cyber-bullying
  Detection
• Affect Analysis of Cyber-Bullying Data
• Conclusions & Future Work
Introduction
• Dialogue agent–companion needs to be aware
  of undesirable activities (in or around the
  user)
• Application: Web security
  • Could these be stopped with Web-mining?
                  • Bus-jack case,
                    2000, Japan
                                     • 9.11
Introduction
• Dialogue agent–companion needs to be aware
  of undesirable activities (inDemi Moore
                                or around the
                               Saves A Teen
  user)                        Through
                               Twitter 03.19
• Application: Web security
  • Could these be stopped with Web-mining?
                  • Bus-jack case,
                    2000, Japan
                                     • 9.11
Introduction
   We need an
• Dialogue agent–companion needs to be aware
  artificial Demi
  of undesirable activities (inDemi Moore
                                or around the
                               Saves A Teen
      Moore!
  user)                        Through
                               Twitter 03.19
• Application: Web security
  • Could these be stopped with Web-mining?
                  • Bus-jack case,
                    2000, Japan
                                     • 9.11
New Threat: Cyber-Bullying
• cyber-bullying (or cyber-harassment, cyber-
  stalking)
     • Cyber-bullying happens ”when the Internet, cell phones or
       other devices are used to send or post text or images
       intended to hurt or embarrass another person.”
         – The National Crime Prevention Council in America

     • cyber-bullying ”involves the use of information and
       communication technologies to support deliberate,
       repeated, and hostile behavior by an individual or group,
       that is intended to harm others.”
         – B. Belsey. Cyberbullying: An Emerging Threat for the ”Always On”
           Generation, http://www.cyberbullying.ca/pdf/Cyberbullying
           Presentation Description.pdf
New Threat: Cyber-Bullying
• In Japan:
  – several suicide cases of cyber-bullying victims
  – Ministry of Education officially considers cyber-
    bullying a problem and produces a manual for
    spotting and handling the cyber-bullying cases.
     • Ministry of Education, Culture, Sports, Science and
       Technology, 2008:
        – 'Netto jou no ijime' ni kansuru taiou manyuaru jirei shuu
          (gakkou, kyouin muke)
        – ["Bullying on the net" Manual for handling and the collection
          of cases (directed to school teachers)] (in Japanese).
New Threat: Cyber-Bullying
• In Japan:
 – Volunteers (teachers, PTA members) started Online
   Patrol (OP) to spot CB cases…
 – But there is too much of it!
   (impossible deal with
   all of it manually)
New Threat: Cyber-Bullying
• In Japan:
 – Volunteers (teachers, PTA members) started Online
   Patrol (OP) to spot CB cases…
 – But there is too much of it!
   (impossible deal with
   all of it manually)

    Need to help OP
     automatically
      spot Cyber-
        bullying
Cyber-bullying Detection Method
• Construction of lexicon of
  words distinguishable for
  cyber-bullying
• Estimation of word
  similarity (due to slang
  modifications of words)
• Classification of entries
  into harmful/non-harmful
• Ranking according to
  harmfulness
Lexicon Construction
• Words distinguishable for cyber-bullying =
  vulgarities
  – In English: f**ck, b*tch, sh*t, c*nt, etc..
  – In Japanese: uzai (freaking annoying), kimoi
    (freaking ugly), etc.
  ↓
• Usually not recognized by parsers
Lexicon Construction
     • Obtained Cyber-bullying data (from Online
       Patrol of Japanese secondary school sites)*

     • Read and manually specified 216
       distinguishable vulgar words.
                                                                    Example:

     • Added to parser dictionary:


*) From Human Rights Research Institute Against All Forms for Discrimination and Racism-MIE, Japan
Similarity Estimation
• Jargonization (online slang)
  – English: “CU” (see you [later]), “brah” (bro[ther],
    friend)
  – Japanese:




  *Problem: The same words will not be recognized or
    will be recognized as separate words.
Similarity Estimation
    • Use Levenshtein Distance
                   – “The Levenshtein Distance between two strings is calculated
                     as the minimum number of operations required to transform
                     one string into another, where the available operations are
                     only deletion, insertion or substitution of a single character.”




V. I. Levenshtein. Binary Code Capable of Correcting Deletions, Insertions and Reversals. Doklady
Akademii Nauk SSSR, Vol. 163, No. 4, pp. 845-848 (1965).
Similarity Estimation
• Add heuristic rules for optimization




• With the threshold set
  on 2, the Precision
  before applying the
  rules was 58.9% and
  was improved to 85.0%.
SVM Classification
• Support Vector Machines (SVM) are a method
  of supervised machine learning developed by
  Vapnik [14] and used for classification of data.
• Training data: 966 entries (750 hamful, 216
  non-harmful)
• Calculate result as balanced F-score (with
  Precision and Recall)
• Perform 10-fold cross validation on all data
SVM Classification
• 10-fold cross validation on all data
  – Divide data to 10 parts
  – Use 9 for training and 1 for test
  – Perform 10 times and take an approximation.

  Precision=79.9%, Recall=98.3%  F=88.2%
Affect Analysis of Cyber-Bullying Data
      • The Affect Analysis system used:
            – ML-Ask:
                  1. Determines Emotiveness
                  2. Determines the types of emotions expressed




M. Ptaszynski, P. Dybala, R. Rzepka and K. Araki. Affecting Corpora: Experiments with Automatic Affect Annotation System
- A Case Study of the 2channel Forum -’, In Proceedings of The Conference of the Pacific Association for Computational
Linguistics 2009 (PACLING-09), pp. 223-228 (2009).
Affect Analysis of Cyber-Bullying Data
• The Affect Analysis system used:
  – ML-Ask:
     1. Emotiveness:
        1.   Determine whether utterance is emotive (0/1)
        2.   Calculate emotive value of an utterance (0-5)
        3.   Number of emotive utterances in conversation
        4.   Approx emotive value for all utterances
        5.   Determine number of emotiveness’ features:
             • Interjections
             • Exclamations
             • Vulgarities
             • Mimetic expressions
Affect Analysis of Cyber-Bullying Data
    • The Affect Analysis system used:
          – ML-Ask:
                2. Determines the types of emotions expressed:
                    One of 10 emotion types said to be the most appropriate
                    for the Japanese language:
                    ki/yorokobi (joy, delight), do/ikari (anger), ai/aware
                    (sadness, gloom), fu/kowagari (fear), chi/haji (shame,
                    shyness), ko/suki (liking, fondness), en/iya (dislike),
                    ko/takaburi (excitement), an/yasuragi (relief) and
                    kyo/odoroki (surprise, amazement)

                       Based on an emotive expression database



A. Nakamura. Kanjo hyogen jiten [Dictionary of Emotive Expressions] (in Japanese), Tokyodo Publishing, Tokyo, 1993.
Affect Analysis of Cyber-Bullying Data
• Results
Affect Analysis of Cyber-Bullying Data
• Results                              Many moderate proofs:
                                     Harmful data is less emotive
     1. Emotiveness:
        1.   Determine whether utterance is emotive
        2.   Calculate emotive value of an utterance
        3.   Number of emotive utterances in conversation
        4.   Approx emotive value for all utterances
        5.   Determine number of emotiveness’ features:
             • Interjections
             • Exclamations
             • Vulgarities
             • Mimetic expressions
                                          There are two distinctive features
Affect Analysis of Cyber-Bullying Data
• Results
  2. Emotion types
  – More positive emotions in non-harmful data
  – Slightly more negative emotions in harmful data
  – Detailed analysis: fondness is often used in irony
Conclusions
• New problem: Cyber-Bullying
• Prototype Machine Learning Method for Cyber-
  bullying Detection
  – Results not ideal, but somewhat encouraging
• Affect Analysis of Cyber-Bullying Data
  – Cyber-bullying is less “emotive” (cold irony)
  – Distinctive features of CB: vulgarities, mimetic
    expressions
  – Expressions of emotions considered as positive are
    often used in ironic meaning
Future Works
• New vulgarities are created everyday
  – Create a method for extraction of vulgarities
  – Find a syntactic model of vulgar expression
• Implement in to a web crawler automatically
  performing online patrol (e.g. for school web
  sites)
Thank you for your attention.

More Related Content

Viewers also liked

Ijcai ip-2015 cyberbullying-final
Ijcai ip-2015 cyberbullying-finalIjcai ip-2015 cyberbullying-final
Ijcai ip-2015 cyberbullying-finalMichal Ptaszynski
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationMarina Santini
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic ComputingMeena Nagarajan
 
Semantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event NotificationSemantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event Notificationokazaki117
 
semantic social network analysis
semantic social network analysissemantic social network analysis
semantic social network analysisguillaume ereteo
 
IEEE 2016-2017 SOFTWARE TITLE
IEEE 2016-2017  SOFTWARE TITLE IEEE 2016-2017  SOFTWARE TITLE
IEEE 2016-2017 SOFTWARE TITLE FOCUSLOGICPROJECTS
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHugJimmy Lai
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Prakash Pimpale
 
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Shalin Hai-Jew
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...Davide Chicco
 
Internet safety and cyber bulling final
Internet safety and cyber bulling finalInternet safety and cyber bulling final
Internet safety and cyber bulling finalKaren Brooks
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural networkSmriti Tikoo
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesFelipe Moraes
 
Introduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisIntroduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisNYC Predictive Analytics
 
Deep Learning 勉強会 (Chapter 7-12)
Deep Learning 勉強会 (Chapter 7-12)Deep Learning 勉強会 (Chapter 7-12)
Deep Learning 勉強会 (Chapter 7-12)Ohsawa Goodfellow
 
Word representations in vector space
Word representations in vector spaceWord representations in vector space
Word representations in vector spaceAbdullah Khan Zehady
 
Face recognition using neural network
Face recognition using neural networkFace recognition using neural network
Face recognition using neural networkIndira Nayak
 
Ersatz meetup - DeepLearning4j Demo
Ersatz meetup - DeepLearning4j DemoErsatz meetup - DeepLearning4j Demo
Ersatz meetup - DeepLearning4j DemoAdam Gibson
 
Semantic Complex Event Processing at Sem Tech 2010
Semantic Complex Event Processing at Sem Tech 2010Semantic Complex Event Processing at Sem Tech 2010
Semantic Complex Event Processing at Sem Tech 2010Adrian Paschke
 
Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...Nicolas Nicolov
 

Viewers also liked (20)

Ijcai ip-2015 cyberbullying-final
Ijcai ip-2015 cyberbullying-finalIjcai ip-2015 cyberbullying-final
Ijcai ip-2015 cyberbullying-final
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
 
Text Analytics for Semantic Computing
Text Analytics for Semantic ComputingText Analytics for Semantic Computing
Text Analytics for Semantic Computing
 
Semantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event NotificationSemantic Twitter Analyzing Tweets For Real Time Event Notification
Semantic Twitter Analyzing Tweets For Real Time Event Notification
 
semantic social network analysis
semantic social network analysissemantic social network analysis
semantic social network analysis
 
IEEE 2016-2017 SOFTWARE TITLE
IEEE 2016-2017  SOFTWARE TITLE IEEE 2016-2017  SOFTWARE TITLE
IEEE 2016-2017 SOFTWARE TITLE
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHug
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
 
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3 Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
Real-time Tweet Analysis w/ Maltego Carbon 3.5.3
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
 
Internet safety and cyber bulling final
Internet safety and cyber bulling finalInternet safety and cyber bulling final
Internet safety and cyber bulling final
 
Detection and recognition of face using neural network
Detection and recognition of face using neural networkDetection and recognition of face using neural network
Detection and recognition of face using neural network
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and Phrases
 
Introduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic AnalysisIntroduction to Probabilistic Latent Semantic Analysis
Introduction to Probabilistic Latent Semantic Analysis
 
Deep Learning 勉強会 (Chapter 7-12)
Deep Learning 勉強会 (Chapter 7-12)Deep Learning 勉強会 (Chapter 7-12)
Deep Learning 勉強会 (Chapter 7-12)
 
Word representations in vector space
Word representations in vector spaceWord representations in vector space
Word representations in vector space
 
Face recognition using neural network
Face recognition using neural networkFace recognition using neural network
Face recognition using neural network
 
Ersatz meetup - DeepLearning4j Demo
Ersatz meetup - DeepLearning4j DemoErsatz meetup - DeepLearning4j Demo
Ersatz meetup - DeepLearning4j Demo
 
Semantic Complex Event Processing at Sem Tech 2010
Semantic Complex Event Processing at Sem Tech 2010Semantic Complex Event Processing at Sem Tech 2010
Semantic Complex Event Processing at Sem Tech 2010
 
Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...
 

Similar to Aisb cyberbullying

Special Topics Day for Engineering Innovation Lecture on Cybersecurity
Special Topics Day for Engineering Innovation Lecture on CybersecuritySpecial Topics Day for Engineering Innovation Lecture on Cybersecurity
Special Topics Day for Engineering Innovation Lecture on CybersecurityMichael Rushanan
 
DarkPatternsUpdated.pptx
DarkPatternsUpdated.pptxDarkPatternsUpdated.pptx
DarkPatternsUpdated.pptxEmma Keaveny
 
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...Jason Hong
 
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...AIRCC Publishing Corporation
 
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...AIRCC Publishing Corporation
 
CS147 Social Mobile
CS147 Social MobileCS147 Social Mobile
CS147 Social Mobilemor
 
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011Jason Hong
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningMirXahid1
 
How machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AIHow machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AIVerena Rieser
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummiesSaurav Chakravorty
 
A survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textA survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textDanbi Cho
 
Utian Ayuba - Profiling The Cloud Crime.pdf
Utian Ayuba - Profiling The Cloud Crime.pdfUtian Ayuba - Profiling The Cloud Crime.pdf
Utian Ayuba - Profiling The Cloud Crime.pdfidsecconf
 
Digital literacy edpc605
Digital literacy edpc605Digital literacy edpc605
Digital literacy edpc605Barbara M. King
 
Cyberbullying Assignment - Inbox The movie
Cyberbullying Assignment - Inbox The movieCyberbullying Assignment - Inbox The movie
Cyberbullying Assignment - Inbox The movieMichelle Urdiales
 
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptxISSIP
 
Detection of Cyberbullying on Social Media using Machine Learning
Detection of Cyberbullying on Social Media using Machine LearningDetection of Cyberbullying on Social Media using Machine Learning
Detection of Cyberbullying on Social Media using Machine LearningIRJET Journal
 
EmpathyWorks – Towards an Event-Based Simulation/ML Hybrid Platform
EmpathyWorks – Towards an Event-Based Simulation/ML Hybrid PlatformEmpathyWorks – Towards an Event-Based Simulation/ML Hybrid Platform
EmpathyWorks – Towards an Event-Based Simulation/ML Hybrid PlatformMike Slinn
 

Similar to Aisb cyberbullying (20)

Special Topics Day for Engineering Innovation Lecture on Cybersecurity
Special Topics Day for Engineering Innovation Lecture on CybersecuritySpecial Topics Day for Engineering Innovation Lecture on Cybersecurity
Special Topics Day for Engineering Innovation Lecture on Cybersecurity
 
DarkPatternsUpdated.pptx
DarkPatternsUpdated.pptxDarkPatternsUpdated.pptx
DarkPatternsUpdated.pptx
 
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
Usable Privacy and Security: A Grand Challenge for HCI, Human Computer Inter...
 
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
 
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
Multimodal Cyberbullying Meme Detection From Social Media Using Deep Learning...
 
CS147 Social Mobile
CS147 Social MobileCS147 Social Mobile
CS147 Social Mobile
 
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
Achieving Behavioral Change, for ISSA 2011 in San Francisco Feb 2011
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learning
 
How machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AIHow machines learn to talk. Machine Learning for Conversational AI
How machines learn to talk. Machine Learning for Conversational AI
 
Data science for advanced dummies
Data science for advanced dummiesData science for advanced dummies
Data science for advanced dummies
 
Cybercrime
CybercrimeCybercrime
Cybercrime
 
Cybercrime
CybercrimeCybercrime
Cybercrime
 
A survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in textA survey on automatic detection of hate speech in text
A survey on automatic detection of hate speech in text
 
Utian Ayuba - Profiling The Cloud Crime.pdf
Utian Ayuba - Profiling The Cloud Crime.pdfUtian Ayuba - Profiling The Cloud Crime.pdf
Utian Ayuba - Profiling The Cloud Crime.pdf
 
Digital literacy edpc605
Digital literacy edpc605Digital literacy edpc605
Digital literacy edpc605
 
Cyberbullying Assignment - Inbox The movie
Cyberbullying Assignment - Inbox The movieCyberbullying Assignment - Inbox The movie
Cyberbullying Assignment - Inbox The movie
 
Dark patterns
Dark patternsDark patterns
Dark patterns
 
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx20240104 HICSS  Panel on AI and Legal Ethical 20240103 v7.pptx
20240104 HICSS Panel on AI and Legal Ethical 20240103 v7.pptx
 
Detection of Cyberbullying on Social Media using Machine Learning
Detection of Cyberbullying on Social Media using Machine LearningDetection of Cyberbullying on Social Media using Machine Learning
Detection of Cyberbullying on Social Media using Machine Learning
 
EmpathyWorks – Towards an Event-Based Simulation/ML Hybrid Platform
EmpathyWorks – Towards an Event-Based Simulation/ML Hybrid PlatformEmpathyWorks – Towards an Event-Based Simulation/ML Hybrid Platform
EmpathyWorks – Towards an Event-Based Simulation/ML Hybrid Platform
 

Recently uploaded

Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 

Recently uploaded (20)

Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 

Aisb cyberbullying

  • 1. Machine Learning and Affect Analysis Against Cyber-Bullying Michal Ptaszynski, Pawel Dybala, Tatsuaki Matsuba, Fumito Masui, Rafal Rzepka, Kenji Araki
  • 2. Presentation Outline • Introduction • What is Cyber-Bullying? • Machine Learning Method for Cyber-bullying Detection • Affect Analysis of Cyber-Bullying Data • Conclusions & Future Work
  • 3. Introduction • Dialogue agent–companion needs to be aware of undesirable activities (in or around the user) • Application: Web security • Could these be stopped with Web-mining? • Bus-jack case, 2000, Japan • 9.11
  • 4. Introduction • Dialogue agent–companion needs to be aware of undesirable activities (inDemi Moore or around the Saves A Teen user) Through Twitter 03.19 • Application: Web security • Could these be stopped with Web-mining? • Bus-jack case, 2000, Japan • 9.11
  • 5. Introduction We need an • Dialogue agent–companion needs to be aware artificial Demi of undesirable activities (inDemi Moore or around the Saves A Teen Moore! user) Through Twitter 03.19 • Application: Web security • Could these be stopped with Web-mining? • Bus-jack case, 2000, Japan • 9.11
  • 6. New Threat: Cyber-Bullying • cyber-bullying (or cyber-harassment, cyber- stalking) • Cyber-bullying happens ”when the Internet, cell phones or other devices are used to send or post text or images intended to hurt or embarrass another person.” – The National Crime Prevention Council in America • cyber-bullying ”involves the use of information and communication technologies to support deliberate, repeated, and hostile behavior by an individual or group, that is intended to harm others.” – B. Belsey. Cyberbullying: An Emerging Threat for the ”Always On” Generation, http://www.cyberbullying.ca/pdf/Cyberbullying Presentation Description.pdf
  • 7. New Threat: Cyber-Bullying • In Japan: – several suicide cases of cyber-bullying victims – Ministry of Education officially considers cyber- bullying a problem and produces a manual for spotting and handling the cyber-bullying cases. • Ministry of Education, Culture, Sports, Science and Technology, 2008: – 'Netto jou no ijime' ni kansuru taiou manyuaru jirei shuu (gakkou, kyouin muke) – ["Bullying on the net" Manual for handling and the collection of cases (directed to school teachers)] (in Japanese).
  • 8. New Threat: Cyber-Bullying • In Japan: – Volunteers (teachers, PTA members) started Online Patrol (OP) to spot CB cases… – But there is too much of it! (impossible deal with all of it manually)
  • 9. New Threat: Cyber-Bullying • In Japan: – Volunteers (teachers, PTA members) started Online Patrol (OP) to spot CB cases… – But there is too much of it! (impossible deal with all of it manually) Need to help OP automatically spot Cyber- bullying
  • 10. Cyber-bullying Detection Method • Construction of lexicon of words distinguishable for cyber-bullying • Estimation of word similarity (due to slang modifications of words) • Classification of entries into harmful/non-harmful • Ranking according to harmfulness
  • 11. Lexicon Construction • Words distinguishable for cyber-bullying = vulgarities – In English: f**ck, b*tch, sh*t, c*nt, etc.. – In Japanese: uzai (freaking annoying), kimoi (freaking ugly), etc. ↓ • Usually not recognized by parsers
  • 12. Lexicon Construction • Obtained Cyber-bullying data (from Online Patrol of Japanese secondary school sites)* • Read and manually specified 216 distinguishable vulgar words. Example: • Added to parser dictionary: *) From Human Rights Research Institute Against All Forms for Discrimination and Racism-MIE, Japan
  • 13. Similarity Estimation • Jargonization (online slang) – English: “CU” (see you [later]), “brah” (bro[ther], friend) – Japanese: *Problem: The same words will not be recognized or will be recognized as separate words.
  • 14. Similarity Estimation • Use Levenshtein Distance – “The Levenshtein Distance between two strings is calculated as the minimum number of operations required to transform one string into another, where the available operations are only deletion, insertion or substitution of a single character.” V. I. Levenshtein. Binary Code Capable of Correcting Deletions, Insertions and Reversals. Doklady Akademii Nauk SSSR, Vol. 163, No. 4, pp. 845-848 (1965).
  • 15. Similarity Estimation • Add heuristic rules for optimization • With the threshold set on 2, the Precision before applying the rules was 58.9% and was improved to 85.0%.
  • 16. SVM Classification • Support Vector Machines (SVM) are a method of supervised machine learning developed by Vapnik [14] and used for classification of data. • Training data: 966 entries (750 hamful, 216 non-harmful) • Calculate result as balanced F-score (with Precision and Recall) • Perform 10-fold cross validation on all data
  • 17. SVM Classification • 10-fold cross validation on all data – Divide data to 10 parts – Use 9 for training and 1 for test – Perform 10 times and take an approximation. Precision=79.9%, Recall=98.3%  F=88.2%
  • 18. Affect Analysis of Cyber-Bullying Data • The Affect Analysis system used: – ML-Ask: 1. Determines Emotiveness 2. Determines the types of emotions expressed M. Ptaszynski, P. Dybala, R. Rzepka and K. Araki. Affecting Corpora: Experiments with Automatic Affect Annotation System - A Case Study of the 2channel Forum -’, In Proceedings of The Conference of the Pacific Association for Computational Linguistics 2009 (PACLING-09), pp. 223-228 (2009).
  • 19. Affect Analysis of Cyber-Bullying Data • The Affect Analysis system used: – ML-Ask: 1. Emotiveness: 1. Determine whether utterance is emotive (0/1) 2. Calculate emotive value of an utterance (0-5) 3. Number of emotive utterances in conversation 4. Approx emotive value for all utterances 5. Determine number of emotiveness’ features: • Interjections • Exclamations • Vulgarities • Mimetic expressions
  • 20. Affect Analysis of Cyber-Bullying Data • The Affect Analysis system used: – ML-Ask: 2. Determines the types of emotions expressed: One of 10 emotion types said to be the most appropriate for the Japanese language: ki/yorokobi (joy, delight), do/ikari (anger), ai/aware (sadness, gloom), fu/kowagari (fear), chi/haji (shame, shyness), ko/suki (liking, fondness), en/iya (dislike), ko/takaburi (excitement), an/yasuragi (relief) and kyo/odoroki (surprise, amazement) Based on an emotive expression database A. Nakamura. Kanjo hyogen jiten [Dictionary of Emotive Expressions] (in Japanese), Tokyodo Publishing, Tokyo, 1993.
  • 21. Affect Analysis of Cyber-Bullying Data • Results
  • 22. Affect Analysis of Cyber-Bullying Data • Results Many moderate proofs: Harmful data is less emotive 1. Emotiveness: 1. Determine whether utterance is emotive 2. Calculate emotive value of an utterance 3. Number of emotive utterances in conversation 4. Approx emotive value for all utterances 5. Determine number of emotiveness’ features: • Interjections • Exclamations • Vulgarities • Mimetic expressions There are two distinctive features
  • 23. Affect Analysis of Cyber-Bullying Data • Results 2. Emotion types – More positive emotions in non-harmful data – Slightly more negative emotions in harmful data – Detailed analysis: fondness is often used in irony
  • 24. Conclusions • New problem: Cyber-Bullying • Prototype Machine Learning Method for Cyber- bullying Detection – Results not ideal, but somewhat encouraging • Affect Analysis of Cyber-Bullying Data – Cyber-bullying is less “emotive” (cold irony) – Distinctive features of CB: vulgarities, mimetic expressions – Expressions of emotions considered as positive are often used in ironic meaning
  • 25. Future Works • New vulgarities are created everyday – Create a method for extraction of vulgarities – Find a syntactic model of vulgar expression • Implement in to a web crawler automatically performing online patrol (e.g. for school web sites)
  • 26. Thank you for your attention.