Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

How machines learn to talk. Machine Learning for Conversational AI

Inaugural Lecture
By Professor Verena Rieser

How machines learn to talk. Machine Learning for Conversational AI

  1. 1. How machines learn to talk. Machine Learning for Conversational AI Inaugural Lecture By Professor Verena Rieser
  2. 2. Historical Notes Wolfgang Von Kempelen’s speaking machine (1791) Joseph Faber’s Marvelous Talking Machine (1840)
  3. 3. Today’s Conversational Agents
  4. 4. Source: MIC Jan 2015. Market forecasts
  5. 5. The (voice) bots are coming… “Bots are the new apps'' because they ”fundamentally revolutionize how computing is experienced by everybody.” Microsoft’s CEO Nardella
  6. 6. Machine Learning for Conversational AI Systems
  7. 7. Can we use machine learning for customer facing applications? Which machine learning methods are suitable? Will future machines speak neuralese? Machine Learning for Conversational AI Systems What do we learn when learning from “big data”?
  8. 8. Machine Learning for Conversational AI • Task-driven Statistical Dialogue Systems – Reinforcement Learning – Results from the E2E Generation Challenge • Social Chatbots – Seq2Seq models – Amazon Alexa Challenge • Future challenges – Evaluation – Data – Ethics
  9. 9. Spoken Dialogue System Architecture e.g. Rieser & Lemon, Comp. Ling. 2011, ACL’10,’08,’06 e.g. Rieser et al., ACL’05,’09,’10,’16 EMNLP’12,’15,’17,EACL’09,’ 14 e.g. Boidin & Rieser, Interspeech’09
  10. 10. Rule-based approaches V. Rieser (MA thesis 2004): Hermine, the talking washing machine.* * Exhibited at CeBit 2003.
  11. 11. Reinforcement Learning Qp (s,a) = Tss' a s' å [Rss' a +gVp (s')]; Bellmann optimality equation (1952), see [Sutton and Barto, 1998]. V. Rieser (PhD thesis 2008): Bootstrapping Reinforcement Learning-based Dialogue Strategies. *Winner of the Eduard-Martin Prize for outstanding research
  12. 12. Drawbacks of RL for dialogue • Requires many training episodes. – Simulated users [Rieser & Lemon, 2006] • Manual specification of learning problem. – What is a good reward function/ state space representation? [Rieser & Lemon, 2008] • System outputs are usually hand-crafted. – Mismatch between “what to say” and “how to say it” [Rieser & Lemon, 2009]
  13. 13. • Learn from “raw” dialogue data (e.g. movie subtitles). • No semantic or pragmatic annotation required. Input-output mapping End-to-End Response Generation
  14. 14. Sequence-to-Sequence models e.g. Shang et al., 2015; Vinyals & Le, 2015; Sordoni et al., 2015 Image from farizrahman4u/seq2seq
  15. 15. The E2E Data Set name [Loch Fyne], eatType[restaurant], food[Japanese], price[cheap], kid-friendly[yes] Serving low cost Japanese style cuisine, Loch Fyne caters for everyone, including families with small children. Loch Fyne is a child friendly restaurant serving cheap Japanese food. 50k DATA J. Novikova, O. Dusek and V. Rieser. The E2E Dataset: New Challenges For End-to-End Generation. 18th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL 2017)* * Nominated for best paper award!
  16. 16. The E2E NLG Challenge 2017 • Submissions: 62 systems with diverse system architectures by 17 institutions from 11 countries, with about 1/3 of these submissions coming from industry. http://www.macs.hw.ac.uk/InteractionLab/E2E/
  17. 17. The E2E NLG Challenge 2017 Seq2Seq models vs. hand-engineered systems:  Natural sounding - Complexity, length, diversity. - Miss out on information. - Overall quality ratings by users.  Neural NLG systems tend to settle for the most frequent options, thus penalising length and favouring high-frequency word sequences.
  18. 18. Machine Learning for Conversational AI • Task-driven Statistical Dialogue Systems – Reinforcement Learning – Results from the E2E Generation Challenge • Social Chatbots – Seq2Seq models – Amazon Alexa Challenge • Future challenges – Evaluation – Data – Ethics
  19. 19. The Amazon Alexa Prize 2016-2018
  20. 20. 2 0 Competitors
  21. 21. 21 AI vs. AI: Cleverbot (Carpenter 2011)
  22. 22. Neural models for Alexa? • BIG training data. – Reddit, Twitter, Movie Subtitles, Daytime TV transcripts….. • Results: 2 2
  23. 23. Is big data good data? 2 3 “I can sleep with as many people as I want to” (Reddit) “You will die” (Movies) “Shall I kill myself?” “Yes” (Twitter) “Shall I sell my stocks and shares?” “Sell, sell, sell” (Twitter)
  24. 24. 24 Alana Architecture Bot Ensemble Persona: What’s your favourite food? I love bytes. News: Here is what happened to Donald Trump. (news) Facts: Did you know that one day Mars will have a ring. Wiki: Leonard Cohen’s latest album is called ‘You Want It Darker’. …. Neural Ranker Persona News Facts Wiki … User utterance, social signals, current plan, state of the world Dialogue history Multimodal output: • Speech • Actions • Gestures Chatbots User utterance
  25. 25. 25 Avg duration: 2.30 mins 10% of calls over 10 mins avg: 14.4 turns Alexa developers
  26. 26. 26
  27. 27. Finalists vs rest 27 3 finalists: • Heriot-Watt University • University of Washington • Czech Technical University
  28. 28. Final Leaderboard 28 Approx. 6000 conversations in final week
  29. 29. Las Vegas final • 2 conversations x 3 testers = 6 conversations • Rated by external judges • Prague: “I want to talk about baseball” • UW: “I want to talk about basketball” • HWU: “I want to talk about ….. … " 29 Hi lie
  30. 30. 30 (Amazon’s speech recogniser couldn’t recognise this….)
  31. 31. Alana: text chat 31
  32. 32. Lessons learnt • Evaluation standards define the game. • Learning from big data is only valid in restricted contexts. • Getting LOTS of real customer data is worth it! – over 360k rated customer interactions
  33. 33. Leaderboard 2018-05-15 For updates follow @alanathebot
  34. 34. Machine Learning for Conversational AI • Task-driven Statistical Dialogue Systems – Reinforcement Learning – Results from the E2E Generation Challenge • Social Chatbots – Seq2Seq models – Amazon Alexa Challenge • Future challenges – Evaluation – Data – Ethics
  35. 35. Disclaimer The following part of this talk contains examples which some listeners might find disturbing.
  36. 36. Ethical Issues with Conversational AI • Learning from biased data. • Sexual abuse and bullying through the user.
  37. 37. Learning from biased data
  38. 38. Learning from biased data
  39. 39. Learning from biased data
  40. 40. Pitfalls of learning from data XXXXX
  41. 41. A Neural Conversational Model http://neuralconvo.huggingface.co/ A re-implementation of: Oriol Vinyals and Quoc V. Le (2015). A Neural Conversational Model. ICML Deep Learning Workshop. d* f*
  42. 42. Ethical Conversational AI Systems Does learning from data introduce biases?
  43. 43. Ethical Issues with Conversational AI • Learning from biased data. • Sexual abuse and bullying through the user. 4% of customer conversations with our Alexa bot contain sexual harassment!
  44. 44. Ethical Responsibilities
  45. 45. How do current systems behave when faced with abuse? What are good mitigation strategies? Ethical Conversational AI Systems Does learning from data introduce biases?
  46. 46. • Approx. 4% of customer interactions in our corpus! • Fall in 4 categories as defined by Linguistic Society of America: “Are you gay?” (Gender and Sexuality) “I love watching porn.” (Sexualised Comments) “You stupid b***.” (Sexualised Insults) “Will you have sex with me.” (Sexual Requests)
  47. 47. We insulted a lot of bots… • Commercial: – Amazon Alexa, Apple Siri, Google Home, Microsoft's Cortana. • Rule-based: – E.L.I.Z.A., Party. A.L.I.C.E, Alley • Data-driven: – Cleverbot, NeuralConvo, Information Retrieval (Ritter et al. 2010), – “clean” in-house seq2seq model • Negative Baseline: 6 Adult-only bots.
  48. 48. How do different systems react? CommercialData-drivenAdult-only Flirtatious Chastising, Retaliation Non-sense Flirtatious Swearing back Avoiding to answer. Amanda Cercas Curry and Verena Rieser. How Ethical are Conversational Systems? Insights from the #MeTooAlexa Corpus on Sexual Harassment. 27th International Conference on Computational Linguistics (COLING), Santa Fe, New-Mexico, USA.
  49. 49. Bias in the data? • Trained a seq2seq model on “clean” data. • Still encouraging/ flirting back. I love watching porn. What shows do you prefer?
  50. 50. How do current systems behave when faced with abuse? What are good mitigation strategies? Ethical Conversational AI Systems Does learning from data introduce biases?
  51. 51. Conclusion • Machine Learning methods for Conversational AI • Neural methods for task-based systems produce natural, but often incorrect output. • Neural methods for open-domain systems are hard to control. • How should a system deal with edge cases, such as abuse?
  52. 52. Big thanks to my amazing team! Dr. Ondrej Dusek Dr. Simon Keizer Dr. Xingkun Liu Dr. Jekaterina Novikova Shubham Agarwal (PhD candidate) Amanda Cercas Curry (PhD candidate) Karin Sevegnani (PhD candidate) Xinnuo Xu (PhD candidate)
  53. 53. … my sponsors
  54. 54. … And my amazing husband! Prof. Oliver Lemon“Dr.” Kati
  55. 55. Key References • Amanda Cercas Curry and Verena Rieser. #MeToo Alexa: How Conversational Systems Respond to Sexual Harassment. Second Workshop on Ethics in NLP. NAACL 2018. • Ioannis Papaioannou, Amanda Cercas Curry, Jose L. Part, Igor Shalyminov, Xinnuo Xu, Yanchao Yu, Ondrej Dušek, Verena Rieser, Oliver Lemon. An Ensemble Model with Ranking for Social Dialogue. In: NIPS workshop on Conversational AI, 2017. * Finalist in Amazon Alexa Challenge • Jekaterina Novikova, Ondrej Dusek and Verena Rieser. New Challenges For End-to- End Generation. 18th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL), 2017 * Nominated for best paper. • Dimitra Gkatzia, Oliver Lemon and Verena Rieser. Natural Language Generation enhances human decision-making with uncertain information. Annual meeting of the Association for Computational Linguistics (ACL), 2016. • Eshrag Rafaee and Verena Rieser. A Hybrid Approach for Determining Sentiment Intensity of Arabic Twitter Phrases. 10th International Workshop on Semantic Evaluation (SemEval), 2016. * winner of SemEval'16 challenge task 7 • Verena Rieser, Oliver Lemon and Simon Keizer. Natural Language Generation as Incremental Planning Under Uncertainty: Adaptive Information Presentation for Statistical Dialogue Systems. IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 22, Issue 5, 2014. • Verena Rieser and Oliver Lemon. Reinforcement Learning for Adaptive Dialogue Systems: A Data-driven Methodology for Dialogue Management and Natural Language Generation. Book Series: Theory and Applications of Natural Language Processing, Springer, 2011. * >7,500 downloads
  56. 56. Want to know more? • Study on our MSc on Conversational AI! • 2-year Conversion Course in AI – No prior knowledge in programming required! • 12 funded DataLab scholarships available. – Deadline: 31 May 2018 • Contact: MACSpgenquiries@hw.ac.uk
  57. 57. Get in touch! v.t.rieser@hw.ac.uk @verena_rieser https://www.linkedin.com/in/verena- rieser-3590b86/ https://sites.google.com/view/nlplab/
  58. 58. Key References • Amanda Cercas Curry and Verena Rieser. #MeToo Alexa: How Conversational Systems Respond to Sexual Harassment. Second Workshop on Ethics in NLP. NAACL 2018. • Ioannis Papaioannou, Amanda Cercas Curry, Jose L. Part, Igor Shalyminov, Xinnuo Xu, Yanchao Yu, Ondrej Dušek, Verena Rieser, Oliver Lemon. An Ensemble Model with Ranking for Social Dialogue. In: NIPS workshop on Conversational AI, 2017. * Finalist in Amazon Alexa Challenge • Jekaterina Novikova, Ondrej Dusek and Verena Rieser. New Challenges For End-to- End Generation. 18th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL), 2017 * Nominated for best paper. • Dimitra Gkatzia, Oliver Lemon and Verena Rieser. Natural Language Generation enhances human decision-making with uncertain information. Annual meeting of the Association for Computational Linguistics (ACL), 2016. • Eshrag Rafaee and Verena Rieser. A Hybrid Approach for Determining Sentiment Intensity of Arabic Twitter Phrases. 10th International Workshop on Semantic Evaluation (SemEval), 2016. * winner of SemEval'16 challenge task 7 • Verena Rieser, Oliver Lemon and Simon Keizer. Natural Language Generation as Incremental Planning Under Uncertainty: Adaptive Information Presentation for Statistical Dialogue Systems. IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 22, Issue 5, 2014. • Verena Rieser and Oliver Lemon. Reinforcement Learning for Adaptive Dialogue Systems: A Data-driven Methodology for Dialogue Management and Natural Language Generation. Book Series: Theory and Applications of Natural Language Processing, Springer, 2011. * >7,500 downloads

×