Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

How to test an AI application

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 32 Publicité

How to test an AI application

Télécharger pour lire hors ligne

The slides (in English, expect the title) of our "How to test an AI application?" webinar. Presenters in Finnish were Mark Sevalnev and Kari Kakkonen, using English slides. The webinar slides are partially from Knowit and partially (with permission) from Knowit's AI testing course material provider STA Consulting.
We gave a quick overview to AI/ML today, as such understanding is needed to be able to test software that has AI in it. Then we discussed how to test such an AI application with some examples.
More about this topic at ISTQB AI Testing course by Knowit at Tieturi
https://www.istqb.org/certifications/artificial-inteligence-tester
https://www.tieturi.fi/koulutus/istqb-ai-testing/
https://www.knowit.fi/kurssit-ja-tapahtumat/

The slides (in English, expect the title) of our "How to test an AI application?" webinar. Presenters in Finnish were Mark Sevalnev and Kari Kakkonen, using English slides. The webinar slides are partially from Knowit and partially (with permission) from Knowit's AI testing course material provider STA Consulting.
We gave a quick overview to AI/ML today, as such understanding is needed to be able to test software that has AI in it. Then we discussed how to test such an AI application with some examples.
More about this topic at ISTQB AI Testing course by Knowit at Tieturi
https://www.istqb.org/certifications/artificial-inteligence-tester
https://www.tieturi.fi/koulutus/istqb-ai-testing/
https://www.knowit.fi/kurssit-ja-tapahtumat/

Publicité
Publicité

Plus De Contenu Connexe

Similaire à How to test an AI application (20)

Plus par Kari Kakkonen (20)

Publicité

Plus récents (20)

How to test an AI application

  1. 1. Tieturi-Webinaari: Kuinka testata tekoälysovellusta? Kari Kakkonen Knowit https://www.linkedin.com/in/karikakkonen Mark Sevalnev Knowit https://www.linkedin.com/in/marksevalnev Copyright Knowit Solutions Oy 2021 1
  2. 2. A Nordic powerhouse for digital solutions 4,000+ / Professionals 6 countries / Sweden, Norway, Finland, Denmark, Germany and Poland 4 business areas / Solutions, Experience, Connectivity and Insight 468.0 MEUR / Net sales Nordic ESG champions / Clear vision to accelerate the sustainability agenda 47.5 MEUR / Adjusted operating profit (EBITA)
  3. 3. ROLES • Knowit Solutions Oy, Director of Training and Competences, Lead Consultant, Trainer and Coach • Children’s and testing author at Dragons Out Oy • TMMi, Board of Directors • Treasurer of Finnish Software Testing Board (FiSTB) ACHIEVEMENTS • Tester of the Year in Finland 2021 • EuroSTAR European Testing Excellence Award 2021 • ISTQB Executive Committee 2015-2021 • Influencing testing since 1996 • Ranked in 100 most influential IT persons in Finland (Tivi magazine) • Great number of presentations in Finnish and international conferences • TestausOSY/FAST founding member. • Co-author of Agile Testing Foundations book • Regular blogger in Tivi-magazine Kari Kakkonen, Lead Testing Consultant SERVICES • ISTQB Advanced, Foundation and Agile Testing • A4Q AI and Software Testing • Knowit Quality Professional • DASA DevOps • Quality & Test process and organization development, Metrics, TMMi and other assessments • Agile testing, Scrum, Kanban, Lean • Leadership • Test automation, Mobile, Cloud, DevOps, AI • Quality, cost, benefits EDUCATION • ISTQB Expert Level Test Management & Advanced Full & Agile Tester certified • DASA DevOps, Scrum Master and SAFe certified • TMMi Professional, Assessor, Process Improver certified • SPICE provisionary assessor certified • M.Sc.(Eng), Helsinki University of Technology (present Aalto University), Otaniemi, Espoo • Marketing studies, University of Wisconsin-Madison, the USA. 26.1.2023 3 BUSINESS DOMAINS • Wide spread of business domain knowledge: Embedded, industry, public, training, telecommunications, commerce, Insurance, banking, pension. twitter.com/kkakkonen Dragonsout.com MORE INFORMATION linkedin.com/in/karikakkonen/ © Copyright Knowit Trainings 2022
  4. 4. Mark has over 10 years of software development experience in three main areas: AI/ML prototyping, traditional software development, and computer science research. His passion is in sky-rocketing domain of AI/ML. Mark has worked with NLP, Deep learning, classification, speech-to-text-systems and he has been co-author of several scientific papers. . TECHNOLOGY • Java • Python • React • Spring Boot • Node.js • Keras • TensorFlow • DialogFlow • AWS • Azure • Google Cloud • GitLab • Docker COURCES AND CERTIFICATIONS • AWS Certified Machine Learning, Specialty • AWS Certified Solutions Architect, Associate EDUCATION • M.Sc. (Technology), Theoretical Computer Science (main), Software Systems (minor), Aalto University Mark Sevalnev, Full Stack Developer #AI/ML #AWS #Java #React ROLES Full Stack Developer AI Developer Trainer MORE INFORMATION linkedin.com/in/marksevalnev TECHNIQUES BENEFIT IN EXAMPLE PROJECTS • Developing a 3D virtual avatar working as a service desk operative as follows: The existing React.js code was modified to match different business requirements. API integrations were implemented with cloud services (AWS) and designed the login for chat bot conversations. Used main tools: React.js, Dialogflow, AWS, and Google Cloud. • AI prediction algorithm to predict future values in the HR process such as time needed for recruitment, etc. The following professional skills were needed: Data investigation, fetching, cleaning and preparation, designing and implementing AI/ML algorithm, and deploying solution to the cloud. Used main tools: Python, Pandas, Numby, Keras, and Sckit-learn. • PoC for Optical Character Recognition (OCR): As AI developer, building. ETL for OCR of scanned documents. Deploying a solution to AWS Fargate running inside Docker containers, and configuring orchestration of ETL pipe using Airflow with the following subtasks: Documents converting to gray scale, OCR, content classification with NLP, storing results into ElasticSearch. 4 26.1.2023
  5. 5. • Miten tekoäly poikkeaa normaalista ohjelmistosta? • Tekoälyn testauksen alueet koneälyn opetuksessa • Tekoälyn testaustapoja Agenda Copyright Knowit Solutions Oy 2021 5
  6. 6. Why right now? Four drivers behind AI revolution 26.1.2023 6 © Copyright Knowit Solutions 2020 | Version 2.0 Computation growth due to general purpose GPUs The rise of Big data Community based achievements in Deep learning Open source tools and frameworks
  7. 7. AI applications? 26.1.2023 7 © Copyright Knowit Solutions 2020 | Version 2.0 Figure: 2019 AI landscape by Firstmark (a snippet) http://mattturck.com/wp- content/uploads/2019/07/2019_Matt_Turck_Big_Data_Landscap e_Final_Fullsize.png
  8. 8. 8 AI as a paradigm shift How AI is different from traditional software development? © Copyright Knowit Oy 2020 | Confidential | Version 1.0 Code ? ? ? Input Output Traditional approach: Work focuses on coding rules Machine learning: Work focuses on collecting examples
  9. 9. Is AI better? In which set of problems AI-based approach is superior? 26.1.2023 9 © Copyright Knowit Solutions 2020 | Version 2.0 Figures: Li/Johnson/Yeung C231 https://cs231n.github.io/classification/
  10. 10. AI is broken? Why well-trained image recognition is failed in production? 26.1.2023 10 © Copyright Knowit Solutions 2020 | Version 2.0 Not a tank tank Tank Classifier tank tank tank
  11. 11. Metaphor for AI learning: baby or alien? 26.1.2023 11 © Copyright Knowit Solutions 2020 | Version 2.0
  12. 12. AI specific challenges How biased data can ruin AI performance? 26.1.2023 12 © Copyright Knowit Solutions 2020 | Version 2.0 Figure: Harvard University https://sitn.hms.harvard.edu/flash/2020/racial- discrimination-in-face-recognition-technology/ Figures: MIT https://arxiv.org/abs/1901.10002
  13. 13. Is AI different to test? Is AI a 'black box'? Is AI 'fragile'? Specificies of AI performance 26.1.2023 13 © Copyright Knowit Solutions 2020 | Version 2.0 https://www.researchgate.net/figure/One-pixel- attacks-created-with-the-proposed-algorithm-that- successfully-fooled-three_fig3_320609325
  14. 14. • Features • Value space • Labels • Functions • Function weights • Model • Model training • Training and testing set • Fitting error Small detour into AI related terms... © Copyright Knowit Solutions 2020 | Version 2.0 14
  15. 15. What are the features? What are the labels? What is value space? 26.1.2023 15 © Copyright Knowit Solutions 2020 | Version 2.0 x1 = 75kg x2 = 172cm x3 = siniset y = mies x1 = 75 x2 = 172 x3 = 3 y = 0 x1 = (105,234,41) x2 = (45,24,44) x3 = (15, 4,21) … x307 200 = (15,24,71) y = koira x1 = (105,234,41) x2 = (45,24,44) x3 = (15, 4,21) … x307 200 = (15,24,71) y = 1 x1 represents person’s weight, so it can potentially get values from 40kg to 200kg x2 represents person’s height, so it can potentially get values from 80cm to 250cm These are real world objects These are what we measure from them These are what we feed to AI model/function
  16. 16. What is function? What are the function weights? 26.1.2023 16 © Copyright Knowit Solutions 2020 | Version 2.0 • Mathematical function is a mapping that takes input x and outputs y • Examples of the functions: G = m*g  G = f(m) s = s0 + v0*t + ½*a*t2  s = f(t) • Every AI algorithm (neural network, regression line, decision tree) is a mathematical function i.e. f(x)=y • x is the input representation i.e. set of properties (features) that describe the given input • y is the desired class i.e. y=0 => ‘this a cat image’, y=1 => ‘this is a dog image’ • weights (parameters) are ‘moving parts’ in a function i.e. numbers that must be fixed
  17. 17. What is training and testing sets? What is model training? What is fittest error? 26.1.2023 17 © Copyright Knowit Solutions 2020 | Version 2.0 virhe
  18. 18. Is AI for instance neural net learns? 26.1.2023 18 © Copyright Knowit Solutions 2020 | Version 2.0 https://www.datasciencecentral. com/the-approximation-power- of-neural-networks-with-python- codes/ https://en.wikipedia.org/wiki/Univ ersal_approximation_theorem
  19. 19. Is AI different to test? Is AI a 'black box'? Is AI 'fragile'? Specificies of AI performance 26.1.2023 19 © Copyright Knowit Solutions 2020 | Version 2.0 https://playground.tensorflow.org/
  20. 20. Functional & Non-Functional Characteristics Non-Functional Testing Functional Testing what the system does how the system does it Functional Suitability Performance Efficiency Compatibility Usability Reliability Security Maintain- ability Portability ISO 25010 Product Quality Model Functional completeness Functional correctness Functional appropriateness Time behaviour Resource utilisation Capacity Co-existence Interoperability Appropriateness recognizability Learnability Operability User error protection User interface aesthetics Accessibility Maturity Availability Fault tolerance Recoverability Confidentiality Integrity Non- repudiation Accountability Authenticity Modularity Reusability Analysability Modifiability Testability Adaptability Installability Replaceability © STA Consulting
  21. 21. Risks, Objectives and Acceptance Criteria ISO/IEC 25010 Quality Characteristics Acceptance Criteria AI-Specific Quality Characteristics Test Objectives Perceived Risks  The most important (highest risk) system characteristics are used to generate test objectives and acceptance criteria for AI-Based systems, including: • flexibility, adaptability and evolution • autonomy • probabilistic and non-deterministic systems • side-effects and reward hacking • ethics and safety • inappropriate bias • transparency, interpretability and explainability © STA Consulting
  22. 22. ML Workflow with Explicit Test Activities Framework & Algorithm Selection Model Generation & Test Select a Framework Select & Build the Algorithm Model Generation ML Model Testing Prepare Data Input Data Testing Understand the Objectives Deploy the Model Monitor & Tune the Model Use the Model Use, Monitor & Tune the Model train/test pipeline production pipeline tested model model objectives framework & algorithm deployed model feedback Data Preparation & Test © STA Consulting
  23. 23. Input Data Testing Input Data Testing Pipeline Testing Data Testing • ensure that the data used by the system (for training and prediction) is of the highest quality  Objective © STA Consulting
  24. 24. ML Model Testing • ensure that the generated model meets any functional and non-functional acceptance criteria  Objective ML Model Testing Dynamic Testing Static Testing © STA Consulting
  25. 25. requirements for ML model ML model & operational pipeline ML Workflow with Life Cycles Component Testing Code System Testing Architectural Design Requirements Analysis Acceptance Testing Detailed Design Component Integration Testing V-Model used as an example only – AI-based systems can be built using any life cycle, but test levels tend to remain the same. requirements for overall system & non-AI components non-AI components © STA Consulting
  26. 26. AI-Specific Testing Issues • self-learning systems • autonomy and autonomous systems • probabilistic and non-deterministic systems • complexity • automation bias • test data • concept drift • inappropriate bias • transparency, interpretability and explainability  Several characteristics make the testing of AI-based systems especially challenging, such as: © STA Consulting
  27. 27. Example: Testing for Inappropriate Bias Testing for Inappropriate Bias Dynamic, Black-Box Static, White-Box Testing for algorithmic bias: • can involve analysis during model training, evaluation and tuning Testing for sample bias: • can involve reviewing the source of data and the acquisition process • can involve reviewing the data pre- processing activities • can be difficult because ML algorithms can use combinations of seemingly unrelated features to infer results (which are biased) • Testing with an independent dataset can often detect bias • Can involve measuring how changes in inputs affect system outputs for specific groups - similar to explainability testing • May be carried out in a production environment, or as part of testing prior to release © STA Consulting
  28. 28. Test Methods and Techniques  Adversarial Attacks and Data Poisoning  Pairwise Testing  Back-to-Back Testing  A/B Testing  Metamorphic Testing  Experience-Based Testing  Selecting Test Techniques for AI-Based Systems © STA Consulting
  29. 29. Example: Risks → Test Approaches AI Components Non-AI Components AI-Based System Specialized Testing Conventional Testing Test Approach Risk Analysis © STA Consulting
  30. 30. Test Environments for AI-Based Systems  The test environments for AI-Based systems have much in common with those for conventional systems, typically • the development environment at unit level • a production-like test environment at system and acceptance levels  ML models, when tested in isolation, are typically tested within their development framework © STA Consulting
  31. 31. • https://www.istqb.org/certifications/artificial-inteligence-tester • https://www.tieturi.fi/koulutus/istqb-ai-testing/ • 13-16.3.2023 Where to learn more? Copyright Knowit Solutions Oy 2021 31
  32. 32. kari.kakkonen@knowit.fi https://linkedin.com/in/karikakkonen/ mark.sevalnev@knowit.fi https://www.linkedin.com/in/marksevalnev Thank you!

×