Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 57 Publicité

[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic

Télécharger pour lire hors ligne

Noise suppression during audio calls is expected from any real-time communication platform. Good denoising means exhaustive use of modern neural networks. AI implies heavy processing, real-time requires speed, and a large number of calls means a large need for resources. Therefore, the questions are: can clients do the necessary processing and can AI solutions be integrated on the client side? We will present a neural network for noise suppression implemented as part of the Infobip WebRTC platform. A generic way of integrating AI solutions with client-side JavaScript will be described, with a special focus on real-time requirements. The final solution based on RNNoise will be presented.

Noise suppression during audio calls is expected from any real-time communication platform. Good denoising means exhaustive use of modern neural networks. AI implies heavy processing, real-time requires speed, and a large number of calls means a large need for resources. Therefore, the questions are: can clients do the necessary processing and can AI solutions be integrated on the client side? We will present a neural network for noise suppression implemented as part of the Infobip WebRTC platform. A generic way of integrating AI solutions with client-side JavaScript will be described, with a special focus on real-time requirements. The final solution based on RNNoise will be presented.

Publicité
Publicité

Plus De Contenu Connexe

Similaire à [DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic (20)

Plus par DataScienceConferenc1 (20)

Publicité

Plus récents (20)

[DSC Europe 22] Make some noise for AI in JavaScript - Sead Delalic

  1. 1. Make some noise for AI in JavaScript Sead Delalić, PhD University of Sarajevo & Infobip
  2. 2. Agenda • About us • Problem description • Proposed AI solution • WebRTC solution • Examples & Conclusion
  3. 3. Infobip
  4. 4. Sead Delalić • PhD in Computer Science • University of Sarajevo • Optimization, Data mining, AI • Senior Teaching Assistant at Faculty of Science • AI Consultant at Infobip • 5+ years of experience in AI industry • 25 research/scientific papers
  5. 5. Problem description
  6. 6. Remove noise from audio calls.
  7. 7. Remove noise from audio calls. Don't remove anything useful.
  8. 8. Questions Which signal is noise? Types of noise? Which signal is useful? How much time do we have for processing?
  9. 9. Answers • Interested in ‣ hearing the voice of the person speaking.
  10. 10. Answers • Interested in ‣ hearing the voice of the person speaking. • We don’t want ‣ static noise; ‣ air conditioner sound and similar noises; ‣ babble noise; ‣ typing sounds.
  11. 11. Answers • Interested in ‣ hearing the voice of the person speaking. • We don’t want ‣ static noise; ‣ air conditioner sound and similar noises; ‣ babble noise; ‣ typing sounds. • Everything should be in real-time, while the call is ongoing. • The total delay must be below 200 ms, ideally below 100 ms.
  12. 12. WebRTC Calls (one-on-one) Conferences
  13. 13. DENOISING
  14. 14. Solution
  15. 15. Preliminary research Standard algorithms Filters Artificial intelligence Recurrent networks & Autoencoders
  16. 16. Noise Suppression with ANNs
  17. 17. Process Create dataset Implement preprocessing Model neural network Train neural network Use trained network
  18. 18. Dataset • Synthesis: Clean speech + Noise = Noisy Speech • MS-SNSD and other datasets • Network Input: Noisy Speech • Network Output: Clean Speech
  19. 19. Preprocessing • Preprocessing depends on the chosen approach. • A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement ‣ Jean-Marc Valin ‣ Proceedings of IEEE Multimedia Signal Processing (MMSP) Workshop, arXiv:1709.08243, 2018. ‣ https://arxiv.org/abs/1709.08243
  20. 20. Preprocessing 1. Audio signal divided into chunks 2. Fast Fourier Transform (FFT) 3. Mel Filter 4. Mel-frequency Cepstrum (MFCC) 5. Calculate deltas
  21. 21. FFT
  22. 22. How Fourier did it
  23. 23. How we did it
  24. 24. Mel Filter Bank
  25. 25. MFCC
  26. 26. RNNoise
  27. 27. Training
  28. 28. WebRTC implementation
  29. 29. Web Audio API • Controlling audio on the Web ‣ Choose sources ‣ Add effects ‣ Create visualizations ‣ etc. • Audio nodes to create routing graph • Custom processor nodes
  30. 30. Flow • Stream • Audio Worklet • Main Thread • Worker Thread
  31. 31. Audio Worklet Main Thread Audio Worker Audio stream Input Buffer & Windowing Noisy Window Noisy Window Preprocessing & RNNoise & Postprocessing Noise-Suppressed Window Noise-Suppressed Window Output Buffer & Reproduction
  32. 32. NumJs & TensorFlow.js • NumJs ‣ Scientific computing with JS ‣ FFT & matrix operations • TensorFlow.js ‣ Hardware-accelerated JS library for DL ‣ Convert Keras format to TFJS & run model in JS
  33. 33. Examples
  34. 34. Example – Original audio
  35. 35. Example – Original audio
  36. 36. Version 1.0
  37. 37. Version 1.0
  38. 38. Version 1.1
  39. 39. Version 1.1
  40. 40. Version 1.n
  41. 41. Version 1.n
  42. 42. Once again
  43. 43. Once again
  44. 44. More WebRTC examples
  45. 45. More WebRTC examples
  46. 46. Conclusion
  47. 47. Conclusion • A hot topic in the WebRTC world • A solution for real-time usage ‣ Different types of noises and generalization ‣ Speech enhancement • Great room for improvement
  48. 48. Make some noise for AI in JavaScript Sead Delalić, PhD delalic.sead@infobip.com | delalic.sead@pmf.unsa.ba

Notes de l'éditeur


  • Uzmemo signal, podijelimo ga na komade, 40ms, pa se pomjeramo za 20, fft, mel, mfcc, delte I gledamo prethodne, delta delta, gainove dobijemo I množimo gainove sa melom

  • Uzmemo signal, podijelimo ga na komade, 40ms, pa se pomjeramo za 20, fft, mel, mfcc, delte I gledamo prethodne, delta delta, gainove dobijemo I množimo gainove sa melom
  • ResetAfter, kompatibilnost I oko toga

×