This document is a project report on emotion speech detection. It summarizes the goals of speech emotion recognition, introduces the Python libraries used including librosa and JupyterLab. It describes the hardware and software requirements, and the RAVDESS dataset used. Screenshots are included showing the data exploration and model training process. The report concludes that an MLPClassifier model achieved 82.50% accuracy on this task of recognizing emotions from speech.
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
HEALTH PROJECT PPT.pptx
1. 6th SEM PROJECT REPORT
On
Emotion Speech Detection
(CSE VI SEM MINI PROJECT)
2021-22
Submitted By:
Gourang Nagar
Roll no: 2015298
DS & AI
2. 1. Introduction
• What is Speech Emotion Recognition?
Speech Emotion Recognition, abbreviated as SER, is the act of attempting to recognize human
emotion and affective states from speech. This is capitalizing on the fact that voice often reflects
underlying emotion through tone and pitch. This is also the phenomenon that animals like dogs
and horses employ to be able to understand human emotion.
• What is librosa?
Librosa is a python library for analyzing audio and music. It has a flatter package layout,
standardizes interfaces and names, backwards compatibility, modular functions, and readable
code. Further, in this Python mini-project, we demonstrate how to install it (and a few other
packages) with pip.
• What is JupyterLab?
JupyterLab is an open-source, web-based UI for Project Jupyter and it has all basic
functionalities of the Jupyter Notebook, like notebooks, terminals, text editors, file browsers, rich
outputs, and more. However, it also provides improved support for third party extensions.
3. SOFTWARE REQUIREMENTS
• Language : python3.8
• Operating system : Window10
HARDWARE REQUIREMENTS
• Processor : Intel core i3 7th Gen
• Random Memory : 128MB
• Hard Disk : 20GB
• Processor Speed : 300 min
4. 4. DATASET
RAVDESS
The Ryson Audio-Visual Database of
Emotional Speech and Song that contains
24 actors (12 male, 12 female), vocalizing
two lexically-matched statements in a
neutral North American accent.
https://www.kaggle.com/datasets
/uwrfkaggler/
ravdess-emotional-speech-audio
5. 5. Libraries used:
Pandas - used to perform data manipulation and analysis
Numpy - used to perform a wide variety of mathematical operations on
arrays
Matplotlib - used for data visualization and graphical plotting
Os - used to handle files using system commands
Seaborn - built on top of matplotlib with similar functionalities
Librosa - used to analyze sound files
Librosa.display - used to display sound data as images
Audio - used to display and hear the audio
Warnings - to manipulate warnings details
10. 9. FINAL MODEL SCORE
In this Python mini project, we learned to recognize emotions
from speech.
We used an MLPClassifier for this and made use of the soundfile
library to read the sound file, and the librosa library to extract
features from it.
As you’ll see, the model delivered an accuracy of 82.50%. That’s
good enough for us yet.