TY - GEN
T1 - Deep Learning Based Automated Lip Reading for Deaf
AU - Prathyakshini,
AU - Prathwini,
AU - Pratheeksha Hegde, N.
AU - Vaishali,
AU - Rashmi, N.
AU - Kumar, Archana Praveen
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Speech recognition systems play an integral role in numerous applications, from virtual assistants to accessibility tools. This paper offers a new viewpoint to speech recognition utilizing computer vision and deep learning techniques. The proposed system is trained on a sizable dataset comprising 700 video clips of individuals uttering predefined words, add up to approximately 3 GB of data. Leveraging TensorFlow and Keras, model architecture is designed incorporating convolutional and dense layers. The training process yielded promising results, with a training accuracy of 95.7% and a validation accuracy of 98.5%, indicative of robust classification performance. The integration of computer vision enriches the system's ability to extract meaningful features from audio-visual inputs, enhancing its overall recognition accuracy. Proposed method demonstrates significant potential for real-time speech recognition applications in different areas, like human-computer interaction and assistive technologies.
AB - Speech recognition systems play an integral role in numerous applications, from virtual assistants to accessibility tools. This paper offers a new viewpoint to speech recognition utilizing computer vision and deep learning techniques. The proposed system is trained on a sizable dataset comprising 700 video clips of individuals uttering predefined words, add up to approximately 3 GB of data. Leveraging TensorFlow and Keras, model architecture is designed incorporating convolutional and dense layers. The training process yielded promising results, with a training accuracy of 95.7% and a validation accuracy of 98.5%, indicative of robust classification performance. The integration of computer vision enriches the system's ability to extract meaningful features from audio-visual inputs, enhancing its overall recognition accuracy. Proposed method demonstrates significant potential for real-time speech recognition applications in different areas, like human-computer interaction and assistive technologies.
UR - https://www.scopus.com/pages/publications/85215009614
UR - https://www.scopus.com/pages/publications/85215009614#tab=citedBy
U2 - 10.1109/ICONAT61936.2024.10775250
DO - 10.1109/ICONAT61936.2024.10775250
M3 - Conference contribution
AN - SCOPUS:85215009614
T3 - 2024 3rd International Conference for Advancement in Technology, ICONAT 2024
BT - 2024 3rd International Conference for Advancement in Technology, ICONAT 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd International Conference for Advancement in Technology, ICONAT 2024
Y2 - 13 September 2024 through 14 September 2024
ER -