Skip to main navigation Skip to search Skip to main content

Speech-to-Noise Ratio and Voice-to-Noise Ratio of Voice Databases With Implications for Acoustic Voice Analysis

  • Duy Duong Nguyen*
  • , Rijul Gupta
  • , Dhanshree R. Gunjawate
  • , John Holik
  • , Craig Jin
  • , Catherine Madill
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Objectives: This study aimed to examine the speech-to-noise ratio (SNR) and voice-to-noise ratio (VNR) in five freely available databases and one university voice database to clarify whether they meet the signal quality requirements for use in acoustic voice analysis. Methods: This was a retrospective study that extracted the prolonged vowel /ɑ/, short phrases, and connected speech in 977 nondisordered and voice-disordered speakers from 6 preexisting voice and speech databases, including Advanced Voice Function Assessment Databases (AVFAD), Massachusetts Eye and Ear Infirmary (MEEI), Perceptual Voice Qualities Database (PVQD), Saarbruecken Voice Database (SVD), Uncommon Voice, and University of Sydney Voice Assessment Clinic (USVAC). These vocal tasks were extracted from randomly selected study identitys and were used to measure SNR and VNR using a Praat script. SNR and VNR were described using descriptive statistics and compared across databases using multivariate analysis of variance. Vocal task effects and group effects within each database were calculated. Results: There was great variability in signal quality across databases with wide ranges of SNR and VNR from low to high values. Except the Uncommon Voice database, in databases with measurable SNR and VNR from available tasks, there were statistically significant effects of tasks and groups on both SNR and VNR in AVFAD, USVAC, and PVQD. Overall, the vowel /ɑ/ and phrases had higher SNR and VNR than connected speech. In databases with single measurable task, group effects were significant for MEEI and not significant for SVD. Conclusions: These databases provided voice samples with variable signal quality, ranging from low to high levels compared with recommended values. The decreased values of SNR and VNR in connected speech tasks in these databases suggested the effects of articulation factors on these two measures. The differences in the two measures between nondisordered and disordered groups in some of the databases indicated the need to standardize vocal tasks, recording equipment and settings, recording environment, and personnel training to achieve consistent signal quality.

Original languageEnglish
JournalJournal of Voice
DOIs
Publication statusAccepted/In press - 2025

All Science Journal Classification (ASJC) codes

  • Otorhinolaryngology
  • Speech and Hearing
  • LPN and LVN

Fingerprint

Dive into the research topics of 'Speech-to-Noise Ratio and Voice-to-Noise Ratio of Voice Databases With Implications for Acoustic Voice Analysis'. Together they form a unique fingerprint.

Cite this