Abstract
Automatic spoken dialect identification (DID) is a challenging task as dialects of a given language have high similarities. Lack of sufficient training data to train deep learning (DL) based DID systems, which is common for low-resource languages like Romanian, makes it further challenging. While transfer-learning can be used to improve the performance in such low-resource conditions, it is not a trivial task to select a pre-trained model and a suitable layer for DID-specific feature extraction, as the ability to encode DID-specific content vary largely across models and their layers. Furthermore, due to high inter-class similarities, the features obtained from such pre-trained models may contain only limited dialect-discriminability, necessitating explicit methods to improve the discriminability of the learned features. Motivated by all these, we propose different methods to build efficient DID systems for identifying Romanian dialects, in low-resource settings. Specifically, we first explore pre-trained models like wav2vec2, HuBERT and XEUS for Romanian DID along with determining their optimum layer for this task. Furthermore, to address the performance degradation due to high inter-class similarities and intra-class variations, we employ metric learning techniques like center loss (CL) and centroid similarity loss (CSL). Obtained results indicate that DID system trained using 4th layer of XEUS model, when combined with CSL gives the best performance on Romanian DID.
| Original language | English |
|---|---|
| Pages (from-to) | 204467-204479 |
| Number of pages | 13 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| Publication status | Accepted/In press - 2025 |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering
Fingerprint
Dive into the research topics of 'Efficient Romanian Dialect Identification in Low-resource conditions using Transfer-Learning and Metric-Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver