Skip to main navigation Skip to search Skip to main content

Improved machine learning framework with feature engineering and SHAP analysis for predicting wine quality

  • Rijwan Khan
  • , Ankur Goyal
  • , Hoshiyar Singh Kanyal
  • , Deepak Parashar*
  • , Sheelesh Kumar Sharma
  • , Md Iqbal
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Wine quality assessment in the modern viticulture industry faces significant challenges due to reliance on subjective expert evaluation, leading to inconsistencies, scalability limitations, and high costs in commercial applications. Traditional approaches lack standardization and struggle with large-scale quality control processes. This comprehensive study evaluates the effectiveness of advanced machine learning techniques for wine quality prediction using a dataset of 6,497 samples (1,599 red wine and 4,898 white wine)0.14 original physicochemical features that were expanded to 34 through feature engineering, plus the quality target variable and addressed class imbalance using SMOTE. Our methodology integrates traditional algorithms (Logistic Regression, SVM, KNN), ensemble methods (Random Forest, XGBoost, LightGBM, Voting and Stacking ensembles), deep neural networks, and a novel application of transfer learning from white wine quality models to enhance red wine quality prediction. Results demonstrate superior performance of ensemble methods across evaluation metrics, with Random Forest achieving up to 95% accuracy and 0.994 AUC score in optimized configurations, while Voting and Stacking ensembles consistently delivered robust performance (81.5% accuracy, 85.3% F1 score) across varied testing conditions. Feature importance analysis using SHAP revealed that alcohol content, sulfur dioxide levels, and volatile acidity are the most influential predictors, with complex interaction patterns between chemical properties. Transfer learning showed promising results with faster convergence but slightly lower accuracy than models trained from scratch. This research advances both the methodological framework for wine quality prediction and provides actionable insights for the wine industry by identifying critical chemical determinants of wine quality.

Original languageEnglish
Article number27
JournalSN Applied Sciences
Volume8
Issue number1
DOIs
Publication statusPublished - 01-2026

All Science Journal Classification (ASJC) codes

  • General Chemical Engineering
  • General Materials Science
  • General Environmental Science
  • General Engineering
  • General Physics and Astronomy
  • General Earth and Planetary Sciences

Fingerprint

Dive into the research topics of 'Improved machine learning framework with feature engineering and SHAP analysis for predicting wine quality'. Together they form a unique fingerprint.

Cite this