Impact of Effective Word Vectors on Deep Learning Based Subjective Classification of Online Reviews

Priya B. Kamath*, M. Geetha, Dinesh U. Acharya, Ritika Nandi, Siddhaling Urolagin

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Sentiment Analysis tasks are made considerably simpler by extracting subjective statements from online reviews, thereby reducing the overhead of the classifiers. The review dataset encompasses both subjective and objective sentences, where subjective writing expresses the author's opinions, and objective text presents factual information. Assessing the subjectivity of review statements involves categorizing them as objective or subjective. The effectiveness of word vectors plays a crucial role in this process, as they capture the semantics and contextual cues of a subjective language. This study investigates the significance of employing sophisticated word vector representations to enhance the detection of subjective reviews. Several methodologies for generating word vectors have been investigated, encompassing both conventional approaches, such as Word2Vec and Global Vectors for word representation, and recent innovations, such as like Bidirectional Encoder Representations from Transformers (BERT), ALBERT, and Embeddings from Language Models. These neural word embeddings were applied using Keras and Scikit-Learn. The analysis focuses on Cornell subjectivity review data within the restaurant domain, and metrics evaluating performance, such as accuracy, F1-score, recall, and precision, are assessed on a dataset containing subjective reviews. A wide range of conventional vector models and deep learning-based word embeddings are utilized for subjective review classification, frequently in combination with deep learning architectures like Long Short-Term Memory (LSTM). Notably, pre-trained BERT-base word embeddings exhibited exceptional accuracy of 96.4%, surpassing the performance of all other models considered in this study. It has been observed that BERT-base is expensive because of its larger structure.

Original languageEnglish
Pages (from-to)736-747
Number of pages12
JournalJournal of Machine and Computing
Volume4
Issue number3
DOIs
Publication statusPublished - 07-2024

All Science Journal Classification (ASJC) codes

  • Computational Mechanics
  • Human-Computer Interaction
  • Computational Theory and Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Impact of Effective Word Vectors on Deep Learning Based Subjective Classification of Online Reviews'. Together they form a unique fingerprint.

Cite this