Preprocessing of Datasets Using Sequential and Parallel Approach: A Comparison

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Data preprocessing is a technique in data mining to make the data read for further processing according to the requirement. Preprocessing is required because the data might be incomplete, redundant, come from different sources which may require aggregation, etc., and data can be processed either sequentially or in parallel. There are several parallel frameworks such as Hadoop, MPI, and CUDA to process the data. A survey has been done to understand these parallel frameworks, and a comparison between sequential and parallel approach is carried out to compare the efficiency of the two approaches.

Original languageEnglish
Title of host publicationExpert Clouds and Applications - Proceedings of ICOECA 2021
EditorsI. Jeena Jacob, Francisco M. Gonzalez-Longatt, Selvanayaki Kolandapalayam Shanmugam, Ivan Izonin
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages10
ISBN (Print)9789811621253
Publication statusPublished - 2022
EventInternational Conference on Expert Clouds and Applications, ICOECA 2021 - Bangalore
Duration: 18-02-202119-02-2021

Publication series

NameLecture Notes in Networks and Systems
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389


ConferenceInternational Conference on Expert Clouds and Applications, ICOECA 2021

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Signal Processing
  • Computer Networks and Communications


Dive into the research topics of 'Preprocessing of Datasets Using Sequential and Parallel Approach: A Comparison'. Together they form a unique fingerprint.

Cite this