TY - GEN
T1 - Parallelized Hybrid Sorting Using Quick and Insertion Sort for Big Data
AU - Bairy, Maithri
AU - Pai, Prajna
AU - Gopalakrishna Kini, N.
AU - Jyothi Upadhya, K.
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - In the context of big data, efficient sorting of massive datasets is essential for optimal performance in data-intensive applications such as database management, data analytics and scientific computing. This paper proposes a parallelized hybrid sorting algorithm for optimizing the efficiency of sorting large-scale data by integrating Quick Sort and Insertion Sort. The hybrid approach utilizes the speed of Quick Sort with larger data partitions and applies Insertion Sort for efficiency on smaller, nearly sorted subarrays. To further improve the performance, two parallelization implementations using MPI and CUDA are carried out. The approaches that use MPI make use of distributed memory across multiple processes, which makes use of k-Way merge using Min-Heap at the root for efficient consolidation. In contrast, CUDA-based implementations utilize GPU parallelism, in which threads are independently handling data segments and the final merge is done by using k-Way merge using Min-Heap. Time of computation and algorithm efficiency are measured for each method on large datasets. Comparison between sequential, MPI and CUDA executions show substantial performance improvements. For smaller datasets, such as 1000 elements, MPI results in an improvement of up to 141 times compared to sequential execution, while a speedup of up to 428 times is observed for larger datasets of 4 million elements with CUDA. The drastic improvement in performance noticed with the use of CUDA highlights the benefits of employing modern parallel and GPU-based methods to reduce computation time and enhance resource utilization.
AB - In the context of big data, efficient sorting of massive datasets is essential for optimal performance in data-intensive applications such as database management, data analytics and scientific computing. This paper proposes a parallelized hybrid sorting algorithm for optimizing the efficiency of sorting large-scale data by integrating Quick Sort and Insertion Sort. The hybrid approach utilizes the speed of Quick Sort with larger data partitions and applies Insertion Sort for efficiency on smaller, nearly sorted subarrays. To further improve the performance, two parallelization implementations using MPI and CUDA are carried out. The approaches that use MPI make use of distributed memory across multiple processes, which makes use of k-Way merge using Min-Heap at the root for efficient consolidation. In contrast, CUDA-based implementations utilize GPU parallelism, in which threads are independently handling data segments and the final merge is done by using k-Way merge using Min-Heap. Time of computation and algorithm efficiency are measured for each method on large datasets. Comparison between sequential, MPI and CUDA executions show substantial performance improvements. For smaller datasets, such as 1000 elements, MPI results in an improvement of up to 141 times compared to sequential execution, while a speedup of up to 428 times is observed for larger datasets of 4 million elements with CUDA. The drastic improvement in performance noticed with the use of CUDA highlights the benefits of employing modern parallel and GPU-based methods to reduce computation time and enhance resource utilization.
UR - https://www.scopus.com/pages/publications/105023331569
UR - https://www.scopus.com/pages/publications/105023331569#tab=citedBy
U2 - 10.1007/978-981-96-9203-3_41
DO - 10.1007/978-981-96-9203-3_41
M3 - Conference contribution
AN - SCOPUS:105023331569
SN - 9789819692026
T3 - Lecture Notes in Electrical Engineering
SP - 503
EP - 513
BT - Recent Trends in Artificial Intelligence and Data Sciences - Select Proceedings of the 15th International Conference, CONFLUENCE 2025
A2 - Kumar, Sumit
A2 - Aggarwal, Garima
A2 - Unhelkar, Bhuvan
A2 - Pal, Raju
PB - Springer Science and Business Media Deutschland GmbH
T2 - 15th International Conference on Recent Trends in Artificial Intelligence and Data Sciences, CONFLUENCE 2025
Y2 - 16 January 2025 through 17 January 2025
ER -