TY - GEN

T1 - GPU Computing for Compute-Intensive Scientific Calculation

AU - Dubey, Sandhya Parasnath

AU - Kumar, M. Sathish

AU - Balaji, S.

PY - 2020/1/1

Y1 - 2020/1/1

N2 - GPU has emerged as a platform that off-loads computation intensive work from CPU and performs numerical computations in less time. One such mathematical operation is matrix multiplication. Matrix is one of the fundamental mathematical objects used in the scientific calculation, with applicability in various fields such as computer graphics, analysis of electrical circuits, computer networks, DNA sequence comparison, protein structure prediction, etc. This work presents a comparative analysis of scalar matrix multiplication in three modes, namely: (i) sequential programming in C language (ii) parallel implementations using OpenCL, and (iii) MPI. The testbed comprises of input matrices ranging from small size of 100 × 100 to a higher size of 800 × 12,800. We observe that parallel execution in OpenCL outperforms MPI and sequential C for higher dimensional matrices. In contrast, sequential C outperforms both MPI and OpenCL for small dimension matrices. Besides, we analyze that OpenCL program has attained a speedup of 9 ×. Therefore, we conclude that parallel execution of code is more efficient for data of computationally large sizes and hence provides a potentially useful solution to address NP-complete problems.

AB - GPU has emerged as a platform that off-loads computation intensive work from CPU and performs numerical computations in less time. One such mathematical operation is matrix multiplication. Matrix is one of the fundamental mathematical objects used in the scientific calculation, with applicability in various fields such as computer graphics, analysis of electrical circuits, computer networks, DNA sequence comparison, protein structure prediction, etc. This work presents a comparative analysis of scalar matrix multiplication in three modes, namely: (i) sequential programming in C language (ii) parallel implementations using OpenCL, and (iii) MPI. The testbed comprises of input matrices ranging from small size of 100 × 100 to a higher size of 800 × 12,800. We observe that parallel execution in OpenCL outperforms MPI and sequential C for higher dimensional matrices. In contrast, sequential C outperforms both MPI and OpenCL for small dimension matrices. Besides, we analyze that OpenCL program has attained a speedup of 9 ×. Therefore, we conclude that parallel execution of code is more efficient for data of computationally large sizes and hence provides a potentially useful solution to address NP-complete problems.

UR - http://www.scopus.com/inward/record.url?scp=85076828734&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85076828734&partnerID=8YFLogxK

U2 - 10.1007/978-981-15-0184-5_12

DO - 10.1007/978-981-15-0184-5_12

M3 - Conference contribution

AN - SCOPUS:85076828734

SN - 9789811501838

T3 - Advances in Intelligent Systems and Computing

SP - 131

EP - 140

BT - Soft Computing for Problem Solving SocProS 2018, Volume 2

A2 - Das, Kedar Nath

A2 - Bansal, Jagdish Chand

A2 - Deep, Kusum

A2 - Nagar, Atulya K.

A2 - Pathipooranam, Ponnambalam

A2 - Naidu, Rani Chinnappa

PB - Springer Paris

T2 - 8th International Conference on Soft Computing for Problem Solving, SocProS 2018

Y2 - 17 December 2018 through 19 December 2018

ER -