TY - GEN
T1 - Locality-aware PMI usage for efficient MPI startup
AU - Raffenetti, Ken
AU - Bayyapu, Neelima
AU - Durnov, Dmitry
AU - Takagi, Masamichi
AU - Balaji, Pavan
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12
Y1 - 2018/12
N2 - In this paper, we examine usage of the Process Management Interface (PMI) during MPI-Init. Specifically, how PMI is used to exchange address information between peer processes in an MPI job. As node and core counts continue to increase in HPC systems, so does the amount of address data processes need to exchange. We show how by applying well-established locality-awareness techniques, we can significantly reduce the time spent in MPI-Init. We first present the use of shared memory to reduce the total amount of information retrieved from PMI. Next, by combining shared memory with a minimally connected set of processes, we further reduce the dependence on PMI, and employ the HPC fabric to transfer the bulk of address data. Our approach is low impact, relying on functionality already provided by MPI libraries and process managers, instead of new APIs and capabilities.
AB - In this paper, we examine usage of the Process Management Interface (PMI) during MPI-Init. Specifically, how PMI is used to exchange address information between peer processes in an MPI job. As node and core counts continue to increase in HPC systems, so does the amount of address data processes need to exchange. We show how by applying well-established locality-awareness techniques, we can significantly reduce the time spent in MPI-Init. We first present the use of shared memory to reduce the total amount of information retrieved from PMI. Next, by combining shared memory with a minimally connected set of processes, we further reduce the dependence on PMI, and employ the HPC fabric to transfer the bulk of address data. Our approach is low impact, relying on functionality already provided by MPI libraries and process managers, instead of new APIs and capabilities.
UR - https://www.scopus.com/pages/publications/85070798924
UR - https://www.scopus.com/inward/citedby.url?scp=85070798924&partnerID=8YFLogxK
U2 - 10.1109/CompComm.2018.8780930
DO - 10.1109/CompComm.2018.8780930
M3 - Conference contribution
AN - SCOPUS:85070798924
T3 - 2018 IEEE 4th International Conference on Computer and Communications, ICCC 2018
SP - 624
EP - 628
BT - 2018 IEEE 4th International Conference on Computer and Communications, ICCC 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Computer and Communications, ICCC 2018
Y2 - 7 December 2018 through 10 December 2018
ER -