There is a long experience in the message passing community of harnessing heterogeneous computing resources into one parallel message passing computation. This is useful for a variety of applications: some ``embarrassingly parallel'' applications may be able to utilize spare compute power in a large network of workstations; some applications may decompose naturally into components that are better suited to different platforms, e.g., a simulation component and a visualization component; other applications may be too large to fit in one system.
Such applications can be developed using standard interprocess communication protocols, such as sockets on TCP/IP. However, these protocols are at a lower level than the message passing interfaces defined by MPI [2]. Furthermore, if each subsystem is a parallel system, then MPI is likely to be used for ``intrasystem'' communication, in order to achieve the better performance that vendor MPI libraries provide, as compared to TCP/IP. It is then convenient to use MPI for ``intersystem'' communication as well.
MPI was designed with such heterogeneous applications in mind. For example, all message passing communication is typed, so that it is possible to perform data conversion when data is transferred across systems with different data representations. Indeed, there are several freely available implementations of MPI that run in a heterogeneous environment. These implementations use a common approach. An infrastructure is developed that provides a parallel virtual machine, on top of the multiple heterogeneous systems. Then, message passing is implemented on this parallel virtual machine. This approach has several deficiencies:
The MPI interoperability effort proposes to define a cross implementation protocol for MPI that will enable heterogeneous computing. MPI implementations that support this protocol will be able to interoperate. A parallel message passing computation will be able to span multiple systems using the native vendor message passing library on each system. We propose to do this without adding any new functions to MPI. Instead, we propose to specify implementation specific interfaces, so as to enable interoperability. In a first phase, our goal is to support all point-to-point communication functions for communication across systems, as well as collectives. We intend to phase in full MPI support, over time. The initial binding will assume that intersystem communication uses one or more sockets between each pair of communicating systems, while intrasystem communication uses proprietary protocols, at the discretion of each vendor. Over time, we expect that the socket interface be expanded to allow for other industry standard stream oriented protocols, such as ATM virtual channels.
While efficient intersystem communication is important, the main performance goal of the design will be not to slow down intrasystem communication: native communication performance should not be affected by the hooks added to support interoperability, as long as there is no intersystem communication. The design should be so that support for interoperability does not weaken availability and security on each system.