Greg Morrow
and
Robert van de Geijn
Department of Computer Sciences
The University of Texas at Austin
Austin, Texas 78712
{morrow,rvdg}@cs.utexas.edu
An Extended Abstract Submitted to SC98
Mathematical software packages such as Mathematica, Matlab, HiQ and others allow scientists and engineers to perform complex analysis and modeling in their familiar workstation environment. However, these systems are limited in the size of problem that can be solved, because the memory and CPU power of the workstation are limited. Obviously, the benefit of the interactive software is lost if the problem takes two weeks to run. With the advent of inexpensive ``beowulf''-type parallel machines [4], and the proliferation of parallel computers in general, it is natural to wonder about combining the user-friendliness and interactivity of the commercially-available mathematical packages with the computing power of the parallel machines.
We have implemented a system, which we call PLAPACK-Mathpackage Interface (``PMI''), that allows a user of one of the supported mathematical packages to export computationally intense problems to a parallel computer running PLAPACK. The interface consists of a set of functions the user calls from within the mathematical package that allow creating, filling, manipulating, and freeing matrix, vector, and scalar objects on the parallel computer. Both memory and CPU power scale linearly with the number of processing elements in the parallel machine. Thus, PMI allows the interactive software packages to break the bonds of the workstation to solve ever larger and more complex problems.
PMI is not the first attempt to exploit parallelism from within interactive mathematical software packages. MultiMatlab [6] (from the Cornell Theory Center) and the MATLAB toolbox [5](from University of Rostock, Germany) are extensions of the Matlab intrepreter that essentially run on each node of the parallel machine. A similar product for Mathematica [10] is available from Mathconsult in Switzerland. Compiler-based systems such as FALCON [9] (from University of Illinois), Otter [8] (from Oregon State University) and CONLAB [7] (from University of Umea, Sweden) start with Matlab script files and use compiler technology to create explicit message-passing codes, which then execute essentially independently of the script's orignal interactive software platform. This list is not exhaustive, but should give the idea that there are many approaches to this problem. Our approach is most similar to the MultiMATLAB and MATLAB Toolbox approaches, with one important difference. In PMI, the third-party software (MATLAB, Mathematica, etc.) only runs on one node of the machine, rather than all nodes. All parallel communication is handled from within PLAPACK.
This paper is organized as follows. Section 2 gives a brief overview of PLAPACK and of the interactive mathematical packages. Section 3 discusses some implementation details of PMI. Section 4 shows what PMI looks like from a user's point of view. Section 5 details some measurements of PMI's performance on a parallel system. Finally, section 6 gives some concluding remarks.
PLAPACK (Parallel Linear Algebra Package) is an object-oriented system for doing dense linear algebra on parallel computers [3,1,2]. It is written in C, and uses the Message Passing Interface (MPI) for communication. It is distinguished by the fact that the programmer is not exposed to error-prone index computations. Instead, the concept of a ``view'' into a matrix or vector is used to allow for index-free programming, even of highly complex algorithms. PLAPACK's high-level abstraction and user-friendliness do not come at the expense of high performance. Our Cholesky factorization algorithm, for example, achieves over 360 MFLOPS per PE on 16 PE's of the Cray T3E-600 (300 MHz).
Interactive mathematics packages such as Matlab (from the Mathworks, www.mathworks.com), Mathematica (from Wolfram Research, www.wolfram.com), HiQ (from National Instruments, www.natinst.com), and others allow their users access to sophisticated mathematics in an interactive, workstation environment. The packages typically include functionality for linear algebra, curve-fitting, differential equations, signal processing and sophisticated graphics, as well as many other areas of functionality.
Because PMI can connect with any of these products, and in an attempt to be even-handed, throughout the text we will refer to the interactive package as ``X-lab.'' This is intended to refer to any of the above products.
This section briefly describes the implementation of PMI. We begin by discussing the basic mechanism of communication in PMI. Then we describe the ``third-party'' part of the program, i.e. the part of the PMI software associated with a particular platform (Matlab, Mathematica, etc.). Next, we detail the PLAPACK side of the interface. Finally, we give some remarks about software layering in PMI.
Given that the third-party program possesses the features described above, the following outline shows how it processes PMI commands. For concreteness, we refer to the third-party software as ``X-lab'', and this is intended to represent any of the PMI-supported software platforms.
Because of the quirks and idiosyncrasies of the third-party interface specifications, there are ceratin functions that must be reimplemented for each intended third party platform. We have attempted to layer our software in such a way that these non-portable parts of the code are isolated and small.
Figure 2 shows a diagram of the PMI layering. Notice that the non-portable sections are limited to the ``shared memory mechanism'' and the ``PMIPutData'' and ``PMIGetData'' modules. The shared memory mechanism is only slightly non-portable : one version works for all Unix platforms, but some slight changes are required for Windows NT.
The functions ``PMIPutData'' and ``PMIGetData'', which send and receive data from PMI to and from the third party software, must be reimplemented for each third party platform.
This section describes PMI from a user's point of view. What does she do to initiate a session? What does she see from within her third party mathematical software? What does a sample application look like?
Let us suppose that we wish to have 16 PE's involved in the parallel side of PMI. Then we would launch the PMI_plapack.x executable, and specify 16 processors on the command line. This would look something like the following
% mpirun -np 16 PMI_plapack.x
PLAPACK software interface waiting for connection
Once the parallel executable is running, the user is free to start PMI from within her mathematical software. (Actually, the order in which the two sides of PMI are initiated is immaterial. However, a PMIopen[] call from within Mathematica will block until the parallel executable is started.)
The first categrory consists of commands to initialize, finalize, and manipulate the environment. Examples of commands in this category are PMIOpen[], PMIClose[], and PMIVerbose[].
The second class of commands in PMI perform parallel object manipulations. The purpose of these commands is to create, free, query, and fill parallel matrices, multivectors, and multiscalars. Examples of commands in this category are PMICreateObject[], PMIFreeObject[], PMIAxpyToObject[], PMIAxpyFromObject[]. The latter two functions are used to put values into a parallel object and to get values from a parallel object, respectively.
The third class of commands in PMI cause some action to be taken on parallel objects that already exist within the PMI parallel application (i.e. objects that have already been created and filled with data values.) Examples of these commands are PMILU[] and PMIGemm[], which perform LU factorization and matrix multiplication, respectively.
This section presents a sample application. This program would be executed from within a Matlab session, and of course would require a copy of the PMI parallel application to be running as well. This example performs the following steps.
Figure 3 shows the above program from within the Matlab version of PMI.
We concentrate on properties of the interface itself, rather than on the properties of the parallel executable. (The parallel executable is simply a PLAPACK program in disguise, and performance numbers for PLAPACK are available in the literature and from the PLAPACK web page.) We do, however, show some speedup values to get an idea of overheads inherent in the interface.
The main performance metrics for PMI concern the speed of the shared-memory connection. In particular, we measure the latency of the connection (the time required to get a zero-length message back and forth to the parallel executable) and the inverse bandwidth (the time per byte of data sent to or received from the parallel executable.) In addition to these measurements, we also provide a profile of the sample application described in the previous section.
| # PE's | latency (sec) | 2c| bandwidth (Mbyte/sec) | |
| nb=16 | nb = 32 | ||
| 1 | 0.020 | 32.0 | 43.0 |
| 4 | 0.019 | 3.0 | 7.3 |
| 8 | 0.019 | 2.4 | 2.7 |
| 16 | 0.019 | 2.4 | 2.7 |
We report bandwidth measurements for several sizes of parallel mesh and values of distribution blocksize in Table 1.
![]() |
This paper has descibed the PLAPACK-Mathpackage interface, which is a software connection allowing a user to plug a parallel computer into the back of their favorite interactive mathematical software. We have given some details of the implemntation, use and performance of PMI. As yet, we have only realized a small subset of what can be done with this package. First, the package can be extended to support other software packages, subject only to the constraints referered to in the implementation section of this paper. Second, we intend to thoroughly test the connection of interactive software running on a workstation to a completely separate parallel machine. Third, we intend to incorporate more of the unique features of PLAPACK (for example, the use of ``views'' into matrices and vectors) into PMI. Finally, we are interested in experimenting with ``real applications.'' That is, we wish to take an existing Matlab or Mathematica application code and parallelize it through PMI.
This work was sponsored in part by the Intel Research Council. The PLAPACK project was sponsored in part by the Parallel Research on Invariant Subspace Methods (PRISM) project (ARPA grant P-95006), the NASA High Performance Computing and Communications Program's Earth and Space Sciences Project (NRA Grants NAG5-2497 and NAG5-2511), and the Environmental Molecular Sciences construction project at Pacific Northwest National Laboratory (PNNL) (PNNL is a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under Contract DE-AC06-76RLO 1830).
This document was generated using the LaTeX2HTML translator Version 97.1 (release) (July 13th, 1997)
Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -address morrow@cs.utexas.edu -split 1 pmi.tex.
The translation was initiated by Greg Morrow on 5/15/1998