Version 1.1, October 2001
© 2001 Forschungszentrum Jülich, ZAM, Germany
Bernd Mohr
OPARI is a source-to-source translation tool
which automatically adds all necessary calls to the
pomp runtime measurement library which
allows to collect runtime performance data of Fortran, C, or C++ OpenMP
applications. It is based on the idea of OpenMP pragma/directive
rewriting which is described in detail in a paper
(
PostScript,
PDF)
for
LACSI'01.
OPARI was developed as part of the
KOJAK and
TAU projects.
DOWNLOAD
This software is
free but
copyright © 2001 by
Forschungszentrum Juelich, ZAM, Germany. By downloading and using this
software you automatically agree to comply with the regulations as
described in the
license agreement.
Sources in gzipped tar format
|
Version
| Date
| Description
|
1.1
| 17-Oct-2001
| Changes
|
1.0
| 28-Aug-2001
| Initial version
|
USAGE
Before compiling the source files of an OpenMP application, each file
needs to be transformed by a call to the
OPARI
tool. In addition, the application has to be linked against the
pomp runtime measurement library and the
OPARI runtime table file. The latter has to
be generated by using the
-table option to
OPARI either together with the
transformation of the
last input source file or with a separate call
to
OPARI after all transformations are done.
A call to
OPARI has the following syntax:
The options and parameters have the following meaning:
[-f77|-f90|-c|-c++]
| [OPTIONAL] Specifies the programming language
of the input source file. This option is only
necessary if the automatic language detection
based on the input file suffix fails.
|
[-nosrc]
| [OPTIONAL] If specified,
OPARI does not generate
#line constructs in the transformation process
which allow to preserve the original source file
and line number information. This option might
be necessary if the OpenMP compiler does not
understand #line constructs. The default is to
generate #line constructs.
|
[-rcfile file]
| [OPTIONAL] OPARI
uses the file ./opari.rc to preserve state information
between calls to OPARI if the OpenMP
application consists of more than one source file. With the
-rcfile option the file file is
used instead. This can be useful if more than one application is
stored in the same directory or if the source files of an application
are stored in more than one directory.
|
-table tabfile
| Generate the OPARI runtime
table in file tabfile.
This option has to be used either together with the
call to OPARI for the transformation of
the last input source file or with a separate call to
OPARI after all transformations are done.
|
-disable constructs
| [OPTIONAL] Disable the instrumentation of
the more fine-grained OpenMP constructs such as !$OMP
ATOMIC. constructs
is a comma separated list of the constructs for which the
instrumentation should be disabled. Accepted tokens are
atomic, critical, master,
flush, single, or locks
as well as sync to disable all of them.
|
infile
| Input file name.
|
[outfile]
| [OPTIONAL] Output file name. If not specified,
OPARI uses the name
infile.mod.suffix if
the input file is called infile.suffix.
|
In addition to the modified output file,
OPARI
also generates a file named
infile.opari.inc. It
contains OpenMP region descriptors, one for each OpenMP region found in the
input file. The meaning of the descriptor fields are described in the file
pomp_lib.h.
In summary, the typical usage of OPARI consists of the following steps:
- Reset OPARI state information by removing
the state information file if it exists.
% rm -f opari.rc
- Call OPARI for each input source file
% opari file1.f90
...
% opari fileN.f90
- Generate the OPARI runtime table and
compile it using a ANSI C compiler
% opari -table opari.tab.c
% cc -c opari.tab.c
- Compile all modified output files *.mod.* using the OpenMP
compiler
- Link the resulting object files against the
OPARI runtime table
opari.tab.o and the pomp
runtime measurement library.
LIMITATIONS
OPARI understands all OpenMP constructs by the
Fortran 77/90 OpenMP 2.0 and
C/C++ OpenMP 1.0 specifications
as well as the OpenMP extension
INST directives/pragmas
and the alternative
POMP sentinel proposed in
lacsi01.ps.gz /
lacsi01.pdf.
Limitations due to fuzzy parsing
Because
OPARI does not contain full parsers for
the supported programming languages the following restrictions apply:
Fortran 77/90:
- The !$OMP END DO and
!$OMP END PARALLEL
DO directives
are required (and not optional as described in the OpenMP specification)
- The atomic expression controlled by a
!$OMP ATOMIC
directive has to be on a line all by itself.
- If the measurement environment does not support the automatic
recording of user function entries and exits, the
OPARI runtime
measurement library has to be initialized by a
!$OMP INST INIT
directive prior to any other OpenMP directive.
C/C++:
- structured blocks describing the extend of an OpenMP pragma
need to be either compound statements {....},
while loops, or simple statements. In addition,
for loops are supported after
omp for and omp
parallel for pragmas.
Complex statements like
if-then-else or do-while need to
be enclosed in a block ( {....} ).
- If the measurement environment does not support the automatic
recording of user function entries and exits, the
OPARI runtime measurement library has to
be initialized by a omp inst
init pragma prior to any other OpenMP pragma.
We did not find these limitations overly restrictive during our tests and
experiments. They rarely apply for well-written code. If they do, the
original source code can easily be fixed. Of course, it would be possible
to remove these limitations by enhancing
OPARI`s
parsing sophistication.
Limitations due to source-to-source translation
In addition, because of some subtleties in the OpenMP standard specifications,
the transformations performed by
OPARI on the
source code level can differ from the same instrumentation done by a real
OpenMP compiler. Here is the list of limitations we currently know about:
- OPARI makes implicit barriers explicit.
Unfortunately, this method cannot be used for measuring the barrier waiting
time at the end of PARALLEL directives because they do not
allow a NOWAIT clause. Therefore, we add an explicit
barrier with corresponding performance interface calls here. For OPARI, this means that actually two barriers get
called. But the second (implicit) barrier should execute and succeed
immediately because the threads of the OpenMP team are already synchronized
by the first barrier.
- The OpenMP standard (unfortunately) allows compilers to ignore
NOWAITs, which means that in this case OPARI inserts an extra barrier and the
pomp functions get invoked on this extra
(and not the real) barrier.
- OPARI cannot instrument the (required)
internal synchronization inside !$OMP
WORKSHARE.
- Mark Bull's Microbenchmarks
show that some compiler use different implementations (with different
characteristics) for implicit and explicit barriers. If OPARI changes implicit to explicit barriers, we
measure the wrong behavior on these compilers.
Of course, an OpenMP compiler can insert the
pomp
performance interface calls directly around the implicit barriers, thereby
avoiding the described overheads and discrepancies.