OPARI: OpenMP Pragma And Region Instrumentor

OPARI is a source-to-source translation tool which automatically adds all necessary calls to the pomp runtime measurement library which allows to collect runtime performance data of Fortran, C, or C++ OpenMP applications. It is based on the idea of OpenMP pragma/directive rewriting which is described in detail in a paper (PostScript, PDF) for LACSI'01.

OPARI was developed as part of the KOJAK and TAU projects.

DOWNLOAD

This software is free but copyright © 2001 by Forschungszentrum Juelich, ZAM, Germany. By downloading and using this software you automatically agree to comply with the regulations as described in the license agreement.

Sources in gzipped tar format

Version Date Description

1.1 17-Oct-2001 Changes

1.0 28-Aug-2001 Initial version

Sources in gzipped tar format
Version	Date	Description
1.1	17-Oct-2001	Changes
1.0	28-Aug-2001	Initial version

USAGE

Before compiling the source files of an OpenMP application, each file needs to be transformed by a call to the OPARI tool. In addition, the application has to be linked against the pomp runtime measurement library and the OPARI runtime table file. The latter has to be generated by using the -table option to OPARI either together with the transformation of the last input source file or with a separate call to OPARI after all transformations are done. A call to OPARI has the following syntax:

The options and parameters have the following meaning:

`[-f77\|-f90\|-c\|-c++]`	[OPTIONAL] Specifies the programming language of the input source file. This option is only necessary if the automatic language detection based on the input file suffix fails.
`[-nosrc]`	[OPTIONAL] If specified, OPARI does not generate `#line` constructs in the transformation process which allow to preserve the original source file and line number information. This option might be necessary if the OpenMP compiler does not understand `#line` constructs. The default is to generate `#line` constructs.
`[-rcfile file]`	[OPTIONAL] OPARI uses the file `./opari.rc` to preserve state information between calls to OPARI if the OpenMP application consists of more than one source file. With the `-rcfile` option the file `file` is used instead. This can be useful if more than one application is stored in the same directory or if the source files of an application are stored in more than one directory.
`-table tabfile`	Generate the OPARI runtime table in file `tabfile`. This option has to be used either together with the call to OPARI for the transformation of the last input source file or with a separate call to OPARI after all transformations are done.
`-disable constructs`	[OPTIONAL] Disable the instrumentation of the more fine-grained OpenMP constructs such as `!$OMP` `ATOMIC`. `constructs` is a comma separated list of the constructs for which the instrumentation should be disabled. Accepted tokens are `atomic`, `critical`, `master`, `flush`, `single`, or `locks` as well as `sync` to disable all of them.
`infile`	Input file name.
`[outfile]`	[OPTIONAL] Output file name. If not specified, OPARI uses the name `infile.mod.suffix` if the input file is called `infile.suffix`.

In addition to the modified output file, OPARI also generates a file named infile.opari.inc. It contains OpenMP region descriptors, one for each OpenMP region found in the input file. The meaning of the descriptor fields are described in the file pomp_lib.h.

In summary, the typical usage of OPARI consists of the following steps:

Reset OPARI state information by removing the state information file if it exists.
```
    % rm -f opari.rc
    
```

Call OPARI for each input source file


    % opari file1.f90
    ...
    % opari fileN.f90

Generate the OPARI runtime table and compile it using a ANSI C compiler
```
    % opari -table opari.tab.c
    % cc -c opari.tab.c
    
```
Compile all modified output files *.mod.* using the OpenMP compiler
Link the resulting object files against the OPARI runtime table opari.tab.o and the pomp runtime measurement library.

LIMITATIONS

OPARI understands all OpenMP constructs by the Fortran 77/90 OpenMP 2.0 and C/C++ OpenMP 1.0 specifications as well as the OpenMP extension INST directives/pragmas and the alternative POMP sentinel proposed in lacsi01.ps.gz / lacsi01.pdf.

Limitations due to fuzzy parsing

Because OPARI does not contain full parsers for the supported programming languages the following restrictions apply:

Fortran 77/90:

The !$OMP END DO and !$OMP END PARALLEL DO directives are required (and not optional as described in the OpenMP specification)
The atomic expression controlled by a !$OMP ATOMIC directive has to be on a line all by itself.
If the measurement environment does not support the automatic recording of user function entries and exits, the OPARI runtime measurement library has to be initialized by a !$OMP INST INIT directive prior to any other OpenMP directive.

C/C++:

structured blocks describing the extend of an OpenMP pragma need to be either compound statements {....}, while loops, or simple statements. In addition, for loops are supported after omp for and omp parallel for pragmas. Complex statements like if-then-else or do-while need to be enclosed in a block ( {....} ).
If the measurement environment does not support the automatic recording of user function entries and exits, the OPARI runtime measurement library has to be initialized by a omp inst init pragma prior to any other OpenMP pragma.

We did not find these limitations overly restrictive during our tests and experiments. They rarely apply for well-written code. If they do, the original source code can easily be fixed. Of course, it would be possible to remove these limitations by enhancing OPARI`s parsing sophistication.

Limitations due to source-to-source translation

In addition, because of some subtleties in the OpenMP standard specifications, the transformations performed by OPARI on the source code level can differ from the same instrumentation done by a real OpenMP compiler. Here is the list of limitations we currently know about:

OPARI makes implicit barriers explicit. Unfortunately, this method cannot be used for measuring the barrier waiting time at the end of PARALLEL directives because they do not allow a NOWAIT clause. Therefore, we add an explicit barrier with corresponding performance interface calls here. For OPARI, this means that actually two barriers get called. But the second (implicit) barrier should execute and succeed immediately because the threads of the OpenMP team are already synchronized by the first barrier.
The OpenMP standard (unfortunately) allows compilers to ignore NOWAITs, which means that in this case OPARI inserts an extra barrier and the pomp functions get invoked on this extra (and not the real) barrier.
OPARI cannot instrument the (required) internal synchronization inside !$OMP WORKSHARE.
Mark Bull's Microbenchmarks show that some compiler use different implementations (with different characteristics) for implicit and explicit barriers. If OPARI changes implicit to explicit barriers, we measure the wrong behavior on these compilers.

Of course, an OpenMP compiler can insert the pomp performance interface calls directly around the implicit barriers, thereby avoiding the described overheads and discrepancies.

OpenMP Pragma And Region Instrumentor -