/[NO]OPTIMIZE
Controls how the compiler produces optimized code.
The default is /OPTIMIZE, which is the same as /OPTIMIZE=(LEVEL=4,
INLINE=SPEED, NOLOOPS, NOPIPELINE, TUNE=GENERIC, UNROLL=0). For a
debugging session, use the negative form (/NOOPTIMIZE or
/OPTIMIZE=LEVEL=0) to ensure that the debugger has sufficient
information to locate errors in the source program.
In most cases, using /OPTIMIZE will make the program execute
faster. As a side effect of getting the fastest execution speeds,
using /OPTIMIZE can produce larger object modules and longer
compile times than /NOOPTIMIZE.
To allow full interprocedural optimization when compiling multiple
source modules, consider separating source file specifications with
plus signs (+), so the files are concatenated and compiled as one
program. Full interprocedural optimization can reduce overall
program execution time. Consider not concatenating source files
when the size of the source files is excessively large and the
amount of memory or disk space is limited.
INLINE=keyword
Controls the inlining performed by the compiler. The keyword can
be any of the following:
Keyword Meaning
------- -------
NONE Suppresses all inlining of routines.
MANUAL This is the same as INLINE=NONE for VSI Fortran.
SIZE Inlines calls that the compiler feels will improve
run-time performance without significantly increasing
the size of the program.
SPEED Inlines calls that the compiler feels will improve
run-time performance, even where it may significantly
increase the size of the program.
ALL Inlines every procedure call that can be inlined
while still generating correct code. Recursive
routines will not cause an infinite loop at
compile time.
/OPTIMIZE=INLINE is equivalent to /OPTIMIZE=(INLINE=SPEED).
/OPTIMIZE=NOINLINE is equivalent to /OPTIMIZE=(INLINE=NONE)
For all optimization levels other than 0, the inlining mode is
the one specified on the command line. If no inlining mode is
explicitly specified, the compiler derives it from the
optimization level, as follows:
Level Inlining Mode
----- -------------
0 NONE
1 NONE
2 NONE
3 NONE
4 SPEED
5 SPEED
LEVEL=n
Controls the level of optimization performed by the compiler.
The "n" is an integer in the range 0 through 5. LEVEL=0 is the
same as /NOOPTIMIZE; LEVEL=4 is the same as /OPTIMIZE. The
following explains the level numbers:
Level Number Meaning
------------ -------
LEVEL=0 Disables nearly all optimizations.
LEVEL=1 Enables local optimizations within the
source program unit and recognition of common
subexpressions.
LEVEL=2 Enables global optimizations and optimizations
performed with LEVEL=1.
LEVEL=3 Enables additional global optimizations that
improve speed (at the cost of extra code size)
and optimizations performed with LEVEL=2.
LEVEL=4 Enables interprocedural analysis, automatic
inlining of small procedures (with heuristics
limiting the amount of extra code), and
optimizations performed with LEVEL=3. LEVEL=4
is the default.
LEVEL=5 Activates software pipelining, loop transformation
optimizations, and optimizations performed with
LEVEL=4. Loop transformation optimizations apply
to array references within loops. Software pipe-
lining allows instructions within a loop to
"wrap around" and execute in a different itera-
tion of the loop. In certain cases, loop trans-
formation and software pipelining can improve
run-time performance.
For more information about these LEVEL numbers, see the HP Fortran
for OpenVMS User Manual.
[NO]LOOPS
Specifies a group of loop transformation optimizations that apply
to array references within loops. These optimizations can
improve the performance of the memory system and usually apply to
multiply nested loops.
The loops chosen for loop transformation optimizations are always
"counted" loops (which include DO or IF loops, but not DO WHILE
loops).
Conditions that typically prevent the loop transformation
optimizations from occurring include subprogram references that
are not inlined (such as an external function call), complicated
exit conditions, and uncounted loops.
The types of optimizations associated with this option are:
Loop blocking
Loop distribution
Loop fusion
Loop interchange
Loop scalar replacement
Outer loop unrolling
This type of optimization can be specified for /OPTIMIZE=LEVEL=2
or higher; it is performed by default if /OPTIMIZE=LEVEL=5 is in
effect.
[NO]PIPELINE
Applies instruction scheduling to certain innermost loops,
allowing instructions within a loop to "wrap around" and execute
in a different iteration of the loop. This can reduce the impact
of long-latency operations, resulting in faster loop execution.
/OPTIMIZE=PIPELINE also enables prefetching of data to reduce the
impact of cache misses.
This type of optimization can be specified for /OPTIMIZE=LEVEL=2
or higher; it is performed by default if /OPTIMIZE=LEVEL=5 is in
effect.
TUNE=keyword (Alpha only)
Specifies the kind of optimized code to be generated. The
keyword can be any of the following:
Keyword Meaning
------- -------
GENERIC Generates and schedules code that will execute
well for all generations of Alpha processors.
This provides generally efficient code for those
cases where all processor generations are likely
to be used.
HOST Generates and schedules code optimized for the
processor generation in use on the system being
used for compilation.
EV4 Generates and schedules code optimized for the
21064, 21064A, 21066, and 21068 implementations
of the Alpha chip.
Programs compiled with the EV4 option run without
instruction emulation overhead on all Alpha
processors.
EV5 Generates and schedules code optimized for the
21164 implementation of the Alpha chip. This
processor generation is faster than EV4.
Programs compiled with the EV5 option run without
instruction emulation overhead on all Alpha
processors.
EV56 Generates code for some 21164 chip implementations
that use the byte and word manipulation instruction
extensions of the Alpha architecture.
Programs compiled with the EV56 option may incur
emulation overhead on EV4 and EV5 processors, but
will still run correctly on OpenVMS Version 7.1 (or
later) systems.
EV6 Generates and schedules code for the 21264 chip
implementation that uses the following extensions
to the base Alpha instruction set: BWX (Byte/Word
manipulation) and MAX (Multimedia) instructions,
square root and floating-point convert instructions,
and count instructions.
Programs compiled with the EV6 option may incur
emulation overhead on EV4, EV5, EV56, and PCA56
processors, but will still run correctly on OpenVMS
Version 7.1 (or later) systems.
EV67 Generates and schedules code for the 21264 chip
implementation that uses the following extensions
to the base Alpha instruction set: BWX (Byte/Word
manipulation) and MAX (Multimedia) instructions,
square root and floating-point convert instructions,
and CIX (Count) instructions.
Programs compiled with the EV67 option may incur
emulation overhead on EV4, EV5, EV56, EV6 and PCA56
processors, but will still run correctly on OpenVMS
Version 7.1 (or later) systems.
PCA56 Generates code for the 21164PC chip implementation
that uses the byte and word manipulation instruction
extensions and multimedia instruction extensions
of the Alpha architecture.
Running programs compiled with the PCA56 keyword
may incur emulation overhead on EV4, EV5, and
EV56 processors, but will still run correctly on
OpenVMS Version 7.1 (or later) systems.
The default is /OPTIMIZE=TUNE=GENERIC.
UNROLL=n
Controls loop unrolling done by the optimizer. UNROLL=n means to
unroll loop bodies n times, where "n" is an integer in the range
0 through 16. UNROLL=0 (the default) means the optimizer will
use its default unroll amount. For more information, see the
HP Fortran for OpenVMS User Manual.