By instrumenting a limited subset of the application code instead of all its
methods, you can reduce the profiling overhead dramatically. Choosing one or
more root methods and instrumenting their call subgraphs is the recommended
technique if you profile a long-running application that repeatedly performs
the same or similar actions, e.g. a server-side application. However, if you
want to profile application startup or a short-running command-line application,
it may be more efficient to instrument most or all of the application code.
There are three instrumentation schemes in Profiler that determine
the order in which methods are instrumented and the likely number of instrumented
methods: Lazy instrumentation, Total instrumentation, and Eager instrumentation.
Lazy instrumentation scheme. This is the default scheme used in
the Predefined Tasks -> Analyze Performance -> Part of application
mode and is best for profiling long-running applications.
When this scheme is used, Profiler instruments methods in the following
way. First, the root method is instrumented. Next, whenever any instrumented
method m() is executed for the very first time, its code is
scanned and all methods that m() may call are instrumented
in turn. In this way, we ensure that the number of methods that are instrumented
is close to the number of methods that will be actually invoked by the target
application during its execution lifetime. By minimizing the number of methods
that are instrumented but not called from the root method, we reduce the
instrumentation time itself. We also avoid the overhead that may otherwise
be incurred if an instrumented method is not actually called from our root,
but is still called from some other part of the program.
Nevertheless, the number of instrumented methods is usually greater than
the number of methods actually invoked. The primary reason for this is that
in an average Java application, many methods are virtual (that is, not static).
Therefore, if there is a call looking like x.m() in the source
code, and variable x has type X, Profiler has to
instrument not only the X.m() method, but also all implementations
of method m() in subclasses of X. This is because
it is generally not possible to find out in advance what the actual class
of x at run time is going to be (in fact it may change many
times), and thus which particular implementation of m() is
going to be called.
The lazy instrumentation scheme is best for profiling long-running applications,
but it may not work very well if you need to profile the application during
the period of time when it loads a lot of classes, or just when instrumentation
is in progress. This is due to the fact that in the HotSpot JVM, which Profiler
uses, many application methods are usually compiled into machine code to improve
performance. If a method gets instrumented later, the JVM initially switches back
to interpretation of its bytecodes (later it can compile this method again).
A temporary switch to interpretation may happen even to some methods that
are not instrumented themselves, but call instrumented methods. For this
reason, during the time when Profiler is discovering the call graph, and for
some time afterwards, the application may run considerably slower than normally,
and the CPU profiling results obtained during this period are not representative
of its normal execution. It is recommended to run the application for some
time after initiating the call graph instrumentation (for example, if it's
a server-side application, you can make it process a few hundred or thousand
requests) and then discard the already accumulated profiling results by
invoking Profile -> Reset Collected Results button ( ).
Profiling results collected afterwards will match the reality much better.
Total instrumentation scheme. This scheme is used in the Analyze Performance
-> Entire application and Analyze
Application Startup predefined profiling tasks.
With this scheme selected, Profiler works like most other profilers, i.e.
instruments all of the target application's methods immediately when a class
containing these methods is loaded. This scheme may result in a huge overhead
for long-running applications, but it usually gives more accurate results
and may even impose a smaller overhead if used for profiling application
startup or for profiling relatively small, short-running applications such
as command-line utilities. This is because it does not suffer from the repeated
method decompilation and recompilation problem of the lazy instrumentation
scheme, discussed above.
Eager instrumentation scheme
Eager instrumentation is a compromise between the two above schemes. Unlike the lazy scheme, eager
scheme locates all potentially reachable methods (that is, methods that
can be called by the root method directly or transitively) in a class as
soon as this class is loaded. Eager analysis for potential reachability
may lead to many more methods instrumented than called (sometimes 10 times
plus), so it is generally not recommended to use this scheme for long-running
applications. However, it usually results in a much smaller number of repeated
method decompilations and recompilations, and thus may be useful - for example,
when profiling an action that happens during application startup, when it
appears that total instrumentation results in too much overhead.