Debugging Models#
When developing a new physics model, or using an existing one in new
regimes, it’s expected that things will occasionally go wrong and
you’ll need to debug the program. While debuggers like gdb
are
very powerful, using them in parallel can be difficult unless you have
access to a dedicated parallel debugger. BOUT++ has some utilities
that make it easier to debug issues that only arise in parallel and/or
long-running simulations.
Loggers#
The first of these is the standard “write to screen” with the
output.write
family of logging functions. If
you have a bug which is easily reproducible and occurs almost
immediately every time you run the code, then this is probably the
easiest way to hunt it down.
The main downside of (most of) these loggers is that if you have a lot of
output they will slow down simulations. Even if you use the
--quiet
command line option to turn them off, they will still add
some overhead. The output_debug
logger can be disabled entirely at
compile-time (so there will be no overhead at all), which means it’s
well suited to adding in-depth diagnostic or debugging information
that can be kept permanently in the code and only enabled if needed.
To enable the output_debug
messages, configure BOUT++ with a
CHECK
level >= 3
. To enable it at lower check levels,
configure BOUT++ with -DENABLE_OUTPUT_DEBUG
. When running BOUT++
add a -v -v
flag to see output_debug
messages.
Message Stack#
The second utility BOUT++ has to help debugging is the message stack
using the TRACE
(and related AUTO_TRACE
) macro. These are very
useful for when a bug only occurs after a long time of running, and/or
only occasionally. The TRACE
macro can simply be dropped in
anywhere in the code:
{
TRACE("Some message here"); // message pushed
} // Scope ends, message popped
This will push the message, then pop the message when the current scope ends. If an error occurs or BOUT++ crashes, any un-popped messages will be printed, along with the file name and line number, to help find where an error occurred. For example, given this snippet:
{
TRACE("1. Outer-most scope");
{
TRACE("2. Middle scope");
{
TRACE("3. Scope not appearing in the output");
}
{
TRACE("4. Inner-most scope");
throw BoutException("Something went wrong");
}
}
}
we would see something like the following output:
====== Back trace ======
-> 4. Inner-most scope on line 58 of '/path/to/model.cxx'
-> 2. Middle scope on line 53 of '/path/to/model.cxx'
-> 1. Outer-most scope on line 51 of '/path/to/model.cxx'
====== Exception thrown ======
Something went wrong
The third TRACE
message doesn’t appear in the output because we’ve
left its scope and it’s no longer relevant.
The run-time overhead of this should be small, but can be removed
entirely if the compile-time flag -DCHECK
is not defined or set to
0
. This turns off checking, and TRACE
becomes an empty
macro. This means that TRACE
macros can be left in your code
permanently, providing some simple diagnostics without compromising
performance, as well as demarcating separate sections with
user-friendly names.
If you need to capture runtime information in the message, you can use
the fmt
syntax also used by the loggers:
TRACE("Value of i={}, some arbitrary {}", i, "string");
There is also an AUTO_TRACE
macro that automatically captures the
name of the function it’s used in. This is used throughout the main
library, especially in functions where numerical issues are likely to
arise.
Backtrace#
Lastly, BOUT++ can also automatically print a backtrace in the event
of a crash. This is a compile-time option in the BOUT++ library
(-DBOUT_ENABLE_BACKTRACE=ON
, the default, requires the
addr2line
program to be installed), and debug symbols to be turned
on (-DCMAKE_BUILD_TYPE=Debug
or =RelWithDebInfo
) in BOUT++
_and_ the physics model. If debug symbols are only present in part, the
backtrace will be missing names for the other part.
The output looks something like this:
...
Error encountered
====== Exception path ======
[bt] #10 ./backtrace() [0x40a27e]
_start at /home/abuild/rpmbuild/BUILD/glibc-2.33/csu/../sysdeps/x86_64/start.S:122
[bt] #9 /lib64/libc.so.6(__libc_start_main+0xd5) [0x7fecbfa28b25]
__libc_start_main at /usr/src/debug/glibc-2.33-4.1.x86_64/csu/../csu/libc-start.c:332
[bt] #8 ./backtrace() [0x40a467]
main at /path/to/BOUT-dev/build/../examples/backtrace/backtrace.cxx:32 (discriminator 9)
[bt] #7 /path/to/BOUT-dev/build/libbout++.so(_ZN6Solver8setModelEP12PhysicsModel+0xb5) [0x7fecc0ca2e93]
Solver::setModel(PhysicsModel*) at /path/to/BOUT-dev/build/../src/solver/solver.cxx:94
[bt] #6 /path/to/BOUT-dev/build/libbout++.so(_ZN12PhysicsModel10initialiseEP6Solver+0xc0) [0x7fecc0cad594]
PhysicsModel::initialise(Solver*) at /path/to/BOUT-dev/build/../include/bout/physicsmodel.hxx:93 (discriminator 5)
[bt] #5 ./backtrace() [0x40a986]
Backtrace::init(bool) at /path/to/BOUT-dev/build/../examples/backtrace/backtrace.cxx:27
[bt] #4 ./backtrace() [0x40a3cf]
f3() at /path/to/BOUT-dev/build/../examples/backtrace/backtrace.cxx:19
[bt] #3 ./backtrace() [0x40a3be]
f2(int) at /path/to/BOUT-dev/build/../examples/backtrace/backtrace.cxx:15
[bt] #2 ./backtrace() [0x40a386]
f1() at /path/to/BOUT-dev/build/../examples/backtrace/backtrace.cxx:13 (discriminator 2)
[bt] #1 ./backtrace(_ZN13BoutExceptionC1IA19_cJEEERKT_DpRKT0_+0xba) [0x40ae16]
BoutException::BoutException<char [19]>(char const (&) [19]) at /path/to/BOUT-dev/build/../include/bout/../boutexception.hxx:28 (discriminator 2)
This output tends to be much harder to read than the message stack
from TRACE
macros, but the advantage is that it doesn’t require
any modifications to the code to use, and can give you more precise
location information.