Welcome to BOUT++âs documentation!Â¶
The documentation is divided into the following sections:
 User documentation
 Developer Documentation
IntroductionÂ¶
BOUT++ is a C++ framework for writing plasma fluid simulations with an arbitrary number of equations in 3D curvilinear coordinates. More specifically, it is a multiblock structured finite difference (/volume) code in curvilinear coordinates, with some features to support unusual coordinate systems used in fusion plasma physics. It has been developed from the original BOUndary Turbulence 3D 2fluid edge simulation code written by X.Xu and M.Umansky at LLNL.
The aim of BOUT++ is to automate the common tasks needed for simulation codes, and to separate the complicated (and errorprone) details such as differential geometry, parallel communication, and file input/output from the userspecified equations to be solved. Thus the equations being solved are made clear, and can be easily changed with only minimal knowledge of the inner workings of the code. As far as possible, this allows the user to concentrate on the physics, rather than worrying about the numerics. This doesnât mean that users donât have to think about numerical methods, and so selecting differencing schemes and boundary conditions is discussed in this manual. The generality of BOUT++ of course also comes with a limitation: although there is a large class of problems which can be tackled by this code, there are many more problems which require a more specialised solver and which BOUT++ will not be able to handle. Hopefully this manual will enable you to test whether BOUT++ is suitable for your problem as quickly and painlessly as possible.
BOUT++ treats time integration and spatial operators separately, an approach called the Method of Lines (MOL). This means that BOUT++ consists of two main parts:
 A set of Ordinary Differential Equation (ODE) integrators, including implicit, explicit and IMEX schemes, such as RungeKutta and the CVODE solver from SUNDIALS. These donât âknowâ anything about the equations being solved, only requiring the time derivative of the system state. For example they make no distinction between the different evolving fields, or the number of dimensions in the simulation. This kind of problemspecific information can be used to improve efficiency, and is usually supplied in the form of usersupplied preconditioners. See section Options for more details.
 A set of operators and data types for calculating time derivatives, given the system state. These calculate things like algebraic operations (+,,*,/ etc), spatial derivatives, and some integral operators.
Each of these two parts treats the other as a black box (mostly), and they communicate by exchanging arrays of data: The ODE integrator finds the system state at a given time and passes it to the problemdependent code, which uses a combination of operators to calculate the time derivative. This time derivative is passed back to the ODE integrator, which updates the state and the cycle continues. This scheme has some advantages in terms of flexibility: Each part of the code doesnât depend on thedetails of the other, so can be changed without requiring modifications to the other. Unfortunately for many problems the details can make a big difference, so ways to provide problemspecific information to time integrators, such as preconditioners, are also provided.
Though designed to simulate tokamak edge plasmas, the methods used are very general and almost any metric tensor can be specified, allowing the code to be used to perform simulations in (for example) slab, sheared slab, and cylindrical coordinates. The restrictions on the simulation domain are that the equilibrium must be axisymmetric (in the z coordinate), and that the parallelisation is done in the \(x\) and \(y\) (parallel to \(\mathbf{B}\)) directions.
After describing how to install BOUT++ (section Getting started), run the test suite (section Running the test suite) and a few examples (section Running BOUT++, more detail in section More examples), increasingly sophisticated ways to modify the problem being solved are introduced. The simplest way to modify a simulation case is by altering the input options, described in section BOUT++ options. Checking that the options are doing what you think they should be by looking at the output logs is described in section Running BOUT++, and an overview of the IDL analysis routines for data postprocessing and visualisation is given in section Postprocessing. Generating new grid files, particularly for tokamak equilibria, is described in section Generating input grids.
Up to this point, little programming experience has been assumed, but performing more drastic alterations to the physics model requires modifying C++ code. Section BOUT++ physics models describes how to write a new physics model specifying the equations to be solved, using ideal MHD as an example. The remaining sections describe in more detail aspects of using BOUT++: section Differential operators describes the differential operators and methods available; section Staggered grids covers the experimental staggered grid system.
Various sources of documentation are:
 This manual
 Most directories in the BOUT++ distribution contain a README file. This should describe briefly what the contents of the directory are and how to use them.
 Most of the code contains Doxygen comment tags (which are slowly getting better). Running doxygen on these files should therefore generate an HTML reference. This is probably going to be the most uptodate documentation.
License and terms of useÂ¶
Copyright 2010 B.D.Dudson, S.Farley, M.V.Umansky, X.Q.Xu
BOUT++ is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
BOUT++ is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with BOUT++. If not, see <https://www.gnu.org/licenses/>.
A copy of the LGPL license is in COPYING.LESSER. Since this is based on (and refers to) the GPL, this is included in COPYING.
BOUT++ is free software, but since it is a scientific code we also ask that you show professional courtesy when using this code:
 Since you are benefiting from work on BOUT++, we ask that you submit any improvements you make to the code to us by emailing Ben Dudson at bd512@york.ac.uk
 If you use BOUT++ results in a paper or professional publication, we ask that you send your results to one of the BOUT++ authors first so that we can check them. It is understood that in most cases if one or more of the BOUT++ team are involved in preparing results then they should appear as coauthors.
 Publications or figures made with the BOUT++ code should acknowledge the BOUT++ code by citing B.Dudson et. al. Comp.Phys.Comm 2009 and/or other BOUT++ papers. See the file CITATION for details.
Getting startedÂ¶
This section goes through the process of getting, installing, and starting to run BOUT++.
The quickest way to get started is to use a prebuilt binary. These take care of all dependencies, configuration and compilation. See section Docker image.
The remainder of this section will go through the following steps to manually install BOUT++. Only the basic functionality needed to use BOUT++ is described here; the next section (Advanced installation options) goes through more advanced options, configurations for particular machines, and how to fix some common problems.
 Obtaining a copy of BOUT++
 Installing dependencies
 Configuring BOUT++
 Configuring BOUT++ analysis codes
 Compiling BOUT++
 Running the test suite
 Installing BOUT++ (experimental)
Note: In this manual commands to run in a BASH shell will begin with â$â, and commands specific to CSH with a â%â.
Prebuilt binariesÂ¶
Docker imageÂ¶
Docker is a widely used container system, which packages together the operating system environment, libraries and other dependencies into an image. This image can be downloaded and run reproducibly on a wide range of hosts, including Windows, Linux and OS X. Here is the starting page for instructions on installing Docker.
The BOUT++ docker images are hosted on dockerhub for some releases and snapshots. Check the list of BOUTnext tags if you want a recent version of BOUT++ ânextâ (development) branch. First download the image:
$ sudo docker pull boutproject/boutproject/boutnext:9f4c663petsc
then run:
$ sudo docker run rm it boutproject/boutnext:9f4c663petsc
This should give a terminal in a âboutuserâ home directory, in which there is âBOUTnextâ, containing BOUT++ configured and compiled with NetCDF, HDF5, SUNDIALS, PETSc and SLEPc. Python 3 is also installed, with ipython, NumPy, Scipy and Matplotlib libaries. To plot to screen an X11 display is needed. Alternatively a shared directory can be created to pass files between the docker image and host. The following commands both enable X11 and create a shared directory:
$ mkdir shared
$ sudo docker run rm it \
e DISPLAY v $HOME/.Xauthority:/home/boutuser/.Xauthority net=host \
v $PWD/shared:/home/boutuser/boutimgshared \
boutproject/boutnext:9f4c663petsc
This should enable plotting from python, and files in the docker image put in â/home/boutuser/boutimgsharedâ should be visible on the host in the âsharedâ directory.
If this is successful, then you can skip to section Running BOUT++.
Obtaining BOUT++
BOUT++ is hosted publicly on github at https://github.com/boutproject/BOUTdev. You can the latest stable version from https://github.com/boutproject/BOUTdev/releases. If you want to develop BOUT++, you should use git to clone the repository. To obtain a copy of the latest version, run:
$ git clone git://github.com/boutproject/BOUTdev.git
which will create a directory BOUTdev
containing the code. To get
the latest changes later, go into the BOUTdev
directory and run:
$ git pull
Development is done on the ânextâ branch, which you can checkout with:
$ git checkout next
Installing dependenciesÂ¶
The bareminimum requirements for compiling and running BOUT++ are:
 A C++ compiler that supports C++11
 An MPI compiler such as OpenMPI (www.openmpi.org/), MPICH ( https://www.mpich.org/) or LAM (www.lammpi.org/)
 The NetCDF library (https://www.unidata.ucar.edu/downloads/netcdf)
The FFTW3 library (http://www.fftw.org/)
is also strongly recommended. Fourier transforms are used for some
derivative methods, as well as the ShiftedMetric
parallel transform
which is used in the majority of BOUT++ tokamak simulations. Without
FFTW3, these options will not be available.
Note
Only GCC versions >= 4.9 are supported. This is due to a bug in previous versions.
Note
If you use an Intel compiler, you must also make sure that you have a version of GCC that supports C++11 (GCC 4.8+).
On supercomputers, or in other environments that use a module system, you may need to load modules for both Intel and GCC.
On a cluster or supercomputerÂ¶
If you are installing on a cluster or supercomputer then the MPI C++ compilers will
already be installed, and on Cray or IBM machines will probably be
called CC
and xlC
respectively.
On large facilities (e.g NERSC or Archer), the compilers and libraries
needed should already be installed, but you may need to load them to use them.
It is common to organise libraries using the modules
system, so try typing:
modules avail
to get a list of available modules. Some instructions for specific machines can be found in Machinespecific installation. See your systemâs documentation on modules and which ones to load. If you donât know, or modules donât work, you can still install libraries in your home directory by following the instructions below for FFTW and NetCDF.
Ubuntu / DebianÂ¶
On Ubuntu or Debian distributions if you have administrator rights then you can install MPICH2 and the needed libraries by running:
$ sudo aptget install mpich2 libmpich2dev
$ sudo aptget install libfftw3dev libnetcdfdev libnetcdfcxxlegacydev
On Ubuntu 16.04:
$ sudo aptget libmpichdev libfftw3dev libnetcdfdev libnetcdfcxxlegacydev
On Ubuntu 18.04:
$ sudo aptget install mpich libmpichdev libfftw3dev libnetcdfdev libnetcdfcxxlegacydev git g++ make
$ sudo aptget install python3 python3distutils python3pip python3numpy python3netcdf4 python3scipy
$ pip3 install user Cython
The first line should be sufficient to install BOUT++, while the 2nd
and 3rd line make sure that the tests work, and that the python
interface can be build.
Further, the encoding for python needs to be utf8  it may be required
to set export LC_CTYPE=C.utf8
.
If you do not have administrator rights, so canât install packages, then you need to install these libraries from source into your home directory. See sections on installing MPI, installing FFTW and installing NetCDF.
Arch LinuxÂ¶
$ pacman S openmpi fftw netcdfcxx make gcc
FedoraÂ¶
On Fedora the required libraries can be installed by running:
$ sudo dnf builddep bout++
This will install all the dependencies that are used to install BOUT++ for fedora. Feel free to install only a subset of the suggested packages. For example, only mpich or openmpi is required. To load an mpi implementation type:
$ module load mpi
After that the mpi library is loaded. Precompiled binaries are available for fedora as well. To get precompiled BOUT++ run:
$ # install the mpich version  openmpi is available as well
$ sudo dnf install bout++mpichdevel
$ # get the python3 modules  python2 is available as well
$ sudo dnf install python3bout++
Configuring BOUT++Â¶
To compile BOUT++, you first need to configure it.
Go into the BOUTdev
directory and run:
$ ./configure
If this finishes by printing a summary, and paths for IDL, Python, and
Octave, then the libraries are set up and you can skip to the next
section. If you see a message
âERROR: FFTW not found. Required by BOUT++
â then make sure
FFTW3 is installed (See the previous section on installing dependencies ).
If FFTW3 is installed in a nonstandard location, you can specify the
directory with the âwithfftw=
option e.g:
$ ./configure withfftw=$HOME/local
Configure should now find FFTW, and search for the NetCDF library. If
configure finishes successfully, then skip to the next section, but if
you see a message NetCDF support disabled
then configure couldnât
find the NetCDF library. Unless you have another file format (like HDF5) installed, this
will be followed by a message
ERROR: At least one file format must be supported
. Check that you have
NetCDF installed (See the previous section on installing dependencies ).
Like the FFTW3 library, if NetCDF is installed in a nonstandard location then
you can specify the directory with the withnetcdf=
option e.g.:
$ ./configure withfftw=$HOME/local withnetcdf=$HOME/local
which should now finish successfully, printing a summary of the configuration:
Configuration summary
PETSc support: no
SLEPc support: no
IDA support: yes
CVODE support: yes
ARKODE support: yes
NetCDF support: yes
ParallelNetCDF support: no
HDF5 support: yes (parallel: no)
MUMPS support: no
If not, see Advanced installation options for some things you can try to resolve common problems.
CMakeÂ¶
There is now (experimental) support for CMake. You will need CMake > 3.9. CMake supports outofsource builds by default, which are A Good Idea. Basic configuration with CMake looks like:
$ mkdir build && cd build
$ cmake ..
You can then run make
as usual.
You can see what build options are available with:
$ cmake .. LH
...
// Enable backtrace
ENABLE_BACKTRACE:BOOL=ON
// Output coloring
ENABLE_COLOR:BOOL=ON
// Enable OpenMP support
ENABLE_OPENMP:BOOL=OFF
// Enable support for PETSc time solvers and inversions
USE_PETSC:BOOL=OFF
...
CMake uses the D<variable>=<choice>
syntax to control these
variables. You can set <package>_ROOT
to guide CMake in finding
the various optional thirdparty packages (except for PETSc/SLEPc,
which use _DIR
). CMake understands the usual environment variables
for setting the compiler, compiler/linking flags, as well as having
builtin options to control them and things like static vs shared
libraries, etc. See the CMake documentation for more infomation.
A more complicated CMake configuration command might look like:
$ CC=mpicc CXX=mpic++ cmake .. \
DUSE_PETSC=ON DPETSC_DIR=/path/to/petsc/ \
DUSE_SLEPC=ON DSLEPC_DIR=/path/to/slepc/ \
DUSE_SUNDIALS=ON DSUNDIALS_ROOT=/path/to/sundials \
DUSE_NETCDF=ON DNetCDF_ROOT=/path/to/netcdf \
DENABLE_OPENMP=ON \
DENABLE_SIGFPE=OFF \
DCMAKE_BUILD_TYPE=Debug \
DBUILD_SHARED_LIBS=ON
DCMAKE_INSTALL_PREFIX=/path/to/install/BOUT++
You can write a CMake configuration file (CMakeLists.txt
) for your
physics model in only four lines:
project(blob2d LANGUAGES CXX)
find_package(bout++ REQUIRED)
add_executable(blob2d blob2d.cxx)
target_link_libraries(blob2d PRIVATE bout++::bout++)
You just need to give CMake the location where you installed BOUT++
via the CMAKE_PREFIX_PATH
variable:
$ mkdir build && cd build
$ cmake .. DCMAKE_PREFIX_PATH=/path/to/install/BOUT++
Natural Language SupportÂ¶
BOUT++ has support for languages other than English, using GNU gettext. If you are planning on installing BOUT++ (see secinstallbout) then this should work automatically, but if you will be running BOUT++ from the directory you downloaded it into, then configure with the option:
./configure localedir=$PWD/locale
This will enable BOUT++ to find the translations. When configure
finishes, the configuration summary should contain a line like:
configure: Natural language support: yes (path: /home/user/BOUTdev/locale)
where the path
is the directory containing the translations.
See Natural language support for details of how to switch language when running BOUT++ simulations.
Configuring analysis routinesÂ¶
The BOUT++ installation comes with a set of useful routines which can be used to prepare inputs and analyse outputs. Most of this code is now in Python, though IDL was used for many years. Python is useful In particular because the test suite scripts and examples use Python, so to run these youâll need python configured.
When the configure script finishes, it prints out the paths you need to get IDL, Python, and Octave analysis routines working. If you just want to compile BOUT++ then you can skip to the next section, but make a note of what configure printed out.
Python configurationÂ¶
To use Python, you will need the NumPy and SciPy libraries. On Debian or Ubuntu these can be installed with:
$ sudo aptget install pythonscipy
which should then add all the other dependencies like NumPy. To test if everything is installed, run:
$ python c "import scipy"
If not, see the SciPy website https://www.scipy.org for instructions on installing.
To do this, the path to tools/pylib
should be added to the
PYTHONPATH
environment variable. Instructions for doing this are
printed at the end of the configure script, for example:
Make sure that the tools/pylib directory is in your PYTHONPATH
e.g. by adding to your ~/.bashrc file
export PYTHONPATH=/home/ben/BOUT/tools/pylib/:$PYTHONPATH
To test if this command has worked, try running:
$ python c "import boutdata"
If this doesnât produce any error messages then Python is configured correctly.
IDL configurationÂ¶
If you want to use IDL to analyse
BOUT++ outputs, then the IDL_PATH
environment variable should include the
tools/idllib/
subdirectory included with BOUT++.
The required command (for Bash) is printed at the end of the BOUT++ configuration:
$ export IDL_PATH=...
After running that command, check that idl
can find the analysis routines by running:
$ idl
IDL> .r collect
IDL> help, /source
You should see the function COLLECT
in the BOUT/tools/idllib
directory. If not, something is wrong with your IDL_PATH
variable.
On some machines, modifying IDL_PATH
causes problems, in which case
you can try modifying the path inside IDL by running:
IDL> !path = !path + ":/path/to/BOUTdev/tools/idllib"
where you should use the full path. You can get this by going to the
tools/idllib
directory and typing pwd
. Once this is done
you should be able to use collect
and other routines.
Compiling BOUT++Â¶
Once BOUT++ has been configured, you can compile the bulk of the code by
going to the BOUTdev
directory (same as configure
) and running:
$ make
(on OSX, FreeBSD, and AIX this should be gmake
). This should print
something like:
 Compiling BOUT++ 
CXX = mpicxx
CFLAGS = O DCHECK=2 DSIGHANDLE \
DREVISION=13571f760cec446d907e1bbeb1d7a3b1c6e0212a \
DNCDF DBOUT_HAS_PVODE
CHECKSUM = ff3fb702b13acc092613cfce3869b875
INCLUDE = I../include
Compiling field.cxx
Compiling field2d.cxx
At the end of this, you should see a file libbout++.a
in the
lib/
subdirectory of the BOUT++ distribution. If you get an error,
please create an issue on Github
including:
 Which machine youâre compiling on
 The output from make, including full error message
 The
make.config
file in the BOUT++ root directory
Running the test suiteÂ¶
BOUT++ comes with three sets of test suites: unit tests, integrated tests and method of manufactured solutions (MMS) tests. The easiest way to run all of them is to simply do:
$ make check
from the toplevel directory. Alternatively, if you just want to run one them individually, you can do:
$ make checkunittests
$ make checkintegratedtests
$ make checkmmstests
Note: The integrated test suite currently uses the mpirun
command to launch the runs, so wonât work on machines which use a job
submission system like PBS or SGE.
These tests should all pass, but if not please create an issue on Github containing:
 Which machine youâre running on
 The
make.config
file in the BOUT++ root directory  The
run.log.*
files in the directory of the test which failed
If the tests pass, congratulations! You have now got a working installation of BOUT++. Unless you want to use some experimental features of BOUT++, skip to section [secrunning] to start running the code.
Installing BOUT++ (experimental)Â¶
Most BOUT++ users install and develop their own copies in their home directory, so do not need to install BOUT++ to a system directory. As of version 4.1 (August 2017), it is possible to install BOUT++ but this is not widely used and so should be considered experimental.
After configuring and compiling BOUT++ as above, BOUT++ can be installed
to system directories by running as superuser or sudo
:
$ sudo make install
Danger
Do not do this unless you know what youâre doing!
This will install the following files under /usr/local/
:
/usr/local/bin/boutconfig
A script which can be used to query BOUT++ configuration and compile codes with BOUT++./usr/local/include/bout++/...
header files for BOUT++/usr/local/lib/libbout++.a
The main BOUT++ library/usr/local/lib/libpvode.a
and/usr/local/lib/libpvpre.a
, the PVODE library/usr/local/share/bout++/pylib/...
Python analysis routines/usr/local/share/bout++/idllib/...
IDL analysis routines/usr/local/share/bout++/make.config
Amakefile
configuration, used to compile many BOUT++ examples
To install BOUT++ under a different directory, use the prefix=
flag e.g. to install in your home directory:
$ make install prefix=$HOME/local/
You can also specify this prefix when configuring, in the usual way (see Configuring BOUT++):
$ ./configure prefix=$HOME/local/
$ make
$ make install
More control over where files are installed is possible by passing options to
configure
, following the GNU conventions:
bindir=
sets whereboutconfig
will be installed ( default/usr/local/bin
)includedir=
sets where thebout++/*.hxx
header files wil be installed (default/usr/local/include
)libdir=
sets where thelibbout++.a
,libpvode.a
andlibpvpre.a
libraries are installed (default/usr/local/lib
)datadir=
sets whereidllib
,pylib
andmake.config
are installed (default/usr/local/share/
)
After installing, that you can run boutconfig
e.g:
$ boutconfig all
which should print out the list of configuration settings which boutconfig
can provide.
If this doesnât work, check that the directory containing boutconfig
is in your PATH
.
The python and IDL analysis scripts can be configured using
boutconfig
rather than manually setting paths as in
Configuring analysis routines. Add this line to your startup file
(e.g. $HOME/.bashrc
):
export PYTHONPATH=`boutconfig python`:$PYTHONPATH
note the back ticks around boutconfig python
not
quotes. Similarly for IDL:
export IDL_PATH=`boutconfig idl`:'<IDL_DEFAULT>':$IDL_PATH
More details on using boutconfig are in the section on makefiles.
Advanced installation optionsÂ¶
This section describes some common issues encountered when configuring and compiling BOUT++, how to manually install dependencies if they are not available, and how to configure optional libraries like SUNDIALS and PETSc.
Optimisation and runtime checkingÂ¶
Configure with enablechecks=3
enables a lot of checks of
operations performed by the field objects. This is very useful for
debugging a code, and can be omitted once bugs have been removed.
enable=checks=2
enables less checking, especially the
computationally rather expensive ones, while enablechecks=0
disables most checks.
To get most checking, both from BOUT++ and from the compiler
enabledebug
can be used. That enables checks of level 3, as
well as debug flags, e.g. g
for gcc.
For (sometimes) more useful error messages, there is the
enabletrack
option. This keeps track of the names of variables
and includes these in error messages.
To enable optimization, configure with enableoptimize=3
.
This will try to set appropriate flags, but may not set the best ones.
This should work well for gcc. Similar to checks, different levels can
be specified, where 3 is high, and 0 means disabling all
optimization. enableoptimize=fast
will set the Ofast
flag
for gcc which enables optimizations that are not standard conforming, so
proceed at own risk.
Manually set compilation flagsÂ¶
You can set the following environment variables if you need more control over how BOUT++ is built:
LDFLAGS
: extra flags for linking, e.g.L<library dir>
LIBS
: extra libraries for linking, e.g.l<library>
CPPFLAGS
: preprocessor flags, e.g.I<include dir>
CXXFLAGS
: compiler flags, e.g.Wall
SUNDIALS_EXTRA_LIBS
specifies additional libraries for linking to SUNDIALS, which are put at the end of the link command.
It is possible to change flags for BOUT++ after running configure, by
editing the make.config
file. Note that this is not recommended,
as e.g. PVODE will not be built with these flags.
Machinespecific installationÂ¶
These are some configurations which have been found to work on particular machines.
ArcherÂ¶
As of 20th April 2018, the following configuration should work
$ module swap PrgEnvcray PrgEnvgnu/5.1.29
$ module load fftw
$ module load archernetcdf/4.1.3
When using CMake on Cray systems like Archer, you need to pass
DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment
so that the Cray compiler
wrappers are detected properly.
KNL @ ArcherÂ¶
To use the KNL system, configure BOUT++ as follows:
./configure MPICXX=CC host=knl withnetcdf withpnetcdf=no withhypre=no CXXFLAGS="xMICAVX512 D_GLIBCXX_USE_CXX11_ABI=0"
AtlasÂ¶
./configure withnetcdf=/usr/local/tools/hdf5gnuserial1.8.1/lib withfftw=/usr/local withpdb=/usr/gapps/pact/new/lnx2.5ib/gnu
CabÂ¶
./configure withnetcdf=/usr/local/tools/hdf5gnuserial1.8.1/lib withfftw=/usr/local/tools/fftw33.2 withpdb=/usr/gapps/pact/new/lnx2.5ib/gnu
EdisonÂ¶
module swap PrgEnvintel PrgEnvgnu
module load fftw
./configure MPICC=cc MPICXX=CC withnetcdf=/global/u2/c/chma/PUBLIC/netcdf_edison/netcdf withfftw=/opt/fftw/3.3.0.1/x86_64
Hoffman2Â¶
./configure withnetcdf=/u/local/apps/netcdf/current withfftw=/u/local/apps/fftw3/current withcvode=/u/local/apps/sundials/2.4.0 withlapack=/u/local/apps/lapack/current
HopperÂ¶
module swap PrgEnvpgi PrgEnvgnu
module load netcdf
module swap netcdf netcdf/4.1.3
module swap gcc gcc/4.6.3
./configure MPICC=cc MPICXX=CC withfftw=/opt/fftw/3.2.2.1 withpdb=/global/homes/u/umansky/PUBLIC/PACT_HOPP2/pact
HyperionÂ¶
With the bash shell use
export PETSC_DIR=~farley9/projects/petsc/petsc3.2p1
export PETSC_ARCH=archc
./configure withnetcdf=/usr/local/tools/netcdfgnu4.1 withfftw=/usr/local MPICXX=mpiCC EXTRA_LIBS=lcurl withpetsc withcvode=~farley9/local withida=~farley9/local
With the tcsh shell use
setenv PETSC_DIR ~farley9/projects/petsc/petsc3.2p1
setenv PETSC_ARCH archc
./configure withnetcdf=/usr/local/tools/netcdfgnu4.1 withfftw=/usr/local MPICXX=mpiCC EXTRA_LIBS=lcurl withpetsc withcvode=~farley9/local withida=~farley9/local
MarconiÂ¶
module load intel intelmpi fftw lapack
module load szip zlib/1.2.8gnu6.1.0
module load hdf5/1.8.17intelpexe2017binary
module load netcdfcxx4
module load python
To compile for the SKL partition, configure with
./configure enablechecks=0 CPPFLAGS="Ofast funrollloops xCOREAVX512 mtune=skylake" host skl
to enable AVX512 vectorization.
Note
As of 20/04/2018, an issue with the netcdf and netcdfcxx4
modules means that you will need to remove lnetcdf
from
EXTRA_LIBS
in make.config
after running
./configure
and before running make
. lnetcdf
needs also to be removed from bin/boutconfig
to allow a
successful build of the python interface. Recreation of
boutcore.pyx
needs to be manually triggered, if
boutcore.pyx
has already been created.
UbglÂ¶
./configure withnetcdf CXXFLAGS=DMPICH_IGNORE_CXX_SEEK CFLAGS=DMPICH_IGNORE_CXX_SEEK withpdb=/usr/gapps/pact/new_s/lnx2.5ib withnetcdf=/usr/local/tools/netcdf/netcdf4.1_c++
File formatsÂ¶
BOUT++ can currently use two different file formats: NetCDF4, and HDF5 and experimental support for parallel flavours of both. NetCDF is a widely used format and so has many more tools for viewing and manipulating files. In particular, the NetCDF4 library can produce files in either NetCDF3 âclassicâ format, which is backwardscompatible with NetCDF libraries since 1994 (version 2.3), or in the newer NetCDF4 format, which is based on (and compatible with) HDF5. HDF5 is another widely used format. If you have multiple libraries installed then BOUT++ can use them simultaneously, for example reading in grid files in NetCDF format, but writing output data in HDF5 format.
To enable NetCDF support, you will need to install NetCDF version 4.0.1
or later. Note that although the NetCDF4 library is used for the C++
interface, by default BOUT++ writes the âclassicâ format. Because of
this, you donât need to install zlib or HDF5 for BOUT++ NetCDF support
to work. If you want to output to HDF5 then you need to first install
the zlib and HDF5 libraries, and then compile NetCDF with HDF5 support.
When NetCDF is installed, a script ncconfig
should be put into
somewhere on the path. If this is found then configure should have all
the settings it needs. If this isnât found then configure will search
for the NetCDF include and library files.
Installing NetCDF from sourceÂ¶
The latest versions of NetCDF have separated out the C++ API from the main C library. As a result, you will need to download and install both. Download the latest versions of the NetCDFC and NetCDF4 C++ libraries from https://www.unidata.ucar.edu/downloads/netcdf. As of January 2017, these are versions 4.4.1.1 and 4.3.0 respectively.
Untar the file and âcdâ into the resulting directory:
$ tar xzvf netcdf4.4.1.1.tar.gz
$ cd netcdf4.4.1.1
Then run configure
, make
and make install
:
$ ./configure prefix=$HOME/local
$ make
$ make install
Sometimes configure can fail, in which case try disabling Fortran:
$ ./configure prefix=$HOME/local disablefortran
$ make
$ make install
Similarly for the C++ API:
$ tar xzvf netcdfcxx44.3.0.tar.gz
$ cd netcdfcxx44.3.0
$ ./configure prefix=$HOME/local
$ make
$ make install
You may need to set a couple of environment variables as well:
$ export PATH=$HOME/local/bin:$PATH
$ export LD_LIBRARY_PATH=$HOME/local/lib:$LD_LIBRARY_PATH
You should check where NetCDF actually installed its libraries. On some
systems this will be $HOME/local/lib
, but on others it may be, e.g.
$HOME/local/lib64
. Check which it is, and set $LD_LIBRARY_PATH
appropriately.
OpenMPÂ¶
BOUT++ can make use of OpenMP parallelism. To enable OpenMP, use the
enableopenmp
flag to configure:
./configure enableopenmp
OpenMP can be used to parallelise in more directions than can be achieved with MPI alone. For example, it is currently difficult to parallelise in X using pure MPI if FCI is used, and impossible to parallelise at all in Z with pure MPI.
OpenMP is in a large number of places now, such that a decent speedup can be achieved with OpenMP alone. Hybrid parallelisation with both MPI and OpenMP can lead to more significant speedups, but it sometimes requires some fine tuning of numerical parameters in order to achieve this. This greatly depends on the details not just of your system, but also your particular problem. We have tried to choose âsensibleâ defaults that will work well for the most common cases, but this is not always possible. You may need to perform some testing yourself to find e.g. the optimum split of OpenMP threads and MPI ranks.
One such parameter that can potentially have a significant effect (for
some problem sizes on some machines) is setting the OpenMP schedule
used in some of the OpenMP loops (specifically those using
BOUT_FOR
). This can be set using:
./configure enableopenmp withopenmpschedule=<schedule>
with <schedule>
being one of: static
(the default),
dynamic
, guided
, auto
or runtime
.
Note
If you want to use OpenMP with Clang, you will need Clang 3.7+,
and either libomp
or libiomp
.
You will be able to compile BOUT++ with OpenMP with lower versions
of Clang, or using the GNU OpenMP library libgomp
, but it will
only run with a single thread.
Note
By default PVODE is built without OpenMP support. To enable this
add enablepvodeopenmp
to the configure command.
Note
OpenMP will attempt to use all available threads by default. This
can cause oversubscription problems on certain systems. You can
limit the number of threads OpenMP uses with the
OMP_NUM_THREADS
environment variable. See your system
documentation for more details.
SUNDIALSÂ¶
The BOUT++ distribution includes a 1998 version of CVODE (then called PVODE) by Scott D. Cohen and Alan C. Hindmarsh, which is the default time integration solver. Whilst no serious bugs have been found in this code (as far as the authors are aware of), several features such as usersupplied preconditioners and constraints cannot be used with this solver. Currently, BOUT++ also supports the SUNDIALS solvers CVODE, IDA and ARKODE which are available from https://computation.llnl.gov/casc/sundials/main.html.
Note
BOUT++ currently supports SUNDIALS > 2.6, up to 4.1.0 as of March 2019. It is advisable to use the highest possible version
In order for a smooth install it is recommended to install SUNDIALS
from an install directory. The full installation guide is found in the
downloaded .tar.gz
, but we will provide a stepbystep guide to
install it and make it compatible with BOUT++ here:
$ cd ~
$ mkdir p install/sundialsinstall
$ cd install/sundialsinstall
$ # Move the downloaded sundials4.1.0.tar.gz to sundialsinstall
$ tar xzvf sundials4.1.0.tar.gz
$ mkdir build && cd build
$ cmake \
DCMAKE_INSTALL_PREFIX=$HOME/local \
DLAPACK_ENABLE=ON \
DOPENMP_ENABLE=ON \
DMPI_ENABLE=ON \
DCMAKE_C_COMPILER=$(which mpicc) \
DCMAKE_CXX_COMPILER=$(which mpicxx) \
../sundials4.1.0
$ make
$ make test
$ make install
The SUNDIALS IDA solver is a DifferentialAlgebraic Equation (DAE) solver, which evolves a system of the form \(\mathbf{f}(\mathbf{u},\dot{\mathbf{u}},t) = 0\). This allows algebraic constraints on variables to be specified.
To configure BOUT++ with SUNDIALS only (see section PETSc on how to build PETSc with SUNDIALS), go to the root directory of BOUT++ and type:
$ ./configure withsundials=/path/to/sundials/install
SUNDIALS will allow you to select at runtime which solver to use. See Options for more details on how to do this.
PETScÂ¶
BOUT++ can use PETSc https://www.mcs.anl.gov/petsc/ for timeintegration and for solving elliptic problems, such as inverting Poisson and Helmholtz equations.
Currently, BOUT++ supports PETSc versions 3.4  3.9. To install PETSc version 3.4.5, use the following steps:
$ cd ~
$ wget http://ftp.mcs.anl.gov/pub/petsc/releasesnapshots/petsc3.4.5.tar.gz
$ tar xzvf petsc3.4.5.tar.gz
$ # Optional
$ # rm petsc3.4.5.tar.gz
$ cd petsc3.4.5
To build PETSc without SUNDIALS, configure with:
$ ./configure \
withclanguage=cxx \
withmpi=yes \
withprecision=double \
withscalartype=real \
withsharedlibraries=0
Add withdebugging=yes
to ./configure
in order to allow
debugging.
Note
To build PETSc with SUNDIALS, install SUNDIALS as explained
in section SUNDIALS, and append ./configure
with withsundialsdir=$HOME/local
Note
It is also possible to get PETSc to download and install MUMPS (see MUMPS), by adding:
downloadmumps \
downloadscalapack \
downloadblacs \
downloadfblaslapack=1 \
downloadparmetis \
downloadptscotch \
downloadmetis
to ./configure
.
To make PETSc, type:
$ make PETSC_DIR=$HOME/petsc3.4.5 PETSC_ARCH=archlinux2cxxdebug all
Should BLAS, LAPACK, or any other packages be missing, you will get an
error, and a suggestion that you can append
downloadnameofpackage
to the ./configure
line. You may want
to test that everything is configured properly. To do this, type:
$ make PETSC_DIR=$HOME/petsc3.4.5 PETSC_ARCH=archlinux2cxxdebug test
To use PETSc, you have to define the PETSC_DIR
and PETSC_ARCH
environment variables to match how PETSc was built:
$ export PETSC_DIR=$HOME/petsc3.4.5
$ export PETSC_ARCH=archlinux2cxxdebug
and add to your startup file $HOME/.bashrc
:
export PETSC_DIR=$HOME/petsc3.4.5
export PETSC_ARCH=archlinux2cxxdebug
To configure BOUT++ with PETSc, go to the BOUT++ root directory, and type:
$ ./configure withpetsc
You can configure BOUT++ against different PETSc installations either
through the PETSC_DIR/ARCH
variables as above, or by specifying
them on the command line:
$ ./configure withpetsc PETSC_DIR=/path/to/other/petsc PETSC_ARCH=otherarch
Note
Unfortunately, there are a variety of ways PETSc can be installed on a system, and it is hard to automatically work out how to compile against a particular installation. In particular, there are two PETScsupported ways of installing PETSc that are subtly different.
The first way is as above, using PETSC_DIR
and
PETSC_ARCH
. A second way is to use the prefix
argument to configure
(much like the traditional GNU
configure
scripts) when building PETSc. In this case,
PETSC_DIR
will be the path passed to prefix
and
PETSC_ARCH
will be empty. When configuring BOUT++, one
can use withpetsc=$PETSC_DIR
as a shortcut in this
case. This will NOT work if PETSc was installed with a
PETSC_ARCH
.
However, there are at least some Linux distributions that
install PETSc in yet another way and you may need to set
PETSC_DIR/ARCH
differently. For example, for Fedora, as
of May 2018, you will need to configure and build BOUT++
like so:
$ ./configure withpetsc=/usr/lib64/openmpi
$ PETSC_DIR=/usr make
Replace openmpi
with the correct MPI implementation that
you have installed.
LAPACKÂ¶
BOUT++ comes with linear solvers for tridiagonal and banddiagonal systems. Some implementations of these solvers (for example Laplacian inversion, section Laplacian inversion) use LAPACK for efficient serial performance. This does not add new features, but may be faster in some cases. LAPACK is however written in FORTRAN 77, which can cause linking headaches. To enable these routines use:
$ ./configure withlapack
and to specify a nonstandard path:
$ ./configure withlapack=/path/to/lapack
MUMPSÂ¶
This is still experimental, but does work on at least some systems at York. The PETSc library can be used to call MUMPS for directly solving matrices (e.g. for Laplacian inversions), or MUMPS can be used directly. To enable MUMPS, configure with:
$ ./configure withmumps
MUMPS has many dependencies, including ScaLapack and
ParMetis. Unfortunately, the exact dependencies and configuration of
MUMPS varies a lot from system to system. The easiest way to get MUMPS
installed is to install PETSc with MUMPS, or supply the CPPFLAGS
,
LDFLAGS
and LIBS
environment variables to configure
:
$ ./configure withmumps CPPFLAGS=I/path/to/mumps/includes \
LDFLAGS=L/path/to/mumps/libs \
LIBS="ldmumps lmumps_common lother_libs_needed_for_mumps"
MPI compilersÂ¶
These are usually called something like mpicc and mpiCC (or mpicxx), and the configure script will look for several common names. If your compilers arenât recognised then set them using:
$ ./configure MPICC=<your C compiler> MPICXX=<your C++ compiler>
NOTES:
 On LLNLâs Grendel, mpicxx is broken. Use mpiCC instead by passing âMPICXX=mpiCCâ to configure. Also need to specify this to NetCDF library by passing âCXX=mpiCCâ to NetCDF configure.
Installing MPICH from sourceÂ¶
In your home directory, create two subdirectories: One called âinstallâ where weâll put the source code, and one called âlocalâ where weâll install the MPI compiler:
$ cd
$ mkdir install
$ mkdir local
Download the latest stable version of MPICH from https://www.mpich.org/ and put the
file in the âinstallâ subdirectory created above. At the time of writing
(January 2018), the file was called mpich3.2.1.tar.gz
. Untar the file:
$ tar xzvf mpich3.2.1.tar.gz
which will create a directory containing the source code. âcdâ into this directory and run:
$ ./configure prefix=$HOME/local
$ make
$ make install
Each of which might take a while. This is the standard way of installing
software from source, and will also be used for installing libraries
later. The âprefix=
option specifies where the software should be
installed. Since we donât have permission to write in the system
directories (e.g. /usr/bin
), we just use a subdirectory of our home
directory. The configure
command configures the install, finding the
libraries and commands it needs. make
compiles everything using the
options found by configure
. The final make install
step copies
the compiled code into the correct places under $HOME/local
.
To be able to use the MPI compiler, you need to modify the PATH
environment variable. To do this, run:
$ export PATH=$PATH:$HOME/local/bin
and add this to the end of your startup file $HOME/.bashrc
. If
youâre using CSH rather than BASH, the command is:
% setenv PATH ${PATH}:${HOME}/local/bin
and the startup file is $HOME/.cshrc
. You should now be able to run
mpicc
and so have a working MPI compiler.
Installing FFTW from sourceÂ¶
If you havenât already, create directories âinstallâ and âlocalâ in your home directory:
$ cd
$ mkdir install
$ mkdir local
Download the latest stable version from
http://www.fftw.org/download.html into the âinstallâ directory. At the
time of writing, this was called fftw3.3.2.tar.gz
. Untar this file,
and âcdâ into the resulting directory. As with the MPI compiler,
configure and install the FFTW library into $HOME/local
by running:
$ ./configure prefix=$HOME/local
$ make
$ make install
Compiling and running under AIXÂ¶
Most development and running of BOUT++ is done under Linux, with the occasional FreeBSD and OSX. The configuration scripts are therefore heavily tested on these architectures. IBMâs POWER architecture however runs AIX, which has some crucial differences which make compiling a pain.
 Under Linux/BSD, itâs usual for a Fortran routine
foo
to appear under C asfoo_
, whilst under AIX the name is unchanged  MPI compiler scripts are usually given the names
mpicc
and eithermpiCC
ormpicxx
. AIX usesmpcc
andmpCC
.  Like BSD, the
make
command isnât compatible with GNU make, so you have to rungmake
to compile everything.  The POWER architecture is bigendian, different to the little endian Intel and AMD chips. This can cause problems with binary file formats.
SUNDIALS under AIXÂ¶
To compile SUNDIALS, use:
export CC=cc
export CXX=xlC
export F77=xlf
export OBJECT_MODE=64
./configure prefix=$HOME/local/ withmpicc=mpcc withmpif77=mpxlf CFLAGS=maix64
You may get an error message like
make: Not a recognized flag: w
This is because the AIX make
is being used, rather than gmake
.
The easiest way to fix this is to make a link to gmake
in your local
bin directory
ln s /usr/bin/gmake $HOME/local/bin/make
Running which make
should now point to this local/bin/make
, and
if not then you need to make sure that your bin directory appears first
in the PATH
export PATH=$HOME/local/bin:$PATH
If you see an error like this
ar: 0707126 ../../src/sundials/sundials_math.o is not valid with the current object file mode.
Use the X option to specify the desired object mode.
then you need to set the environment variable OBJECT_MODE
export OBJECT_MODE=64
Configuring BOUT++, you may get the error
configure: error: C compiler cannot create executables
In that case, you can try using:
./configure CFLAGS="maix64"
When compiling, you may see warnings:
xlC_r: 1501216 (W) command option 64 is not recognized  passed to ld
At this point, the main BOUT++ library should compile, and you can try compiling one of the examples.
ld: 0711317 ERROR: Undefined symbol: .NcError::NcError(NcError::Behavior)
ld: 0711317 ERROR: Undefined symbol: .NcFile::is_valid() const
ld: 0711317 ERROR: Undefined symbol: .NcError::~NcError()
ld: 0711317 ERROR: Undefined symbol: .NcFile::get_dim(const char*) const
This is probably because the NetCDF libraries are 32bit, whilst BOUT++ has been compiled as 64bit. You can try compiling BOUT++ as 32bit
export OBJECT_MODE=32
./configure CFLAGS="maix32"
gmake
If you still get undefined symbols, then go back to 64bit, and edit
make.config, replacing lnetcdf_c++
with lnetcdf64_c++, and
lnetcdf
with lnetcdf64. This can be done by running
sed 's/netcdf/netcdf64/g' make.config > make.config.new
mv make.config.new make.config
IssuesÂ¶
Wrong install scriptÂ¶
Before installing, make sure the correct version of install
is being
used by running:
$ which install
This should point to a system directory like /usr/bin/install
.
Sometimes when IDL has been installed, this points to the IDL install
(e.g. something like /usr/common/usg/idl/idl70/bin/install
on
Franklin). A quick way to fix this is to create a link from your local
bin to the system install:
$ ln s /usr/bin/install $HOME/local/bin/
âwhich installâ should now print the install in your local bin directory.
Compiling cvode.cxx failsÂ¶
Occasionally compiling the CVODE solver interface will fail with an error similar to:
cvode.cxx: In member function âvirtual int CvodeSolver::init(rhsfunc, bool, int, BoutR...
cvode.cxx:234:56: error: invalid conversion from âint (*)(CVINT...
...
This is caused by different sizes of ints used in different versions of
the CVODE library. The configure script tries to determine the correct
type to use, but may fail in unusual circumstances. To fix, edit
src/solver/impls/cvode/cvode.cxx
, and change line 48 from
typedef int CVODEINT;
to
typedef long CVODEINT;
Compiling with IBM xlC compiler failsÂ¶
When using the xlC
compiler, an error may occur:
variant.hpp(1568) parameter pack "Ts" was referenced but not expanded
The workaround is to change line 428 of externalpackages/mpark.variant/include/mpark/lib.hpp
from:
#ifdef MPARK_TYPE_PACK_ELEMENT
to:
#ifdef CAUSES_ERROR // MPARK_TYPE_PACK_ELEMENT
This will force an alternate implementation of type_pack_element to be defined. See also https://software.intel.com/enus/forums/intelccompiler/topic/501502
Running BOUT++Â¶
Quick startÂ¶
The examples/
directory contains some example physics models for a
variety of fluid models. There are also some under
tests/integrated/
, which often just run a part of the code rather
than a complete simulation. The simplest example to start with is
examples/conduction/
. This solves a single equation for a 3D
scalar field \(T\):
There are several files involved:
conduction.cxx
contains the source code which specifies the equation to solve. See Heat conduction for a linebyline walkthrough of this fileconduct_grid.nc
is the grid file, which in this case just specifies the number of grid points in \(X\) and \(Y\) (nx
&ny
) with everything else being left as the default (e.g. grid spacings dx and dy are \(1\), the metric tensor is the identity matrix). For details of the grid file format, see Generating input grids.generate.py
is a Python script to create the grid file. In this case it just writes nx and nydata/BOUT.inp
is the settings file, specifying how many output timesteps to take, differencing schemes to use, and many other things. In this case itâs mostly empty so the defaults are used.
First you need to compile the example:
$ gmake
which should print out something along the lines of:
Compiling conduction.cxx
Linking conduction
If you get an error, most likely during the linking stage, you may need to go back and make sure the libraries are all set up correctly. A common problem is mixing MPI implementations, for example compiling NetCDF using Open MPI and then BOUT++ with MPICH2. Unfortunately the solution is to recompile everything with the same compiler.
Then try running the example. If youâre running on a standalone server, desktop or laptop then try:
$ mpirun np 2 ./conduction
If youâre running on a cluster or supercomputer, you should find out how
to submit jobs. This varies, but usually on these bigger machines there
will be a queueing system and youâll need to use qsub
, msub
,
llsubmit
or similar to submit jobs.
When the example runs, it should print a lot of output. This is
recording all the settings being used by the code, and is also written
to log files for future reference. The test should take a few seconds to
run, and produce a bunch of files in the data/
subdirectory.
BOUT.log.*
contains a log from each process, so because we ran with ânp 2â there should be 2 logs. The one from processor \(0\) will be the same as what was printed to the screen. This is mainly useful because if one process crashes it may only put an error message into its own log.BOUT.settings
contains all the options used in the code, including options which were not set and used the default values. Itâs in the same format as BOUT.inp, so can be renamed and used to rerun simulations if needed. In some cases the options used have documentation, with a brief explanation of how they are used. In most cases the type the option is used as (e.g.int
,BoutReal
orbool
) is given.BOUT.restart.*.nc
are the restart files for the last time point. Currently each processor saves its own state in a separate file, but there is experimental support for parallel I/O. For the settings, see Input and Output.BOUT.dmp.*.nc
contain the output data, including time history. As with the restart files, each processor currently outputs a separate file.
Restart files allow the run to be restarted from where they left off:
$ mpirun np 2 ./conduction restart
This will delete the output data BOUT.dmp.*.nc
files, and start
again. If you want to keep the output from the first run, add âappendâ:
$ mpirun np 2 ./conduction restart append
which will then append any new outputs to the end of the old data files. For more information on restarting, see Restarting runs.
To see some of the other commandline options try âhâ:
$ ./conduction h
and see the section on options (BOUT++ options).
To analyse the output of the simulation, cd into the data
subdirectory and start python or IDL (skip to Using IDL for IDL).
Analysing the output Using pythonÂ¶
In order to analyse the output of the simulation using Python, you
will first need to have set up python to use the BOUT++ libraries
boutdata
and boututils
; see section
Python configuration for how to do this. The analysis routines have
some requirements such as SciPy; see section
Requirements for details.
To print a list of variables in the output files, one way is to use the DataFile
class. This is a wrapper around the various NetCDF and HDF5 libraries for python:
>>> from boututils.datafile import DataFile
>>> DataFile("BOUT.dmp.0.nc").list()
To collect a variable, reading in the data as a NumPy array:
>>> from boutdata.collect import collect
>>> T = collect("T")
>>> T.shape
Note that the order of the indices is different in Python and IDL: In
Python, 4D variables are arranged as [t, x, y, z]
. To show an
animation
>>> from boututils.showdata import showdata
>>> showdata(T[:,0,:,0])
The first index of the array passed to showdata
is assumed to be time, amd the remaining
indices are plotted. In this example we pass a 2D array [t,y]
, so showdata
will animate
a line plot.
Analysing the output using IDLÂ¶
First, list the variables in one of the data files:
IDL> print, file_list("BOUT.dmp.0.nc")
iteration MXSUB MYSUB MXG MYG MZ NXPE NYPE BOUT_VERSION t_array ZMAX ZMIN T
All of these except âT
â are in all output files, and they
contain information about the layout of the mesh so that the data can be
put in the correct place. The most useful variable is ât_array
â
which is a 1D array of simulation output times. To read this, we can use
the collect
function:
IDL> time = collect(var="t_array")
IDL> print, time
1.10000 1.20000 1.30000 1.40000 1.50000 ...
The number of variables in an output file depends on the model being
solved, which in this case consists of a single scalar field
âT
â. To read this into IDL, again use collect
:
IDL> T = collect(var="T")
IDL> help, T
T FLOAT = Array[5, 64, 1, 20]
This is a 4D variable, arranged as [x, y, z, t]
. The \(x\)
direction has 5 points, consisting of 2 points either side for the
boundaries and one point in the middle which is evolving. This case is
only solving a 1D problem in \(y\) with 64 points so to display an
animation of this
IDL> showdata, T[2,*,0,*]
which selects the only evolving \(x\) point, all \(y\), the only \(z\) point, and all time points. If given 3D variables, showdata will display an animated surface
IDL> showdata, T[*,*,0,*]
and to make this a coloured contour plot
IDL> showdata, T[*,*,0,*], /cont
The equivalent commands in Python are as follows.
Natural language supportÂ¶
If you have locales installed, and configured the locale
path
correctly (see Natural Language Support), then the LANG
environment
variable selects the language to use. Currently BOUT++ only has support
for fr
, de
, es
, zh_TW
and zh_CN
locales e.g.
LANG=zh_TW.utf8 ./conduction
which should produce an output like:
BOUT++ ç 4.3.0
ç: 667c19c136fc3e72fcd7c7b2109d44886fdf818d
MD5 checksum: 2263dc17fa414179c7ad87c3972f624b
ä»ŁçąŒæŒ Nov 21 2019 17:26:55 çŒèŻ
...
or
LANG=es_ES.utf8 ./conduction
which should produce:
VersiĂłn de BOUT++ 4.3.0
RevisiĂłn: 667c19c136fc3e72fcd7c7b2109d44886fdf818d
MD5 checksum: 2263dc17fa414179c7ad87c3972f624b
CĂłdigo compilado en Nov 21 2019 en 17:26:55
...
The name of the locale (zh_TW.utf8
or es_ES.utf8
above) can be different
on different machines. To see a list of available locales on your system try running:
locale a
If you are missing a locale you need, see your distributionâs help, or try this Arch wiki page on locale.
When things go wrongÂ¶
BOUT++ is still under development, and so occasionally you may be lucky enough to discover a new bug. This is particularly likely if youâre modifying the physics module source code (see BOUT++ physics models) when you need a way to debug your code too.
 Check the end of each processorâs log file (tail data/BOUT.log.*). When BOUT++ exits before it should, what is printed to screen is just the output from processor 0. If an error occurred on another processor then the error message will be written to itâs log file instead.
 By default when an error occurs a kind of stack trace is printed
which shows which functions were being run (most recent first). This
should give a good indication of where an error occurred. If this
stack isnât printed, make sure checking is set to level 2 or higher
(
./configure âenablechecks=2
).  If the error is due to nonfinite numbers, increase the checking
level (
./configure âenablechecks=3
) to perform more checking of values and (hopefully) find an error as soon as possible after it occurs.  If the error is a segmentation fault, you can try a debugger such as
gdb or totalview. You will likely need to compile with some
debugging flags (
./configure enabledebug
).  You can also enable exceptions on floating point errors
(
./configure enablesigfpe
), though the majority of these types of errors should be caught with checking level set to 3.  Expert users can try AddressSanitizer, which is a tool that comes
with recent versions of GCC and Clang. To enable AddressSanitizer,
include
fsanitize=leak fsanitize=address fsanitize=undefined
inCXXFLAGS
when configuring BOUT++, or add them toBOUT_FLAGS
.
Startup outputÂ¶
When BOUT++ is run, it produces a lot of output initially, mainly
listing the options which have been used so you can check that itâs
doing what you think it should be. Itâs generally a good idea to scan
over this see if there are any important warnings or errors. Each
processor outputs its own log file BOUT.log.#
and the log from
processor 0 is also sent to the screen. This output may look a little
different if itâs out of date, but the general layout will probably be
the same.
First comes the introductory blurb:
BOUT++ version 1.0
Revision: c8794400adc256480f72c651dcf186fb6ea1da49
MD5 checksum: 8419adb752f9c23b90eb50ea2261963c
Code compiled on May 11 2011 at 18:22:37
B.Dudson (University of York), M.Umansky (LLNL) 2007
Based on BOUT by Xueqiao Xu, 1999
The version number (1.0 here) gets increased occasionally after some major feature has been added. To help match simulations to code versions, the Git revision of the core BOUT++ code and the date and time it was compiled is recorded. Because code could be modified from the revision, an MD5 checksum of all the code is also calculated. This information makes it possible to verify precisely which version of the code was used for any given run.
Next comes the compiletime options, which depend on how BOUT++ was configured (see Compiling BOUT++):
Compiletime options:
Checking enabled, level 2
Signal handling enabled
netCDF support enabled
Parallel NetCDF support disabled
This says that some runtime checking of values is enabled, that the code will try to catch segmentation faults to print a useful error, that NetCDF files are supported, but that the parallel flavour isnât.
The processor number comes next:
Processor number: 0 of 1
This will always be processor number â0â on screen as only the output from processor â0â is sent to the terminal. After this the core BOUT++ code reads some options:
Option /nout = 50 (data/BOUT.inp)
Option /timestep = 100 (data/BOUT.inp)
Option /grid = slab.6b5.r1.cdl (data/BOUT.inp)
Option /dump_float = true (default)
Option /non_uniform = false (data/BOUT.inp)
Option /restart = false (default)
Option /append = false (default)
Option /dump_format = nc (data/BOUT.inp)
Option /StaggerGrids = false (default)
This lists each option and the value it has been assigned. For every
option the source of the value being used is also given. If a value had
been given on the command line then (command line)
would appear
after the option.:
Setting X differencing methods
First : Second order central (C2)
Second : Second order central (C2)
Upwind : Third order WENO (W3)
Flux : Split into upwind and central (SPLIT)
Setting Y differencing methods
First : Fourth order central (C4)
Second : Fourth order central (C4)
Upwind : Third order WENO (W3)
Flux : Split into upwind and central (SPLIT)
Setting Z differencing methods
First : FFT (FFT)
Second : FFT (FFT)
Upwind : Third order WENO (W3)
Flux : Split into upwind and central (SPLIT)
This is a list of the differential methods for each direction. These are
set in the BOUT.inp file ([ddx]
, [ddy]
and [ddz]
sections),
but can be overridden for individual operators. For each direction,
numerical methods can be specified for first and second central
difference terms, upwinding terms of the form
\({{\frac{\partial f}{\partial t}}} = {{\boldsymbol{v}}}\cdot\nabla f\),
and flux terms of the form
\({{\frac{\partial f}{\partial t}}} = \nabla\cdot({{\boldsymbol{v}}}f)\).
By default the flux terms are just split into a central and an upwinding
term.
In brackets are the code used to specify the method in BOUT.inp. A list of available methods is given in Differencing methods.:
Setting grid format
Option /grid_format = (default)
Using NetCDF format for file 'slab.6b5.r1.cdl'
Loading mesh
Grid size: 10 by 64
Option /mxg = 2 (data/BOUT.inp)
Option /myg = 2 (data/BOUT.inp)
Option /NXPE = 1 (default)
Option /mz = 65 (data/BOUT.inp)
Option /twistshift = false (data/BOUT.inp)
Option /TwistOrder = 0 (default)
Option /ShiftOrder = 0 (default)
Option /shiftxderivs = false (data/BOUT.inp)
Option /IncIntShear = false (default)
Option /BoundaryOnCell = false (default)
Option /StaggerGrids = false (default)
Option /periodicX = false (default)
Option /async_send = false (default)
Option /zmin = 0 (data/BOUT.inp)
Option /zmax = 0.0028505 (data/BOUT.inp)::
WARNING: Number of inner y points 'ny_inner' not found. Setting to 32
Optional quantities (such as ny_inner
in this case) which are not
specified are given a default (bestguess) value, and a warning is
printed.:
EQUILIBRIUM IS SINGLE NULL (SND)
MYPE_IN_CORE = 0
DXS = 0, DIN = 1. DOUT = 1
UXS = 0, UIN = 1. UOUT = 1
XIN = 1, XOUT = 1
Twistshift:
At this point, BOUT++ reads the grid file, and works out the topology of the grid, and connections between processors. BOUT++ then tries to read the metric coefficients from the grid file:
WARNING: Could not read 'g11' from grid. Setting to 1.000000e+00
WARNING: Could not read 'g22' from grid. Setting to 1.000000e+00
WARNING: Could not read 'g33' from grid. Setting to 1.000000e+00
WARNING: Could not read 'g12' from grid. Setting to 0.000000e+00
WARNING: Could not read 'g13' from grid. Setting to 0.000000e+00
WARNING: Could not read 'g23' from grid. Setting to 0.000000e+00
These warnings are printed because the coefficients have not been specified in the grid file, and so the metric tensor is set to the default identity matrix.:
WARNING: Could not read 'zShift' from grid. Setting to 0.000000e+00
WARNING: Z shift for radial derivatives not found
To get radial derivatives, the quasiballooning coordinate method is used . The upshot of this is that to get radial derivatives, interpolation in Z is needed. This should also always be set to FFT.:
WARNING: Twistshift angle 'ShiftAngle' not found. Setting from zShift
Option /twistshift_pf = false (default)::
Maximum error in diagonal inversion is 0.000000e+00
Maximum error in offdiagonal inversion is 0.000000e+00
If only the contravariant components (g11
etc.) of the metric tensor
are specified, the covariant components (g_11
etc.) are calculated
by inverting the metric tensor matrix. Error estimates are then
calculated by calculating \(g_{ij}g^{jk}\) as a check. Since no
metrics were specified in the input, the metric tensor was set to the
identity matrix, making inversion easy and the error tiny.:
WARNING: Could not read 'J' from grid. Setting to 0.000000e+00
WARNING: Jacobian 'J' not found. Calculating from metric tensor::
Maximum difference in Bxy is 1.444077e02
Calculating differential geometry terms
Communicating connection terms
Boundary regions in this processor: core, sol, target, target,
done::
Setting file formats
Using NetCDF format for file 'data/BOUT.dmp.0.nc'
The laplacian inversion code is initialised, and prints out the options used.:
Initialising Laplacian inversion routines
Option comms/async = true (default)
Option laplace/filter = 0.2 (default)
Option laplace/low_mem = false (default)
Option laplace/use_pdd = false (default)
Option laplace/all_terms = false (default)
Option laplace/laplace_nonuniform = false (default)
Using serial algorithm
Option laplace/max_mode = 26 (default)
After this comes the physics modulespecific output:
Initialising physics module
Option solver/type = (default)
.
.
.
This typically lists the options used, and useful/important normalisation factors etc.
Finally, once the physics module has been initialised, and the current values loaded, the solver can be started:
Initialising solver
Option /archive = 1 (default)
Option /dump_format = nc (data/BOUT.inp)
Option /restart_format = nc (default)
Using NetCDF format for file 'nc'::
Initialising PVODE solver
Boundary region inner X
Boundary region outer X
3d fields = 2, 2d fields = 0 neq=84992, local_N=84992
This last line gives the number of equations being evolved (in this case 84992), and the number of these on this processor (here 84992).:
Option solver/mudq = 16 (default)
Option solver/mldq = 16 (default)
Option solver/mukeep = 0 (default)
Option solver/mlkeep = 0 (default)
The absolute and relative tolerances come next:
Option solver/atol = 1e10 (data/BOUT.inp)
Option solver/rtol = 1e05 (data/BOUT.inp)::
Option solver/use_precon = false (default)
Option solver/precon_dimens = 50 (default)
Option solver/precon_tol = 0.0001 (default)
Option solver/mxstep = 500 (default)::
Option fft/fft_measure = false (default)
This next option specifies the maximum number of internal timesteps which CVODE will take between outputs.:
Option fft/fft_measure = false (default)
Running simulation
Run started at : Wed May 11 18:23:20 2011
Option /wall_limit = 1 (default)
Pertimestep outputÂ¶
At the beginning of a run, just after the last line in the previous section, a header is printed out as a guide:
Sim Time  RHS evals  Wall Time  Calc Inv Comm I/O SOLVER
Each timestep (the one specified in BOUT.inp, not the internal timestep), BOUT++ prints out something like:
1.001e+02 76 2.27e+02 87.1 5.3 1.0 0.0 6.6
This gives the simulation time; the number of times the timederivatives (RHS) were evaluated; the walltime this took to run, and percentages for the time spent in different parts of the code.
Calc
is the time spent doing calculations such as multiplications, derivatives etcInv
is the time spent in inversion code (i.e. inverting Laplacians), including any communication which may be needed to do the inversion.Comm
is the time spent communicating variables (outside the inversion routine)I/O
is the time spent writing dump and restart files to disk. Most of the time this should not be an issueSOLVER
is the time spent in the implicit solver code.
The output sent to the terminal (not the log files) also includes a run time, and estimated remaining time.
Restarting runsÂ¶
Every output timestep, BOUT++ writes a set of files named âBOUT.restart.#.ncâ where â#â is the processor number (for parallel output, a single file âBOUT.restart.ncâ is used). To restart from where the previous run finished, just add the keyword restart to the end of the command, for example:
$ mpirun np 2 ./conduction restart
Equivalently, put ârestart=trueâ near the top of the BOUT.inp input file. Note that this will overwrite the existing data in the âBOUT.dmp.*.ncâ files. If you want to append to them instead then add the keyword append to the command, for example:
$ mpirun np 2 ./conduction restart append
or also put âappend=trueâ near the top of the BOUT.inp input file.
When restarting simulations BOUT++ will by default output the initial state, unless appending to existing data files when it will not output until the first timestep is completed. To override this behaviour, you can specify the option dump_on_restart manually. If dump_on_restart is true then the initial state will always be written out, if false then it never will be (regardless of the values of restart and append).
If you need to restart from a different point in your simulation, or the BOUT.restart files become corrupted, you can either use archived restart files, or create new restart files. Archived restart files have names like âBOUT.restart_0020.#.ncâ, and are written every 20 outputs by default. To change this, set âarchiveâ in the BOUT.inp file. To use these files, they must be renamed to âBOUT.restart.#.ncâ. A useful tool to do this is ârenameâ:
$ rename 's/_0020//' *.nc
will strip out â_0020â from any file names ending in â.ncâ.
If you donât have archived restarts, or want to start from a different
timepoint, there are Python routines for creating new restart files. If
your PYTHONPATH environment variable is set up (see
Configuring analysis routines) then you can use the
boutdata.restart.create
function in
tools/pylib/boutdata/restart.py
:
>>> from boutdata.restart import create
>>> create(final=10, path='data', output='.')
The above will take time point 10 from the BOUT.dmp.* files in the âdataâ directory. For each one, it will output a BOUT.restart file in the output directory â.â.
Stopping simulationsÂ¶
If you need to stop a simulation early this can be done by CtrlC in a terminal, but this will stop the simulation immediately without shutting down cleanly. Most of the time this will be fine, but interrupting a simulation while it is writing data to file could result in inconsistent or corrupted data.
Stop fileÂ¶
Note This method needs to be enabled before the simulation starts by setting
stopCheck=true
on the command line or input options:
$ mpirun np 4 ./conduction stopCheck=true
or in the top section of BOUT.inp
set stopCheck=true
.
At every output time, the monitor checks for the existence of a file, by default called
BOUT.stop
, in the same directory as the output data. If the file exists then
the monitor signals the time integration solver to quit. This should result in a clean
shutdown.
To stop a simulation using this method, just create an empty file in the output directory:
$ mpirun np 4 ./conduction stopCheck=true
...
$ touch data/BOUT.stop
just remember to delete the file afterwards.
Send signal USR1Â¶
Another option is to send signal user defined signal 1
:
$ mpirun np 4 ./conduction &
...
$ killall s USR1 conduction
Note that this will stop all conduction simulation on this node. Many
HPC systems provide tools to send signals to the simulation nodes,
such as qsig
on archer.
To just stop one simulation, the boutstopscript
can send a
signal based on the path of the simulation data dir:
$ mpirun np 4 ./conduction &
...
$ boutstopscript data
This will stop the simulation cleanly, and:
$ mpirun np 4 ./conduction &
...
$ boutstopscript data force
will kill the simulation immediately.
Manipulating restart filesÂ¶
It is sometimes useful to change the number of processors used in a simulation, or to modify restart files in various ways. For example, a 3D turbulence simulation might start with a quick 2D simulation with diffusive transport to reach a steadystate. The restart files can then be extended into 3D, noise added to seed instabilities, and the files split over a more processors.
Routines to modify restart files are in tools/pylib/boutdata/restart.py
:
>>> from boutdata import restart
>>> help(restart)
Changing number of processorsÂ¶
To change the number of processors use the redistribute
function:
>>> from boutdata import restart
>>> restart.redistribute(32, path="../oldrun", output=".")
where in this example 32
is the number of processors desired; path
sets
the path to the existing restart files, and output
is the path where
the new restart files should go.
Note Make sure that path
and output
are different.
If your simulation is divided in X and Y directions then you should also specify
the number of processors in the X direction, NXPE
:
>>> restart.redistribute(32, path="../oldrun", output=".", nxpe=8)
Note Currently this routine doesnât check that this split is consistent with branch cuts, e.g. for Xpoint tokamak simulations. If an inconsistent choice is made then the BOUT++ restart will fail.
Note It is a good idea to set nxpe
in the BOUT.inp
file to be consistent with
what you set here. If it is inconsistent then the restart will fail, but the error message may
not be particularly enlightening.
BOUT++ physics modelsÂ¶
Once you have tried some example codes, and generally got the hang of running BOUT++ and analysing the results, there will probably come a time when you want to change the equations being solved. This section demonstrates how a BOUT++ physics model is put together. It assumes you have a working knowledge of C or C++, but you donât need to be an expert  most of the messy code is hidden away from the physics model. There are several good books on C and C++, but Iâd recommend online tutorials over books because there are a lot more of them, theyâre quicker to scan through, and theyâre cheaper.
Many of the examples which come with BOUT++ are physics models, and
can be used as a starting point. Some relatively simple examples are
blob2d
(2D plasma filament/blob propagation),
hasegawawakatani
(2D turbulence), finitevolume/fluid
(1D
compressible fluid) and gascompress
(up to 3D compressible
fluid). Some of the integrated tests (under tests/integrated
) use
either physics models (e.g. testdelp2
and
testdriftinstability
), or define their own main
function
(e.g. testio
and testcyclic
).
Heat conductionÂ¶
The conduction
example solves 1D heat conduction
The source code to solve this is in conduction.cxx
, which we show here:
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42  #include <bout/physicsmodel.hxx>
class Conduction : public PhysicsModel {
private:
Field3D T; // Evolving temperature equation only
BoutReal chi; // Parallel conduction coefficient
protected:
// This is called once at the start
int init(bool UNUSED(restarting)) override {
// Get the options
auto& options = Options::root()["conduction"];
// Read from BOUT.inp, setting default to 1.0
// The doc() provides some documentation in BOUT.settings
chi = options["chi"].doc("Conduction coefficient").withDefault(1.0);
// Tell BOUT++ to solve T
SOLVE_FOR(T);
return 0;
}
int rhs(BoutReal UNUSED(time)) override {
mesh>communicate(T); // Communicate guard cells
ddt(T) = Div_par_K_Grad_par(chi, T); // Parallel diffusion Div_{}( chi * Grad_{}(T) )
return 0;
}
};
BOUTMAIN(Conduction);

Letâs go through it linebyline. First, we include the header that
defines the PhysicsModel
class:
#include <bout/physicsmodel.hxx>
This also brings in the header files that we need for the rest of the
code. Next, we need to define a new class, Conduction
, that
inherits from PhysicsModel
(line 8):
class Conduction : public PhysicsModel {
The PhysicsModel
contains both the physical variables we want to
evolve, like the temperature:
Field3D T; // Evolving temperature equation only
as well as any physical or numerical coefficients. In this case, we
only have the parallel conduction coefficient, chi
:
BoutReal chi; // Parallel conduction coefficient
A Field3D
represents a 3D scalar quantity, while a BoutReal
represents a single number. See the later section on
Variables for more information.
After declaring our model variables, we need to define two functions:
an initialisation function, init
, that is called to set up the
simulation and specify which variables are evolving in time; and a
ârighthand sideâ function, rhs
, that calculates the time
derivatives of our evolving variables. These are defined in lines 18
and 21 respectively above:
int init(bool restarting) override {
...
}
int rhs(BoutReal time) override {
...
}
PhysicsModel::init()
takes as input a bool
(true
or false
)
that tells it whether or not the model is being restarted, which can
be useful if something only needs to be done once before the
simulation starts properly. The simulation (physical) time is passed
to PhyiscsModel::rhs
as a BoutReal
.
The override
keyword is just to let the compiler know weâre
overriding a method in the base class and is not important to
understand.
InitialisationÂ¶
During initialisation (the init
function), the conduction example
first reads an option (lines 21 and 24) from the input settings file
(data/BOUT.inp
by default):
auto options = Options::root()["conduction"];
OPTION(options, chi, 1.0);
This first gets a section called âconductionâ, then requests an option called âchiâ inside this section. If this setting is not found, then the default value of 1.0 will be used. To set this value the BOUT.inp file contains:
[conduction]
chi = 1.0
which defines a section called âconductionâ, and within that section a variable called âchiâ. This value can also be overridden by specifying the setting on the command line:
$ ./conduction conduction:chi=2
where conduction:chi
means the variable âchiâ in the section
âconductionâ. When this option is read, a message is printed to the
BOUT.log files, giving the value used and the source of that value:
Option conduction:chi = 1 (data/BOUT.inp)
For more information on options and input files, see
BOUT++ options, as well as the documentation for the Options
class.
After reading the chi option, the init
method then specifies which
variables to evolve using the SOLVE_FOR
macro:
// Tell BOUT++ to solve T
SOLVE_FOR(T);
This tells the BOUT++ time integration solver to set the variable
T
using values from the input settings. It looks in a section with
the same name as the variable (T
here) for variables âscaleâ and
âfunctionâ:
[T] # Settings for the T variable
scale = 1.0 # Size of the initial perturbation
function = gauss(ypi, 0.2) # The form of the initial perturbation. y from 0 to 2*pi
The function is evaluated using expressions which can involve x,y and z coordinates. More details are given in section Initialisation of time evolved variables.
Finally an error code is returned, here 0 indicates no error. If
init
returns nonzero then the simulation will stop.
Time evolutionÂ¶
During time evolution, the time integration method (ODE integrator)
calculates the system state (here T
) at a give time. It then calls
the PhysicsModel::rhs()
function, which should calculate the time
derivative of all the evolving variables. In this case the job of the
rhs
function is to calculate ddt(T)
, the partial
derivative of the variable T
with respect to time, given the value
of T
:
\[\frac{\partial T}{\partial t} = \nabla_{}(\chi\partial_{} T)\]
The first thing the rhs
function function does is communicate the
guard (halo) cells using Mesh::communicate()
on line 33:
mesh>communicate(T);
This is because BOUT++ does not (generally) do communications, but
leaves it up to the user to decide when the most efficient or
convenient time to do them is. Before we can take derivatives of a
variable (here T
), the values of the function must be known in the
boundaries and guard cells, which requires communication between
processors. By default the values in the guard cells are set to
NaN
, so if they are accidentally used without first communicating
then the code should crash fairly quickly with a nonfinite number
error.
Once the guard cells have been communicated, we calculate the right hand side (RHS) of the equation above (line 35):
ddt(T) = Div_par_K_Grad_par(chi, T);
The function Div_par_K_Grad_par()
is a function in the BOUT++ library
which calculates the divergence in the parallel (y) direction of a
constant multiplied by the gradient of a function in the parallel
direction.
As with the init
code, a nonzero return value indicates an error
and will stop the simulation.
Running the modelÂ¶
The very last thing we need to do in our physics model is to define a
main
function. Here, we do it with the BOUTMAIN
macro:
BOUTMAIN(Conduction);
You can define your own main()
function, but for most cases this
is enough. The macro expands to something like:
int main(int argc, char **argv) {
BoutInitialise(argc, argv); // Initialise BOUT++
Conduction *model = new Conduction(); // Create a model
Solver *solver = Solver::create(); // Create a solver
solver>setModel(model); // Specify the model to solve
solver>addMonitor(bout_monitor); // Monitor the solver
solver>solve(); // Run the solver
delete model;
delete solver;
BoutFinalise(); // Finished with BOUT++
return 0;
}
This initialises the main BOUT++ library, creates the PhysicsModel
and Solver
, runs the solver, and finally cleans up the model, solver
and library.
Magnetohydrodynamics (MHD)Â¶
When going through this section, it may help to refer to the finished
code, which is given in the file mhd.cxx
in the BOUT++ examples
directory under orszagtang
. The equations to be solved are:
As in the heat conduction example,
a class is created which inherits from PhysicsModel
and defines
init
and rhs
functions:
class MHD : public PhysicsModel {
private:
int init(bool restarting) override {
...
}
int rhs(BoutReal t) override {
...
}
};
The init
function is called once at the start of the simulation,
and should set up the problem, specifying which variables are to be
evolved. The argument restarting
is false the first time a
problem is run, and true if loading the state from a restart file.
The rhs
function is called every timestep, and should calculate
the timederivatives for a given state. In both cases returning
nonzero tells BOUT++ that an error occurred.
VariablesÂ¶
We need to define the variables to evolve as member variables (so they
can be used in init
and rhs
).
For ideal MHD, we need two 3D scalar fields density \(\rho\) and pressure \(p\), and two 3D vector fields velocity \(v\), and magnetic field \(B\):
class MHD : public PhysicsModel {
private:
Field3D rho, p; // 3D scalar fields
Vector3D v, B; // 3D vector fields
...
};
Scalar and vector fields behave much as you would expect: Field3D
objects can be added, subtracted, multiplied and divided, so the
following examples are all valid operations:
Field3D a, b, c;
BoutReal r;
a = b + c; a = b  c;
a = b * c; a = r * b;
a = b / c; a = b / r; a = r / b;
Similarly, vector objects can be added/subtracted from each other, multiplied/divided by scalar fields and real numbers, for example:
Vector3D a, b, c;
Field3D f;
BoutReal r;
a = b + c; a = b  c;
a = b * f; a = b * r;
a = b / f; a = b / r;
In addition the dot and cross products are represented by *
and
\(\wedge\) symbols:
Vector3D a, b, c;
Field3D f;
f = a * b // Dotproduct
a = b ^ c // Crossproduct
For both scalar and vector field operations, so long as the result of an operation is of the correct type, the usual C/C++ shorthand notation can be used:
Field3D a, b;
Vector3D v, w;
a += b; v *= a; v = w; v ^= w; // valid
v *= w; // NOT valid: result of dotproduct is a scalar
Note: The operator precedence for \(\wedge\) is lower than
+
, *
and /
so it is recommended to surround a ^ b
with
braces.
Evolution equationsÂ¶
At this point we can tell BOUT++ which variables to evolve, and where
the state and timederivatives will be stored. This is done using the
bout_solve(variable, name)
function in
your physics model init
:
int init(bool restarting) {
bout_solve(rho, "density");
bout_solve(p, "pressure");
v.covariant = true; // evolve covariant components
bout_solve(v, "v");
B.covariant = false; // evolve contravariant components
bout_solve(B, "B");
return 0;
}
The name given to this function will be used in the output and restart data files. These will be automatically read and written depending on input options (see BOUT++ options). Input options based on these names are also used to initialise the variables.
If the name of the variable in the output file is the same as the
variable name, you can use a shorthand macro. In this case, we could use
this shorthand for v
and B
:
SOLVE_FOR(v);
SOLVE_FOR(B);
To make this even shorter, multiple fields can be passed to
SOLVE_FOR
(up to 10 at the time of writing). We can also use
macros SOLVE_FOR2
, SOLVE_FOR3
, âŠ, SOLVE_FOR6
which are used in
many models. Our initialisation code becomes:
int init(bool restarting) override {
...
bout_solve(rho, "density");
bout_solve(p, "pressure");
v.covariant = true; // evolve covariant components
B.covariant = false; // evolve contravariant components
SOLVE_FOR(v, B);
...
return 0;
}
Vector quantities can be stored in either covariant or contravariant
form. The value of the Vector3D::covariant
property when
PhysicsModel::bout_solve()
(or SOLVE_FOR
) is called is the form
which is evolved in time and saved to the output file.
The equations to be solved can now be written in the rhs
function. The value passed to the function (BoutReal t
) is the
simulation time  only needed if your equations contain timedependent
sources or similar terms. To refer to the timederivative of a
variable var
, use ddt(var)
. The ideal MHD equations can be
written as:
int rhs(BoutReal t) override {
ddt(rho) = V_dot_Grad(v, rho)  rho*Div(v);
ddt(p) = V_dot_Grad(v, p)  g*p*Div(v);
ddt(v) = V_dot_Grad(v, v) + ( (Curl(B)^B)  Grad(p) ) / rho;
ddt(B) = Curl(v^B);
}
Where the differential operators vector = Grad(scalar)
,
scalar = Div(vector)
, and vector = Curl(vector)
are
used. For the density and pressure equations, the
\(\mathbf{v}\cdot\nabla\rho\) term could be written as
v*Grad(rho)
, but this would then use central differencing in the
Grad operator. Instead, the function V_dot_Grad()
uses upwinding
methods for these advection terms. In addition, the Grad()
function
will not operate on vector objects (since result is neither scalar nor
vector), so the \(\mathbf{v}\cdot\nabla\mathbf{v}\) term CANNOT be
written as v*Grad(v)
.
Input optionsÂ¶
Note that in the above equations the extra parameter g
has been
used for the ratio of specific heats. To enable this to be set in the
input options file (see BOUT++ options), we use the Options
object in the initialisation function:
class MHD : public PhysicsModel {
private:
BoutReal gamma;
int init(bool restarting) override {
auto globalOptions = Options::root();
auto options = globalOptions["mhd"];
OPTION(options, g, 5.0 / 3.0);
...
This specifies that an option called âgâ in a section called âmhdâ
should be put into the variable g
. If the option could not be
found, or was of the wrong type, the variable should be set to a
default value of \(5/3\). The value used will be printed to the
output file, so if g
is not set in the input file the following
line will appear:
Option mhd:g = 1.66667 (default)
This function can be used to get integers and booleans. To get
strings, there is the function (char* options.getString(section,
name)
. To separate options specific to the physics model, these
options should be put in a separate section, for example here the
âmhdâ section has been specified.
Most of the time, the name of the variable (e.g. g
) will be the
same as the identifier in the options file (âgâ). In this case, there
is the macro:
OPTION(options, g, 5.0/3.0);
which is equivalent to:
g = options["g"].withDefault( 5.0/3.0 );
See BOUT++ options for more details of how to use the input options.
CommunicationÂ¶
If you plan to run BOUT++ on more than one processor, any operations
involving derivatives will require knowledge of data stored on other
processors. To handle the necessary parallel communication, there is
the mesh>communicate
function. This takes care
of where the data needs to go to/from, and only needs to be told which
variables to transfer.
If you only need to communicate a small number (up to 5 currently) of
variables then just call the Mesh::communicate()
function directly.
For the MHD code, we need to communicate the variables rho,p,v,B
at the beginning of the PhysicsModel::rhs()
function before any
derivatives are calculated:
int rhs(BoutReal t) override {
mesh>communicate(rho, p, v, B);
If you need to communicate lots of variables, or want to change at
runtime which variables are evolved (e.g. depending on input
options), then you can create a group of variables and communicate
them later. To do this, first create a FieldGroup
object , in this
case called comms
, then use the add method. This method does no
communication, but records which variables to transfer when the
communication is done later:
class MHD : public PhysicsModel {
private:
FieldGroup comms;
int init(bool restarting) override {
...
comms.add(rho);
comms.add(p);
comms.add(v);
comms.add(B);
...
The comms.add()
routine can be given any number of
variables at once (thereâs no practical limit on the total number of
variables which are added to a FieldGroup
), so this can be
shortened to:
comms.add(rho, p, v, B);
To perform the actual communication, call the mesh>communicate
function with the group. In this case we need to
communicate all these variables before performing any calculations, so
call this function at the start of the rhs
routine:
int rhs(BoutReal t) override {
mesh>communicate(comms);
...
In many situations there may be several groups of variables which can
be communicated at different times. The function mesh>communicate
consists of a call to Mesh::send()
followed by Mesh::wait()
which can
be done separately to interleave calculations and communications.
This will speed up the code if parallel communication bandwidth is a
problem for your simulation.
In our MHD example, the calculation of ddt(rho)
and ddt(p)
does not require B
, so we could first communicate rho
, p
,
and v
, send B
and do some calculations whilst communications
are performed:
int rhs(BoutReal t) override {
mesh>communicate(rho, p, v); // sends and receives rho, p and v
comm_handle ch = mesh>send(B);// only send B
ddt(rho) = ...
ddt(p) = ...
mesh>wait(ch); // now wait for B to arrive
ddt(v) = ...
ddt(B) = ...
return 0;
}
This scheme is not used in mhd.cxx
, partly for clarity, and partly
because currently communications are not a significant bottleneck (too
much inefficiency elsewhere!).
When a differential is calculated, points on neighbouring cells are assumed to be in the guard cells. There is no way to calculate the result of the differential in the guard cells, and so after every differential operator the values in the guard cells are invalid. Therefore, if you take the output of one differential operator and use it as input to another differential operator, you must perform communications (and set boundary conditions) first. See Differential operators.
Error handlingÂ¶
Finding where bugs have occurred in a (fairly large) parallel code is a difficult problem. This is more of a concern for developers of BOUT++ (see the developers manual), but it is still useful for the user to be able to hunt down bug in their own code, or help narrow down where a bug could be occurring.
If you have a bug which is easily reproduceable i.e. it occurs almost
immediately every time you run the code, then the easiest way to hunt
down the bug is to insert lots of output.write
statements (see
Logging output). Things get harder when a bug only occurs after a
long time of running, and/or only occasionally. For this type of
problem, a useful tool can be the message stack. An easy way to use
this message stack is to use the TRACE
macro:
{
TRACE("Some message here"); // message pushed
} // Scope ends, message popped
This will push the message, then pop the message when the current
scope ends (except when an exception occurs). The error message will
also have the file name and line number appended, to help find where
an error occurred. The runtime overhead of this should be small, but
can be removed entirely if the compiletime flag DCHECK
is not
defined or set to 0
. This turns off checking, and TRACE
becomes an empty macro. It is possible to use standard printf
like formatting with the trace macro, for example:
{
TRACE("The value of i is %d and this is an arbitrary %s", i, "string"); // message pushed
} // Scope ends, message popped
In the mhd.cxx
example each part of the rhs
function is
traceâd. If an error occurs then at least the equation where it
happened will be printed:
{
TRACE("ddt(rho)");
ddt(rho) = V_dot_Grad(v, rho)  rho*Div(v);
}
Boundary conditionsÂ¶
All evolving variables have boundary conditions applied automatically
before the rhs
function is called (or afterwards if the boundaries
are being evolved in time). Which condition is applied depends on the
options file settings (see Boundary conditions). If you want to
disable this and apply your own boundary conditions then set boundary
condition to none
in the BOUT.inp
options file.
In addition to evolving variables, itâs sometimes necessary to impose boundary conditions on other quantities which are not explicitly evolved.
The simplest way to set a boundary condition is to specify it as text, so to apply a Dirichlet boundary condition:
Field3D var;
...
var.applyBoundary("dirichlet");
The format is exactly the same as in the options file. Each time this
is called it must parse the text, create and destroy boundary
objects. To avoid this overhead and have different boundary conditions
for each region, itâs better to set the boundary conditions you want
to use first in init
, then just apply them every time:
class MHD : public PhysicsModel {
Field3D var;
int init(bool restarting) override {
...
var.setBoundary("myVar");
...
}
int rhs(BoutReal t) override {
...
var.applyBoundary();
...
}
}
This will look in the options file for a section called [myvar]
(upper or lower case doesnât matter) in the same way that evolving
variables are handled. In fact this is precisely what is done: inside
PhysicsModel::bout_solve()
(or SOLVE_FOR
) the Field3D::setBoundary
method is called, and then after rhs
the Field3D::applyBoundary()
method is called on each evolving variable. This method also gives you
the flexibility to apply different boundary conditions on different
boundary regions (e.g. radial boundaries and target plates); the
first method just applies the same boundary condition to all
boundaries.
Another way to set the boundaries is to copy them from another variable:
Field3D a, b;
...
a.setBoundaryTo(b); // Copy b's boundaries into a
...
Note that this will copy the value at the boundary, which is halfway between
mesh points. This is not the same as copying the guard cells from
field b
to field a
. The value at the boundary cell is
calculated using secondorder central difference. For example if
there is one boundary cell, so that a(0,y,z)
is the boundary cell,
and a(1,y,z)
is in the domain, then the boundary would be set so that:
a(0,y,z) + a(1,y,z) = b(0,y,z) + b(1,y,z)
rearranged as:
a(0,y,z) =  a(1,y,z) + b(0,y,z) + b(1,y,z)
To copy the boundary cells (and communication guard cells), iterate over them:
BOUT_FOR(i, a.getRegion("RGN_GUARDS")) {
a[i] = b[i];
}
See Iterating over fields for more details on iterating over custom regions.
Custom boundary conditionsÂ¶
The boundary conditions supplied with the BOUT++ library cover the most common situations, but cannot cover all of them. If the boundary condition you need isnât available, then itâs quite straightforward to write your own. First you need to make sure that your boundary condition isnât going to be overwritten. To do this, set the boundary condition to ânoneâ in the BOUT.inp options file, and BOUT++ will leave that boundary alone. For example:
[P]
bndry_all = dirichlet
bndry_xin = none
bndry_xout = none
would set all boundaries for the variable âPâ to zero value, except for the X inner and outer boundaries which will be left alone for you to modify.
To set an X boundary condition, itâs necessary to test if the processor
is at the left boundary (first in X), or right boundary (last in X).
Note that it might be both if NXPE = 1
, or neither if NXPE > 2
.
Field3D f;
...
if(mesh>firstX()) {
// At the left of the X domain
// set f[0:1][*][*] i.e. first two points in X, all Y and all Z
for(int x=0; x < 2; x++)
for(int y=0; y < mesh>LocalNy; y++)
for(int z=0; z < mesh>LocalNz; z++) {
f(x,y,z) = ...
}
}
if(mesh>lastX()) {
// At the right of the X domain
// Set last two points in X
for(int x=mesh>LocalNx2; x < mesh>LocalNx; x++)
for(int y=0; y < mesh>LocalNy; y++)
for(int z=0; z < mesh>LocalNz; z++) {
f(x,y,z) = ...
}
}
note the size of the local mesh including guard cells is given by
Mesh::LocalNx
, Mesh::LocalNy
, and Mesh::LocalNz
. The functions
Mesh::firstX()
and Mesh::lastX()
return true only if the current
processor is on the left or right of the X domain respectively.
Setting custom Y boundaries is slightly more complicated than X
boundaries, because target or limiter plates could cover only part of
the domain. Rather than use a for
loop to iterate over the points
in the boundary, we need to use a more general iterator:
Field3D f;
...
RangeIterator it = mesh>iterateBndryLowerY();
for(it.first(); !it.isDone(); it++) {
// it.ind contains the x index
for(int y=2;y>=0;y) // Boundary width 3 points
for(int z=0;z<mesh>LocalNz;z++) {
ddt(f)(it.ind,y,z) = 0.; // Set timederivative to zero in boundary
}
}
This would set the timederivative of f
to zero in a boundary of
width 3 in Y (from 0 to 2 inclusive). In the same way
mesh>iterateBndryUpperY()
can be used to iterate over the upper
boundary:
RangeIterator it = mesh>iterateBndryUpperY();
for(it.first(); !it.isDone(); it++) {
// it.ind contains the x index
for(int y=mesh>LocalNy3;y<mesh>LocalNy;y) // Boundary width 3 points
for(int z=0;z<mesh>LocalNz;z++) {
ddt(f)(it.ind,y,z) = 0.; // Set timederivative to zero in boundary
}
}
Initial profilesÂ¶
Up to this point the code is evolving total density, pressure etc. This has advantages for clarity, but has problems numerically: For small perturbations, rounding error and tolerances in the timeintegration mean that linear dispersion relations are not calculated correctly. The solution to this is to write all equations in terms of an initial âbackgroundâ quantity and a timeevolving perturbation, for example \(\rho(t) \rightarrow \rho_0 + \tilde{\rho}(t)\). For this reason, the initialisation of all variables passed to the `PhysicsModel::bout_solve` function is a combination of smallamplitude gaussians and waves; the user is expected to have performed this separation into background and perturbed quantities.
To read in a quantity from a grid file, there is the mesh>get
function:
Field2D Ni0; // Background density
int init(bool restarting) override {
...
mesh>get(Ni0, "Ni0");
...
}
As with the input options, most of the time the name of the variable in the physics code will be the same as the name in the grid file to avoid confusion. In this case, you can just use:
GRID_LOAD(Ni0);
which is equivalent to:
mesh>get(Ni0, "Ni0");
(see Mesh::get()
).
Output variablesÂ¶
BOUT++ always writes the evolving variables to file, but often itâs useful to add other variables to the output. For convenience you might want to write the normalised starting profiles or other nonevolving values to file. For example:
Field2D Ni0;
...
GRID_LOAD(Ni0);
dump.add(Ni0, "Ni0", 0);
where the â0â at the end means the variable should only be written to file once at the start of the simulation. For convenience there are some macros e.g.:
SAVE_ONCE(Ni0);
is equivalent to:
dump.add(Ni0, "Ni0", 0);
(see Datafile::add()
). In some situations you might also want to write
some data to a different file. To do this, create a Datafile
object:
Datafile mydata;
in init
, you then:
(optional) Initialise the file, passing it the options to use. If you skip this step, default (sane) options will be used. This just allows you to enable/disable, use parallel I/O, set whether files are opened and closed every time etc.:
mydata = Datafile(Options::getRoot()>getSection("mydata"));
which would use options in a section
[mydata]
in BOUT.inpOpen the file for writing:
mydata.openw("mydata.nc")
(see
Datafile::openw()
). By default this only specifies the file name; actual opening of the file happens later when the data is written. If you are not using parallel I/O, the processor number is also inserted into the file name before the last â.â, so mydata.ncâ becomes âmydata.0.ncâ, âmydata.1.ncâ etc. The file format used depends on the extension, so â.ncâ will open NetCDF, and â.hdf5â or â.h5â an HDF5 file.(see e.g. src/fileio/datafile.cxx line 139, which calls src/fileio/dataformat.cxx line 23, which then calls the file format interface e.g. src/fileio/impls/netcdf/nc_format.cxx line 172).
Add variables to the file
// Not evolving. Every time the file is written, this will be overwritten mydata.add(variable, "name"); // Evolving. Will output a sequence of values mydata.add(variable2, "name2", 1);
Whenever you want to write values to the file, for example in
rhs
or a monitor, just call:
mydata.write();
(see Datafile::write()
). To collect the data afterwards, you can
specify the prefix to collect. In Python (see
collect()
):
>>> var = collect("name", prefix="mydata")
By default the prefix is âBOUT.dmpâ.
Variable attributesÂ¶
An experimental feature is the ability to add attributes to output
variables. Do this using with Datafile::setAttribute()
:
dump.setAttribute(variable, attribute, value);
where variable
is the name of the variable; attribute
is the
name of the attribute, and value
can be either a string or an
integer. For example:
dump.setAttribute("Ni0", "units", "m^3");
Reduced MHDÂ¶
The MHD example presented previously covered some of the functions
available in BOUT++, which can be used for a wide variety of models.
There are however several other significant functions and classes
which are commonly used, which will be illustrated using the
reconnect2field
example. This is solving equations for
\(A_{}\) and vorticity \(U\)
with \(\phi\) and \(j_{}\) given by
First create the variables which are going to be evolved, ensure theyâre communicated:
class TwoField : public PhysicsModel {
private:
Field3D U, Apar; // Evolving variables
int init(bool restarting) override {
SOLVE_FOR(U, Apar);
}
int rhs(BoutReal t) override {
mesh>communicate(U, Apar);
}
};
In order to calculate the time derivatives, we need the auxiliary variables \(\phi\) and \(j_{}\). Calculating \(j_{}\) from \(A_{}\) is a straightforward differential operation, but getting \(\phi\) from \(U\) means inverting a Laplacian.
Field3D U, Apar;
Field3D phi, jpar; // Auxilliary variables
int init(bool restarting) override {
SOLVE_FOR(U, Apar);
SAVE_REPEAT(phi, jpar); // Save variables in output file
return 0;
}
int rhs(BoutReal t) override {
phi = invert_laplace(mesh>Bxy*U, phi_flags); // Solve for phi
mesh>communicate(U, Apar, phi); // Communicate phi
jpar = Delp2(Apar); // Calculate jpar
mesh>communicate(jpar); // Communicate jpar
return 0;
}
Note that the Laplacian inversion code takes care of boundary regions,
so U
doesnât need to be communicated first. The differential
operator Delp2
, like all differential operators, needs the values
in the guard cells and so Apar
needs to be communicated before
calculating jpar
. Since we will need to take derivatives of
jpar
later, this needs to be communicated as well.
int rhs(BoutReal t) override {
...
mesh>communicate(jpar);
ddt(U) = b0xGrad_dot_Grad(phi, U) + SQ(mesh>Bxy)*Grad_par(Jpar / mesh>Bxy)
ddt(Apar) = Grad_par(phi) / beta_hat  eta*jpar / beta_hat; }
Logging outputÂ¶
Logging should be used to report simulation progress, record information, and warn about potential problems. BOUT++ includes a simple logging facility which supports both C printf and C++ iostream styles. For example:
output.write("This is an integer: %d, and this a real: %e\n", 5, 2.0)
output << "This is an integer: " << 5 << ", and this a real: " << 2.0 << endl;
Messages sent to output
on processor 0 will be printed to console
and saved to BOUT.log.0
. Messages from all other processors will
only go to their log files, BOUT.log.#
where #
is the
processor number.
Note: If an error occurs on a processor other than processor 0, then the error message will usually only be in the log file, not printed to console. If BOUT++ crashes but no error message is printed, try looking at the ends of all log files:
$ tail BOUT.log.*
For finer control over which messages are printed, several outputs are available, listed in the table below.
Name  Useage 

output_debug 
For highly verbose output messages, that are normally not needed. Needs to be enabled with a compile switch 
output_info 
For infos like what options are used 
output_progress 
For infos about the current progress 
output_warn 
For warnings 
output_error 
For errors 
Controlling logging levelÂ¶
By default all of the outputs except output_debug
are saved to log
and printed to console (processor 0 only).
To reduce the volume of outputs the command line argument q
(quiet) reduces the output level by one, and v
(verbose)
increases it by one. Running with q
in the command line arguments
suppresses the output_info
messages, so that they will not appear
in the console or log file. Running with q q
suppresses
everything except output_warn
and output_error
.
To enable the output_debug
messages, first configure BOUT++ with
debug messages enabled by adding DDEBUG_ENABLED
to BOUT_FLAGS
in make.config
and then recompiling with make clean;
make
. When running BOUT++ add a âvâ flag to see output_debug
messages.
Updating Physics Models from v3 to v4Â¶
Version 4.0.0 of BOUT++ introduced several features which break backwards compatibility. If you already have physics models, you will most likely need to update them to work with version 4. The main breaking changes which you are likely to come across are:
 Using round brackets
()
instead of square brackets[]
for indexing fields  Moving components of
Mesh
related to the metric tensor and âreal spaceâ out into a new object,Coordinates
 Changed some
Field3D
member functions into nonmember functions  The shifted metric method has changed in version 4, so that fields are stored in orthogonal XZ coordinates rather than field aligned coordinates. This has implications for boundary conditions and postprocessing. See Parallel Transforms for more information.
A new tool is provided, bin/bout_3to4.py
, which can identify these
changes, and fix most of them automatically. Simply run this program
on your physic model to see how to update it to work with version 4:
$ ${BOUT_TOP}/bin/bout_3to4.py my_model.cxx
The output of this command will show you how to fix each problem it
identifies. To automatically apply them, you can use the replace
option:
$ ${BOUT_TOP}/bin/bout_3to4.py replace my_model.cxx
Also in version 4 is a new syntax for looping over each point in a field. See Iterating over fields for more information.
More examplesÂ¶
The code and input files in the examples/
subdirectory are for
research, demonstrating BOUT++, and to check for broken functionality.
Some proper unit tests have been implemented, but this is something
which needs improving. The examples which were published in
[Dudson2009] were driftinstability
, interchangeinstability
and orszagtang
.
[Dudson2009]  https://doi.org/10.1016/j.cpc.2009.03.008 
advect1dÂ¶
The model in gas_compress.cxx
solves the compressible gas dynamics
equations for the density \(n\), velocity \(\mathbf{V}\), and
pressure \(P\):
driftinstabilityÂ¶
The physics code 2fluid.cxx
implements a set of reduced Braginskii
2fluid equations, similar to those solved by the original BOUT code.
This evolves 6 variables: Density, electron and ion temperatures,
parallel ion velocity, parallel current density and vorticity.
Input grid files are the same as the original BOUT code, but the output format is different.
interchangeinstabilityÂ¶
sodshockÂ¶
Makefiles and compiling BOUT++Â¶
BOUT++ has its own makefile system. These can be used to
In all makefiles, BOUT_TOP
is required!
These makefiles are sufficient for most uses, but for more complicated,
an executable script boutconfig
can be used to get the compilation
flags (see boutconfig script).
Executables exampleÂ¶
If writing an example (or physics module that executes) then the makefile is very simple:
BOUT_TOP = ../..
SOURCEC = <filename>.cxx
include $(BOUT_TOP)/make.config
where BOUT_TOP
 refers to the relative (or absolute) location of
the BOUT directory (the one that includes /lib
and /src
) and
SOURCEC
is the name of your file, e.g. gas_compress.cxx
.
Optionally, it is possible to specify TARGET
which defines what the
executable should be called (e.g. if you have multiple source files).
Thatâs it!
Multiple subdirectoriesÂ¶
Large physics modules can have many files, and it can be helpful to
organise these into subdirectories. An example of how to do this is in
examples/make_subdir
.
In the top level, list the directories
DIRS = fuu bar
In the makefile in each subdirectory, specify
TARGET = sub
then specify the path to the toplevel directory
MODULE_DIR = ..
and the name of the subdirectory that the makefile is in
SUB_NAME = fuu
Modules exampleÂ¶
If you are writing a new module (or concrete implementation) to go into the BOUT++ library, then it is again pretty simple
BOUT_TOP = ../..
SOURCEC = communicator.cxx difops.cxx geometry.cxx grid.cxx \
interpolation.cxx topology.cxx
SOURCEH = $(SOURCEC:%.cxx=%.h)
TARGET = lib
include $(BOUT_TOP)/make.config
TARGET
 must be lib
to signify you are adding to
libbout++.a
.
The other variables should be pretty self explanatory.
Adding a new subdirectory to âsrcâÂ¶
No worries, just make sure to edit src/makefile
to add it to the
DIRS
variable.
boutconfig scriptÂ¶
The boutconfig
script is in the bin
subdirectory of the BOUT++
distribution, and is generated by configure
. This script can be used
to get the compilers, flags and settings to compile BOUT++. To get a
list of available options:
$ boutconfig help
so to get the library linking flags, for example
$ boutconfig libs
This script can be used in makefiles to compile BOUT++ alongside other
libraries. The easiest way is to use boutconfig
to find the make.config
file which contains the settings. For example the heat conduction example can
be compiled with the following makefile
:
SOURCEC = conduction.cxx
include $(shell boutconfig configfile)
This includes the make.config
file installed with boutconfig
, rather than
using the BOUT_TOP
variable.
A different way to use boutconfig
is to get the compiler and linker flags,
and use them in your own makefile, for example:
CXX=`boutconfig cxx`
CFLAGS=`boutconfig cflags`
LD=`boutconfig ld`
LDFLAGS=`boutconfig libs
conduction: conduction.cxx
$(CXX) $(CFLAGS) c conduction.cxx o conduction.o
$(LD) o conduction conduction.o $(LDFLAGS)
A more general example is in examples/makescript
.
Variable initialisationÂ¶
Variables in BOUT++ are not initialised automatically, but must be
explicitly given a value. For example the following code declares a
Field3D
variable then attempts to access a particular element:
Field3D f; // Declare a variable
f(0,0,0) = 1.0; // Error!
This results in an error because the data array to store values in f
has not been allocated. Allocating data can be done in several ways:
Initialise with a value:
Field3D f = 0.0; // Allocates memory, fills with zeros f(0,0,0) = 1.0; // ok
This cannot be done at a global scope, since it requires the mesh to already exist and have a defined size.
Set to a scalar value:
Field3D f; f = 0.0; // Allocates memory, fills with zeros f(0,0,0) = 1.0; // ok
Note that setting a field equal to another field has the effect of making both fields share the same underlying data. This behaviour is similar to how NumPy arrays behave in Python.
Field3D g = 0.0; // Allocates memory, fills with zeros Field3D f = g; // f now shares memory with g f(0,0,0) = 1.0; // g also modified
To ensure that a field has a unique underlying memory array call the
Field3D::allocate()
method before writing to individual indices.Use
Field3D::allocate()
to allocate memory:Field3D f; f.allocate(); // Allocates memory, values undefined f(0,0,0) = 1.0; // ok
In a BOUT++ simulation some variables are typically evolved in time. The initialisation of these variables is handled by the time integration solver.
Initialisation of time evolved variablesÂ¶
Each variable being evolved has its own section, with the same name as
the output data. For example, the high\(\beta\) model has
variables âPâ, âjparâ, and âUâ, and so has sections [P]
, [jpar]
,
[U]
(not case sensitive).
ExpressionsÂ¶
The recommended way to initialise a variable is to use the function
option for each variable:
[p]
function = 1 + gauss(x0.5)*gauss(y)*sin(z)
This evaluates an analytic expression to initialise the \(P\)
variable. Expressions can include the usual operators
(+
,
,*
,/
), including ^
for exponents. The
following values are also already defined:
Name  Description 

x  \(x\) position between \(0\) and \(1\) 
y  \(y\) position between \(0\) and \(2\pi\) (excluding the last point) 
z  \(z\) position between \(0\) and \(2\pi\) (excluding the last point) 
pi Ï  \(3.1415\ldots\) 
Table: Initialisation expression values
By default, \(x\) is defined as i / (nx  2*MXG)
, where MXG
is the width of the boundary region, by default 2. Hence \(x\)
actually goes from 0 on the leftmost point to (nx1)/(nx4)
on the
rightmost point. This is not a particularly good definition, but for
most cases its sufficient to create some initial profiles. For some
problems like island reconnection simulations, itâs useful to define
\(x\) in a particular way which is more symmetric than the default.
To do this, set in BOUT.inp
[mesh]
symmetricGlobalX = true
This will change the definition of \(x\) to i / (nx  1)
, so
\(x\) is then between \(0\) and \(1\) everywhere.
By default the expressions are evaluated in a fieldaligned coordinate system,
i.e. if you are using the [mesh]
option paralleltransform = shifted
,
the input f
will have f = fromFieldAligned(f)
applied before being
returned. To switch off this behaviour and evaluate the input expressions in
coordinates with orthogonal xz (i.e. toroidal \(\{\psi,\theta,\phi\}\)
coordinates when using paralleltransform = shifted
), set in BOUT.inp
[input]
transform_from_field_aligned = false
The functions in Table 1 are also available in expressions.
Name  Description 

abs(x) 
Absolute value \(x\) 
asin(x) , acos(x) , atan(x) ,
atan(y,x) 
Inverse trigonometric functions 
ballooning(x) 
Ballooning transform ((1), Fig. 3) 
ballooning(x,n) 
Ballooning transform, using \(n\) terms (default 3) 
cos(x) 
Cosine 
cosh(x) 
Hyperbolic cosine 
exp(x) 
Exponential 
tanh(x) 
Hyperbolic tangent 
gauss(x) 
Gaussian \(\exp(x^2/2) / \sqrt{2\pi}\) 
gauss(x, w) 
Gaussian \(\exp[x^2/(2w^2)] / (w\sqrt{2\pi})\) 
H(x) 
Heaviside function: \(1\) if \(x > 0\) otherwise \(0\) 
log(x) 
Natural logarithm 
max(x,y,...) 
Maximum (variable arguments) 
min(x,y,...) 
Minimum (variable arguments) 
mixmode(x) 
A mixture of Fourier modes 
mixmode(x, seed) 
seed determines random phase (default 0.5) 
power(x,y) 
Exponent \(x^y\) 
sin(x) 
Sine 
sinh(x) 
Hyperbolic sine 
sqrt(x) 
\(\sqrt{x}\) 
tan(x) 
Tangent 
erf(x) 
The error function 
TanhHat(x, width, centre, steepness) 
The hat function \(\frac{1}{2}(\tanh[s (x[c\frac{w}{2}])]\) \( \tanh[s (x[c+\frac{w}{2}])] )\) 
fmod(x) 
The modulo operator, returns floating point remainder 
For fieldaligned tokamak simulations, the Y direction is along the
field and in the core this will have a discontinuity at the twistshift
location where fieldlines are matched onto each other. To handle this,
the ballooning
function applies a truncated Ballooning
transformation to construct a smooth initial perturbation:
There is an example code testballooning
which compares methods of
setting initial conditions with the ballooning transform.
The mixmode(x)
function is a mixture of Fourier modes of the form:
where \(\phi\) is a random phase between \(\pi\) and \(+\pi\), which depends on the seed. The factor in front of each term is chosen so that the 4th harmonic (\(i=4\)) has the highest amplitude. This is useful mainly for initialising turbulence simulations, where a mixture of mode numbers is desired.
Initalising variables with the FieldFactory
classÂ¶
This class provides a way to generate a field with a specified form. For
example to create a variable var
from options we could write
FieldFactory f(mesh);
Field2D var = f.create2D("var");
This will look for an option called âvarâ, and use that expression to
initialise the variable var
. This could then be set in the BOUT.inp
file or on the command line.
var = gauss(x0.5,0.2)*gauss(y)*sin(3*z)
To do this, FieldFactory
implements a recursive descent
parser to turn a string containing something like
"gauss(x0.5,0.2)*gauss(y)*sin(3*z)"
into values in a
Field3D
or Field2D
object. Examples are
given in the testfieldfactory
example:
FieldFactory f(mesh);
Field2D b = f.create2D("1  x");
Field3D d = f.create3D("gauss(x0.5,0.2)*gauss(y)*sin(z)");
This is done by creating a tree of FieldGenerator
objects
which then generate the field values:
class FieldGenerator {
public:
virtual ~FieldGenerator() { }
virtual FieldGenerator* clone(const list<FieldGenerator*> args) {return NULL;}
virtual BoutReal generate(int x, int y, int z) = 0;
};
All classes inheriting from FieldGenerator
must implement
a FieldGenerator::generate()
function, which returns the
value at the given (x,y,z)
position. Classes should also implement
a FieldGenerator::clone()
function, which takes a list of
arguments and creates a new instance of its class. This takes as input
a list of other FieldGenerator
objects, allowing a
variable number of arguments.
The simplest generator is a fixed numerical value, which is
represented by a FieldValue
object:
class FieldValue : public FieldGenerator {
public:
FieldValue(BoutReal val) : value(val) {}
BoutReal generate(int x, int y, int z) { return value; }
private:
BoutReal value;
};
Adding a new functionÂ¶
To add a new function to the FieldFactory, a new
FieldGenerator
class must be defined. Here we will use
the example of the sinh
function, implemented using a class
FieldSinh
. This takes a single argument as input, but
FieldPI
takes no arguments, and
FieldGaussian
takes either one or two. Study these after
reading this to see how these are handled.
First, edit src/field/fieldgenerators.hxx
and add a class
definition:
class FieldSinh : public FieldGenerator {
public:
FieldSinh(FieldGenerator* g) : gen(g) {}
~FieldSinh() {if(gen) delete gen;}
FieldGenerator* clone(const list<FieldGenerator*> args);
BoutReal generate(int x, int y, int z);
private:
FieldGenerator *gen;
};
The gen
member is used to store the input argument, and to make
sure itâs deleted properly we add some code to the destructor. The
constructor takes a single input, the FieldGenerator
argument to the sinh
function, which is stored in the member
gen
.
Next edit src/field/fieldgenerators.cxx
and add the implementation
of the clone
and generate
functions:
FieldGenerator* FieldSinh::clone(const list<FieldGenerator*> args) {
if(args.size() != 1) {
throw ParseException("Incorrect number of arguments to sinh function. Expecting 1, got %d", args.size());
}
return new FieldSinh(args.front());
}
BoutReal FieldSinh::generate(double x, double y, double z, double t) {
return sinh(gen>generate(x,y,z,t));
}
The clone
function first checks the number of arguments using
args.size()
. This is used in FieldGaussian
to handle
different numbers of input, but in this case we throw a
ParseException
if the number of inputs isnât
one. clone
then creates a new FieldSinh
object,
passing the first argument ( args.front()
) to the constructor
(which then gets stored in the gen
member variable).
The generate
function for sinh
just gets the value of the input
by calling gen>generate(x,y,z)
, calculates sinh
of it and
returns the result.
The clone
function means that the parsing code can make copies of
any FieldGenerator
class if itâs given a single instance
to start with. The final step is therefore to give the
FieldFactory
class an instance of this new
generator. Edit the FieldFactory
constructor
FieldFactory::FieldFactory()
in
src/field/field_factory.cxx
and add the line:
addGenerator("sinh", new FieldSinh(NULL));
Thatâs it! This line associates the string "sinh"
with a
FieldGenerator
. Even though FieldFactory
doesnât know what type of FieldGenerator
it is, it can
make more copies by calling the clone
member function. This is a
useful technique for polymorphic objects in C++ called the âVirtual
Constructorâ idiom.
Parser internalsÂ¶
When a FieldGenerator
is added using the addGenerator
function, it is entered into a std::map
which maps strings to
FieldGenerator
objects (include/field_factory.hxx
):
map<string, FieldGenerator*> gen;
Parsing a string into a tree of FieldGenerator
objects is
done by first splitting the string up into separate tokens like
operators like â*â, brackets â(â, names like âsinhâ and so on, then
recognising patterns in the stream of tokens. Recognising tokens is
done in src/field/field_factory.cxx
:
char FieldFactory::nextToken() {
...
This returns the next token, and setting the variable char curtok
to
the same value. This can be one of:
 1 if the next token is a number. The variable
BoutReal curval
is set to the value of the token  2 for a string (e.g. âsinhâ, âxâ or âpiâ). This includes anything
which starts with a letter, and contains only letters, numbers, and
underscores. The string is stored in the variable
string curident
.  0 to mean end of input
 The character if none of the above. Since letters and numbers are taken care of (see above), this includes brackets and operators like â+â and ââ.
The parsing stage turns these tokens into a tree of
FieldGenerator
objects, starting with the parse()
function:
FieldGenerator* FieldFactory::parse(const string &input) {
...
which puts the input string into a stream so that nextToken()
can
use it, then calls the parseExpression()
function to do the actual
parsing:
FieldGenerator* FieldFactory::parseExpression() {
...
This breaks down expressions in stages, starting with writing every expression as:
expression := primary [ op primary ]
i.e. a primary expression, and optionally an operator and another
primary expression. Primary expressions are handled by the
parsePrimary()
function, so first parsePrimary()
is called, and
then parseBinOpRHS
which checks if there is an operator, and if so
calls parsePrimary()
to parse it. This code also takes care of
operator precedence by keeping track of the precedence of the current
operator. Primary expressions are then further broken down and can
consist of either a number, a name (identifier), a minus sign and a
primary expression, or brackets around an expression:
primary := number
:= identifier
:= '' primary
:= '(' expression ')'
:= '[' expression ']'
The minus sign case is needed to handle the unary minus e.g. "x"
.
Identifiers are handled in parseIdentifierExpr()
which handles
either variable names, or functions
identifier := name
:= name '(' expression [ ',' expression [ ',' ... ] ] ')'
i.e. a name, optionally followed by brackets containing one or more
expressions separated by commas. names without brackets are treated the
same as those with empty brackets, so "x"
is the same as "x()"
.
A list of inputs (list<FieldGenerator*> args;
) is created, the
gen
map is searched to find the FieldGenerator
object
corresponding to the name, and the list of inputs is passed to the
objectâs clone
function.
Boundary conditionsÂ¶
Like the variable initialisation, boundary conditions can be set for
each variable in individual sections, with default values in a section
[All]
. Boundary conditions are specified for each variable, being
applied to variable itself during initialisation, and the
timederivatives at each timestep. They are a combination of a basic
boundary condition, and optional modifiers.
When finding the boundary condition for a variable var
on a boundary
region, the options are checked in order from most to least specific:
 Section
var
,bndry_
+ region name. Depending on the mesh file, regions of the grid are given labels. Currently these arecore
,sol
,pf
andtarget
which are intended for tokamak edge simulations. Hence the variables checked arebndry_core
,bndry_pf
etc.  Section
var
,bndry_
+ boundary side. These names arexin
,xout
,yup
andydown
.  Section
var
, variablebndry_all
 The same settings again except in section
All
.
The default setting for everything is therefore bndry_all
in the
All
section.
Boundary conditions are given names, with optional arguments in brackets. Currently implemented boundary conditions are:
dirichlet
 Set to zerodirichlet(<number>)
 Set to some number e.g.dirichlet(1)
sets the boundary to \(1.0\)neumann
 Zero gradientrobin
 A combination of zerogradient and zerovalue \(a f + b{{\frac{\partial f}{\partial x}}} = g\) where the syntax isrobin(a, b, g)
.constgradient
 Constant gradient across boundaryzerolaplace
 Laplacian = 0, decaying solution (X boundaries only)zerolaplace2
 Laplacian = 0, using coefficients from the Laplacian inversion and Delp2 operator.constlaplace
 Laplacian = const, decaying solution (X boundaries only)
The zero or constantLaplacian boundary conditions works as follows:
which when Fourier transformed in \(z\) becomes:
which has the solution
Assuming that the solution should decay away from the domain, on the inner \(x\) boundary \(B = 0\), and on the outer boundary \(A = 0\). Boundary modifiers change the behaviour of boundary conditions, and more than one modifier can be used. Currently the following are available:
relax
 Relaxing boundaries. Evolve the variable towards the given boundary condition at a given rateshifted
 Apply boundary conditions in orthogonal XZ coordinates, rather than fieldalignedwidth
 Modifies the width of the region over which the boundary condition is applied
These are described in the following subsections.
Relaxing boundariesÂ¶
All boundaries can be modified to be ârelaxingâ which are a combination of zerogradient timederivative, and whatever boundary condition they are applied to. The idea is that this prevents sharp discontinuities at boundaries during transients, whilst maintaining the desired boundary condition on longer timescales. In some cases this can improve the numerical stability and timestep.
For example, relax(dirichlet)
will make a field \(f\) at point
\(i\) in the boundary follow a point \(i1\) in the domain:
where \(\tau\) is a timescale for the boundary (currently set to 0.1, but will be a global option). When the timederivatives are slow close to the boundary, the boundary relaxes to the desired condition (Dirichlet in this case), but when the timederivatives are large then the boundary approaches Neumann to reduce discontinuities.
By default, the relaxation rate is set to \(10\) (i.e. a timescale
of \(\tau=0.1\)). To change this, give the rate as the second
argument e.g. relax(dirichlet, 2)
would relax to a Dirichlet
boundary condition at a rate of \(2\).
Shifted boundariesÂ¶
By default boundary conditions are applied in fieldaligned coordinates,
where \(y\) is along fieldlines but \(x\) has a discontinuity
at the twistshift location. If radial derivatives are being done in
shifted coordinates where \(x\) and \(z\) are orthogonal, then
boundary conditions should also be applied in shifted coordinates. To do
this, the shifted
boundary modifier applies a \(z\) shift,
applies the boundary condition, then shifts back. For example:
bndry_core = shifted( neumann )
would ensure that radial derivatives were zero in shifted coordinates on the core boundary.
Changing the width of boundariesÂ¶
To change the width of a boundary region, the width
modifier changes
the width of a boundary region before applying the boundary condition,
then changes the width back afterwards. To use, specify the boundary
condition and the width, for example
bndry_core = width( neumann , 4 )
would apply a Neumann boundary condition on the innermost 4 cells in the core, rather than the usual 2. When combining with other boundary modifiers, this should be applied first e.g.
bndry_sol = width( relax( dirichlet ), 3)
would relax the last 3 cells towards zero, whereas
bndry_sol = relax( width( dirichlet, 3) )
would only apply to the usual 2, since relax didnât use the updated width.
Limitations:
 Because it modifies then restores a globallyused BoundaryRegion, this code is not thread safe.
 Boundary conditions canât be applied across processors, and no checks are done that the width asked for fits within a single processor.
ExamplesÂ¶
This example is taken from the UEDGE benchmark test (in
examples/uedgebenchmark
):
[All]
bndry_all = neumann # Default for all variables, boundaries
[Ni]
bndry_target = neumann
bndry_core = relax(dirichlet(1.)) # 1e13 cm^3 on core boundary
bndry_all = relax(dirichlet(0.1)) # 1e12 cm^3 on other boundaries
[Vi]
bndry_ydown = relax(dirichlet(1.41648)) # 3.095e4/Vi_x
bndry_yup = relax(dirichlet( 1.41648))
The variable Ni
(density) is set to a Neumann boundary condition on
the targets (yup and ydown), relaxes towards \(1\) on the core
boundary, and relaxes to \(0.1\) on all other boundaries. Note that
the bndry_target = neumann
needs to be in the Ni
section: If we
just had
[All]
bndry_all = neumann # Default for all variables, boundaries
[Ni]
bndry_core = relax(dirichlet(1.)) # 1e13 cm^3 on core boundary
bndry_all = relax(dirichlet(0.1)) # 1e12 cm^3 on other boundaries
then the âtargetâ boundary condition for Ni
would first search in
the [Ni]
section for bndry_target
, then for bndry_all
in the
[Ni]
section. This is set to relax(dirichlet(0.1))
, not the
Neumann condition desired.
Boundary regionsÂ¶
The boundary condition code needs ways to loop over the boundary regions, without needing to know the details of the mesh.
At the moment two mechanisms are provided: A RangeIterator over upper and lower Y boundaries, and a vector of BoundaryRegion objects.
// Boundary region iteration
virtual const RangeIterator iterateBndryLowerY() const = 0;
virtual const RangeIterator iterateBndryUpperY() const = 0;
bool hasBndryLowerY();
bool hasBndryUpperY();
bool BoundaryOnCell; // NB: DOESN'T REALLY BELONG HERE
The RangeIterator
class is an iterator which allows looping over a
set of indices. For example, in src/solver/solver.cxx
to loop over
the upper Y boundary of a 2D variable var
:
for(RangeIterator xi = mesh>iterateBndryUpperY(); !xi.isDone(); xi++) {
...
}
The BoundaryRegion
class is defined in
include/boundary_region.hxx
Boundary regionsÂ¶
Different regions of the boundary such as âcoreâ, âsolâ etc. are
labelled by the Mesh
class (i.e. BoutMesh
), which implements a
member function defined in mesh.hxx
:
// Boundary regions
virtual vector<BoundaryRegion*> getBoundaries() = 0;
This returns a vector of pointers to BoundaryRegion
objects, each of
which describes a boundary region with a label, a BndryLoc
location (i.e. inner x, outer x, lower y, upper y or all), and
iterator functions for looping over the points. This class is defined
in boundary_region.hxx
:
/// Describes a region of the boundary, and a means of iterating over it
class BoundaryRegion {
public:
BoundaryRegion();
BoundaryRegion(const string &name, int xd, int yd);
virtual ~BoundaryRegion();
string label; // Label for this boundary region
BndryLoc location; // Which side of the domain is it on?
int x,y; // Indices of the point in the boundary
int bx, by; // Direction of the boundary [x+dx][y+dy] is going outwards
virtual void first() = 0;
virtual void next() = 0; // Loop over every element from inside out (in X or
Y first)
virtual void nextX() = 0; // Just loop over X
virtual void nextY() = 0; // Just loop over Y
virtual bool isDone() = 0; // Returns true if outside domain. Can use this
with nested nextX, nextY
};
Example: To loop over all points in BoundaryRegion *bndry
, use
for(bndry>first(); !bndry>isDone(); bndry>next()) {
...
}
Inside the loop, bndry>x
and bndry>y
are the indices of the
point, whilst bndry>bx
and bndry>by
are unit vectors out of
the domain. The loop is over all the points from the domain outwards
i.e. the point [bndry>x  bndry>bx][bndry>y  bndry>by]
will
always be defined.
Sometimes itâs useful to be able to loop over just one direction along
the boundary. To do this, it is possible to use nextX()
or
nextY()
rather than next()
. It is also possible to loop over
both dimensions using:
for(bndry>first(); !bndry>isDone(); bndry>nextX())
for(; !bndry>isDone(); bndry>nextY()) {
...
}
Boundary operationsÂ¶
On each boundary, conditions must be specified for each variable. The
different conditions are imposed by BoundaryOp
objects. These set
the values in the boundary region such that they obey e.g. Dirichlet
or Neumann conditions. The BoundaryOp
class is defined in
boundary_op.hxx
:
/// An operation on a boundary
class BoundaryOp {
public:
BoundaryOp() {bndry = NULL;}
BoundaryOp(BoundaryRegion *region)
// Note: All methods must implement clone, except for modifiers (see below)
virtual BoundaryOp* clone(BoundaryRegion *region, const list<string> &args);
/// Apply a boundary condition on field f
virtual void apply(Field2D &f) = 0;
virtual void apply(Field3D &f) = 0;
virtual void apply(Vector2D &f);
virtual void apply(Vector3D &f);
/// Apply a boundary condition on ddt(f)
virtual void apply_ddt(Field2D &f);
virtual void apply_ddt(Field3D &f);
virtual void apply_ddt(Vector2D &f);
virtual void apply_ddt(Vector3D &f);
BoundaryRegion *bndry;
};
(where the implementations have been removed for clarity). Which has a
pointer to a BoundaryRegion
object specifying which region this
boundary is operating on.
Boundary conditions need to be imposed on the initial conditions (after
PhysicsModel::init()
), and on the timederivatives (after
PhysicsModel::rhs()
). The apply()
functions are therefore called
during initialisation and given the evolving variables, whilst the
apply_ddt
functions are passed the timederivatives.
To implement a boundary operation, as a minimum the apply(Field2D)
,
apply(Field2D)
and clone()
need to be implemented: By default
the apply(Vector)
will call the apply(Field)
functions on each
component individually, and the apply_ddt()
functions just call the
apply()
functions.
Example: Neumann boundary conditions are defined in
boundary_standard.hxx
:
/// Neumann (zerogradient) boundary condition
class BoundaryNeumann : public BoundaryOp {
public:
BoundaryNeumann() {}
BoundaryNeumann(BoundaryRegion *region):BoundaryOp(region) { }
BoundaryOp* clone(BoundaryRegion *region, const list<string> &args);
void apply(Field2D &f);
void apply(Field3D &f);
};
and implemented in boundary_standard.cxx
void BoundaryNeumann::apply(Field2D &f) {
// Loop over all elements and set equal to the next point in
for(bndry>first(); !bndry>isDone(); bndry>next())
f[bndry>x][bndry>y] = f[bndry>x  bndry>bx][bndry>y  bndry>by];
}
void BoundaryNeumann::apply(Field3D &f) {
for(bndry>first(); !bndry>isDone(); bndry>next())
for(int z=0;z<mesh>LocalNz;z++)
f[bndry>x][bndry>y][z] = f[bndry>x  bndry>bx][bndry>y 
bndry>by][z];
}
This is all thatâs needed in this case since thereâs no difference between applying Neumann conditions to a variable and to its timederivative, and Neumann conditions for vectors are just Neumann conditions on each vector component.
To create a boundary condition, we need to give it a boundary region to operate over:
BoundaryRegion *bndry = ...
BoundaryOp op = new BoundaryOp(bndry);
The clone
function is used to create boundary operations given a
single object as a template in BoundaryFactory
. This can take
additional arguments as a vector of strings  see explanation in
Boundary factory.
Boundary modifiersÂ¶
To create more complicated boundary conditions from simple ones (such
as Neumann conditions above), boundary operations can be modified by
wrapping them up in a BoundaryModifier
object, defined in
boundary_op.hxx
:
class BoundaryModifier : public BoundaryOp {
public:
virtual BoundaryOp* clone(BoundaryOp *op, const list<string> &args) = 0;
protected:
BoundaryOp *op;
};
Since BoundaryModifier
inherits from BoundaryOp
, modified boundary
operations are just a different boundary operation and can be treated
the same (Decorator pattern). Boundary modifiers could also be nested
inside each other to create even more complicated boundary
operations. Note that the clone
function is different to the
BoundaryOp
one: instead of a BoundaryRegion
to operate on,
modifiers are passed a BoundaryOp
to modify.
Currently the only modifier is BoundaryRelax
, defined in
boundary_standard.hxx
:
/// Convert a boundary condition to a relaxing one
class BoundaryRelax : public BoundaryModifier {
public:
BoundaryRelax(BoutReal rate) {r = fabs(rate);}
BoundaryOp* clone(BoundaryOp *op, const list<string> &args);
void apply(Field2D &f);
void apply(Field3D &f);
void apply_ddt(Field2D &f);
void apply_ddt(Field3D &f);
private:
BoundaryRelax() {} // Must be initialised with a rate
BoutReal r;
};
Boundary factoryÂ¶
The boundary factory creates new boundary operations from input strings,
for example turning ârelax(dirichlet)â into a relaxing Dirichlet
boundary operation on a given region. It is defined in
boundary_factory.hxx
as a Singleton, so to get a pointer to the
boundary factory use
BoundaryFactory *bfact = BoundaryFactory::getInstance();
and to delete this singleton, free memory and cleanup at the end use:
BoundaryFactory::cleanup();
Because users should be able to add new boundary conditions during
PhysicsModel::init()
, boundary conditions are not hardwired into
BoundaryFactory
. Instead, boundary conditions must be registered
with the factory, passing an instance which can later be cloned. This
is done in bout++.cxx
for the standard boundary conditions:
BoundaryFactory* bndry = BoundaryFactory::getInstance();
bndry>add(new BoundaryDirichlet(), "dirichlet");
...
bndry>addMod(new BoundaryRelax(10.), "relax");
where the add
function adds BoundaryOp objects, whereas addMod
adds BoundaryModifier
objects. Note: The objects passed to
BoundaryFactory
will be deleted when cleanup()
is called.
When a boundary operation is added, it is given a name such as
âdirichletâ, and similarly for the modifiers (ârelaxâ above). These
labels and object pointers are stored internally in BoundaryFactory
in maps defined in boundary_factory.hxx
:
// Database of available boundary conditions and modifiers
map<string, BoundaryOp*> opmap;
map<string, BoundaryModifier*> modmap;
These are then used by BoundaryFactory::create()
:
/// Create a boundary operation object
BoundaryOp* create(const string &name, BoundaryRegion *region);
BoundaryOp* create(const char* name, BoundaryRegion *region);
to turn a string such as ârelax(dirichlet)â and a BoundaryRegion
pointer into a BoundaryOp
object. These functions are implemented in
boundary_factory.cxx
, starting around line 42. The parsing is done
recursively by matching the input string to one of:
modifier(<expression>, arg1, ...)
modifier(<expression>)
operation(arg1, ...)
operation
the <expression>
variable is then resolved into a BoundaryOp
object by calling create(<expression>, region)
.
When an operator or modifier is found, it is created from the pointer
stored in the opmap
or modmap
maps using the clone
method,
passing a list<string>
reference containing any arguments. Itâs up
to the operation implementation to ensure that the correct number of
arguments are passed, and to parse them into floats or other types.
Example: The Dirichlet boundary condition can take an optional
argument to change the value the boundaryâs set to. In
boundary_standard.cxx
:
BoundaryOp* BoundaryDirichlet::clone(BoundaryRegion *region, const list<string>
&args) {
if(!args.empty()) {
// First argument should be a value
stringstream ss;
ss << args.front();
BoutReal val;
ss >> val;
return new BoundaryDirichlet(region, val);
}
return new BoundaryDirichlet(region);
}
If no arguments are passed i.e. the string was âdirichletâ or
âdirichlet()â then the args
list is empty, and the default value
(0.0) is used. If one or more arguments is used then the first
argument is parsed into a BoutReal
type and used to create a new
BoundaryDirichlet
object. If more arguments are passed then these
are just ignored; probably a warning should be printed.
To set boundary conditions on a field, FieldData
methods are defined
in field_data.hxx
:
// Boundary conditions
void setBoundary(const string &name); ///< Set the boundary conditions
void setBoundary(const string ®ion, BoundaryOp *op); ///< Manually set
virtual void applyBoundary() {}
virtual void applyTDerivBoundary() {};
protected:
vector<BoundaryOp*> bndry_op; // Boundary conditions
The FieldData::setBoundary()
method is implemented in
field_data.cxx
. It first gets a vector of pointers to
BoundaryRegion
s from the mesh, then loops over these calling
BoundaryFactory::createFromOptions()
for each one and adding the
resulting boundary operator to the FieldData::bndry_op
vector.
TestingÂ¶
There are three types of test used in BOUT++, in order of complexity: unit tests, integrated tests, and âmethod of manufactured solutionsâ (MMS) tests. Unit tests are very short, quick tests that test a single âunitâ â usually a single function or method. Integrated tests are longer tests that range from tests that need a lot of set up and check multiple conditions, to full physics model tests. MMS tests check the numerical properties of operators, such as the error scaling of derivatives.
There is a test suite that runs through all of the unit tests, and selected integrated and MMS tests. The easiest way to run this is with:
$ make check
We expect that any new feature or function implemented in BOUT++ also has some corresponding tests, and strongly prefer unit tests.
Automated tests and code coverageÂ¶
BOUT++ uses Travis CI to automatically run the test suite on every
push to the GitHub repository, as well as on every submitted Pull
Request. The Travis settings are in .travis.yml
. Pull requests
that fail the tests will not be merged.
We also gather information from how well the unit tests cover the
library using CodeCov, the settings for which are stored in
.codecov.yml
.
Unit testsÂ¶
The unit test suits aims to be a comprehensive set of tests that run
very fast and ensure the basic functionality of BOUT++ is
correct. At the time of writing, we have around 500 tests that run in
less than a second. Because these tests run very quickly, they should
be run on every commit (or even more often!). For more information on
the unit tests, see tests/unit/README.md
.
You can run the unit tests with:
$ make checkunittests
Integrated testsÂ¶
This set of tests are designed to test that different components of the BOUT++ library work together. These tests are more expensive than the unit tests, but are expected to be run on at least every pull request, and the majority on every commit.
You can run the integrated tests with:
$ make checkintegratedtests
The test suite is in the tests/integrated
directory, and is run
using the test_suite
python
script. tests/integrated/test_suite_list
contains a list of the
subdirectories to run (e.g. testio
, testlaplace
,
interchangeinstability
). In each of those subdirectories the
script runtest
is executed, and the return value used to determine
if the test passed or failed.
All tests should be short, otherwise it discourages people from running the tests before committing changes. A few minutes or less on a typical desktop, and ideally only a few seconds. If you have a large simulation which you want to stop anyone breaking, find starting parameters which are as sensitive as possible so that the simulation can be run quickly.
Custom test requirementsÂ¶
Some tests require particular libraries or environments, so should be
skipped if these are not available. To do this, each runtest
script can contain a line starting with #requires
, followed by a
python expression which evaluates to True
or False
. For
example, a test which doesnât work if both ARKODE and PETSc are used:
#requires not (arkode and petsc)
or if there were a test which required PETSc to be available, it could specify
#requires petsc
Currently the requirements which can be combined are travis
,
netcdf
, pnetcdf
, hdf5
, pvode
, cvode
,
ida
, lapack
, petsc
, slepc
, mumps
, arkode
,
openmp
and make
. The make
requirement is set to True when
the tests are being compiled (but not run), and False when the scripts
are run. Itâs used for tests which do not have a compilation stage.
Method of Manufactured SolutionsÂ¶
The Method of Manufactured solutions (MMS) is a rigorous way to check that a numerical algorithm is implemented correctly. A known solution is specified (manufactured), and it is possible to check that the code output converges to this solution at the expected rate.
To enable testing by MMS, switch an input option âmmsâ to true:
[solver]
mms = true
This will have the following effect:
 For each evolving variable, the solution will be used to initialise and to calculate the error
 For each evolving variable, a source function will be read from the input file and added to the time derivative.
Note
The convergence behaviour of derivatives using FFTs is quite different to the finite difference methods: once the highest frequency in the manufactured solution is resolved, the accuracy will jump enormously, and after that, finer grids will not increase the accuracy. Whereas with finite difference methods, accuracy varies smoothly as the grid is refined.
Choosing manufactured solutionsÂ¶
Manufactured solutions must be continuous and have continuous derivatives. Common mistakes:
 Donât use terms multiplying coordinates together e.g.
x * z
ory * z
. These are not periodic in \(y\) and/or \(z\), so will give strange answers and usually no convergence. Instead usex * sin(z)
or similar, which are periodic.
TimingÂ¶
To time parts of the code, and calculate the percentage of time spent
in communications, file I/O, etc. there is the Timer
class defined
in include/bout/sys/timer.hxx
. To use it, just create a Timer
object at the beginning of the function you want to time:
#include <bout/sys/timer.hxx>
void someFunction() {
Timer timer("test")
...
}
Creating the object starts the timer, and since the object is destroyed when the function returns (since it goes out of scope) the destructor stops the timer.
class Timer {
public:
Timer();
Timer(const std::string &label);
~Timer();
double getTime();
double resetTime();
};
The empty constructor is equivalent to setting label = ""
.
Constructors call a private function getInfo()
, which looks up the
timer_info
structure corresponding to the label in a
map<string, timer_info*>
. If no such structure exists, then one is
created. This structure is defined as:
struct timer_info {
double time; ///< Total time
bool running; ///< Is the timer currently running?
double started; ///< Start time
};
Since each timer can only have one entry in the map, creating two timers with the same label at the same time will lead to trouble. Hence this code is not threadsafe.
The member functions getTime()
and resetTime()
both return the
current time. Whereas getTime()
only returns the time without
modifying the timer, resetTime()
also resets the timer to zero.
If you donât have the object, you can still get and reset the time using static methods:
double Timer::getTime(const std::string &label);
double Timer::resetTime(const std::string &label);
These look up the timer_info
structure, and perform the same task as
their nonstatic namesakes. These functions are used by the monitor
function in bout++.cxx
to print the percentage timing information.
BOUT++ optionsÂ¶
The inputs to BOUT++ are a text file containing options, commandline options, and for complex grids a binary grid file in NetCDF or HDF5 format. Generating input grids for tokamaks is described in Generating input grids. The grid file describes the size and topology of the XY domain, metric tensor components and usually some initial profiles. The option file specifies the size of the domain in the symmetric direction (Z), and controls how the equations are evolved e.g. differencing schemes to use, and boundary conditions. In most situations, the grid file will be used in many different simulations, but the options may be changed frequently.
All options used in a simulation are saved to a BOUT.settings
file.
This includes values which are not explicitly set in BOUT.inp
.
BOUT.inp input fileÂ¶
The text input file BOUT.inp
is always in a subdirectory called
data
for all examples. The files include comments (starting with
either ;
or #
) and should be fairly selfexplanatory. The format is
the same as a windows INI file, consisting of name = value
pairs.
Any type which can be read from a stream using the >>
operator can
be stored in an option (see later for the implementation details).
Supported value types include:
 Integers
 Real values
 Booleans
 Strings
Options are also divided into sections, which start with the section name in square brackets.
[section1]
something = 132 # an integer
another = 5.131 # a real value
ć·„äœç = true # a boolean
à€à€šà€Șà„à€ = "some text" # a string
Option names can contain almost any character except â=â and â:â, including unicode.
If they start with a number or .
, contain arithmetic symbols
(+*/^
), brackets ((){}[]
), whitespace or comma ,
, then these will need
to be escaped in expressions. See below for how this is done.
Subsections can also be used, separated by colons â:â, e.g.
[section:subsection]
Numerical quantities can be plain numbers or expressions:
short_pi = 3.145
foo = 6 * 9
Variables can even reference other variables:
pressure = temperature * density
temperature = 12
density = 3
Note that variables can be used before their definition; all variables
are first read, and then processed afterwards.
The value pi
is already defined, as is Ï
, and can be used in expressions.
Uses for expressions include initialising variables Expressions and input sources, defining grids Generating input grids and MMS convergence tests Method of Manufactured Solutions.
Expressions can include addition (+
), subtraction (
),
multiplication (*
), division (/
) and exponentiation (^
)
operators, with the usual precedence rules. In addition to Ï
,
expressions can use predefined variables x
, y
, z
and t
to refer to the spatial and time coordinates.
A number of functions are defined, listed in table
Table 1. One slightly unusual feature is that if a
number comes before a symbol or an opening bracket ((
)
then a multiplication is assumed: 2x+3y^2
is the same as
2*x + 3*y^2
, which with the usual precedence rules is the same as
(2*x) + (3*(y^2))
.
All expressions are calculated in floating point and then converted to
an integer when read inside BOUT++. The conversion is done by rounding
to the nearest integer, but throws an error if the floating point
value is not within \(1e3\) of an integer. This is to minimise
unexpected behaviour. If you want to round any result to an integer,
use the round
function:
bad_integer = 256.4
ok_integer = round(256.4)
Note that it is still possible to read bad_integer
as a real
number, since the type is determined by how it is used.
Have a look through the examples to see how the options are used.
Special symbols in Option namesÂ¶
If option names start with numbers or .
or contain symbols such as
+
and 
then these symbols need to be escaped in expressions
or they will be treated as arithmetic operators like addition or
subtraction. To escape a single character
\
(backslash) can be used, for example plasma\density * 10
would read the option plasmadensity
and multiply it
by 10 e.g
plasmadensity = 1e19
2ndvalue = 10
value = plasma\density * \2ndvalue
To escape multiple characters, ` (backquote) can be used:
plasmadensity = 1e19
2ndvalue = 10
value = `plasmadensity` * `2ndvalue`
The character :
cannot be part of an option or section name, and cannot be escaped,
as it is always used to separate sections.
Command line optionsÂ¶
Commandline switches are:
Switch  Description 

h, âhelp  Prints a help message and quits 
v, âverbose  Outputs more messages to BOUT.log files 
q, âquiet  Outputs fewer messages to log files 
d <directory>  Look in <directory> for input/output files (default âdataâ) 
f <file>  Use OPTIONS given in <file> 
o <file>  Save used OPTIONS given to <file> (default BOUT.settings) 
In addition all options in the BOUT.inp file can be set on the command line,
and will override those set in BOUT.inp. The most commonly used are ârestartâ and âappendâ,
described in Running BOUT++. If values are not given for
commandline arguments, then the value is set to true
, so putting
restart
is equivalent to restart=true
.
Values can be specified on the command line for other settings, such as the fraction of a torus to simulate (ZPERIOD):
./command zperiod=10
Remember no spaces around the â=â sign. Like the BOUT.inp file, setting names are not case sensitive.
Sections are separated by colons â:â, so to set the solver type (Options) you can either put this in BOUT.inp:
[solver]
type = rk4
or put solver:type=rk4
on the command line. This capability is used
in many test suite cases to change the parameters for each run.
General optionsÂ¶
At the top of the BOUT.inp file (before any section headers), options which affect the core code are listed. These are common to all physics models, and the most useful of them are:
NOUT = 100 # number of timepoints output
TIMESTEP = 1.0 # time between outputs
which set the number of outputs, and the time step between them. Note
that this has nothing to do with the internal timestep used to advance
the equations, which is adjusted automatically. What timestep to use
depends on many factors, but for high\(\beta\) reduced MHD ELM
simulations reasonable choices are 1.0
for the first part of a run
(to handle initial transients), then around 10.0
for the linear
phase. Once nonlinear effects become important, you will have to reduce
the timestep to around 0.1
.
Most large clusters or supercomputers have a limit on how long a job can run for called âwall timeâ, because itâs the time taken according to a clock on the wall, as opposed to the CPU time actually used. If this is the case, you can use the option
wall_limit = 10 # wall clock limit (in hours)
BOUT++ will then try to quit cleanly before this time runs out. Setting a negative value (default is 1) means no limit.
Often itâs useful to be able to restart a simulation from a chosen point, either to reproduce a previous run, or to modify the settings and rerun. A restart file is output every timestep, but this is overwritten each time, and so the simulation can only be continued from the end of the last simulation. Whilst it is possible to create a restart file from the output data afterwards, itâs much easier if you have the restart files. Using the option
archive = 20
saves a copy of the restart files every 20 timesteps, which can then be used as a starting point.
GridsÂ¶
You can set the size of the computational grid in the mesh
section
of the input file (see Generating input grids for more information):
[mesh]
nx = 16 # Number of points in X
ny = 16 # Number of points in Y
nz = 32 # Number of points in Z
It is recommended, but not necessary, that this be \(\texttt{nz} = 2^n\), i.e. \(1,2,4,8,\ldots\). This is because FFTs are usually slightly faster with poweroftwo length arrays, and FFTs are used quite frequently in many models.
Note
In previous versions of BOUT++, nz
was constrained to be
a poweroftwo, and had to be specified as a poweroftwo
plus one (i.e. a number of the form \(2^n + 1\) like
\(2, 3, 5, 9,\ldots\)) in order to account for an
additional, unused, point in Z. Both of these conditions
were relaxed in BOUT++ 4.0. If you use an input file from a
previous version, check that this superfluous point is not
included in nz
.
Since the Z dimension is periodic, the domain size is specified as multiples or fractions of \(2\pi\). To specify a fraction of \(2\pi\), use
ZPERIOD = 10
This specifies a Z range from \(0\) to \(2\pi / {\texttt{ZPERIOD}}\), and is useful for simulation of tokamaks to make sure that the domain is an integer fraction of a torus. If instead you want to specify the Z range directly (for example if Z is not an angle), there are the options
ZMIN = 0.0
ZMAX = 0.1
which specify the range in multiples of \(2\pi\).
In BOUT++, grids can be split between processors in both X and Y directions. By default BOUT++ automatically divides the grid in both X and Y, finding the decomposition with domains closest to square, whilst satisfying constraints. These constraints are:
 Every processor must have the same size and shape domain
 Branch cuts, mostly at Xpoints, must be on processor boundaries. This is because the connection between grid points is modified in BOUT++ by changing which processors communicate.
To specify a splitting manually, the number of processors in the X direction can be specified:
NXPE = 1 # Set number of X processors
Alternatively, the number in the Y direction can be specified (if both are
given, NXPE
takes precedence and NYPE
is ignored):
NYPE = 1 # Set number of Y processors
If you need to specify complex input values, e.g. numerical values
from experiment, you may want to use a grid file. The grid file to use
is specified relative to the root directory where the simulation is
run (i.e. running âls ./data/BOUT.inp
â gives the options
file). You can use the global option grid
, or mesh:file
:
grid = "data/cbm18_8_y064_x260.nc"
# Alternatively:
[mesh]
file = "data/cbm18_8_y064_x260.nc"
CommunicationsÂ¶
The communication system has a section [comms]
, with a true/false
option async
. This determines whether asynchronous MPI sends are
used; which method is faster varies (though not by much) with machine
and problem.
Differencing methodsÂ¶
Differencing methods are specified in the section ([mesh:ddx]
,
[mesh:ddy]
, [mesh:ddz]
and [mesh:diff]
), one for each
dimension. The [mesh:diff]
section is only used if the section for
the dimension does not contain an option for the differencing method.
Note that [mesh]
is the name of the section passed to the mesh
constructor, which is most often mesh
 but could have another
name, e.g. if multiple meshes are used.
first
, the method used for first derivativessecond
, method for second derivativesfourth
, method for fourth derivativesupwind
, method for upwinding termsflux
, for conservation law terms
The methods which can be specified include U1, U4, C2, C4, W2, W3, FFT Apart from FFT, the first letter gives the type of method (U = upwind, C = central, W = WENO), and the number gives the order.
The staggered derivatives can be specified as FirstStag
or if the
value is not set, then First
is checked.
Note that for the staggered quantities, if the staggered quantity in a
dimension is not set, first the staggered quantity in the [mesh:diff]
section is checked. This is useful, as the staggered quantities are
more restricted in the available choices than the nonstaggered
differenciating operators.
Modelspecific optionsÂ¶
The options which affect a specific physics model vary, since they are
defined in the physics module itself (see Input options). They
should have a separate section, for example the high\(\beta\)
reduced MHD code uses options in a section called [highbeta]
.
There are three places to look for these options: the BOUT.inp file; the physics model C++ code, and the output logs. The physics module author should ideally have an example input file, with commented options explaining what they do; alternately they may have put comments in the C++ code for the module. Another way is to look at the output logs: when BOUT++ is run, (nearly) all options used are printed out with their default values. This wonât provide much explanation of what they do, but may be useful anyway. See Postprocessing for more details.
Input and OutputÂ¶
The format of the output (dump) files can be controlled, if support for more than one output format has been configured, by setting the toplevel option dump_format to one of the recognised file extensions: âncâ for NetCDF; âhdf5â, âhdfâ or âh5â for HDF5. For example to select HDF5 instead of the default NetCDF format put
dump_format = hdf5
before any section headers. The output (dump) files with timehistory are controlled by settings in a section called âoutputâ. Restart files contain a single timeslice, and are controlled by a section called ârestartâ. The options available are listed in table Table 2.
Option  Description  Default value 
enabled  Writing is enabled  true 
floats  Write floats rather than doubles  false 
flush  Flush the file to disk after each write  true 
guards  Output guard cells  true 
openclose  Reopen the file for each write, and close after  true 
parallel  Use parallel I/O  false 
enabled is useful mainly for doing performance or scaling tests, where you want to exclude I/O from the timings. floats can be used to reduce the size of the output files: files are stored as double by default, but setting floats = true changes the output to singleprecision floats.
To enable parallel I/O for either output or restart files, set
parallel = true
in the output or restart section. If you have compiled BOUT++ with a parallel I/O library such as pnetcdf (see Advanced installation options), then rather than outputting one file per processor, all processors will output to the same file. For restart files this is particularly useful, as it means that you can restart a job with a different number of processors. Note that this feature is still experimental, and incomplete: output dump files are not yet supported by the collect routines.
ImplementationÂ¶
To control the behaviour of BOUT++ a set of options is used, with
options organised into sections which can be nested. To represent this
tree structure there is the Options
class defined in
bout++/include/options.hxx
.
To access the options, there is a static function (singleton):
auto& options = Options::root();
which returns a reference (type Options&
). Note that without
the &
the options tree will be copied, so any changes made will not
be retained in the global tree. Options can be set by
assigning, treating options as a map or dictionary:
options["nout"] = 10; // Integer
options["restart"] = true; // bool
Internally these values are stored in a variant type, which supports commonly
used types including strings, integers, real numbers and fields (2D and
3D). Since strings can be stored, any type can be assigned, so long as it can be
streamed to a string (using <<
operator and a std::stringstream
).
Often itâs useful to see where an option setting has come from e.g. the
name of the options file or âcommand lineâ. To specify a source, use
the assign
function to assign values:
options["nout"].assign(10, "manual");
A value cannot be assigned more than once with different values and the same source (âmanualâ in this example). This is to catch a common error in which a setting is inconsistently specified in an input file. To force a value to change, overwriting the existing value (if any):
options["nout"].force(20, "manual");
Subsections are created as they are accessed, so a value in a subsection could be set using:
auto& section = options["mysection"];
section["myswitch"] = true;
or just:
options["mysection"]["myswitch"] = true;
To get options, they can be assigned to a variable:
int nout = options["nout"];
If the option is not found then a BoutException
will be thrown. A
default value can be given, which will be used if the option has not
been set:
int nout = options["nout"].withDefault(1);
If options
is not const
, then the given default value will be
cached. If a default value has already been cached for this option,
then the default values must be consistent: A BoutException
is
thrown if inconsistent default values are detected.
The default can also be set from another option. This may be useful if two or more options should usually be changed together:
BoutReal value2 = options["value2"].withDefault(options["value1"]);
Note that if the result should be a real number (e.g. BoutReal
) then withDefault
should be given a real. Otherwise it will convert the number to an integer:
BoutReal value = options["value"].withDefault(42); // Convert to integer
BoutReal value = options["value"].withDefault(42.0); // ok
auto value = options["value"].withDefault<BoutReal>(42); // ok
It is common for BOUT++ models to read in many settings which have the same variable name as option setting (e.g. ânoutâ here). A convenient macro reads options into an alreadydefined variable:
int nout;
OPTION(options, nout, 1);
where the first argument is a section, second argument is the variable whose name will also be used as the option string, and third argument is the default value.
Every time an option is accessed, a message is written
to output_info
. This message includes the value used and the
source of that value. By default this message is printed to the
terminal and saved in the log files, but this can be disabled by
changing the logging level: Add q
to the command line to reduce
logging level. See section Logging output for more details about
logging.
The type to be returned can also be specified as a template argument:
BoutReal nout = options["nout"].as<BoutReal>();
Any type can be used which can be streamed (operator >>
) from a
stringstream
. There are special implementations for bool
,
int
and BoutReal
which enable use of expressions in the input
file. The type can also be specified to withDefault
, or will be
inferred from the argument:
BoutReal nout = options["nout"].withDefault<BoutReal>(1);
DocumentationÂ¶
Options can be given a doc
attribute describing what they do. This documentation
will then be written to the BOUT.settings
file at the end of a run:
Te0 = options["Te0"].doc("Temperature in eV").withDefault(30.0);
The .doc()
function returns a reference Options&
so can be chained
with withDefault
or as
functions, or as part of an assignment:
options["value"].doc("Useful setting info") = 42;
This string is stored in the attributes of the option:
std::string docstring = options["value"].attributes["doc"];
Older interfaceÂ¶
Some code in BOUT++ currently uses an older interface to Options
which uses pointers rather than references. Both interfaces are
currently supported, but use of the newer interface above is
encouraged.
To access the options, there is a static function (singleton):
Options *options = Options::getRoot();
which gives the toplevel (root) options class. Setting options is done
using the set()
methods which are currently defined for int
,
BoutReal
, bool
and string
. For example:
options>set("nout", 10); // Set an integer
options>set("restart", true); // A bool
Often itâs useful to see where an option setting has come from e.g. the name of the options file or âcommand lineâ. To specify a source, pass it as a third argument:
options>set("nout", 10, "manual");
To create a section, just use getSection
: if it doesnât exist it
will be created:
Options *section = options>getSection("mysection");
section>set("myswitch", true);
To get options, use the get()
method which take the name of the
option, the variable to set, and the default value:
int nout;
options>get("nout", nout, 1);
Internally, Options
converts all types to strings and does type
conversion when needed, so the following code would work:
Options *options = Options::getRoot();
options>set("test", "123");
int val;
options>get("test", val, 1);
This is because often the type of the option is not known at the time when itâs set, but only when itâs requested.
Reading optionsÂ¶
To allow different input file formats, each file parser implements the
OptionParser
interface defined in
bout++/src/sys/options/optionparser.hxx
:
class OptionParser {
public:
virtual void read(Options *options, const string &filename) = 0;
private:
};
and so just needs to implement a single function which reads a given
file name and inserts the options into the given Options
object.
To use these parsers and read in a file, there is the OptionsReader
class defined in bout++/include/optionsreader.hxx
:
class OptionsReader {
public:
void read(Options *options, const char *file, ...);
void parseCommandLine(Options *options, int argc, char **argv);
};
This is a singleton object which is accessed using:
OptionsReader *reader = OptionsReader::getInstance();
so to read a file BOUT.inp
in a directory given in a variable
data_dir
the following code is used in bout++.cxx
:
Options *options = Options::getRoot();
OptionsReader *reader = OptionsReader::getInstance();
reader>read(options, "%s/BOUT.inp", data_dir);
To parse command line arguments as options, the OptionsReader
class
has a method:
reader>parseCommandLine(options, argc, argv);
This is currently quite rudimentary and needs improving.
Reading and writing to NetCDFÂ¶
If NetCDF4 support is enabled, then the OptionsNetCDF
class
provides an experimental way to read and write options. To use this class:
#include "options_netcdf.hxx"
using bout::experimental::OptionsNetCDF;
Examples are in integrated test tests/integrated/testoptionsnetcdf/
To write the current Options
tree (e.g. from BOUT.inp
) to a
NetCDF file:
OptionsNetCDF("settings.nc").write(Options::root());
and to read it in again:
Options data = OptionsNetCDF("settings.nc").read();
Fields can also be stored and written:
Options fields;
fields["f2d"] = Field2D(1.0);
fields["f3d"] = Field3D(2.0);
OptionsNetCDF("fields.nc").write(fields);
This should allow the input settings and evolving variables to be combined into a single tree (see above on joining trees) and written to the output dump or restart files.
Reading fields is a bit more difficult. Currently 1D data is read as
an Array<BoutReal>
, 2D as Matrix<BoutReal>
and 3D as
Tensor<BoutReal>
. These can be extracted directly from the
Options
tree, or converted to a Field:
Options fields_in = OptionsNetCDF("fields.nc").read();
Field2D f2d = fields_in["f2d"].as<Field2D>();
Field3D f3d = fields_in["f3d"].as<Field3D>();
Note that by default reading as Field2D
or Field3D
will use the global
bout::globals::mesh
. To use a different mesh, or different cell location,
pass a field which the result should be similar to:
Field3D example = ... // Some existing field
Field3D f3d = fields_in["f3d"].as<Field3D>(example);
Meta data like Mesh
pointer, will be taken from example
.
Currently converting from Matrix
or Tensor
types only works if
the data in the Matrix
or Tensor
is the same size as the
Field
. In the case of grid files, the fields only needs a part of
the global values. Some kind of mapping from the global index to local
index is needed, probably defined by Mesh
. For now it should be
possible to be compatible with the current system, so that all
quantities from the grid file are accessed through Mesh::get.
Time dependenceÂ¶
When writing NetCDF files, some variables should have a time
dimension added, and then be added to each time they are written. This
has been implemented using an attribute: If variables in the Options
tree have an attribute âtime_dimensionâ then that is used as the name
of the time dimension in the output file. This allows multiple time
dimensions e.g. high frequency diagnostics and low frequency outputs,
to exist in the same file:
Options data;
data["scalar"] = 1.0;
data["scalar"].attributes["time_dimension"] = "t";
data["field"] = Field3D(2.0);
data["field"].attributes["time_dimension"] = "t";
OptionsNetCDF("time.nc").write(data);
// Update timedependent values. This can be done without `force` if the time_dimension
// attribute is set
data["scalar"] = 2.0;
data["field"] = Field3D(3.0);
// Append data to file
OptionsNetCDF("time.nc", OptionsNetCDF::FileMode::append).write(data);
Some issues:
 Currently all variables in the Options tree are written when passed
to
OptionsNetCDF::write
. This means that the variables with different time dimensions should be stored in different Options trees, so they can be written at different times. One possibility is to have an optional argument to write, so that only variables with one specified time dimension are updated.
FFTÂ¶
There is one global option for Fourier transforms, fft_measure
(default: false
). Setting this to true enables the
FFTW_MEASURE
mode when performing FFTs, otherwise
FFTW_ESTIMATE
is used:
[fft]
fft_measure = true
In FFTW_MEASURE
mode, FFTW runs and measures how long several
FFTs take, and tries to find the optimal method.
Note
Technically, FFTW_MEASURE
is nondeterministic and
enabling fft_measure
may result in slightly different
answers from run to run, or be dependent on the number of
MPI processes. This may be important if you are trying to
benchmark or measure performance of your code.
See the FFTW FAQ for more information.
Generating input gridsÂ¶
The simulation mesh describes the number and topology of grid points, the spacing between them, and the coordinate system. For many problems, a simple mesh can be created using options.
[mesh]
nx = 260 # X grid size
ny = 256 # Y grid size
dx = 0.1 # X mesh spacing
dy = 0.1 # Y mesh spacing
The above options will create a \(260\times 256\) mesh in X and Y (MZ option sets Z resolution), with mesh spacing of \(0.1\) in both directions. By default the coordinate system is Cartesian (metric tensor is the identity matrix), but this can be changed by specifying the metric tensor components.
Integer quantities such as nx
can be numbers (like â260â), or
expressions (like â256 + 2*MXGâ).
A common use is to make x
and z
dimensions have the same
number of points, when x
has mxg
boundary cells on each
boundary but z
does not (since it is usually periodic):
[mesh]
nx = nz + 2*mxg # X grid size
nz = 256 # Z grid size
mxg = 2
Note that the variable nz
can be used before its definition; all
variables are first read, and then processed afterwards.
All expressions are calculated in floating point and then
converted to an integer. The conversion is
done by rounding to the nearest integer, but throws an error if the
floating point value is not within 1e3 of an integer. This is to minimise
unexpected behaviour. If you want to round any result to an integer,
use the round
function:
[mesh]
nx = 256.4 # Error!
nx = round(256.4) # ok
Real (floatingpoint) values can also be expressions, allowing quite
complicated analytic inputs. For example in the example testgriddata
:
# Screw pinch
rwidth = 0.4
Rxy = 0.1 + rwidth*x # Radius from axis [m]
L = 10 # Length of the device [m]
dy = L/ny
hthe = 1.0
Zxy = L * y / (2*pi)
Bpxy = 1.0 # Axial field [T]
Btxy = 0.1*Rxy # Azimuthal field [T]
Bxy = sqrt(Btxy^2 + Bpxy^2)
dr = rwidth / nx
dx = dr * Bpxy * Rxy
These expressions use the same mechanism as used for variable
initialisation (Expressions): x
is a variable from
\(0\) to \(1\) in the domain which is uniform in index space;
y
and z
go from \(0\) to \(2\pi\). As with variable
initialisation, common trigonometric and mathematical functions can be
used. In the above example, some variables depend on each other, for
example dy
depends on L
and ny
. The order in which these
variables are defined doesnât matter, so L
could be defined below
dy
, but circular dependencies are not allowed. If the variables are
defined in the same section (as dy
and L
) then no section prefix
is required. To refer to a variable in a different section, prefix the
variable with the section name e.g. âsection:variable
â.
More complex meshes can be created by supplying an input grid file to describe the grid points, geometry, and starting profiles. Currently BOUT++ supports either NetCDF, HDF5 format binary files. During startup, BOUT++ looks in the grid file for the following variables. If any are not found, a warning will be printed and the default values used.
 X and Y grid sizes (integers)
nx
andny
REQUIRED  Differencing quantities in 2D arrays
dx[nx][ny]
anddy[nx][ny]
. If these are not found they will be set to 1.  Diagonal terms of the metric tensor \(g^{ij}\)
g11[nx][ny]
,g22[nx][ny]
, andg33[nx][ny]
. If not found, these will be set to 1.  Offdiagonal metric tensor \(g^{ij}\) elements
g12[nx][ny]
,g13[nx][ny]
, andg23[nx][ny]
. If not found, these will be set to 0.  Z shift for interpolation between fieldaligned coordinates and
shifted coordinates (see
manual/coordinates.pdf
). Perpendicular differential operators are calculated in shifted coordinates whenShiftXderivs
inmesh/mesh.hxx
is enabled.ShiftXderivs
can be set in the root section ofBOUT.inp
asShiftXderivs = true
. The shifts must be provided in the gridfile in a fieldzshift[nx][ny]
. If not found,zshift
is set to zero.
The remaining quantities determine the topology of the grid. These are based on tokamak single/doublenull configurations, but can be adapted to many other situations.
 Separatrix locations
ixseps1
, andixseps2
If neither is given, both are set to nx (i.e. all points in closed âcoreâ region). If onlyixseps1
is found,ixseps2
is set to nx, and if only ixseps2 is found, ixseps1 is set to 1.  Branchcut locations
jyseps1_1
,jyseps1_2
,jyseps2_1
, andjyseps2_2
 Twistshift matching condition
ShiftAngle[nx]
for field aligned coordinates. This is applied in the âcoreâ region between indicesjyseps2_2
, andjyseps1_1 + 1
, if eitherTwistShift = True
enabled in the options file or in general theTwistShift
flag inmesh/impls/bout/boutmesh.hxx
is enabled by other means. BOUT++ automatically reads the twist shifts in the gridfile if the shifts are stored in a field ShiftAngle[nx]; ShiftAngle must be given in the gridfile or gridoptions ifTwistShift = True
.
The only quantities which are required are the sizes of the grid. If these are the only quantities specified, then the coordinates revert to Cartesian.
This section describes how to generate inputs for tokamak equilibria. If youâre not interested in tokamaks then you can skip to the next section.
The directory tokamak_grids
contains code to generate input grid
files for tokamaks. These can be used by the 2fluid
and
highbeta_reduced
modules, and are (mostly) compatible with inputs to
the BOUT06 code.
BOUT++ TopologyÂ¶
BasicÂ¶
In order to handle tokamak geometry BOUT++ contains an internal topology
which is determined by the branchcut locations (jyseps1_1
,
jyseps1_2
, jyseps2_1
, and jyseps2_2
) and separatrix
locations (ixseps1
and ixseps2
).
The separatrix locations, ixseps1
and ixseps2
, give the indices
in the x
domain where the first and second separatrices are located.
If ixseps1 == ixseps2
then there is a single separatrix representing
the boundary between the core region and the SOL region and the grid is
a connected double null configuration. If ixseps1 > ixseps2
then
there are two separatrices and the inner separatrix is ixseps2
so
the tokamak is an upper double null. If ixseps1 < ixseps2
then there
are two separatrices and the inner separatrix is ixseps1
so the
tokamak is a lower double null.
In other words: Let us for illustrative purposes say that
ixseps1 > ixseps2
(see Fig. 4). Let
us say that we have a field f(x,y,z)
with a global x
index which
includes ghost points. f(x<=xseps1,y,z)
) will then be periodic in
the y
direction, f(xspes1<x<=xseps2,y,z)
) will have boundary
condition in the y
direction set by the lowermost ydown
and
yup
. If f(xspes2<x,y,z)
) the boundary condition in the
y
direction will be set by the uppermost ydown
and yup
. As
for now, there is no difference between the two sets of upper and lower
ydown
and yup
boundary conditions (unless manually specified,
see Custom boundary conditions).
These values are set either in the grid file or in BOUT.inp
.
Fig. 4 shows schematically how ixseps
is
used.
The branch cut locations, jyseps1_1
, jyseps1_2
, jyseps2_1
,
and jyseps2_2
, split the y
domain into logical regions defining
the SOL, the PFR (private flux region) and the core of the tokamak. This
is illustrated also in Fig. 4. If
jyseps1_2 == jyseps2_1
then the grid is a single null configuration,
otherwise the grid is a double null configuration.
AdvancedÂ¶
The internal domain in BOUT++ is deconstructed into a series of
logically rectangular subdomains with boundaries determined by the
ixseps
and jyseps
parameters. The boundaries coincide with
processor boundaries so the number of grid points within each subdomain
must be an integer multiple of ny/nypes
where ny
is the number
of grid points in y
and nypes
is the number of processors used
to split the y domain. Processor communication across the domain
boundaries is then handled internally. Fig. 5
shows schematically how the different regions of a double null tokamak
with ixseps1 = ixseps2
are connected together via communications.
Note
To ensure that each subdomain follows logically, the
jyseps
indices must adhere to the following conditions:
jyseps1_1 > 1
jyseps2_1 >= jyseps1_1 + 1
jyseps1_2 >= jyseps2_1
jyseps2_2 >= jyseps1_2
jyseps2_2 <= ny  1
To ensure that communications work branch cuts must align with processor boundaries.
ImplementationsÂ¶
In BOUT++ each processor has a logically rectangular domain, so any branch cuts needed for Xpoint geometry (see Fig. 5) must be at processor boundaries.
In the standard âboutâ mesh (src/mesh/impls/bout/
), the
communication is controlled by the variables
int UDATA_INDEST, UDATA_OUTDEST, UDATA_XSPLIT;
int DDATA_INDEST, DDATA_OUTDEST, DDATA_XSPLIT;
int IDATA_DEST, ODATA_DEST;
These control the behavior of the communications as shown in Fig. 6.
In the Y direction, each boundary region (Up and Down in Y)
can be split into two, with 0 <= x < UDATA_XSPLIT
going to the
processor index UDATA_INDEST
, and UDATA_INDEST <= x < LocalNx
going
to UDATA_OUTDEST
. Similarly for the Down boundary. Since there are
no branchcuts in the X direction, there is just one destination for the
Inner and Outer boundaries. In all cases a negative
processor number means that thereâs a domain boundary so no
communication is needed.
The communication control variables are set in the topology()
function, in src/mesh/impls/bout/boutmesh.cxx
starting around line
2056. First the function default_connections()
sets the topology to
be a rectangle
To change the topology, the function set_connection
checks that the
requested branch cut is on a processor boundary, and changes the
communications consistently so that communications are twoway and there
are no âdanglingâ communications.
3D variablesÂ¶
BOUT++ was originally designed for tokamak simulations where the input equilibrium varies only in XY, and Z is used as the axisymmetric toroidal angle direction. In those cases, it is often convenient to have input grids which are only 2D, and allow the Z dimension to be specified independently, such as in the options file. The problem then is how to store 3D variables in the grid file?
Two representations are now supported for 3D variables:
A Fourier representation. If the size of the toroidal domain is not specified in the grid file (
nz
is not defined), then 3D fields are stored as Fourier components. In the Z dimension the coefficients must be stored as\[[n = 0, n = 1 (\textrm{real}), n = 1 (\textrm{imag}), n = 2 (\textrm{real}), n = 2 (\textrm{imag}), \ldots ]\]where \(n\) is the toroidal mode number. The size of the array must therefore be odd in the Z dimension, to contain a constant (\(n=0\)) component followed by real/imaginary pairs for the nonaxisymmetric components.
If you are using IDL to create a grid file, there is a routine in
tools/idllib/bout3dvar.pro
for converting between BOUT++âs real and Fourier representation.Real space, as values on grid points. If
nz
is set in the grid file, then 3D variables in the grid file must have sizenx
\(\times\)ny
\(\times\)nz
. These are then read in directly intoField3D
variables as required.
From EFIT filesÂ¶
An IDL code called âHypnotoadâ has been developed to create BOUT++ input
files from RZ equilibria. This can read EFIT âgâ files, find flux
surfaces, and calculate metric coefficients. The code is in
tools/tokamak_grids/gridgen
, and has its own manual under the
doc
subdirectory.
From ELITE and GATO filesÂ¶
Currently conversions exist for ELITE .eqin
and GATO dskgato
equilibrium files. Conversion of these into BOUT++ input grids is in two
stages: In the first, both these input files are converted into a common
NetCDF format which describes the GradShafranov equilibrium. These
intermediate files are then converted to BOUT++ grids using an
interactive IDL script.
Generating equilibriaÂ¶
The directory tokamak_grids/shifted_circle
contains IDL code to
generate shifted circle (large aspect ratio) GradShafranov equilibria.
Zoidberg grid generatorÂ¶
The Zoidberg grid generator creates inputs for the Flux Coordinate Independent (FCI) parallel transform (section Parallel Transforms). The domain is divided into a set of 2D grids in the XZ coordinates, and the magnetic field is followed along the Y coordinate from each 2D grid to where it either intersects the forward and backward grid, or hits a boundary.
The simplest code which creates an output file is:
import zoidberg
# Define the magnetic field
field = zoidberg.field.Slab()
# Define the grid points
grid = zoidberg.grid.rectangular_grid(10,10,10)
# Follow magnetic fields from each point
maps = zoidberg.make_maps(grid, field)
# Write everything to file
zoidberg.write_maps(grid, field, maps, gridfile="grid.fci.nc")
As in the above code, creating an output file consists of the following steps:
 Define a magnetic field
 Define the grid points. This can be broken down into:
 Define 2D âpoloidalâ grids
 Form a 3D grid by putting 2D grids together along the Y direction
 Create maps from each 2D grid to its neighbours
 Save grids, fields and maps to file
Each of these stages can be customised to handle more complicated
magnetic fields, more complicated grids, and particular output
formats. Details of the functionality available are described in
sections below, and there are several examples in the
examples/zoidberg
directory.
Rectangular gridsÂ¶
An important input to Zoidberg is the size of the domain in Y, and
whether the domain is periodic in Y. By default rectangular_grid
makes
a nonperiodic rectangular box which is of length 10 in the Y direction.
This means that there are boundaries at \(y=0\) and at \(y=10\).
rectangular_grid
puts the y slices at equally spaced intervals, and puts
the first and last points half an interval away from boundaries in y.
In this case with 10 points in y (second argument to rectangular_grid(nx,ny,nz)
)
the y locations are \(\left(0.5, 1.5, 2.5, \ldots, 9.5\right)\).
At each of these y locations rectangular_grid
defines a rectangular 2D poloidal grid in
the XZ coordinates, by default with a length of 1 in each direction and centred on \(x=0,z=0\).
These 2D poloidal grids are then put together into a 3D Grid
. This process can be customised
by separating step 2 (the rectangular_grid
call) into stages 2a) and 2b).
For example, to create a periodic rectangular grid we could use the following:
import numpy as np
# Create a 10x10 grid in XZ with sides of length 1
poloidal_grid = zoidberg.poloidal_grid.RectangularPoloidalGrid(10, 10, 1.0, 1.0)
# Define the length of the domain in y
ylength = 10.0
# Define the y locations
ycoords = np.linspace(0.0, ylength, 10, endpoint=False)
# Create the 3D grid by putting together 2D poloidal grids
grid = zoidberg.grid.Grid(poloidal_grid, ycoords, ylength, yperiodic=True)
In the above code the length of the domain in the y direction needs to be given to Grid
so that it knows where to put boundaries (if not periodic), or where to wrap the domain
(if periodic). The array of y locations ycoords can be arbitrary, but note that finite
difference methods (like FCI) work best if grid point spacing varies smoothly.
A more realistic example is creating a grid for a MAST tokamak equilibrium from a GEqdsk
input file (this is in examples/zoidberg/tokamak.py
):
import numpy as np
import zoidberg
field = zoidberg.field.GEQDSK("g014220.00200") # Read magnetic field
grid = zoidberg.grid.rectangular_grid(100, 10, 100,
1.50.1, # Range in R (max  min)
2*np.pi, # Toroidal angle
3., # Range in Z
xcentre=(1.5+0.1)/2, # Middle of grid in R
yperiodic=True) # Periodic in toroidal angle
# Create the forward and backward maps
maps = zoidberg.make_maps(grid, field)
# Save to file
zoidberg.write_maps(grid, field, maps, gridfile="grid.fci.nc")
# Plot grid points and the points they map to in the forward direction
zoidberg.plot.plot_forward_map(grid, maps)
In the last example only one poloidal grid was created (a RectangularPoloidalGrid
)
and then reused for each y slice. We can instead define a different grid for each y
position. For example, to define a grid which expands along y (for some reason) we could do:
ylength = 10.0
ycoords = np.linspace(0.0, ylength, 10, endpoint=False)
# Create a list of poloidal grids, one for each y location
poloidal_grids = [ RectangularPoloidalGrid(10, 10, 1.0 + y/10., 1.0 + y/10.)
for y in ycoords ]
# Create the 3D grid by putting together 2D poloidal grids
grid = zoidberg.grid.Grid(poloidal_grids, ycoords, ylength, yperiodic=True)
Note: Currently there is an assumption that the number of X and Z points is the
same on every poloidal grid. The shape of the grid can however be completely
different. The construction of a 3D Grid
is the same in all cases, so for now
we will concentrate on producing different poloidal grids.
More general gridsÂ¶
The FCI technique is not restricted to rectangular grids, and in particular
Zoidberg can handle structured grids in an annulus with quite complicated shapes.
The StructuredPoloidalGrid
class handles quite general geometries,
but still assumes that the grid is structured and logically rectangular.
Currently it also assumes that the z index is periodic.
One way to create this grid is to define the grid points manually e.g.:
import numpy as np
import zoidberg
# First argument is minor radius, second is angle
r,theta = np.meshgrid(np.linspace(1,2,10),
np.linspace(0,2*np.pi, 10),
indexing="ij")
R = r * np.sin(theta)
Z = r * np.cos(theta)
poloidal_grid = zoidberg.poloidal_grid.StructuredPoloidalGrid(R,Z)
For more complicated shapes than circles, Zoidberg comes with an elliptic grid generator which needs to be given only the inner and outer boundaries:
import zoidberg
inner = zoidberg.rzline.shaped_line(R0=3.0, a=0.5,
elong=1.0, triang=0.0, indent=1.0,
n=50)
outer = zoidberg.rzline.shaped_line(R0=2.8, a=1.5,
elong=1.0, triang=0.0, indent=0.2,
n=50)
poloidal_grid = zoidberg.poloidal_grid.grid_elliptic(inner, outer,
100, 100, show=True)
which should produce the figure below:
Grids aligned to flux surfacesÂ¶
The elliptic grid generator can be used to generate grids
whose inner and/or outer boundaries align with magnetic flux surfaces.
All it needs is two RZline
objects as generated by zoidberg.rzline.shaped_line
,
one for the inner boundary and one for the outer boundary.
RZline
objects represent periodic lines in RZ (XZ coordinates), with
interpolation using splines.
To create an RZline
object for a flux surface we first need to find
where the flux surface is. To do this we can use a Poincare plot: Start at a point
and follow the magnetic field a number of times around the periodic y direction
(e.g. toroidal angle). Every time the field line reaches a y location of interest,
mark the position to build up a scattered set of points which all lie on the same
flux surface.
At the moment this will not work correctly for slab geometries, but expects closed flux surfaces such as in a stellarator or tokamak. A simple test case is a straight stellarator:
import zoidberg
field = zoidberg.field.StraightStellarator(I_coil=0.4, yperiod=10)
By default StraightStellarator
calculates the magnetic field due to four coils which spiral around
the axis at a distance \(r=0.8\) in a classical stellarator configuration. The yperiod
argument is the period in y after which the coils return to their starting locations.
To visualise the Poincare plot for this stellarator field, pass the MagneticField
object
to zoidberg.plot.plot_poincare
, together with start location(s) and periodicity information:
zoidberg.plot.plot_poincare(field, 0.4, 0.0, 10.0)
which should produce the following figure:
The inputs here are the starting location \(\left(x,z\right) = \left(0.4, 0.0\right)\),
and the periodicity in the y direction (10.0). By default this will
integrate from this given starting location 40 times (revs
option) around the y domain (0 to 10).
To create an RZline
from these Poincare plots we need a
list of points in order around the line. Since the points
on a flux surface in a Poincare will not generally be in order
we need to find the best fit i.e. the shortest path which passes through all the points without crossing itself. In general
this is a known hard problem
but fortunately in this case the nearest neighbour algorithm seems to be quite robust provided there are enough points.
An example of calculating a Poincare plot on a single y slice (y=0) and producing an RZline
is:
from zoidberg.fieldtracer import trace_poincare
rzcoord, ycoords = trace_poincare(field, 0.4, 0.0, 10.0,
y_slices=[0])
R = rzcoord[:,0,0]
Z = rzcoord[:,0,1]
line = zoidberg.rzline.line_from_points(R, Z)
line.plot()
Note: Currently there is no checking that the line created is a good solution. The line
could cross itself, but this has to be diagnosed manually at the moment. If the line is not a good
approximation to the flux surface, increase the number of points by setting the revs
keyword
(y revolutions) in the trace_poincare
call.
In general the points along this line are not evenly
distributed, but tend to cluster together in some regions and have large gaps in others.
The elliptic grid generator places grid points on the boundaries
which are uniform in the index of the RZline
it is given.
Passing a very uneven set of points will therefore result in
a poor quality mesh. To avoid this, define a new RZline
by placing points at equal distances along the line:
line = line.equallySpaced()
The example zoidberg/straightstellaratorcurvilinear.py puts the above methods together to create a grid file for a straight stellarator.
Sections below now describe each part of Zoidberg in more detail. Further documentation of the API can be found in the docstrings and unit tests.
Magnetic fieldsÂ¶
The magnetic field is represented by a MagneticField
class, in zoidberg.field
.
Magnetic fields can be defined in either cylindrical or Cartesian coordinates:
 In Cartesian coordinates all (x,y,z) directions have the same units of length
 In cylindrical coordinates the y coordinate is assumed to be an angle, so that the distance in y is given by \(ds = R dy\) where \(R\) is the major radius.
Which coordinate is used is controlled by the Rfunc
method, which should return the
major radius if using a cylindrical coordinate system.
Should return None
for a Cartesian coordinate system (the default).
Several implementations inherit from MagneticField
, and provide:
Bxfunc
, Byfunc
, Bzfunc
which give the components of the magnetic field in
the x,y and z directions respectively. These should be in the same units (e.g. Tesla) for
both Cartesian and cylindrical coordinates, but the way they are integrated changes depending
on the coordinate system.
Using these functions the MagneticField
class provides a Bmag
method and field_direction
method, which are called by the field line tracer routines (in zoidberg.field_tracer
).
Slabs and curved slabsÂ¶
The simplest magnetic field is a straight slab geometry:
import zoidberg
field = zoidberg.field.Slab()
By default this has a magnetic field \(\mathbf{B} = \left(0, 1, 0.1 + x\right)\).
A variant is a curved slab, which is defined in cylindrical coordinates and has a given major radius (default 1):
import zoidberg
field = zoidberg.field.CurvedSlab()
Note that this uses a large aspectratio approximation, so the major radius is constant across the domain (independent of x).
Straight stellaratorÂ¶
This is generated by four coils with alternating currents arranged on the edge of a circle, which spiral around the axis:
import zoidberg
field = zoidberg.field.StraightStellarator()
Note
This requires Sympy to generate the magnetic field, so if unavailable an exception will be raised
GEqdsk filesÂ¶
This format is commonly used for axisymmetric tokamak equilibria, for example output from EFIT equilibrium reconstruction. It consists of the poloidal flux psi, describing the magnetic field in R and Z, with the toroidal magnetic field Bt given by a 1D function f(psi) = R*Bt which depends only on psi:
import zoidberg
field = zoidberg.field.GEQDSK("gfile.eqdsk")
VMEC filesÂ¶
The VMEC format describes 3D magnetic fields in toroidal geometry, but only includes closed flux surfaces:
import zoidberg
field = zoidberg.field.VMEC("w7x.wout")
Plotting the magnetic fieldÂ¶
Routines to plot the magnetic field are in zoidberg.plot
. They include Poincare plots
and 3D field line plots.
For example, to make a Poincare plot from a MAST equilibrium:
import numpy as np
import zoidberg
field = zoidberg.field.GEQDSK("g014220.00200")
zoidberg.plot.plot_poincare(field, 1.4, 0.0, 2*np.pi, interactive=True)
This creates a flux surface starting at \(R=1.4\) and \(Z=0.0\). The fourth input (2*np.pi
) is
the periodicity in the \(y\) direction. Since this magnetic field is symmetric in y (toroidal angle),
this parameter only affects the toroidal planes where the points are plotted.
The interactive=True
argument to plot_poincare
generates a new set of points for every click
on the plot window.
Creating poloidal gridsÂ¶
The FCI technique is used for derivatives along the magnetic field (in Y), and doesnât restrict the form of the grid in the XZ poloidal planes. A 3D grid created by Zoidberg is a collection of 2D planes (poloidal grids), connected together by interpolations along the magnetic field.To define a 3D grid we first need to define the 2D poloidal grids.
Two types of poloidal grids can currently be created: Rectangular grids, and curvilinear structured grids. All poloidal grids have the following methods:
getCoordinate()
which returns the real space (R,Z) coordinates of a given (x,z) index, or derivatives thereoffindIndex()
which returns the (x,z) index of a given (R,Z) coordinate which in general is floating pointmetric()
which returns the 2D metric tensorplot()
which plots the grid
Rectangular gridsÂ¶
To create a rectangular grid, pass the number of points and lengths in the x and z directions
to RectangularPoloidalGrid
:
import zoidberg
rect = zoidberg.poloidal_grid.RectangularPoloidalGrid( nx, nz, Lx, Lz )
By default the middle of the rectangle is at \(\left(R,Z\right) = \left(0,0\right)\)
but this can be changed with the Rcentre
and Zcentre
options.
Curvilinear structured gridsÂ¶
To create the structured curvilinear grids inner and outer lines are needed
(two RZline
objects). The shaped_line
function creates RZline
shapes
with the following formula:
where \(R_0\) is the major radius, \(a\) is the minor radius,
\(\epsilon\) is the elongation (elong
), \(\delta\) the triangularity (triang
), and \(b\) the indentation (indent
).
PostprocessingÂ¶
The majority of the existing analysis and postprocessing code is
written in Python. Routines to read BOUT++ output data, usually called
âcollectâ because it collects data from multiple files, are also
available in IDL, Matlab, Mathematica and Octave. All these
postprocessing routines are in the tools
directory, with Python
modules in tools/pylib
. A summary of available routines is in
Python routines; see below for how to install the
requirements.
Python routinesÂ¶
RequirementsÂ¶
The Python tools provided with BOUT++ make heavy use of numpy and scipy, as well as matplotlib for the plotting routines. In order to read BOUT++ output in Python, you will need either netcdf4 or h5py.
While we try to ensure that the Python tools are compatible with both Python 2 and 3, we officially only support Python 3.
If you are developing BOUT++, you may also need Jinja2 to edit some of the generated code(see Field2D/Field3D Arithmetic Operators for more information).
You can install most of the required Python modules by running
$ pip3 install user requirement requirements.txt
in the directory where you have unpacked BOUT++. This will install supported versions of numpy, scipy, netcdf4, matplotlib and jinja2.
Note
If you have difficulties installing SciPy, please see their installation instructions
Reading BOUT++ dataÂ¶
To read data from a BOUT++ simulation into Python, there is a collect
routine.
This gathers together the data from multiple processors, taking care of the correct
layout.
from boutdata.collect import collect
Ni = collect("Ni") # Collect the variable "Ni"
The result is an up to 4D array, Ni
in this case. The array is a BoutArray
object: BoutArray is a wrapper class for Numpyâs ndarray which adds an
âattributesâ member variable containing a dictionary of attributes. The array
is ordered [t,x,y,z]
:
>>> Ni.shape
[10,1,2,3]
so Ni
would have 10 time slices, 1 point in x, 2 in y, and 3 in z.
This should correspond to the grid size used in the simulation.
Since the collected data is a NumPy array, all the useful routines
in NumPy, SciPy and Matplotlib can be used for further analysis.
The attributes of the data give:
 the
bout_type
of the variable {
'Field3D_t'
,'Field2D_t'
,'scalar_t'
} for timeevolving variables  {
'Field3D'
,'Field2D'
,'scalar'
} for timeindependent variables
 {
 its location, one of {
'CELL_CENTRE'
,'CELL_XLOW'
,'CELL_YLOW'
,'CELL_ZLOW'
}. See Staggered grids.
>>> Ni.attributes("bout_type")
'Field3D_t'
>>> Ni.attributes("location")
'CELL_CENTRE'
Attributes can also be read using the attributes
routine:
from boutdata.collect import attributes
attribs = attributes("Ni")
The result is a dictionary (map) of attribute name to attribute value.
If the data has less then 4 dimension, it can be checked with
dimension
what dimensions are available:
from boutdata.collect import dimension
print(dimension("Ni"))
print(dimension("dx"))
The first will print as expected [t, x, y, z]
 while the second
will print [x, y]
as dx is nether evolved in time, nor does it has
a z
dependency.
To access both the input options (in the BOUT.inp file) and output data, there
is the BoutData
class.
>>> from boutdata.data import BoutData
>>> d = BoutData(path=".")
where the path is optional, and should point to the directory containing the BOUT.inp (input) and BOUT.dmp.* (output) files. This will return a dictionary with keys âpathâ (the given path to the data), âoptionsâ (the input options) and âoutputsâ (the output data). The tree of options can be printed:
>>> print d["options"]
options
 timestep = 50
 myg = 0
 nout = 50
 mxg = 2
 all
  bndry_all = neumann
  scale = 0.0
 phisolver
  fourth_order = true
...
and accessed as a tree of dictionaries:
>>> print d["options"]["phisolver"]["fourth_order"]
true
Currently the values are either integers, floats, or strings, so in the above example âtrueâ is a string, not a Boolean.
In a similar way the outputs are available as dictionary keys:
>>> print d["outputs"]
ZMAX
rho_s
zperiod
BOUT_VERSION
...
>>> d["outputs"]["rho_s"]
0.00092165524660235405
There are several modules available for reading NetCDF files, so to
provide a consistent interface, file access is wrapped into a class
DataFile. This provides a simple interface for reading and writing files
from any of the following modules: netCDF4
;
Scientific.IO.NetCDF
; and scipy.io.netcdf
. The DataFile class
also provides allows access to HDF5 files through the same interface,
using the h5py
module. To open a file using DataFile:
from boututils.datafile import DataFile
f = DataFile("file.nc") # Open the file
var = f.read("variable") # Read a variable from the file
f.close() # Close the file
or similarly for an HDF5 file
from boututils.datafile import DataFile
f = DataFile("file.hdf5") # Open the file
var = f.read("variable") # Read a variable from the file
f.close() # Close the file
A more robust way to read from DataFiles is to use the context manager syntax:
from boututils.datafile import DataFile
with DataFile("file.hdf5") as f: # Open the file
var = f.read("variable") # Read a variable from the file
This way the DataFile is automatically closed at the end of the with
block, even if there is an error in f.read
. To list the variables in
a file e.g.
>>> f = DataFile("test_io.grd.nc")
>>> print(f.list())
['f3d', 'f2d', 'nx', 'ny', 'rvar', 'ivar']
and to list the names of the dimensions
>>> print(f.dimensions("f3d"))
('x', 'y', 'z')
or to get the sizes of the dimensions
>>> print(f.size("f3d"))
[12, 12, 5]
or the dictionary of attributes
>>> print(f.attributes("f3d"))
{}
To read in all variables in a file into a dictionary there is the
file_import
function
from boututils.file_import import file_import
grid = file_import("grid.nc")
Python analysis routinesÂ¶
The analysis and postprocessing routines are currently divided into two Python modules:
boutdata
, which contains BOUT++ specific things like collect
, and boututils
which contains more generic useful routines.
To plot data, a convenient wrapper around matplotlib is plotdata
from boutdata import collect
n = collect("n") # Read data as NumPy array [t,x,y,z]
from boututils.plotdata import plotdata
plotdata(n[1,:,0,:])
If given a 2D array as in the above example, plotdata produces a contour plot (using matplotlib pyplot.contourf) with colour bar. If given a 1D array then it will plot a line plot (using pyplot.plot).
It is sometimes useful to see an animation of a simulation. To do this there is
showdata
, which again is a wrapper around matplotlib:
from boutdata import collect
n = collect("n") # Read data as NumPy array [t,x,y,z]
from boututils.showdata import showdata
showdata(n[:,:,0,:])
This always assumes that the first index is time and will be animated over. The above example
animates the variable n
in time, at each time point plotting a contour plot in x
and z
dimensions.
The colour range is kept constant by default. If a 2D array is given to showdata
then a line plot will be
drawn at each time, with the scale being kept constant.
Reading BOUT++ output into IDLÂ¶
There are several routines provided for reading data from BOUT++
output into IDL. In the directory containing the BOUT++ output files
(usually data/
), you can list the variables available using
IDL> print, file_list("BOUT.dmp.0.nc")
Ajpar Apar BOUT_VERSION MXG MXSUB MYG MYSUB MZ NXPE NYPE Ni Ni0 Ni_x Te0 Te_x
Ti0 Ti_x ZMAX ZMIN iteration jpar phi rho rho_s t_array wci
The file_list
procedure just returns an array, listing all the
variables in a given file.
One thing new users can find confusing is that different simulations may have very different outputs. This is because BOUT++ is not a single physics model: the variables evolved and written to file are determined by the model, and will be very different between (for example) full MHD and reduced Braginskii models. There are however some variables which all BOUT++ output files contain:
BOUT_VERSION
, which gives the version number of BOUT++ which produced the file. This is mainly to help output processing codes handle changes to the output file format. For example, BOUT++ version 0.30 introduced 2D domain decomposition which needs to be handled when collecting data.MXG
,MYG
. These are the sizes of the X and Y guard cellsMXSUB
, the number of X grid points in each processor. This does not include the guard cells, so the total X size of each field will beMXSUB + 2*MXG
.MYSUB
, the number of Y grid points per processor (like MXSUB)MZ
, the number of Z pointsNXPE, NYPE
, the number of processors in the X and Y directions.NXPE * MXSUB + 2*MXG= NX
,NYPE * MYSUB = NY
ZMIN
,ZMAX
, the range of Z in fractions of \(2\pi\).iteration
, the last timestep in the filet_array
, an array of times
Most of these  particularly those concerned with grid size and
processor layout  are used by postprocessing routines such as
collect
, and are seldom needed directly. To read a single variable
from a file, there is the file_read
function:
IDL> wci = file_read("BOUT.dmp.0.nc", "wci")
IDL> print, wci
9.58000e+06
To read in all the variables in a file into a structure, use the
file_import
function:
IDL> d = file_import("BOUT.dmp.0.nc")
IDL> print, d.wci
9.58000e+06
This is often used to read in the entire grid file at once. Doing this for output data files can take a long time and use a lot of memory.
Reading from individual files is fine for scalar quantities and time
arrays, but reading arrays which are spread across processors (i.e.
evolving variables) is tedious to do manually. Instead, there is the
collect
function to automate this:
IDL> ni = collect(var="ni")
Variable 'ni' not found
> Variables are casesensitive: Using 'Ni'
Reading from .//BOUT.dmp.0.nc: [035][26] > [035][04]
This function takes care of the case, so that reading âniâ is automatically corrected to âNiâ. The result is a 4D variable:
IDL> help, ni
NI FLOAT = Array[36, 5, 64, 400]
with the indices [X, Y, Z, T]
. Note that in the output files, these
variables are stored in [T, X, Y, Z]
format instead but this is
changed by collect
. Sometimes you donât want to read in the entire
array (which may be very large). To read in only a subset, there are
several optional keywords with [min,max]
ranges:
IDL> ni = collect(var="Ni", xind=[10,20], yind=[2,2], zind=[0,31],
tind=[300,399])
Reading from .//BOUT.dmp.0.nc: [1020][44] > [1020][22]
IDL> help, ni
NI FLOAT = Array[11, 1, 32, 100]
Summary of IDL file routinesÂ¶
Functions file_ can currently only read/write NetCDF files. HDF5 is not supported yet.
Open a NetCDF file:
handle = file_open("filename", /write, /create)
Array of variable names:
list = file_list(handle)
list = file_list("filename")
Number of dimensions:
nd = file_ndims(handle, "variable")
nd = file_ndims("filename", "variable")
Read a variable from file. Inds = [xmin, xmax, ymin, ymax, âŠ]
data = file_read(handle, "variable", inds=inds)
data = file_read("filename", "variable", inds=inds)
Write a variable to file. For NetCDF it tries to match up dimensions, and defines new dimensions when needed
status = file_write(handle, "variable", data)
Close a file after use
file_close, handle
To read in all the data in a file into a structure:
data = file_import("filename")
and to write a structure to file:
status = file_export("filename", data)
IDL analysis routinesÂ¶
Now that the BOUT++ results have been read into IDL, all the usual
analysis and plotting routines can be used. In addition, there are many
useful routines included in the idllib
subdirectory. There is a
README
file which describes what each of these routines, but some of
the most useful ones are listed here. All these examples assume there is
a variable P
which has been read into IDL as a 4D [x,y,z,t]
variable:
fft_deriv
andfft_integrate
which differentiate and integrate periodic functions.get_integer
,get_float
, andget_yesno
request integers, floats and a yes/no answer from the user respectively.showdata
animates 1 or 2dimensional variables. Useful for quickly displaying results in different ways. This is useful for taking a quick look at the data, but can also produce bitmap outputs for turning into a movie for presentation. To show an animated surface plot at a particular poloidal location (32 here):IDL> showdata, p[*,32,*,*]
To turn this into a contour plot,
IDL> showdata, p[*,32,*,*], /cont
To show a slice through this at a particular toroidal location (0 here):
IDL> showdata, p[*,32,0,*]
There are a few other options, and ways to show data using this code; see the README file, or comments in
showdata.pro
. Instead of plotting to screen, showdata can produce a series of numbered bitmap images by using thebmp
optionIDL> showdata, p[*,32,*,*], /cont, bmp="result_"
which will produce images called
result_0000.bmp
,result_0001.bmp
and so on. Note that the plotting should not be obscured or minimised, since this works by plotting to screen, then grabbing an image of the resulting plot.moment_xyzt
takes a 4D variable (such as those fromcollect
), and calculates RMS, DC and AC components in the Z direction.safe_colors
A general routine for IDL which arranges the color table so that colors are numbered 1 (black), 2 (red), 3 (green), 4 (blue). Useful for plotting, and used by many other routines in this library.
There are many other useful routines in the idllib
directory. See
the idllib/README
file for a short description of each one.
Matlab routinesÂ¶
These are Matlab routines for collecting data, showing animation and performing some basic analysis. To use these routines, either you may copy these routines (from tools/matlablib) directly to your present working directory or a path to tools/matlablib should be added before analysis.
>> addpath <full_path_BOUT_directory>/tools/matlablib/
Now, the first routine to collect data and import it to Matlab for further analysis is
>> var = import_dmp(path,var_name);
Here, path is the path where the output data in netcdf format has been dumped. var_name is the name of variable which user want to load for further analysis. For example, to load âPâ variable from present working directory:
>> P = import_dmp('.','P');
Variable âPâ can be any of [X,Y,Z,T]/[X,Y,Z]/[X,Y]/Constant formats. If we are going to Import a large data set with [X,Y,Z,T] format. Normally such data files are of very big size and Matlab goes out of memory/ or may take too much time to load data for all time steps. To resolve this limitation of above routine import_dmp, another routine import_data_netcdf is being provided. It serves all purposes the routine import_dmp does but also gives user freedom to import data at only few/specific time steps.
>> var = import_data_netcdf(path,var_name,nt,ntsp);
Here, path and var_name are same variables as described before. nt is the number of time steps user wish to load data. ntsp is the steps at which one wish to write data of of total simulation times the data written.
>> P = import_data_netcdf('.','P',5,100);
Variable âPâ has been imported from present working directory for 5 time steps. As the original netcdf data contains time information of 500 steps (assume NT=500 in BOUT++ simulations), user will pick only 5 time steps at steps of ntsp i.e. 100 here. Details of other Matlab routines provided with BOUT++ package can be looked in to README.txt of tools/matlablib directory. The Matlab users can develop their own routines using *ncread, ncinfo, ncwrite, ncdisp, netcdf etc.* functions provided in Matlab package.
Mathematica routinesÂ¶
A package to read BOUT++ output data into Mathematica is in
tools/mathematicalib
. To read data into Mathematica, first add this
directory to Mathematicaâs path by putting
AppendTo[$Path,"/full/path/to/BOUT/tools/mathematicalib"]
in your Mathematica startup file (usually
$HOME/.Mathematica/Kernel/init.m
). To use the package, call
Import["BoutCollect.m"]
from inside Mathematica. Then you can use e.g.
f=BoutCollect[variable,path>"data"]
or
f=BoutCollect[variable,path>"data"]
â bc
â is a shorthand for âBoutCollect
â. All options
supported by the Python collect()
function are included, though Info
does nothing yet.
Octave routinesÂ¶
There is minimal support for reading data into Octave, which has been
tested on Octave 3.2. It requires the octcdf
library to access
NetCDF files.
f = bcollect() # optional path argument is "." by default
f = bsetxrange(f, 1, 10) # Set ranges
# Same for y, z, and t (NOTE: indexing from 1!)
u = bread(f, "U") # Finally read the variable
The python boutcore moduleÂ¶
InstallingÂ¶
Installing boutcore can be tricky. Ideally it should be just
./configure enableshared
make j 4 python
but getting all the
dependencies can be difficult.
make python
creates the python3 module.
If problems arise, it might be worth checking a copy of the bout module out, to reduce the risk of causing issues with the old bout installation. This is especially true if you are trying to run boutcore not on compute nodes of a super computer but rather on postprocessing/login/âŠ nodes.
To use boutcore on the login node, a self compiled version of mpi may be
required, as the provided one may be only for the compute nodes.
Further, numpy header files are required, therefore numpy needs to be
compiled as well.
Further, the header files need to be exposed to the boutcore cython
compilation, e.g. by adding them to _boutcore_build/boutcore.pyx.in
.
It seems both NUMPY/numpy/core/include
and
NUMPY/build/src.linuxx86_642.7/numpy/core/include/numpy
need to be
added, where NUMPY
is the path of the numpy directory.
For running boutcore on the post processing nodes, fftw3 needs to be
compiled as well, if certain fftw routines are used. Note, fftw needs
to be configured with enableshared
.
After installing mpi e.g. in ~/local/mpich
, bout needs to be
configured with something like:
./configure enableshared MPICC=~/local/mpich/bin/mpicc MPICXX=~/local/mpich/bin/mpicxx withfftw=~/local/fftw/
enableshared
is required, so that pvode etc. is compiles as position
independent code.
If you are running fedora  you can install prebuild binaries:
sudo dnf copr enable davidsch/bout
sudo dnf install python3bout++mpich
module load mpi/mpich$(arch)
PurposeÂ¶
The boutcore module exposes (part) of the BOUT++ C++ library to python. It allows to calculate e.g. BOUT++ derivatives in python.
StateÂ¶
Field3D and Field2D are working. If other fields are needed, please open an issue.
Fields can be accessed directly using the [] operators, and give a list of slice objects.
The get all data, f3d.getAll()
is equivalent to f3d[:,:,]
and returns a numpy array.
This array can be addressed with
e.g. []
operators, and then the field can be set again with
f3d.setAll(numpyarray)
.
It is also possible to set a part of an Field3D with the []
operators.
Addition, multiplication etc. are all available.
The derivatives should all be working, if find a missing one, please open an issue.
Vectors are not exposed yet.
FunctionsÂ¶
ExamplesÂ¶
Some trivial post processing:
import boutcore
import numpy as np
args="d data f BOUT.settings o BOUT.post".split(" ")
boutcore.init(args)
dens=boutcore.Field3D.fromCollect("n",path="data")
temp=boutcore.Field3D.fromCollect("T",path="data")
pres=dens*temp
dpdz=boutcore.DDZ(pres,outloc="CELL_ZLOW")
A simple MMS test:
import boutcore
import numpy as np
boutcore.init("d data f BOUT.settings o BOUT.post")
for nz in [64,128,256]:
boutcore.setOption("meshz:nz","%d"%nz)
mesh=boutcore.Mesh(OptionSection="meshz")
f=boutcore.create3D("sin(z)",mesh)
sim=boutcore.DDZ(f)
ana=boutcore.create3D("cos(z)",mesh)
err=simana
err=boutcore.max(boutcore.abs(err))
errors.append(err)
A real example  unstagger data:
import boutcore
boutcore.init("d data f BOUT.settings o BOUT.post")
# uses location from dump  is already staggered
upar=boutcore.Field3D.fromCollect("Upar")
upar=boutcore.interp_to(upar,"CELL_CENTRE")
# convert to numpy array
upar=upar.getAll()
A real example  check derivative contributions:
#!/usr/bin/env python
from boutcore import *
import numpy as np
from netCDF4 import Dataset
import sys
if len(sys.argv)> 1:
path=sys.argv[1]
else:
path="data"
times=collect("t_array",path=path)
boutcore.init("d data f BOUT.settings o BOUT.post")
with Dataset(path+'/vort.nc', 'w', format='NETCDF4') as outdmp:
phiSolver=Laplacian()
phi=Field3D.fromCollect("n",path=path,tind=0,info=False)
zeros=phi.getAll()*0
phi.setAll(zeros)
outdmp.createDimension('x',zeros.shape[0])
outdmp.createDimension('y',zeros.shape[1])
outdmp.createDimension('z',zeros.shape[2])
outdmp.createDimension('t',None)
t_array_=outdmp.createVariable('t_array','f4',('t'))
t_array_[:]=times
ExB = outdmp.createVariable('ExB' ,'f4',('t','x','y','z'))
par_adv = outdmp.createVariable('par_adv','f4',('t','x','y','z'))
def setXGuards(phi,phi_arr):
for z in range(tmp.shape[2]):
phi[0,:,z]=phi_arr
phi[1,:,z]=phi_arr
phi[2,:,z]=phi_arr
phi[1,:,z]=phi_arr
with open(path+"/equilibrium/phi_eq.dat","rb") as inf:
phi_arr=np.fromfile(inf,dtype=np.double)
bm="BRACKET_ARAKAWA_OLD"
for tind in range(len(times)):
vort = Field3D.fromCollect("vort" ,path=path,tind=tind,info=False)
U = Field3D.fromCollect("U" ,path=path,tind=tind,info=False)
setXGuards(phi,phi_arr)
phi=phiSolver.solve(vort,phi)
ExB[tind,:,:,:]=(bracket(phi, vort, bm, "CELL_CENTRE")).getAll()
par_adv[tind,:,:,:]=( Vpar_Grad_par(U, vort)).getAll()
Functions  undocumentedÂ¶
Functions  special and inheritedÂ¶
Time integrationÂ¶
OptionsÂ¶
BOUT++ can be compiled with several different timeintegration solvers , and at minimum should have RungeKutta (RK4) and PVODE (BDF/Adams) solvers available.
The solver library used is set using the solver:type
option, so
either in BOUT.inp:
[solver]
type = rk4 # Set the solver to use
or on the command line by adding solver:type=pvode
for example:
mpirun np 4 ./2fluid solver:type=rk4
NB: Make sure there are no spaces around the â=â sign:
solver:type =pvode
wonât work (probably). Table Table 3 gives
a list of time integration solvers, along with any compiletime options
needed to make the solver available.
Name  Description  Compile options 

euler  Euler explicit method (example only)  Always available 
rk4  RungeKutta 4thorder explicit method  Always available 
rkgeneric  Generic Runge Kutta explicit methods  Always available 
karniadakis  Karniadakis explicit method  Always available 
rk3ssp  3rdorder Strong Stability Preserving  Always available 
splitrk  Split RK3SSP and RKLegendre  Always available 
pvode  1998 PVODE with BDF method  Always available 
cvode  SUNDIALS CVODE. BDF and Adams methods  âwithcvode 
ida  SUNDIALS IDA. DAE solver  âwithida 
arkode  SUNDIALS ARKODE IMEX solver  âwitharkode 
petsc  PETSc TS methods  âwithpetsc 
imexbdf2  IMEXBDF2 scheme  âwithpetsc 
Each solver can have its own settings which work in slightly different ways, but some common settings and which solvers they are used in are given in table Table 4.
Option  Description  Solvers used 

atol  Absolute tolerance  rk4, pvode, cvode, ida, imexbdf2 
rtol  Relative tolerance  rk4, pvode, cvode, ida, imexbdf2 
mxstep  Maximum internal steps per output step  rk4, imexbdf2 
max_timestep  Maximum timestep  rk4, cvode 
timestep  Starting timestep  rk4, karniadakis, euler, imexbdf2 
adaptive  Adapt timestep? (Y/N)  rk4, imexbdf2 
use_precon  Use a preconditioner? (Y/N)  pvode, cvode, ida, imexbdf2 
mudq, mldq  BBD preconditioner settings  pvode, cvode, ida 
mukeep, mlkeep  
maxl  Maximum number of linear iterations  cvode, imexbdf2 
use_jacobian  Use usersupplied Jacobian? (Y/N)  cvode 
adams_moulton  Use AdamsMoulton method rather than BDF  cvode 
diagnose  Collect and print additional diagnostics  cvode, imexbdf2 
The most commonly changed options are the absolute and relative solver
tolerances, ATOL
and RTOL
which should be varied to check
convergence.
CVODEÂ¶
The most commonly used time integration solver is CVODE, or its older version PVODE. CVODE has several advantages over PVODE, including better support for preconditioning and diagnostics.
Enabling diagnostics output using solver:diagnose=true
will print a
set of outputs for each timestep similar to:
CVODE: nsteps 51, nfevals 69, nniters 65, npevals 126, nliters 79
> Newton iterations per step: 1.274510e+00
> Linear iterations per Newton iteration: 1.215385e+00
> Preconditioner evaluations per Newton: 1.938462e+00
> Last step size: 1.026792e+00, order: 5
> Local error fails: 0, nonlinear convergence fails: 0
> Stability limit order reductions: 0
1.000e+01 149 2.07e+01 78.3 0.0 10.0 0.9 10.8
When diagnosing slow performance, key quantities to look for are
nonlinear convergence failures, and the number of linear iterations per
Newton iteration. A large number of failures, and close to 5 linear
iterations per Newton iteration are a sign that the linear solver is not
converging quickly enough, and hitting the default limit of 5
iterations. This limit can be modified using the solver:maxl
setting. Giving it a large value e.g. solver:maxl=1000
will show how
many iterations are needed to solve the linear system. If the number of
iterations becomes large, this may be an indication that the system is
poorly conditioned, and a preconditioner might help improve performance.
See Preconditioning.
IMEXBDF2Â¶
This is an IMplicitEXplicit time integration solver, which allows the evolving function to be split into two parts: one which has relatively long timescales and can be integrated using explicit methods, and a part which has short timescales and must be integrated implicitly. The order of accuracy is variable (up to 4thorder currently), and an adaptive timestep can be used.
To use the IMEXBDF2 solver, set the solver type to imexbdf2
,
e.g. on the commandline add solver:type=imexbdf2
or in the
options file:
[solver]
type = imexbdf2
The order of the method is set to 2 by default, but can be increased up to a maximum of 4:
[solver]
type = imexbdf2
maxOrder = 3
This is a multistep method, so the state from previous steps are used
to construct the next one. This means that at the start, when there
are no previous steps, the order is limited to 1 (backwards Euler
method). Similarly, the second step is limited to order 2, and so
on. At the moment the order is not adapted, so just increases until
reaching maxOrder
.
At each step the explicit (nonstiff) part of the function is called, and combined with previous timestep values. The implicit part of the function is then solved using PETScâs SNES, which consists of a nonlinear solver (usually modified Newton iteration), each iteration of which requires a linear solve (usually GMRES). Settings which affect this implicit part of the solve are:
Option  Default  Description 

atol  1e16  Absolute tolerance on SNES solver 
rtol  1e10  Relative tolerance on SNES solver 
max_nonlinear_it  5  Maximum number of nonlinear iterations If adaptive timestepping is used then failure will cause timestep reduction 
maxl  20  Maximum number of linear iterations If adaptive, failure will cause timestep reduction 
predictor  1  Starting guess for the nonlinear solve Specifies order of extrapolating polynomial 
use_precon  false  Use usersupplied preconditioner? 
matrix_free  true  Use Jacobianfree methods? If false, calculates the Jacobian matrix using finite difference 
use_coloring  true  If not matrix free, use coloring to speed up calculation of the Jacobian 
Note that the SNES tolerances atol
and rtol
are set very conservatively by default. More reasonable
values might be 1e10 and 1e5, but this must be explicitly asked for in the input options.
The predictor extrapolates from previous timesteps to get a starting estimate for the value at the next timestep. This estimate is then used to initialise the SNES nonlinear solve. The value is the order of the extrapolating polynomial, so 1 (the default) is a linear extrapolation from the last two steps, 0 is the same as the last step. A value of 1 uses the explicit update to the state as the starting guess, i.e. assuming that the implicit part of the problem is small. This is usually not a good guess.
To diagnose what is happening in the time integration, for example to see why it is
failing to converge or why timesteps are small, there are two settings which can be
set to true
to enable:
diagnose
outputs a summary at each output time, similar to CVODE. This contains information like the last timestep, average number of iterations and number of convergence failures.verbose
prints information at every internal step, with more information on the values used to modify timesteps, and the reasons for solver failures.
By default adaptive timestepping is turned on, using several factors to modify the timestep:
 If the nonlinear solver (SNES) fails to converge, either because it diverges or exceeds the iteration limits
max_nonlinear_its
ormaxl
. Reduces the timestep by 2 and tries again, giving up after 10 failures.  Every
nadapt
internal timesteps (default 4), the error is checked by taking the timestep twice: Once with the current order of accuracy, and once with one order of accuracy lower. The difference between the solutions is then used to estimate the timestep required to achieve the required tolerances. If this is much larger or smaller than the current timestep, then the timestep is modified.  The timestep is kept within userspecified maximum and minimum ranges.
The options which control this behaviour are:
Option  Default  Description 

adaptive  true  Turns on adaptive timestepping 
timestep  output timestep  If adaptive sets the starting timestep. If not adaptive, timestep fixed at this value 
dtMin  1e10  Minimum timestep 
dtMax  output timestep  Maximum timestep 
mxstep  1e5  Maximum number of internal steps between outputs 
nadapt  4  How often is error checked and timestep adjusted? 
adaptRtol  1e3  Target relative tolerance for adaptive timestep 
scaleCushDown  1.0  Timestep scale factor below which the timestep is modified. By default the timestep is always reduced 
scaleCushUp  1.5  Minimum timestep scale factor based on adaptRtol above which the timestep will be modified. Currently the timestep increase is limited to 25% 
SplitRKÂ¶
The splitrk
solver type uses Strang splitting to combine two
explicit Runge Kutta schemes:
 2nd order RungeKuttaLegendre method for the diffusion (parabolic) part. These schemes use multiple stages to increase stability, rather than accuracy; this is always 2nd order, but the stable timestep for diffusion problems increases as the square of the number of stages. The number of stages is an input option, and can be arbitrarily large.
 3rd order SSPRK3 scheme for the advection (hyperbolic) part http://www.cscamm.umd.edu/tadmor/pub/linearstability/GottliebShuTadmor.SIREV01.pdf
Each timestep consists of
 A half timestep of the diffusion part
 A full timestep of the advection part
 A half timestep of the diffusion part
Options to control the behaviour of the solver are:
Option  Default  Description 

timestep  output timestep  If adaptive sets the starting timestep. If not adaptive, timestep fixed at this value 
nstages  10  Number of stages in RKL step. Must be > 1 
diagnose  false  Print diagnostic information 
And the adaptive timestepping options:
Option  Default  Description 

adaptive  true  Turn on adaptive timestepping 
atol  1e10  Absolute tolerance 
rtol  1e5  Relative tolerance 
max_timestep  output timestep  Maximum internal timestep 
max_timestep_change  2  Maximum factor by which the timestep by which the time step can be changed at each step 
mxstep  1000  Maximum number of internal steps before output 
adapt_period  1  Number of internal steps between tolerance checks 
ODE integrationÂ¶
The Solver
class can be used to solve systems of ODEs inside a physics
model: Multiple Solver objects can exist besides the main one used for
time integration. Example code is in examples/testintegrate
.
To use this feature, systems of ODEs must be represented by a class
derived from PhysicsModel
.
class MyFunction : public PhysicsModel {
public:
int init(bool restarting) {
// Initialise ODE
// Add variables to solver as usual
solver>add(result, "result");
...
}
int rhs(BoutReal time) {
// Specify derivatives of fields as usual
ddt(result) = ...
}
private:
Field3D result;
};
To solve this ODE, create a new Solver
object:
Solver* ode = Solver::create(Options::getRoot()>getSection("ode"));
This will look in the section [ode]
in the options file.
Important: To prevent this solver overwriting the main restart files
with its own restart files, either disable restart files:
[ode]
enablerestart = false
or specify a different directory to put the restart files:
[ode]
restartdir = ode # Restart files ode/BOUT.restart.0.nc, ...
Create a model object, and pass it to the solver:
MyFunction* model = new MyFunction();
ode>setModel(model);
Finally tell the solver to perform the integration:
ode>solve(5, 0.1);
The first argument is the number of steps to take, and the second is the size of each step. These can also be specified in the options, so calling
ode>solve();
will cause ode to look in the input for nout
and timestep
options:
[ode]
nout = 5
timestep = 0.1
Finally, delete the model and solver when finished:
delete model;
delete solver;
Note: If an ODE needs to be solved multiple times, at the moment it is recommended to delete the solver, and create a new one each time.
PreconditioningÂ¶
At every time step, an implicit scheme such as BDF has to solve a nonlinear problem to find the next solution. This is usually done using Newtonâs method, each step of which involves solving a linear (matrix) problem. For \(N\) evolving variables is an \(N\times N\) matrix and so can be very large. By default matrixfree methods are used, in which the Jacobian \(\mathcal{J}\) is approximated by finite differences (see next subsection), and so this matrix never needs to be explicitly calculated. Finding a solution to this matrix can still be difficult, particularly as \(\delta t\) gets large compared with some timescales in the system (i.e. a stiff problem).
A preconditioner is a function which quickly finds an approximate solution to this matrix, speeding up convergence to a solution. A preconditioner does not need to include all the terms in the problem being solved, as the preconditioner only affects the convergence rate and not the final solution. A good preconditioner can therefore concentrate on solving the parts of the problem with the fastest timescales.
A simple example [1] is a coupled wave equation, solved in the
testprecon
example code:
First, calculate the Jacobian of this set of equations by taking partial derivatives of the timederivatives with respect to each of the evolving variables
In this case \(\frac{\partial u}{\partial t}\) doesnât depend on \(u\) nor \(\frac{\partial v}{\partial t}\) on \(v\), so the diagonal is empty. Since the equations are linear, the Jacobian doesnât depend on \(u\) or \(v\) and so
In general for nonlinear functions \(\mathcal{J}\) gives the change in timederivatives in response to changes in the state variables \(u\) and \(v\).
In implicit time stepping, the preconditioner needs to solve an equation
where \(\mathcal{I}\) is the identity matrix, and \(\gamma\) depends on the time step and method (e.g. \(\gamma = \delta t\) for backwards Euler method). For the simple wave equation problem, this is
This matrix can be block inverted using Schur factorisation [2]
where \({\mathbf{P}}_{Schur} = {\mathbf{D}}  {\mathbf{L}}{\mathbf{E}}^{1}{\mathbf{U}}\) Using this, the wave problem becomes:
The preconditioner is implemented by defining a function of the form
int precon(BoutReal t, BoutReal gamma, BoutReal delta) {
...
}
which takes as input the current time, the \(\gamma\) factor
appearing above, and \(\delta\) which is only important for
constrained problems (not discussed hereâŠ yet). The current state of
the system is stored in the state variables (here u
and v
),
whilst the vector to be preconditioned is stored in the time derivatives
(here ddt(u)
and ddt(v)
). At the end of the preconditioner the
result should be in the time derivatives. A preconditioner which is just
the identity matrix and so does nothing is therefore:
int precon(BoutReal t, BoutReal gamma, BoutReal delta) {
}
To implement the preconditioner in equation (2), first apply the rightmost matrix to the given vector:
int precon(BoutReal t, BoutReal gamma, BoutReal delta) {
mesh>communicate(ddt(u));
//ddt(u) = ddt(u);
ddt(v) = gamma*Grad_par(ddt(u)) + ddt(v);
note that since the preconditioner is linear, it doesnât depend on
\(u\) or \(v\). As in the RHS function, since we are taking a
differential of ddt(u)
, it first needs to be communicated to
exchange guard cell values.
The second matrix
doesnât alter \(u\), but solves a parabolic equation in the
parallel direction. There is a solver class to do this called
InvertPar
which solves the equation \((A + B\partial_{}^2)x =
b\) where \(A\) and \(B\) are Field2D
or constants [3]. In
PhysicsModel::init()
we create one of these solvers:
InvertPar *inv; // Parallel inversion class
int init(bool restarting) {
...
inv = InvertPar::Create();
inv>setCoefA(1.0);
...
}
In the preconditioner we then use this solver to update \(v\):
inv>setCoefB(SQ(gamma));
ddt(v) = inv>solve(ddt(v));
which solves \(ddt(v) \rightarrow (1  \gamma^2\partial_{}^2)^{1} ddt(v)\). The final matrix just updates \(u\) using this new solution for \(v\)
mesh>communicate(ddt(v));
ddt(u) = ddt(u) + gamma*Grad_par(ddt(v));
Finally, boundary conditions need to be imposed, which should be consistent with the conditions used in the RHS:
ddt(u).applyBoundary("dirichlet");
ddt(v).applyBoundary("dirichlet");
To use the preconditioner, pass the function to the solver in
PhysicsModel::init()
:
int init(bool restarting) {
solver>setPrecon(precon);
...
}
then in the BOUT.inp
settings file switch on the preconditioner
[solver]
type = cvode # Need CVODE or PETSc
use_precon = true # Use preconditioner
rightprec = false # Use Right preconditioner (default left)
Jacobian functionÂ¶
DAE constraint equationsÂ¶
Using the IDA or IMEXBDF2 solvers, BOUT++ can solve Differential
Algebraic Equations (DAEs), in which algebraic constraints are used for
some variables. Examples of how this is used are in the
examples/constraints
subdirectory.
First the variable to be constrained is added to the solver, in a similar way to time integrated variables. For example
Field3D phi;
...
solver>constraint(phi, ddt(phi), "phi");
The first argument is the variable to be solved for (constrained). The
second argument is the field to contain the residual (error). In this
example the time derivative field ddt(phi)
is used, but it could
be another Field3D
variable. The solver will attempt to
find a solution to the first argument (phi
here) such that the
second argument (ddt(phi)
) is zero to within tolerances.
In the RHS function the residual should be calculated. In this example
(examples/constraints/driftwaveconstraint
) we have:
ddt(phi) = Delp2(phi)  Vort;
so the time integration solver includes the algebraic constraint
Delp2(phi) = Vort
i.e. (\(\nabla_\perp^2\phi = \omega\)).
IMEXBDF2Â¶
This is an implicitexplicit multistep method, which uses the PETSc
library for the SNES nonlinear solver. To use this solver, BOUT++ must
have been configured with PETSc support, and the solver type set to
imexbdf2
[solver]
type = imexbdf2
For examples of using IMEXBDF2, see the examples/IMEX/
subdirectory, in particular the diffusionnl
, driftwave
and
driftwaveconstrain
examples.
The time step is currently fixed (not adaptive), and defaults to the
output timestep. To set a smaller internal timestep, the
solver:timestep
option can be set. If the timestep is too large,
then the explicit part of the problem may become unstable, or the
implicit part may fail to converge.
The implicit part of the problem can be solved matrixfree, in which case the Jacobianvector product is approximated using finite differences. This is currently the default, and can be set on the commandline using the options:
solver:matrix_free=true snes_mf
Note the snes_mf
flag which is passed to PETSc. When using a matrix
free solver, the Jacobian is not calculated and so the amount of memory
used is minimal. However, since the Jacobian is not known, many standard
preconditioning methods cannot be used, and so in many cases a custom
preconditioner is needed to obtain good convergence.
An experimental feature uses PETScâs ability to calculate the Jacobian using finite differences. This can then speed up the linear solve, and allows more options for preconditioning. To enable this option:
solver:matrix_free=false
There are two ways to calculate the Jacobian: A brute force method which is set up by this call to PETSc which is generally very slow, and a âcoloringâ scheme which can be quite fast and is the default. Coloring uses knowledge of where the nonzero values are in the Jacobian, to work out which rows can be calculated simultaneously. The coloring code in IMEXBDF2 currently assumes that every field is coupled to every other field in a star pattern: one cell on each side, a 7 point stencil for 3D fields. If this is not the case for your problem, then the solver may not converge.
The brute force method can be useful for comparing the Jacobian structure, so to turn off coloring:
solver:use_coloring=false
Using MatView calls, or the mat_view
PETSc options, the nonzero
structure of the Jacobian can be plotted or printed.
Monitoring the simulation outputÂ¶
Monitoring of the solution can be done at two levels: output monitoring,
and timestep monitoring. Output monitoring occurs only when data is
written to file, whereas timestep monitoring is every timestep and so
(usually) much more frequent. Examples of both are in
examples/monitor
and examples/monitornewapi
.
Output monitoring: At every output timestep the solver calls a monitor method of the BoutMonitor class, which writes the output dump file, calculates and prints timing information and estimated time remaining. If you want to run additional code or write data to a different file, you can implement the outputMonitor method of PhysicsModel:
int outputMonitor(BoutReal simtime, int iter, int nout)
The first input is the current simulation time, the second is the output number, and the last is the total number of outputs requested. This method is called by a monitor object PhysicsModel::modelMonitor, which writes the restart files at the same time. You can change the frequency at which the monitor is called by calling, in PhysicsModel::init:
modelMonitor.setTimestep(new_timestep)
where new_timestep
is a BoutReal which is either timestep*n
or
timestep/n
for an integer n
. Note that this will change the frequency
of writing restarts as well as of calling outputMonitor()
.
You can also add custom monitor object(s) for more flexibility.
You can call your output monitor class whatever you like, but it must be a
subclass of Monitor and provide the method call
which takes 4 inputs and
returns an int:
class MyOutputMonitor : public Monitor {
int call(Solver *solver, BoutReal simtime, int iter, int NOUT) {
...
}
};
The first input is the solver object, the second is the current
simulation time, the third is the output number, and the last is the
total number of outputs requested. To get the solver to call this
function every output time, define a MyOutputMonitor
object as a member of your
PhysicsModel:
MyOutputMonitor my_output_monitor;
and put in your PhysicsModel::init()
code:
solver>addMonitor(my_output_monitor);
If you want to later remove a monitor, you can do so with:
solver>removeMonitor(my_output_monitor);
A simple example using this monitor is:
class MyOutputMonitor: public Monitor{
public:
MyOutputMonitor(BoutReal timestep=1):Monitor(timestep){};
int call(Solver *solver, BoutReal simtime, int iter, int NOUT) override;
};
int MyOutputMonitor::call(Solver *solver, BoutReal simtime, int iter, int NOUT) {
output.write("Output monitor, time = %e, step %d of %d\n",
simtime, iter, NOUT);
return 0;
}
MyOutputMonitor my_monitor;
int init(bool restarting) {
solver>addMonitor(my_monitor);
}
See the monitor example (examples/monitor
) for full code.
Timestep monitoring: This uses functions instead of objects. First define a monitor function:
int my_timestep_monitor(Solver *solver, BoutReal simtime, BoutReal lastdt) {
...
}
where simtime
will again contain the current simulation time, and
lastdt
the last timestep taken. Add this function to the solver:
solver>addTimestepMonitor(my_timestep_monitor);
Timestep monitoring is disabled by default, unlike output monitoring. To enable timestep monitoring, set in the options file (BOUT.inp):
[solver]
monitor_timestep = true
or put on the command line solver:monitor_timestep=true
. When this
is enabled, it will change how solvers like CVODE and PVODE (the default
solvers) are used. Rather than being run in NORMAL mode, they will
instead be run in SINGLE_STEP mode (see the SUNDIALS notes
here:https://computation.llnl.gov/casc/sundials/support/notes.html).
This may in some cases be less efficient.
Implementation internalsÂ¶
The solver is the interface between BOUT++ and the timeintegration
code such as SUNDIALS. All solvers implement the Solver
class interface (see src/solver/generic_solver.hxx
).
First all the fields which are to be evolved need to be added to the solver. These are always done in pairs, the first specifying the field, and the second the timederivative:
void add(Field2D &v, Field2D &F_v, const char* name);
This is normally called in the PhysicsModel::init()
initialisation routine.
Some solvers (e.g. IDA) can support constraints, which need to be added
in the same way as evolving fields:
bool constraints();
void constraint(Field2D &v, Field2D &C_v, const char* name);
The constraints()
function tests whether or not the current solver
supports constraints. The format of constraint(...)
is the same as
add
, except that now the solver will attempt to make C_v
zero.
If constraint
is called when the solver doesnât support them then an
error should occur.
If the physics model implements a preconditioner or Jacobianvector multiplication routine, these can be passed to the solver during initialisation:
typedef int (*PhysicsPrecon)(BoutReal t, BoutReal gamma, BoutReal delta);
void setPrecon(PhysicsPrecon f); // Specify a preconditioner
typedef int (*Jacobian)(BoutReal t);
void setJacobian(Jacobian j); // Specify a Jacobian
If the solver doesnât support these functions then the calls will just be ignored.
Once the problem to be solved has been specified, the solver can be initialised using:
int init(rhsfunc f, int argc, char **argv, bool restarting, int nout, BoutReal tstep);
which returns an error code (0 on success). This is currently called in bout++.cxx:
if(solver.init(rhs, argc, argv, restart, NOUT, TIMESTEP)) {
output.write("Failed to initialise solver. Aborting\n");
return(1);
}
which passes the (physics module) RHS function PhysicsModel::rhs()
to the
solver along with the number and size of the output steps.
typedef int (*MonitorFunc)(BoutReal simtime, int iter, int NOUT);
int run(MonitorFunc f);
[1]  Taken from a talk by L.Chacon available here https://bout2011.llnl.gov/pdf/talks/Chacon_bout2011.pdf 
[2]  See paper https://arxiv.org/abs/1209.2054 for an application to 2fluid equations 
[3]  This InvertPar class can handle cases with closed
fieldlines and twistshift boundary conditions for tokamak
simulations 
Parallel TransformsÂ¶
In most BOUT++ simulations the Y coordinate is parallel to the magnetic field. In particular if the magnetic field \(\mathbf{B}\) can be expressed as
then the Clebsch operators can be used. See section Differential operators for more details.
The structure of the magnetic field can be simple, as in a slab geometry, but in many cases it is quite complicated. In a tokamak, for example, the magnetic shear causes deformation of grid cells and numerical issues. One way to overcome this is to transform between local coordinate systems, interpolating in the toroidal (Z) direction when calculating gradients along the magnetic field. This is called the shifted metric method. In more general geometries such as stellarators, the magnetic field can have a 3D structure and stochastic regions. In this case the interpolation becomes 2D (in X and Z), and is known as the Flux Coordinate Independent (FCI) method.
To handle these different cases in the same code, the BOUT++ mesh
implements different ParallelTransform
classes. Each Field3D
class
contains a pointer to the values up and down in the Y direction,
called yup and ydown. These values are calculated during
communication:
Field3D f(0.0); // f allocated, set to zero
f.yup(); // error: f.yup not allocated
mesh>communicate(f);
f.yup(); // ok
f.ydown()(0,1,0); // ok
In the case of slab geometry, yup and ydown point to the original field (f). For this reason the value of f along the magnetic field from f(x,y,z) is given by f.ydown(x,y1,z) and f.yup(x,y+1,z). To take a second derivative along Y using the Field3D iterators (section Iterating over fields):
Field3D result;
result.allocate(); // Need to allocate before indexing
for(const auto &i : result.region(RGN_NOBNDRY)) {
result[i] = f.yup()[i.yp()]  f.ydown()[i.ym()];
}
Note the use of yp() and ym() to increase and decrease the Y index.
Fieldaligned gridÂ¶
The default ParallelTransform
is the identity transform, which sets
yup() and ydown() to point to the same field. In the input options the
setting is
[mesh]
paralleltransform = identity
This then uses the ParallelTransformIdentity
class to calculate the
yup and ydown fields.
This is mostly useful for slab geometries, where for a straight magnetic field
the grid is either periodic in the ydirection or ends on a yboundary. By
setting the global option TwistShift = true
and providing a ShiftAngle
in the gridfile or [mesh]
options a branch cut can be introduced between
the beginning and end of the ydomain.
ParallelTransformIdentity
can also be used in nonslab geometries. Then
TwistShift = true
should be set so that a twistshift boundary condition is
applied on closed field lines, as fieldline following coordinates are not
periodic in poloidal angle. Note that it is not recommended to use
ParallelTransformIdentity
with toroidal geometries, as magnetic shear will
make the radial derivatives inaccurate away from the outboard midplane (which
is normall chosen as the zero point for the integrated shear).
Shifted metricÂ¶
The shifted metric method is selected using:
[mesh]
paralleltransform = shifted
so that mesh uses the ShiftedMetric
class to calculate parallel
transforms. During initialisation, this class reads a quantity zShift
from the input or grid file. If zShift is not found then qinty is read
instead. If qinty is not found then the angle is zero, and this method
becomes the same as the identity transform. For each X and Z index,
the zShift variable should contain the toroidal angle of a magnetic
field line at \(z=0\) starting at \(\phi=0\) at a reference
location \(\theta_0\):
Note that here \(\theta_0\) does not need to be constant in X (radius), since it is only the relative shifts between Y locations which matters.
FCI methodÂ¶
To use the FCI method for parallel transforms, set
[mesh]
paralleltransform = fci
which causes the FCITransform
class to be used for parallel
transforms. This reads four variables (3D fields) from the input
grid: forward_xt_prime
, forward_zt_prime
, backward_xt_prime
, and
backward_zt_prime
. These give the cell indices, not in general
integers, in the forward (yup) and backward (ydown) directions. These
are arranged so that forward_xt_prime(x,y,z) is the x index at
y+1. Hence f.yup()(x,y+1,z) is calculated using
forward_xt_prime(x,y,z) and forward_zt_prime(x,y,z), whilst
f.ydown()(x,y1,z) is calculated using backward_xt_prime(x,y,z) and
backward_zt_prime(x,y,z).
Tools for calculating these mappings include Zoidberg, a Python tool which carries out fieldline tracing and generates FCI inputs.
Laplacian inversionÂ¶
A common problem in plasma models is to solve an equation of the form
For example,
appears in reduced MHD for the vorticity inversion and \(j_{}\).
Alternative formulations and ways to invert equation (3) can be found in section LaplaceXY and LaplaceXZ
Several implementations of the Laplacian solver are available, which are selected by changing the âtypeâ setting.The currently available implementations are listed in table Table 5.
Name  Description  Requirements 

cyclic  Serial/parallel. Gathers boundary rows onto one processor.  
petsc  Serial/parallel. Lots of methods, no Boussinesq  PETSc (section PETSc) 
multigrid  Serial/parallel. Geometric multigrid, no Boussinesq  
naulin  Serial/parallel. Iterative treatment of nonBoussinesq terms  
serial_tri  Serial only. Thomas algorithm for tridiagonal system.  Lapack (section LAPACK) 
serial_band  Serial only. Enables 4thorder accuracy  Lapack (section LAPACK) 
spt  Parallel only (NXPE>1). Thomas algorithm.  
mumps  Serial/parallel. Direct solver  MUMPS (section MUMPS) 
pdd  Parallel Diagnonally Dominant algorithm. Experimental  
shoot  Shooting method. Experimental 
Usage of the laplacian inversionÂ¶
In BOUT++, equation (3) can be solved in two
ways. The first method Fourier transforms in the \(z\)direction,
whilst the other solves the full two dimensional problem by matrix
inversion. The derivation of \(\nabla_\perp^2f\) for a general
coordinate system can be found in the
Fieldaligned coordinates section. What is important, is to
note that if \(g_{xy}\) and \(g_{yz}\) are nonzero, BOUT++
neglects the \(y\)parallel derivatives when using the solvers
Laplacian
and LaplaceXZ
.
By neglecting the \(y\)derivatives (or if \(g_{xy}=g_{yz}=0\)), one can solve equation (3) \(y\) plane by \(y\) plane.
The first approach utilizes the fact that it is possible to Fourier transform the equation in \(z\) (using some assumptions described in section Numerical implementation), and solve a tridiagonal system for each mode. These inversion problems are banddiagonal (tridiagonal in the case of 2ndorder differencing) and so inversions can be very efficient: \(O(n_z \log n_z)\) for the FFTs, \(O(n_x)\) for tridiagonal inversion using the Thomas algorithm, where \(n_x\) and \(n_z\) are the number of gridpoints in the \(x\) and \(z\) directions respectively.
In the second approach, the full \(2\)D system is solved. The available solvers for this approach are âmultigridâ using a multigrid algorithm; ânaulinâ using an iterative scheme to correct the FFTbased approach; or âpetscâ using KSP linear solvers from the PETSc library (this requires PETSc to be built with BOUT++).
The Laplacian
class is defined in invert_laplace.hxx
and solves
problems formulated like equation (3) To use this
class, first create an instance of it:
Laplacian *lap = Laplacian::create();
By default, this will use the options in a section called âlaplaceâ, but
can be given a different section as an argument. By default
\(d = 1\), \(a = 0\), and \(c_1=c_2=1\). To set the values of
these coefficients, there are the setCoefA()
, setCoefC1()
,
setCoefC2()
, setCoefC()
(which sets both \(c_1\) and \(c_2\)
to its argument), and setCoefD()
methods:
Field2D a = ...;
lap>setCoefA(a);
lap>setCoefC(0.5);
arguments can be Field2D
, Field3D
, or BoutReal
values. Note that FFT
solvers will use only the DC part of Field3D
arguments.
Settings for the inversion can be set in the input file under the
section laplace
(default) or whichever settings section name was
specified when the Laplacian
class was created. Commonly used
settings are listed in tables Table 6 to
Table 9.
In particular boundary conditions on the \(x\) boundaries can be
set using the and outer_boundary_flags
variables, as detailed in
table Table 8. Note that DC (âdirectcurrentâ)
refers to \(k = 0\) Fourier component, AC (âalternatingcurrentâ)
refers to \(k \neq 0\) Fourier components. NonFourier solvers use
AC options (and ignore DC ones). Multiple boundary conditions can be
selected by adding together the required boundary condition flag
values together. For example, inner_boundary_flags = 3
will set a
Neumann boundary condition on both AC and DC components.
It is pertinent to note here that the boundary in BOUT++ is defined by
default to be located half way between the first guard point and first
point inside the domain. For example, when a Dirichlet boundary
condition is set, using inner_boundary_flags = 0
, 16
, or
32
, then the first guard point, \(f_{}\) will be set to
\(f_{} = 2v  f_+\), where \(f_+\) is the first grid point
inside the domain, and \(v\) is the value to which the boundary is
being set to.
The global_flags
, inner_boundary_flags
,
outer_boundary_flags
and flags
values can also be set from
within the physics module using setGlobalFlags
,
setInnerBoundaryFlags
, setOuterBoundaryFlags
and
setFlags
.
lap>setGlobalFlags(Global_Flags_Value);
lap>setInnerBoundaryFlags(Inner_Flags_Value);
lap>setOuterBoundaryFlags(Outer_Flags_Value);
lap>setFlags(Flags_Value);
Name  Meaning  Default value 

type 
Which implementation to use. See table Table 5  cyclic 
filter 
Filter out modes above \((1\)filter \()\times k_{max}\), if using Fourier solver 
0 
maxmode 
Filter modes with \(n >\)maxmode 
MZ /2 
all_terms 
Include first derivative terms  true 
nonuniform 
Include corrections for nonuniform meshes (dx not constant)  Same as global non_uniform .
See here 
global_flags 
Sets global inversion options See table Laplace global flags  0 
inner_boundary_flags 
Sets boundary conditions on inner boundary. See table Laplace boundary flags  0 
outer_boundary_flags 
Sets boundary conditions on outer boundary. See table Laplace boundary flags  0 
flags 
DEPRECATED. Sets global solver options and boundary conditions. See Laplace flags or invert_laplace.cxx  0 
include_yguards 
Perform inversion in \(y\)boundary guard cells  false 
Flag  Meaning  Code variable 

0  No global option set  \(\) 
1  zero DC component (Fourier solvers)  INVERT_ZERO_DC 
2  set initial guess to 0 (iterative solvers)  INVERT_START_NEW 
4  equivalent to
outer_boundary_flags = 128 ,
inner_boundary_flags = 128 
INVERT_BOTH_BNDRY_ONE 
8  Use 4th order differencing (Apparently not actually implemented anywhere!!!)  INVERT_4TH_ORDER 
16  Set constant component (\(k_x = k_z = 0\)) to zero  INVERT_KX_ZERO 
Flag  Meaning  Code variable 

0  Dirichlet (Set boundary to 0)  \(\) 
1  Neumann on DC component (set gradient to 0)  INVERT_DC_GRAD 
2  Neumann on AC component (set gradient to 0)  INVERT_AC_GRAD 
4  Zero or decaying Laplacian on AC components ( \(\frac{\partial^2}{\partial x^2}+k_z^2\) vanishes/decays)  INVERT_AC_LAP 
8  Use symmetry to enforce zero value or gradient (redundant for 2nd order now)  INVERT_SYM 
16  Set boundary condition to values in boundary guard cells of second
argument, x0 , of Laplacian::solve(const Field3D &b,
const Field3D &x0) . May be combined with any
combination of 0, 1 and 2, i.e. a Dirichlet or Neumann boundary
condition set to values which are \(\neq 0\) or \(f(y)\) 
INVERT_SET 
32  Set boundary condition to values in boundary guard cells of RHS,
b in Laplacian::solve(const Field3D &b, const Field3D
&x0) . May be combined with any combination of 0,
1 and 2, i.e. a Dirichlet or Neumann boundary condition set to values
which are \(\neq 0\) or \(f(y)\) 
INVERT_RHS 
64  Zero or decaying Laplacian on DC components (\(\frac{\partial^2}{\partial x^2}\) vanishes/decays)  INVERT_DC_LAP 
128  Assert that there is only one guard cell in the \(x\)boundary  INVERT_BNDRY_ONE 
256  DC value is set to parallel gradient, \(\nabla_\parallel f\)  INVERT_DC_GRADPAR 
512  DC value is set to inverse of parallel gradient \(1/\nabla_\parallel f\)  INVERT_DC_GRADPARINV 
1024  Boundary condition for inner âboundaryâ of cylinder  INVERT_IN_CYLINDER 
Flag  Meaning 

1  Zerogradient DC on inner (X) boundary. Default is zerovalue 
2  Zerogradient AC on inner boundary 
4  Zerogradient DC on outer boundary 
8  Zerogradient AC on outer boundary 
16  Zero DC component everywhere 
32  Not used currently 
64  Set width of boundary to 1 (default is MXG ) 
128  Use 4\(^{th}\)order band solver (default is 2\(^{nd}\) order tridiagonal) 
256  Attempt to set zero laplacian AC component on inner boundary by combining 2nd and 4thorder differencing at the boundary. Ignored if tridiagonal solver used. 
512  Zero laplacian AC on outer boundary 
1024  Symmetric boundary condition on inner boundary 
2048  Symmetric outer boundary condition 
To perform the inversion, thereâs the solve
method
x = lap>solve(b);
There are also functions compatible with older versions of the BOUT++ code, but these are deprecated:
Field2D a, c, d;
invert_laplace(b, x, flags, &a, &c, &d);
and
x = invert_laplace(b, flags, &a, &c, &d);
The input b
and output x
are 3D fields, and the coefficients
a
, c
, and d
are pointers to 2D fields. To omit any of the
three coefficients, set them to NULL.
Numerical implementationÂ¶
We will here go through the implementation of the laplacian inversion algorithm, as it is performed in BOUT++. We would like to solve the following equation for \(f\)
BOUT++ neglects the \(y\)parallel derivatives if \(g_{xy}\)
and \(g_{yz}\) are nonzero when using the solvers Laplacian
and
LaplaceXZ
. For these two solvers, equation (4) becomes
(see Fieldaligned coordinates for derivation)
Using tridiagonal solversÂ¶
Since there are no parallel \(y\)derivatives if \(g_{xy}=g_{yz}=0\) (or if they are neglected), equation (4) will only contain derivatives of \(x\) and \(z\) for the dependent variable. The hope is that the modes in the periodic \(z\) direction will decouple, so that we in the end only have to invert for the \(x\) coordinate.
If the modes decouples when Fourier transforming equation (5), we can use a tridiagonal solver to solve the equation for each Fourier mode.
Using the discrete Fourier transform
we see that the modes will not decouple if a term consist of a product of two terms which depends on \(z\), as this would give terms like
Thus, in order to use a tridiagonal solver, \(a\), \(c_1\), \(c_2\) and \(d\) cannot be functions of \(z\). Because of this, the \({{\boldsymbol{e}}}^z \partial_z c_2\) term in equation (5) is zero. Thus the tridiagonal solvers solve equations of the form
after using the discrete Fourier transform (see section Derivatives of the Fourier transform), we get
which gives
As nothing in equation (6) couples points in \(y\) together (since we neglected the \(y\)derivatives if \(g_{xy}\) and \(g_{yz}\) were nonzero) we can solve \(y\)plane by \(y\)plane. Also, as the modes are decoupled, we may solve equation (6) \(k\) mode by \(k\) mode in addition to \(y\)plane by \(y\)plane.
The second order centred approximation of the first and second derivatives in \(x\) reads
This gives
collecting point by point
We now introduce
which inserted in equation (7) gives
This can be formulated as the matrix equation
where the matrix \(A\) is tridiagonal. The boundary conditions are set by setting the first and last rows in \(A\) and \(B_z\).
The tridiagonal solvers previously required \(c_1 = c_2\) in equation (4), but from version 4.3 allow \(c_1 \neq c_2\).
Using PETSc solversÂ¶
When using PETSc, all terms of equation (5) are used when inverting to find \(f\). Note that when using PETSc, we do not Fourier decompose in the \(z\)direction, so it may take substantially longer time to find the solution. As with the tridiagonal solver, the fields are sliced in the \(y\)direction, and a solution is found for one \(y\) plane at the time.
Before solving, equation (5) is rewritten to the form \(A{{\boldsymbol{x}}} ={{\boldsymbol{b}}}\) (however, the full \(A\) is not expanded in memory). To do this, a row \(i\) in the matrix \(A\) is indexed from bottom left of the two dimensional field \(= (0,0) = 0\) to top right \(= (\texttt{meshx}1, \texttt{meshz}1) = \texttt{meshx}\cdot\texttt{meshz}1\) of the two dimensional field. This is done in such a way so that a row \(i\) in \(A\) increments by \(1\) for an increase of \(1\) in the \(z\)direction, and by \(\texttt{meshz}\) for an increase of \(1\) in the \(x\)direction, where the variables \(\texttt{meshx}\) and \(\texttt{meshz}\) represents the highest value of the field in the given direction.
Similarly to equation (7), the discretised version of equation (5) can be written. Doing the same for the full two dimensional case yields:
Second order approximation
Fourth order approximation
To determine the coefficient for each node point, it is convenient to introduce some quantities
In addition, we have:
Second order approximation (5point stencil)
Fourth order approximation (9point stencil)
This gives
The coefficients \(c_{i+m,j+n}\) are finally being set according
to the appropriate order of discretisation. The coefficients can be
found in the file petsc_laplace.cxx
.
Example: The 5point stencilÂ¶
Let us now consider the 5point stencil for a mesh with \(3\) inner points in the \(x\)direction, and \(3\) inner points in the \(z\)direction. The \(z\) direction will be periodic, and the \(x\) direction will have the boundaries half between the gridpoint and the first ghost point (see Fig. 10).
Applying the \(5\)point stencil to point \(f_{22}\) this mesh will result in Fig. 11.
We want to solve a problem on the form \(A{{\mathbf{x}}}={{\mathbf{b}}}\). We will order \({{\mathbf{x}}}\) in a rowmajor order (so that \(z\) is varying faster than \(x\)). Further, we put the inner \(x\) boundary points first in \({{\mathbf{x}}}\), and the outer \(x\) boundary points last in \({{\mathbf{x}}}\). The matrix problem for our mesh can then be written like in Fig. 12.
As we are using a rowmajor implementation, the global indices of the matrix will be as in Fig. 13
Implementation internalsÂ¶
The Laplacian inversion code solves the equation:
where \(x\) and \(b\) are 3D variables, whilst \(a\), \(c_1\), \(c_2\) and \(d\) are 2D variables for the FFT solvers, or 3D variables otherwise. Several different algorithms are implemented for Laplacian inversion, and they differ between serial and parallel versions. Serial inversion can currently either be done using a tridiagonal solver (Thomas algorithm), or a bandsolver (allowing \(4^{th}\)order differencing).
To support multiple implementations, a base class Laplacian
is
defined in include/invert_laplace.hxx
. This defines a set of
functions which all implementations must provide:
class Laplacian {
public:
virtual void setCoefA(const Field2D &val) = 0;
virtual void setCoefC(const Field2D &val) = 0;
virtual void setCoefD(const Field2D &val) = 0;
virtual const FieldPerp solve(const FieldPerp &b) = 0;
}
At minimum, all implementations must provide a way to set coefficients, and a solve function which operates on a single FieldPerp (XY) object at once. Several other functions are also virtual, so default code exists but can be overridden by an implementation.
For convenience, the Laplacian
base class also defines a function to
calculate coefficients in a Tridiagonal matrix:
void tridagCoefs(int jx, int jy, int jz, dcomplex &a, dcomplex &b,
dcomplex &c, const Field2D *c1coef = nullptr,
const Field2D *c2coef = nullptr,
const Field2D *d=nullptr);
For the user of the class, some static functions are defined:
static Laplacian* create(Options *opt = nullptr);
static Laplacian* defaultInstance();
The create function allows new Laplacian implementations to be created,
based on options. To use the options in the [laplace]
section, just
use the default:
Laplacian* lap = Laplacian::create();
The code for the Laplacian
base class is in
src/invert/laplace/invert_laplace.cxx
. The actual creation of new
Laplacian implementations is done in the LaplaceFactory
class,
defined in src/invert/laplace/laplacefactory.cxx
. This file
includes all the headers for the implementations, and chooses which
one to create based on the type
setting in the input options. This
factory therefore provides a single point of access to the underlying
Laplacian inversion implementations.
Each of the implementations is in a subdirectory of
src/invert/laplace/impls
and is discussed below.
Serial tridiagonal solverÂ¶
This is the simplest implementation, and is in
src/invert/laplace/impls/serial_tri/
Serial band solverÂ¶
This is bandsolver which performs a \(4^{th}\)order inversion.
Currently this is only available when NXPE=1
; when more than one
processor is used in \(x\), the Laplacian algorithm currently
reverts to \(3^{rd}\)order.
SPT parallel tridiagonalÂ¶
This is a reference code which performs the same operations as the
serial code. To invert a single XZ slice (FieldPerp
object), data
must pass from the innermost processor (mesh>PE_XIND = 0
) to the
outermost mesh>PE_XIND = mesh>NXPE1
and back again.
Some parallelism is achieved by running several inversions
simultaneously, so while processor 1 is inverting Y=0, processor 0 is
starting on Y=1. This works ok as long as the number of slices to be
inverted is greater than the number of X processors
(MYSUB > mesh>NXPE
). If MYSUB < mesh>NXPE
then not all
processors can be busy at once, and so efficiency will fall sharply.
Fig. 14 shows the useage of 4 processors inverting a
set of 3 poloidal slices (i.e. MYSUB=3)
PDD algorithmÂ¶
This is the Parallel Diagonally Dominant (PDD) algorithm. Itâs very fast, but achieves this by neglecting some crossprocessor terms. For ELM simulations, it has been found that these terms are important, so this method is not usually used.
Cyclic algorithmÂ¶
This is now the default solver in both serial and parallel. It is an FFTbased solver using a cyclic reduction algorithm.
Multigrid solverÂ¶
A solver using a geometric multigrid algorithm was introduced by projects in 2015 and 2016 of CCFE and the EUROfusion HLST.
Naulin solverÂ¶
This scheme was introduced for BOUT++ by Michael LĂžiten in the CELMA code and the iterative algoritm is detailed in his thesis [LĂžiten2017].
The iteration can be underrelaxed (see naulin_laplace.cxx
for more details of the
implementation). A factor \(0< \text{underrelax\_factor}<=1\) is used, with a value of 1 corresponding
to no underrelaxation. If the iteration starts to diverge (the error increases on any
step) the underrelax_factor is reduced by a factor of 0.9, and the iteration is restarted
from the initial guess. The initial value of underrelax_factor, which underrelax_factor is
set to at the beginning of each call to solve
can be set by the option
initial_underrelax_factor
(default is 1.0) in the appropriate section of the input
file ([laplace]
by default). Reducing the value of initial_underrelax_factor
may
speed up convergence in some cases. Some statistics from the solver are written to the
output files to help in choosing this value. With <i>
being the number of the
LaplaceNaulin
solver, counting in the order they are created in the physics model:
naulinsolver<i>_mean_underrelax_counts
gives the mean number of timesunderrelax_factor
had to be reduced to get the iteration to converge. If this is much above 0, it is probably worth reducinginitial_underrelax_factor
.naulinsolver<i>_mean_its
is the mean number of iterations taken to converge. Try to minimise when adjustinginitial_underrelax_factor
.
[LĂžiten2017]  Michael LĂžiten, âGlobal numerical modeling of magnetized plasma in a linear deviceâ, 2017, https://celmaproject.github.io/. 
LaplaceXYÂ¶
Perpendicular Laplacian solver in XY.
In 2D (XY), the \(g_{xy}\) component can be dropped since this depends on integrated shear \(I\) which will cancel with the \(g_{xz}\) component. The \(z\) derivative is zero and so this simplifies to
The divergence operator in conservative form is
and so the perpendicular Laplacian in XY is
In fieldaligned coordinates, the metrics in the \(y\) derivative term become:
In the LaplaceXY operator this is implemented in terms of fluxes at cell faces.
Notes:
 The ShiftXderivs option must be true for this to work, since it assumes that \(g^{xz} = 0\)
LaplaceXZÂ¶
This is a Laplacian inversion code in XZ, similar to the
Laplacian
solver described in Laplacian inversion. The
difference is in the form of the Laplacian equation solved, and the
approach used to derive the finite difference formulae. The equation
solved is:
where \(A\) and \(B\) are coefficients, \(b\) is the known
RHS vector (e.g. vorticity), and \(f\) is the unknown quantity to
be calculated (e.g. potential), and \(\nabla_\perp f\) is the same
as equation ((8)), but with negligible
\(y\)parallel derivatives if \(g_{xy}\), \(g_{yz}\) and
\(g_{xz}\) is nonvanishing. The Laplacian is written in
conservative form like the LaplaceXY
solver, and
discretised in terms of fluxes through cell faces.
The header file is include/bout/invert/laplacexz.hxx
. The solver is
constructed by using the LaplaceXZ::create()
function:
LaplaceXZ *lap = LaplaceXZ::create(mesh);
Note that a pointer to a Mesh
object must be given, which
for now is the global variable mesh
. By default the
options section laplacexz
is used, so to set the type of solver
created, set in the options
[laplacexz]
type = petsc # Set LaplaceXZ type
or on the commandline laplacexz:type=petsc
.
The coefficients must be set using setCoefs
. All coefficients must
be set at the same time:
lap>setCoefs(1.0, 0.0);
Constants, Field2D
or Field3D
values can be passed. If the
implementation doesnât support Field3D
values then the average over
\(z\) will be used as a Field2D
value.
To perform the inversion, call the solve
function:
Field3D vort = ...;
Field3D phi = lap>solve(vort, 0.0);
The second input to solve
is an initial guess for the solution,
which can be used by iterative schemes e.g. using PETSc.
ImplementationsÂ¶
The currently available implementations are:
cyclic
: This implementation assumes coefficients are constant in \(Z\), and uses FFTs in \(z\) and a complex tridiagonal solver in \(x\) for each \(z\) mode (theCyclicReduction
solver). Code insrc/invert/laplacexz/impls/cyclic/
.petsc
: This uses the PETSc KSP interface to solve a matrix with coefficients varying in both \(x\) and \(z\). To improve efficiency of direct solves, a different matrix is used for preconditioning. When the coefficients are updated the preconditioner matrix is not usually updated. This means that LU factorisations of the preconditioner can be reused. Since this factorisation is a large part of the cost of direct solves, this should greatly reduce the runtime.
Test caseÂ¶
The code in examples/testlaplacexz
is a simple test case for
LaplaceXZ
. First it creates a LaplaceXZ
object:
LaplaceXZ *inv = LaplaceXZ::create(mesh);
For this test the petsc
implementation is the default:
[laplacexz]
type = petsc
ksptype = gmres # Iterative method
pctype = lu # Preconditioner
By default the LU preconditioner is used. PETScâs builtin factorisation only works in serial, so for parallel solves a different package is needed. This is set using:
factor_package = superlu_dist
This setting can be âpetscâ for the builtin (serial) code, or one of âsuperluâ, âsuperlu_distâ, âmumpsâ, or âcusparseâ.
Then we set the coefficients:
inv>setCoefs(Field3D(1.0),Field3D(0.0));
Note that the scalars need to be cast to fields (Field2D or Field3D)
otherwise the call is ambiguous. Using the PETSc commandline flag
mat_view ::ascii_info
information on the assembled matrix is
printed:
$ mpirun np 2 ./testlaplacexz mat_view ::ascii_info
...
Matrix Object: 2 MPI processes
type: mpiaij
rows=1088, cols=1088
total: nonzeros=5248, allocated nonzeros=5248
total number of mallocs used during MatSetValues calls =0
not using Inode (on process 0) routines
...
which confirms that the matrix element preallocation is setting the correct number of nonzero elements, since no additional memory allocation was needed.
A field to invert is created using FieldFactory:
Field3D rhs = FieldFactory::get()>create3D("rhs",
Options::getRoot(),
mesh);
which is currently set to a simple function in the options:
rhs = sin(x  z)
and then the system is solved:
Field3D x = inv>solve(rhs, 0.0);
Using the PETSc commandline flags ksp_monitor
to monitor the
iterative solve, and mat_superlu_dist_statprint
to monitor
SuperLU_dist we get:
Nonzeros in L 19984
Nonzeros in U 19984
nonzeros in L+U 38880
nonzeros in LSUB 11900
NUMfact space (MB) sum(procs): L\U 0.45 all 0.61
Total highmark (MB): All 0.62 Avg 0.31 Max 0.36
Mat conversion(PETSc>SuperLU_DIST) time (max/min/avg):
4.69685e05 / 4.69685e05 / 4.69685e05
EQUIL time 0.00
ROWPERM time 0.00
COLPERM time 0.00
SYMBFACT time 0.00
DISTRIBUTE time 0.00
FACTOR time 0.00
Factor flops 1.073774e+06 Mflops 222.08
SOLVE time 0.00
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 28.67
0 KSP Residual norm 5.169560044060e+02
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 60.50
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 49.86
1 KSP Residual norm 1.359142853145e12
So after the initial setup and factorisation, the system is solved in one iteration using the LU direct solve.
As a test of reusing the preconditioner, the coefficients are then modified:
inv>setCoefs(Field3D(2.0),Field3D(0.1));
and solved again:
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 84.15
0 KSP Residual norm 5.169560044060e+02
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 90.42
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 98.51
1 KSP Residual norm 2.813291076609e+02
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 94.88
2 KSP Residual norm 1.688683980433e+02
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 87.27
3 KSP Residual norm 7.436784980024e+01
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 88.77
4 KSP Residual norm 1.835640800835e+01
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 89.55
5 KSP Residual norm 2.431147365563e+00
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 88.00
6 KSP Residual norm 5.386963293959e01
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 93.50
7 KSP Residual norm 2.093714782067e01
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 91.91
8 KSP Residual norm 1.306701698197e02
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 89.44
9 KSP Residual norm 5.838501185134e04
SOLVE time 0.00
Solve flops 8.245800e+04 Mflops 81.47
Note that this time there is no factorisation step, but the direct solve is still very effective.
Blob2d comparisonÂ¶
The example examples/blob2dlaplacexz
is the same as
examples/blob2d
but with LaplaceXZ
rather than
Laplacian
.
Tests on one processor: Using Boussinesq approximation, so that the matrix elements are not changed, the cyclic solver produces output:
1.000e+02 125 8.28e01 71.8 8.2 0.4 0.6 18.9
2.000e+02 44 3.00e01 69.4 8.1 0.4 2.1 20.0
whilst the PETSc solver with LU preconditioner outputs:
1.000e+02 146 1.15e+00 61.9 20.5 0.5 0.9 16.2
2.000e+02 42 3.30e01 58.2 20.2 0.4 3.7 17.5
so the PETSc direct solver seems to take only slightly longer than the cyclic solver. For comparison, GMRES with Jacobi preconditioning gives:
1.000e+02 130 2.66e+00 24.1 68.3 0.2 0.8 6.6
2.000e+02 78 1.16e+00 33.8 54.9 0.3 1.1 9.9
and with SOR preconditioner:
1.000e+02 124 1.54e+00 38.6 50.2 0.3 0.4 10.5
2.000e+02 45 4.51e01 46.8 37.8 0.3 1.7 13.4
When the Boussinesq approximation is not used, the PETSc solver with LU preconditioning, resetting the preconditioner every 100 solves gives:
1.000e+02 142 3.06e+00 23.0 70.7 0.2 0.2 6.0
2.000e+02 41 9.47e01 21.0 72.1 0.3 0.6 6.1
i.e. around three times slower than the Boussinesq case. When using jacobi preconditioner:
1.000e+02 128 2.59e+00 22.9 70.8 0.2 0.2 5.9
2.000e+02 68 1.18e+00 26.5 64.6 0.2 0.6 8.1
For comparison, the Laplacian
solver using the
tridiagonal solver as preconditioner gives:
1.000e+02 222 5.70e+00 17.4 77.9 0.1 0.1 4.5
2.000e+02 172 3.84e+00 20.2 74.2 0.2 0.2 5.2
or with Jacobi preconditioner:
1.000e+02 107 3.13e+00 15.8 79.5 0.1 0.2 4.3
2.000e+02 110 2.14e+00 23.5 69.2 0.2 0.3 6.7
The LaplaceXZ
solver does not appear to be dramatically faster in
serial than the Laplacian
solver when the matrix coefficients are
modified every solve. When matrix elements are not modified then the
solve time is competitive with the tridiagonal solver.
As a test, timing only the setCoefs
call for the nonBoussinesq case
gives:
1.000e+02 142 1.86e+00 83.3 9.5 0.2 0.3 6.7
2.000e+02 41 5.04e01 83.1 8.0 0.3 1.2 7.3
so around 9% of the runtime is in setting the coefficients, and the remaining \(\sim 60\)% in the solve itself.
Differential operatorsÂ¶
There are a huge number of possible ways to perform differencing in computational fluid dynamics, and BOUT++ is intended to be able to implement a large number of them. This means that the way differentials are handled internally is quite involved; see the developerâs manual for full gory details. Much of the time this detail is not all that important, and certainly not while learning to use BOUT++. Default options are therefore set which work most of the time, so you can start using the code without getting bogged down in these details.
In order to handle many different differencing methods and operations, many layers are used, each of which handles just part of the problem. The main division is between differencing methods (such as 4thorder central differencing), and differential operators (such as \(\nabla_{}\)).
Differencing methodsÂ¶
Methods are typically implemented on 5point stencils (although exceptions are possible) and are divided into three categories:
 Centraldifferencing methods, for diffusion operators
\(\frac{df}{dx}\), \(\frac{d^2f}{dx^2}\). Each method has a
short code, and currently include
C2
: 2\(^{nd}\) order \(f_{1}  2f_0 + f_1\)C4
: 4\(^{th}\) order \((f_{2} + 16f_{1}  30f_0 + 16f_1  f_2)/12\)S2
: 2\(^{nd}\) order smoothing derivativeW2
: 2\(^{nd}\) order CWENOW3
: 3\(^{rd}\) order CWENO
 Upwinding methods for advection operators \(v_x\frac{df}{dx}\)
U1
: 1\(^{st}\) order upwindingU2
: 2\(^{nd}\) order upwindingU3
: 3\(^{rd}\) order upwindingU4
: 4\(^{th}\) order upwindingC2
: 2\(^{nd}\) order centralC4
: 4\(^{th}\) order centralW3
: 3\(^{rd}\) order Weighted Essentially NonOscillatory (WENO)
 Flux conserving and limiting methods for terms of the form
\(\frac{d}{dx}(v_x f)\)
U1
: 1\(^{st}\) order upwindingC2
: 2\(^{nd}\) order centralC4
: 4\(^{th}\) order central
Special methods :
FFT
: Classed as a central method, Fourier Transform method in Z (axisymmetric) direction only. Currently available for
first
andsecond
order central difference
SPLIT
: A flux method that splits into upwind and central terms \(\frac{d}{dx}(v_x f) = v_x\frac{df}{dx} + f\frac{dv_x}{dx}\)
WENO methods avoid overshoots (Gibbs phenomena) at sharp gradients such as shocks, but the simple 1storder method has very large artificial diffusion. WENO schemes are a development of the ENO reconstruction schemes which combine good handling of sharpgradient regions with high accuracy in smooth regions.
The stencil based methods are based by a kernel that combines the data
in a stencil to produce a single BoutReal (note upwind/flux methods
take extra information about the flow, either a BoutReal
or
another stencil
). It is not anticipated that the user would wish
to apply one of these kernels directly so documentation is not
provided here for how to do so. If this is of interest please look at
include/bout/index_derivs.hxx
. Internally, these kernel routines
are combined within a functor struct that uses a BOUT_FOR
loop
over the domain to provide a routine that will apply the kernel to
every point, calculating the derivative everywhere. These routines are
registered in the appropriate DerivativeStore
and identified by
the direction of differential, the staggering, the type
(central/upwind/flux) and a key such as âC2â. The typical user does
not need to interact with this store, instead one can add the
following to the top of your physics module:
#include <derivs.hxx>
to provide access to the following routines. These take care of selecting the appropriate method from the store and ensuring the input/output field locations are compatible.
Function  Formula 

DDX(f)  \(\partial f / \partial x\) 
DDY(f)  \(\partial f / \partial y\) 
DDZ(f)  \(\partial f / \partial z\) 
D2DX2(f)  \(\partial^2 f / \partial x^2\) 
D2DY2(f)  \(\partial^2 f / \partial y^2\) 
D2DZ2(f)  \(\partial^2 f / \partial z^2\) 
D4DX4(f)  \(\partial^4 f / \partial x^4\) 
D4DY4(f)  \(\partial^4 f / \partial y^4\) 
D4DZ4(f)  \(\partial^4 f / \partial z^4\) 
D2DXDZ(f)  \(\partial^2 f / \partial x\partial z\) 
D2DYDZ(f)  \(\partial^2 f / \partial y\partial z\) 
VDDX(f, g)  \(f \partial g / \partial x\) 
VDDY(f, g)  \(f \partial g / \partial y\) 
VDDZ(f, g)  \(f \partial g / \partial z\) 
FDDX(f, g)  \(\partial/\partial x( f * g )\) 
FDDY(f, g)  \(\partial/\partial x( f * g )\) 
FDDZ(f, g)  \(\partial/\partial x( f * g )\) 
By default the method used will be the one specified in the options
input file (see Differencing methods), but most of these
methods can take an optional std::string
argument (or a
DIFF_METHOD
argument  to be deprecated), specifying exactly which
method to use.
User registered methodsÂ¶
Note
The following may be considered advanced usage.
It is possible for the user to define their own
differencing routines, either by supplying a stencil using kernel or
writing their own functor that calculates the differential
everywhere. It is then possible to register these methods with the
derivative store (for any direction, staggering etc.). For examples
please look at include/bout/index_derivs.hxx
to see how these
approaches work.
Here is a verbose example showing how the C2
method is
implemented.
DEFINE_STANDARD_DERIV(DDX_C2, "C2", 1, DERIV::Stanard) {
return 0.5*(f.p  f.m);
};
Here DEFINE_STANARD_DERIV
is a macro that acts on the kernel return
0.5*(f.p  f.m);
and produces the functor that will apply the
differencing method over an entire field. The macro takes several
arguments;
 the first (
DDX_C2
) is the name of the generated functor â this needs to be unique and allows advanced users to refer to a specific derivative functor without having to go through the derivative store if desired.  the second (
"C2"
) is the string key that is used to refer to this specific method when registering/retrieving the method from the derivative store.  the third (
1
) is the number of guard cells required to be able to use this method (i.e. here the stencil will consist of three values â the field at the current point and one point either side). This can be 1 or 2.  the fourth (
DERIV::Standard
) identifies the type of method  here a central method.
Alongside DEFINE_STANDARD_DERIV
thereâs also DEFINE_UPWIND_DERIV
,
DEFINE_FLUX_DERIV
and the staggered versions
DEFINE_STANDARD_DERIV_STAGGERED
, DEFINE_UPWIND_DERIV_STAGGERED
and
DEFINE_FLUX_DERIV_STAGGERED
.
To register this method with the derivative store in X
and Z
with
no staggering for both field types we can then use the following code:
produceCombinations<Set<WRAP_ENUM(DIRECTION, X), WRAP_ENUM(DIRECTION, Z)>,
Set<WRAP_ENUM(STAGGER, None)>,
Set<TypeContainer<Field2D, Field3D>>,
Set<DDX_C2>>
someUniqueNameForDerivativeRegistration(registerMethod{});
For the common case where the user wishes to register the method in
X
, Y
and Z
and for both field types we provide the helper
macros, REGISTER_DERIVATIVE
and REGISTER_STAGGERED_DERIVATIVE
which could be used as REGISTER_DERIVATIVE(DDX_C2)
.
To simplify matters further we provide REGISTER_STANDARD_DERIVATIVE
,
REGISTER_UPWIND_DERIVATIVE
, REGISTER_FLUX_DERIVATIVE
,
REGISTER_STANDARD_STAGGERED_DERIVATIVE
,
REGISTER_UPWIND_STAGGERED_DERIVATIVE
and
REGISTER_FLUX_STAGGERED_DERIVATIVE
macros that can define and
register a stencil using kernel in a single step. For example:
REGISTER_STANDARD_DERIVATIVE(DDX_C2, "C2", 1, DERIV::Standard) { return 0.5*(f.pf.m);};
Will define the DDX_C2
functor and register it with the derivative
store using key "C2"
for all three directions and both fields with
no staggering.
Mixed secondderivative operatorsÂ¶
Coordinate derivatives commute, as long as the coordinates are globally welldefined, i.e.
When using paralleltransform = shifted
or paralleltransform = fci
(see
Parallel Transforms) we do not have globally welldefined coordinates. In those
cases the coordinate systems are fieldaligned, but the grid points are at constant
toroidal angle. The fieldaligned coordinates are defined locally, on planes of constant
\(y\). There are different coordinate systems for each plane. However, within each
local coordinate system the derivatives do commute. \(y\)derivatives are taken in the
local fieldaligned coordinate system, so mixed derivatives are calculated as
D2DXDY(f) = DDX(DDY(f))
D2DYDZ(f) = DDZ(DDY(f))
This order is simpler â the alternative is possible. Using secondorder central difference operators for the yderivatives we could calculate (not worring about communications or boundary conditions here)
Field3D D2DXDY(Field3D f) {
auto result{emptyFrom(f)};
auto& coords = \*f.getCoordinates()
auto dfdx_yup = DDX(f.yup());
auto dfdx_ydown = DDX(f.ydown());
BOUT_FOR(i, f.getRegion()) {
result[i] = (dfdx_yup[i.yp()]  dfdx_ydown[i.ym()]) / (2. * coords.dy[i])
}
return result;
}
This would give equivalent results to the previous form [1] as yup
and ydown
give
the values of f
one grid point along the magnetic field in the local fieldaligned
coordinate system.
The \(x\mathrm{}z\) derivative is unaffected as it is taken entirely on a plane of constant \(y\) anyway. It is evaluated as
D2DXDZ(f) = DDZ(DDX(f))
As the z
direction is periodic and the z
grid is not split across processors,
DDZ
does not require any guard cells. By taking DDZ
second, we do not have to
communicate or set boundary conditions on the result of DDX
or DDY
before taking
DDZ
.
The derivatives in D2DXDY(f)
are applied in two steps. First dfdy = DDY(f)
is
calculated; dfdy
is communicated and has a boundary condition applied so that all the
xguard cells are filled. The boundary condition is free_o3
by default (3rd order
extrapolation into the boundary cells), but can be specified with the fifth argument to
D2DXDY
(see Boundary conditions for possible options). Second DDX(dfdy)
is
calculated, and returned from the function.
[1]  Equivalent but not exactly the same numerically. Expanding out the derivatives in
secondorder centraldifference form shows that the two differ in the grid points
at which they evaluate dx and dy . As long as the grid spacings are smooth
this should not affect the order of accuracy of the scheme (?). 
Nonuniform meshesÂ¶
examples/testnonuniform seems to not work? Setting
non_uniform = true
in the BOUT.inp options file enables corrections
to second derivatives in \(X\) and \(Y\). This correction is
given by writing derivatives as:
where \(i\) is the cell index number. The second derivative is therefore given by
The correction factor \(\partial/\partial i(1/\Delta x)\) can
be calculated automatically, but you can also specify d2x
in the
grid file which is
The correction factor is then calculated from d2x
using
Note: There is a separate switch in the Laplacian inversion code, which enables or disables nonuniform mesh corrections.
General operatorsÂ¶
These are differential operators which are for a general coordinate system.
where we have defined
not to be confused with the Christoffel symbol of the second kind (see the coordinates manual for more details).
Clebsch operatorsÂ¶
Another set of operators assume that the equilibrium magnetic field is written in Clebsch form as
where
is the background equilibrium magnetic field.
Function  Formula 
Grad_par 
\(\partial^0_{} = \mathbf{b}_0\cdot\nabla = \frac{1}{\sqrt{g_{yy}}}{{\frac{\partial }{\partial y}}}\) 
Div_par 
\(\nabla^0_{}f = B_0\partial^0_{}(\frac{f}{B_0})\) 
Grad2_par2 
\(\partial^2_{}\phi = \partial^0_{}(\partial^0_{}\phi) = \frac{1}{\sqrt{g_{yy}}}{{\frac{\partial}{\partial y}}}(\frac{1}{\sqrt{g_{yy}}}){{\frac{\partial \phi}{\partial y}}} + \frac{1}{g_{yy}}\frac{\partial^2\phi}{\partial y^2}\) 
Laplace_par 
\(\nabla_{}^2\phi = \nabla\cdot\mathbf{b}_0\mathbf{b}_0\cdot\nabla\phi = \frac{1}{J}{{\frac{\partial}{\partial y}}}(\frac{J}{g_{yy}}{{\frac{\partial \phi}{\partial y}}})\) 
Laplace_perp 
\(\nabla_\perp^2 = \nabla^2  \nabla_{}^2\) 
Delp2 
Perpendicular Laplacian, neglecting all \(y\)
derivatives. The Laplacian solver performs the
inverse operation 
brackets 
Poisson brackets. The Arakawa option, neglects the parallel \(y\) derivatives if \(g_{xy}\) and \(g_{yz}\) are nonzero 
We have that
In a Clebsch coordinate system \({{\boldsymbol{B}}} = \nabla z \times \nabla x = \frac{1}{J}{{\boldsymbol{e}}}_y\), \(g_{yy} = {{\boldsymbol{e}}}_y\cdot{{\boldsymbol{e}}}_y = J^2B^2\), and so the \(\nabla y\) term cancels out:
The bracket operatorsÂ¶
The bracket operator brackets(phi, f, method)
aims to
differentiate equations on the form
Notice that when we use the Arakawa scheme, \(y\)derivatives are
neglected if \(g_{xy}\) and \(g_{yz}\) are nonzero. An
example of usage of the brackets can be found in for example
examples/MMS/advection
or examples/blob2d
.
Finite volume, conservative finite difference methodsÂ¶
These schemes aim to conserve the integral of the advected quantity over the domain. If \(f\) is being advected, then
is conserved, where the index \(i\) refers to cell index. This is done by calculating fluxes between cells: Whatever leaves one cell is added to another. There are several caveats to this:
 Boundary fluxes can still lead to changes in the total, unless noflow boundary conditions are used
 When using an implicit time integration scheme, such as the default PVODE / CVODE, the total is not guaranteed to be conserved, but may vary depending on the solver tolerances.
 There will always be a small rounding error, even with double precision.
The methods can be used by including the header:
#include "bout/fv_ops.hxx"
Note The methods are defined in a namespace FV
.
Some methods (those with templates) are defined in the header, but others
are defined in src/mesh/fv_ops.cxx
.
Parallel divergence Div_par
Â¶
This function calculates the divergence of a flow in \(y\) (parallel to the magnetic field) by a given velocity.
template<typename CellEdges = MC>
const Field3D Div_par(const Field3D &f_in, const Field3D &v_in,
const Field3D &a, bool fixflux=true);
where f_in
is the quantity being advected (e.g. density), v_in
is the parallel advection velocity. The third input, a
, is the maximum
wave speed, which multiplies the dissipation term in the method.
ddt(n) = FV::Div_par( n, v, cs );
By default the MC
slope limiter is used to calculate cell edges, but this can
be changed at compile time e.g:
ddt(n) = FV::Div_par<FV::Fromm>( n, v, cs );
A list of available limiters is given in section Slope limiters below.
Example and convergence testÂ¶
The example code examples/finitevolume/fluid/
solves the Euler
equations for a 1D adiabatic fluid, using FV::Div_par()
for
the advection terms.
where \(n\) is the density, \(p\) is the pressure, and
\(nv_{}\) is the momentum in the direction parallel to the
magnetic field. The operator \(\nabla_{}\) represents the
divergence of a parallel flow (Div_par
), and \(\partial_{}
= \mathbf{b}\cdot\nabla\) is the gradient in the parallel direction.
There is a convergence test using the Method of Manufactured Solutions (MMS) for this example.
See section Method of Manufactured Solutions for details of the testing method. Running the runtest
script should produce the graph
Parallel diffusionÂ¶
The parallel diffusion operator calculates \(\nabla_{}\left[k\partial_\left(f\right)\right]\)
const Field3D Div_par_K_Grad_par(const Field3D &k, const Field3D &f,
bool bndry_flux=true);
This is done by calculating the flux \(k\partial_\left(f\right)\) on cell boundaries using central differencing.
Advection in 3DÂ¶
This operator calculates \(\nabla\cdot\left( n \mathbf{v} \right)\) where \(\mathbf{v}\) is a 3D vector. It is written in flux form by discretising the expression
Like the Div_par
operator, a slope limiter is used to calculate the value of
the field \(n\) on cell boundaries. By default this is the MC method, but
this can be set as a template parameter.
template<typename CellEdges = MC>
const Field3D Div_f_v(const Field3D &n, const Vector3D &v, bool bndry_flux)
Slope limitersÂ¶
Here limiters are implemented as slope limiters: The value of a given
quantity is calculated at the faces of a cell based on the cellcentre
values. Several slope limiters are defined in fv_ops.hxx
:
Upwind
 First order upwinding, in which the left and right edges of the cell are the same as the centre (zero slope).Fromm
 A secondorder scheme which is a fixed weighted average of upwinding and central difference schemes.MinMod
 This second order scheme switches between the upwind and downwind gradient, choosing the one with the smallest absolute value. If the gradients have different signs, as at a maximum or minimum, then the method reverts to first order upwinding (zero slope).MC
(Monotonised Central) is a second order scheme which switches between central, upwind and downwind differencing in a similar way toMinMod
. It has smaller dissipation thanMinMod
so is the default.
Staggered gridsÂ¶
By default, all quantities in BOUT++ are defined at cell centre, and all derivative methods map cellcentred quantities to cell centres. Switching on staggered grid support in BOUT.inp:
StaggerGrids = true
allows quantities to be defined on cell boundaries. Functions such as
DDX
now have to handle all possible combinations of input and output
locations, in addition to the possible derivative methods.
Several things are not currently implemented, which probably should be:
 Only 3D fields currently have a cell location attribute. The location (cell centre etc) of 2D fields is ignored at the moment. The rationale for this is that 2D fields are assumed to be slowlyvarying equilibrium quantities for which it wonât matter so much. Still, needs to be improved in future
 Twistshift and X shifting still treat all quantities as cellcentred.
 No boundary condition functions yet account for cell location.
Currently, BOUT++ does not support values at cell corners; values can only be defined at cell centre, or at the lower X,Y, or Z boundaries. This is
Once staggered grids are enabled, two types of stencil are needed: those which map between the same cell location (e.g. cellcentred values to cellcentred values), and those which map to different locations (e.g. cellcentred to lower X).
Central differencing using 4point stencil:
Input  Output  Actions 

Central stencil  
CENTRE  XLOW  Lower staggered stencil 
XLOW  CENTRE  Upper staggered stencil 
XLOW  Any  Staggered stencil to CENTRE, then interpolate 
CENTRE  Any  Central stencil, then interpolate 
Any  Any  Interpolate to centre, use central stencil, then interpolate 
Table: DDX actions depending on input and output locations. Uses first match.
Derivatives of the Fourier transformÂ¶
By using the definition of the Fourier transformed, we have
this gives
where we have used that \(f(x,y,\pm\infty)=0\) in order to have a well defined Fourier transform. This means that
In our case, we are dealing with periodic boundary conditions. Strictly speaking, the Fourier transform does not exist in such cases, but it is possible to define a Fourier transform in the limit which in the end lead to the Fourier series [2] By discretising the spatial domain, it is no longer possible to represent the infinite amount of Fourier modes, but only \(N+1\) number of modes, where \(N\) is the number of points (this includes the modes with negative frequencies, and the zeroth offset mode). For the discrete Fourier transform, we have
where \(k\) is the mode number, \(N\) is the number of points in \(z\). If we call the sampling points of \(z\) for \(z_Z\), where \(Z = 0, 1 \ldots N1\), we have that \(z_Z = Z \text{d}z\). As our domain goes from \([0, 2\pi[\), we have that (since we have one less line segment than point) \(\text{d}z (N1) = L_z = 2\pi  \text{d}z\), which gives \(\text{d}z = \frac{2\pi}{N}\). Inserting this is equation ((10)) yields
The discrete version of equation ((9)) thus gives
[2]  For more detail see Bracewell, R. N.  The Fourier Transform and Its Applications 3rd Edition chapter 10 
Algebraic operatorsÂ¶
BOUT++ provides a wide variety of algebraic operators acting on fields.
The algebraic operators are listed in Table 11.
For a completely uptodate list, see the Nonmember functions
part of field2d.hxx,
field3d.hxx,
fieldperp.hxx.
Name  Description 

min(f, allpe=true, region) 
Minimum (optionally over all processes) 
max(f, allpe=true, region) 
Maximum (optionally over all processes) 
pow(lhs, rhs, region) 
\(\mathtt{lhs}^\mathtt{rhs}\) 
sqrt(f, region) 
\(\sqrt{(f)}\) 
abs(f, region) 
\(f\) 
exp(f, region) 
\(e^f\) 
log(f, region) 
\(\log(f)\) 
sin(f, region) 
\(\sin(f)\) 
cos(f, region) 
\(\cos(f)\) 
tan(f, region) 
\(\tan(f)\) 
sinh(f, region) 
\(\sinh(f)\) 
cosh(f, region) 
\(\cosh(f)\) 
tanh(f, region) 
\(\tanh(f)\) 
floor(f, region) 
Returns a field with the floor of f at each point 
filter(f, n, region) 
Calculate the amplitude of the Fourier mode in the
zdirection with mode number n 
lowpass(f, nmax, region) 
Remove Fourier modes (in the zdirection) with mode
number higher than zmax 
lowpass(f, nmax, nmin, region) 
Remove Fourier modes (in the zdirection) with mode
number higher than zmax or lower than zmin 
shiftZ(f, angle, region) 
Rotate f by angle in the zdirection.
\(\mathtt{angle}/2\pi\) is the fraction of the
domain multiplied by \(2\pi\) so angle is in
radians if the total size of the domain is
\(2\pi\) 
DC(f, region) 
The average in the zdirection of f
(DC stands for direct current, i.e. the constant part
of f as opposed to the AC, alternating current, or
fluctuating part) 
These operators take a region
argument, whose values can be [1] (see
Iterating over fields)
RGN_ALL
, which is the whole mesh;RGN_NOBNDRY
, which skips all boundaries;RGN_NOX
, which skips the x boundariesRGN_NOY
, which skips the y boundaries
The default value for the region argument is RGN_ALL
which should work in all
cases. However, the region argument can be used for optimization, to skip
calculations in guard cells if it is known that those results will not be
needed (for example, if no derivatives of the result will be calculated). Since
these operators can be relatively expensive compared to addition, subtraction,
multiplication this can be a useful performance improvement.
[1]  More regions may be added in future, for example to act on only subsets of the physical domain. 
Staggered gridsÂ¶
Until now all quantities have been cellcentred i.e. both velocities and conserved quantities were defined at the same locations. This is because these methods are simple and this was the scheme used in the original BOUT. This class of methods can however be susceptible to gridgrid oscillations, and so most shockcapturing schemes involve densities and velocities (for example) which are not defined at the same location: their grids are staggered.
By default BOUT++ runs with all quantities at cell centre. To enable staggered grids, set:
StaggerGrids = true
in the top section of the BOUT.inp
file. The teststaggered
example illustrates how to use staggered grids in BOUT++.
There are four possible locations in a grid cell where a quantity can be defined in BOUT++: centre, lower X, lower Y, and lower Z. These are illustrated in Fig. 17.
To specify the location of a variable, use the method
Field3D::setLocation()
with one of the CELL_LOC
locations
CELL_CENTRE
, CELL_XLOW
, CELL_YLOW
, or CELL_ZLOW
.
The key lines in the staggered_grid example which specify the locations of the evolving variables are:
Field3D n, v;
int init(bool restart) {
v.setLocation(CELL_YLOW); // Staggered relative to n
SOLVE_FOR(n, v);
...
which makes the velocity v
staggered to the lower side of the cell
in Y, whilst the density \(n\) remains cell centred.
Note
If BOUT++ was configued withchecks
,
Field3D::setLocation()
will throw an exception if you donât
have staggered grids turned on and try to set the location
to something other than CELL_CENTRE
. If you want to be
able to run your model with and without staggered grids, you
should do something like:
if (v.getMesh()>StaggerGrids) {
v.setLocation(CELL_YLOW);
}
Compiling BOUT++ with checks turned off will instead cause
Field3D::setLocation()
to silently set the location to
CELL_CENTRE
if staggered grids are off, regardless of what
you pass it.
Arithmetic operations can only be performed between variables with the same location. When performing a calculation at one location, to include a variable from a different location, use the interpolation routines. Include the header file
#include <interpolation.hxx>
then use the interp_to(field, location, region)
function. For example,
given a CELL_CENTRE
field n
and a CELL_YLOW
field v
, to calculate
n*v
at CELL_YLOW
, call interp_to(n, CELL_YLOW)*v
whose result will be
CELL_YLOW
as n
is interpolated.
Note
The region argument is optional but useful (see :ref:sec_iterating
for more on regions). The default RGN_ALL
reproduces the historical
behaviour of BOUT++, which communicates before returning the result
from interp_to
. Communication is necessary because the result of
interpolation in the guard cells depends on data from another process
(except, currently, in the case of interpolation in the zdirection
which can be done without communication because all the zpoints are
on the same process).
Using RGN_NOBNDRY no communication is performed (so interp_to is faster, potentially significantly faster when using many processes) and all the guard cells are invalid. Whichever region is used, the boundary guard cells are invalid since no boundary condition is applied in interp_to. If the guard cells are needed (e.g. to calculate a derivative) a boundary condition must be applied explicitly to the result.
RGN_NOX and RGN_NOY currently have identical behaviour to RGN_ALL because at present BOUT++ has no functions for singledirection communication which could in principle be used in these cases (if the combination of region and direction of interpolation allows it). x or yinterpolation can never be calculated in guard cells without communication because the corner guard cells are never valid.
Differential operators by default return fields which are defined at
the same location as their inputs, so here Grad_par(v)
would be
CELL_YLOW
. If this is not what is wanted, give the location of the
result as an additional argument: Grad_par(v, CELL_CENTRE)
uses
staggered differencing to produce a result which is defined at the
cell centres. As with the arithmetic operators, if you ask for the
result to be staggered in a different direction from the input then
the differencing will be to cell centre and then be interpolated. For
example Grad_par(v, CELL_XLOW)
would first perform staggered
differencing from CELL_YLOW
to get a result at CELL_CENTRE
, and
then interpolate the result to CELL_XLOW
.
Advection operators which take two arguments return a result which is
defined at the location of the field being advected. For example
Vpar_Grad_par(v, f)
calculates \(v \nabla_{} f\) and returns a
result at the same location as f
. If v
and f
are defined at
the same locations then centred differencing is used, if one is centred
and the other staggered then staggered differencing is used, and if both
are staggered to different locations then the behaviour is less well
defined (donât do it). As with other differential operators, the
required location of the result can be given as an optional argument.
Eigenvalue solverÂ¶
By using the SLEPc library, BOUT++ can be used as an eigenvalue solver to find the eigenvectors and eigenvalues of sets of equations.
Configuring with SLEPcÂ¶
The BOUT++ interface has been tested with SLEPc version 3.4.3, itself compiled with PETSc 3.4.2. SLEPc version 3.4 should work, but other versions will not yet.
SLEPc optionsÂ¶
Time derivatives can be taken directly from the RHS function, or by advancing the simulation in time by a relatively large increment. This second method acts to damp high frequency components
Nonlocal heat flux modelsÂ¶
SpitzerHarm heat fluxÂ¶
The SpitzerHarm heat flux \(q_{SH}\) is calculated using
where \(n_e\) is the electron density in \(m^{3}\), \(T_e\) is the electron temperature in eV, \(kappa_0 = 13.58\), \(Z\) is the average ion charge. The resulting expression is in units of \(eV/m^2/s\).
The thermal collision time \(tau_{ei,T} = \lambda_{ei,T} / v_{T}\) is calculated using the thermal mean free path and thermal velocity:
where it is assumed that \(n_i = n_e\), and the following are used:
Note: If comparing to online notes, \(\kappa_0\frac{Z+0.24}{Z+4.2} \simeq 3.2\), a different definition of collision time \(\tau_{ei}\) is used here, but the other factors are included so that the heat flux \(q_{SH}\) is the same here as in those notes.
SNB modelÂ¶
The SNB model calculates a correction to the SpitzerHarm heat flux, solving a diffusion equation for each of a set of energy groups with normalised energy \(\beta = E_g / eT_e\) where \(E_g\) is the energy of the group.
where \(\nabla_{}\) is the divergence of a parallel flux, and \(\partial_{}\) is a parallel gradient. \(U_g = W_g q_{SH}\) is the contribution to the SpitzerHarm heat flux from a group:
The modified mean free paths for each group are:
From the quantities \(H_g\) for each group, the SNB heat flux is:
In flud models we actually want the divergence of the heat flux, rather than the heat flux itself. We therefore rearrange to get:
and so calculate the divergence of the heat flux as:
The Helmholtz type equation along the magnetic field is solved using a tridiagonal solver. The parallel divergence term is currently split into a second derivative term, and a first derivative correction:
Using the SNB modelÂ¶
To use the SNB model, first include the header:
#include <bout/snb.hxx>
then create an instance:
HeatFluxSNB snb;
By default this will use options in a section called âsnbâ, but if
needed a different Options&
section can be given to the constructor:
HeatFluxSNB snb(Options::root()["mysnb"]);
The options are listed in table tabsnboptions
.
Name  Meaning  Default value 

beta_max
ngroups
r 
Maximum energy group to consider (multiple of eT) Number of energy groups Scaling down the electronelectron mean free path  10 40 2 
The divergence of the heat flux can then be calculated:
Field3D Div_q = snb.divHeatFlux(Te, Ne);
where Te
is the temperature in eV, and Ne
is the electron density in \(m^{3}\).
The result is in eV per \(m^3\) per second, so multiplying by \(e=1.602\times 10^{19}\) will give
Watts per cubic meter.
To compare to the SpitzerHarm result, pass in a pointer to a
Field3D
as the third argument. This field will be set to the
SpitzerHarm value:
Field3D Div_q_SH;
Field3D Div_q = snb.divHeatFlux(Te, Ne, &Div_q_SH);
This is used in the examples discussed below.
Example: Linear perturbationÂ¶
The examples/conductionsnb
example calculates the heat flux for a
given density and temperature profile, comparing the SNB and
SpitzerHarm fluxes. The sinusoidal.py
case uses a periodic
domain of length 1 meter and a small (0.01eV) perturbation to the
temperature. The temperature is varied from 1eV to 1keV, so that the
mean free path varies. This is done for different SNB settings,
changing the number of groups and the maximum \(\beta\):
$ python sinusoid.py
This should output a file snbsinusoidal.png
and display the results,
shown in figure Fig. 18.
Example: Nonlinear heat fluxÂ¶
A nonlinear test is also included in examples/conductionsnb
, a step function in temperature
from around 200eV to 950eV over a distance of around 0.1mm, at an electron density of 5e26 per cubic meter:
$ python step.py
This should output a file snbstep.png
, shown in figure Fig. 19.
Fieldaligned coordinatesÂ¶
Author:  B.DudsonÂ§, M.V.Umansky, L.C.Wang, X.Q.Xu, L.L.LoDestro Â§Department of Physics, University of York, UK Lawrence Livermore National Laboratory, USA IFTS, China 

IntroductionÂ¶
This manual covers the fieldaligned coordinate system used in many BOUT++ tokamak models, and useful derivations and expressions.
Orthogonal toroidal coordinatesÂ¶
Starting with an orthogonal toroidal coordinate system \(\left(\psi, \theta, \zeta\right)\), where \(\psi\) is the poloidal flux, \(\theta\) the poloidal angle (from \(0\) to \(2\pi\)), and \(\zeta\) the toroidal angle (also \(0\) to \(2\pi\)). We have that the magnetic field \({\boldsymbol{B}}\) can be expressed as
The magnitudes of the unit vectors are
where \({h_\theta}\) is the poloidal arc length per radian. The coordinate system is right handed, so \(\hat{{\boldsymbol{e}}}_\psi\times\hat{{\boldsymbol{e}}}_\theta = \hat{{\boldsymbol{e}}}_\zeta\), \(\hat{{\boldsymbol{e}}}_\psi\times\hat{{\boldsymbol{e}}}_\zeta = \hat{{\boldsymbol{e}}}_\theta\) and \(\hat{{\boldsymbol{e}}}_\theta\times\hat{{\boldsymbol{e}}}_\zeta = \hat{{\boldsymbol{e}}}_\psi\). The covariant metric coefficients are
and the magnitudes of the reciprocal vectors are therefore
Because the coordinate system is orthogonal, \(g^{ii} = 1/g_{ii}\) and so the crossproducts can be calculated as
Similarly,
Fieldaligned coordinatesÂ¶
In order to efficiently simulate (predominantly) fieldaligned structures, gridpoints are placed in a fieldaligned coordinate system. We define \(\sigma_{B\theta} \equiv {B_{\text{pol}}}/ \left{B_{\text{pol}}}\right\) i.e. the sign of the poloidal field. The new coordinates \(\left(x,y,z\right)\) are defined by:
Where \(\nu\) is the local fieldline pitch given by
where \(F={B_{\text{tor}}}R\) is a function only of \(\psi\) (sometimes called the poloidal current function).
The coordinate system is chosen so that \(x\) increases radially outwards, from plasma to the wall. The sign of the toroidal field \({B_{\text{tor}}}\) can then be either + or .
The contravariant basis vectors are therefore
The term in square brackets is the integrated local shear:
Magnetic fieldÂ¶
Magnetic field is given in Clebsch form by:
The contravariant components of this are then
i.e. \({\boldsymbol{B}}\) can be written as
and the covariant components calculated using \(g_{ij}\) as
The unit vector in the direction of equilibrium \({\boldsymbol{B}}\) is therefore
Jacobian and metric tensorsÂ¶
The Jacobian of this coordinate system is
which can be either positive or negative, depending on the sign of \({B_{\text{pol}}}\). The contravariant metric tensor is given by:
and the covariant metric tensor:
Differential operatorsÂ¶
The derivative of a scalar field \(f\) along the unperturbed magnetic field \({\boldsymbol{b}}_0\) is given by
whilst the parallel divergence is given by
Using equation (25), the Laplacian operator is given by
Using equation (24) for \(\nabla^2x = G^x\) etc, the values are
Neglecting some parallel derivative terms, the perpendicular Laplacian can be written:
The second derivative along the equilibrium field
A common expression (the Poisson bracket in reduced MHD) is (from equation (29))):
The perpendicular nabla operator:
J x B in fieldaligned coordinatesÂ¶
Components of the magnetic field in fieldaligned coordinates:
and
Calculate current \({\boldsymbol{J}}= \frac{1}{\mu}{\nabla\times {\boldsymbol{B}} }\)
since \({B_{\text{tor}}}R\) is a fluxsurface quantity, and \({\boldsymbol{B}}\) is axisymmetric.
The second term can be simplified, again using \({B_{\text{tor}}}R\) constant on fluxsurfaces:
From these, calculate covariant components:
Calculate \({\boldsymbol{J}}\times{\boldsymbol{B}}\) using
gives
Covariant components of \(\nabla P\):
and contravariant:
Hence equating contravariant x components of \({\boldsymbol{J}}\times{\boldsymbol{B}}= \nabla P\),
Use this to calculate \({h_\theta}\) profiles (need to fix \({h_\theta}\) at one radial location).
Close to xpoints, the above expression becomes singular, so a better way to write it is:
For solving forcebalance by adjusting \(P\) and \(f\) profiles, the form used is
A quick way to calculate f is to rearrange this to:
and then integrate this using LSODE.
Parallel currentÂ¶
and from equation (12):
since \(J_{} = b^yJ_y\),
CurvatureÂ¶
For reduced MHD, need to calculate curvature term \({\boldsymbol{b}}\times{\boldsymbol{\kappa}}\), where \({\boldsymbol{\kappa}} = \left({\boldsymbol{b}}\cdot\nabla\right){\boldsymbol{b}}= {\boldsymbol{b}}\times\left(\nabla\times{\boldsymbol{b}}\right)\). Rearranging, this becomes:
Components of \(\nabla\times{\boldsymbol{b}}\) are:
giving:
therefore,
Using equation (13):
we can rewrite the above components as:
Curvature from div (b/B)Â¶
The vector \({\boldsymbol{b}}\times{\boldsymbol{\kappa}}\) is an approximation of
so can just derive from the original expression. Using the contravariant components of \({\boldsymbol{b}}\), and the curl operator in curvilinear coordinates (see appendix):
This can be simplified using
to give
The first and second terms in \(\left({\boldsymbol{b}}\times{\boldsymbol{\kappa}}\cdot\nabla\right)^z\) almost cancel, so by expanding out \(\nu\) a better expression is
Curvature of a single lineÂ¶
The curvature vector can be calculated from the fieldline toroidal coordinates \(\left(R,Z,\phi\right)\) as follows. The line element is given by
Hence the tangent vector is
where \(s\) is the distance along the fieldline. From this, the curvature vector is given by
i.e.
Want the components of \({\boldsymbol{b}}\times{\boldsymbol{\kappa}}\), and since the vector \({\boldsymbol{b}}\) is just the tangent vector \({\boldsymbol{T}}\) above, this can be written using the crossproducts
This vector must then be dotted with \(\nabla\psi\), \(\nabla\theta\), and \(\nabla\phi\). This is done by writing these vectors in cylindrical coordinates:
An alternative is to use
and that the tangent vector \({\boldsymbol{T}} = {\boldsymbol{b}}\). This gives
and so because \(d\phi / ds = {B_{\text{tor}}}/ \left(RB\right)\)
Taking the crossproduct of the tangent vector with the curvature in equation (15) above gives
The components in fieldaligned coordinates can then be calculated:
Curvature in toroidal coordinatesÂ¶
In toroidal coordinates \(\left(\psi,\theta,\phi\right)\), the \({\boldsymbol{b}}\) vector is
The curl of this vector is
where \(1/\sqrt{g} = {B_{\text{pol}}}/{h_\theta}\). Therefore, in terms of unit vectors:
psi derivative of the B fieldÂ¶
Needed to calculate magnetic shear, and one way to get the curvature. The simplest way is to use finite differencing, but there is another way using local derivatives (implemented using DCT).
Using
we get
and so
The derivatives of \({B_{\text{pol}}}\) in \(R\) and \(Z\) are:
For the toroidal field, \({B_{\text{tor}}}= f/R\)
As above, \({\frac{\partial R}{\partial \psi}} = \nabla R \cdot\nabla\psi / \left(R{B_{\text{pol}}}\right)^2\), and since \(\nabla R\cdot\nabla R = 1\),
similarly,
and so the variation of toroidal field with \(\psi\) is
From the definition \(B=\sqrt{{B_{\text{tor}}}^2 + {B_{\text{pol}}}^2}\),
Parallel derivative of the B fieldÂ¶
To get the parallel nablaients of the \(B\) field components, start with
Using the fact that \(R{B_{\text{tor}}}\) is constant along \(s\),
which gives
The poloidal field can be calculated from
Using equation (16), \(\nabla\psi \cdot \nabla\psi\) can also be written as
and so (unsurprisingly)
Hence
Which gives
Magnetic shear from J x BÂ¶
Rearranging the radial force balance equation (13) gives
Magnetic shearÂ¶
The fieldline pitch is given by
and so
The last three terms are given in the previous section, but \(\partial{h_\theta}/\partial\psi\) needs to be evaluated
psi derivative of hÂ¶
From the expression for curvature (equation (14)), and using \(\nabla x \cdot \nabla \psi = {\sigma_{B\theta}}\left(R{B_{\text{pol}}}\right)^2\) and \(\nabla z\cdot\nabla \psi = {\sigma_{B\theta}}I \left(R{B_{\text{pol}}}\right)^2\)
The second and third terms partly cancel, and using \({\frac{\partial I}{\partial y}} = {\sigma_{B\theta}} {\frac{\partial \nu}{\partial x}}\)
Writing
and using \(B{\frac{\partial B}{\partial x}} = {B_{\text{tor}}}{\frac{\partial {B_{\text{tor}}}}{\partial x}} + {B_{\text{pol}}}{\frac{\partial {B_{\text{pol}}}}{\partial x}}\), this simplifies to give
This can be transformed into an expression for \({\frac{\partial {h_\theta}}{\partial x}}\) involving only derivatives along fieldlines. Writing \(\nabla R = {\frac{\partial R}{\partial \psi}}\nabla\psi + {\frac{\partial R}{\partial \theta}}\nabla\theta\),
Using (16),
and so
Substituting this and equation (17) for \({\boldsymbol{\kappa}}\cdot\nabla\psi\) into equation (18) the \({\frac{\partial R}{\partial x}}\) term cancels with part of the \({\boldsymbol{\kappa}}\cdot\nabla\psi\) term, simplifying to
Shifted radial derivativesÂ¶
The coordinate system given by equation (11) and used in the above sections has a problem: There is a special poloidal location \(\theta_0\) where the radial basis vector \({\boldsymbol{e}}_x\) is purely in the \(\nabla\psi\) direction. Moving away from this location, the coordinate system becomes sheared in the toroidal direction.
Making the substitution
we also get the mixed derivative
and secondorder \(x\) derivative
Perpendicular LaplacianÂ¶
transforms to
The extra term involving \(I\) disappears, but only if both the \(x\) and \(z\) first derivatives are taken into account:
with
where \(J={h_\theta}/ {B_{\text{pol}}}\) is the Jacobian. Transforming into \(\psi\) derivatives, the middle term of equation (20) cancels the \(I\) term in equation (19), but introduces another \(I\) term (first term in equation (20)). This term cancels with the \(\nabla^2 x\) term when \({\frac{\partial }{\partial x}}\) is expanded, so the full expression for \(\nabla_\perp^2\) using \(\psi\) derivatives is:
In orthogonal (psi, theta, zeta) flux coordinatesÂ¶
For comparison, the perpendicular Laplacian can be derived in orthogonal âfluxâ coordinates
The Laplacian operator is given by
parallel derivative by
and so
Hence in orthogonal flux coordinates, the perpendicular Laplacian is:
where the neglected terms are firstorder derivatives. The coefficient for the secondorder \(z\) derivative differs from equation (21), and equation (22) still contains a derivative in \(\theta\). This shows that the transformation made to get equation (21) doesnât result in the same answer as orthogonal flux coordinates: equation (21) is in fieldaligned coordinates.
Note that in the limit of \({B_{\text{pol}}}= B\), both equations (21) and (22) are the same, as they should be.
Operator B x Nabla Phi Dot Nabla AÂ¶
Useful identitiesÂ¶
\(\mathbf{b}\times\mathbf{\kappa}\cdot\nabla\psi \simeq RB_\zeta\partial_{}\ln B\)Â¶
Using \(\mathbf{b}\times\mathbf{\kappa} \simeq \frac{B}{2}\nabla\times\frac{\mathbf{b}}{B}\), and working in orthogonal \(\left(\psi, \theta, \zeta\right)\) coordinates. The magnetic field unit vector is:
and using the definition of curl (equation (26)) we can write
so that when dotted with \(\nabla\psi\), only the first bracket survives. The parallel gradient is
Neglecting derivatives for axisymmetric equilibrium
Since \(B_\zeta R\) is a flux function, this can be written as
and so
Differential geometryÂ¶
Warning
Several mistakes have been found (and is now corrected) in this section, so it should be proof read before removing this warning! The following are notes from [haeseler].
Sets of vectors \(\left\{\mathbf{A, B, C}\right\}\) and \(\left\{\mathbf{a, b, c}\right\}\) are reciprocal if
which implies that \(\left\{\mathbf{A, B, C}\right\}\) and \(\left\{\mathbf{a, b, c}\right\}\) are each linearly independent. Equivalently,
Either of these sets can be used as a basis, and any vector \(\mathbf{w}\) can be represented as \(\mathbf{w} = \left(\mathbf{w\cdot a}\right)\mathbf{A} + \left(\mathbf{w\cdot b}\right){\boldsymbol{B}}+ \left(\mathbf{w\cdot c}\right)\mathbf{C}\) or as \(\mathbf{w} = \left(\mathbf{w\cdot A}\right)\mathbf{a} + \left(\mathbf{w\cdot B}\right){\boldsymbol{b}} + \left(\mathbf{w\cdot C}\right)\mathbf{c}\). In the Cartesian coordinate system, the basis vectors are reciprocal to themselves so this distinction is not needed. For a general set of coordinates \(\left\{u^1, u^2, u^3\right\}\), tangent basis vectors can be defined. If the Cartesian coordinates of a point are given by \(\left(x, y, z\right) = \mathbf{R}\left(u^1, u^2, u^3\right)\) then the tangent basis vectors are:
and in general these will vary from point to point. The scale factor or metric coefficient \(h_i =\left{\boldsymbol{e}}_i\right\) is the distance moved for a unit change in \(u^i\). The unit vector \(\hat{{\boldsymbol{e}}}_i = {\boldsymbol{e}}_i/h_i\). Definition of nabla operator:
From the chain rule, \(d\mathbf{R} = \frac{\partial\mathbf{R}}{\partial u^i}du^i = {\boldsymbol{e}}_idu^i\) and substituting \(\Phi = u^i\)
which can only be true if \(\nabla u^i\cdot{\boldsymbol{e}}_j = \delta^i_j\) i.e. if
Since the sets of vectors \(\left\{{\boldsymbol{e}}^i\right\}\) and \(\left\{{\boldsymbol{e}}_i\right\}\) are reciprocal, any vector \(\mathbf{D}\) can be written as \(\mathbf{D} = D_i{\boldsymbol{e}}^i = D^i{\boldsymbol{e}}_i\) where \(D_i = \mathbf{D\cdot e}_i\) are the covariant components and \(D^i = \mathbf{D\cdot e}^i\) are the contravariant components. To convert between covariant and contravariant components, define the metric coefficients \(g_{ij} = \mathbf{e_i\cdot e_j}\) and \(g^{ij} = \mathbf{e^i\cdot e^j}\) so that \({\boldsymbol{e}}_i = g_{ij}{\boldsymbol{e}}^j\). \(g_{ij}\) and \(g^{ij}\) are symmetric and if the basis is orthogonal then \(g_{ij}=g^{ij} = 0\) for \(i\neq j\) i.e. the metric is diagonal.
For a general set of coordinates, the nabla operator can be expressed as
and for a general set of (differentiable) coordinates \(\left\{u^i\right\}\), the Laplacian is given by
which can be expanded as
where \(G^j\) must not be mistaken as the so called connection coefficients (i.e. the Christoffel symbols of second kind). Setting \(\phi = u^k\) in equation (23) gives \(\nabla^2u^k = G^k\). Expanding (23) and setting \(\left\{u^i\right\} = \left\{x, y, z\right\}\) gives
Curl defined as:
Crossproduct relation between contravariant and covariant vectors:
Derivation of operators in the BOUT++ Clebsch systemÂ¶
The Clebsch system in BOUT++ goes like this
We have
Further on
The parallel and perpendicular gradientsÂ¶
We have that
and that
so that
The perpendicular gradients in Laplacian inversionÂ¶
In the Laplacian inversion BOUT++ currently neglects the parallel \(y\) derivatives if \(g_{xy}\) and \(g_{yz}\) are nonzero, thus
The LaplacianÂ¶
We would here like to find an expression for the Laplacian
In general we have (using equation (2.6.39) in DâHaeseleer [haeseler])
and that
In our case \(A \to {\nabla}\), so that
Thus
where we have defined [1]
By writing the terms out, we get
We now use that the metric tensor is symmetric (by definition), so that \(g^{ij}=g^{ji}\), and \(g_{ij}=g_{ji}\), and that the partial derivatives commutes for smooth functions \(\partial_i\partial_j=\partial_j\partial_i\). This gives
Notice that \(G^i\) does not operate on \(\partial_i\), but rather that the two are multiplied together.
The parallel LaplacianÂ¶
We have that
we have that
so that by equation (28),
The perpendicular LaplacianÂ¶
For the perpendicular Laplacian, we have that
The perpendicular Laplacian in Laplacian inversionÂ¶
Notice that BOUT++ currently assumes small parallel gradients in the dependent variable in Laplacian inversion if \(g_{xy}\) and \(g_{yz}\) are nonzero (if these are zero, the derivation can be done directly from equation (27) instead), so that
The Poisson bracket operatorÂ¶
We will here derive the bracket operators, as they are used in BOUT++.
The electrostatic ExB velocityÂ¶
Under electrostatic conditions, we have that \({\boldsymbol{v}}_E = \frac{\nabla\phi\times{\boldsymbol{b}}}{B}\), which is similar to \({\boldsymbol{v}}={\boldsymbol{k}}\times\nabla\psi\) found in incompressible fluid flow
The electrostatic ExB advectionÂ¶
The electrostatic \(E\times B\) advection operator thus becomes
Where we have used the definition of the Poisson bracket
The pure solenoidal advection is thus
The brackets operator in BOUT++Â¶
Notice that the (phi,f)@ operators in BOUT++ returns \(\frac{\nabla\phi\times{\boldsymbol{b}}}{B}\cdot\nabla f\) rather than \(\nabla\phi\times{\boldsymbol{b}}\cdot\nabla f\).
Notice also that the Arakawa brackets neglects the \(\partial_y\) derivative terms (the \(y\)derivative terms) if \(g_{xy}\) and \(g_{yz}\) are nonzero, so for the Arakawa brackets, BOUT++ returns
Divergence of ExB velocityÂ¶
Using
the divergence of the \({\boldsymbol{E}}\times{\boldsymbol{B}}\) velocity can be written as
The second term on the right is identically zero (curl of a nablaient). The first term on the right can be expanded as
Using
this becomes:
Alternatively, equation (30) can be expanded as
[haeseler]  (1, 2) Haeseler, W. D.: Flux Coordinates and Magnetic Field Structure, SpringerVerlag, 1991, ISBN 3540524193 
Footnotes
[1]  Notice that \(G^i\) is not the same as the Christoffel symbols of second kind (also known as the connection coefficients or \(\Gamma^i_{jk}={\boldsymbol{e}}^i\cdot\partial_k {\boldsymbol{e}}_j\)), although the derivation of the two are quite similar.  We find that \(\Gamma^i_{ji}={\boldsymbol{e}}^i\cdot\partial_i {\boldsymbol{e}}_j = {\nabla\cdot}{\boldsymbol{e}}_j\), whereas using equation (28) leads to \(G^i={\boldsymbol{e}}^i\cdot\partial_i {\boldsymbol{e}}^j = {\nabla\cdot} {\boldsymbol{e}}^j\), since \(g^{ji}=g^{ij}\) due to symmetry. 
BOUT++ preconditioningÂ¶
Author:  B.Dudson, University of York 

IntroductionÂ¶
This manual describes some of the ways BOUT++ could (and in some cases does) support preconditioning, Jacobian calculations and other methods to speed up simulations. This manual assumes that youâre familiar with how BOUT++ works internally.
Some notation: The ODE being solved is of the form
Here the state vector \(f = \left(f_0, f_1, f_2, \ldots\right)^T\) is a vector containing the evolving (3D) variables \(f_i\left(x,y,z\right)\).
The Jacobian of this system is then
The order of the elements in the vector \({\mathbf{f}}\) is determined in the solver code and SUNDIALS, so here just assume that there exists a map \(\mathbb{I}\) between a global index \(k\) and (variable, position) i.e. \(\left(i,x,y,z\right)\)
and its inverse
Some problemspecific operations which can be used to speed up the timestepping
 Jacobianvector multiply: Given a vector, multiply it by \({\mathbb{J}}\)
 Preconditioner multiply: Given a vector, multiply by an approximate inverse of \(\mathbb{M} = \mathbb{I}  \gamma\mathbb{J}\)
 Calculate the stencils i.e. nonzero elements in \({\mathbb{J}}\)
 Calculate the nonzero elements of \({\mathbb{J}}\)
Physics problemsÂ¶
Some interesting physics problems of increasing difficulty
Resistive driftinterchange instabilityÂ¶
A âsimpleâ test problem of 2 fields, which results in nontrivial turbulent results. Supports resistive drift wave and interchange instabilities.
Reduced 3field MHDÂ¶
This is a 3field system of pressure \(P\), magnetic flux \(\psi\) and vorticity \(U\):
The coupled set of equations to be solved are therefore
The Jacobian of this system is therefore:
Where the blue terms are only included in nonlinear simulations.
This Jacobian has large dense blocks because of the Laplacian inversion terms (involving \(\nabla_\perp^{2}\) which couples together all points in an XZ plane. The way to make \({\mathbb{J}}\) sparse is to solve \(\phi\) as a constraint (using e.g. the IDA solver) which moves the Laplacian inversion to the preconditioner.
Solving \(\phi\) as a constraintÂ¶
The evolving state vector becomes
UEDGE equationsÂ¶
The UEDGE benchmark is a 4field model with the following equations:
This set of equations is good in that there is no inversion needed, and so the Jacobian is sparse everywhere. The state vector is
The Jacobian is:
If instead the state vector is
then the Jacobian is
2fluid turbulenceÂ¶
Jacobianvector multiplyÂ¶
This is currently implemented into the CVODE (SUNDIALS) solver.
Preconditionervector multiplyÂ¶
Reduced 3field MHDÂ¶
The matrix \(\mathbb{M}\) to be inverted can therefore be written
where
For small flow velocities, the inverse of \(\mathbb{D}\) can be approximated using the Binomial theorem:
Following [chacon2008], [chacon2002], \(\mathbb{M}\) can be rewritten as
The Schur factorization of \(\mathbb{M}\) yields ([chacon2008])
Where \(\mathbb{P}_{Schur} = \mathbb{D}  \mathbb{L}\mathbb{E}^{1}\mathbb{U}\) is the Schur complement. Note that this inversion is exact so far. Since \(\mathbb{E}\) is blockdiagonal, and \(\mathbb{D}\) can be easily approximated using equation (32), this simplifies the problem to inverting \(\mathbb{P}_{Schur}\), which is much smaller than \(\mathbb{M}\).
A possible approximation to \(\mathbb{P}_{Schur}\) is to neglect:
 All drive terms
 the curvature term \(\mathbb{L}_P\)
 the \(J_{0}\) term in \(\mathbb{L}_\psi\)
 All nonlinear terms (blue terms in equation (31)), including perpendicular terms (so \(\mathbb{D} = \mathbb{I}\))
This gives
Where the commutation of parallel and perpendicular derivatives is also an approximation. This remaining term is just the shear AlfvĂ©n wave propagating along fieldlines, the fastest wave supported by these equations.
StencilsÂ¶
Jacobian calculationÂ¶
The (sparse) Jacobian matrix elements can be calculated automatically from the physics code by keeping track of the (linearised) operations going through the RHS function.
For each point, keep the value (as usual), plus the nonzero elements in that row of \({\mathbb{J}}\) and the constant: result = Ax + b Keep track of elements using product rule.
class Field3D {
data[ngx][ngy][ngz]; // The data as now
int JacIndex; // Variable index in Jacobian
SparseMatrix *jac; // Set of rows for indices (JacIndex,*,*,*)
};
JacIndex is set by the solver, so for the system
P.JacIndex = 0
, psi.JacIndex = 1
and U.JacIndex = 2
. All
other fields are given JacIndex = 1
.
SparseMatrix stores the nonzero Jacobian components for the set of rows
corresponding to this variable. Evolving variables do not have an
associated SparseMatrix
object, but any fields which result from
operations on evolving fields will have one.
[chacon2008]  (1, 2)

[chacon2002] 

Geometry and Differential OperatorÂ¶
Author:  X. Q. Xu 

GeometryÂ¶
In a axisymmetric toroidal system, the magnetic field can be expressed as
where \(\psi\) is the poloidal flux, \(\theta\) is the poloidal anglelike coordinate, and \(\zeta\) is the toroidal angle. Here, \(I(\psi)=RB_t\). The two important geometrical parameters are: the curvature, \(\bf \kappa\), and the local pitch, \(\nu(\psi,\theta)\),
The local pitch \(\nu(\psi,\theta)\) is related to the MHD safety q by \(\hat q(\psi)={2\pi}^{1}\oint\nu(\psi,\theta) d\theta\) in the closed flux surface region, and \(\hat q(\psi)={2\pi}^{1}\int_{inboard}^{outboard}\nu(\psi,\theta) d\theta\) in the scrapeofflayer. Here \({\bf \cal J}=(\nabla\psi\times\nabla\theta\cdot\nabla\zeta)^{1}\) is the coordinate Jacobian, \(R\) is the major radius, and \(Z\) is the vertical position.
Geometry and Differential OperatorsÂ¶
In a axisymmetric toroidal system, the magnetic field can be expressed as \({\bf B}=I(\psi)\nabla\zeta+\nabla\zeta\times\nabla\psi\), where \(\psi\) is the poloidal flux, \(\theta\) is the poloidal anglelike coordinate, and \(\zeta\) is the toroidal angle. Here, \(I(\psi)=RB_t\). The two important geometrical parameters are: the curvature, \(\bf \kappa\), and the local pitch, \(\nu(\psi,\theta)\), and \(\nu(\psi,\theta)= {I(\psi){\bf \cal J}/R^2}\). The local pitch \(\nu(\psi,\theta)\) is related to the MHD safety q by \(\hat q(\psi)={2\pi}^{1}\oint\nu(\psi,\theta) d\theta\) in the closed flux surface region, and \(\hat q(\psi)={2\pi}^{1}\int_{inboard}^{outboard}\nu(\psi,\theta) d\theta\) in the scrapeofflayer. Here \({\bf \cal J}=(\nabla\psi\times\nabla\theta\cdot\nabla\zeta)^{1}\) is the coordinate Jacobian, \(R\) is the major radius, and \(Z\) is the vertical position.
Differential OperatorsÂ¶
For such an axisymmetric equilibrium the metric coefficients are only functions of \(\psi\) and \(\theta\). Three spatial differential operators appear in the equations given as: \({\bf v_E}\cdot\nabla_\perp\), \(\nabla_\\) and \(\nabla_\perp^2\).
where the coordinate Jacobian and metric coefficients are defined as following:
Concentric circular cross section inside the separatrix without the SOLÂ¶
For concentric circular cross section inside the separatrix without the SOL, the differential operators are reduced to:
Fieldaligned coordinates with \(\theta\) as the coordinate along the field lineÂ¶
A suitable coordinate mapping between fieldaligned ballooning coordinates (\(x\), \(y\), \(z\)) and the usual flux coordinates (\(\psi\), \(\theta\), \(\zeta\)) is
as shown in Fig. 1. The covering area given by the square ABCD in the usual flux coordinates is the same as the parallelogram ABEF in the fieldaligned coordinates. The magnetic separatrix is denoted by \(\psi=\psi_s\). In this choice of coordinates, \(x\) is a flux surface label, \(y\), the poloidal angle, is also the coordinate along the field line, and \(z\) is a field line label within the flux surface.
The coordinate Jacobian and metric coefficients are defined as following:
Here \(h\) is the local minor radius, \(I_s\) is the integrated local shear, and \(y_0\) is an arbitrary integration parameter, which, depending on the choice of Jacobian, determines the location where \(I_s=0\). The disadvantage of this choice of coordinates is that the Jacobian diverges near the Xpoint as \(B_p\rightarrow 0\) and its effect spreads over the entire flux surafces near the separatrix as the results of coordinate transform \(z\). Therefore a better set of coordinates is needed for Xpoint divertor geometry. The derivatives are obtained from the chain rule as follows:
In the fieldaligned ballooning coordinates, the parallel differential operator is simple, involving only one coordinate \(y\)
which requires a few grid points. The total axisymmetric drift operator becomes
The perturbed \({\bf E}\times {\bf B}\) drift operator becomes
when the conventional turbulence ordering (\(k_\\ll k_\perp\)) is used, the perturbed \({\bf E}\times {\bf B}\) drift operator can be further reduced to a simple form
where \(\partial/\partial\theta\simeq \nu\partial/\partial z\) is used. In the perturbed \({\bf E}\times {\bf B}\) drift operator the poloidal and radial derivatives are written in the usual flux \((\psi,\theta,\zeta)\) coordinates in order to have various options for valid discretizations. The general Laplacian operator for potential is