Run MrBayes Along With Beagle-lib

omicx.cc

March 21, 2022

Page content

MrBayes 3 is a program for Bayesian inference and model choice across a large space of phylogenetic and evolutionary models. And BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. BEAGLE is able to accelerate MrBayes analyses very much.

This post introduces the way to run MrBayes along with BEAGLE.

MrBayes logo

1. Install BEAGLE and MrBayes

Please follow Install beagle-lib for NVidia Tesla V100 on Debian buster to install and setup BEAGLE first.

To install Mrbayes in Debian Bullseye, the easiest way is:

$ sudo apt update
$ sudo apt install mrbayes mrbayes-mpi

A very short version of building MrBayes is:

$ git clone --depth=1 https://github.com/NBISweden/MrBayes.git
$ cd MrBayes
$ ./configure
$ make && sudo make install

Please check INSTALL for more information.

2. Check available BEAGLE resources

Available BEAGLE resources for MrBayes can be found by these steps:

At first, run mb command, it will lead you to the MrBayes prompt.

$ mb


                            MrBayes 3.2.7a x86_64

                      (Bayesian Analysis of Phylogeny)

              Distributed under the GNU General Public License


               Type "help" or "help <command>" for information
                     on the commands that are available.

                   Type "about" for authorship and general
                       information about the program.


MrBayes >

Then run showbeagle command, it will list all available resrouces:

MrBayes > showbeagle

   Available resources reported by beagle library:
	Resource 0:
	Name: CPU
	Flags: PROCESSOR_CPU PRECISION_DOUBLE PRECISION_SINGLE COMPUTATION_SYNCH
             EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO
             SCALING_ALWAYS SCALING_DYNAMIC SCALERS_RAW SCALERS_LOG
             VECTOR_NONE VECTOR_SSE THREADING_NONE THREADING_CPP

	Resource 1:
	Name: Tesla P100-PCIE-12GB
	Desc: Global memory (MB): 12198 | Clock speed (Ghz): 1.33 | Number of cores: 3584
	Flags: PROCESSOR_GPU PRECISION_DOUBLE PRECISION_SINGLE COMPUTATION_ASYNCH
             COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL
             SCALING_AUTO SCALING_ALWAYS SCALING_DYNAMIC SCALERS_RAW
             SCALERS_LOG VECTOR_NONE THREADING_NONE

   BEAGLE version: 3.1.2

3. Run MrBayes with BEAGLE support

By default, MrBayes will use its built-in single-precision SSE likelihood calculators on CPUs. They are quite fast on most machines, and should be similar in performance to the CPU routines in the BEAGLE library. The set command is used to switch to the BEAGLE likelihood calculators, and to set various BEAGLE-related options.

The current BEAGLE setting can be checked by the help set command. BEAGLE relevant settings are:

Parameter	Options	Current Setting
Usebeagle	Yes/No	Yes
Beagleresource	<number>	99
Beagledevice	CPU/GPU	CPU
Beagleprecision	Single/Double	Single
Beaglescaling	Always/Dynamic	Dynamic
Beaglesse	Yes/No	No
Beaglethreads	Yes/No	Yes
Beaglethreadcount	<number>	99
Beaglefloattips	Yes/No	No

The explantion of these options are:

Option	Description	Note
Usebeagle	Set this option to `Yes` to attempt to use the BEAGLE library to compute the phylogenetic likelihood on a variety of high-performance hardware including multicore CPUs and GPUs.	Some models in MrBayes are not yet supported by BEAGLE.
Beagleresource	Set this option to the number of a specific resource you wish to use with BEAGLE (use `Showbeagle` to see the list of available resources).	Set to `99` for auto-resource selection.
Beagledevice	Set this option to `GPU` or `CPU` to select processor.
Beagleprecision	Selection `Single` or `Double` precision computation.
Beaglescaling	`Always` rescales partial likelihoods at each evaluation. `Dynamic` rescales less frequently and should run faster.
Beaglesse	Use SSE instructions on Intel CPU processors.
Beaglethreads	Use threading for parallelism on multi-core CPU processors.
Beaglethreadcount	Set maximum number of CPU threads to be used by BEAGLE.	Set to `99` for auto-threading.
Beaglefloattips	Use floating-point representation for tip sequence data.	Can result in improved performance on GPU devices at the cost of additional memory usage.

If run BEAGLE on CPUs, the preferred options is likely to be the double-precision SSE code with dynamic scaling:

MrBayes > set usebeagle=yes beagledevice=cpu beagleprecision=double beaglescaling=dynamic beaglesse=yes

The default MrBayes likelihood calculator uses single-precision SSE code, which should theoretically be the fastest option on the CPU. Currently, BEAGLE only supports double-precision SSE code, which is slower. However, the code we use to call BEAGLE supports smart dynamic scaling, unlike the code calling the default calculator, which may well compensate for the slowdown caused by the increased precision. Therefore, you need to test both options before knowing which one will be faster on your machine.

To use the GPU, you simply switch from CPU to GPU (if you have an available GPU). With the GPU, the fastest option should be the single-precision code with dynamic scaling:

MrBayes > set usebeagle=yes beagledevice=gpu beagleprecision=single beaglescaling=dynamic

Note: The SSE option is not applicable in the GPU case.

The GPU code can be a lot faster than the CPU code, particularly for amino acid and codon models. However, the length of the sequences also influences the speed-up. In general, the longer the sequences are, the better GPU performance you can expect. If the sequences are short, the overhead involved in shuffling data to and from the GPU may well overshadow any performance gain you get in the computation step. Try the various calculator options out in short runs before you decide on the best option for longer runs.

1. Install BEAGLE and MrBayes

2. Check available BEAGLE resources

3. Run MrBayes with BEAGLE support

Reference