Run MrBayes Along With Beagle-lib

Page content

MrBayes 3 is a program for Bayesian inference and model choice across a large space of phylogenetic and evolutionary models. And BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. BEAGLE is able to accelerate MrBayes analyses very much.

This post introduces the way to run MrBayes along with BEAGLE.

1. Install BEAGLE and MrBayes

Please follow Install beagle-lib for NVidia Tesla V100 on Debian buster to install and setup BEAGLE first.

To install Mrbayes in Debian Bullseye, the easiest way is:

$ sudo apt update
$ sudo apt install mrbayes mrbayes-mpi

A very short version of building MrBayes is:

$ git clone --depth=1 https://github.com/NBISweden/MrBayes.git
$ cd MrBayes
$ ./configure
$ make && sudo make install

Please check INSTALL for more information.

2. Check available BEAGLE resources

Available BEAGLE resources for MrBayes can be found by these steps:

At first, run mb command, it will lead you to the MrBayes prompt.

$ mb


                            MrBayes 3.2.7a x86_64

                      (Bayesian Analysis of Phylogeny)

              Distributed under the GNU General Public License


               Type "help" or "help <command>" for information
                     on the commands that are available.

                   Type "about" for authorship and general
                       information about the program.


MrBayes > 

Then run showbeagle command, it will list all available resrouces:

MrBayes > showbeagle

   Available resources reported by beagle library:
	Resource 0:
	Name: CPU
	Flags: PROCESSOR_CPU PRECISION_DOUBLE PRECISION_SINGLE COMPUTATION_SYNCH
             EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO
             SCALING_ALWAYS SCALING_DYNAMIC SCALERS_RAW SCALERS_LOG
             VECTOR_NONE VECTOR_SSE THREADING_NONE THREADING_CPP

	Resource 1:
	Name: Tesla P100-PCIE-12GB
	Desc: Global memory (MB): 12198 | Clock speed (Ghz): 1.33 | Number of cores: 3584
	Flags: PROCESSOR_GPU PRECISION_DOUBLE PRECISION_SINGLE COMPUTATION_ASYNCH
             COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL
             SCALING_AUTO SCALING_ALWAYS SCALING_DYNAMIC SCALERS_RAW
             SCALERS_LOG VECTOR_NONE THREADING_NONE

   BEAGLE version: 3.1.2

3. Run MrBayes with BEAGLE support

By default, MrBayes will use its built-in single-precision SSE likelihood calculators on CPUs. They are quite fast on most machines, and should be similar in performance to the CPU routines in the BEAGLE library. The set command is used to switch to the BEAGLE likelihood calculators, and to set various BEAGLE-related options.

The current BEAGLE setting can be checked by the help set command. BEAGLE relevant settings are:

Parameter Options Current Setting
Usebeagle Yes/No Yes
Beagleresource <number> 99
Beagledevice CPU/GPU CPU
Beagleprecision Single/Double Single
Beaglescaling Always/Dynamic Dynamic
Beaglesse Yes/No No
Beaglethreads Yes/No Yes
Beaglethreadcount <number> 99
Beaglefloattips Yes/No No

The explantion of these options are:

Option Description Note
Usebeagle Set this option to Yes to attempt to use the BEAGLE library to compute the phylogenetic likelihood on a variety of high-performance hardware including multicore CPUs and GPUs. Some models in MrBayes are not yet supported by BEAGLE.
Beagleresource Set this option to the number of a specific resource you wish to use with BEAGLE (use Showbeagle to see the list of available resources). Set to 99 for auto-resource selection.
Beagledevice Set this option to GPU or CPU to select processor.
Beagleprecision Selection Single or Double precision computation.
Beaglescaling Always rescales partial likelihoods at each evaluation. Dynamic rescales less frequently and should run faster.
Beaglesse Use SSE instructions on Intel CPU processors.
Beaglethreads Use threading for parallelism on multi-core CPU processors.
Beaglethreadcount Set maximum number of CPU threads to be used by BEAGLE. Set to 99 for auto-threading.
Beaglefloattips Use floating-point representation for tip sequence data. Can result in improved performance on GPU devices at the cost of additional memory usage.

If run BEAGLE on CPUs, the preferred options is likely to be the double-precision SSE code with dynamic scaling:

MrBayes > set usebeagle=yes beagledevice=cpu beagleprecision=double beaglescaling=dynamic beaglesse=yes

The default MrBayes likelihood calculator uses single-precision SSE code, which should theoretically be the fastest option on the CPU. Currently, BEAGLE only supports double-precision SSE code, which is slower. However, the code we use to call BEAGLE supports smart dynamic scaling, unlike the code calling the default calculator, which may well compensate for the slowdown caused by the increased precision. Therefore, you need to test both options before knowing which one will be faster on your machine.

To use the GPU, you simply switch from CPU to GPU (if you have an available GPU). With the GPU, the fastest option should be the single-precision code with dynamic scaling:

MrBayes > set usebeagle=yes beagledevice=gpu beagleprecision=single beaglescaling=dynamic
  • Note: The SSE option is not applicable in the GPU case.

The GPU code can be a lot faster than the CPU code, particularly for amino acid and codon models. However, the length of the sequences also influences the speed-up. In general, the longer the sequences are, the better GPU performance you can expect. If the sequences are short, the overhead involved in shuffling data to and from the GPU may well overshadow any performance gain you get in the computation step. Try the various calculator options out in short runs before you decide on the best option for longer runs.

Reference

  1. MrBayes 3.2 manual.
  2. How to set-up CUDA, BEAGLE, and MrBayes/BEAST