Run MrBayes Along With Beagle-lib
MrBayes 3 is a program for Bayesian inference and model choice across a large space of phylogenetic and evolutionary models. And BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. BEAGLE is able to accelerate MrBayes analyses very much.
This post introduces the way to run MrBayes along with BEAGLE.
1. Install BEAGLE and MrBayes
Please follow Install beagle-lib for NVidia Tesla V100 on Debian buster to install and setup BEAGLE first.
To install Mrbayes in Debian Bullseye, the easiest way is:
$ sudo apt update
$ sudo apt install mrbayes mrbayes-mpi
A very short version of building MrBayes is:
$ git clone --depth=1 https://github.com/NBISweden/MrBayes.git
$ cd MrBayes
$ ./configure
$ make && sudo make install
Please check INSTALL for more information.
2. Check available BEAGLE resources
Available BEAGLE resources for MrBayes can be found by these steps:
At first, run mb
command, it will lead you to the MrBayes prompt.
$ mb
MrBayes 3.2.7a x86_64
(Bayesian Analysis of Phylogeny)
Distributed under the GNU General Public License
Type "help" or "help <command>" for information
on the commands that are available.
Type "about" for authorship and general
information about the program.
MrBayes >
Then run showbeagle
command, it will list all available resrouces:
MrBayes > showbeagle
Available resources reported by beagle library:
Resource 0:
Name: CPU
Flags: PROCESSOR_CPU PRECISION_DOUBLE PRECISION_SINGLE COMPUTATION_SYNCH
EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL SCALING_AUTO
SCALING_ALWAYS SCALING_DYNAMIC SCALERS_RAW SCALERS_LOG
VECTOR_NONE VECTOR_SSE THREADING_NONE THREADING_CPP
Resource 1:
Name: Tesla P100-PCIE-12GB
Desc: Global memory (MB): 12198 | Clock speed (Ghz): 1.33 | Number of cores: 3584
Flags: PROCESSOR_GPU PRECISION_DOUBLE PRECISION_SINGLE COMPUTATION_ASYNCH
COMPUTATION_SYNCH EIGEN_REAL EIGEN_COMPLEX SCALING_MANUAL
SCALING_AUTO SCALING_ALWAYS SCALING_DYNAMIC SCALERS_RAW
SCALERS_LOG VECTOR_NONE THREADING_NONE
BEAGLE version: 3.1.2
3. Run MrBayes with BEAGLE support
By default, MrBayes will use its built-in single-precision SSE likelihood calculators on CPUs. They are quite fast on most machines, and should be similar in performance to the CPU routines in the BEAGLE library. The set
command is used to switch to the BEAGLE likelihood calculators, and to set various BEAGLE-related options.
The current BEAGLE setting can be checked by the help set
command. BEAGLE relevant settings are:
Parameter | Options | Current Setting |
---|---|---|
Usebeagle | Yes/No | Yes |
Beagleresource | <number> | 99 |
Beagledevice | CPU/GPU | CPU |
Beagleprecision | Single/Double | Single |
Beaglescaling | Always/Dynamic | Dynamic |
Beaglesse | Yes/No | No |
Beaglethreads | Yes/No | Yes |
Beaglethreadcount | <number> | 99 |
Beaglefloattips | Yes/No | No |
The explantion of these options are:
Option | Description | Note |
---|---|---|
Usebeagle | Set this option to Yes to attempt to use the BEAGLE library to compute the phylogenetic likelihood on a variety of high-performance hardware including multicore CPUs and GPUs. |
Some models in MrBayes are not yet supported by BEAGLE. |
Beagleresource | Set this option to the number of a specific resource you wish to use with BEAGLE (use Showbeagle to see the list of available resources). |
Set to 99 for auto-resource selection. |
Beagledevice | Set this option to GPU or CPU to select processor. |
|
Beagleprecision | Selection Single or Double precision computation. |
|
Beaglescaling | Always rescales partial likelihoods at each evaluation. Dynamic rescales less frequently and should run faster. |
|
Beaglesse | Use SSE instructions on Intel CPU processors. | |
Beaglethreads | Use threading for parallelism on multi-core CPU processors. | |
Beaglethreadcount | Set maximum number of CPU threads to be used by BEAGLE. | Set to 99 for auto-threading. |
Beaglefloattips | Use floating-point representation for tip sequence data. | Can result in improved performance on GPU devices at the cost of additional memory usage. |
If run BEAGLE on CPUs, the preferred options is likely to be the double-precision SSE code with dynamic scaling:
MrBayes > set usebeagle=yes beagledevice=cpu beagleprecision=double beaglescaling=dynamic beaglesse=yes
The default MrBayes likelihood calculator uses single-precision SSE code, which should theoretically be the fastest option on the CPU. Currently, BEAGLE only supports double-precision SSE code, which is slower. However, the code we use to call BEAGLE supports smart dynamic scaling, unlike the code calling the default calculator, which may well compensate for the slowdown caused by the increased precision. Therefore, you need to test both options before knowing which one will be faster on your machine.
To use the GPU, you simply switch from CPU to GPU (if you have an available GPU). With the GPU, the fastest option should be the single-precision code with dynamic scaling:
MrBayes > set usebeagle=yes beagledevice=gpu beagleprecision=single beaglescaling=dynamic
- Note: The SSE option is not applicable in the GPU case.
The GPU code can be a lot faster than the CPU code, particularly for amino acid and codon models. However, the length of the sequences also influences the speed-up. In general, the longer the sequences are, the better GPU performance you can expect. If the sequences are short, the overhead involved in shuffling data to and from the GPU may well overshadow any performance gain you get in the computation step. Try the various calculator options out in short runs before you decide on the best option for longer runs.