There are nearly 14 million viral genome sequences right now in the GISAID EpiCoV ™ database. It is not likely to infer the phylogenetic relationships for such a huge dataset by traditional maximum likelyhood or Bayesian methods in a shor time period. The UShER package was developed to generate ultra-large phylogenetic tree of SARS-CoV-2 genomes. The algorithm of the UShER program is to place new samples onto an existing phylogeny using maximum parsimony method. It is able to place given SARS-CoV-2 genome sequences into the GISAID global phylogeny in a couple of hours. This program is particularly helpful in understanding the relationships of newly sequenced SARS-CoV-2 genomes with each other and with previously sequenced genomes in a global phylogeny.
The UShER package is composed by four programs:
UShER: a program that rapidly places new samples onto an existing phylogeny using maximum parsimony.
matUtils: a toolkit for querying, interpreting and manipulating the mutation-annotated trees (MATs).
matOptimize: a program to rapidly and effectively optimize a mutation-annotated tree (MAT) for parsimony using subtree pruning and regrafting (SPR) moves within a user-defined radius.
RIPPLES: a program that uses a phylogenomic technique to rapidly and sensitively detect recombinant nodes and their ancestors in a mutation-annotated tree (MAT).
The taxoniumtools and Taxonium website are used to display the MAT generated by UShER.
MrBayes 3 is a program for Bayesian inference and model choice across a large space of phylogenetic and evolutionary models. And BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. BEAGLE is able to accelerate MrBayes analyses very much.
This post introduces the way to run MrBayes along with BEAGLE.
BEAST (and BEAST 2, of course) is widely uesed for phylogenetic inference. But it is quite complicated. This post represents some useful tips while using BEAST.