BIO bootstrapping with PHYLIP
Jump to navigation
Jump to search
Bootstrapping PHYLIP trees
A maximally brief overview how to produce bootstrapping results for PHYLIP trees using PROML.
Principle
- Create multiple boostrapped copies (e.g. 100) of your input data using seqboot.
- Run your tree estimation program of choice using the
Minput option (analyze multiple trees). - Use the program consense to calculate your consensus tree.
Input data
Create a PHYLIP input file with the usual infile filename. Something like this:
7 77
KilA_ESCCO ---------R AKDGYINATS MCRTAGKLLS DYTRLLSRDM GIPISEIQSF
Mbp1_SACCE IHSTGSIMKR KKDDWVNATH ILKAANFAKA KRTRILEKEV LKE--THEKV
Mbp1_NEUCR -VNNVAVMRR RHDDWVNATH ILKAAGFDKP ARTRILEREV QKD--THEKI
Mbp1_CANAL VTSEGPIMRR KKDSWINATH ILKIAKFPKA KRTRILEKDV QTG--IHEKV
Mbp1_USTMA IINNVAVMRR RSDDWLNATQ ILKVVGLDKP QRTRVLEREI QKG--IHEKV
Mbp1_ASPNI -----SVMRR RSDDWINATH ILKVAGFDKP ARTRILEREV QKG--VHEKV
Mbp1_SCHPO -IKGVSVMRR RRDSWLNATQ ILKVADFDKP QRTRVLERQV QIG--AHEKV
KGGRPENQGT WVHPDIAINL AQ-----
QGGFGKYQGT WVPLNIAKQL AEKFSVY
QGGYGRYQGT WIPLEQAEAL ARRNNIY
QGGYGKYQGT YVPLDLGAAI ARNFGVY
QGGYGKYQGT WIPLDVAIEL AERYNI-
QGGYGKYQGT WIPLQEGRQL AERNNI-
QGGYGKYQGT WVPFQRGVDL ATKYKV-seqboot
- Read the documentation for the
seqbootprogram. - Run
seqbooton yourinfile. - Set your parameters. I have used the defaults for this example. The random seed should be of the form
4n+1. - The usual
outfileis created. Here is the first bootstrap replicate from the run.
7 77
KilA_ESCCO ---------- -RKKGGGYIA TTMMCCRRRL SIISSEIQQQ GGRRRNQQQQ GTWVPIIIAI
Mbp1_SACCE HHSSTGSIMK KRKKDDDWVA TTIILLKRRL E----THEEE GGFFFYQQQQ GTWVLIIIAK
Mbp1_NEUCR VVNNNVAVMR RRHHDDDWVA TTIILLKRRL E----THEEE GGYYYYQQQQ GTWILQQQAE
Mbp1_CANAL TTSSEGPIMR RRKKSSSWIA TTIILLKRRL E----IHEEE GGYYYYQQQQ GTYVLLLLGA
Mbp1_USTMA IINNNVAVMR RRSSDDDWLA TTIILLKRRL E----IHEEE GGYYYYQQQQ GTWILVVVAI
Mbp1_ASPNI ------SVMR RRSSDDDWIA TTIILLKRRL E----VHEEE GGYYYYQQQQ GTWILEEEGR
Mbp1_SCHPO IIKKGVSVMR RRRRSSSWLA TTIILLKRRL E----AHEEE GGYYYYQQQQ GTWVFRRRGV
INNLLAAAQQ Q------
KQQLLAAAEE EKKSSVY
EAALLAAARR RRRNNIY
AAAIIAAARR RNNGGVY
IEELLAAAEE ERRNNI-
RQQLLAAAEE ERRNNI-
VDDLLAAATT TKKKKV-Note how approximately 1/3 of the columns are replicates.
proml
The output of seqboot works for most of the tree estimation programs. Be aware that running time will increase by a factor of 100 for 100 bootstrap replicates.
- Read the documentation for the
promlprogram. - Rename the previous
outfileas the newinfile. - Run
promlon yourinfile. - Set your parameters. I have used the defaults for this example, except for choosing the
Moption for multiple datasets and as promptedDfor data (not weights), the number of replicates (100), and a random seed, and "jumbling" only once. (While this is running, you can read about common input options such as what "jumble means here.) - The usual
outfileandouttreeis created. Have a look.
Notes
-->
Further reading and resources