Reference tree for APSES domains
- How this tree was computed
The input sequence alignment was derived from the Probcons MSA of 74 fungal APSES domains (from Assignment 3). The PHYLIP input file appears below. Columns that represented regions of uncertain alignment were deleted, as well as large gapped sections. Using the PHYLIP suite and the program promlk (an ML tree building program that constructs a tree under the assumption of a "molecular clock") with default parameters plus global optimization, the following tree was constructed (running time > 4h on my workstation).
The program 'retree was used on the output treefile to rotate particular clades around their branchpoint. This was done to arrange the species within a clade as nearly as possible in the sequence in which they appear in in the reference cladogram. While this is a "cosmetic" change (rotations around branch points do not change the topology of a tree), it facilitates analysis, especially to evaluate how many species are present in each clade and which species may be missing.
The APSES Reference Tree
+------11:9301 MAGGR +-88 +-87 +------12:9978 GIBZE | | +-86 +---------10:2599 ASPTE | | +-------------------85 +---------13:3009 ASPNI | | | +----------9:1244 ASPFU +-------------77 | | +--------------------8:6482 CANGL | | | | | | +----6:XBP1 SACCE | +---------78 +----84 | | +-83 +----5:3869 EREGO | | | | | | | +----------7:9773 DEBHA | +--------79 | | +----------4:5262 KLULA | +-80 | | +---------3:0918 CANAL | +-81 | | +--------2:5499 YARLI | +-82 | +--------1:0925 USTMA | | +-19:0837 NEUCR | +----98 | | +-18:8552 MAGGR | +-97 | | | +----21:PHD1 SACCE +-76 | +-99 | | +--96 +----20:9680 CANGL | | | | | | | | +-23:0305 GIBZE | | +-95 +--100 | | | | +-22:3440 ASPNI | | | | | | | | +-25:8847 CANGL | | | +-----101 | | +-91 +-24:5299 KLULA | | | | | | | | +-16:3001 EREGO | | | | +-94 | | | | +-93 +-15:SOK2 SACCE | | +-90 | | | | | | | +---92 +-14:9785 DEBHA | | | | | | | | | +----17:1102 YARLI | | | | | | | +---------26:2292 YARLI | | | | +---------------------------------89 +-31:8256 ASPTE | | +----106 | | | +-30:5125 ASPFU | | +-104 | | | | +-29:0447 DEBHA | | +103 +---105 | | | | +-28:1513 CANAL | +102 | | | +----------32:4197 CANAL | | | +----------27:4237 CANAL | | +----53:5548 ASPTE | +130 | | +----54:1770 YARLI | | | +-----129 +-55:MBP1 MAGGR | | | +-132 | | | | +-56:MBP1 GIBZE | | +131 | | | +-----59:MBP1 NEUCR | | +133 | | | +-58:4319 ASPNI | | +---134 | | +-57:MBP1 ASPFU | | | | +-64:MBP1 ASPNI | | +----140 | +-----128 | | +-63:MBP1 ASPTE | | | +139 +141 -75 | | | | +-62:4232 ASPFU | | | | | | | | +---137 +-------65:5821 NEUCR | | | | | | | | | | +---61:2974 MAGGR | | | | +-138 | | | +136 +---60:0560 GIBZE | | | | | | | | | | +----------69:9090 CRYNE | | | | | | | | | | +142 +------67:1485 USTMA | | +135 | +144 | | | +143 +------66:5496 SCHPO | | | | | +116 | +--------68:MBP1 USTMA | | | | | | | | +-------71:MBP1 YARLI | | | +---145 | | | +-------70:MBP1 CRYNE | | | | | | +---52:MBP1 SACCE | | | +126 | | | | | +-50:MBP1 EREGO | | | | +127 | | | +-----------123 +-51:MBP1 KLULA | | | | | | | | | | +-----49:MBP1 CANGL | | | | +124 | | | | | +-47:MBP1 DEBHA | +------------115 +117 +--125 | | | | +-48:MBP1 CANAL | | | | | | | | +---------41:6370 EREGO | | | +-------118 | | | | +---------46:4890 KLULA | | | +119 | | | | +--------45:4966 CANGL | | | +120 | | | | +--------44:SWI4 SACCE | +108 | +121 | | | | | +-42:7246 DEBHA | | | | +-----122 | | | | +-43:2876 CANAL | | | | | | | +---------------------40:MBP1 SCHPO | | | | | | +--------39:2267 NEUCR | | | | | | | | +--35:3762 MAGGR | | | | +---113 | | +-------------------------109 | +--36:5459 GIBZE +------107 | +112 | | | | +--------38:7766 ASPNI | | | +114 | +110 +--------37:6132 SCHPO | | | | +--33:6355 ASPTE | +---111 | +--34:3510 ASPFU | | +--------73:9901 DEBHA | +------------------------147 +146 +--------72:3412 CANAL | +-----------------------------------74:6166 SCHPO
PHYLIP Input File
74 72 MBP1_SACCEIMKRKKDDW-VNATHILKAANF-AKA--KRTREKVQGGFGKYQGTWVPLNIAKQLAEKF--SVYDQLK-PLF MBP1_YARLIVMRRKSDGW-VNATHILKVAGF-DKP--QRTREKVQGGYGKYQGTWVPLERAREIATLY--DVDSHLA-PIF 5821_NEUCRVMRRRHDDW-VNATHILKAAGF-DKP--ARTREKIQGGYGRYQGTWIPLEQAEALARRN--NIYERLK-PIF 9090_CRYNEVMRRRSDAY-LNATQILKVAGF-DKP--QRTREKVQGGYGKYQGTWIPIERGLALAKQY--GVEDILR-PII MBP1_ASPNIVMRRRSDDW-INATHILKVAGF-DKP--ARTREKVQGGYGKYQGTWIPLQEGRQLAERN--NILDKLL-PIF MBP1_KLULAIMKRKADNW-VNATHILKAAKF-PKA--KRTREKVQGGFGKYQGTWIPLELASKLAEKF--EVLDELK-PLF MBP1_GIBZEVMRRRNDSW-LNATQILKVAGV-DKG--KRTKEKVQGGYGKYQGTWIKFERGLQVCRQY--GVEELLR-PLL MBP1_ASPTEVMRRRADDW-INATHILKVAGF-DKP--ARTREKVQGGYGKYQGTWIPLPEGRLLAERN--NIIDKLR-PIF MBP1_CANALIMRRKKDSW-INATHILKIAKF-PKA--KRTREKVQGGYGKYQGTYVPLDLGAAIARNF--GVYDVLK-PIF MBP1_CANGLIMKRKNDGW-VNATHILKAANF-AKA--KRTREKVQGGFGKYQGTWVPLNIAINLAEKF--DVYQDLK-PLF 1770_YARLIVMRRRTDSS-LNATQILKVAGV-EKS--KRTKEKVQGGYGKYQGTWIPYERGVDLCRQY--SVYDVLQ-PLL 2974_MAGGRVMRRRVDDW-INATHILKAAGF-DKP--ARTREKVQGGYGKYQGTWIPLEAGEALAHRN--NIFDRLR-PIF 1485_USTMAVMRRRGDGW-LNATQILKIAGI-EKT--RRTKEKIQGGYGKFQGTWIPLQRAQQVAAEY--NVSHLLQ-PIL MBP1_USTMAVMRRRSDDW-LNATQILKVVGL-DKP--QRTREKVQGGYGKYQGTWIPLDVAIELAERY--NIQGLLQ-PIT 0560_GIBZEVMRRRSDDW-INATHILKAAGF-DKP--ARTREKIQGGYGKYQGTWIPLESGQALAERH--SVIDRLR-PIF 4232_ASPFU-MRRRGDDW-INATHILKVAGF-DKP--ARTREKVQGGYGKYQGTWIPLHEGRLLAERN--NIIDKLR-PIF MBP1_CRYNEVMRRASDSW-VNATQILKVAGV-HKS--ARTKEKIQGGYGKYQGTWVPLDRGRDLAEQY--GVGSYLS-SVF MBP1_NEUCRVMRRQKDGW-VNATQILKVANI-DKG--RRTKEKVQGGYGKYQGTWIPFERGLEVCRQY--GVEELLS-KLL MBP1_DEBHAIMRRKLDSW-INATHILKIAKF-PKA--KRTREKVQGGYGKYQGTYVPLDLGADIAKNF--GVFDSLR-PIF 2876_CANALIMRRCKDDW-VNATQILKCCNF-PKA--KRTKEKVQGGFGRFQGTWIPLEDARRLAKTY--GVTEELA-PVL MBP1_MAGGRVMKRIGDSK-LNATQILKVAGV-EKG--KRTKEKVQGGYGKYQGTWIKYERALEVCRQY--GVEELLR-PLL 4319_ASPNIVMKRRSDGW-LNATQILKVAGV-VKA--RRTKEKVQGGYGKYQGTWVNYQRGVELCREY--HVEELLR-PLL MBP1_ASPFUVMKRRSDSW-LNATQILKVAGV-VKA--RRTKEKVQGGYGKYQGTWVNYQRGVELCREY--HVEELLR-PLL MBP1_SCHPOVMRRRRDSW-LNATQILKVADF-DKP--QRTREKVQGGYGKYQGTWVPFQRGVDLATKY--KVDGIMS-PIL 5548_ASPTEVMKRRSDSW-LNATQILKVAGV-VKA--RRTKEKVQGGYGKYQGTWVNYQRGVDLCREY--HVEELLR-PLL 5496_SCHPOLMKRCHDNW-LNATQILKIAEL-DKP--RRTREKIQGGCGKYQGTWVPSERAVELAHEY--NVFDLIQ-PLI 7246_DEBHAIMRRCKDDW-VNATQILKCCNF-PKA--KRTKEKIQGGYGRFQGTWIPLADAQRLAASY--GVTPDLA-PVL MBP1_EREGOIMKRKADDW-VNATHILKAAKF-AKA--KRTREKVQGGFGKYQGTWVPLDIARRLAQKF--EVLEELR-PLF 6370_EREGOVMRRLHDDW-VNITQVFKVATF-SKT--QRTKEKIQGGYGRFQGTWIPLDSAKGLVAKY--EITDIVVLTVI SWI4_SACCEVMRRTKDDW-INITQVFKIAQF-SKT--KRTKEKVQGGYGRFQGTWIPLDSAKFLVNKY--EIIDPVVNSIL 4890_KLULAIMRRCNDNW-LNITQVFKAGSF-TKA--QRTKEKIQGGYGRFQGTWIPWESTKYLVEKY--NINNKVVKRIV 4966_CANGLVMRRTMDDW-VNVTQVFKIAQF-SKT--QRTKEKVQGGYGRFQGTWVPLEAAKFMTTKY--NIDNPVVNTI- 9785_DEBHAVVRRADNNM-INGTKLLNVAQM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAMAQRE--GIVDLLY-PLF 3009_ASPNIVMWDYNIGL-VRTTHLFKCNDY-SKT--TPAKHSITGGALAAQGYWMPYEAAKAIAATFCWKIRFALT-PLF SOK2_SACCEVVRRADNDM-VNGTKLLNVTKM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAIAQRE--KIADYLY-PLF 9680_CANGLVVRRADNDM-VNGTKLLNVTGM-TRG--RRDGDVVKGGPMTLKGVWIPIDRARAIARQE--GIEQWLY-PLF 3001_EREGOVVRRADNDM-INGTKLLNVAKM-TRG--RRDGHVVKIGSMHLKGVWIPFERALALAQRE--KIVDMLF-PLF 4197_CANALVVRRADNNM-INGTKLLNVAQM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAMAQRE--QIVDMLY-PLF 4237_CANALVVRRADNNM-INGTKLLNVAQM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAMAQRE--QIVDMLY-PLF 8256_ASPTEVARREDNSM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALEFANKE--KITDLLY-PLF 3440_ASPNIVARREDNGM-INGTKLLNVAGM-TRG--RRDGNVVKIGPMHLKGVWIPFDRALEFANKE--KITDLLY-PLF 2292_YARLIVARREDNDM-INGTKLLNVAGM-TRG--RRDGHVVKAGAMHLKGVWIPYDRALEFANKE--KIIDLLF-PLF 1102_YARLIVARREDNNM-INGTKLLNVVGM-TRG--RRDGHVVKIGAMHLKGVWIPYERALAFAQRE--RIVDVLY-PLF 5125_ASPFUVARREDNHM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALEFANKE--KITDLLY-PLF PHD1_SACCEVVRRADNNM-INGTKLLNVTKM-TRG--RRDGEVVKIGSMHLKGVWIPFERAYILAQRE--QILDHLY-PLF 8847_CANGLVVRRADNDM-INGTKLLNVTKM-TRG--KRDGKVVKIGSMHLKGVWIPFERALFIAKRE--KIVDLLY-PLF 5499_YARLIIIWDYHTGY-VHLTGLWKAIGN-SKA--DIVKRRVRGGYLKIQGTWVPYDIARALASRTCYFIRFALI-PLF 5299_KLULAVVRRADNDM-INGTKLLNVTRM-TRG--RRDGHVVKIGSMHLKGVWIPFERALVMAQRE--KIVDLLY-ALF 0305_GIBZEVARREDNHM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPYDRALDFANKE--KITELLY-PLF 0837_NEUCRVARREDNAM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALDFANKE--KITELLY-PLF 8552_MAGGRVARREDNHM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALDFANKE--KITELLY-PLF 0447_DEBHAVSRREDTNY-VNGTKLLNVAGM-TRG--KRDGSVVKVGAMNLKGVWIPFERASEIARNE--GIDGLLY-PLF 9978_GIBZEVMWDYNIGL-VRMTPFFKCRGY-GKT--IPAKHSITGGSIAAQGYWMPYRCAKAICATFCHPIAGALI-PIF 1513_CANALVSRREDTNY-INGTKLLNVIGM-TRG--KRDGNVVKVGSMNLKGVWIPFDRAYEIARNE--GVDSLLY-PLF 6132_SCHPO-LRRCPDSY-FNISQILRLAGT-SSS--ENAKENVDSKHPQIDGVWVPYDRAISIAKRY--GVYEILQ-PLI 1244_ASPFUVMWDYNIGL-VRTTHLFKCNDY-SKM--LNA-HSITGGALAAQGYWMPYEAAKAVAATFCWKIRHALT-PLF 0925_USTMAMMIDVDTSF-VRFTSITQALGK-NKV--NFGRTKLKGGYLSIQGTWLPFDLAKELSRRIAWEIRDHLV-PLF 2599_ASPTEIMWDYNIGL-VRTTPLFRSQNY-SKT--TPAKHSITGGAIVKPGYWIPFEAAKAVAATFCWRIRYALT-PIF 9773_DEBHAIIWDYETGF-VHLTGIWKASIN-DEVKADIVKKRIRGGFLKIQGTWLPFDLCKMLAKRFCYHIRFQLI-PIF 0918_CANALVIWDYETGW-VHLTGIWKASLT-IDGKADIVKKRIRGGFLKIQGTWLPYKLCKILARRFCYYLRYSLI-PIF 9901_DEBHAILRRVQDSY-INISQLFSILLKISEA--QLTNSSGGHEVRDLRGLWIPYDRAVSLALKF--DIYELAK-SLF 7766_ASPNILMRRSKDGY-VSATGMFKIAFP-WAK--LEEETRPESEDEIAGNVWISPVLALELAAEY--KMYDWVR-ALL 5459_GIBZELMRRSYDGF-VSATGMFKASFP-YAE--ASDESLPTSHEETAGNVWIPPEQALILAEEY--KISPWIR-ALL 2267_NEUCRLMRRSQDGY-ISATGMFKATFP-YAS--QEEESIPTSSEETAGNVWIPPEQALILAEEY--QITPWIR-ALL 3510_ASPFULMRRSKDGY-VSATGMFKIAFP-WAK--LEEETREGSEDEIAGNIWVSPLLALELAKEY--QMYDWVR-ALL 3762_MAGGRLMRRSSDGY-VSATGMFKATFP-YAD--AEDESLPASKEETAGNVWISPDQALALAEEY--SIATWIR-ALL 3412_CANALVLRRVQDSF-VNVTQLFQILIKLPTS--QVDNGSSSHQNIYLQGIWIPYDKAVNLALKF--DIYEITK-KLF 6166_SCHPOLMRMAKDSS-ISATSMFRSAFP-KAT--QEEEDNLNIEDKRVAGLWVPPADALALAKDY--SMTPFIN-ALL XBP1_SACCE---------------RDLICQS-YKD--F--LKRIRGGYIKIQGTWLPMEISRLLCLRFCFPIRYFLV-PIF 6355_ASPTETY-FLMDGY-VSATGMFKIAFP-WAK--LDEESREESEDEIAGNVWISPKLALELAGEY--QMYNWVR-ALL 9301_MAGGRVMWDYGCGL-VRMTHFFKCRGY-TKT--VPGKYSITGGSISAQESPIDREEAESMYGRSMQAQAQQQG-PLR 5262_KLULAYI---DLHWHLNP------TLS-TLL--G--QKRIRGGYIKIQGTWLPYPVSKELCSRFCYPLRYLLV-PLF 3869_EREGOYT---DVHWNVDPTWKQRLCRL-YQQ--E--KKRIRGGYIKIQGTWLPMEICKRLCIRFCFPIRYFLV-PIF 6482_CANGLSVNYLDFHW-FDISEKVRSQIF-EQF--K--QQRIRGGYIKIQGTWVPWYIAKLICIRFCFPIRYLLV-PIF