Reference tree for APSES domains

From "A B C"
Revision as of 22:06, 4 December 2006 by Boris (talk | contribs)
Jump to navigation Jump to search
How this tree was computed

The input sequence alignment was derived from the Probcons MSA of 74 fungal APSES domains (from Assignment 3). The PHYLIP input file appears below. Columns that represented regions of uncertain alignment were deleted, as well as large gapped sections. Using the PHYLIP suite and the program promlk (an ML tree building program that constructs a tree under the assumption of a "molecular clock") with default parameters plus global optimization, the following tree was constructed (running time > 4h on my workstation).

The program 'retree was used on the output treefile to rotate particular clades around their branchpoint. This was done to arrange the species within a clade as nearly as possible in the sequence in which they appear in in the reference cladogram. While this is a "cosmetic" change (rotations around branch points do not change the topology of a tree), it facilitates analysis, especially to evaluate how many species are present in each clade and which species may be missing.

The APSES Reference Tree

                                                  +------11:9301 MAGGR
                                               +-88  
                                            +-87  +------12:9978 GIBZE
                                            |  |  
                                         +-86  +---------10:2599 ASPTE
                                         |  |  
                    +-------------------85  +---------13:3009 ASPNI
                    |                    |  
                    |                    +----------9:1244 ASPFU
     +-------------77  
     |              |          +--------------------8:6482 CANGL
     |              |          |  
     |              |          |                  +----6:XBP1 SACCE
     |              +---------78            +----84  
     |                         |         +-83     +----5:3869 EREGO
     |                         |         |  |  
     |                         |         |  +----------7:9773 DEBHA
     |                         +--------79  
     |                                   |  +----------4:5262 KLULA
     |                                   +-80  
     |                                      |  +---------3:0918 CANAL
     |                                      +-81  
     |                                         |  +--------2:5499 YARLI
     |                                         +-82  
     |                                            +--------1:0925 USTMA
     |  
     |                                                        +-19:0837 NEUCR
     |                                                  +----98  
     |                                                  |     +-18:8552 MAGGR
     |                                               +-97  
     |                                               |  |  +----21:PHD1 SACCE
  +-76                                               |  +-99  
  |  |                                           +--96     +----20:9680 CANGL
  |  |                                           |   |  
  |  |                                           |   |    +-23:0305 GIBZE
  |  |                                        +-95   +--100  
  |  |                                        |  |        +-22:3440 ASPNI
  |  |                                        |  |  
  |  |                                        |  |       +-25:8847 CANGL
  |  |                                        |  +-----101  
  |  |                                     +-91          +-24:5299 KLULA
  |  |                                     |  |  
  |  |                                     |  |          +-16:3001 EREGO
  |  |                                     |  |       +-94  
  |  |                                     |  |    +-93  +-15:SOK2 SACCE
  |  |                                  +-90  |    |  |  
  |  |                                  |  |  +---92  +-14:9785 DEBHA
  |  |                                  |  |       |  
  |  |                                  |  |       +----17:1102 YARLI
  |  |                                  |  |  
  |  |                                  |  +---------26:2292 YARLI
  |  |                                  |  
  |  +---------------------------------89                +-31:8256 ASPTE
  |                                     |         +----106  
  |                                     |         |      +-30:5125 ASPFU
  |                                     |     +-104  
  |                                     |     |   |     +-29:0447 DEBHA
  |                                     |  +103   +---105  
  |                                     |  |  |         +-28:1513 CANAL
  |                                     +102  |  
  |                                        |  +----------32:4197 CANAL
  |                                        |  
  |                                        +----------27:4237 CANAL
  |  
  |                                                +----53:5548 ASPTE
  |                                             +130  
  |                                             |  +----54:1770 YARLI
  |                                             |  
  |                                     +-----129      +-55:MBP1 MAGGR
  |                                     |       |  +-132  
  |                                     |       |  |   +-56:MBP1 GIBZE
  |                                     |       +131  
  |                                     |          |  +-----59:MBP1 NEUCR
  |                                     |          +133  
  |                                     |             |     +-58:4319 ASPNI
  |                                     |             +---134  
  |                                     |                   +-57:MBP1 ASPFU
  |                                     |  
  |                                     |                     +-64:MBP1 ASPNI
  |                                     |              +----140  
  |                             +-----128              |      |  +-63:MBP1 ASPTE
  |                             |       |           +139      +141  
-75                             |       |           |  |         +-62:4232 ASPFU
  |                             |       |           |  |  
  |                             |       |     +---137  +-------65:5821 NEUCR
  |                             |       |     |     |  
  |                             |       |     |     |   +---61:2974 MAGGR
  |                             |       |     |     +-138  
  |                             |       |  +136         +---60:0560 GIBZE
  |                             |       |  |  |  
  |                             |       |  |  |  +----------69:9090 CRYNE
  |                             |       |  |  |  |  
  |                             |       |  |  +142     +------67:1485 USTMA
  |                             |       +135     |  +144  
  |                             |          |     +143  +------66:5496 SCHPO
  |                             |          |        |  
  |                          +116          |        +--------68:MBP1 USTMA
  |                          |  |          |  
  |                          |  |          |     +-------71:MBP1 YARLI
  |                          |  |          +---145  
  |                          |  |                +-------70:MBP1 CRYNE
  |                          |  |  
  |                          |  |                   +---52:MBP1 SACCE
  |                          |  |                +126  
  |                          |  |                |  |  +-50:MBP1 EREGO
  |                          |  |                |  +127  
  |                          |  |  +-----------123     +-51:MBP1 KLULA
  |                          |  |  |             |  
  |                          |  |  |             |  +-----49:MBP1 CANGL
  |                          |  |  |             +124  
  |                          |  |  |                |    +-47:MBP1 DEBHA
  |           +------------115  +117                +--125  
  |           |              |     |                     +-48:MBP1 CANAL
  |           |              |     |  
  |           |              |     |         +---------41:6370 EREGO
  |           |              |     +-------118  
  |           |              |               |  +---------46:4890 KLULA
  |           |              |               +119  
  |           |              |                  |  +--------45:4966 CANGL
  |           |              |                  +120  
  |           |              |                     |  +--------44:SWI4 SACCE
  |        +108              |                     +121  
  |        |  |              |                        |       +-42:7246 DEBHA
  |        |  |              |                        +-----122  
  |        |  |              |                                +-43:2876 CANAL
  |        |  |              |  
  |        |  |              +---------------------40:MBP1 SCHPO
  |        |  |  
  |        |  |                           +--------39:2267 NEUCR
  |        |  |                           |  
  |        |  |                           |           +--35:3762 MAGGR
  |        |  |                           |     +---113  
  |        |  +-------------------------109     |     +--36:5459 GIBZE
  +------107                              |  +112  
           |                              |  |  |  +--------38:7766 ASPNI
           |                              |  |  +114  
           |                              +110     +--------37:6132 SCHPO
           |                                 |  
           |                                 |     +--33:6355 ASPTE
           |                                 +---111  
           |                                       +--34:3510 ASPFU
           |  
           |                             +--------73:9901 DEBHA
           |  +------------------------147  
           +146                          +--------72:3412 CANAL
              |  
              +-----------------------------------74:6166 SCHPO


PHYLIP Input File

  74  72
MBP1_SACCEIMKRKKDDW-VNATHILKAANF-AKA--KRTREKVQGGFGKYQGTWVPLNIAKQLAEKF--SVYDQLK-PLF
MBP1_YARLIVMRRKSDGW-VNATHILKVAGF-DKP--QRTREKVQGGYGKYQGTWVPLERAREIATLY--DVDSHLA-PIF
5821_NEUCRVMRRRHDDW-VNATHILKAAGF-DKP--ARTREKIQGGYGRYQGTWIPLEQAEALARRN--NIYERLK-PIF
9090_CRYNEVMRRRSDAY-LNATQILKVAGF-DKP--QRTREKVQGGYGKYQGTWIPIERGLALAKQY--GVEDILR-PII
MBP1_ASPNIVMRRRSDDW-INATHILKVAGF-DKP--ARTREKVQGGYGKYQGTWIPLQEGRQLAERN--NILDKLL-PIF
MBP1_KLULAIMKRKADNW-VNATHILKAAKF-PKA--KRTREKVQGGFGKYQGTWIPLELASKLAEKF--EVLDELK-PLF
MBP1_GIBZEVMRRRNDSW-LNATQILKVAGV-DKG--KRTKEKVQGGYGKYQGTWIKFERGLQVCRQY--GVEELLR-PLL
MBP1_ASPTEVMRRRADDW-INATHILKVAGF-DKP--ARTREKVQGGYGKYQGTWIPLPEGRLLAERN--NIIDKLR-PIF
MBP1_CANALIMRRKKDSW-INATHILKIAKF-PKA--KRTREKVQGGYGKYQGTYVPLDLGAAIARNF--GVYDVLK-PIF
MBP1_CANGLIMKRKNDGW-VNATHILKAANF-AKA--KRTREKVQGGFGKYQGTWVPLNIAINLAEKF--DVYQDLK-PLF
1770_YARLIVMRRRTDSS-LNATQILKVAGV-EKS--KRTKEKVQGGYGKYQGTWIPYERGVDLCRQY--SVYDVLQ-PLL
2974_MAGGRVMRRRVDDW-INATHILKAAGF-DKP--ARTREKVQGGYGKYQGTWIPLEAGEALAHRN--NIFDRLR-PIF
1485_USTMAVMRRRGDGW-LNATQILKIAGI-EKT--RRTKEKIQGGYGKFQGTWIPLQRAQQVAAEY--NVSHLLQ-PIL
MBP1_USTMAVMRRRSDDW-LNATQILKVVGL-DKP--QRTREKVQGGYGKYQGTWIPLDVAIELAERY--NIQGLLQ-PIT
0560_GIBZEVMRRRSDDW-INATHILKAAGF-DKP--ARTREKIQGGYGKYQGTWIPLESGQALAERH--SVIDRLR-PIF
4232_ASPFU-MRRRGDDW-INATHILKVAGF-DKP--ARTREKVQGGYGKYQGTWIPLHEGRLLAERN--NIIDKLR-PIF
MBP1_CRYNEVMRRASDSW-VNATQILKVAGV-HKS--ARTKEKIQGGYGKYQGTWVPLDRGRDLAEQY--GVGSYLS-SVF
MBP1_NEUCRVMRRQKDGW-VNATQILKVANI-DKG--RRTKEKVQGGYGKYQGTWIPFERGLEVCRQY--GVEELLS-KLL
MBP1_DEBHAIMRRKLDSW-INATHILKIAKF-PKA--KRTREKVQGGYGKYQGTYVPLDLGADIAKNF--GVFDSLR-PIF
2876_CANALIMRRCKDDW-VNATQILKCCNF-PKA--KRTKEKVQGGFGRFQGTWIPLEDARRLAKTY--GVTEELA-PVL
MBP1_MAGGRVMKRIGDSK-LNATQILKVAGV-EKG--KRTKEKVQGGYGKYQGTWIKYERALEVCRQY--GVEELLR-PLL
4319_ASPNIVMKRRSDGW-LNATQILKVAGV-VKA--RRTKEKVQGGYGKYQGTWVNYQRGVELCREY--HVEELLR-PLL
MBP1_ASPFUVMKRRSDSW-LNATQILKVAGV-VKA--RRTKEKVQGGYGKYQGTWVNYQRGVELCREY--HVEELLR-PLL
MBP1_SCHPOVMRRRRDSW-LNATQILKVADF-DKP--QRTREKVQGGYGKYQGTWVPFQRGVDLATKY--KVDGIMS-PIL
5548_ASPTEVMKRRSDSW-LNATQILKVAGV-VKA--RRTKEKVQGGYGKYQGTWVNYQRGVDLCREY--HVEELLR-PLL
5496_SCHPOLMKRCHDNW-LNATQILKIAEL-DKP--RRTREKIQGGCGKYQGTWVPSERAVELAHEY--NVFDLIQ-PLI
7246_DEBHAIMRRCKDDW-VNATQILKCCNF-PKA--KRTKEKIQGGYGRFQGTWIPLADAQRLAASY--GVTPDLA-PVL
MBP1_EREGOIMKRKADDW-VNATHILKAAKF-AKA--KRTREKVQGGFGKYQGTWVPLDIARRLAQKF--EVLEELR-PLF
6370_EREGOVMRRLHDDW-VNITQVFKVATF-SKT--QRTKEKIQGGYGRFQGTWIPLDSAKGLVAKY--EITDIVVLTVI
SWI4_SACCEVMRRTKDDW-INITQVFKIAQF-SKT--KRTKEKVQGGYGRFQGTWIPLDSAKFLVNKY--EIIDPVVNSIL
4890_KLULAIMRRCNDNW-LNITQVFKAGSF-TKA--QRTKEKIQGGYGRFQGTWIPWESTKYLVEKY--NINNKVVKRIV
4966_CANGLVMRRTMDDW-VNVTQVFKIAQF-SKT--QRTKEKVQGGYGRFQGTWVPLEAAKFMTTKY--NIDNPVVNTI-
9785_DEBHAVVRRADNNM-INGTKLLNVAQM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAMAQRE--GIVDLLY-PLF
3009_ASPNIVMWDYNIGL-VRTTHLFKCNDY-SKT--TPAKHSITGGALAAQGYWMPYEAAKAIAATFCWKIRFALT-PLF
SOK2_SACCEVVRRADNDM-VNGTKLLNVTKM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAIAQRE--KIADYLY-PLF
9680_CANGLVVRRADNDM-VNGTKLLNVTGM-TRG--RRDGDVVKGGPMTLKGVWIPIDRARAIARQE--GIEQWLY-PLF
3001_EREGOVVRRADNDM-INGTKLLNVAKM-TRG--RRDGHVVKIGSMHLKGVWIPFERALALAQRE--KIVDMLF-PLF
4197_CANALVVRRADNNM-INGTKLLNVAQM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAMAQRE--QIVDMLY-PLF
4237_CANALVVRRADNNM-INGTKLLNVAQM-TRG--RRDGHVVKIGSMHLKGVWIPFERALAMAQRE--QIVDMLY-PLF
8256_ASPTEVARREDNSM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALEFANKE--KITDLLY-PLF
3440_ASPNIVARREDNGM-INGTKLLNVAGM-TRG--RRDGNVVKIGPMHLKGVWIPFDRALEFANKE--KITDLLY-PLF
2292_YARLIVARREDNDM-INGTKLLNVAGM-TRG--RRDGHVVKAGAMHLKGVWIPYDRALEFANKE--KIIDLLF-PLF
1102_YARLIVARREDNNM-INGTKLLNVVGM-TRG--RRDGHVVKIGAMHLKGVWIPYERALAFAQRE--RIVDVLY-PLF
5125_ASPFUVARREDNHM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALEFANKE--KITDLLY-PLF
PHD1_SACCEVVRRADNNM-INGTKLLNVTKM-TRG--RRDGEVVKIGSMHLKGVWIPFERAYILAQRE--QILDHLY-PLF
8847_CANGLVVRRADNDM-INGTKLLNVTKM-TRG--KRDGKVVKIGSMHLKGVWIPFERALFIAKRE--KIVDLLY-PLF
5499_YARLIIIWDYHTGY-VHLTGLWKAIGN-SKA--DIVKRRVRGGYLKIQGTWVPYDIARALASRTCYFIRFALI-PLF
5299_KLULAVVRRADNDM-INGTKLLNVTRM-TRG--RRDGHVVKIGSMHLKGVWIPFERALVMAQRE--KIVDLLY-ALF
0305_GIBZEVARREDNHM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPYDRALDFANKE--KITELLY-PLF
0837_NEUCRVARREDNAM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALDFANKE--KITELLY-PLF
8552_MAGGRVARREDNHM-INGTKLLNVAGM-TRG--RRDGHVVKIGPMHLKGVWIPFERALDFANKE--KITELLY-PLF
0447_DEBHAVSRREDTNY-VNGTKLLNVAGM-TRG--KRDGSVVKVGAMNLKGVWIPFERASEIARNE--GIDGLLY-PLF
9978_GIBZEVMWDYNIGL-VRMTPFFKCRGY-GKT--IPAKHSITGGSIAAQGYWMPYRCAKAICATFCHPIAGALI-PIF
1513_CANALVSRREDTNY-INGTKLLNVIGM-TRG--KRDGNVVKVGSMNLKGVWIPFDRAYEIARNE--GVDSLLY-PLF
6132_SCHPO-LRRCPDSY-FNISQILRLAGT-SSS--ENAKENVDSKHPQIDGVWVPYDRAISIAKRY--GVYEILQ-PLI
1244_ASPFUVMWDYNIGL-VRTTHLFKCNDY-SKM--LNA-HSITGGALAAQGYWMPYEAAKAVAATFCWKIRHALT-PLF
0925_USTMAMMIDVDTSF-VRFTSITQALGK-NKV--NFGRTKLKGGYLSIQGTWLPFDLAKELSRRIAWEIRDHLV-PLF
2599_ASPTEIMWDYNIGL-VRTTPLFRSQNY-SKT--TPAKHSITGGAIVKPGYWIPFEAAKAVAATFCWRIRYALT-PIF
9773_DEBHAIIWDYETGF-VHLTGIWKASIN-DEVKADIVKKRIRGGFLKIQGTWLPFDLCKMLAKRFCYHIRFQLI-PIF
0918_CANALVIWDYETGW-VHLTGIWKASLT-IDGKADIVKKRIRGGFLKIQGTWLPYKLCKILARRFCYYLRYSLI-PIF
9901_DEBHAILRRVQDSY-INISQLFSILLKISEA--QLTNSSGGHEVRDLRGLWIPYDRAVSLALKF--DIYELAK-SLF
7766_ASPNILMRRSKDGY-VSATGMFKIAFP-WAK--LEEETRPESEDEIAGNVWISPVLALELAAEY--KMYDWVR-ALL
5459_GIBZELMRRSYDGF-VSATGMFKASFP-YAE--ASDESLPTSHEETAGNVWIPPEQALILAEEY--KISPWIR-ALL
2267_NEUCRLMRRSQDGY-ISATGMFKATFP-YAS--QEEESIPTSSEETAGNVWIPPEQALILAEEY--QITPWIR-ALL
3510_ASPFULMRRSKDGY-VSATGMFKIAFP-WAK--LEEETREGSEDEIAGNIWVSPLLALELAKEY--QMYDWVR-ALL
3762_MAGGRLMRRSSDGY-VSATGMFKATFP-YAD--AEDESLPASKEETAGNVWISPDQALALAEEY--SIATWIR-ALL
3412_CANALVLRRVQDSF-VNVTQLFQILIKLPTS--QVDNGSSSHQNIYLQGIWIPYDKAVNLALKF--DIYEITK-KLF
6166_SCHPOLMRMAKDSS-ISATSMFRSAFP-KAT--QEEEDNLNIEDKRVAGLWVPPADALALAKDY--SMTPFIN-ALL
XBP1_SACCE---------------RDLICQS-YKD--F--LKRIRGGYIKIQGTWLPMEISRLLCLRFCFPIRYFLV-PIF
6355_ASPTETY-FLMDGY-VSATGMFKIAFP-WAK--LDEESREESEDEIAGNVWISPKLALELAGEY--QMYNWVR-ALL
9301_MAGGRVMWDYGCGL-VRMTHFFKCRGY-TKT--VPGKYSITGGSISAQESPIDREEAESMYGRSMQAQAQQQG-PLR
5262_KLULAYI---DLHWHLNP------TLS-TLL--G--QKRIRGGYIKIQGTWLPYPVSKELCSRFCYPLRYLLV-PLF
3869_EREGOYT---DVHWNVDPTWKQRLCRL-YQQ--E--KKRIRGGYIKIQGTWLPMEICKRLCIRFCFPIRYFLV-PIF
6482_CANGLSVNYLDFHW-FDISEKVRSQIF-EQF--K--QQRIRGGYIKIQGTWVPWYIAKLICIRFCFPIRYLLV-PIF