Difference between revisions of "Reference APSES domains (reference species)"

From "A B C"
Jump to navigation Jump to search
Line 38: Line 38:
 
To make the interpretation of alignments and gene trees easier, the Mbp1 orthologues for all species were named accordingly (e.g. <code>Mbp1_ASPFU</code>). All yeast genes were given the yeast-gene-name  (e.g. <code>Sok2_SACCE</code>). All other sequences were named according to the yeast gene they share the most identities with, where the last digit was replaced with A, B, C - as required.  (e.g. <code>SokA_ASHGO</code>). Note that relabeling sequences does not change the data or its interpretation, it is just helpful. Finally the squences were sorted to have the Mbp1 orthologues first, then all other sequences by organism.
 
To make the interpretation of alignments and gene trees easier, the Mbp1 orthologues for all species were named accordingly (e.g. <code>Mbp1_ASPFU</code>). All yeast genes were given the yeast-gene-name  (e.g. <code>Sok2_SACCE</code>). All other sequences were named according to the yeast gene they share the most identities with, where the last digit was replaced with A, B, C - as required.  (e.g. <code>SokA_ASHGO</code>). Note that relabeling sequences does not change the data or its interpretation, it is just helpful. Finally the squences were sorted to have the Mbp1 orthologues first, then all other sequences by organism.
  
====The final 74 sequences====
+
====The final 69 sequences====
  
  >MBP1_SACCE NP_010227 024..107
+
  >Mbp1_SACCE (79  ids)  NP_010227   (024..102)
  SIMKRKKDDWVNATHILKAANFAKAKRTRILEKEVLKETHEKVQGGFGKYQGTWVPLNIAKQLAEKFSVYDQLKPLFDFTQTDG
+
  SIMKRKKDDWVNATHILKAANFAKAKRTRILEKEVLKETHEKVQGGFGKYQGTWVPLNIAKQLAEKFSVYDQLKPLFDF
  >MBP1_YARLI XP_500257 022..105
+
  >Mbp1_ASHGO (66  ids)  NP_986147    (031..109)
  AVMRRKSDGWVNATHILKVAGFDKPQRTRILEKEVQKGVHEKVQGGYGKYQGTWVPLERAREIATLYDVDSHLAPIFNYDDEDG
+
  SIMKRKADDWVNATHILKAAKFAKAKRTRILEKEVIKDTHEKVQGGFGKYQGTWVPLDIARRLAQKFEVLEELRPLFDF
  >5821_NEUCR XP_955821 037..118
+
  >Mbp1_ASPFU (49  ids)  XP_754232    (001..077)
  VMRRRHDDWVNATHILKAAGFDKPARTRILEREVQKDTHEKIQGGYGRYQGTWIPLEQAEALARRNNIYERLKPIFEFQPGN
+
  MRRRGDDWINATHILKVAGFDKPARTRILEREVQKGTHEKVQGGYGKYQGTWIPLHEGRLLAERNNIIDKLRPIFDY
  >9090_CRYNE XP_569090 036..117
+
  >Mbp1_ASPNI (50 ids) XP_660758   (028..106)
  AVMRRRSDAYLNATQILKVAGFDKPQRTRVLEREVQKGEHEKVQGGYGKYQGTWIPIERGLALAKQYGVEDILRPIIDYVPT
+
  SVMRRRSDDWINATHILKVAGFDKPARTRILEREVQKGVHEKVQGGYGKYQGTWIPLQEGRQLAERNNILDKLLPIFDY
  >MBP1_ASPNI XP_660758 028..110
+
  >Mbp1_ASPTE (49 ids) XP_001213217 (028..106)
  SVMRRRSDDWINATHILKVAGFDKPARTRILEREVQKGVHEKVQGGYGKYQGTWIPLQEGRQLAERNNILDKLLPIFDYVAGD
+
  SVMRRRADDWINATHILKVAGFDKPARTRILEREVQKGVHEKVQGGYGKYQGTWIPLPEGRLLAERNNIIDKLRPIFDY
  >MBP1_KLULA XP_454189 025..108
+
  >Mbp1_CANAL (53 ids) XP_723071   (026..103)
  SIMKRKADNWVNATHILKAAKFPKAKRTRILEKEVITDTHEKVQGGFGKYQGTWIPLELASKLAEKFEVLDELKPLFDFTQQEG
+
  IMRRKKDSWINATHILKIAKFPKAKRTRILEKDVQTGIHEKVQGGYGKYQGTYVPLDLGAAIARNFGVYDVLKPIFEF
  >MBP1_GIBZE XP_384396 045..129
+
  >Mbp1_CANGL (71  ids)  XP_445458   (024..102)
  AVMRRRNDSWLNATQILKVAGVDKGKRTKILEKEIQTGEHEKVQGGYGKYQGTWIKFERGLQVCRQYGVEELLRPLLTYDMGQDG
+
  SIMKRKNDGWVNATHILKAANFAKAKRTRILEKEVLKEMHEKVQGGFGKYQGTWVPLNIAINLAEKFDVYQDLKPLFDF
  >MBP1_ASPTE XP_001213217 028..110
+
  >Mbp1_COPCI (43 ids) EAU84310    (025..103)
  SVMRRRADDWINATHILKVAGFDKPARTRILEREVQKGVHEKVQGGYGKYQGTWIPLPEGRLLAERNNIIDKLRPIFDYVAGD
+
  AVMRRRSDSWLNATQILKVAGFDKPQRTRVLEREVQKGEHEKVQGGYGKYQGTWIPLERGMQLAKQYNCEHLLRPIIEF
  >MBP1_CANAL XP_723071 026..108
+
  >Mbp1_CRYNE (47 ids) XP_570545    (133..211)
  IMRRKKDSWINATHILKIAKFPKAKRTRILEKDVQTGIHEKVQGGYGKYQGTYVPLDLGAAIARNFGVYDVLKPIFEFQYIEG
+
  SVMRRASDSWVNATQILKVAGVHKSARTKILEKEVLNGIHEKIQGGYGKYQGTWVPLDRGRDLAEQYGVGSYLSSVFDF
  >MBP1_CANGL XP_445458 024..107
+
  >Mbp1_DEBHA (50 ids) XP_458784    (027..104)
  SIMKRKNDGWVNATHILKAANFAKAKRTRILEKEVLKEMHEKVQGGFGKYQGTWVPLNIAINLAEKFDVYQDLKPLFDFSEENG
+
  IMRRKLDSWINATHILKIAKFPKAKRTRILEKDVQTGVHEKVQGGYGKYQGTYVPLDLGADIAKNFGVFDSLRPIFEF
  >1770_YARLI XP_501770 036..116
+
  >Mbp1_GIBZE (48 ids) XP_390560    (040..117)
  AVMRRRTDSSLNATQILKVAGVEKSKRTKILEKEILTGAHEKVQGGYGKYQGTWIPYERGVDLCRQYSVYDVLQPLLAFDP
+
  VMRRRSDDWINATHILKAAGFDKPARTRILERDVQKDVHEKIQGGYGKYQGTWIPLESGQALAERHSVIDRLRPIFEY
  >2974_MAGGR XP_362974 121..199
+
  >Mbp1_KLULA (64 ids) XP_454189    (025..103)
  VMRRRVDDWINATHILKAAGFDKPARTRILEREVQKDQHEKVQGGYGKYQGTWIPLEAGEALAHRNNIFDRLRPIFEFS
+
  SIMKRKADNWVNATHILKAAKFPKAKRTRILEKEVITDTHEKVQGGFGKYQGTWIPLELASKLAEKFEVLDELKPLFDF
  >1485_USTMA XP_761485 182..262
+
  >Mbp1_MAGGR (48 ids) XP_362974    (040..117)
  AVMRRRGDGWLNATQILKIAGIEKTRRTKILEKSILTGEHEKIQGGYGKFQGTWIPLQRAQQVAAEYNVSHLLQPILEFDP
+
  VMRRRVDDWINATHILKAAGFDKPARTRILEREVQKDQHEKVQGGYGKYQGTWIPLEAGEALAHRNNIFDRLRPIFEF
  >MBP1_USTMA XP_762343 026..107
+
  >Mbp1_NEUCR (50 ids) XP_955821    (037..114)
  AVMRRRSDDWLNATQILKVVGLDKPQRTRVLEREIQKGIHEKVQGGYGKYQGTWIPLDVAIELAERYNIQGLLQPITSYVPS
+
  VMRRRHDDWVNATHILKAAGFDKPARTRILEREVQKDTHEKIQGGYGRYQGTWIPLEQAEALARRNNIYERLKPIFEF
  >0560_GIBZE XP_390560 040..120
+
  >Mbp1_PICST (52 ids) XP_001386821 (026..103)
  VMRRRSDDWINATHILKAAGFDKPARTRILERDVQKDVHEKIQGGYGKYQGTWIPLESGQALAERHSVIDRLRPIFEYVQG
+
  IMRRKLDSWINATHILKIAKFPKAKRTRILEKDVQTGVHEKVQGGYGKYQGTYVPLELGRDIAKNFGVFDILKPIFDF
  >4232_ASPFU XP_754232 001..081
+
>Mbp1_SCHPO (43 ids) NP_595496    (027..103)
  MRRRGDDWINATHILKVAGFDKPARTRILEREVQKGTHEKVQGGYGKYQGTWIPLHEGRLLAERNNIIDKLRPIFDYVAGD
+
  MKRCHDNWLNATQILKIAELDKPRRTRILEKFAQKGLHEKIQGGCGKYQGTWVPSERAVELAHEYNVFDLIQPLIEY
  >MBP1_CRYNE XP_570545 133..214
+
  >Mbp1_USTMA (41 ids) XP_762343    (026..104)
  SVMRRASDSWVNATQILKVAGVHKSARTKILEKEVLNGIHEKIQGGYGKYQGTWVPLDRGRDLAEQYGVGSYLSSVFDFVPS
+
  AVMRRRSDDWLNATQILKVVGLDKPQRTRVLEREIQKGIHEKVQGGYGKYQGTWIPLDVAIELAERYNIQGLLQPITSY
  >MBP1_NEUCR XP_962967 071..155
+
  >Mbp1_YARLI (49 ids) XP_500257    (022..100)
  AVMRRQKDGWVNATQILKVANIDKGRRTKILEKEIQIGEHEKVQGGYGKYQGTWIPFERGLEVCRQYGVEELLSKLLTHNRGQEG
+
  AVMRRKSDGWVNATHILKVAGFDKPQRTRILEKEVQKGVHEKVQGGYGKYQGTWVPLERAREIATLYDVDSHLAPIFNY
  >MBP1_DEBHA XP_458784 027..109
+
  >Swi4_ASHGO (58 ids) NP_986370    (043..115)
  IMRRKLDSWINATHILKIAKFPKAKRTRILEKDVQTGVHEKVQGGYGKYQGTYVPLDLGADIAKNFGVFDSLRPIFEFTYVEG
+
  VMRRLHDDWVNITQVFKVATFSKTQRTKILEKESADISHEKIQGGYGRFQGTWIPLDSAKGLVAKYEITDIVV
  >2876_CANAL XP_712876 006..088
+
  >Sok2_ASHGO (67 ids) NP_983001   (352..425)
  SIMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKVQGGFGRFQGTWIPLEDARRLAKTYGVTEELAPVLFLDFSD
 
  >MBP1_MAGGR XP_365024 131..210
 
  AVMKRIGDSKLNATQILKVAGVEKGKRTKILEKEIQTGEHEKVQGGYGKYQGTWIKYERALEVCRQYGVEELLRPLLEYN
 
  >4319_ASPNI XP_664319 119..198
 
  AVMKRRSDGWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVELCREYHVEELLRPLLEYD
 
  >MBP1_ASPFU XP_748947 105..184
 
  AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVELCREYHVEELLRPLLEYD
 
  >MBP1_SCHPO NP_593032 027..110
 
  SVMRRRRDSWLNATQILKVADFDKPQRTRVLERQVQIGAHEKVQGGYGKYQGTWVPFQRGVDLATKYKVDGIMSPILSLDIDEG
 
  >5548_ASPTE XP_001215548 007..086
 
  AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVDLCREYHVEELLRPLLEYD
 
  >5496_SCHPO NP_595496 026..106
 
  LMKRCHDNWLNATQILKIAELDKPRRTRILEKFAQKGLHEKIQGGCGKYQGTWVPSERAVELAHEYNVFDLIQPLIEYSGS
 
>7246_DEBHA XP_457246 028..109
 
  IMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKIQGGYGRFQGTWIPLADAQRLAASYGVTPDLAPVLYLDASD
 
  >MBP1_EREGO NP_986147 031..114
 
  SIMKRKADDWVNATHILKAAKFAKAKRTRILEKEVIKDTHEKVQGGFGKYQGTWVPLDIARRLAQKFEVLEELRPLFDFTRRDG
 
  >6370_EREGO NP_986370 043..124
 
  VMRRLHDDWVNITQVFKVATFSKTQRTKILEKESADISHEKIQGGYGRFQGTWIPLDSAKGLVAKYEITDIVVLTVINFQPD
 
  >SWI4_SACCE NP_011036 060..141
 
  VMRRTKDDWINITQVFKIAQFSKTKRTKILEKESNDMQHEKVQGGYGRFQGTWIPLDSAKFLVNKYEIIDPVVNSILTFQFD
 
  >4890_KLULA XP_454890 119..200
 
  IMRRCNDNWLNITQVFKAGSFTKAQRTKILEKEANEIKHEKIQGGYGRFQGTWIPWESTKYLVEKYNINNKVVKRIVEFIPD
 
  >4966_CANGL XP_444966 062..140
 
  VMRRTMDDWVNVTQVFKIAQFSKTQRTKILEKESTNMKHEKVQGGYGRFQGTWVPLEAAKFMTTKYNIDNPVVNTILSF
 
  >9785_DEBHA XP_459785 307..380
 
  SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREGIVDLLYPLF
 
  >3009_ASPNI XP_663009 131..216
 
TVMWDYNIGLVRTTHLFKCNDYSKTTPAKMLNQNPGLRDICHSITGGALAAQGYWMPYEAAKAIAATFCWKIRFALTPLFGDNFPD
 
>SOK2_SACCE NP_013729 436..509
 
  SVVRRADNDMVNGTKLLNVTKMTRGRRDGILKAEKIRHVVKIGSMHLKGVWIPFERALAIAQREKIADYLYPLF
 
  >9680_CANGL XP_449680 143..216
 
  TVVRRADNDMVNGTKLLNVTGMTRGRRDGILKNEPVRDVVKGGPMTLKGVWIPIDRARAIARQEGIEQWLYPLF
 
  >3001_EREGO NP_983001 352..425
 
 
  SVVRRADNDMINGTKLLNVAKMTRGRRDGILKAEKVRHVVKIGSMHLKGVWIPFERALALAQREKIVDMLFPLF
 
  SVVRRADNDMINGTKLLNVAKMTRGRRDGILKAEKVRHVVKIGSMHLKGVWIPFERALALAQREKIVDMLFPLF
  >4197_CANAL XP_714197 227..300
+
  >MbpB_ASPFU (22  ids)  XP_751244    (151..225)
  SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREQIVDMLYPLF
+
  VMWDYNIGLVRTTHLFKCNDYSKMLNANPGLREICHSITGGALAAQGYWMPYEAAKAVAATFCWKIRHALTPLFG
  >4237_CANAL XP_714237 228..301
+
  >MbpA_ASPFU (40  ids)  XP_748947    (105..183)
  SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREQIVDMLYPLF
+
  AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVELCREYHVEELLRPLLEY
  >8256_ASPTE XP_001218256 139..211
+
  >Sok2_ASPFU (58  ids)  XP_755125    (152..224)
  VARREDNSMINGTKLLNVAGMTRGRRDGILKSEKIRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
+
VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
  >3440_ASPNI XP_663440 152..224
+
>MbpB_ASPNI (19  ids)  XP_001392970 (124..203)
 +
ISWDYNVGLVLTRSLFKCNGHPKTAPAKVLKMNPGLGDISHSITGGALVGQGYWMPFRAAKALATTFCWNIRFVLTPMFG
 +
>SokB_ASPNI (21  ids)  XP_663009    (131..211)
 +
TVMWDYNIGLVRTTHLFKCNDYSKTTPAKMLNQNPGLRDICHSITGGALAAQGYWMPYEAAKAIAATFCWKIRFALTPLFG
 +
  >MbpA_ASPNI (40  ids)  XP_001391313 (118..196)
 +
AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVELCREYHVEELLRPLLEY
 +
  >SokA_ASPNI (56  ids)  XP_663440   (152..224)
 
  VARREDNGMINGTKLLNVAGMTRGRRDGILKSEKVRNVVKIGPMHLKGVWIPFDRALEFANKEKITDLLYPLF
 
  VARREDNGMINGTKLLNVAGMTRGRRDGILKSEKVRNVVKIGPMHLKGVWIPFDRALEFANKEKITDLLYPLF
  >2292_YARLI XP_502292 285..357
+
  >Sok2_ASPNI (58 ids) XP_001390623 (153..225)
VARREDNDMINGTKLLNVAGMTRGRRDGILKGEKLRHVVKAGAMHLKGVWIPYDRALEFANKEKIIDLLFPLF
 
>1102_YARLI XP_501102 130..202
 
  VARREDNNMINGTKLLNVVGMTRGRRDGILKTEKIRHVVKIGAMHLKGVWIPYERALAFAQRERIVDVLYPLF
 
  >5125_ASPFU XP_755125 152..224
 
 
  VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
 
  VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
  >PHD1_SACCE NP_012881 208..281
+
  >MbpB_ASPTE (21  ids)  XP_001212599 (130..212)
  SVVRRADNNMINGTKLLNVTKMTRGRRDGILRSEKVREVVKIGSMHLKGVWIPFERAYILAQREQILDHLYPLF
+
IMWDYNIGLVRTTPLFRSQNYSKTTPAKVLDANPGLREISHSITGGAIVAQDKPGYWIPFEAAKAVAATFCWRIRYALTPIFG
  >8847_CANGL XP_448847 224..297
+
>MbpA_ASPTE (40  ids)  XP_001215548 (007..085)
 +
  AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVDLCREYHVEELLRPLLEY
 +
  >Sok2_ASPTE (59  ids)  XP_001218256 (139..211)
 +
VARREDNSMINGTKLLNVAGMTRGRRDGILKSEKIRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
 +
>MbpC_CANAL (22  ids)  XP_723412    (087..178)
 +
VLRRVQDSFVNVTQLFQILIKLEVLPTSQVDNYFDNEILSNLKYFGSSSNTPQYLDLRKHQNIYLQGIWIPYDKAVNLALKFDIYEITKKLF
 +
>MbpB_CANAL (25  ids)  XP_710918    (256..346)
 +
VIWDYETGWVHLTGIWKASLTIDGSNVSPSHLKADIVKLLESTPKEYQQYIKRIRGGFLKIQGTWLPYKLCKILARRFCYYLRYSLIPIFG
 +
>MbpA_CANAL (48  ids)  XP_712970    (006..082)
 +
SIMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKVQGGFGRFQGTWIPLEDARKLAKTYGVTEELAPVL
 +
>Sok2_CANAL (49  ids)  XP_711513    (469..541)
 +
VSRREDTNYINGTKLLNVIGMTRGKRDGILKTEKIKNVVKVGSMNLKGVWIPFDRAYEIARNEGVDSLLYPLF
 +
>Phd1_CANAL (65  ids)  XP_714237    (228..301)
 +
SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREQIVDMLYPLF
 +
>SokA_CANGL (56  ids)  XP_449680    (143..216)
 +
TVVRRADNDMVNGTKLLNVTGMTRGRRDGILKNEPVRDVVKGGPMTLKGVWIPIDRARAIARQEGIEQWLYPLF
 +
>Swi4_CANGL (61  ids)  XP_444966    (062..140)
 +
VMRRTMDDWVNVTQVFKIAQFSKTQRTKILEKESTNMKHEKVQGGYGRFQGTWVPLEAAKFMTTKYNIDNPVVNTILSF
 +
>Sok2_CANGL (64  ids)  XP_448847   (224..297)
 
  SVVRRADNDMINGTKLLNVTKMTRGKRDGILRSEKYRKVVKIGSMHLKGVWIPFERALFIAKREKIVDLLYPLF
 
  SVVRRADNDMINGTKLLNVTKMTRGKRDGILRSEKYRKVVKIGSMHLKGVWIPFERALFIAKREKIVDLLYPLF
  >5499_YARLI XP_505499 080..165
+
  >MbpA_COPCI (26  ids)  EAU85126    (059..139)
  IIWDYHTGYVHLTGLWKAIGNSKADIVKLIDNSPDLEAVIRRVRGGYLKIQGTWVPYDIARALASRTCYFIRFALIPLFGQDFPGT
+
IMMDIDDGYILWTGIWKALGNSKADIVKMIDSQPDLAPLIRRVRGGYLKIQGTWMPYEVALKLSRRVAWPIRHDLVPLFGF
  >5299_KLULA XP_455299 386..459
+
>MbpA_CRYNE (42  ids)  XP_569090    (036..114)
 +
AVMRRRSDAYLNATQILKVAGFDKPQRTRVLEREVQKGEHEKVQGGYGKYQGTWIPIERGLALAKQYGVEDILRPIIDY
 +
>MbpB_DEBHA (26  ids)  XP_459773    (187..275)
 +
IIWDYETGFVHLTGIWKASINDEVNTHRNLKADIVKLLESTPKQYHQHIKRIRGGFLKIQGTWLPFDLCKMLAKRFCYHIRFQLIPIFG
 +
>Swi4_DEBHA (26  ids)  XP_459901    (067..158)
 +
ILRRVQDSYINISQLFSILLKIGHLSEAQLTNFLNNEILTNTQYLSSGGSNPQFNDLRNHEVRDLRGLWIPYDRAVSLALKFDIYELAKSLF
 +
>MbpA_DEBHA (45  ids)  XP_457246    (028..103)
 +
IMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKIQGGYGRFQGTWIPLADAQRLAASYGVTPDLAPVL
 +
>SokA_DEBHA (50  ids)  XP_460447    (213..285)
 +
VSRREDTNYVNGTKLLNVAGMTRGKRDGILKTEKTKSVVKVGAMNLKGVWIPFERASEIARNEGIDGLLYPLF
 +
>Sok2_DEBHA (64  ids)  XP_459785    (307..380)
 +
SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREGIVDLLYPLF
 +
>MbpB_GIBZE (21  ids)  XP_389978    (139..219)
 +
  AVMWDYNIGLVRMTPFFKCRGYGKTIPAKMLGLNPGLKEITHSITGGSIAAQGYWMPYRCAKAICATFCHPIAGALIPIFG
 +
  >MbpA_GIBZE (39  ids)  XP_384396    (045..123)
 +
AVMRRRNDSWLNATQILKVAGVDKGKRTKILEKEIQTGEHEKVQGGYGKYQGTWIKFERGLQVCRQYGVEELLRPLLTY
 +
>Sok2_GIBZE (55  ids)  XP_390305    (226..298)
 +
VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPYDRALDFANKEKITELLYPLF
 +
>Swi4_KLULA (50  ids)  XP_454890    (119..197)
 +
IMRRCNDNWLNITQVFKAGSFTKAQRTKILEKEANEIKHEKIQGGYGRFQGTWIPWESTKYLVEKYNINNKVVKRIVEF
 +
>Sok2_KLULA (67  ids)  XP_455299   (386..459)
 
  SVVRRADNDMINGTKLLNVTRMTRGRRDGILKAEKIRHVVKIGSMHLKGVWIPFERALVMAQREKIVDLLYALF
 
  SVVRRADNDMINGTKLLNVTRMTRGRRDGILKAEKIRHVVKIGSMHLKGVWIPFERALVMAQREKIVDLLYALF
  >0305_GIBZE XP_390305 226..298
+
  >MbpB_MAGGR (20  ids)  XP_369301    (096..176)
  VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPYDRALDFANKEKITELLYPLF
+
  TVMWDYGCGLVRMTHFFKCRGYTKTVPGKVLNQNHGLKDITYSITGGSISAQGYWMPFACARAVCATFCHPIAGALIPIFG
  >0837_NEUCR XP_960837 139..211
+
  >MbpA_MAGGR (39  ids)  XP_365024    (131..209)
  VARREDNAMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPFERALDFANKEKITELLYPLF
+
  AVMKRIGDSKLNATQILKVAGVEKGKRTKILEKEIQTGEHEKVQGGYGKYQGTWIKYERALEVCRQYGVEELLRPLLEY
  >8552_MAGGR XP_368552 127..199
+
  >Sok2_MAGGR (57  ids)  XP_368552   (133..205)
 
  VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKMRHVVKIGPMHLKGVWIPFERALDFANKEKITELLYPLF
 
  VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKMRHVVKIGPMHLKGVWIPFERALDFANKEKITELLYPLF
  >0447_DEBHA XP_460447 213..285
+
  >MbpA_NEUCR (40  ids)  XP_962967    (071..147)
  VSRREDTNYVNGTKLLNVAGMTRGKRDGILKTEKTKSVVKVGAMNLKGVWIPFERASEIARNEGIDGLLYPLF
+
  AVMRRQKDGWVNATQILKVANIDKGRRTKILEKEIQIGEHEKVQGGYGKYQGTWIPFERGLEVCRQYGVEELLSKLL
  >9978_GIBZE XP_389978 139..218
+
  >MbpA_PICST (46  ids)  XP_001383745 (006..081)
  AVMWDYNIGLVRMTPFFKCRGYGKTIPAKMLGLNPGLKEITHSITGGSIAAQGYWMPYRCAKAICATFCHPIAGALIPIF
+
  IMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKVQGGFGRFQGTWIPLPDAQRLATMYGVTADAAPVL
  >1513_CANAL XP_711513 469..541
+
  >SokA_PICST (49  ids)  XP_001385235 (239..311)
  VSRREDTNYINGTKLLNVIGMTRGKRDGILKTEKIKNVVKVGSMNLKGVWIPFDRAYEIARNEGVDSLLYPLF
+
  VSRREDTNFVNGTKLLNVIGMTRGKRDGILKTEKTRNVVKVGSMNLKGVWIPFDRAFEIARNEGVDEALHPLF
  >6132_SCHPO NP_596132 088..165
+
  >Sok2_PICST (64  ids)  XP_001383609 (194..267)
  LRRCPDSYFNISQILRLAGTSSSENAKELDDIIESGDYENVDSKHPQIDGVWVPYDRAISIAKRYGVYEILQPLISFN
+
  SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREGIVDLLYPLF
  >1244_ASPFU XP_751244 151..230
+
  >Sok2_SACCE (74  ids)  EDN64408    (435..508)
  VMWDYNIGLVRTTHLFKCNDYSKMLNANPGLREICHSITGGALAAQGYWMPYEAAKAVAATFCWKIRHALTPLFGLDFPS
+
  SVVRRADNDMVNGTKLLNVTKMTRGRRDGILKAEKIRHVVKIGSMHLKGVWIPFERALAIAQREKIADYLYPLF
  >0925_USTMA XP_760925 057..143
+
  >Phd1_SACCE (74  ids)  NP_012881    (208..281)
  TMMIDVDTSFVRFTSITQALGKNKVNFGRLVKTCPALDPHITKLKGGYLSIQGTWLPFDLAKELSRRIAWEIRDHLVPLFGYDFPST
+
  SVVRRADNNMINGTKLLNVTKMTRGRRDGILRSEKVREVVKIGSMHLKGVWIPFERAYILAQREQILDHLYPLF
  >2599_ASPTE XP_001212599 130..218
+
  >Swi4_SACCE (79  ids)  EDN63086    (060..138)
  IMWDYNIGLVRTTPLFRSQNYSKTTPAKVLDANPGLREISHSITGGAIVAQDKPGYWIPFEAAKAVAATFCWRIRYALTPIFGLDFPSQ
+
  VMRRTKDDWINITQVFKIAQFSKTKRTKILEKESNDMQHEKVQGGYGRFQGTWIPLDSAKFLVNKYEIIDPVVNSILTF
  >9773_DEBHA XP_459773 187..274
+
  >MbpB_SCHPO (21 ids) NP_596132    (088..164)
  IIWDYETGFVHLTGIWKASINDEVNTHRNLKADIVKLLESTPKQYHQHIKRIRGGFLKIQGTWLPFDLCKMLAKRFCYHIRFQLIPIF
+
  LRRCPDSYFNISQILRLAGTSSSENAKELDDIIESGDYENVDSKHPQIDGVWVPYDRAISIAKRYGVYEILQPLISF
  >0918_CANAL XP_710918 256..352
+
  >MbpA_SCHPO (41 ids) NP_593032    (027..104)
  VIWDYETGWVHLTGIWKASLTIDGSNVSPSHLKADIVKLLESTPKEYQQYIKRIRGGFLKIQGTWLPYKLCKILARRFCYYLRYSLIPIFGTDFPDS
+
  SVMRRRRDSWLNATQILKVADFDKPQRTRVLERQVQIGAHEKVQGGYGKYQGTWVPFQRGVDLATKYKVDGIMSPILS
  >9901_DEBHA XP_459901 067..158
+
  >MbpA_USTMA (24 ids) XP_760925    (057..138)
  ILRRVQDSYINISQLFSILLKIGHLSEAQLTNFLNNEILTNTQYLSSGGSNPQFNDLRNHEVRDLRGLWIPYDRAVSLALKFDIYELAKSLF
+
  TMMIDVDTSFVRFTSITQALGKNKVNFGRLVKTCPALDPHITKLKGGYLSIQGTWLPFDLAKELSRRIAWEIRDHLVPLFGY
  >7766_ASPNI XP_657766 089..163
+
  >Swi4_USTMA (42 ids) XP_761485    (182..260)
  LMRRSKDGYVSATGMFKIAFPWAKLEEERSEREYLKTRPETSEDEIAGNVWISPVLALELAAEYKMYDWVRALLD
+
  AVMRRRGDGWLNATQILKIAGIEKTRRTKILEKSILTGEHEKIQGGYGKFQGTWIPLQRAQQVAAEYNVSHLLQPILEF
  >5459_GIBZE XP_385459 077..154
+
  >MbpB_YARLI (26 ids) XP_505499    (080..159)
  LMRRSYDGFVSATGMFKASFPYAEASDEDAERKYIKSLPTTSHEETAGNVWIPPEQALILAEEYKISPWIRALLDPTP
+
  IIWDYHTGYVHLTGLWKAIGNSKADIVKLIDNSPDLEAVIRRVRGGYLKIQGTWVPYDIARALASRTCYFIRFALIPLFG
  >2267_NEUCR XP_962267 085..162
+
  >MbpA_YARLI (44 ids) XP_501770    (036..114)
  LMRRSQDGYISATGMFKATFPYASQEEEEAERKYIKSIPTTSSEETAGNVWIPPEQALILAEEYQITPWIRALLDPSD
+
  AVMRRRTDSSLNATQILKVAGVEKSKRTKILEKEILTGAHEKVQGGYGKYQGTWIPYERGVDLCRQYSVYDVLQPLLAF
  >3510_ASPFU XP_753510 089..163
+
  >SokA_YARLI (55 ids) CAB45654    (144..216)
  LMRRSKDGYVSATGMFKIAFPWAKLEEEKAEREYLKTREGTSEDEIAGNIWVSPLLALELAKEYQMYDWVRALLD
+
  VARREDNDMINGTKLLNVAGMTRGRRDGILKGEKLRHVVKAGAMHLKGVWIPYDRALEFANKEKIIDLLFPLF
  >3762_MAGGR XP_363762 084..161
+
  >Sok2_YARLI (60 ids) XP_501102    (130..202)
  LMRRSSDGYVSATGMFKATFPYADAEDEEAERNYIKSLPATSKEETAGNVWISPDQALALAEEYSIATWIRALLDPTD
+
  VARREDNNMINGTKLLNVVGMTRGRRDGILKTEKIRHVVKIGAMHLKGVWIPYERALAFAQRERIVDVLYPLF
  >3412_CANAL XP_723412 087..178
 
  VLRRVQDSFVNVTQLFQILIKLEVLPTSQVDNYFDNEILSNLKYFGSSSNTPQYLDLRKHQNIYLQGIWIPYDKAVNLALKFDIYEITKKLF
 
  >6166_SCHPO NP_596166 062..140
 
  LMRMAKDSSISATSMFRSAFPKATQEEEDLEMRWIRDNLNPIEDKRVAGLWVPPADALALAKDYSMTPFINALLEASST
 
  >XBP1_SACCE NP_012165 314..400
 
  RDLICQSYKDFLINELGPDQIDLPNLNPANFTKRIRGGYIKIQGTWLPMEISRLLCLRFCFPIRYFLVPIFGPDFPKDCESWYLAHQ
 
  >6355_ASPTE XP_001216355 084..167
 
  TYFLMDGYVSATGMFKIAFPWAKLDEERSEREYLKSREETSEDEIAGNVWISPKLALELAGEYQMYNWVRALLDPTDIVQSPS
 
  >9301_MAGGR XP_369301 092..188
 
  EEYTVMWDYGCGLVRMTHFFKCRGYTKTVPGKVLNQNHGLKDITYSITGGSISAQESPNFGRMVIDRELVAHATREAESMYGRSMQAQAQQQGPLR
 
  >5262_KLULA XP_455262 301..388
 
  QQKWNKWFQRESFSTYIDLHWHKLNPTLSTLLGQSYDAKIPFERMVKRIRGGYIKIQGTWLPYPVSKELCSRFCYPLRYLLVPLFGPDFPEKCEYWY
 
  >3869_EREGO NP_983869 277..365
 
  YTDVHWNQVDPTWKQRLCRLYQQEKNLDFTPEFQDCYKRIRGGYIKIQGTWLPMEICKRLCIRFCFPIRYFLVPIFGEGFLQECHNWYF
 
  >6482_CANGL XP_446482 300..390
 
  SVNYLDFHWFDISEKVRSQIFEQFKQHLEKDRNVDCSTIPKAEEYIQRIRGGYIKIQGTWVPWYIAKLICIRFCFPIRYLLVPIFGEQFPV
 

Revision as of 05:11, 10 October 2007


Multi FASTA file of all APSES domains in fungal proteins.

Executing the PSI-BLAST search

The starting point of this list is a BLAST search with one known APSES domain sequence. This query sequence - the Mbp1 APSES domain - was defined as follows, based on Pfam profile 02292: APSES.

>Yeast Mbp1 APSES domain (AA 24..102 of NP_010227)
SIMKRKKDDWVNATHILKAANFAKAKRTRILEKEVLKETHEKVQGGFGKY
QGTWVPLNIAKQLAEKFSVYDQLKPLFDF

A PSI-BLAST search was executed, searching in the nr subset of GenPept without further restrictions. The default parameters for PSI-BLAST were used, except for using the BLOSUM45 matrix and reducing the Evalue to 1.0 from 10.0.

The search converged after 6 iterations, i.e. PSI-BLAST had found no additional new hits above the inclusion threshold E-value of 0.005. 164 sequences were found and contributed to the profile. However, some of these sequences are redundant, i.e. they are matches to the same amino acid sequence in different database entries, and some of these sequences are from organisnms other than the ones we are considering in the assignment. Even if these latter sequences are removed, it was appropriate to keep them included initially: they contribute to the information in the PSI-BLAST search profile and improve the sensitivity and specificity of the search.

It would certainly not be impossible - albeit somewhat tedious - to manually edit the list of proteins by checking/unchecking which hits to include. I have written a short Perl script for this task solely to be able to rename the sequences at the same time. This is not required; RefSeq / GenPept accession numbers will do just fine to name the sequences, but the final analysis is easier to do if the sequence labels actually tell us something about the organisms they came from and which other sequence they might be similar to.

After removing redundant sequences, sequence fragments that did not span the entire Mbp1 APSES domain, and sequences from fungi that are not in the list of organisms for this course, 69 sequences remained for analysis.


Constructing the multi-FASTA file

A multi-FASTA file is the default input format for many MSA programs, it is simply a file that contains more than one FASTA formatted sequence.

The PSI-BLAST search has already defined the sequences from each source protein that are similar to the APSES search profile. We only need to extract them in a convenient way from the search results. NCBI offers a number of options to format the result page: they are presented from alink at the top of the BLAST results page: " Reformat these Results": the principal options for the format are:

  • Pairwise: the default
  • Pairwise with identities: showing only differences to the query sequence
  • query anchored with/without identities: looks something like a multiple sequence alignment, hyphens for gaps, insertions relative to the query are displayed below the sequence
  • flat-query anchored with/without identitites: This now looks like a multiple sequence alignment (in fact it is one - all sequences aligned to the profile).
  • hit-table: this gives only the numerical parameters describing the quality of the matches.

When we select the flat-query anchored with/without identitites option, it is reasonably straightforward to obtain the aligned sequences, copy and paste them into a Word document and convert that into a multi-FASTA format with a few Edit > Replace commands.

Renaming sequences

To make the interpretation of alignments and gene trees easier, the Mbp1 orthologues for all species were named accordingly (e.g. Mbp1_ASPFU). All yeast genes were given the yeast-gene-name (e.g. Sok2_SACCE). All other sequences were named according to the yeast gene they share the most identities with, where the last digit was replaced with A, B, C - as required. (e.g. SokA_ASHGO). Note that relabeling sequences does not change the data or its interpretation, it is just helpful. Finally the squences were sorted to have the Mbp1 orthologues first, then all other sequences by organism.

The final 69 sequences

>Mbp1_SACCE (79  ids)  NP_010227    (024..102)
SIMKRKKDDWVNATHILKAANFAKAKRTRILEKEVLKETHEKVQGGFGKYQGTWVPLNIAKQLAEKFSVYDQLKPLFDF
>Mbp1_ASHGO (66  ids)  NP_986147    (031..109)
SIMKRKADDWVNATHILKAAKFAKAKRTRILEKEVIKDTHEKVQGGFGKYQGTWVPLDIARRLAQKFEVLEELRPLFDF
>Mbp1_ASPFU (49  ids)  XP_754232    (001..077)
MRRRGDDWINATHILKVAGFDKPARTRILEREVQKGTHEKVQGGYGKYQGTWIPLHEGRLLAERNNIIDKLRPIFDY
>Mbp1_ASPNI (50  ids)  XP_660758    (028..106)
SVMRRRSDDWINATHILKVAGFDKPARTRILEREVQKGVHEKVQGGYGKYQGTWIPLQEGRQLAERNNILDKLLPIFDY
>Mbp1_ASPTE (49  ids)  XP_001213217 (028..106)
SVMRRRADDWINATHILKVAGFDKPARTRILEREVQKGVHEKVQGGYGKYQGTWIPLPEGRLLAERNNIIDKLRPIFDY
>Mbp1_CANAL (53  ids)  XP_723071    (026..103)
IMRRKKDSWINATHILKIAKFPKAKRTRILEKDVQTGIHEKVQGGYGKYQGTYVPLDLGAAIARNFGVYDVLKPIFEF
>Mbp1_CANGL (71  ids)  XP_445458    (024..102)
SIMKRKNDGWVNATHILKAANFAKAKRTRILEKEVLKEMHEKVQGGFGKYQGTWVPLNIAINLAEKFDVYQDLKPLFDF
>Mbp1_COPCI (43  ids)  EAU84310     (025..103)
AVMRRRSDSWLNATQILKVAGFDKPQRTRVLEREVQKGEHEKVQGGYGKYQGTWIPLERGMQLAKQYNCEHLLRPIIEF
>Mbp1_CRYNE (47  ids)  XP_570545    (133..211)
SVMRRASDSWVNATQILKVAGVHKSARTKILEKEVLNGIHEKIQGGYGKYQGTWVPLDRGRDLAEQYGVGSYLSSVFDF
>Mbp1_DEBHA (50  ids)  XP_458784    (027..104)
IMRRKLDSWINATHILKIAKFPKAKRTRILEKDVQTGVHEKVQGGYGKYQGTYVPLDLGADIAKNFGVFDSLRPIFEF
>Mbp1_GIBZE (48  ids)  XP_390560    (040..117)
VMRRRSDDWINATHILKAAGFDKPARTRILERDVQKDVHEKIQGGYGKYQGTWIPLESGQALAERHSVIDRLRPIFEY
>Mbp1_KLULA (64  ids)  XP_454189    (025..103)
SIMKRKADNWVNATHILKAAKFPKAKRTRILEKEVITDTHEKVQGGFGKYQGTWIPLELASKLAEKFEVLDELKPLFDF
>Mbp1_MAGGR (48  ids)  XP_362974    (040..117)
VMRRRVDDWINATHILKAAGFDKPARTRILEREVQKDQHEKVQGGYGKYQGTWIPLEAGEALAHRNNIFDRLRPIFEF
>Mbp1_NEUCR (50  ids)  XP_955821    (037..114)
VMRRRHDDWVNATHILKAAGFDKPARTRILEREVQKDTHEKIQGGYGRYQGTWIPLEQAEALARRNNIYERLKPIFEF
>Mbp1_PICST (52  ids)  XP_001386821 (026..103)
IMRRKLDSWINATHILKIAKFPKAKRTRILEKDVQTGVHEKVQGGYGKYQGTYVPLELGRDIAKNFGVFDILKPIFDF

>Mbp1_SCHPO (43 ids) NP_595496 (027..103)

MKRCHDNWLNATQILKIAELDKPRRTRILEKFAQKGLHEKIQGGCGKYQGTWVPSERAVELAHEYNVFDLIQPLIEY
>Mbp1_USTMA (41  ids)  XP_762343    (026..104)
AVMRRRSDDWLNATQILKVVGLDKPQRTRVLEREIQKGIHEKVQGGYGKYQGTWIPLDVAIELAERYNIQGLLQPITSY
>Mbp1_YARLI (49  ids)  XP_500257    (022..100)
AVMRRKSDGWVNATHILKVAGFDKPQRTRILEKEVQKGVHEKVQGGYGKYQGTWVPLERAREIATLYDVDSHLAPIFNY
>Swi4_ASHGO (58  ids)  NP_986370    (043..115)
VMRRLHDDWVNITQVFKVATFSKTQRTKILEKESADISHEKIQGGYGRFQGTWIPLDSAKGLVAKYEITDIVV
>Sok2_ASHGO (67  ids)  NP_983001    (352..425)
SVVRRADNDMINGTKLLNVAKMTRGRRDGILKAEKVRHVVKIGSMHLKGVWIPFERALALAQREKIVDMLFPLF
>MbpB_ASPFU (22  ids)  XP_751244    (151..225)
VMWDYNIGLVRTTHLFKCNDYSKMLNANPGLREICHSITGGALAAQGYWMPYEAAKAVAATFCWKIRHALTPLFG
>MbpA_ASPFU (40  ids)  XP_748947    (105..183)
AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVELCREYHVEELLRPLLEY
>Sok2_ASPFU (58  ids)  XP_755125    (152..224)
VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
>MbpB_ASPNI (19  ids)  XP_001392970 (124..203)
ISWDYNVGLVLTRSLFKCNGHPKTAPAKVLKMNPGLGDISHSITGGALVGQGYWMPFRAAKALATTFCWNIRFVLTPMFG
>SokB_ASPNI (21  ids)  XP_663009    (131..211)
TVMWDYNIGLVRTTHLFKCNDYSKTTPAKMLNQNPGLRDICHSITGGALAAQGYWMPYEAAKAIAATFCWKIRFALTPLFG
>MbpA_ASPNI (40  ids)  XP_001391313 (118..196)
AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVELCREYHVEELLRPLLEY
>SokA_ASPNI (56  ids)  XP_663440    (152..224)
VARREDNGMINGTKLLNVAGMTRGRRDGILKSEKVRNVVKIGPMHLKGVWIPFDRALEFANKEKITDLLYPLF
>Sok2_ASPNI (58  ids)  XP_001390623 (153..225)
VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
>MbpB_ASPTE (21  ids)  XP_001212599 (130..212)
IMWDYNIGLVRTTPLFRSQNYSKTTPAKVLDANPGLREISHSITGGAIVAQDKPGYWIPFEAAKAVAATFCWRIRYALTPIFG
>MbpA_ASPTE (40  ids)  XP_001215548 (007..085)
AVMKRRSDSWLNATQILKVAGVVKARRTKTLEKEIAAGEHEKVQGGYGKYQGTWVNYQRGVDLCREYHVEELLRPLLEY
>Sok2_ASPTE (59  ids)  XP_001218256 (139..211)
VARREDNSMINGTKLLNVAGMTRGRRDGILKSEKIRHVVKIGPMHLKGVWIPFERALEFANKEKITDLLYPLF
>MbpC_CANAL (22  ids)  XP_723412    (087..178)
VLRRVQDSFVNVTQLFQILIKLEVLPTSQVDNYFDNEILSNLKYFGSSSNTPQYLDLRKHQNIYLQGIWIPYDKAVNLALKFDIYEITKKLF
>MbpB_CANAL (25  ids)  XP_710918    (256..346)
VIWDYETGWVHLTGIWKASLTIDGSNVSPSHLKADIVKLLESTPKEYQQYIKRIRGGFLKIQGTWLPYKLCKILARRFCYYLRYSLIPIFG
>MbpA_CANAL (48  ids)  XP_712970    (006..082)
SIMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKVQGGFGRFQGTWIPLEDARKLAKTYGVTEELAPVL
>Sok2_CANAL (49  ids)  XP_711513    (469..541)
VSRREDTNYINGTKLLNVIGMTRGKRDGILKTEKIKNVVKVGSMNLKGVWIPFDRAYEIARNEGVDSLLYPLF
>Phd1_CANAL (65  ids)  XP_714237    (228..301)
SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREQIVDMLYPLF
>SokA_CANGL (56  ids)  XP_449680    (143..216)
TVVRRADNDMVNGTKLLNVTGMTRGRRDGILKNEPVRDVVKGGPMTLKGVWIPIDRARAIARQEGIEQWLYPLF
>Swi4_CANGL (61  ids)  XP_444966    (062..140)
VMRRTMDDWVNVTQVFKIAQFSKTQRTKILEKESTNMKHEKVQGGYGRFQGTWVPLEAAKFMTTKYNIDNPVVNTILSF
>Sok2_CANGL (64  ids)  XP_448847    (224..297)
SVVRRADNDMINGTKLLNVTKMTRGKRDGILRSEKYRKVVKIGSMHLKGVWIPFERALFIAKREKIVDLLYPLF
>MbpA_COPCI (26  ids)  EAU85126     (059..139)
IMMDIDDGYILWTGIWKALGNSKADIVKMIDSQPDLAPLIRRVRGGYLKIQGTWMPYEVALKLSRRVAWPIRHDLVPLFGF
>MbpA_CRYNE (42  ids)  XP_569090    (036..114)
AVMRRRSDAYLNATQILKVAGFDKPQRTRVLEREVQKGEHEKVQGGYGKYQGTWIPIERGLALAKQYGVEDILRPIIDY
>MbpB_DEBHA (26  ids)  XP_459773    (187..275)
IIWDYETGFVHLTGIWKASINDEVNTHRNLKADIVKLLESTPKQYHQHIKRIRGGFLKIQGTWLPFDLCKMLAKRFCYHIRFQLIPIFG
>Swi4_DEBHA (26  ids)  XP_459901    (067..158)
ILRRVQDSYINISQLFSILLKIGHLSEAQLTNFLNNEILTNTQYLSSGGSNPQFNDLRNHEVRDLRGLWIPYDRAVSLALKFDIYELAKSLF
>MbpA_DEBHA (45  ids)  XP_457246    (028..103)
IMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKIQGGYGRFQGTWIPLADAQRLAASYGVTPDLAPVL
>SokA_DEBHA (50  ids)  XP_460447    (213..285)
VSRREDTNYVNGTKLLNVAGMTRGKRDGILKTEKTKSVVKVGAMNLKGVWIPFERASEIARNEGIDGLLYPLF
>Sok2_DEBHA (64  ids)  XP_459785    (307..380)
SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREGIVDLLYPLF
>MbpB_GIBZE (21  ids)  XP_389978    (139..219)
AVMWDYNIGLVRMTPFFKCRGYGKTIPAKMLGLNPGLKEITHSITGGSIAAQGYWMPYRCAKAICATFCHPIAGALIPIFG
>MbpA_GIBZE (39  ids)  XP_384396    (045..123)
AVMRRRNDSWLNATQILKVAGVDKGKRTKILEKEIQTGEHEKVQGGYGKYQGTWIKFERGLQVCRQYGVEELLRPLLTY
>Sok2_GIBZE (55  ids)  XP_390305    (226..298)
VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKVRHVVKIGPMHLKGVWIPYDRALDFANKEKITELLYPLF
>Swi4_KLULA (50  ids)  XP_454890    (119..197)
IMRRCNDNWLNITQVFKAGSFTKAQRTKILEKEANEIKHEKIQGGYGRFQGTWIPWESTKYLVEKYNINNKVVKRIVEF
>Sok2_KLULA (67  ids)  XP_455299    (386..459)
SVVRRADNDMINGTKLLNVTRMTRGRRDGILKAEKIRHVVKIGSMHLKGVWIPFERALVMAQREKIVDLLYALF
>MbpB_MAGGR (20  ids)  XP_369301    (096..176)
TVMWDYGCGLVRMTHFFKCRGYTKTVPGKVLNQNHGLKDITYSITGGSISAQGYWMPFACARAVCATFCHPIAGALIPIFG
>MbpA_MAGGR (39  ids)  XP_365024    (131..209)
AVMKRIGDSKLNATQILKVAGVEKGKRTKILEKEIQTGEHEKVQGGYGKYQGTWIKYERALEVCRQYGVEELLRPLLEY
>Sok2_MAGGR (57  ids)  XP_368552    (133..205)
VARREDNHMINGTKLLNVAGMTRGRRDGILKSEKMRHVVKIGPMHLKGVWIPFERALDFANKEKITELLYPLF
>MbpA_NEUCR (40  ids)  XP_962967    (071..147)
AVMRRQKDGWVNATQILKVANIDKGRRTKILEKEIQIGEHEKVQGGYGKYQGTWIPFERGLEVCRQYGVEELLSKLL
>MbpA_PICST (46  ids)  XP_001383745 (006..081)
IMRRCKDDWVNATQILKCCNFPKAKRTKILEKGVQQGLHEKVQGGFGRFQGTWIPLPDAQRLATMYGVTADAAPVL
>SokA_PICST (49  ids)  XP_001385235 (239..311)
VSRREDTNFVNGTKLLNVIGMTRGKRDGILKTEKTRNVVKVGSMNLKGVWIPFDRAFEIARNEGVDEALHPLF
>Sok2_PICST (64  ids)  XP_001383609 (194..267)
SVVRRADNNMINGTKLLNVAQMTRGRRDGILKSEKVRHVVKIGSMHLKGVWIPFERALAMAQREGIVDLLYPLF
>Sok2_SACCE (74  ids)  EDN64408     (435..508)
SVVRRADNDMVNGTKLLNVTKMTRGRRDGILKAEKIRHVVKIGSMHLKGVWIPFERALAIAQREKIADYLYPLF
>Phd1_SACCE (74  ids)  NP_012881    (208..281)
SVVRRADNNMINGTKLLNVTKMTRGRRDGILRSEKVREVVKIGSMHLKGVWIPFERAYILAQREQILDHLYPLF
>Swi4_SACCE (79  ids)  EDN63086     (060..138)
VMRRTKDDWINITQVFKIAQFSKTKRTKILEKESNDMQHEKVQGGYGRFQGTWIPLDSAKFLVNKYEIIDPVVNSILTF
>MbpB_SCHPO (21  ids)  NP_596132    (088..164)
LRRCPDSYFNISQILRLAGTSSSENAKELDDIIESGDYENVDSKHPQIDGVWVPYDRAISIAKRYGVYEILQPLISF
>MbpA_SCHPO (41  ids)  NP_593032    (027..104)
SVMRRRRDSWLNATQILKVADFDKPQRTRVLERQVQIGAHEKVQGGYGKYQGTWVPFQRGVDLATKYKVDGIMSPILS
>MbpA_USTMA (24  ids)  XP_760925    (057..138)
TMMIDVDTSFVRFTSITQALGKNKVNFGRLVKTCPALDPHITKLKGGYLSIQGTWLPFDLAKELSRRIAWEIRDHLVPLFGY
>Swi4_USTMA (42  ids)  XP_761485    (182..260)
AVMRRRGDGWLNATQILKIAGIEKTRRTKILEKSILTGEHEKIQGGYGKFQGTWIPLQRAQQVAAEYNVSHLLQPILEF
>MbpB_YARLI (26  ids)  XP_505499    (080..159)
IIWDYHTGYVHLTGLWKAIGNSKADIVKLIDNSPDLEAVIRRVRGGYLKIQGTWVPYDIARALASRTCYFIRFALIPLFG
>MbpA_YARLI (44  ids)  XP_501770    (036..114)
AVMRRRTDSSLNATQILKVAGVEKSKRTKILEKEILTGAHEKVQGGYGKYQGTWIPYERGVDLCRQYSVYDVLQPLLAF
>SokA_YARLI (55  ids)  CAB45654     (144..216)
VARREDNDMINGTKLLNVAGMTRGRRDGILKGEKLRHVVKAGAMHLKGVWIPYDRALEFANKEKIIDLLFPLF
>Sok2_YARLI (60  ids)  XP_501102    (130..202)
VARREDNNMINGTKLLNVVGMTRGRRDGILKTEKIRHVVKIGAMHLKGVWIPYERALAFAQRERIVDVLYPLF