Reference alignment for KilA-N domains (MUSCLE, reference species)

From "A B C"
Jump to navigation Jump to search

Aligned KilA-N domain family



Alignment of KilA-N domains in selected genome sequenced fungi, generated with the MUSCLE webserver.


This is a MUSCLE alignment for KilA-N domains found in genome-sequenced fungi by PSI-BLAST. Gene names have been revised to name genes to their most similar counterparts in Saccharomyces cerevisiae. The gene names now either correspond to the orthologuous Saccharomyces cerevisiae gene (Mbp1, Swi4, Phd1, Sok2 or Xbp1) if they fulfill RBM, or use the first three letters of the most similar yeast orthologue's gene name plus A, B, C ... if they are paralogues. In addition, the yeast Xbp1 gene has been added to the alignment (bold). In BLAST scans, Saccharomyces cerevisiae Xbp1 often gives only partial matches to HSPs. It turns out that an entire KilA-N domain can be defined for Xbp1; it has one 84 amino acid insertion and another 17 amino acid insertion relative to the consensus KilA-N domains and is therefore split into separate HSPs. I have included the sequence here, but removed all residues that are inserted relative to the closely related Debaryomyces hansenii orthologue. Failure to recognize domain similarity when one or more long indels are present is a frequently encountered problem that automatic algorithms do not address well.

The output is sorted in the same order as the input file. Some residues that were proposed to be important for DNA binding were manually highlighted in red (-) and blue (+) in the Mbp1_SACCE sequence. Two structurally important residues have been colored green.

   

MUSCLE (3.6) multiple sequence alignment


Mbp1_SACCE      SIMKRKKDDWVNATHILKAA--------NFA------KAKRTRILEKEV--LKETHEKVQGG----FGKYQ--------GTWVPLNIAKQLAEK--FSVYD-QLKPLFDF
Mbp1_ASHGO      SIMKRKADDWVNATHILKAA--------KFA------KAKRTRILEKEV--IKDTHEKVQGG----FGKYQ--------GTWVPLDIARRLAQK--FEVLE-ELRPLFDF
Mbp1_ASPFU      --MRRRGDDWINATHILKVA--------GFD------KPARTRILEREV--QKGTHEKVQGG----YGKYQ--------GTWIPLHEGRLLAER--NNIID-KLRPIFDY
Mbp1_ASPNI      SVMRRRSDDWINATHILKVA--------GFD------KPARTRILEREV--QKGVHEKVQGG----YGKYQ--------GTWIPLQEGRQLAER--NNILD-KLLPIFDY
Mbp1_ASPTE      SVMRRRADDWINATHILKVA--------GFD------KPARTRILEREV--QKGVHEKVQGG----YGKYQ--------GTWIPLPEGRLLAER--NNIID-KLRPIFDY
Mbp1_CANAL      -IMRRKKDSWINATHILKIA--------KFP------KAKRTRILEKDV--QTGIHEKVQGG----YGKYQ--------GTYVPLDLGAAIARN--FGVYD-VLKPIFEF
Mbp1_CANGL      SIMKRKNDGWVNATHILKAA--------NFA------KAKRTRILEKEV--LKEMHEKVQGG----FGKYQ--------GTWVPLNIAINLAEK--FDVYQ-DLKPLFDF
Mbp1_COPCI      AVMRRRSDSWLNATQILKVA--------GFD------KPQRTRVLEREV--QKGEHEKVQGG----YGKYQ--------GTWIPLERGMQLAKQ--YNCEH-LLRPIIEF
Mbp1_CRYNE      SVMRRASDSWVNATQILKVA--------GVH------KSARTKILEKEV--LNGIHEKIQGG----YGKYQ--------GTWVPLDRGRDLAEQ--YGVGS-YLSSVFDF
Mbp1_DEBHA      -IMRRKLDSWINATHILKIA--------KFP------KAKRTRILEKDV--QTGVHEKVQGG----YGKYQ--------GTYVPLDLGADIAKN--FGVFD-SLRPIFEF
Mbp1_GIBZE      -VMRRRSDDWINATHILKAA--------GFD------KPARTRILERDV--QKDVHEKIQGG----YGKYQ--------GTWIPLESGQALAER--HSVID-RLRPIFEY
Mbp1_KLULA      SIMKRKADNWVNATHILKAA--------KFP------KAKRTRILEKEV--ITDTHEKVQGG----FGKYQ--------GTWIPLELASKLAEK--FEVLD-ELKPLFDF
Mbp1_MAGGR      -VMRRRVDDWINATHILKAA--------GFD------KPARTRILEREV--QKDQHEKVQGG----YGKYQ--------GTWIPLEAGEALAHR--NNIFD-RLRPIFEF
Mbp1_NEUCR      -VMRRRHDDWVNATHILKAA--------GFD------KPARTRILEREV--QKDTHEKIQGG----YGRYQ--------GTWIPLEQAEALARR--NNIYE-RLKPIFEF
Mbp1_PICST      -IMRRKLDSWINATHILKIA--------KFP------KAKRTRILEKDV--QTGVHEKVQGG----YGKYQ--------GTYVPLELGRDIAKN--FGVFD-ILKPIFDF
MbpA_SCHPO      --MKRCHDNWLNATQILKIA--------ELD------KPRRTRILEKFA--QKGLHEKIQGG----CGKYQ--------GTWVPSERAVELAHE--YNVFD-LIQPLIEY
Mbp1_USTMA      AVMRRRSDDWLNATQILKVV--------GLD------KPQRTRVLEREI--QKGIHEKVQGG----YGKYQ--------GTWIPLDVAIELAER--YNIQG-LLQPITSY
Mbp1_YARLI      AVMRRKSDGWVNATHILKVA--------GFD------KPQRTRILEKEV--QKGVHEKVQGG----YGKYQ--------GTWVPLERAREIATL--YDVDS-HLAPIFNY
Swi4_ASHGO      -VMRRLHDDWVNITQVFKVA--------TFS------KTQRTKILEKES--ADISHEKIQGG----YGRFQ--------GTWIPLDSAKGLVAK--YEITD-IVV-----
Sok2_ASHGO      SVVRRADNDMINGTKLLNVA--------KMT------RGRRDGILKAEK-----VRHVVKIG----SMHLK--------GVWIPFERALALAQR--EKIVD-MLFPLF--
Xbp1_ASPFU      -VMWDYNIGLVRTTHLFKCN--------DYS-----------KMLNANP-GLREICHSITGG----ALAAQ--------GYWMPYEAAKAVAATFCWKIRH-ALTPLFG-
MbpA_ASPFU      AVMKRRSDSWLNATQILKVA--------GVV------KARRTKTLEKEI--AAGEHEKVQGG----YGKYQ--------GTWVNYQRGVELCRE--YHVEE-LLRPLLEY
Sok2_ASPFU      -VARREDNHMINGTKLLNVA--------GMT------RGRRDGILKSEK-----VRHVVKIG----PMHLK--------GVWIPFERALEFANK--EKITD-LLYPLF--
XbpA_ASPNI      -ISWDYNVGLVLTRSLFKCN--------GHP------KTAPAKVLKMNP-GLGDISHSITGG----ALVGQ--------GYWMPFRAAKALATTFCWNIRF-VLTPMFG-
Xbp1_ASPNI      TVMWDYNIGLVRTTHLFKCN--------DYS------KTTPAKMLNQNP-GLRDICHSITGG----ALAAQ--------GYWMPYEAAKAIAATFCWKIRF-ALTPLFG-
MbpA_ASPNI      AVMKRRSDSWLNATQILKVA--------GVV------KARRTKTLEKEI--AAGEHEKVQGG----YGKYQ--------GTWVNYQRGVELCRE--YHVEE-LLRPLLEY
SokA_ASPNI      -VARREDNGMINGTKLLNVA--------GMT------RGRRDGILKSEK-----VRNVVKIG----PMHLK--------GVWIPFDRALEFANK--EKITD-LLYPLF--
Sok2_ASPNI      -VARREDNHMINGTKLLNVA--------GMT------RGRRDGILKSEK-----VRHVVKIG----PMHLK--------GVWIPFERALEFANK--EKITD-LLYPLF--
Xbp1_ASPTE      -IMWDYNIGLVRTTPLFRSQ--------NYS------KTTPAKVLDANP-GLREISHSITGG----AIVAQDKP-----GYWIPFEAAKAVAATFCWRIRY-ALTPIFG-
MbpA_ASPTE      AVMKRRSDSWLNATQILKVA--------GVV------KARRTKTLEKEI--AAGEHEKVQGG----YGKYQ--------GTWVNYQRGVDLCRE--YHVEE-LLRPLLEY
Sok2_ASPTE      -VARREDNSMINGTKLLNVA--------GMT------RGRRDGILKSEK-----IRHVVKIG----PMHLK--------GVWIPFERALEFANK--EKITD-LLYPLF--
MbpC_CANAL      -VLRRVQDSFVNVTQLFQILIKLE----VLP------TSQVDNYFDNEI--LSNLKYFGSSSNTPQYLDLRKHQNIYLQGIWIPYDKAVNLALK--FDIYE-ITKKLF--
Xbp1_CANAL      -VIWDYETGWVHLTGIWKASLTID----GSNVSPSHLKADIVKLLESTPKEYQQYIKRIRGG----FLKIQ--------GTWLPYKLCKILARRFCYYLRY-SLIPIFG-
MbpA_CANAL      SIMRRCKDDWVNATQILKCC--------NFP------KAKRTKILEKGV--QQGLHEKVQGG----FGRFQ--------GTWIPLEDARKLAKT--YGVTE-ELAPVL--
Sok2_CANAL      -VSRREDTNYINGTKLLNVI--------GMT------RGKRDGILKTEK-----IKNVVKVG----SMNLK--------GVWIPFDRAYEIARN--EGVDS-LLYPLF--
Phd1_CANAL      SVVRRADNNMINGTKLLNVA--------QMT------RGRRDGILKSEK-----VRHVVKIG----SMHLK--------GVWIPFERALAMAQR--EQIVD-MLYPLF--
SokA_CANGL      TVVRRADNDMVNGTKLLNVT--------GMT------RGRRDGILKNEP-----VRDVVKGG----PMTLK--------GVWIPIDRARAIARQ--EGIEQ-WLYPLF--
Swi4_CANGL      -VMRRTMDDWVNVTQVFKIA--------QFS------KTQRTKILEKES--TNMKHEKVQGG----YGRFQ--------GTWVPLEAAKFMTTK--YNIDNPVVNTILSF
Sok2_CANGL      SVVRRADNDMINGTKLLNVT--------KMT------RGKRDGILRSEK-----YRKVVKIG----SMHLK--------GVWIPFERALFIAKR--EKIVD-LLYPLF--
Xbp1_COPCI      -IMMDIDDGYILWTGIWKAL--------GNS------KADIVKMIDSQP-DLAPLIRRVRGG----YLKIQ--------GTWMPYEVALKLSRRVAWPIRH-DLVPLFGF
MbpA_CRYNE      AVMRRRSDAYLNATQILKVA--------GFD------KPQRTRVLEREV--QKGEHEKVQGG----YGKYQ--------GTWIPIERGLALAKQ--YGVED-ILRPIIDY
Xbp1_DEBHA      -IIWDYETGFVHLTGIWKASINDEVNTHRNL------KADIVKLLESTPKQYHQHIKRIRGG----FLKIQ--------GTWLPFDLCKMLAKRFCYHIRF-QLIPIFG-
MbpB_DEBHA      -ILRRVQDSYINISQLFSILLKIG----HLS------EAQLTNFLNNEI--LTNTQYLSSGGSNPQFNDLRNHEVRDLRGLWIPYDRAVSLALK--FDIYE-LAKSLF--
MbpA_DEBHA      -IMRRCKDDWVNATQILKCC--------NFP------KAKRTKILEKGV--QQGLHEKIQGG----YGRFQ--------GTWIPLADAQRLAAS--YGVTP-DLAPVL--
SokA_DEBHA      -VSRREDTNYVNGTKLLNVA--------GMT------RGKRDGILKTEK-----TKSVVKVG----AMNLK--------GVWIPFERASEIARN--EGIDG-LLYPLF--
Sok2_DEBHA      SVVRRADNNMINGTKLLNVA--------QMT------RGRRDGILKSEK-----VRHVVKIG----SMHLK--------GVWIPFERALAMAQR--EGIVD-LLYPLF--
Xbp1_GIBZE      AVMWDYNIGLVRMTPFFKCR--------GYG------KTIPAKMLGLNP-GLKEITHSITGG----SIAAQ--------GYWMPYRCAKAICATFCHPIAG-ALIPIFG-
MbpA_GIBZE      AVMRRRNDSWLNATQILKVA--------GVD------KGKRTKILEKEI--QTGEHEKVQGG----YGKYQ--------GTWIKFERGLQVCRQ--YGVEE-LLRPLLTY
Sok2_GIBZE      -VARREDNHMINGTKLLNVA--------GMT------RGRRDGILKSEK-----VRHVVKIG----PMHLK--------GVWIPYDRALDFANK--EKITE-LLYPLF--
Swi4_KLULA      -IMRRCNDNWLNITQVFKAG--------SFT------KAQRTKILEKEA--NEIKHEKIQGG----YGRFQ--------GTWIPWESTKYLVEK--YNINNKVVKRIVEF
Sok2_KLULA      SVVRRADNDMINGTKLLNVT--------RMT------RGRRDGILKAEK-----IRHVVKIG----SMHLK--------GVWIPFERALVMAQR--EKIVD-LLYALF--
Xbp1_MAGGR      TVMWDYGCGLVRMTHFFKCR--------GYT------KTVPGKVLNQNH-GLKDITYSITGG----SISAQ--------GYWMPFACARAVCATFCHPIAG-ALIPIFG-
MbpA_MAGGR      AVMKRIGDSKLNATQILKVA--------GVE------KGKRTKILEKEI--QTGEHEKVQGG----YGKYQ--------GTWIKYERALEVCRQ--YGVEE-LLRPLLEY
Sok2_MAGGR      -VARREDNHMINGTKLLNVA--------GMT------RGRRDGILKSEK-----MRHVVKIG----PMHLK--------GVWIPFERALDFANK--EKITE-LLYPLF--
MbpA_NEUCR      AVMRRQKDGWVNATQILKVA--------NID------KGRRTKILEKEI--QIGEHEKVQGG----YGKYQ--------GTWIPFERGLEVCRQ--YGVEE-LLSKLL--
MbpA_PICST      -IMRRCKDDWVNATQILKCC--------NFP------KAKRTKILEKGV--QQGLHEKVQGG----FGRFQ--------GTWIPLPDAQRLATM--YGVTA-DAAPVL--
SokA_PICST      -VSRREDTNFVNGTKLLNVI--------GMT------RGKRDGILKTEK-----TRNVVKVG----SMNLK--------GVWIPFDRAFEIARN--EGVDE-ALHPLF--
Sok2_PICST      SVVRRADNNMINGTKLLNVA--------QMT------RGRRDGILKSEK-----VRHVVKIG----SMHLK--------GVWIPFERALAMAQR--EGIVD-LLYPLF--
Sok2_SACCE      SVVRRADNDMVNGTKLLNVT--------KMT------RGRRDGILKAEK-----IRHVVKIG----SMHLK--------GVWIPFERALAIAQR--EKIAD-YLYPLF--
Phd1_SACCE      SVVRRADNNMINGTKLLNVT--------KMT------RGRRDGILRSEK-----VREVVKIG----SMHLK--------GVWIPFERAYILAQR--EQILD-HLYPLF--
Swi4_SACCE      -VMRRTKDDWINITQVFKIA--------QFS------KTKRTKILEKES--NDMQHEKVQGG----YGRFQ--------GTWIPLDSAKFLVNK--YEIIDPVVNSILTF
Xbp1_SACCE      CVIWSHDSGYVFMTGIWRLYQDVMKGLINLPRGDSNIKPELRDLICQSYKDFANFTKRIRGG----YIKIQ--------GTWLPMEISRLLCLRFCFPIRY-FLVPIFG-
MbpB_SCHPO      --LRRCPDSYFNISQILRLA--------GTS------SSENAKELDDII--ESGDYENVDSK----HPQID--------GVWVPYDRAISIAKR--YGVYE-ILQPLISF
Mbp1_SCHPO      SVMRRRRDSWLNATQILKVA--------DFD------KPQRTRVLERQV--QIGAHEKVQGG----YGKYQ--------GTWVPFQRGVDLATK--YKVDG-IMSPILS-
MbpA_USTMA      TMMIDVDTSFVRFTSITQAL--------GKN------KVNFGRLVKTCP-ALDPHITKLKGG----YLSIQ--------GTWLPFDLAKELSRRIAWEIRD-HLVPLFGY
MbpB_USTMA      AVMRRRGDGWLNATQILKIA--------GIE------KTRRTKILEKSI--LTGEHEKIQGG----YGKFQ--------GTWIPLQRAQQVAAE--YNVSH-LLQPILEF
MbpB_YARLI      -IIWDYHTGYVHLTGLWKAI--------GNS------KADIVKLIDNSP-DLEAVIRRVRGG----YLKIQ--------GTWVPYDIARALASRTCYFIRF-ALIPLFG-
Xbp1_YARLI      AVMRRRTDSSLNATQILKVA--------GVE------KSKRTKILEKEI--LTGAHEKVQGG----YGKYQ--------GTWIPYERGVDLCRQ--YSVYD-VLQPLLAF
SokA_YARLI      -VARREDNDMINGTKLLNVA--------GMT------RGRRDGILKGEK-----LRHVVKAG----AMHLK--------GVWIPYDRALEFANK--EKIID-LLFPLF--
Sok2_YARLI      -VARREDNNMINGTKLLNVV--------GMT------RGRRDGILKTEK-----IRHVVKIG----AMHLK--------GVWIPYERALAFAQR--ERIVD-VLYPLF--
                          .    .                            .                                  * ::       .