Difference between revisions of "Reference species for fungi"

From "A B C"
Jump to navigation Jump to search
m
m
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<div id="BIO">
 
<div id="BIO">
 
<div class="b1">
 
<div class="b1">
Fungal reference species
+
Reference fungi data
 
</div>
 
</div>
  
Line 10: Line 10:
  
  
Many bioinformatics procedures depend on the comparison of sequences between species. To make good use of evolutionary information, we should choose species that span the breadth of observations, and that are not biased towards a particular branch of the phylogenetic tree. To keep procedures manageable, the number of speciescannot be "too large". For fungi, we make use of recent phylogenetic studies that establish the branching order of the entire {{WP|Taxonomic_rank|kingdom}}, and we choose ten representatives for clades at the '''Class''' or subphylum level. To illustrate the "class"level, for animals the class level contains e.g. bony and cartilaginous fishes, amphibians, reptiles, birds and mammals.
+
Many bioinformatics procedures depend on the comparison of sequences between species. To make good use of evolutionary information, we should choose species that span the breadth of observations, and that are not biased towards a particular branch of the phylogenetic tree. To keep procedures manageable, the number of species cannot be "too large". For fungi, we make use of recent phylogenetic studies that establish the branching order of the entire {{WP|Taxonomic_rank|kingdom}}, and we choose ten representatives for clades at the {{WP|Class_(biology)|'''Class'''}} or subphylum level. To illustrate the "class" level: for animals the class levels include e.g. bony and cartilaginous fishes, segmented worms, amphibians, reptiles, birds and mammals - the familiar, very broad categories. I.e. a reference species list of animals, divided along class levels, might include zebrafish, african claw frog, the fruit fly, humans, the raven, the oyster etc. etc.
 +
Even though they are all fungi, our reference species are no more similar to each other than the former.
  
__NOTOC__
 
  
 +
__TOC__
  
To select a set of diverse species, the whole set of names of genome-sequenced fungi was loaded into the NCBI's  [http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi '''Common Taxonomic Tree'''] tree tool. Then ten representative species were selected. The selected species are:
+
{{Vspace}}
  
* Phylum Basidiomyceta
+
==Reference species==
** ''Wallemia sebi'' (WALSE)&nbsp;&nbsp;&nbsp; <small>Subphylum Basidiomycota incertae sedis</small>
 
** ''Puccinia Graminis'' (PUCGR)&nbsp;&nbsp;&nbsp; <small>Subphylum Pucciniomycotina</small>
 
** ''Ustilago maydis'' (USTMA)&nbsp;&nbsp;&nbsp; <small>Subphylum Ustilaginomycotina</small>
 
** ''Cryptococcus neoformans'' (CRYNE)&nbsp;&nbsp;&nbsp; <small>Subphylum Agaricomycotina; Class Tremellomycetes</small>
 
** ''Coprinopsis cinerea'' (COPCI)&nbsp;&nbsp;&nbsp; <small>Subphylum Agaricomycotina; Class Agaricomycetes</small>
 
* Phylum Ascomycota
 
** ''Schizosaccharomyces pombe'' (SCHPO)&nbsp;&nbsp;&nbsp; <small>Subphylum Taphrinomycotina</small>
 
** ''Aspergillus nidulans'' (ASPNI)&nbsp;&nbsp;&nbsp; <small>Subphylum Pezizomycotina; Class Eurotiomycetes</small>
 
** ''Neurospora crassa'' (NEUCR)&nbsp;&nbsp;&nbsp; <small>Subphylum Pezizomycotina; Class Sordariomycetes</small>
 
** ''Bipolaris oryzae'' (BIPOR)&nbsp;&nbsp;&nbsp; <small>Subphylum Pezizomycotina; Class Dothideomycetes</small>
 
** ''Saccharomyces cerevisiae'' (SACCE)&nbsp;&nbsp;&nbsp; <small>Subphylum Saccharomycotina</small>
 
  
 +
To select a set of diverse species, the whole set of names of genome-sequenced fungi was loaded into the NCBI's  [http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi '''Common Taxonomic Tree'''] tree tool. Then ten representative species were manually selected as being well distributed across the tree. The selected species are:
  
 +
<table cellpadding="5">
 +
<tr class="sh">
 +
  <td>'''''Name'''''</td>
 +
  <td>'''<tt>BICODE</tt>'''</td>
 +
  <td>'''<tt>tax ID</tt>'''</td>
 +
  <td><small>Classification</small></td>
 +
</tr>
 +
<tr><td colspan="4" class="sp"></td></tr>
 +
<tr class="s2"><td colspan="4">'''Phylum Ascomycota'''</td></tr>
 +
<tr class="s1">
 +
  <td>''Aspergillus nidulans''</td>
 +
  <td><tt> ASPNI </tt></td>
 +
  <td><tt>162425</tt></td>
 +
  <td><small>Subphylum Pezizomycotina; Class Eurotiomycetes</small></td>
 +
</tr>
  
 +
<tr class="s2">
 +
  <td>''Bipolaris oryzae''</td>
 +
  <td><tt> BIPOR </tt></td>
 +
  <td><tt>101162</tt></td>
 +
  <td><small>Subphylum Pezizomycotina; Class Dothideomycetes</small></td>
 +
</tr>
 +
 +
<tr class="s1">
 +
  <td>''Neurospora crassa''</td>
 +
  <td><tt> NEUCR </tt></td>
 +
  <td><tt>5141</tt></td>
 +
  <td><small>Subphylum Pezizomycotina; Class Sordariomycetes</small></td>
 +
</tr>
 +
 +
<tr class="s2">
 +
  <td>''Saccharomyces cerevisiae''</td>
 +
  <td><tt> SACCE </tt></td>
 +
  <td><tt>4932</tt></td>
 +
  <td><small>Subphylum Saccharomycotina</small></td>
 +
</tr>
 +
 +
<tr class="s1">
 +
  <td>''Schizosaccharomyces pombe''</td>
 +
  <td><tt> SCHPO </tt></td>
 +
  <td><tt>4896</tt></td>
 +
  <td><small>Subphylum Taphrinomycotina</small></td>
 +
</tr>
 +
 +
<tr><td colspan="4" class="sp"></td></tr>
 +
 +
<tr><td colspan="4" class="s2">'''Phylum Basidiomyceta'''</td></tr>
 +
 +
 +
<tr class="s1">
 +
  <td>''Coprinopsis cinerea''</td>
 +
  <td><tt> COPCI </tt></td>
 +
  <td><tt>5346</tt></td>
 +
  <td><small>Subphylum Agaricomycotina; Class Agaricomycetes</small></td>
 +
</tr>
 +
 +
<tr class="s2">
 +
  <td>''Cryptococcus neoformans''</td>
 +
  <td><tt> CRYNE </tt></td>
 +
  <td><tt>5207</tt></td>
 +
  <td><small>Subphylum Agaricomycotina; Class Tremellomycetes</small></td>
 +
</tr>
 +
 +
<tr class="s1">
 +
  <td>''Puccinia Graminis''</td>
 +
  <td><tt> PUCGR </tt></td>
 +
  <td><tt>5297</tt></td>
 +
  <td><small>Subphylum Pucciniomycotina</small></td>
 +
</tr>
 +
 +
<tr class="s2">
 +
  <td>''Ustilago maydis''</td>
 +
  <td><tt> USTMA </tt></td>
 +
  <td><tt>5270</tt></td>
 +
  <td><small>Subphylum Ustilaginomycotina</small></td>
 +
</tr>
 +
 +
<tr class="s1">
 +
  <td>''Wallemia mellicola''</td>
 +
  <td><tt> WALME </tt></td>
 +
  <td><tt>1708541</tt></td>
 +
  <td><small>Subphylum Wallemiales incertae sedis</small></td>
 +
</tr>
 +
 +
</table>
 +
 +
{{Vspace}}
 +
 +
 +
===Entrez===
 
;Entrez selection code, e.g. for BLAST searches
 
;Entrez selection code, e.g. for BLAST searches
 
<source lang="text">
 
<source lang="text">
"Wallemia sebi"[organism] OR
+
"Wallemia mellicola"[organism] OR
 
"Puccinia Graminis"[organism] OR
 
"Puccinia Graminis"[organism] OR
 
"Ustilago maydis"[organism] OR
 
"Ustilago maydis"[organism] OR
Line 43: Line 123:
 
"Neurospora crassa"[organism] OR
 
"Neurospora crassa"[organism] OR
 
"Bipolaris oryzae"[organism] OR
 
"Bipolaris oryzae"[organism] OR
"Saccharomyces cerevisiae[organism]"
+
"Saccharomyces cerevisiae"[organism]
 
</source>
 
</source>
 +
 +
{{Vspace}}
 +
 +
===Tax ID===
  
 
;Taxonomy IDs, e.g. for the NCBI taxonomy browser
 
;Taxonomy IDs, e.g. for the NCBI taxonomy browser
Line 51: Line 135:
 
4932
 
4932
 
5141
 
5141
5207
 
 
5270
 
5270
 
5297
 
5297
 
5346
 
5346
 +
5207
 
101162
 
101162
148960
 
 
162425
 
162425
 +
1708541
 +
</source>
 +
 +
{{Vspace}}
 +
 +
===Trees===
 +
 +
;Text tree, based on the NCBI Taxonomy Common Tree
 +
<source lang="text">
 +
 +
Dikarya
 +
|
 +
+--Basidiomycota
 +
|  |
 +
|  +-Agaricomycotina
 +
|  |  +-Wallemia mellicola
 +
|  |  +-Coprinopsis cinerea
 +
|  |  +-Cryptococcus neoformans
 +
|  |
 +
|  +-Puccinia graminis
 +
|  +-Ustilago maydis
 +
|
 +
+--Ascomycota
 +
    |
 +
    +-Schizosaccharomyces pombe
 +
    |
 +
    +-saccharomyceta
 +
        +-Saccharomyces cerevisiae
 +
        |
 +
        +-leotiomyceta
 +
          +-Aspergillus nidulans
 +
          +-Neurospora crassa
 +
          +-Bipolaris oryzae
 +
 
</source>
 
</source>
  
;Phylyp tree format, e.g. to plot cladograms
+
{{Vspace}}
 +
 
 +
;Phylip tree format, e.g. to plot cladograms
 
<source lang="text">
 
<source lang="text">
 
(
 
(
 
(
 
(
'Wallemia sebi':4,
+
'Wallemia mellicola':4,
 
'Puccinia graminis':4,
 
'Puccinia graminis':4,
 
'Ustilago maydis':4,
 
'Ustilago maydis':4,
Line 87: Line 206:
 
</source>
 
</source>
  
 +
{{Vspace}}
  
 
; Cladogram, drawn with the Phylip program <code>retree</code>
 
; Cladogram, drawn with the Phylip program <code>retree</code>
 
+
<source lang="text">
 
                 ┌──────────── Schizosaccharomyces pombe
 
                 ┌──────────── Schizosaccharomyces pombe
 
                 │   
 
                 │   
Line 108: Line 228:
 
                 ├───────────── Puccinia graminis
 
                 ├───────────── Puccinia graminis
 
                 │   
 
                 │   
                 └───────────── Wallemia sebi
+
                 └───────────── Wallemia mellicola
   
+
  </source>
 +
 
 +
{{Vspace}}
 +
 
 +
==Strains==
 +
 
 +
Genome sequencing is done from a single, defined '''strain''' of a species and several genomes from different strains of the same species may be deposited in the database. This may be a problem, since the annotation level is not the same for all of them, and there may be many strains. If you work with an organism at the strain-level, make sure you know which one is considered the "reference" in the field.
 +
 
 +
<table cellpadding="5">
 +
<tr class="sh">
 +
  <td>'''''Species'''''</td>
 +
  <td>'''Strain'''</td>
 +
  <td>'''<tt>tax ID</tt>'''</td>
 +
</tr>
 +
<tr><td colspan="4" class="sp"></td></tr>
 +
<tr class="s1">
 +
  <td>''Aspergillus nidulans''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=227321 ''Aspergillus nidulans'' FGSC A4]</td>
 +
  <td><tt> 227321 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s2">
 +
  <td>''Bipolaris oryzae''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=930090 ''Bipolaris oryzae'' ATCC 44560]</td>
 +
  <td><tt> 930090 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s1">
 +
  <td>''Neurospora crassa''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=367110 ''Neurospora crassa'' OR74A]</td>
 +
  <td><tt> 367110 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s2">
 +
  <td>''Saccharomyces cerevisiae''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=559292 ''Saccharomyces cerevisiae'' S288C]</td>
 +
  <td><tt> 559292 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s1">
 +
  <td>''Schizosaccharomyces pombe''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=284812 ''Schizosaccharomyces pombe'' 972h-]</td>
 +
  <td><tt> 284812 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s2">
 +
  <td>''Coprinopsis cinerea''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=240176 ''Coprinopsis cinerea'' okayama7#130]</td>
 +
  <td><tt> 240176 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s1">
 +
  <td>''Cryptococcus neoformans''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=214684 ''Cryptococcus neoformans'' var. neoformans JEC21]</td>
 +
  <td><tt> 214684 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s2">
 +
  <td>''Puccinia Graminis''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=418459 ''Puccinia Graminis'' f. sp. tritici CRL 75-36-700-3]</td>
 +
  <td><tt> 418459 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s1">
 +
  <td>''Ustilago maydis''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=237631 ''Ustilago maydis'' 521]</td>
 +
  <td><tt> 237631 </tt></td>
 +
</tr>
 +
 
 +
<tr class="s2">
 +
  <td>''Wallemia mellicola''</td>
 +
  <td>[https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=671144 ''Wallemia mellicola'' CBS 633.66]</td>
 +
  <td><tt> 671144 </tt></td>
 +
</tr>
 +
 
 +
</table>
 +
 
 +
Note that the tax-IDs for the species and strains are indeed different!
 +
 
 +
{{Vspace}}
 +
 
 +
==R Code ==
 +
 
 +
;A vector of binomial species names.
 +
<pre>
 +
REFspecies <- c("Aspergillus nidulans",
 +
                "Bipolaris oryzae",
 +
                "Coprinopsis cinerea",
 +
                "Cryptococcus neoformans",
 +
                "Neurospora crassa",
 +
                "Puccinia graminis",
 +
                "Saccharomyces cerevisiae",
 +
                "Schizosaccharomyces pombe",
 +
                "Ustilago maydis",
 +
                "Wallemia mellicola"
 +
)
 +
</pre>
 +
 
 +
;A data frame of binomial species and TaxIDs.
 +
<pre>
 +
refTaxa <- data.frame(
 +
              ID = as.integer(c(162425,
 +
                                101162,
 +
                                5141,
 +
                                4932,
 +
                                4896,
 +
                                5346,
 +
                                5207,
 +
                                5297,
 +
                                5270,
 +
                                1708541)),
 +
              species = c("Aspergillus nidulans",
 +
                          "Bipolaris oryzae",
 +
                          "Neurospora crassa",
 +
                          "Saccharomyces cerevisiae",
 +
                          "Schizosaccharomyces pombe",
 +
                          "Coprinopsis cinerea",
 +
                          "Cryptococcus neoformans",
 +
                          "Puccinia Graminis",
 +
                          "Ustilago maydis",
 +
                          "Wallemia mellicola"),
 +
              stringsAsFactors = FALSE)
 +
</pre>
 +
 
 +
;A data frame of the genome-sequenced strains in which the Mbp1 orthologues are annotated.
 +
 
 +
<pre>
 +
library(jsonlite)
 +
taxa <- fromJSON('[
 +
  { "ID" : 227321,
 +
  "species" : "Aspergillus nidulans FGSC A4"},
 +
  { "ID" : 930090,
 +
  "species" : "Bipolaris oryzae ATCC 44560"},
 +
  { "ID" : 367110,
 +
  "species" : "Neurospora crassa OR74A"},
 +
  { "ID" : 559292,
 +
  "species" : "Saccharomyces cerevisiae S288C"},
 +
  { "ID" : 284812,
 +
  "species" : "Schizosaccharomyces pombe 972h-"},
 +
  { "ID" : 240176,
 +
  "species" : "Coprinopsis cinerea okayama7#130"},
 +
  { "ID" : 214684,
 +
  "species" : "Cryptococcus neoformans var. neoformans JEC21"},
 +
  { "ID" : 418459,
 +
  "species" : "Puccinia graminis f. sp. tritici CRL 75-36-700-3"},
 +
  { "ID" : 237631,
 +
  "species" : "Ustilago maydis 521"},
 +
  { "ID" : 671144,
 +
  "species" : "Wallemia mellicola CBS 633.66"}
 +
  ]
 +
')
 +
taxa
 +
ID                                          species
 +
# 1  227321                    Aspergillus nidulans FGSC A4
 +
# 2  930090                      Bipolaris oryzae ATCC 44560
 +
# 3  367110                          Neurospora crassa OR74A
 +
# 4  559292                  Saccharomyces cerevisiae S288C
 +
# 5  284812                  Schizosaccharomyces pombe 972h-
 +
# 6  240176                Coprinopsis cinerea okayama7#130
 +
# 7  214684    Cryptococcus neoformans var. neoformans JEC21
 +
# 8  418459 Puccinia graminis f. sp. tritici CRL 75-36-700-3
 +
# 9  237631                              Ustilago maydis 521
 +
# 10 671144                    Wallemia mellicola CBS 633.66
 +
</pre>
 +
 
 +
 
 +
{{Vspace}}
 +
 
 +
==Mbp1 orthologues==
 +
 
 +
;RBMs to MBP1_SACCE
 +
 
 +
<table cellpadding="5">
 +
<tr class="sh"><td>name</td><td><small>''Originally...''</small></td><td>RefSeqID</td><td> UniProtID </td></tr>
 +
<tr class="s2"><td>MBP1_ASPNI</td><td>AN3154</td><td>[https://www.ncbi.nlm.nih.gov/protein/67525393 XP_660758]</td><td>[http://www.uniprot.org/uniprot/Q5B8H6 Q5B8H6]</td></tr>
 +
<tr class="s1"><td>MBP1_BIPOR</td><td>COCMIDRAFT_338</td><td>[https://www.ncbi.nlm.nih.gov/protein/627818929 XP_007682304]</td><td>[http://www.uniprot.org/uniprot/W6ZM86 W6ZM86]</td></tr>
 +
<tr class="s2"><td>MBP1_NEUCR</td><td>Swi4</td><td>[https://www.ncbi.nlm.nih.gov/protein/85075775 XP_955821]</td><td>[http://www.uniprot.org/uniprot/Q7RW59 Q7RW59]</td></tr>
 +
<tr class="s1"><td>MBP1_SACCE</td><td>Mbp1</td><td>[https://www.ncbi.nlm.nih.gov/protein/6320147 NP_010227]</td><td>[http://www.uniprot.org/uniprot/P39678 P39678]</td></tr>
 +
<tr class="s2"><td>MBP1_SCHPO</td><td>Res2</td><td>[https://www.ncbi.nlm.nih.gov/protein/19113944 NP_593032]</td><td>[http://www.uniprot.org/uniprot/P41412 P41412]</td></tr>
 +
<tr class="s1"><td>MBP1_COPCI</td><td>&nbsp;</td><td>[https://www.ncbi.nlm.nih.gov/protein/299748003 XP_001837394]</td><td>[http://www.uniprot.org/uniprot/A8NYC6 A8NYC6]</td></tr>
 +
<tr class="s2"><td>MBP1_CRYNE</td><td>&nbsp;</td><td>[https://www.ncbi.nlm.nih.gov/protein/58263360 XP_569090]</td><td>[http://www.uniprot.org/uniprot/Q5KMQ9 Q5KMQ9]</td></tr>
 +
<tr class="s1"><td>MBP1_PUCGR</td><td>PGTG_08863</td><td>[https://www.ncbi.nlm.nih.gov/protein/403167277 XP_003327086]</td><td>[http://www.uniprot.org/uniprot/E3KED4 E3KED4]</td></tr>
 +
<tr class="s2"><td>MBP1_USTMA</td><td>UMAG_11222</td><td>[https://www.ncbi.nlm.nih.gov/protein/758987770 XP_011392621]</td><td>[http://www.uniprot.org/uniprot/A0A0D1DP35 A0A0D1DP35]</td></tr>
 +
<tr class="s1"><td>MBP1_WALME</td><td>&nbsp;</td><td>[https://www.ncbi.nlm.nih.gov/protein/588255750 XP_006957051]</td><td>[http://www.uniprot.org/uniprot/I4YGC0 I4YGC0]</td></tr>
 +
</table>
  
 +
{{Vspace}}
  
&nbsp;
 
 
==Further reading and resources==
 
==Further reading and resources==
 
{{#pmid: 22114356}}
 
{{#pmid: 22114356}}
Line 118: Line 422:
 
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
 
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
  
 +
{{Vspace}}
  
&nbsp;
 
 
[[Category:Bioinformatics]]
 
[[Category:Bioinformatics]]
  
 
</div>
 
</div>

Latest revision as of 01:05, 18 September 2020

Reference fungi data


Explanation and definition for the "reference species" we use for the course.


Many bioinformatics procedures depend on the comparison of sequences between species. To make good use of evolutionary information, we should choose species that span the breadth of observations, and that are not biased towards a particular branch of the phylogenetic tree. To keep procedures manageable, the number of species cannot be "too large". For fungi, we make use of recent phylogenetic studies that establish the branching order of the entire kingdom, and we choose ten representatives for clades at the Class or subphylum level. To illustrate the "class" level: for animals the class levels include e.g. bony and cartilaginous fishes, segmented worms, amphibians, reptiles, birds and mammals - the familiar, very broad categories. I.e. a reference species list of animals, divided along class levels, might include zebrafish, african claw frog, the fruit fly, humans, the raven, the oyster etc. etc. Even though they are all fungi, our reference species are no more similar to each other than the former.



 

Reference species

To select a set of diverse species, the whole set of names of genome-sequenced fungi was loaded into the NCBI's Common Taxonomic Tree tree tool. Then ten representative species were manually selected as being well distributed across the tree. The selected species are:


Name BICODE tax ID Classification
Phylum Ascomycota
Aspergillus nidulans ASPNI 162425 Subphylum Pezizomycotina; Class Eurotiomycetes
Bipolaris oryzae BIPOR 101162 Subphylum Pezizomycotina; Class Dothideomycetes
Neurospora crassa NEUCR 5141 Subphylum Pezizomycotina; Class Sordariomycetes
Saccharomyces cerevisiae SACCE 4932 Subphylum Saccharomycotina
Schizosaccharomyces pombe SCHPO 4896 Subphylum Taphrinomycotina
Phylum Basidiomyceta
Coprinopsis cinerea COPCI 5346 Subphylum Agaricomycotina; Class Agaricomycetes
Cryptococcus neoformans CRYNE 5207 Subphylum Agaricomycotina; Class Tremellomycetes
Puccinia Graminis PUCGR 5297 Subphylum Pucciniomycotina
Ustilago maydis USTMA 5270 Subphylum Ustilaginomycotina
Wallemia mellicola WALME 1708541 Subphylum Wallemiales incertae sedis


 


Entrez

Entrez selection code, e.g. for BLAST searches
"Wallemia mellicola"[organism] OR
"Puccinia Graminis"[organism] OR
"Ustilago maydis"[organism] OR
"Cryptococcus neoformans"[organism] OR
"Coprinopsis cinerea"[organism] OR
"Schizosaccharomyces pombe"[organism] OR
"Aspergillus nidulans"[organism] OR
"Neurospora crassa"[organism] OR
"Bipolaris oryzae"[organism] OR
"Saccharomyces cerevisiae"[organism]


 

Tax ID

Taxonomy IDs, e.g. for the NCBI taxonomy browser
4896
4932
5141
5270
5297
5346
5207
101162
162425
1708541


 

Trees

Text tree, based on the NCBI Taxonomy Common Tree
Dikarya
 |
 +--Basidiomycota
 |   |
 |   +-Agaricomycotina
 |   |  +-Wallemia mellicola
 |   |  +-Coprinopsis cinerea
 |   |  +-Cryptococcus neoformans
 |   |
 |   +-Puccinia graminis
 |   +-Ustilago maydis
 |
 +--Ascomycota
     |
     +-Schizosaccharomyces pombe
     |
     +-saccharomyceta
        +-Saccharomyces cerevisiae
        |
        +-leotiomyceta
           +-Aspergillus nidulans
           +-Neurospora crassa
           +-Bipolaris oryzae


 
Phylip tree format, e.g. to plot cladograms
(
(
'Wallemia mellicola':4,
'Puccinia graminis':4,
'Ustilago maydis':4,
(
'Coprinopsis cinerea':4,
'Cryptococcus neoformans':4
)Agaricomycotina:4
)Basidiomycota:4,
(
(
(
'Aspergillus nidulans':4,
'Bipolaris oryzae':4,
'Neurospora crassa':4
)leotiomyceta:4,
'Saccharomyces cerevisiae':4
)saccharomyceta:4,
'Schizosaccharomyces pombe':4
)Ascomycota:4
)Dikarya:4;


 
Cladogram, drawn with the Phylip program retree
                 ┌──────────── Schizosaccharomyces pombe
                 │  
                 │                          ┌───────────── Aspergillus nidulans
   ┌─────────────+                          │  
   │             │             ┌────────────+───────────── Bipolaris oryzae
   │             │             │            │  
   │             └─────────────+            └───────────── Neurospora crassa
   │                           │  
 ──+                           └──────────── <span style="background-color:#EEEEBB;">Saccharomyces cerevisiae</span>
   │  
   │                           ┌──────────── Cryptococcus neoformans
   │             ┌─────────────+  
   │             │             └───────────── Coprinopsis cinerea
   │             │  
   └─────────────+───────────── Ustilago maydis
                 │  
                 ├───────────── Puccinia graminis
                 │  
                 └───────────── Wallemia mellicola


 

Strains

Genome sequencing is done from a single, defined strain of a species and several genomes from different strains of the same species may be deposited in the database. This may be a problem, since the annotation level is not the same for all of them, and there may be many strains. If you work with an organism at the strain-level, make sure you know which one is considered the "reference" in the field.

Species Strain tax ID
Aspergillus nidulans Aspergillus nidulans FGSC A4 227321
Bipolaris oryzae Bipolaris oryzae ATCC 44560 930090
Neurospora crassa Neurospora crassa OR74A 367110
Saccharomyces cerevisiae Saccharomyces cerevisiae S288C 559292
Schizosaccharomyces pombe Schizosaccharomyces pombe 972h- 284812
Coprinopsis cinerea Coprinopsis cinerea okayama7#130 240176
Cryptococcus neoformans Cryptococcus neoformans var. neoformans JEC21 214684
Puccinia Graminis Puccinia Graminis f. sp. tritici CRL 75-36-700-3 418459
Ustilago maydis Ustilago maydis 521 237631
Wallemia mellicola Wallemia mellicola CBS 633.66 671144

Note that the tax-IDs for the species and strains are indeed different!


 

R Code

A vector of binomial species names.
REFspecies <- c("Aspergillus nidulans",
                "Bipolaris oryzae",
                "Coprinopsis cinerea",
                "Cryptococcus neoformans",
                "Neurospora crassa",
                "Puccinia graminis",
                "Saccharomyces cerevisiae",
                "Schizosaccharomyces pombe",
                "Ustilago maydis",
                "Wallemia mellicola"
)
A data frame of binomial species and TaxIDs.
refTaxa <- data.frame(
              ID = as.integer(c(162425,
                                101162,
                                5141,
                                4932,
                                4896,
                                5346,
                                5207,
                                5297,
                                5270,
                                1708541)),
              species = c("Aspergillus nidulans",
                          "Bipolaris oryzae",
                          "Neurospora crassa",
                          "Saccharomyces cerevisiae",
                          "Schizosaccharomyces pombe",
                          "Coprinopsis cinerea",
                          "Cryptococcus neoformans",
                          "Puccinia Graminis",
                          "Ustilago maydis",
                          "Wallemia mellicola"),
              stringsAsFactors = FALSE)
A data frame of the genome-sequenced strains in which the Mbp1 orthologues are annotated.
library(jsonlite)
taxa <- fromJSON('[
  { "ID" : 227321,
  "species" : "Aspergillus nidulans FGSC A4"},
  { "ID" : 930090,
  "species" : "Bipolaris oryzae ATCC 44560"},
  { "ID" : 367110,
  "species" : "Neurospora crassa OR74A"},
  { "ID" : 559292,
  "species" : "Saccharomyces cerevisiae S288C"},
  { "ID" : 284812,
  "species" : "Schizosaccharomyces pombe 972h-"},
  { "ID" : 240176,
  "species" : "Coprinopsis cinerea okayama7#130"},
  { "ID" : 214684,
  "species" : "Cryptococcus neoformans var. neoformans JEC21"},
  { "ID" : 418459,
  "species" : "Puccinia graminis f. sp. tritici CRL 75-36-700-3"},
  { "ID" : 237631,
  "species" : "Ustilago maydis 521"},
  { "ID" : 671144,
  "species" : "Wallemia mellicola CBS 633.66"}
  ]
')
taxa
ID                                          species
# 1  227321                     Aspergillus nidulans FGSC A4
# 2  930090                      Bipolaris oryzae ATCC 44560
# 3  367110                          Neurospora crassa OR74A
# 4  559292                   Saccharomyces cerevisiae S288C
# 5  284812                  Schizosaccharomyces pombe 972h-
# 6  240176                 Coprinopsis cinerea okayama7#130
# 7  214684    Cryptococcus neoformans var. neoformans JEC21
# 8  418459 Puccinia graminis f. sp. tritici CRL 75-36-700-3
# 9  237631                              Ustilago maydis 521
# 10 671144                    Wallemia mellicola CBS 633.66


 

Mbp1 orthologues

RBMs to MBP1_SACCE
nameOriginally...RefSeqID UniProtID
MBP1_ASPNIAN3154XP_660758Q5B8H6
MBP1_BIPORCOCMIDRAFT_338XP_007682304W6ZM86
MBP1_NEUCRSwi4XP_955821Q7RW59
MBP1_SACCEMbp1NP_010227P39678
MBP1_SCHPORes2NP_593032P41412
MBP1_COPCI XP_001837394A8NYC6
MBP1_CRYNE XP_569090Q5KMQ9
MBP1_PUCGRPGTG_08863XP_003327086E3KED4
MBP1_USTMAUMAG_11222XP_011392621A0A0D1DP35
MBP1_WALME XP_006957051I4YGC0


 

Further reading and resources

Ebersberger et al. (2012) A consistent phylogenetic backbone for the fungi. Mol Biol Evol 29:1319-34. (pmid: 22114356)

PubMed ] [ DOI ] The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen phylogenetic conclusions. We reason that branches (splits) recovered with independent data and different tree reconstruction methods are likely to reflect true evolutionary relationships. Two complementary phylogenomic data sets based on 99 fungal genomes and 109 fungal expressed sequence tag (EST) sets analyzed with four different tree reconstruction methods shed light from different angles on the fungal tree of life. Eleven additional data sets address specifically the phylogenetic position of Blastocladiomycota, Ustilaginomycotina, and Dothideomycetes, respectively. The combined evidence from the resulting trees supports the deep-level stability of the fungal groups toward a comprehensive natural system of the fungi. In addition, our analysis reveals methodologically interesting aspects. Enrichment for EST encoded data-a common practice in phylogenomic analyses-introduces a strong bias toward slowly evolving and functionally correlated genes. Consequently, the generalization of phylogenomic data sets as collections of randomly selected genes cannot be taken for granted. A thorough characterization of the data to assess possible influences on the tree reconstruction should therefore become a standard in phylogenomic analyses.