Difference between revisions of "Reference species for fungi"

From "A B C"
Jump to navigation Jump to search
m
m
Line 1: Line 1:
 
<div id="BIO">
 
<div id="BIO">
 
<div class="b1">
 
<div class="b1">
Reference fungi
+
Reference fungi data
 
</div>
 
</div>
  
Line 10: Line 10:
  
  
Many bioinformatics procedures depend on the comparison of sequences between species. To make good use of evolutionary information, we should choose species that span the breadth of observations, and that are not biased towards a particular branch of the phylogenetic tree. To keep procedures manageable, the number of species cannot be "too large". For fungi, we make use of recent phylogenetic studies that establish the branching order of the entire {{WP|Taxonomic_rank|kingdom}}, and we choose ten representatives for clades at the '''Class''' or subphylum level. To illustrate the "class"level, for animals the class level contains e.g. bony and cartilaginous fishes, amphibians, reptiles, birds and mammals.
+
Many bioinformatics procedures depend on the comparison of sequences between species. To make good use of evolutionary information, we should choose species that span the breadth of observations, and that are not biased towards a particular branch of the phylogenetic tree. To keep procedures manageable, the number of species cannot be "too large". For fungi, we make use of recent phylogenetic studies that establish the branching order of the entire {{WP|Taxonomic_rank|kingdom}}, and we choose ten representatives for clades at the {{WP|Class_(biology)|'''Class'''}} or subphylum level. To illustrate the "class" level: for animals the class levels include e.g. bony and cartilaginous fishes, segmented worms, amphibians, reptiles, birds and mammals - the familiar, very broad categories. I.e. a reference species list of animals, divided along class levels, might include zebrafish, african claw frog, the fruit fly, humans, the raven, the oyster etc. etc.
 +
Even though they are all fungi, our reference species are no more similar to each other than the former.
 +
 
  
 
__NOTOC__
 
__NOTOC__
  
 
{{Vspace}
 
{{Vspace}
 +
 +
 +
==Reference species==
  
 
To select a set of diverse species, the whole set of names of genome-sequenced fungi was loaded into the NCBI's  [http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi '''Common Taxonomic Tree'''] tree tool. Then ten representative species were manually selected as being well distributed across the tree. The selected species are:
 
To select a set of diverse species, the whole set of names of genome-sequenced fungi was loaded into the NCBI's  [http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi '''Common Taxonomic Tree'''] tree tool. Then ten representative species were manually selected as being well distributed across the tree. The selected species are:
Line 106: Line 111:
 
{{Vspace}}
 
{{Vspace}}
  
 +
 +
===Entrez===
 
;Entrez selection code, e.g. for BLAST searches
 
;Entrez selection code, e.g. for BLAST searches
 
<source lang="text">
 
<source lang="text">
Line 121: Line 128:
  
 
{{Vspace}}
 
{{Vspace}}
 +
 +
===Tax ID===
  
 
;Taxonomy IDs, e.g. for the NCBI taxonomy browser
 
;Taxonomy IDs, e.g. for the NCBI taxonomy browser
Line 137: Line 146:
  
 
{{Vspace}}
 
{{Vspace}}
 +
 +
===Trees===
  
 
;Text tree, based on the NCBI Taxonomy Common Tree
 
;Text tree, based on the NCBI Taxonomy Common Tree
Line 223: Line 234:
 
{{Vspace}}
 
{{Vspace}}
  
;Rcode
+
===R===
 +
 
 +
;A vector of binomial species names.
 
<source lang="R">
 
<source lang="R">
 
REFspecies <- c("Aspergillus nidulans",
 
REFspecies <- c("Aspergillus nidulans",
Line 237: Line 250:
 
)
 
)
 
</source>
 
</source>
 +
 +
;A data frame of binomial species and TaxIDs.
 +
<source lang="R">
 +
refTaxa <- data.frame(
 +
              ID = as.integer(c(162425,
 +
                                101162,
 +
                                5141,
 +
                                4932,
 +
                                4896,
 +
                                5346,
 +
                                5207,
 +
                                5297,
 +
                                5270,
 +
                                1708541)),
 +
              species = c("Aspergillus nidulans",
 +
                          "Bipolaris oryzae",
 +
                          "Neurospora crassa",
 +
                          "Saccharomyces cerevisiae",
 +
                          "Schizosaccharomyces pombe",
 +
                          "Coprinopsis cinerea",
 +
                          "Cryptococcus neoformans",
 +
                          "Puccinia Graminis",
 +
                          "Ustilago maydis",
 +
                          "Wallemia mellicola"),
 +
              stringsAsFactors = FALSE)
 +
</source>
 +
 +
 +
  
 
{{Vspace}}
 
{{Vspace}}
 +
 +
==Mbp1 orthologues==
  
 
;RBMs to MBP1_SACCE
 
;RBMs to MBP1_SACCE
 
     name              RefSeqID      UniProtID
 
     name              RefSeqID      UniProtID
   1  MBP1_ASPNI        XP_660758      Q5B8H6
+
   1  MBP1_ASPNI        [http:site.com XP_660758]     Q5B8H6
 
   2  MBP1_BIPOR        XP_007682304  W6ZM86
 
   2  MBP1_BIPOR        XP_007682304  W6ZM86
 
   3  MBP1_NEUCR        XP_955821      Q7RW59
 
   3  MBP1_NEUCR        XP_955821      Q7RW59

Revision as of 14:40, 4 October 2016

Reference fungi data


Explanation and definition for the "reference species" we use for the course.


Many bioinformatics procedures depend on the comparison of sequences between species. To make good use of evolutionary information, we should choose species that span the breadth of observations, and that are not biased towards a particular branch of the phylogenetic tree. To keep procedures manageable, the number of species cannot be "too large". For fungi, we make use of recent phylogenetic studies that establish the branching order of the entire kingdom, and we choose ten representatives for clades at the Class or subphylum level. To illustrate the "class" level: for animals the class levels include e.g. bony and cartilaginous fishes, segmented worms, amphibians, reptiles, birds and mammals - the familiar, very broad categories. I.e. a reference species list of animals, divided along class levels, might include zebrafish, african claw frog, the fruit fly, humans, the raven, the oyster etc. etc. Even though they are all fungi, our reference species are no more similar to each other than the former.



{{Vspace}


Reference species

To select a set of diverse species, the whole set of names of genome-sequenced fungi was loaded into the NCBI's Common Taxonomic Tree tree tool. Then ten representative species were manually selected as being well distributed across the tree. The selected species are:


Name BICODE tax ID Classification
Phylum Ascomycota
Aspergillus nidulans ASPNI 162425 Subphylum Pezizomycotina; Class Eurotiomycetes
Bipolaris oryzae BIPOR 101162 Subphylum Pezizomycotina; Class Dothideomycetes
Neurospora crassa NEUCR 5141 Subphylum Pezizomycotina; Class Sordariomycetes
Saccharomyces cerevisiae SACCE 4932 Subphylum Saccharomycotina
Schizosaccharomyces pombe SCHPO 4896 Subphylum Taphrinomycotina
Phylum Basidiomyceta
Coprinopsis cinerea COPCI 5346 Subphylum Agaricomycotina; Class Agaricomycetes
Cryptococcus neoformans CRYNE 5207 Subphylum Agaricomycotina; Class Tremellomycetes
Puccinia Graminis PUCGR 5297 Subphylum Pucciniomycotina
Ustilago maydis USTMA 5270 Subphylum Ustilaginomycotina
Wallemia mellicola WALME 1708541 Subphylum Wallemiales incertae sedis


 


Entrez

Entrez selection code, e.g. for BLAST searches
"Wallemia mellicola"[organism] OR
"Puccinia Graminis"[organism] OR
"Ustilago maydis"[organism] OR
"Cryptococcus neoformans"[organism] OR
"Coprinopsis cinerea"[organism] OR
"Schizosaccharomyces pombe"[organism] OR
"Aspergillus nidulans"[organism] OR
"Neurospora crassa"[organism] OR
"Bipolaris oryzae"[organism] OR
"Saccharomyces cerevisiae"[organism]


 

Tax ID

Taxonomy IDs, e.g. for the NCBI taxonomy browser
4896
4932
5141
5270
5297
5346
5207
101162
162425
1708541


 

Trees

Text tree, based on the NCBI Taxonomy Common Tree
Dikarya
 |
 +--Basidiomycota
 |   |
 |   +-Agaricomycotina
 |   |  +-Wallemia mellicola
 |   |  +-Coprinopsis cinerea
 |   |  +-Cryptococcus neoformans
 |   |
 |   +-Puccinia graminis
 |   +-Ustilago maydis
 |
 +--Ascomycota
     |
     +-Schizosaccharomyces pombe
     |
     +-saccharomyceta
        +-Saccharomyces cerevisiae
        |
        +-leotiomyceta
           +-Aspergillus nidulans
           +-Neurospora crassa
           +-Bipolaris oryzae


 
Phylip tree format, e.g. to plot cladograms
(
(
'Wallemia mellicola':4,
'Puccinia graminis':4,
'Ustilago maydis':4,
(
'Coprinopsis cinerea':4,
'Cryptococcus neoformans':4
)Agaricomycotina:4
)Basidiomycota:4,
(
(
(
'Aspergillus nidulans':4,
'Bipolaris oryzae':4,
'Neurospora crassa':4
)leotiomyceta:4,
'Saccharomyces cerevisiae':4
)saccharomyceta:4,
'Schizosaccharomyces pombe':4
)Ascomycota:4
)Dikarya:4;


 
Cladogram, drawn with the Phylip program retree
                 ┌──────────── Schizosaccharomyces pombe
                 │  
                 │                          ┌───────────── Aspergillus nidulans
   ┌─────────────+                          │  
   │             │             ┌────────────+───────────── Bipolaris oryzae
   │             │             │            │  
   │             └─────────────+            └───────────── Neurospora crassa
   │                           │  
 ──+                           └──────────── <span style="background-color:#EEEEBB;">Saccharomyces cerevisiae</span>
   │  
   │                           ┌──────────── Cryptococcus neoformans
   │             ┌─────────────+  
   │             │             └───────────── Coprinopsis cinerea
   │             │  
   └─────────────+───────────── Ustilago maydis
                 │  
                 ├───────────── Puccinia graminis
                 │  
                 └───────────── Wallemia mellicola


 

R

A vector of binomial species names.
REFspecies <- c("Aspergillus nidulans",
                "Bipolaris oryzae",
                "Coprinopsis cinerea",
                "Cryptococcus neoformans",
                "Neurospora crassa",
                "Puccinia graminis",
                "Saccharomyces cerevisiae",
                "Schizosaccharomyces pombe",
                "Ustilago maydis",
                "Wallemia mellicola"
)
A data frame of binomial species and TaxIDs.
refTaxa <- data.frame(
              ID = as.integer(c(162425,
                                101162,
                                5141,
                                4932,
                                4896,
                                5346,
                                5207,
                                5297,
                                5270,
                                1708541)),
              species = c("Aspergillus nidulans",
                          "Bipolaris oryzae",
                          "Neurospora crassa",
                          "Saccharomyces cerevisiae",
                          "Schizosaccharomyces pombe",
                          "Coprinopsis cinerea",
                          "Cryptococcus neoformans",
                          "Puccinia Graminis",
                          "Ustilago maydis",
                          "Wallemia mellicola"),
              stringsAsFactors = FALSE)



 

Mbp1 orthologues

RBMs to MBP1_SACCE
    name               RefSeqID       UniProtID
 1  MBP1_ASPNI         [http:site.com XP_660758]      Q5B8H6
 2  MBP1_BIPOR         XP_007682304   W6ZM86
 3  MBP1_NEUCR         XP_955821      Q7RW59
 4  MBP1_SACCE         NP_010227      P39678
 5  MBP1_SCHPO (Res2)  NP_593032      P41412
 6  MBP1_COPCI         XP_001837394   A8NYC6
 7  MBP1_CRYNE         XP_569090      Q5KMQ9
 8  MBP1_PUCGR         XP_003327086   E3KED4
 9  MBP1_USTMA         XP_011392621   A0A0D1DP35
10  MBP1_WALME         XP_006957051   I4YGC0


 

Further reading and resources

Ebersberger et al. (2012) A consistent phylogenetic backbone for the fungi. Mol Biol Evol 29:1319-34. (pmid: 22114356)

PubMed ] [ DOI ] The kingdom of fungi provides model organisms for biotechnology, cell biology, genetics, and life sciences in general. Only when their phylogenetic relationships are stably resolved, can individual results from fungal research be integrated into a holistic picture of biology. However, and despite recent progress, many deep relationships within the fungi remain unclear. Here, we present the first phylogenomic study of an entire eukaryotic kingdom that uses a consistency criterion to strengthen phylogenetic conclusions. We reason that branches (splits) recovered with independent data and different tree reconstruction methods are likely to reflect true evolutionary relationships. Two complementary phylogenomic data sets based on 99 fungal genomes and 109 fungal expressed sequence tag (EST) sets analyzed with four different tree reconstruction methods shed light from different angles on the fungal tree of life. Eleven additional data sets address specifically the phylogenetic position of Blastocladiomycota, Ustilaginomycotina, and Dothideomycetes, respectively. The combined evidence from the resulting trees supports the deep-level stability of the fungal groups toward a comprehensive natural system of the fungi. In addition, our analysis reveals methodologically interesting aspects. Enrichment for EST encoded data-a common practice in phylogenomic analyses-introduces a strong bias toward slowly evolving and functionally correlated genes. Consequently, the generalization of phylogenomic data sets as collections of randomly selected genes cannot be taken for granted. A thorough characterization of the data to assess possible influences on the tree reconstruction should therefore become a standard in phylogenomic analyses.