Difference between revisions of "BIN-FUNC-GO"
m |
m |
||
Line 1: | Line 1: | ||
<div id="ABC"> | <div id="ABC"> | ||
− | <div style="padding:5px; border:1px solid #000000; background-color:# | + | <div style="padding:5px; border:1px solid #000000; background-color:#f4d7b7; font-size:300%; font-weight:400; color: #000000; width:100%;"> |
Gene Ontology | Gene Ontology | ||
− | <div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:# | + | <div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#f4d7b7; font-size:30%; font-weight:200; color: #000000; "> |
(Ontologies in knowledge engineering, GO and GOA) | (Ontologies in knowledge engineering, GO and GOA) | ||
</div> | </div> | ||
Line 10: | Line 10: | ||
− | <div style="padding:5px; border:1px solid #000000; background-color:# | + | <div style="padding:5px; border:1px solid #000000; background-color:#f4d7b733; font-size:85%;"> |
<div style="font-size:118%;"> | <div style="font-size:118%;"> | ||
<b>Abstract:</b><br /> | <b>Abstract:</b><br /> | ||
Line 39: | Line 39: | ||
<section begin=deliverables /> | <section begin=deliverables /> | ||
<!-- included from "./data/ABC-unit_components.txt", section: "deliverables-time_management" --> | <!-- included from "./data/ABC-unit_components.txt", section: "deliverables-time_management" --> | ||
− | + | <li><b>Time management</b>: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.</li> | |
<!-- included from "./data/ABC-unit_components.txt", section: "deliverables-journal" --> | <!-- included from "./data/ABC-unit_components.txt", section: "deliverables-journal" --> | ||
− | + | <li><b>Journal</b>: Document your progress in your [[FND-Journal|Course Journal]]. Some tasks may ask you to include specific items in your journal. Don't overlook these.</li> | |
<!-- included from "./data/ABC-unit_components.txt", section: "deliverables-insights" --> | <!-- included from "./data/ABC-unit_components.txt", section: "deliverables-insights" --> | ||
− | + | <li><b>Insights</b>: If you find something particularly noteworthy about this unit, make a note in your [[ABC-Insights|'''insights!''' page]].</li> | |
<section end=deliverables /> | <section end=deliverables /> | ||
<!-- ============================ --> | <!-- ============================ --> | ||
Line 50: | Line 50: | ||
<b>Prerequisites:</b><br /> | <b>Prerequisites:</b><br /> | ||
<!-- included from "./data/ABC-unit_components.txt", section: "notes-prerequisites" --> | <!-- included from "./data/ABC-unit_components.txt", section: "notes-prerequisites" --> | ||
− | This unit builds on material covered in the following prerequisite units: | + | This unit builds on material covered in the following prerequisite units:<br /> |
*[[BIN-FUNC-Databases|BIN-FUNC-Databases (Molecular Function Databases)]] | *[[BIN-FUNC-Databases|BIN-FUNC-Databases (Molecular Function Databases)]] | ||
<section end=prerequisites /> | <section end=prerequisites /> | ||
Line 59: | Line 59: | ||
+ | {{REVISE}} | ||
{{Smallvspace}} | {{Smallvspace}} |
Revision as of 12:38, 16 September 2020
Gene Ontology
(Ontologies in knowledge engineering, GO and GOA)
Abstract:
Introduction to the Gene Ontology (GO) and Gene Ontology Annotations (GOA).
Objectives:
|
Outcomes:
|
Deliverables:
Prerequisites:
This unit builds on material covered in the following prerequisite units:
Contents
Contents
Introduction
The Gene Ontology project is the most influential contributor to the definition of function in computational biology and the use of GO terms and GO annotations is ubiquitous.
Task:
- Read the introductory notes on the Gene Ontology project to define and annotate gene function.
- Browse through the paper describing the 2017 update on the GO database and tools; in particular take note of the LEGO initiative that aims to build systems and pathway models by combining suitable GO terms.
The Gene Ontology Consortium (2017) Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 45:D331-D338. (pmid: 27899567) |
Remember the three separate component ontologies of GO:
- Molecular function
- Biological Process
- Cellular component
GO evidence codes
Annotations can be made according to literature data or computational inference and it is important to note how an annotation has been justified by the curator to evaluate the level of trust we should have in the annotation. GO uses evidence codes to make this process transparent. When computing with the ontology, we may want to filter (exclude) particular terms in order to avoid tautologies: for example if we were to infer functional relationships between homologous genes, we should exclude annotations that have been based on the same inference or similar, and compute only with the actual experimental data.
The following evidence codes are in current use; if you want to exclude inferred anotations you would restrict the codes you use to the ones shown in bold below: EXP, IDA, IPI, IMP, IEP, and perhaps IGI, although the interpretation of genetic interactions can require assumptions. The codes are ubiquitous and important, you need to know what they mean and imply when working with GOA data.
- Automatically-assigned Evidence Codes
- IEA: Inferred from Electronic Annotation
- Curator-assigned Evidence Codes
- Experimental Evidence Codes
- EXP: Inferred from Experiment
- IDA: Inferred from Direct Assay
- IPI: Inferred from Physical Interaction
- IMP: Inferred from Mutant Phenotype
- IEP: Inferred from Expression Pattern
- IGI: Inferred from Genetic Interaction
- Computational Analysis Evidence Codes
- ISS: Inferred from Sequence or Structural Similarity
- ISO: Inferred from Sequence Orthology
- ISA: Inferred from Sequence Alignment
- ISM: Inferred from Sequence Model
- IGC: Inferred from Genomic Context
- IBA: Inferred from Biological aspect of Ancestor
- IBD: Inferred from Biological aspect of Descendant
- IKR: Inferred from Key Residues
- IRD: Inferred from Rapid Divergence
- RCA: inferred from Reviewed Computational Analysis
- Author Statement Evidence Codes
- TAS: Traceable Author Statement
- NAS: Non-traceable Author Statement
- Curator Statement Evidence Codes
- IC: Inferred by Curator
- ND: No biological Data available
For further details, see the Guide to GO Evidence Codes and the GO Evidence Code Decision Tree.
GO tools
For many projects, the simplest approach will be to download the GO ontology itself. It is a well constructed, easily parseable file that is well suited for computation.
Bioconducter has a large number of packages that supply and analyze GO and GOA data.
AmiGO
AmiGO is a convenient online GO browser developed by the Gene Ontology consortium and hosted on their website.
AmiGO - Gene products
Task:
- Navigate to the GO homepage.
- Enter
Mbp1
into the search box to initiate a search for the yeast Mbp1 transcription factor (as gene or protein name). - There are a three catgories of hits - Ontology terms directly associated with the search string, Genes and gene products annoted to terms in GOA, and Annotations of terms to any of the genes. As usual, we need to be wary of keyword searches since they rarely identify a unique gene, so we check the Genes... category first. Follow the link.
- From the table you find you can easily identify the correct gene. Follow its link to the associated Gene Information page. Study the information on that page.
- Note that this page lists Associations - i.e. GO terms that haven been associated with Mbp1 in GOA.
AmiGO - Associations
GO annotations for a protein are called associations.
Task:
- Use the Results count selector and increase the number of annotations to show all Gene Product Associations on the page. Note the evidence codes.
- Note that you can expand the left hand menu for detailed filtering. Click on Ontology (aspect) to display or undisplay the terms for the three different component ontologies - the GO "aspects": F, C, and P (what were these again? This is one thing you must remember.).
- The most specific annotation on the page seems to be "positive regulation of transcription involved in G1/S transition of mitotic cell cycle". Follow the link.
- Note that you can now filter for organisms. Restrict the organism to Saccaromyces cerevisiae S288C by clicking on the green (+) sign. Note that you now see all yeast genes that are annotated to this term! This is an effective way to build system membership information from the bottom up.
- There are a number of tabs available for different views on the data: Annotations, Graph Views, Inferred Tree View, Neigborhood, and Mappings. Visit them.
- The link to QuickGo from the Graph Views tab gives you the entire ancestor chart of the term, with clickable term nodes. You need to consider the ancestor terms to expand searches for related, collaborating genes. For example, if a term is annotated with "positive regulation ...", you will need to consider genes associete to the cognate "negative regulation ..." or just "regulation ..." terms as well to get a complete picture of the gene's activities.
- Neigborhood refers to the ancestors and children of a term.
- Study the information available on that page and through the tabs on the page, especially the graph view.
- Navigate to the Inferred Tree View tab. Note that terms are labelled with icons that signify the category of the relationship: P: "part-of", I: "is-a", and R: "regulates". Find the two-removed ancestor node: "GO:0000082 G1/S transition of mitotic cell cycle", of which GO:0071931 is a part. Follow the link.
- On the annotations tab of GO:0000082, filter the list to S. cerevisiae genes. As of today there are 143 annotated genes. Are these genes specifically annotated to that term, or does the list include genes that are annotated to descendants of the term?
GO Slims
GO is large and very detailed and the need for somehwat more high-level descriptions in model organisms is met by the GoSlim datasets that are curated by some of the main model-organism databases and consortia. Follow the link and read more about GO slims (short).
Self-evaluation
Notes
Further reading, links and resources
The Gene Ontology Consortium (2017) Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res 45:D331-D338. (pmid: 27899567) |
Gene Ontology Consortium (2012) The Gene Ontology: enhancements for 2011. Nucleic Acids Res 40:D559-64. (pmid: 22102568) |
Gene Ontology Consortium (2010) The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res 38:D331-5. (pmid: 19920128) |
Bastos et al. (2011) Application of gene ontology to gene identification. Methods Mol Biol 760:141-57. (pmid: 21779995) |
du Plessis et al. (2011) The what, where, how and why of gene ontology--a primer for bioinformaticians. Brief Bioinformatics 12:723-35. (pmid: 21330331) |
Hackenberg & Matthiesen (2010) Algorithms and methods for correlating experimental results with annotation databases. Methods Mol Biol 593:315-40. (pmid: 19957156) |
Carol Goble on the tension between purists and pragmatists in life-science ontology construction. Plenary talk at SOFG2...
Goble & Wroe (2004) The Montagues and the Capulets. Comp Funct Genomics 5:623-32. (pmid: 18629186) |
Harris (2008) Developing an ontology. Methods Mol Biol 452:111-24. (pmid: 18563371) |
Dimmer et al. (2007) Methods for gene ontology annotation. Methods Mol Biol 406:495-520. (pmid: 18287709) |
If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2017-08-05
Modified:
- 2017-11-12
Version:
- 1.0
Version history:
- 1.0 First live
- 0.1 First stub
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.