Difference between revisions of "BIN-FUNC-GO"
m |
m |
||
Line 40: | Line 40: | ||
<!-- included from "ABC-unit_components.wtxt", section: "notes-prerequisites" --> | <!-- included from "ABC-unit_components.wtxt", section: "notes-prerequisites" --> | ||
You need to complete the following units before beginning this one: | You need to complete the following units before beginning this one: | ||
− | *[[BIN-FUNC-Databases]] | + | *[[BIN-FUNC-Databases|BIN-FUNC-Databases (Molecular Function Databases)]] |
{{Vspace}} | {{Vspace}} | ||
Line 221: | Line 221: | ||
}} | }} | ||
+ | ==Quick GO== | ||
+ | ==GO Slims== | ||
+ | |||
+ | http://www.geneontology.org/page/go-slim-and-subset-guide | ||
Revision as of 12:32, 6 October 2017
Gene Ontology
Keywords: Ontologies in knowledge engineering, GO and GOA
Contents
This unit is under development. There is some contents here but it is incomplete and/or may change significantly: links may lead to nowhere, the contents is likely going to be rearranged, and objectives, deliverables etc. may be incomplete or missing. Do not work with this material until it is updated to "live" status.
Abstract
...
This unit ...
Prerequisites
You need to complete the following units before beginning this one:
Objectives
...
Outcomes
...
Deliverables
- Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
- Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don't overlook these.
- Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.
Evaluation
Evaluation: NA
- This unit is not evaluated for course marks.
Contents
Task:
- Read the introductory notes on the Gene Ontology project to define and annotate gene function.
GO
Introduction
Harris (2008) Developing an ontology. Methods Mol Biol 452:111-24. (pmid: 18563371) |
Hackenberg & Matthiesen (2010) Algorithms and methods for correlating experimental results with annotation databases. Methods Mol Biol 593:315-40. (pmid: 19957156) |
GO
The Gene Ontology project is the most influential contributor to the definition of function in computational biology and the use of GO terms and GO annotations is ubiquitous.
GO: the Gene Ontology project [ link ] [ page ] Expand... | ![]() |
du Plessis et al. (2011) The what, where, how and why of gene ontology--a primer for bioinformaticians. Brief Bioinformatics 12:723-35. (pmid: 21330331) |
The GO actually comprises three separate ontologies:
- Molecular function
- ...
- Biological Process
- ...
- Cellular component
- ...
GO terms
GO terms comprise the core of the information in the ontology: a carefully crafted definition of a term in any of GO's separate ontologies.
GO relationships
The nature of the relationships is as much a part of the ontology as the terms themselves. GO uses three categories of relationships:
- is a
- part of
- regulates
GO annotations
The GO terms are conceptual in nature, and while they represent our interpretation of biological phenomena, they do not intrinsically represent biological objects, such a specific genes or proteins. In order to link molecules with these concepts, the ontology is used to annotate genes. The annotation project is referred to as GOA.
Dimmer et al. (2007) Methods for gene ontology annotation. Methods Mol Biol 406:495-520. (pmid: 18287709) |
GO evidence codes
Annotations can be made according to literature data or computational inference and it is important to note how an annotation has been justified by the curator to evaluate the level of trust we should have in the annotation. GO uses evidence codes to make this process transparent. When computing with the ontology, we may want to filter (exclude) particular terms in order to avoid tautologies: for example if we were to infer functional relationships between homologous genes, we should exclude annotations that have been based on the same inference or similar, and compute only with the actual experimental data.
The following evidence codes are in current use; if you want to exclude inferred anotations you would restrict the codes you use to the ones shown in bold: EXP, IDA, IPI, IMP, IEP, and perhaps IGI, although the interpretation of genetic interactions can require assumptions.
- Automatically-assigned Evidence Codes
- IEA: Inferred from Electronic Annotation
- Curator-assigned Evidence Codes
- Experimental Evidence Codes
- EXP: Inferred from Experiment
- IDA: Inferred from Direct Assay
- IPI: Inferred from Physical Interaction
- IMP: Inferred from Mutant Phenotype
- IGI: Inferred from Genetic Interaction
- IEP: Inferred from Expression Pattern
- Computational Analysis Evidence Codes
- ISS: Inferred from Sequence or Structural Similarity
- ISO: Inferred from Sequence Orthology
- ISA: Inferred from Sequence Alignment
- ISM: Inferred from Sequence Model
- IGC: Inferred from Genomic Context
- IBA: Inferred from Biological aspect of Ancestor
- IBD: Inferred from Biological aspect of Descendant
- IKR: Inferred from Key Residues
- IRD: Inferred from Rapid Divergence
- RCA: inferred from Reviewed Computational Analysis
- Author Statement Evidence Codes
- TAS: Traceable Author Statement
- NAS: Non-traceable Author Statement
- Curator Statement Evidence Codes
- IC: Inferred by Curator
- ND: No biological Data available
For further details, see the Guide to GO Evidence Codes and the GO Evidence Code Decision Tree.
GO tools
For many projects, the simplest approach will be to download the GO ontology itself. It is a well constructed, easily parseable file that is well suited for computation. For details, see Computing with GO on this wiki.
AmiGO
practical work with GO: at first via the AmiGO browser AmiGO is a GO browser developed by the Gene Ontology consortium and hosted on their website.
AmiGO - Gene products
Task:
- Navigate to the GO homepage.
- Enter
SOX2
into the search box to initiate a search for the human SOX2 transcription factor (WP, HUGO) (as gene or protein name). - There are a number of hits in various organisms: sulfhydryl oxidases and (sex determining region Y)-box genes. Check to see the various ways by which you could filter and restrict the results.
- Select Homo sapiens as the species filter and set the filter. Note that this still does not give you a unique hit, but ...
- ... you can identify the Transcription factor SOX-2 and follow its gene product information link. Study the information on that page.
- Later, we will need Entrez Gene IDs. The GOA information page provides these as GeneID in the External references section. Note it down. With the same approach, find and record the Gene IDs (a) of the functionally related Oct4 (POU5F1) protein, (b) the human cell-cycle transcription factor E2F1, (c) the human bone morphogenetic protein-4 transforming growth factor BMP4, (d) the human UDP glucuronosyltransferase 1 family protein 1, an enzyme that is differentially expressed in some cancers, UGT1A1, and (d) as a positive control, SOX2's interaction partner NANOG, which we would expect to be annotated as functionally similar to both Oct4 and SOX2.
AmiGO - Associations
GO annotations for a protein are called associations.
Task:
- Open the associations information page for the human SOX2 protein via the link in the right column in a separate tab. Study the information on that page.
- Note that you can filter the associations by ontology and evidence code. You have read about the three GO ontologies in your previous assignment, but you should also be familiar with the evidence codes. Click on any of the evidence links to access the Evidence code definition page and study the definitions of the codes. Make sure you understand which codes point to experimental observation, and which codes denote computational inference, or say that the evidence is someone's opinion (TAS, IC etc.). Note: it is good practice - but regrettably not universally implemented standard - to clearly document database semantics and keep definitions associated with database entries easily accessible, as GO is doing here. You won't find this everywhere, but as a user please feel encouraged to complain to the database providers if you come across a database where the semantics are not clear. Seriously: opaque semantics make database annotations useless.
- There are many associations (around 60) and a good way to select which ones to pursue is to follow the most specific ones. Set
IDA
as a filter and among the returned terms selectGO:0035019
– somatic stem cell maintenance in the Biological Process ontology. Follow that link. - Study the information available on that page and through the tabs on the page, especially the graph view.
- In the Inferred Tree View tab, find the genes annotated to this go term for homo sapiens. There should be about 55. Click on the number behind the term. The resulting page will give you all human proteins that have been annotated with this particular term. Note that the great majority of these is via the
IEA
evidence code.
Quick GO
GO Slims
http://www.geneontology.org/page/go-slim-and-subset-guide
Further reading, links and resources
Gene Ontology Consortium (2012) The Gene Ontology: enhancements for 2011. Nucleic Acids Res 40:D559-64. (pmid: 22102568) |
Bastos et al. (2011) Application of gene ontology to gene identification. Methods Mol Biol 760:141-57. (pmid: 21779995) |
Gene Ontology Consortium (2010) The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res 38:D331-5. (pmid: 19920128) |
Carol Goble on the tension between purists and pragmatists in life-science ontology construction. Plenary talk at SOFG2...
Goble & Wroe (2004) The Montagues and the Capulets. Comp Funct Genomics 5:623-32. (pmid: 18629186) |
Gajiwala & Burley (2000) Winged helix proteins. Curr Opin Struct Biol 10:110-6. (pmid: 10679470) |
Aravind et al. (2005) The many faces of the helix-turn-helix domain: transcription regulation and beyond. FEMS Microbiol Rev 29:231-62. (pmid: 15808743) |
Notes
Self-evaluation
If in doubt, ask! If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2017-08-05
Modified:
- 2017-08-05
Version:
- 0.1
Version history:
- 0.1 First stub
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.