Difference between revisions of "ABC-INT-GO categories"
m |
m |
||
Line 1: | Line 1: | ||
<div id="ABC"> | <div id="ABC"> | ||
− | <div style="padding:5px; border: | + | <div style="padding:5px; border:4px solid #000000; background-color:#e19fa7; font-size:300%; font-weight:400; color: #000000; width:100%;"> |
Integrator Unit: GO term categories | Integrator Unit: GO term categories | ||
<div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#e19fa7; font-size:30%; font-weight:200; color: #000000; "> | <div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#e19fa7; font-size:30%; font-weight:200; color: #000000; "> | ||
Line 14: | Line 14: | ||
<b>Abstract:</b><br /> | <b>Abstract:</b><br /> | ||
<section begin=abstract /> | <section begin=abstract /> | ||
− | This page integrates | + | This page integrates learning units on code development and data structures, graph theory and the Gene Ontology. |
<section end=abstract /> | <section end=abstract /> | ||
</div> | </div> | ||
Line 21: | Line 21: | ||
<b>Deliverables:</b><br /> | <b>Deliverables:</b><br /> | ||
<section begin=deliverables /> | <section begin=deliverables /> | ||
− | |||
<li><b>Integrator unit</b>: Deliverables can be submitted for course marks. See below for details.</li> | <li><b>Integrator unit</b>: Deliverables can be submitted for course marks. See below for details.</li> | ||
<section end=deliverables /> | <section end=deliverables /> | ||
Line 28: | Line 27: | ||
<section begin=prerequisites /> | <section begin=prerequisites /> | ||
<b>Prerequisites:</b><br /> | <b>Prerequisites:</b><br /> | ||
− | |||
This unit builds on material covered in the following prerequisite units:<br /> | This unit builds on material covered in the following prerequisite units:<br /> | ||
*[[BIN-FUNC-Semantic_similarity|BIN-FUNC-Semantic_similarity (Measuring "Semantic Similarity" in Ontologies)]] | *[[BIN-FUNC-Semantic_similarity|BIN-FUNC-Semantic_similarity (Measuring "Semantic Similarity" in Ontologies)]] | ||
Line 41: | Line 39: | ||
− | {{ | + | {{SLEEP}} |
{{Smallvspace}} | {{Smallvspace}} | ||
Line 52: | Line 50: | ||
=== Evaluation === | === Evaluation === | ||
− | |||
This "Integrator Unit" is not for evaluation. We will work through this unit in class to illustrate the process of translating requirements to tasks, and bringing a project to a defined conclusion, supported by the ABC knowledge network. | This "Integrator Unit" is not for evaluation. We will work through this unit in class to illustrate the process of translating requirements to tasks, and bringing a project to a defined conclusion, supported by the ABC knowledge network. | ||
Line 62: | Line 59: | ||
{{Smallvspace}} | {{Smallvspace}} | ||
== Contents == | == Contents == | ||
− | |||
{{Smallvspace}} | {{Smallvspace}} | ||
===Scenario=== | ===Scenario=== | ||
Line 103: | Line 99: | ||
--> | --> | ||
− | |||
− | |||
− | |||
− | |||
== Further reading, links and resources == | == Further reading, links and resources == | ||
<!-- {{#pmid: 19957275}} --> | <!-- {{#pmid: 19957275}} --> | ||
<!-- {{WWW|WWW_GMOD}} --> | <!-- {{WWW|WWW_GMOD}} --> | ||
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> --> | <!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> --> | ||
+ | == Notes == | ||
+ | <references /> | ||
{{Vspace}} | {{Vspace}} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<div class="about"> | <div class="about"> | ||
Line 141: | Line 123: | ||
*1.0 First live version | *1.0 First live version | ||
</div> | </div> | ||
− | |||
− | |||
{{CC-BY}} | {{CC-BY}} | ||
+ | [[Category:ABC-units]] | ||
+ | {{INTEGRATOR}} | ||
+ | {{SLEEP}} | ||
+ | {{EVAL}} | ||
</div> | </div> | ||
<!-- [END] --> | <!-- [END] --> |
Revision as of 01:38, 23 September 2020
Integrator Unit: GO term categories
(Integrator unit: define GO term selection)
Abstract:
This page integrates learning units on code development and data structures, graph theory and the Gene Ontology.
Deliverables:
Prerequisites:
This unit builds on material covered in the following prerequisite units:
Contents
Evaluation
This "Integrator Unit" is not for evaluation. We will work through this unit in class to illustrate the process of translating requirements to tasks, and bringing a project to a defined conclusion, supported by the ABC knowledge network.
- Report option
- Work through the tasks described in the scenario.
- Document your results in a short report on a subpage of your User page on the Student Wiki. Describe your methods (R-code!) in an appendix;
Contents
Scenario
This is panel B' of Figure 3 from Zhang et al.s analysis of essential genes of Plasmodium falciparum by saturation mutagenesis[1]. A typical question of large-scale experiments that discover sets of genes is: what do these genes do? Is there a trend among functional categories? As you see, the data is derived from experiments, but the interpretation is entirely dependent on how these functional categories are defined.
How were these categories defined? Is there a principled way to do so?
The Gene Ontology contains more than 270,000 terms and these are far too many categories to provide a meaningful overview such as the one that Zhang et al. required. GO maintains a number of subsets, but the generic "GO slim" also contains more than 2,200 terms.
Your task is:
- to consider how a meaningful, balanced subset of GO terms can be defined to a resolution that can be specified by a user, say, between 10 and 100 terms;
- to write R code to produce such a subset, paying attention to R coding style, and best practice of software design, data management, and reproducible research;
- to document and evaluate your results.
The code should use go-basic.obo
as its input, it can also use the homo sapiens GOA data. It's output should be useful as input to a function that reads GOA data and a list of gene symbols, and associates each gene symbol with a GO term.
- Note
- This task is open in the sense that there are potentiall many suitable solutions: I expect that balancing terms will consider the number of children on the GO graph and/or the number of genes annotated to the various branches in GOA. Presumably your first task is to explore the various concepts around this task, formulate precise requirements, and to write a project plan.
Self-evaluation
Further reading, links and resources
Notes
- ↑
Zhang et al. (2018) Uncovering the essential genes of the human malaria parasite Plasmodium falciparum by saturation mutagenesis. Science 360:. (pmid: 29724925) [ PubMed ] [ DOI ] Severe malaria is caused by the apicomplexan parasite Plasmodium falciparum. Despite decades of research, the distinct biology of these parasites has made it challenging to establish high-throughput genetic approaches to identify and prioritize therapeutic targets. Using transposon mutagenesis of P. falciparum in an approach that exploited its AT-rich genome, we generated more than 38,000 mutants, saturating the genome and defining mutability and fitness costs for over 87% of genes. Of 5399 genes, our study defined 2680 genes as essential for optimal growth of asexual blood stages in vitro. These essential genes are associated with drug resistance, represent leading vaccine candidates, and include approximately 1000 Plasmodium-conserved genes of unknown function. We validated this approach by testing proteasome pathways for individual mutants associated with artemisinin sensitivity.
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2018-09-18
Modified:
- 2018-09-18
Version:
- 1.0
Version history:
- 1.0 First live version
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.