Difference between revisions of "Computational Systems Biology Main Page"
Line 121: | Line 121: | ||
<section begin=CSB_main_grading /> | <section begin=CSB_main_grading /> | ||
+ | |||
+ | ===The "Knowledge Network"=== | ||
+ | |||
+ | Supporting learning units for this course are organized in a "Knowledge Network" of self-contained units that can be worked on according to students' individual needs and timing. Here is the '''detailed map'''. It contains links to all of the units. | ||
+ | |||
+ | |||
+ | * <command>-Click to open the Learning Units Map in a new tab, scale for detail. | ||
+ | [[File:BCB420-Units.svg|thumb|500px|none|link=http://steipe.biochemistry.utoronto.ca/abc/assets/BCB420-Units.svg|'''A map of the BCB420 learning units.''']] | ||
+ | * Hover over a learning unit to see its keywords. | ||
+ | * Click on a learning unit to open the associated page. | ||
+ | * The nodes of the learning unit network are colour-coded: | ||
+ | **<span style="background-color: #b3dbce;"> Live units </span> are green | ||
+ | **<span style="background-color: #d9ead5;"> Units under development </span> are light green. These are still in progress. | ||
+ | **<span style="background-color: #f2fafa;"> Stubs </span> (placeholders) are pale. These still need basic contents. | ||
+ | **<span style="background-color: #97bed5;"> Milestone units </span> are blue. These collect a number of prerequisites to simplify the network. | ||
+ | **<span style="background-color: #e19fa7;"> Integrator units </span> are red. These embody the main goals of the course. | ||
+ | **<span style="background-color: #f4d7b7;"> Units that require revision </span> are pale orange. | ||
+ | *Units that have a <span style="background-color: #eeeeee; border:solid 2px #000000;"> black border </span> have deliverables that can be submitted for credit. Visit the node for details. | ||
+ | *Arrows point from a prerequisite unit to a unit that builds on its contents. | ||
+ | |||
+ | (Many new units will be added to the map as the course progresses, reload the map frequently.) | ||
+ | |||
+ | |||
+ | {{Vspace}} | ||
+ | |||
+ | ====Navigating the course==== | ||
+ | |||
+ | Everything starts with the following three units: | ||
+ | *[[FND-Wiki_editing|Introduction to editing Wiki pages]] | ||
+ | :{{#lst:FND-Wiki_editing|abstract}} | ||
+ | |||
+ | *[[FND-Journal|Your Course Journal]] | ||
+ | :{{#lst:FND-Journal|abstract}} | ||
+ | |||
+ | *[[ABC-Insights|The "insights!" page]] | ||
+ | :{{#lst:ABC-Insights|abstract}} | ||
+ | |||
+ | * Once you have completed these three units, get started '''immediately''' on the Introduction-to-R units. You need time and practice, practice, practice<ref>[https://tapas.io/episode/923459 It's practice!]</ref> to acquire the programming skills you will need for the course. | ||
+ | |||
+ | * Whenever you want to take a break from studying R, get done with the other preparatory units. | ||
+ | |||
+ | At the end of our preparatory phase (after week 2) we will hold a comprehensive, non-trivial quiz on the preparatory units and on R basics. | ||
+ | |||
+ | |||
+ | <!-- | ||
+ | Everything leads to the ''Integrator Units''. These cover four large areas of bioinformatics that make up the explicit goals of the course: | ||
+ | |||
+ | (i) algorithms and statistics;<br /> | ||
+ | (ii) structural modelling and interpretation;<br /> | ||
+ | iii) gene annotation; and;<br /> | ||
+ | (iv) phylogenetic analysis. | ||
+ | |||
+ | The knowledge and skills you need to work on these ''Integrator Units'' can be obtained from the other learning units that are shown on the [http://steipe.biochemistry.utoronto.ca/abc/assets/ABC-units_map.svg learning units map] as prerequisites. Note that "prerequisites" in this context does not mean you '''must''' do one thing before you can do another, the arrows simply point out which units assume what prior knowledge. You can acquire that knowledge in whatever sequence makes sense to '''you''', and you don't have to learn from the learning units of this course at all. Just make sure that you submit enough general learning units for evaluation along the way. And document what you are doing in your Course Journal. Also, remember that '''all''' the material is cumulative - my evaluation of your work implicitly includes all of the prerequisite material. | ||
+ | --> | ||
+ | |||
+ | {{Vspace}} | ||
===Grading and Activities=== | ===Grading and Activities=== | ||
Line 193: | Line 249: | ||
====Journals==== | ====Journals==== | ||
− | + | Start forming a habit and even get marks for it too ... | |
Revision as of 01:02, 9 January 2018
Computational Systems Biology
Course Wiki for BCB420 (Computational Systems Biology) and JTB2020 (Applied Bioinformatics).
This is our main tool to coordinate information, activities and projects in University of Toronto's computational systems biology course BCB420. If you are not one of our students, this site is unlikely to be useful. If you are here because you are interested in general aspects of bioinformatics or computational biology, you may want to review the Wikipedia article on bioinformatics, or visit Wikiomics. Contact boris.steipe(at)utoronto.ca with any questions you may have.
Warning – this page and all associated course pages are currently under intense revision to prepare for the 2018 Winter session. This course will be delivered in part via an "inverted", knowledge-network format, and in part via a project centred format. I will contact students later in December to enrol everyone in the course mailing list.
Contents
BCB420 / JTB2020
These are the course pages for BCB420H (Computational Systems Biology). Welcome, you're in the right place.
These are also the course pages for JTB2020H (Applied Bioinformatics). How come? Why is JTB2020 not the graduate equivalent of BCB410 (Applied Bioinformatics)? Let me explain. When this course was conceived as a required part of the (then so called) Collaborative PhD Program in Proteomics and Bioinformatics in 2003, there was an urgent need to bring graduate students to a minimal level of computer skills and programming; prior experience was virtually nonexistent. Fortunately, the field has changed and our current graduate students are usually quite competent at least in some practical aspects of computational biology. In this course we profit from the rich and diverse knowledge of the problem-domain our graduate students have, while bringing everyone up to a level of competence in the practical, computational aspects.
- The 2018 course...
In this course we pursue a task in computational systems biology of human genes in project oriented format. This will proceed in three phases:
- First, we will review basic computational skills and bioinformatics knowledge to bring everyone to the same level. In all likelihood you will need to start with these tasks well in advance of the actual lectures. This phase will end with a comprehensive quiz in week 3;
- Next we'll focus on data integration and definition of features. As an example, we will integrate gene expression data from different experiments into a common set of features. Each student will contribute data from one experiment. The results of this phase will be the topic of our first Oral Exam;
- Finally, we will each adopt a biological "system" in human cells and use machine learning methods to attempt to refine its gene membership and assign roles to its member genes. The results will form the basis of our final Oral Exam;
- There are several meta-skills that you will pick up "on the side" these include time management, working according to best practice of reproducible research in a collaborative environment on GitHub; report writing, and keeping a scientific lab journal.
Organization
- Dates
- BCB420/JTB2020 is a Winter Term course.
- Lectures: Tuesdays, 16:00 to 18:00. (Classes start at 10 minutes past the hour.)
- Final Exam: None for this course.
- Location
- MS 4171 (Medical Sciences Building).
- Departmental information
- For BCB420 see the BCB420 Biochemistry Department Course Web page.
- For JTB2020 see the JTB2020 Course Web page for general information.
Prerequisites and Preparation
This course has formal prerequisites of BCH441H1 (Bioinformatics) or CSB472H1 (Computational Genomics and Bioinformatics). I have no way of knowing what is being taught in CSB472, and no way of confirming how much you remember from any of your previous courses, like BCH441 or BCB410. Moreover there are many alternative ways to become familiar with important course contents. Thus I generally enforce prerequisites only very weakly and you should not assume at all that having taken any particular combination of courses will have prepared you sufficiently. What I try to do instead is make the contents of the course very explicit. If your preparation is lacking, you will have to expend a very significant amount of effort. This is certainly possible, but whether you will succeed will depend on your motivation and aptitude.
The course requires (i) a solid understanding of molecular biology, (ii) introductory level knowledge of bioinformatics, (iii) a working knowledge of the R programming language.
The preparation material detailed below will be the subject of our first Quiz in the third week of class.
- 1 – A working knowledge of R ...
- Work through the R tutorial on this site and complete the tasks and exercises in the tutorial and the associated scripts.
- 2 – A basic knowledge of Bioinformatics ...
- 3 – Project specific prereading ...
Vogelstein et al. (2013) Cancer genome landscapes. Science 339:1546-58. (pmid: 23539594) [ PubMed ] [ DOI ] Over the past decade, comprehensive sequencing efforts have revealed the genomic landscapes of common forms of human cancer. For most cancer types, this landscape consists of a small number of "mountains" (genes altered in a high percentage of tumors) and a much larger number of "hills" (genes altered infrequently). To date, these studies have revealed ~140 genes that, when altered by intragenic mutations, can promote or "drive" tumorigenesis. A typical tumor contains two to eight of these "driver gene" mutations; the remaining mutations are passengers that confer no selective growth advantage. Driver genes can be classified into 12 signaling pathways that regulate three core cellular processes: cell fate, cell survival, and genome maintenance. A better understanding of these pathways is one of the most pressing needs in basic cancer research. Even now, however, our knowledge of cancer genomes is sufficient to guide the development of more effective approaches for reducing cancer morbidity and mortality.
Leiserson et al. (2015) Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat Genet 47:106-14. (pmid: 25501392) [ PubMed ] [ DOI ] Cancers exhibit extensive mutational heterogeneity, and the resulting long-tail phenomenon complicates the discovery of genes and pathways that are significantly mutated in cancer. We perform a pan-cancer analysis of mutated networks in 3,281 samples from 12 cancer types from The Cancer Genome Atlas (TCGA) using HotNet2, a new algorithm to find mutated subnetworks that overcomes the limitations of existing single-gene, pathway and network approaches. We identify 16 significantly mutated subnetworks that comprise well-known cancer signaling pathways as well as subnetworks with less characterized roles in cancer, including cohesin, condensin and others. Many of these subnetworks exhibit co-occurring mutations across samples. These subnetworks contain dozens of genes with rare somatic mutations across multiple cancers; many of these genes have additional evidence supporting a role in cancer. By illuminating these rare combinations of mutations, pan-cancer network analyses provide a roadmap to investigate new diagnostic and therapeutic opportunities across cancer types.
Leiserson et al. (2016) A weighted exact test for mutually exclusive mutations in cancer. Bioinformatics 32:i736-i745. (pmid: 27587696) [ PubMed ] [ DOI ] MOTIVATION: The somatic mutations in the pathways that drive cancer development tend to be mutually exclusive across tumors, providing a signal for distinguishing driver mutations from a larger number of random passenger mutations. This mutual exclusivity signal can be confounded by high and highly variable mutation rates across a cohort of samples. Current statistical tests for exclusivity that incorporate both per-gene and per-sample mutational frequencies are computationally expensive and have limited precision. RESULTS: We formulate a weighted exact test for assessing the significance of mutual exclusivity in an arbitrary number of mutational events. Our test conditions on the number of samples with a mutation as well as per-event, per-sample mutation probabilities. We provide a recursive formula to compute P-values for the weighted test exactly as well as a highly accurate and efficient saddlepoint approximation of the test. We use our test to approximate a commonly used permutation test for exclusivity that conditions on per-event, per-sample mutation frequencies. However, our test is more efficient and it recovers more significant results than the permutation test. We use our Weighted Exclusivity Test (WExT) software to analyze hundreds of colorectal and endometrial samples from The Cancer Genome Atlas, which are two cancer types that often have extremely high mutation rates. On both cancer types, the weighted test identifies sets of mutually exclusive mutations in cancer genes with fewer false positives than earlier approaches. AVAILABILITY AND IMPLEMENTATION: See http://compbio.cs.brown.edu/projects/wext for software. CONTACT: braphael@cs.brown.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
The "Knowledge Network"
Supporting learning units for this course are organized in a "Knowledge Network" of self-contained units that can be worked on according to students' individual needs and timing. Here is the detailed map. It contains links to all of the units.
- <command>-Click to open the Learning Units Map in a new tab, scale for detail.
- Hover over a learning unit to see its keywords.
- Click on a learning unit to open the associated page.
- The nodes of the learning unit network are colour-coded:
- Live units are green
- Units under development are light green. These are still in progress.
- Stubs (placeholders) are pale. These still need basic contents.
- Milestone units are blue. These collect a number of prerequisites to simplify the network.
- Integrator units are red. These embody the main goals of the course.
- Units that require revision are pale orange.
- Units that have a black border have deliverables that can be submitted for credit. Visit the node for details.
- Arrows point from a prerequisite unit to a unit that builds on its contents.
(Many new units will be added to the map as the course progresses, reload the map frequently.)
Everything starts with the following three units:
This should be the first learning unit you work with, since your Course Journal will be kept on a Wiki, as well as all other deliverables. This unit includes an introduction to authoring Wikitext and the structure of Wikis, in particular how different pages live in separate "Namespaces". The unit also covers the standard markup conventions - "Wikitext markup" - the same conventions that are used on Wikipedia - as well as some extensions that are specific to our Course- and Student Wiki. We also discuss page categories that help keep a Wiki organized, licensing under a Creative Commons Attribution license, and how to add licenses and other page components through template codes.
Keeping a journal is an essential task in a laboratory. To practice keeping a technical journal, you will document your activities as you are working through the material of the course. A significant part of your term grade will be given for this Course Journal. This unit introduces components and best practice for lab- and course journals and includes a wiki-source template to begin your own journal on the Student Wiki.
In paralell with your other work, you will maintain an insights! page on which you collect valuable insights and learning experiences of the course. Through this you ask yourself: what does this material mean - for the field, and for myself.
- Once you have completed these three units, get started immediately on the Introduction-to-R units. You need time and practice, practice, practice[1] to acquire the programming skills you will need for the course.
- Whenever you want to take a break from studying R, get done with the other preparatory units.
At the end of our preparatory phase (after week 2) we will hold a comprehensive, non-trivial quiz on the preparatory units and on R basics.
Grading and Activities
Activity | Weight BCB410 - (Undergraduates) |
Weight JTB2020 - (Graduates) |
Self-evaluation and Feedback session on preparatory material("Quiz"[2]) | 20 marks | 20 marks |
First Oral Exam | 20 marks | 15 marks |
Second Oral Exam | 30 marks | 25 marks |
Journal | 25 marks | 25 marks |
Insights | 5 marks | 5 marks |
Manuscript Draft | 10 marks | |
Total | 100 marks | 100 marks |
Oral Exams
Contents and reflection of participation ...
Journals
Start forming a habit and even get marks for it too ...
Marks adjustments
I do not adjust marks towards a target mean and variance (i.e. there will be no "belling" of grades). I feel strongly that such "normalization" detracts from a collaborative and mutually supportive learning environment. If your classmate gets a great mark because you helped them with a difficult concept, this should never have the effect that it brings down your mark through class average adjustments. Collaborate as much as possible, it is a great way to learn. But do keep it honest and carefully consider our rules on Plagiarism and Academic Misconduct.
Prerequisites
You must have taken an introductory bioinformatics course as a prerequisite, or otherwise acquired the necessary knowledge. Therefore I expect familiarity with the material of my BCH441 course. If you have not taken BCH441, please update your knowledge and skills before the course starts. I will not make accommodations for lack of prerequisites. Please check the syllabus for this course below to find whether you need to catch up on additional material, and peruse this site to find the information you may need. A (non-exhaustive) overview of topics and useful links is linked here.
Timetable and syllabus
Syllabus and activities in progress for the 2018 Winter Term ...
Week | In class: Tuesday, January 9 | This week |
1 |
|
|
Week | In class: Tuesday, January 16 | This week |
2 |
|
|
Week | In class: Tuesday, January 23 | This week |
3 |
|
|
Week | In class: Tuesday, January 30 | This week |
4 |
|
|
Week | In class: Tuesday, February 6 | This week |
5 |
|
|
Week | In class: Tuesday, February 13 | This week |
6 |
|
|
Week | In class: Tuesday, February 20 | This week |
– |
|
|
Week | In class: Tuesday, February 27 | This week |
7 |
|
|
Week | In class: Tuesday, March 6 | This week |
8 |
|
|
Week | In class: Tuesday, March 13 | This week |
9 |
|
|
Week | In class: Tuesday, March 20 | This week |
10 |
|
|
Week | In class: Tuesday, March 27 | This week |
11 |
|
|
Week | In class: Tuesday, April 3 | This week |
12 |
|
|
In depth...
Resources
- Course related
- Student Wiki
- The Course Google Group.
- Netiquette for the Group mailing list
325C78 | 7097B8 | 9BACCF | A8A5CC | D7C0F0 |
Notes
- ↑ It's practice!
- ↑ I call these activities Quiz sessions for brevity, however they are not quizzes in the usual sense, since they rely on self-evaluation and immediate feedback.