Computational Systems Biology Main Page

From "A B C"
Jump to navigation Jump to search

Computational Systems Biology

Course Wiki for BCB420 (Computational Systems Biology) and JTB2020 (Applied Bioinformatics).


 

This is our main tool to coordinate information, activities and projects in University of Toronto's computational systems biology course BCB420. If you are not one of our students, this site is unlikely to be useful. If you are here because you are interested in general aspects of bioinformatics or computational biology, you may want to review the Wikipedia article on bioinformatics, or visit Wikiomics. Contact boris.steipe(at)utoronto.ca with any questions you may have.


 

Warning – this page and all associated course pages are currently under intense revision to prepare for the 2018 Winter session. This course will be delivered in part via an "inverted", knowledge-network format, and in part via a project centred format. I will contact students later in December to enrol everyone in the course mailing list.


 


 


 

BCB420 / JTB2020

These are the course pages for BCB420H (Computational Systems Biology). Welcome, you're in the right place.

These are also the course pages for JTB2020H (Applied Bioinformatics). How come? Why is JTB2020 not the graduate equivalent of BCB410 (Applied Bioinformatics)? Let me explain. When this course was conceived as a required part of the (then so called) Collaborative PhD Program in Proteomics and Bioinformatics in 2003, there was an urgent need to bring graduate students to a minimal level of computer skills and programming; prior experience was virtually nonexistent. Fortunately, the field has changed and our current graduate students are usually quite competent at least in some practical aspects of computational biology. In this course we profit from the rich and diverse knowledge of the problem-domain our graduate students have, while bringing everyone up to a level of competence in the practical, computational aspects.


The 2018 course...

In this course we pursue a task in computational systems biology of human genes in project oriented format. This will proceed in three phases:

  • First, we will review basic computational skills and bioinformatics knowledge to bring everyone to the same level. In all likelihood you will need to start with these tasks well in advance of the actual lectures. This phase will end with a comprehensive quiz in week 3;
  • Next we'll focus on data integration and definition of features. As an example, we will integrate gene expression data from different experiments into a common set of features. Each student will contribute data from one experiment. The results of this phase will be the topic of our first Oral Exam;
  • Finally, we will each adopt a biological "system" in human cells and use machine learning methods to attempt to refine its gene membership and assign roles to its member genes. The results will form the basis of our final Oral Exam;
  • There are several meta-skills that you will pick up "on the side" these include time management, working according to best practice of reproducible research in a collaborative environment on GitHub; report writing, and keeping a scientific lab journal.



Organization

Dates
BCB420/JTB2020 is a Winter Term course.
Lectures: Tuesdays, 16:00 to 18:00. (Classes start at 10 minutes past the hour.)
Final Exam: None for this course.


Location
MS 4171 (Medical Sciences Building).


Departmental information
For BCB420 see the BCB420 Biochemistry Department Course Web page.
For JTB2020 see the JTB2020 Course Web page for general information.


Prerequisites and Preparation

This course has formal prerequisites of BCH441H1 (Bioinformatics) or CSB472H1 (Computational Genomics and Bioinformatics). I have no way of knowing what is being taught in CSB472, and no way of confirming how much you remember from any of your previous courses, like BCH441 or BCB410. Moreover there are many alternative ways to become familiar with important course contents. Thus I generally enforce prerequisites only very weakly and you should not assume at all that having taken any particular combination of courses will have prepared you sufficiently. What I try to do instead is make the contents of the course very explicit. If your preparation is lacking, you will have to expend a very significant amount of effort. This is certainly possible, but whether you will succeed will depend on your motivation and aptitude.

The course requires (i) a solid understanding of molecular biology, (ii) introductory level knowledge of bioinformatics, (iii) a working knowledge of the R programming language.

The preparation material detailed below will be the subject of our first Quiz in the third week of class.

1 – A working knowledge of R ...
  • Work through the R tutorial on this site and complete the tasks and exercises in the tutorial and the associated scripts.


2 – A basic knowledge of Bioinformatics ...
Here is a list of detailed, introductory bioinformatics tutorials. If you have taken BCH441, the material will serve as a review. If you have not taken BCH441, this will help you get up to speed with basic concepts and R code.


3 – Project specific prereading ...
  • Vogelstein et al. (2013) Cancer genome landscapes. Science 339:1546-58. (pmid: 23539594)

    PubMed ] [ DOI ] Over the past decade, comprehensive sequencing efforts have revealed the genomic landscapes of common forms of human cancer. For most cancer types, this landscape consists of a small number of "mountains" (genes altered in a high percentage of tumors) and a much larger number of "hills" (genes altered infrequently). To date, these studies have revealed ~140 genes that, when altered by intragenic mutations, can promote or "drive" tumorigenesis. A typical tumor contains two to eight of these "driver gene" mutations; the remaining mutations are passengers that confer no selective growth advantage. Driver genes can be classified into 12 signaling pathways that regulate three core cellular processes: cell fate, cell survival, and genome maintenance. A better understanding of these pathways is one of the most pressing needs in basic cancer research. Even now, however, our knowledge of cancer genomes is sufficient to guide the development of more effective approaches for reducing cancer morbidity and mortality.

    • Leiserson et al. (2015) Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat Genet 47:106-14. (pmid: 25501392)

      PubMed ] [ DOI ] Cancers exhibit extensive mutational heterogeneity, and the resulting long-tail phenomenon complicates the discovery of genes and pathways that are significantly mutated in cancer. We perform a pan-cancer analysis of mutated networks in 3,281 samples from 12 cancer types from The Cancer Genome Atlas (TCGA) using HotNet2, a new algorithm to find mutated subnetworks that overcomes the limitations of existing single-gene, pathway and network approaches. We identify 16 significantly mutated subnetworks that comprise well-known cancer signaling pathways as well as subnetworks with less characterized roles in cancer, including cohesin, condensin and others. Many of these subnetworks exhibit co-occurring mutations across samples. These subnetworks contain dozens of genes with rare somatic mutations across multiple cancers; many of these genes have additional evidence supporting a role in cancer. By illuminating these rare combinations of mutations, pan-cancer network analyses provide a roadmap to investigate new diagnostic and therapeutic opportunities across cancer types.

      • Leiserson et al. (2016) A weighted exact test for mutually exclusive mutations in cancer. Bioinformatics 32:i736-i745. (pmid: 27587696)

        PubMed ] [ DOI ] MOTIVATION: The somatic mutations in the pathways that drive cancer development tend to be mutually exclusive across tumors, providing a signal for distinguishing driver mutations from a larger number of random passenger mutations. This mutual exclusivity signal can be confounded by high and highly variable mutation rates across a cohort of samples. Current statistical tests for exclusivity that incorporate both per-gene and per-sample mutational frequencies are computationally expensive and have limited precision. RESULTS: We formulate a weighted exact test for assessing the significance of mutual exclusivity in an arbitrary number of mutational events. Our test conditions on the number of samples with a mutation as well as per-event, per-sample mutation probabilities. We provide a recursive formula to compute P-values for the weighted test exactly as well as a highly accurate and efficient saddlepoint approximation of the test. We use our test to approximate a commonly used permutation test for exclusivity that conditions on per-event, per-sample mutation frequencies. However, our test is more efficient and it recovers more significant results than the permutation test. We use our Weighted Exclusivity Test (WExT) software to analyze hundreds of colorectal and endometrial samples from The Cancer Genome Atlas, which are two cancer types that often have extremely high mutation rates. On both cancer types, the weighted test identifies sets of mutually exclusive mutations in cancer genes with fewer false positives than earlier approaches. AVAILABILITY AND IMPLEMENTATION: See http://compbio.cs.brown.edu/projects/wext for software. CONTACT: braphael@cs.brown.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.




Grading and Activities

 
Activity Weight
BCB410 - (Undergraduates)
Weight
JTB2020 - (Graduates)
Self-evaluation and Feedback session on preparatory material("Quiz"[1]) 20 marks 20 marks
First Oral Exam 20 marks 15 marks
Second Oral Exam 30 marks 25 marks
Journal 25 marks 25 marks
Insights 5 marks 5 marks
Manuscript Draft   10 marks
Total 100 marks 100 marks


 

Oral Exams

Contents and reflection of participation ...


 

Journals

Try out forming a habit and get marks for it too ...


 

Marks adjustments

I do not adjust marks towards a target mean and variance (i.e. there will be no "belling" of grades). I feel strongly that such "normalization" detracts from a collaborative and mutually supportive learning environment. If your classmate gets a great mark because you helped them with a difficult concept, this should never have the effect that it brings down your mark through class average adjustments. Collaborate as much as possible, it is a great way to learn. But do keep it honest and carefully consider our rules on Plagiarism and Academic Misconduct.

Prerequisites

You must have taken an introductory bioinformatics course as a prerequisite, or otherwise acquired the necessary knowledge. Therefore I expect familiarity with the material of my BCH441 course. If you have not taken BCH441, please update your knowledge and skills before the course starts. I will not make accommodations for lack of prerequisites. Please check the syllabus for this course below to find whether you need to catch up on additional material, and peruse this site to find the information you may need. A (non-exhaustive) overview of topics and useful links is linked here.


Timetable and syllabus

Syllabus and activities in progress for the 2018 Winter Term ...


 


Week In class: Tuesday, January 9 This week
1
  • No class meeting this day!
  • Preparations I
  • Syllabus
  • Projects
  • Important dates
  • Grading
  • Organization
  • Signup to mailing list and Student Wiki.


 


Week In class: Tuesday, January 16 This week
2
  • First class meeting
  • Review of preparatory materials (you should have worked through all of the materials in preparation for class).
  • Practice quiz on preparations (not for credit)
  • Preparations II
  • Defining the class projects


 


Week In class: Tuesday, January 23 This week
3
  • First Quiz
  • Data Integration I
  • Data sources and workflows
  • Development principles
  • Writing R packages
  • Collaboration tools


 


Week In class: Tuesday, January 30 This week
4
  • ...
  • Data Integration II


 


Week In class: Tuesday, February 6 This week
5
  • ...
  • Data Integration III


 


Week In class: Tuesday, February 13 This week
6
  • Finish Data Integration tasks
  • Discuss and adopt Systems tasks
  • First Oral Exams


 


Week In class: Tuesday, February 20 This week
  • No class meeting - Reading Week
  • Systems readings


 


Week In class: Tuesday, February 27 This week
7
  • ...
  • Systems I


 


Week In class: Tuesday, March 6 This week
8
  • ...
  • Systems II


 


Week In class: Tuesday, March 13 This week
9
  • ...
  • Wednesday March 14: BCB420S drop date
  • Systems III


 


Week In class: Tuesday, March 20 This week
10
  • ...
  • Systems IV


 


Week In class: Tuesday, March 27 This week
11
  • ...
  • Systems V


 


Week In class: Tuesday, April 3 This week
12
  • Finish up and review Systems tasks
  • Final Oral Exams


 




In depth...


Resources

Course related


325C78 7097B8 9BACCF A8A5CC D7C0F0


 

Notes

  1. I call these activities Quiz sessions for brevity, however they are not quizzes in the usual sense, since they rely on self-evaluation and immediate feedback.