Difference between revisions of "Bioinformatics Main Page"

From "A B C"
Jump to navigation Jump to search
Line 86: Line 86:
 
:none
 
:none
  
 +
{{Vspace}}
  
 
===Printed material===
 
===Printed material===

Revision as of 19:35, 5 July 2017

BCH441 - Bioinformatics

Welcome to the BCH441 Course Wiki.

These wiki pages are provided to coordinate information, activities and projects in the introductory bioinformatics course taught by Boris Steipe at the University of Toronto. If you are not one of my students, you can still browse this site, however only users with a login account can edit or contribute material. If you are here because you are interested in general aspects of bioinformatics or computational biology, you may want to review the Wikipedia article on bioinformatics, or visit Wikiomics. Contact boris.steipe(at)utoronto.ca with any questions you may have.




This course is currently undergoing a fundamental revision. The contents will remain approximately the same - but the delivery, activities and assessments will comprise an entirely novel format for an undergraduate course. Watch this space for updates, only some of the information is relevant for the 2017 Fall Term.



The Course

BCH441 (BCH1441) is an introduction to current bioinformatics for life science students and the specialists in the BCB Program. The course provides an overview of the sources of biomolecular data, data annotation and integration, and the interpretation of results through evidence-based reasoning. This includes the components – sequence, structure, and function, the relationships in phylogeny and in the networks of interactions and regulation, and the “systems” through which we conceptually organize our knowledge.


Specific contents include:

  • large, public biomolecular data resources,
  • DNA and protein sequences and sequence analysis,
  • pairwise and multiple sequence alignment,
  • fast database searches to discover homologues,
  • protein structure interpretation and homology modeling,
  • phylogenetic analysis - tree building and interpretation,
  • work with genome-scale data,
  • functional annotation with Gene Ontology and other resources,
  • relationships discovered through co-expression and protein-protein interactions, and
  • introduction to systems-level concepts.

Practical, hands on tasks and assignments will introduce public data resources and analysis tools. Along with improving general computer literacy, you will learn to use the programming language and statistical workbench R, with a special emphasis on the kind of everyday tasks of data preparation and analysis that have become indispensable for any life-science laboratory. (Yes, you will learn programming.).

The course is complemented by BCB420 / JTB2020 (offered in the Winter Term) which consolidates aspects of cutting-edge computational systems biology in a project context.

BCH441H1F is the undergraduate course code.
BCH1441H1F is the cross-listed course code for graduate students.


Coordinator

Boris Steipe


 


Dates

BCH441/BCH1441 is a Fall Term course; contact times are Tuesdays, 17:00 to 20:00. These are listed nominally as Tutorials T5 and Lectures T6-8, but we will use the time in variable configurations.

First day of class – Tuesday, September 12. You must attend the first class.
We need this time to go over the course delivery and organizational details, and get you an account on the Course Wiki and on the course mailing list. Your personal presence is a requirement of the course. Please do not enrol in the course if your travel- or other plans prevent you from attending the first class session.


 

Location

LM 161 (Lash Miller Building)



Student Wiki

Many of the class activities will take place interactively on a separate Wiki site (the "Student Wiki"). You will create a personalized user page there, and use it to submit materials as required.

This Wiki is not accessible to the general public, you need an account that we will be registered after the first class-session.


 



Contact

Course communication will take place on the Quercus discussion section. We'll see how this goes. If it's not suitable for our needs we'll find an alternative.


 



Office hours

(Virtual) face to face meetings are by appointment, if required. However, we will be able to resolve almost all issues by e-mail. You will find that discussions by e-mail are both more efficient and effective than meetings. Moreover e-mail discussions leave you with a document trail of what was discussed, can contain links to information sources, and we can share points of general interest more easily with the class.


 



Prerequisites

Introductory courses to biochemistry and molecular biology provide the contents background to the course. Such might be obtained through the listed prerequistes: BCH210H1/BCH242Y1; BCH311H1/MGY311Y1/PSL350H1[1]; special permission of the course coordinator can be granted.

I generally waive prerequisites if you can convince me that you are willing and able to make up for material that you have not covered previously. This should be your informed and responsible decision, not mine.

You must have access to the Internet via your own computer. From time to time it may be necessary to bring your computer to class. If you do not have a laptop computer that is set up to work in the University's wireless network, contact me so we can figure out how to work around any issues.


 

Exclusions & Enrolment controls

none


 

Printed material

This is an electronic submission only course; but if you must print material, you might consider printing double-sided. Learn how, at the Print-Double-Sided Student Initiative. Printing of course material is expressly discouraged since the material is updated frequently.


Material below this point is no longer current and may change significantly for the 2017 Fall Term!



General

We will make an attempt to teach BCH441H following an inverted teaching model. Concepts will be introduced through background reading and extensive, hands-on assignments. We will use the classroom time

  • to assess contents-milestones in a weekly quiz;
  • to discuss fine points, perspectives, and to resolve uncertainties; and
  • to introduce concepts for the upcoming week.


Recommended textbooks

Depending on your background, various levels of textbooks may be suitable. I will bring my evaluation copies to class so you can have a look.
Understanding Bioinformatics (Zvelebil & Baum) is a decent general introduction to many aspects of bioinformatics. It was published in 2007, an updated version is urgently needed. Still, some of the basics (like the algorithm for optimal sequence alignment) don't change. (Amazon) (Indigo) (ABE books)
Practical Bioinformatics (Agostino) covers some of the material of the BCH441 exercises. Expect a no-nonsense introduction to the very most basic stuff. I have my pet peeves about this book (as I have for many others, eg. why in the world do they still teach CLUSTAL when all available studies demonstrate it to be the least accurate MSA algorithm by a margin???), but if you haven't taken BCH441, this may serve you well. And if you did take BCH441, it may consolidate some ideas that I wasn't clear about. (Amazon) (Indigo) (ABE books)
If you are aware of more recent good textbooks, or have your own opinions about these or other books, let me know.


 

Grading and Activities

 

Activity Weight
BCH441 - (Undergraduates)
Weight
BCH1441 - (Graduates)
11 Self-assessment and Feedback sessions 44 marks (11 x 4) 22 marks (11 x 2)
Bioinformatics project 26 marks (11 + 11 + 4) 26 marks
"Classroom" participation 10 marks (2 + 8) 10 marks
Thesis Project   22 marks
Final exam 20 marks 20 marks
Total 100 marks 100 marks


A note on marking

It is not my policy to adjust marks towards a target mean and variance (i.e. there will be no "belling" of grades). I feel strongly that such "normalization" detracts from a collaborative and mutually supportive learning environment. If your classmate gets a great mark because you helped him with a difficult concept, this should never have the effect that it brings down your mark through class average adjustments. Collaborate as much as possible, it is a great way to learn. However I may adjust marks is if we phrase questions ambiguously on quizzes, or if I decide that the final exam was too long.

 

Timetable and syllabus

 




 

PREPARATION

 


Week In class: Tuesday, Sept. 13 Readings Assignment In class: Tuesday, Sept. 20
1
  • Organization
  • Syllabus
  • Important dates
  • First assignment
  • Projects
  • Grading
  • Signup to mailing list and Student Wiki.
  • Introduction to bioinformatics and computational biology
R Tutorial Assignment 1 Quiz 1
Remember to bring your red pen!

Perspectives:

Customizing R and R Studio. Subsetting and filtering of vectors, arrays and lists.


 

DATA

 


Week In class: Tuesday, Sept. 20 Readings Assignment In class: Tuesday, Sept. 27
2
  • Abstractions
  • Data modelling
  • Key Public Databases (NCBI, EBI)
Lecture 02: Annotated Notes Assignment 2 Quiz 2

Perspectives ... data modelling.


 

SEQUENCE ANALYSIS

 


Week In class: Tuesday, Sept. 27 Readings Assignment In class: Tuesday, Oct. 4
3
  • Introduction to the sequence abstraction
  • EMBOSS and other sequence analysis tools
TBD Assignment 3 Quiz 3

Perspectives ... machine learning.


 

SEQUENCE ALIGNMENT

 


Week In class: Tuesday, Oct. 4 Readings Assignment Tuesday, Oct. 11
4
  • Introduction to homology
  • Optimal sequence alignment
  • Sequence database searches: BLAST, PSI-BLAST et al.
  • Multiple sequence alignment.
Lecture 04: Annotated Notes (Part 1)
Lecture 04: Annotated Notes (Part 2)
Assignment 4 TBD

Perspectives ... TBD


 

3D STRUCTURE

 


Week Tuesday, Oct. 11 Readings Assignment In class: Tuesday, Oct. 18
5
  • 3D structures
  • The PDB
  • Structure interpretation
  • Structural domains
Week 05: Annotated Notes (PDF 55.5.MB)
Assignment 5 Quiz 4

Perspectives ... TBD


 

FUNCTION

 


Week In class: Tuesday, Oct. 18 Readings Assignment In class: Tuesday, Oct. 25
6
  • The concept of function
  • Function annotation
  • Function databases
  • GO: the gene ontology
  • Function prediction strategies
Week 06: Annotated Notes (PDF 23.1 MB) Assignment 6 Quiz 5 and 6

Perspectives ... computing semantic similarity


 

PHYLOGENETIC ANALYSIS

 


Week In class: Tuesday, Oct. 25 Readings Assignment In class: Tuesday, Nov. 1
7
  • Phylogenetic analysis principles
  • Building trees
  • Tree interpretation
  • Inference from phylogenies
  • Signals of selective pressure and recent change
Week 07: Annotated Notes (PDF 15.7 MB) Assignment 7 Quiz 7

Perspectives ... Traces of selective pressure


At midnight: Project stage 1 is due.


 

STRUCTURE PREDICTION

 


Week In class: Tuesday, Nov. 1 Readings Assignment In class: Tuesday, Nov. 15
8

Note: Nov. 8 - no class due to Fall Break.

  • Homology modelling of protein structure
  • Protein structure forcefields
  • Molecular dynamics
  • de novo prediction
Week 08: Annotated Notes (PDF 12.7 MB) Assignment 8 Quiz 8

Perspectives ... Using Rosetta


 

GENOMICS

 


Week In class: Tuesday, Nov. 15 Readings Assignment In class: Tuesday, Nov. 22
9
  • Genome sequencing
  • Genome annotation
  • Genome databases and browsers
  • Human genomics
Week 09: Annotated Notes (PDF 20.2 MB) Assignment 9 Quiz 9

Perspectives ... Popular pipelines


 

EXPRESSION ANALYSIS

 


Week In class: Tuesday, Nov. 22 Readings Assignment In class: Tuesday, Nov. 29
10
  • Measuring gene expression levels: microarrays vs. NGS
  • GEO - Microarrays and RNAseq
  • GEO2R and RNAseq alternatives
  • Discovering differentially expressed genes
Week 10: Annotated Notes (PDF 15.8 MB) Assignment 10 Quiz 10

Perspectives ...


 

PROTEIN-PROTEIN INTERACTIONS

 


Week In class: Tuesday, Nov. 29 Readings Assignment In class: Tuesday, Dec. 6
11
  • Concepts of protein-protein interactions
  • Interaction databases
  • Graph theory
  • Interactome
  • other -omes
Week 11: Annotated Notes (PDF 12.2 MB) Assignment 11 Quiz 11

Perspectives ... Computing on graphs


 

EXPLORATIONS

 


Week In class: Tuesday, Dec. 6 Readings
12
  • Automation of queries
  • Integration of data
  • Principles of Exploratory Data Analysis (EDA)
    • Plotting
    • "Features"
    • Clustering
TBD


 


Resources

Course related


 

Contents related

 


 
Forums
BioStar: General bioinformatics, computational-, and systems biology questions (timesink warning!)
Reddit: the bioinformatics "subreddit" (timesink warning!)
R-help: The R programming language
Stack Overflow: R-related questions
BioConductor Support: for all questions about the BioConductor Project
Cross Validated: statistics related questions on Stack-exchange



Notes

  1. Please check the official Calendar for the academic year to confirm.