Bioinformatics Main Page

From "A B C"
Jump to navigation Jump to search

BCH441 - Bioinformatics

Welcome to the BCH441 Course Wiki.

These wiki pages are provided to coordinate information, activities and projects in the introductory bioinformatics course taught by Boris Steipe at the University of Toronto. If you are not one of my students, you can still browse this site, however only users with a login account can edit or contribute or edit material. If you are here because you are interested in general aspects of bioinformatics or computational biology, you may want to review the Wikipedia article on bioinformatics, or visit Wikiomics. Contact boris.steipe(at)utoronto.ca with any questions you may have.



BCH441/BCH1441 will have its first class meeting TODAY (Tuesday, September 13) at 18:00 (no tutorial session that day). We have moved to a larger classroom and will meet in WI1016 (Wilson Hall, New College) throughout the term. Note: this is a room change - the original room, MS4279, is no longer current.

Our first meeting is the all-important coordination meeting. Do not miss it. You will receive 2 marks of your participation score for your presence in person. Sorry, no make-up or exceptions.


All materials on this site are currently undergoing revisions.


Auditors: since the course contents are transmitted to a large degree through the course-restricted mailing list and Wiki, BCH441 is not suitable for auditing. Sorry.


</div-->



The Course

BCH441 (BCH1441) is an introduction to current bioinformatics for life science students and the specialists in the BCB Program. The course provides an overview of the sources of biomolecular data, data annotation and integration, and the interpretation of results through evidence-based reasoning. This includes the components – sequence, structure, and function, the relationships in phylogeny and in the networks of interactions and regulation, and the “systems” through which we conceptually organize our knowledge.


Specific contents include:

  • large, public biomolecular data resources,
  • DNA and protein sequences and sequence analysis,
  • pairwise and multiple sequence alignment,
  • fast database searches to discover homologues,
  • protein structure interpretation and homology modeling,
  • phylogenetic analysis - tree building and interpretation,
  • work with genome-scale data,
  • functional annotation with Gene Ontology and other resources,
  • relationships discovered through co-expression and protein-protein interactions, and
  • introduction to systems-level concepts.

Practical, weekly, hands on assignments will introduce public data resources and analysis tools. Along with improving general computer literacy, you will learn to use the programming language and statistical workbench R, with a special emphasis on the kind of everyday tasks of data preparation and analysis that have become indispensable for any life-science laboratory. (Yes, you will learn programming.) Application of the material in a systems-biology oriented project will round off the course.

The course is complemented by BCB420 / JTB2020 (offered in the Winter Term) which consolidates aspects of cutting-edge computational systems biology in a project context.

BCH441H1F is the undergraduate course code.
BCH1441H1F is the cross-listed course code for graduate students.


Organization

General
We will make an attempt to teach BCH441H following an inverted teaching model. Concepts will be introduced through background reading and extensive, hands-on assignments. We will use the classroom time
  • to assess contents-milestones in a weekly quiz;
  • to discuss fine points, perspectives, and to resolve uncertainties; and
  • to introduce concepts for the upcoming week.


Dates
BCH441/BCH1441 is a Fall Term course.
Tutorial sessions: Tuesday, 17:00 to 18:00 for open discussion of lecture material, in-class quizzes, quiz debriefings, and other activities. Class begins at 10 minutes past the hour, don't be late. That's rude.
Lectures: right after the tutorials: Tuesday, 18:00 to 20:00.


Location
WI 1016 (Wilson Hall, New College) Note: this is a room change - the original room was MS4279.


The Student Wiki.
Many of your activities will take place on a Wiki site (the "Student Wiki").


The Mailing List.
All course announcements and all course discussion (outside of class) will take place on a mailing list. We use Google Groups for this purpose. You will be subscribed to the list in the first class. You will not be able to participate fully in the course if you are not subscribed. Make sure you are subscribed with the email address you use most frequently and set your preferences to immediate delivery - "Digest" or "Web only" delivery won't allow you to participate actively in discussions and your participation mark will suffer.


Coordinator
Boris Steipe


Office hours
Face to face meetings are by appointment, as required. You will find that meetings that we prepare by e-mail are both more efficient and effective. Moreover e-mail leaves you with a document trail of what was discussed, and we can share points of general interest more easily with the class.


Prerequisites
Introductory courses to biochemistry and molecular biology provide the contents background to the course. Such might be obtained through the listed prerequistes: BCH210H1/BCH242Y1; BCH311H1/MGY311Y1/PSL350H1[1]; special permission of the course coordinator can be granted.
It is assumed that students have access to the Internet via their own computer.


Exclusions & Enrolment controls
none


Printed material
This is an electronic submission only course; but if you must print material, you might consider printing double-sided. Learn how, at the Print-Double-Sided Student Initiative. Printing of course material is expressly discouraged since the material is updated frequently.


Recommended textbooks

Depending on your background, various levels of textbooks may be suitable. I will bring my evaluation copies to class so you can have a look.
Understanding Bioinformatics (Zvelebil & Baum) is a decent general introduction to many aspects of bioinformatics. It was published in 2007, an updated version is urgently needed. Still, some of the basics (like the algorithm for optimal sequence alignment) don't change. (Amazon) (Indigo) (ABE books)
Practical Bioinformatics (Agostino) covers some of the material of the BCH441 exercises. Expect a no-nonsense introduction to the very most basic stuff. I have my pet peeves about this book (as I have for many others, eg. why in the world do they still teach CLUSTAL when all available studies demonstrate it to be the least accurate MSA algorithm by a margin???), but if you haven't taken BCH441, this may serve you well. And if you did take BCH441, it may consolidate some ideas that I wasn't clear about. (Amazon) (Indigo) (ABE books)
If you are aware of more recent good textbooks, or have your own opinions about these or other books, let me know.


 

Grading and Activities

 

Activity Weight
BCH441 - (Undergraduates)
Weight
BCH1441 - (Graduates)
11 Self-assessment and Feedback sessions 44 marks (11 x 4) 22 marks (11 x 2)
Bioinformatics project 26 marks (5 + 12 + 9) 26 marks
"Classroom" participation 10 marks (2 + 8) 10 marks
Thesis Project   22 marks
Final exam 20 marks 20 marks
Total 100 marks 100 marks


A note on marking

It is not my policy to adjust marks towards a target mean and variance (i.e. there will be no "belling" of grades). I feel strongly that such "normalization" detracts from a collaborative and mutually supportive learning environment. If your classmate gets a great mark because you helped him with a difficult concept, this should never have the effect that it brings down your mark through class average adjustments. Collaborate as much as possible, it is a great way to learn. However I may adjust marks is if we phrase questions ambiguously on quizzes, or if I decide that the final exam was too long.

 

Timetable and syllabus

 



Syllabus and assignments will still be in flux for a few weeks.


 

PREPARATION

 


Week In class: Tuesday, Sept. 13 Readings Assignment In class: Tuesday, Sept. 20
1
  • Organization
  • Syllabus
  • Important dates
  • First assignment
  • Projects
  • Grading
  • Signup to mailing list and Student Wiki.
  • Introduction to bioinformatics and computational biology
R Tutorial Assignment 1 Quiz 1
Remember to bring your red pen!

Perspectives:

Customizing R and R Studio. Subsetting and filtering of vectors, arrays and lists.


 

DATA

 


Week In class: Tuesday, Sept. 20 Readings Assignment In class: Tuesday, Sept. 27
2
  • Abstractions
  • Data modelling
  • Key Public Databases (NCBI, EBI)
TBD Assignment 2 Quiz 2

Perspectives ... data modelling.


 

SEQUENCE ANALYSIS

 


Week In class: Tuesday, Sept. 27 Readings Assignment In class: Tuesday, Oct. 4
3
  • Introduction to the sequence abstraction
  • EMBOSS and other sequence analysis tools
TBD Assignment 3 Quiz 3

Perspectives ... machine learning.


 

SEQUENCE ALIGNMENT

 


Week In class: Tuesday, Oct. 4 Readings Assignment Tuesday, Oct. 11
4
  • Introduction to homology
  • Optimal sequence alignment
  • Sequence database searches: BLAST, PSI-BLAST et al.
  • Multiple sequence alignment.
TBD Assignment 4 TBD

Perspectives ... TBD


 

3D STRUCTURE

 


Week Tuesday, Oct. 11 Readings Assignment In class: Tuesday, Oct. 18
5
  • 3D structures
  • The PDB
  • Structure interpretation
  • Structural domains
TBD Assignment 5 Quiz 4 and 5

Perspectives ... TBD


 

FUNCTION

 


Week In class: Tuesday, Oct. 18 Readings Assignment In class: Tuesday, Oct. 25
6
  • The concept of function
  • Function annotation
  • Function databases
  • GO: the gene ontology
  • Function prediction strategies
TBD Assignment 6 Quiz 6

Perspectives ... computing semantic similarity

At midnight: Project stage 1 is due.


 

PHYLOGENETIC ANALYSIS

 


Week In class: Tuesday, Oct. 25 Readings Assignment In class: Tuesday, Nov. 1
7
  • Phylogenetic analysis principles
  • Building trees
  • Tree interpretation
  • Inference from phylogenies
  • Signals of selective pressure and recent change
TBD Assignment 7 Quiz 7

Perspectives ... Traces of selective pressure


 

STRUCTURE PREDICTION

 


Week In class: Tuesday, Nov. 1 Readings Assignment In class: Tuesday, Nov. 15
8

Note: Nov. 8 - no class due to Fall Break.

  • Homology modelling of protein structure
  • Protein structure forcefields
  • Molecular dynamics
  • de novo prediction
TBD Assignment 8 Quiz 8

Perspectives ... Using Rosetta


 

GENOME ANALYSIS

 


Week In class: Tuesday, Nov. 15 Readings Assignment In class: Tuesday, Nov. 22
9
  • Genome sequencing
  • Genome annotation
  • Genome databases and browsers
  • Human genomics
TBD Assignment 9 Quiz 9

Perspectives ... Popular pipelines


 

EXPRESSION ANALYSIS

 


Week In class: Tuesday, Nov. 22 Readings Assignment In class: Tuesday, Nov. 29
10
  • Measuring gene expression levels: microarrays vs. NGS
  • GEO
  • Discovering differentially expressed genes
TBD Assignment 10 Quiz 10

Perspectives ... What does FDR mean?


 

PROTEIN-PROTEIN INTERACTIONS

 


Week In class: Tuesday, Nov. 29 Readings Assignment In class: Tuesday, Dec. 6
11
  • Concepts of protein-protein interactions
  • Interaction databases
  • Graph theory
  • Interactome
  • other -omes
TBD Assignment 11 Quiz 11

Perspectives ... TBD


 

SYSTEMS

 


Week In class: Tuesday, Dec. 6 Readings
12
  • Definition of systems
  • Properties of biological networks and pathways as a reflection of systems
  • Pathway and system databases
  • Phenotypes and cheminformatics
  • Introduction to Exploratory Data Analysis (EDA)
TBD


 


 




 

Resources

Course related


 

Contents related

 


 
Forums
BioStar: General bioinformatics, computational-, and systems biology questions (timesink warning!)
Reddit: the bioinformatics "subreddit" (timesink warning!)
R-help: The R programming language
Stack Overflow: R-related questions
BioConductor Support: for all questions about the BioConductor Project
Cross Validated: statistics related questions on Stack-exchange



Notes

  1. Please check the official Calendar for the academic year to confirm.