Difference between revisions of "Lecture 04"
Jump to navigation
Jump to search
(4 intermediate revisions by the same user not shown) | |||
Line 13: | Line 13: | ||
;What you should take home from this part of the course: | ;What you should take home from this part of the course: | ||
− | *Understand | + | |
− | + | *Understand key concepts in probabilistic pattern representation and matching, especially PSSMs. Understand that machine-learning tools such as HMMs (Hidden Markov Models) and NN (Neural Networks) can be used for probabilistic pattern matching and classification. | |
− | * | + | *Understand the concept of a sequence logo. |
− | *Be familiar with | + | *Be familiar with the SignalP Web server. |
− | * | + | *Know basic concepts of statistics and probability theory, key terms of descriptive statistics; |
− | * | + | *Understand probability tables in principle; |
− | * | + | *Have encountered important probability distributions; |
− | * | + | *Understand different error types; |
+ | *Understand the terms: significance, confidence interval and statistical test. | ||
+ | *Be familiar with the concepts and strategy of simulation testing and understand why its simplicity is making an important contribution to computational biology. | ||
| | ||
;Links summary: | ;Links summary: | ||
− | *[http:// | + | *[http://weblogo.berkeley.edu/ WebLogo] |
− | *[http:// | + | *[http://www.lecb.ncifcrf.gov/~toms/sequencelogo.html Tom Schneider's Sequence Logo pages] (and introductions to information theory) |
− | + | *[http://www.cbs.dtu.dk/services/SignalP/ The SignalP server] | |
− | |||
− | *[http://www. | ||
− | |||
| | ||
;Exercises | ;Exercises | ||
− | * | + | *If you assume that an 80-mer oligonucleotide can be synthesized with 99.9% coupling efficiency per step and a 0.2% chance of coupling a leftover nucleotide from the previous synthesis step, what is the probability that a randomly picked clone of a gene built with this oligonucleotide has the correct sequence? |
− | * | + | *In a recent doctoral thesis defence the candidate claimed that in a microarray expression analysis he was able to show reciprocal regulation of two genes (one related to immune stimulation, the other related to immune suppression): this would mean whenever one gene is regulated up, the other is downregulated, and ''vice versa''. The claim was based on observing this effect in eight of ten experiments. Expression levels were scored semiquantitatively on a scale of (++,+,0,-, and --). Given that such experiments have experimental error as well as biological variability, '''sketch''' a simulation test that would analyse whether in fact a significant (anti)correlation had been observed, or whether this result could just as well be due to meaningless fluctuations. |
| | ||
Line 147: | Line 146: | ||
]] | ]] | ||
======Slide 027====== | ======Slide 027====== | ||
− | + | <small>deleted</small> | |
− | |||
======Slide 028====== | ======Slide 028====== | ||
[[Image:L04_s028.jpg|frame|none|Lecture 04, Slide 028<br> | [[Image:L04_s028.jpg|frame|none|Lecture 04, Slide 028<br> | ||
Line 168: | Line 166: | ||
======Slide 032====== | ======Slide 032====== | ||
[[Image:L04_s032.jpg|frame|none|Lecture 04, Slide 032<br> | [[Image:L04_s032.jpg|frame|none|Lecture 04, Slide 032<br> | ||
− | You should be familiar with these most fundamental descriptors, they come up time- and time again in the literature. Here is a series of highly readable reviews on topics of medical statistics by Jonathan Ball and Coauthors:<br> <br> | + | You should be familiar with these most fundamental descriptors, they come up time- and time again in the literature. Here is a series of highly readable reviews on topics of medical statistics by Jonathan Ball and Coauthors:<br> |
+ | <br> | ||
+ | *(1)[http://ccforum.com/content/6/1/66 Presenting and summarising data]<br> | ||
+ | *(2) [http://ccforum.com/content/6/2/143 Samples and populations]<br> | ||
+ | *(3) [http://ccforum.com/content/6/3/222 Hypothesis testing and P values]<br> | ||
+ | *(4) [http://ccforum.com/content/6/4/335 Sample size calculations]<br> | ||
+ | *(5) [http://ccforum.com/content/6/5/424 Comparison of means]<br> | ||
+ | *(6) [http://ccforum.com/content/6/6/509 Nonparametric methods]<br> | ||
+ | *(7) [http://ccforum.com/content/7/6/451 Correlation and regression]<br> | ||
+ | *(8) [http://ccforum.com/content/8/1/46 Qualitative data - tests of association]<br> | ||
+ | *(9) [http://ccforum.com/content/8/2/130 One-way analysis of variance]<br> | ||
+ | *(10) [http://ccforum.com/content/8/3/196 Further nonparametric methods]<br> | ||
+ | *(11) [http://ccforum.com/content/8/4/287 Assessing risk]<br> | ||
+ | *(12) [http://ccforum.com/content/8/5/389 Survival analysis]<br> | ||
+ | *(13) [http://ccforum.com/content/8/6/508 Receiver operating characteristic curves]<br> | ||
+ | *(14) [http://ccforum.com/content/9/1/112 Logistic regression] | ||
]] | ]] | ||
+ | |||
======Slide 033====== | ======Slide 033====== | ||
[[Image:L04_s033.jpg|frame|none|Lecture 04, Slide 033<br> | [[Image:L04_s033.jpg|frame|none|Lecture 04, Slide 033<br> |
Latest revision as of 23:51, 22 September 2007
(Previous lecture) ... (Next lecture)
Sequence Analysis
- What you should take home from this part of the course
- Understand key concepts in probabilistic pattern representation and matching, especially PSSMs. Understand that machine-learning tools such as HMMs (Hidden Markov Models) and NN (Neural Networks) can be used for probabilistic pattern matching and classification.
- Understand the concept of a sequence logo.
- Be familiar with the SignalP Web server.
- Know basic concepts of statistics and probability theory, key terms of descriptive statistics;
- Understand probability tables in principle;
- Have encountered important probability distributions;
- Understand different error types;
- Understand the terms: significance, confidence interval and statistical test.
- Be familiar with the concepts and strategy of simulation testing and understand why its simplicity is making an important contribution to computational biology.
- Links summary
- WebLogo
- Tom Schneider's Sequence Logo pages (and introductions to information theory)
- The SignalP server
- Exercises
- If you assume that an 80-mer oligonucleotide can be synthesized with 99.9% coupling efficiency per step and a 0.2% chance of coupling a leftover nucleotide from the previous synthesis step, what is the probability that a randomly picked clone of a gene built with this oligonucleotide has the correct sequence?
- In a recent doctoral thesis defence the candidate claimed that in a microarray expression analysis he was able to show reciprocal regulation of two genes (one related to immune stimulation, the other related to immune suppression): this would mean whenever one gene is regulated up, the other is downregulated, and vice versa. The claim was based on observing this effect in eight of ten experiments. Expression levels were scored semiquantitatively on a scale of (++,+,0,-, and --). Given that such experiments have experimental error as well as biological variability, sketch a simulation test that would analyse whether in fact a significant (anti)correlation had been observed, or whether this result could just as well be due to meaningless fluctuations.
Lecture Slides
Slide 001
Slide 002
Slide 003
Slide 004
Slide 005
Slide 006
Slide 007
Slide 008
Slide 009
Slide 010
Slide 011
Slide 012
Slide 013
Slide 014
Slide 015
Slide 016
Slide 017
Slide 018
Slide 019
Slide 020
Slide 021
Slide 022
Slide 023
Slide 024
Slide 025
Slide 026
Slide 027
deleted
Slide 028
Slide 029
Slide 030
Slide 031
Slide 032
Slide 033
Slide 034
Slide 035
Slide 036
Slide 037
Slide 038
Slide 039
Slide 040
Slide 041
Slide 042
Slide 043
Slide 044
Slide 045
Slide 046
Slide 047
Slide 048
Slide 049
Slide 050
Slide 051
Slide 052
Slide 053
Slide 054
Slide 055
Slide 056
Slide 057
Slide 058
Slide 059
Slide 060
Slide 061
Slide 062
Slide 063
Slide 064
Slide 065
Slide 066
Slide 067
Slide 068
Slide 069
Slide 070
Slide 071
Slide 072
Slide 073
Slide 074
Slide 075