BIO Assignment Week 11
Assignment for Week 11
Protein-Protein Interactions
< Assignment 10 |
Note! This assignment is currently inactive. Major and minor unannounced changes may be made at any time.
Concepts and activities (and reading, if applicable) for this assignment will be topics on next week's quiz.
Contents
Data Sources
Interaction databases have similar problems as sequence databases: the need for standards for abstracting biological concepts into computable objects, data integrity, search and retrieval, and the metrics of comparison. There is however an added complication: interactions are rarely all-or-none, and the high-throughput experimental methods have large false-positive and false-negative rates. This makes it necessary to define confidence scores for interactions. On top of experimental methods, there are also a variety of methods for computational interaction prediction. However, even though the "gold standard" are careful, small-scale laboratory experiments, different curated efforts on the same experimental publication usually lead to different results - with as little as 42% overlap between databases being reported.
Currently, likely the best integrated protein-protein interaction database is iRefWeb, built on the iRefIndex (which incidentally is available via an R-package on CRAN.) Funding and support for interaction databases is very patchy and we have seen far, far too many promising resources fall into irrelevance for lack of updating. iRef is a case in point. The Wodak lab's iRefWeb represents iRefIndex version 13 - from 2013. iRefIndex has since been updated to version 14.0 (in 2015), but Ian Donaldson who built this resource is now a freelance research scientist...
Another excellent database - and perhaps the one with the most stable, continuous curation effort is the EBI's IntAct database.
Task:
- Find interactors for yeast Mbp1 in both iRefWeb and IntAct.
- Are they largely the same?
- The various visualization options of iRefWeb are currently not working.
- ... but the ones at IntAct are. Click on the Graph tab.
Then what?
If you are like me, you would now like to be able to link expression profiles, information about known complexes, GO annotations, knock-out phenotypes etc. etc. Too bad.
Data visualization and analysis
If you are serious about working with interaction networks, sooner or later you will be working with Cytoscape. It is more or less the standard among "professional" systems biologists. But it is not an online tool.
Task:
- Navigate to the Cytoscape homepage and inform yourself what the program does and how to install it. There are many tutorials online available. But this is software that needs to be downloaded, and installed and it definitively has a learning curve.
The state of integrated online interaction viewers these days is actually pretty dismal. Have a look at this article that discusses the gap between what one would need to do, and what is offered:
Jeanquartier et al. (2015) Integrated web visualizations for protein-protein interaction databases. BMC Bioinformatics 16:195. (pmid: 26077899) |
The online resource that comes out as the best is the one at the String database.
Task:
- Navigate to the String database and search for saccharomyces cerevisiae Mbp1 interactors.
- Visualize the network. Add a few proteins by clicking the (+) button a two or three times.
- Click on a node to get a synopsis of its function.
- Explore the "confidence", "evidence" and "actions" networks for the retrieved interactors.
- Not all interacting proteins are also predicted to have a functional relationship with Mbp1. Do you agree?
- Explore the clustering and layout options. Do you understand what they do?
- Explore the Views on
- Neighborhood (not relevant for our query though)
- Fusion (also not relevant for our query)
- Occurence
- Coexpression
- Experiments
- Database, and
- Textmining
Each of these are methods for predicting functional relationships. Figure out how each one contributes to evidence of a functional interaction between Mbp1 and its predicted functional partners. I find the Occurrence view a unique and intriguing tool: visualizing in which organisms groups of genes are either all absent or all present allows to quickly establish functional clusters.
In summary, String is a convincingly well built tool to explore functional relationships between proteins.
Links and resources
Razick et al. (2008) iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics 9:405. (pmid: 18823568) |
Mora & Donaldson (2011) iRefR: an R package to manipulate the iRefIndex consolidated protein interaction database. BMC Bioinformatics 12:455. (pmid: 22115179) |
Footnotes and references
Ask, if things don't work for you!
- If anything about the assignment is not clear to you, please ask on the mailing list. You can be certain that others will have had similar problems. Success comes from joining the conversation.
- Do consider how to ask your questions so that a meaningful answer is possible:
- How to create a Minimal, Complete, and Verifiable example on stackoverflow and ...
- How to make a great R reproducible example are required reading.
< Assignment 10 |