Difference between revisions of "BIN-PPI-Databases"
m |
m |
||
(8 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | <div id=" | + | <div id="ABC"> |
− | + | <div style="padding:5px; border:1px solid #000000; background-color:#b3dbce; font-size:300%; font-weight:400; color: #000000; width:100%;"> | |
Protein-Protein Interaction Databases | Protein-Protein Interaction Databases | ||
− | + | <div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#b3dbce; font-size:30%; font-weight:200; color: #000000; "> | |
− | + | (IntAct, iRef,) | |
− | + | </div> | |
− | |||
− | |||
− | |||
− | IntAct, iRef, | ||
</div> | </div> | ||
− | {{ | + | {{Smallvspace}} |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | <div style="padding:5px; border:1px solid #000000; background-color:#b3dbce33; font-size:85%;"> | ||
+ | <div style="font-size:118%;"> | ||
+ | <b>Abstract:</b><br /> | ||
+ | <section begin=abstract /> | ||
+ | Exploring IntAct and BioGRID PPI databases. | ||
+ | <section end=abstract /> | ||
</div> | </div> | ||
− | < | + | <!-- ============================ --> |
− | == | + | <hr> |
− | < | + | <table> |
− | ... | + | <tr> |
− | + | <td style="padding:10px;"> | |
− | + | <b>Objectives:</b><br /> | |
− | + | This unit will ... | |
− | + | * ... introduce issues surrounding the collection and curation of protein-protein interactions in databases; | |
− | == | + | * ... explore the Web interfaces to IntAct and BioGRID; |
− | === | + | * ... discuss the limitations of interaction predictions based on homology ; |
− | < | + | </td> |
− | <!-- | + | <td style="padding:10px;"> |
− | You need the following preparation before beginning this unit. If you are not familiar with this material from courses you took previously, you need to prepare yourself from other information sources: | + | <b>Outcomes:</b><br /> |
− | < | + | After working through this unit you ... |
+ | * ... can access IntAct and BioGRID and discover interactions with a protein of interest. | ||
+ | </td> | ||
+ | </tr> | ||
+ | </table> | ||
+ | <!-- ============================ --> | ||
+ | <hr> | ||
+ | <b>Deliverables:</b><br /> | ||
+ | <section begin=deliverables /> | ||
+ | <li><b>Time management</b>: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.</li> | ||
+ | <li><b>Journal</b>: Document your progress in your [[FND-Journal|Course Journal]]. Some tasks may ask you to include specific items in your journal. Don't overlook these.</li> | ||
+ | <li><b>Insights</b>: If you find something particularly noteworthy about this unit, make a note in your [[ABC-Insights|'''insights!''' page]].</li> | ||
+ | <section end=deliverables /> | ||
+ | <!-- ============================ --> | ||
+ | <hr> | ||
+ | <section begin=prerequisites /> | ||
+ | <b>Prerequisites:</b><br /> | ||
+ | You need the following preparation before beginning this unit. If you are not familiar with this material from courses you took previously, you need to prepare yourself from other information sources:<br /> | ||
*<b>Biomolecules</b>: The molecules of life; nucleic acids and amino acids; the genetic code; protein folding; post-translational modifications and protein biochemistry; membrane proteins; biological function. | *<b>Biomolecules</b>: The molecules of life; nucleic acids and amino acids; the genetic code; protein folding; post-translational modifications and protein biochemistry; membrane proteins; biological function. | ||
− | + | This unit builds on material covered in the following prerequisite units:<br /> | |
− | + | *[[BIN-PPI-Concepts|BIN-PPI-Concepts (Protein-Protein Interaction (PPI) Concepts)]] | |
− | *[[BIN-PPI-Concepts]] | + | <section end=prerequisites /> |
+ | <!-- ============================ --> | ||
+ | </div> | ||
− | {{ | + | {{Smallvspace}} |
− | |||
− | |||
− | |||
− | {{ | + | {{Smallvspace}} |
− | + | __TOC__ | |
− | |||
− | |||
{{Vspace}} | {{Vspace}} | ||
− | === | + | === Evaluation === |
− | < | + | <b>Evaluation: NA</b><br /> |
− | < | + | <div style="margin-left: 2rem;">This unit is not evaluated for course marks.</div> |
− | + | == Contents == | |
− | + | {{Smallvspace}} | |
− | + | In high-throughput biology, the genome was the beginning. As [http://en.wikipedia.org/wiki/Sydney_Brenner Sydney Brenner] has phrased it: we have now written the "white-pages" of the cell, fulfilling the "CAP-criterion" (Comprehensive, Accurate and Permanent). The next level is figuring out the way the parts work - if you will, the "Yellow Pages" - and many of us expect that substantial progress can be made by mapping their interactions. After all, physiological function can be described to a large part as the result of physical interaction. | |
− | |||
− | |||
− | + | Please note that there are different types of '''physical interactions'''. We most often think of '''complexes''', either stable or transient homo- or heterooligomers when we speak of physical interactions. But there are also interactions between '''substrates and products''' and not all of them correspond to classical enzymatic pathways. Phosphorylation and dephosphorylation are processes of key importance in signal transduction and acetylation/deacetylation plays a critical role in regulatory pathways. Here, the substrates are proteins and the interaction with the modifying enzyme is of course a physical interaction. | |
− | + | '''Genetic interactions''' on the other hand are another story. Here the word ''interaction'' is used in an entirely different sense: it is not synonymous with ''contact'' it is synonymous with ''influence''. In fact, most proteins that display genetic interactions would '''not''' be expected to interact physically as well. See the [[FND-PPI-Physical_vs_genetic]] unit for details. | |
− | |||
− | |||
− | |||
− | |||
− | |||
{{Vspace}} | {{Vspace}} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
{{Task|1= | {{Task|1= | ||
*Read the introductory notes on {{ABC-PDF|BIN-PPI-Databases|protein-protein interaction databases}}. | *Read the introductory notes on {{ABC-PDF|BIN-PPI-Databases|protein-protein interaction databases}}. | ||
+ | |||
+ | * Read | ||
+ | {{#pmid: 27115627}} | ||
+ | {{#pmid: 30476227}} | ||
+ | |||
}} | }} | ||
Line 100: | Line 99: | ||
'''Interaction databases''' have similar problems as sequence databases: the need for standards for abstracting biological concepts into computable objects, data integrity, search and retrieval, and the metrics of comparison. There is however an added complication: interactions are rarely all-or-none, and the high-throughput experimental methods have large false-positive and false-negative rates. This makes it necessary to define '''confidence scores''' for interactions. On top of experimental methods, there are also a variety of methods for {{WP|Protein–protein_interaction_prediction|computational interaction prediction}}. However, even though the "gold standard" are careful, small-scale laboratory experiments, different curated efforts on the same experimental publication usually lead to different results - with as little as 42% overlap between databases being reported. | '''Interaction databases''' have similar problems as sequence databases: the need for standards for abstracting biological concepts into computable objects, data integrity, search and retrieval, and the metrics of comparison. There is however an added complication: interactions are rarely all-or-none, and the high-throughput experimental methods have large false-positive and false-negative rates. This makes it necessary to define '''confidence scores''' for interactions. On top of experimental methods, there are also a variety of methods for {{WP|Protein–protein_interaction_prediction|computational interaction prediction}}. However, even though the "gold standard" are careful, small-scale laboratory experiments, different curated efforts on the same experimental publication usually lead to different results - with as little as 42% overlap between databases being reported. | ||
− | Currently, likely the best integrated protein-protein interaction database is [http://www.ebi.ac.uk/intact/ '''IntAct'''], at the EBI, which besides curating interactions from the literature hosts interactions from the IMEx consortium | + | Currently, likely the best integrated protein-protein interaction database is [http://www.ebi.ac.uk/intact/ '''IntAct'''], at the EBI, which, besides curating interactions from the literature, hosts interactions from the IMEx consortium = an extensive data-sharing agreement between a number of general and specialized source databases. |
− | {{ | + | {{Vspace}} |
{{task|1= | {{task|1= | ||
* Access [http://www.ebi.ac.uk/intact/ '''IntAct'''] and enter the UniProt ID for yeast Mbp1 <tt>P39678</tt>. | * Access [http://www.ebi.ac.uk/intact/ '''IntAct'''] and enter the UniProt ID for yeast Mbp1 <tt>P39678</tt>. | ||
− | * | + | *The EBI search directly returns a table of pairwise interactions; both partners are listed as a pair and, in each pair one of the partners should be "Mbp1", or "YDL056W"" (the systematic name of yeast Mbp1). |
− | * | + | *How many different physical interaction detection methods do the IntAct records list? Follow the links and read their definitions. <small>('''Bravo''' to the IntAct developers, for '''defining''' their terms. In a better world, all the semantics of our databases should be similarly defined to be meaningful.)</small> |
− | But | + | But now what? |
− | If you are like me, you would | + | If you are like me, you would like to be able to link expression profiles, information about known complexes, GO annotations, knock-out phenotypes etc. etc. Not on the Web. |
}} | }} | ||
Line 119: | Line 118: | ||
+ | Next, we explore the BioGRID interaction database. BioGrid stores physical and genetic interactions. | ||
+ | {{Smallvspace}} | ||
+ | {{Task|1= | ||
+ | * Access the [http://www.thebiogrid.org/ the '''BioGRID'''] database at the Samuel-Lunenfeld Research Institute, Mount Sinai Hospital, Toronto. Search for interactions of the Mbp1 gene by entering the gene name into the form field. | ||
+ | *Follow the correct link in BioGrid for ''saccharomyces cerevisiae'' Mbp1 (YDL056W). All genes listed in that table have demonstrated interactions with Mbp1. | ||
+ | * List what general experimental type(s) the BioGrid interactors come from. (In particular note the difference between <span style="background-color:#FFCC00;">yellow</span> and <span style="background-color:#00FF44;">green</span> boxes). | ||
+ | You will note that some, but not all physical interactions listed by BioGRID and IntAct are the '''same''' according to a restrictive interpretation: '''same organism, same proteins, same experiment, same publication'''. | ||
− | + | * Which of the IntAct Mbp1 interactions are the same in BioGrid? | |
+ | * Check whether all of the interactions between the regulators of the ''G<sub>1</sub>/S'' phase as per the digram in the "Systems Concepts" PDF are present in BioGRID interactions. | ||
+ | }} | ||
− | + | {{Vspace}} | |
− | {{#pmid: | + | Now, what about MYSPE? Could you infer interactions between proteins whose orthologs interact in another species? Such predictions are called ''interologs'' (''inter''acting homo''logs''). Unfortunately, that does not appear to be the case. Confident prediction of interologs can only be achieved in cases of >80% joint sequence identity of both pairs<ref>{{#pmid:16854211}}</ref>, a level of similarity that (I believe) none of our Mbp1 proteins achieves. Does this mean the pathways and interactions are not conserved? Certainly not. We expect a very high degree of conservation of the system's function, but we can't say for sure whether any two specific proteins interact in a different species the same way they interact in yeast. All we can do is to use annotation transfer for hypothesis generation. But that is a useful starting point. |
+ | {{Vspace}} | ||
− | + | == Further reading, links and resources == | |
+ | <!--{{#pmid: 22115179}}--><!-- iRefR: an R package to manipulate the iRefIndex consolidated protein interaction database --> | ||
+ | {{#pmid: 23028270}}<!-- What evidence is there for the homology of protein-protein interactions? --> | ||
+ | {{#pmid: 22689642}}<!-- BIANA Interolog Prediction Server --> | ||
+ | {{#pmid: 27074302}}<!-- Predicting Protein-Protein Interactions from the Molecular to the Proteome Level --> | ||
== Notes == | == Notes == | ||
− | |||
− | |||
<references /> | <references /> | ||
{{Vspace}} | {{Vspace}} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<div class="about"> | <div class="about"> | ||
Line 188: | Line 159: | ||
:2017-08-05 | :2017-08-05 | ||
<b>Modified:</b><br /> | <b>Modified:</b><br /> | ||
− | : | + | :2020-09-22 |
<b>Version:</b><br /> | <b>Version:</b><br /> | ||
− | : | + | :1.1 |
<b>Version history:</b><br /> | <b>Version history:</b><br /> | ||
+ | *1.1 2020 Update | ||
+ | *1.0 First live | ||
*0.1 First stub | *0.1 First stub | ||
</div> | </div> | ||
− | |||
− | |||
{{CC-BY}} | {{CC-BY}} | ||
+ | [[Category:ABC-units]] | ||
+ | {{UNIT}} | ||
+ | {{LIVE}} | ||
</div> | </div> | ||
<!-- [END] --> | <!-- [END] --> |
Latest revision as of 07:42, 23 September 2020
Protein-Protein Interaction Databases
(IntAct, iRef,)
Abstract:
Exploring IntAct and BioGRID PPI databases.
Objectives:
|
Outcomes:
|
Deliverables:
Prerequisites:
You need the following preparation before beginning this unit. If you are not familiar with this material from courses you took previously, you need to prepare yourself from other information sources:
- Biomolecules: The molecules of life; nucleic acids and amino acids; the genetic code; protein folding; post-translational modifications and protein biochemistry; membrane proteins; biological function.
This unit builds on material covered in the following prerequisite units:
Evaluation
Evaluation: NA
Contents
In high-throughput biology, the genome was the beginning. As Sydney Brenner has phrased it: we have now written the "white-pages" of the cell, fulfilling the "CAP-criterion" (Comprehensive, Accurate and Permanent). The next level is figuring out the way the parts work - if you will, the "Yellow Pages" - and many of us expect that substantial progress can be made by mapping their interactions. After all, physiological function can be described to a large part as the result of physical interaction.
Please note that there are different types of physical interactions. We most often think of complexes, either stable or transient homo- or heterooligomers when we speak of physical interactions. But there are also interactions between substrates and products and not all of them correspond to classical enzymatic pathways. Phosphorylation and dephosphorylation are processes of key importance in signal transduction and acetylation/deacetylation plays a critical role in regulatory pathways. Here, the substrates are proteins and the interaction with the modifying enzyme is of course a physical interaction.
Genetic interactions on the other hand are another story. Here the word interaction is used in an entirely different sense: it is not synonymous with contact it is synonymous with influence. In fact, most proteins that display genetic interactions would not be expected to interact physically as well. See the FND-PPI-Physical_vs_genetic unit for details.
Task:
- Read the introductory notes on protein-protein interaction databases.
- Read
Licata & Orchard (2016) The MIntAct Project and Molecular Interaction Databases. Methods Mol Biol 1415:55-69. (pmid: 27115627) |
Oughtred et al. (2019) The BioGRID interaction database: 2019 update. Nucleic Acids Res 47:D529-D541. (pmid: 30476227) |
Data Sources
Interaction databases have similar problems as sequence databases: the need for standards for abstracting biological concepts into computable objects, data integrity, search and retrieval, and the metrics of comparison. There is however an added complication: interactions are rarely all-or-none, and the high-throughput experimental methods have large false-positive and false-negative rates. This makes it necessary to define confidence scores for interactions. On top of experimental methods, there are also a variety of methods for computational interaction prediction. However, even though the "gold standard" are careful, small-scale laboratory experiments, different curated efforts on the same experimental publication usually lead to different results - with as little as 42% overlap between databases being reported.
Currently, likely the best integrated protein-protein interaction database is IntAct, at the EBI, which, besides curating interactions from the literature, hosts interactions from the IMEx consortium = an extensive data-sharing agreement between a number of general and specialized source databases.
Task:
- Access IntAct and enter the UniProt ID for yeast Mbp1 P39678.
- The EBI search directly returns a table of pairwise interactions; both partners are listed as a pair and, in each pair one of the partners should be "Mbp1", or "YDL056W"" (the systematic name of yeast Mbp1).
- How many different physical interaction detection methods do the IntAct records list? Follow the links and read their definitions. (Bravo to the IntAct developers, for defining their terms. In a better world, all the semantics of our databases should be similarly defined to be meaningful.)
But now what?
If you are like me, you would like to be able to link expression profiles, information about known complexes, GO annotations, knock-out phenotypes etc. etc. Not on the Web.
Next, we explore the BioGRID interaction database. BioGrid stores physical and genetic interactions.
Task:
- Access the the BioGRID database at the Samuel-Lunenfeld Research Institute, Mount Sinai Hospital, Toronto. Search for interactions of the Mbp1 gene by entering the gene name into the form field.
- Follow the correct link in BioGrid for saccharomyces cerevisiae Mbp1 (YDL056W). All genes listed in that table have demonstrated interactions with Mbp1.
- List what general experimental type(s) the BioGrid interactors come from. (In particular note the difference between yellow and green boxes).
You will note that some, but not all physical interactions listed by BioGRID and IntAct are the same according to a restrictive interpretation: same organism, same proteins, same experiment, same publication.
- Which of the IntAct Mbp1 interactions are the same in BioGrid?
- Check whether all of the interactions between the regulators of the G1/S phase as per the digram in the "Systems Concepts" PDF are present in BioGRID interactions.
Now, what about MYSPE? Could you infer interactions between proteins whose orthologs interact in another species? Such predictions are called interologs (interacting homologs). Unfortunately, that does not appear to be the case. Confident prediction of interologs can only be achieved in cases of >80% joint sequence identity of both pairs[1], a level of similarity that (I believe) none of our Mbp1 proteins achieves. Does this mean the pathways and interactions are not conserved? Certainly not. We expect a very high degree of conservation of the system's function, but we can't say for sure whether any two specific proteins interact in a different species the same way they interact in yeast. All we can do is to use annotation transfer for hypothesis generation. But that is a useful starting point.
Further reading, links and resources
Lewis et al. (2012) What evidence is there for the homology of protein-protein interactions?. PLoS Comput Biol 8:e1002645. (pmid: 23028270) |
Garcia-Garcia et al. (2012) BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference. Nucleic Acids Res 40:W147-51. (pmid: 22689642) |
Keskin et al. (2016) Predicting Protein-Protein Interactions from the Molecular to the Proteome Level. Chem Rev 116:4884-909. (pmid: 27074302) |
Notes
About ...
Author:
- Boris Steipe <boris.steipe@utoronto.ca>
Created:
- 2017-08-05
Modified:
- 2020-09-22
Version:
- 1.1
Version history:
- 1.1 2020 Update
- 1.0 First live
- 0.1 First stub
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.