Difference between revisions of "BIO Assignment Week 11"

From "A B C"
Jump to navigation Jump to search
Line 101: Line 101:
 
{{#pmid:20926419}}  
 
{{#pmid:20926419}}  
 
{{#pmid:21877285}}  
 
{{#pmid:21877285}}  
 +
</div>
  
 
<div class="reference-box">The [http://wiki.cytoscape.org/Welcome '''Cytoscape wiki''' and manual], and the [http://wiki.cytoscape.org/Cytoscape_User_Manual/Network_Formats Cytoscape manual page on '''network formats'''].</div>
 
<div class="reference-box">The [http://wiki.cytoscape.org/Welcome '''Cytoscape wiki''' and manual], and the [http://wiki.cytoscape.org/Cytoscape_User_Manual/Network_Formats Cytoscape manual page on '''network formats'''].</div>
Line 114: Line 115:
 
{{#pmid:21975162}}
 
{{#pmid:21975162}}
 
{{#pmid:22070249}}
 
{{#pmid:22070249}}
 +
</div>
  
 
==Complex Analysis==
 
==Complex Analysis==

Revision as of 17:55, 4 October 2015

Assignment for Week 11
Protein-Protein Interactions

< Assignment 10  

Note! This assignment is currently inactive. Major and minor unannounced changes may be made at any time.

 
 

Concepts and activities (and reading, if applicable) for this assignment will be topics on next week's quiz.



 

Introduction

Data Sources

  • Iref Index
  • Iref Web (screenscraping iRefWeb et al.)


Interaction databases have similar problems as sequence databases: the need for standards for abstracting biological concepts into computable objects, data integrity, search and retrieval, and the metrics of comparison. There is however an added complication: interactions are rarely all-or-none, and the high-throughput experimental methods have large false-positive and false-negative rates. This makes it necessary to define confidence scores for interactions.


 

Introductory reading

Turner et al. (2010) iRefWeb: interactive analysis of consolidated protein interaction data and their supporting evidence. Database (Oxford) 2010:baq023. (pmid: 20940177)

PubMed ] [ DOI ] We present iRefWeb, a web interface to protein interaction data consolidated from 10 public databases: BIND, BioGRID, CORUM, DIP, IntAct, HPRD, MINT, MPact, MPPI and OPHID. iRefWeb enables users to examine aggregated interactions for a protein of interest, and presents various statistical summaries of the data across databases, such as the number of organism-specific interactions, proteins and cited publications. Through links to source databases and supporting evidence, researchers may gauge the reliability of an interaction using simple criteria, such as the detection methods, the scale of the study (high- or low-throughput) or the number of cited publications. Furthermore, iRefWeb compares the information extracted from the same publication by different databases, and offers means to follow-up possible inconsistencies. We provide an overview of the consolidated protein-protein interaction landscape and show how it can be automatically cropped to aid the generation of meaningful organism-specific interactomes. iRefWeb can be accessed at: http://wodaklab.org/iRefWeb. Database URL: http://wodaklab.org/iRefWeb/


 

Contents

  • Abstraction and standards
  • Databases
  • Confidence scores
Mora & Donaldson (2011) iRefR: an R package to manipulate the iRefIndex consolidated protein interaction database. BMC Bioinformatics 12:455. (pmid: 22115179)

PubMed ] [ DOI ] BACKGROUND: The iRefIndex addresses the need to consolidate protein interaction data into a single uniform data resource. iRefR provides the user with access to this data source from an R environment. RESULTS: The iRefR package includes tools for selecting specific subsets of interest from the iRefIndex by criteria such as organism, source database, experimental method, protein accessions and publication identifier. Data may be converted between three representations (MITAB, edgeList and graph) for use with other R packages such as igraph, graph and RBGL.The user may choose between different methods for resolving redundancies in interaction data and how n-ary data is represented. In addition, we describe a function to identify binary interaction records that possibly represent protein complexes. We show that the user choice of data selection, redundancy resolution and n-ary data representation all have an impact on graphical analysis. CONCLUSIONS: The package allows the user to control how these issues are dealt with and communicate them via an R-script written using the iRefR package - this will facilitate communication of methods, reproducibility of network analyses and further modification and comparison of methods by researchers.


 

Further reading and resources

Standards
Orchard & Hermjakob (2011) Data standardization by the HUPO-PSI: how has the community benefitted?. Methods Mol Biol 696:149-60. (pmid: 21063946)

PubMed ] [ DOI ] The groundwork allowing the systematic capture of proteomics data has now largely been completed, with the design and publication of exchange formats and interchange standards by the Human Proteome Organisation Proteomics Standards Initiative (HUPO-PSI). Our focus can now shift to gathering the ever-increasing amounts of generated data, and finding novel ways to catalog and present it so that a deeper understanding of basic science, health, and disease can be gained by scientists mining these increasingly rich resources.

Data
Razick et al. (2008) iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics 9:405. (pmid: 18823568)

PubMed ] [ DOI ] BACKGROUND: Interaction data for a given protein may be spread across multiple databases. We set out to create a unifying index that would facilitate searching for these data and that would group together redundant interaction data while recording the methods used to perform this grouping. RESULTS: We present a method to generate a key for a protein interaction record and a key for each participant protein. These keys may be generated by anyone using only the primary sequence of the proteins, their taxonomy identifiers and the Secure Hash Algorithm. Two interaction records will have identical keys if they refer to the same set of identical protein sequences and taxonomy identifiers. We define records with identical keys as a redundant group. Our method required that we map protein database references found in interaction records to current protein sequence records. Operations performed during this mapping are described by a mapping score that may provide valuable feedback to source interaction databases on problematic references that are malformed, deprecated, ambiguous or unfound. Keys for protein participants allow for retrieval of interaction information independent of the protein references used in the original records. CONCLUSION: We have applied our method to protein interaction records from BIND, BioGrid, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. The resulting interaction reference index is provided in PSI-MITAB 2.5 format at http://irefindex.uio.no. This index may form the basis of alternative redundant groupings based on gene identifiers or near sequence identity groupings.

Ooi et al. (2010) Databases of protein-protein interactions and complexes. Methods Mol Biol 609:145-59. (pmid: 20221918)

PubMed ] [ DOI ] In the current understanding, translation of genomic sequences into proteins is the most important path for realization of genome information. In exercising their intended function, proteins work together through various forms of direct (physical) or indirect interaction mechanisms. For a variety of basic functions, many proteins form a large complex representing a molecular machine or a macromolecular super-structural building block. After several high-throughput techniques for detection of protein-protein interactions had matured, protein interaction data became available in a large scale and curated databases for protein-protein interactions (PPIs) are a new necessity for efficient research. Here, their scope, annotation quality, and retrieval tools are reviewed. In addition, attention is paid to portals that provide unified access to a variety of such databases with added annotation value.

Wodak et al. (2011) High-throughput analyses and curation of protein interactions in yeast. Methods Mol Biol 759:381-406. (pmid: 21863499)

PubMed ] [ DOI ] The yeast Saccharomyces cerevisiae is the model organism in which protein interactions have been most extensively analyzed. The vast majority of these interactions have been characterized by a variety of sophisticated high-throughput techniques probing different aspects of protein association. This chapter summarizes the major techniques, highlights their complementary nature, discusses the data they produce, and highlights some of the biases from which they suffer. A main focus is the key role played by computational methods for processing, analyzing, and validating the large body of noisy data produced by the experimental procedures. It also describes how computational methods are used to extend the coverage and reliability of protein interaction data by integrating information from heterogeneous sources and reviews the current status of literature-curated data on yeast protein interactions stored in specialized databases.

Musso et al. (2011) Filtering and interpreting large-scale experimental protein-protein interaction data. Methods Mol Biol 781:295-309. (pmid: 21877287)

PubMed ] [ DOI ] Rarely acting in isolation, it is invariably the physical associations among proteins that define their biological activity, necessitating the study of the cellular meshwork of protein-protein interactions (PPI) before a full appreciation of gene function can be achieved. The past few years have seen a marked expansion in the both the sheer volume and number of organisms for which high-quality interaction data is available, with high-throughput interaction screening and detection techniques showing consistent improvement both in scale and sensitivity. Although techniques for large-scale PPI mapping are increasingly being applied to new organisms, including human, there is a corresponding need to rigorously evaluate, benchmark, and impartially filter the results. This chapter explores methods for PPI dataset evaluation, including a survey of previous techniques applied by landmark studies in the field and a discussion of promising new experimental approaches. We further outline practical suggestions and useful tools for interpreting newly generated PPI data. As the majority of large-scale experimental data has been generated for the budding yeast S. cerevisiae, most of the techniques and datasets described are from the perspective of this model unicellular eukaryote; however, extensions to other organisms including mammals are mentioned where possible.

Jain & Bader (2010) An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinformatics 11:562. (pmid: 21078182)

PubMed ] [ DOI ] BACKGROUND: Semantic similarity measures are useful to assess the physiological relevance of protein-protein interactions (PPIs). They quantify similarity between proteins based on their function using annotation systems like the Gene Ontology (GO). Proteins that interact in the cell are likely to be in similar locations or involved in similar biological processes compared to proteins that do not interact. Thus the more semantically similar the gene function annotations are among the interacting proteins, more likely the interaction is physiologically relevant. However, most semantic similarity measures used for PPI confidence assessment do not consider the unequal depth of term hierarchies in different classes of cellular location, molecular function, and biological process ontologies of GO and thus may over-or under-estimate similarity. RESULTS: We describe an improved algorithm, Topological Clustering Semantic Similarity (TCSS), to compute semantic similarity between GO terms annotated to proteins in interaction datasets. Our algorithm, considers unequal depth of biological knowledge representation in different branches of the GO graph. The central idea is to divide the GO graph into sub-graphs and score PPIs higher if participating proteins belong to the same sub-graph as compared to if they belong to different sub-graphs. CONCLUSIONS: The TCSS algorithm performs better than other semantic similarity measurement techniques that we evaluated in terms of their performance on distinguishing true from false protein interactions, and correlation with gene expression and protein families. We show an average improvement of 4.6 times the F1 score over Resnik, the next best method, on our Saccharomyces cerevisiae PPI dataset and 2 times on our Homo sapiens PPI dataset using cellular component, biological process and molecular function GO annotations.

Databases
Yu et al. (2004) Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res 14:1107-18. (pmid: 15173116)

PubMed ] [ DOI ] Proteins function mainly through interactions, especially with DNA and other proteins. While some large-scale interaction networks are now available for a number of model organisms, their experimental generation remains difficult. Consequently, interolog mapping--the transfer of interaction annotation from one organism to another using comparative genomics--is of significant value. Here we quantitatively assess the degree to which interologs can be reliably transferred between species as a function of the sequence similarity of the corresponding interacting proteins. Using interaction information from Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Helicobacter pylori, we find that protein-protein interactions can be transferred when a pair of proteins has a joint sequence identity >80% or a joint E-value <10(-70). (These "joint" quantities are the geometric means of the identities or E-values for the two pairs of interacting proteins.) We generalize our interolog analysis to protein-DNA binding, finding such interactions are conserved at specific thresholds between 30% and 60% sequence identity depending on the protein family. Furthermore, we introduce the concept of a "regulog"--a conserved regulatory relationship between proteins across different species. We map interologs and regulogs from yeast to a number of genomes with limited experimental annotation (e.g., Arabidopsis thaliana) and make these available through an online database at http://interolog.gersteinlab.org. Specifically, we are able to transfer approximately 90,000 potential protein-protein interactions to the worm. We test a number of these in two-hybrid experiments and are able to verify 45 overlaps, which we show to be statistically significant.

Szklarczyk et al. (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39:D561-8. (pmid: 21045058)

PubMed ] [ DOI ] An essential prerequisite for any systems-level understanding of cellular functions is to correctly uncover and annotate all functional interactions among proteins in the cell. Toward this goal, remarkable progress has been made in recent years, both in terms of experimental measurements and computational prediction techniques. However, public efforts to collect and present protein interaction information have struggled to keep up with the pace of interaction discovery, partly because protein-protein interaction information can be error-prone and require considerable effort to annotate. Here, we present an update on the online database resource Search Tool for the Retrieval of Interacting Genes (STRING); it provides uniquely comprehensive coverage and ease of access to both experimental as well as predicted interaction information. Interactions in STRING are provided with a confidence score, and accessory information such as protein domains and 3D structures is made available, all within a stable and consistent identifier space. New features in STRING include an interactive network viewer that can cluster networks on demand, updated on-screen previews of structural information including homology models, extensive data updates and strongly improved connectivity and integration with third-party resources. Version 9.0 of STRING covers more than 1100 completely sequenced organisms; the resource can be reached at http://string-db.org.




 

Interaction prediction

Interologs for YFO...


 

Visualizing Interactions

Cytoscape is a program originally written in Trey Ideker's lab at the Institue for Systems Biology, that is now a thriving, open-source community project for the development of a biology-oriented network display and analysis tool.


Kohl et al. (2011) Cytoscape: software for visualization and analysis of biological networks. Methods Mol Biol 696:291-303. (pmid: 21063955)

PubMed ] [ DOI ] Substantial progress has been made in the field of "omics" research (e.g., Genomics, Transcriptomics, Proteomics, and Metabolomics), leading to a vast amount of biological data. In order to represent large biological data sets in an easily interpretable manner, this information is frequently visualized as graphs, i.e., a set of nodes and edges. Nodes are representations of biological molecules and edges connect the nodes depicting some kind of relationship. Obviously, there is a high demand for computer-based assistance for both visualization and analysis of biological data, which are often heterogeneous and retrieved from different sources. This chapter focuses on software tools that assist in visual exploration and analysis of biological networks. Global requirements for such programs are discussed. Utilization of visualization software is exemplified using the widely used Cytoscape tool. Additional information about the use of Cytoscape is provided in the Notes section. Furthermore, special features of alternative software tools are highlighted in order to assist researchers in the choice of an adequate program for their specific requirements.


Cytoscape is now available as version 3 and should be straightforward to download and install.


Cytoscape tutorials ([2])
Montojo et al. (2010) GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26:2927-8. (pmid: 20926419)

PubMed ] [ DOI ] UNLABELLED: The GeneMANIA Cytoscape plugin brings fast gene function prediction capabilities to the desktop. GeneMANIA identifies the most related genes to a query gene set using a guilt-by-association approach. The plugin uses over 800 networks from six organisms and each related gene is traceable to the source network used to make the prediction. Users may add their own interaction networks and expression profile data to complement or override the default data. AVAILABILITY AND IMPLEMENTATION: The GeneMANIA Cytoscape plugin is implemented in Java and is freely available at http://www.genemania.org/plugin/.

Merico et al. (2011) Visualizing gene-set enrichment results using the Cytoscape plug-in enrichment map. Methods Mol Biol 781:257-77. (pmid: 21877285)

PubMed ] [ DOI ] Gene-set enrichment analysis finds functionally coherent gene-sets, such as pathways, that are statistically overrepresented in a given gene list. Ideally, the number of resulting sets is smaller than the number of genes in the list, thus simplifying interpretation. However, the increasing number and redundancy of -gene-sets used by many current enrichment analysis resources work against this ideal. "Enrichment Map" is a Cytoscape plug-in that helps overcome gene-set redundancy and aids in the interpretation of enrichment results. Gene-sets are organized in a network, where each set is a node and links represent gene overlap between sets. Automated network layout groups related gene-sets into -network clusters, enabling the user to quickly identify the major enriched functional themes and more easily interpret enrichment results.

Platform
Shannon et al. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13:2498-504. (pmid: 14597658)

PubMed ] [ DOI ] Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

Cline et al. (2007) Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2:2366-82. (pmid: 17947979)

PubMed ] [ DOI ] Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape.

Killcoyne et al. (2009) Cytoscape: a community-based framework for network modeling. Methods Mol Biol 563:219-39. (pmid: 19597788)

PubMed ] [ DOI ] Cytoscape is a general network visualization, data integration, and analysis software package. Its development and use has been focused on the modeling requirements of systems biology, though it has been used in other fields. Cytoscape's flexibility has encouraged many users to adopt it and adapt it to their own research by using the plugin framework offered to specialize data analysis, data integration, or visualization. Plugins represent collections of community-contributed functionality and can be used to dynamically extend Cytoscape functionality. This community of users and developers has worked together since Cytoscape's initial release to improve the basic project through contributions to the core code and public offerings of plugin modules. This chapter discusses what Cytoscape does, why it was developed, and the extensions numerous groups have made available to the public. It also describes the development of a plugin used to investigate a particular research question in systems biology and walks through an example analysis using Cytoscape.

Smoot et al. (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27:431-2. (pmid: 21149340)

PubMed ] [ DOI ] UNLABELLED: Cytoscape is a popular bioinformatics package for biological network visualization and data integration. Version 2.8 introduces two powerful new features--Custom Node Graphics and Attribute Equations--which can be used jointly to greatly enhance Cytoscape's data integration and visualization capabilities. Custom Node Graphics allow an image to be projected onto a node, including images generated dynamically or at remote locations. Attribute Equations provide Cytoscape with spreadsheet-like functionality in which the value of an attribute is computed dynamically as a function of other attributes and network properties. AVAILABILITY AND IMPLEMENTATION: Cytoscape is a desktop Java application released under the Library Gnu Public License (LGPL). Binary install bundles and source code for Cytoscape 2.8 are available for download from http://cytoscape.org.

Plugins
Rivera et al. (2010) NeMo: Network Module identification in Cytoscape. BMC Bioinformatics 11 Suppl 1:S61. (pmid: 20122237)

PubMed ] [ DOI ] BACKGROUND: As the size of the known human interactome grows, biologists increasingly rely on computational tools to identify patterns that represent protein complexes and pathways. Previous studies have shown that densely connected network components frequently correspond to community structure and functionally related modules. In this work, we present a novel method to identify densely connected and bipartite network modules based on a log odds score for shared neighbours. RESULTS: To evaluate the performance of our method (NeMo), we compare it to other widely used tools for community detection including kMetis, MCODE, and spectral clustering. We test these methods on a collection of synthetically constructed networks and the set of MIPS human complexes. We apply our method to the CXC chemokine pathway and find a high scoring functional module of 12 disconnected phospholipase isoforms. CONCLUSION: We present a novel method that combines a unique neighbour-sharing score with hierarchical agglomerative clustering to identify diverse network communities. The approach is unique in that we identify both dense network and dense bipartite network structures in a single approach. Our results suggest that the performance of NeMo is better than or competitive with leading approaches on both real and synthetic datasets. We minimize model complexity and generalization error in the Bayesian spirit by integrating out nuisance parameters. An implementation of our method is freely available for download as a plugin to Cytoscape through our website and through Cytoscape itself.

Montojo et al. (2010) GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26:2927-8. (pmid: 20926419)

PubMed ] [ DOI ] UNLABELLED: The GeneMANIA Cytoscape plugin brings fast gene function prediction capabilities to the desktop. GeneMANIA identifies the most related genes to a query gene set using a guilt-by-association approach. The plugin uses over 800 networks from six organisms and each related gene is traceable to the source network used to make the prediction. Users may add their own interaction networks and expression profile data to complement or override the default data. AVAILABILITY AND IMPLEMENTATION: The GeneMANIA Cytoscape plugin is implemented in Java and is freely available at http://www.genemania.org/plugin/.

Oesper et al. (2011) WordCloud: a Cytoscape plugin to create a visual semantic summary of networks. Source Code Biol Med 6:7. (pmid: 21473782)

PubMed ] [ DOI ] BACKGROUND: When biological networks are studied, it is common to look for clusters, i.e. sets of nodes that are highly inter-connected. To understand the biological meaning of a cluster, the user usually has to sift through many textual annotations that are associated with biological entities. FINDINGS: The WordCloud Cytoscape plugin generates a visual summary of these annotations by displaying them as a tag cloud, where more frequent words are displayed using a larger font size. Word co-occurrence in a phrase can be visualized by arranging words in clusters or as a network. CONCLUSIONS: WordCloud provides a concise visual summary of annotations which is helpful for network analysis and interpretation. WordCloud is freely available at http://baderlab.org/Software/WordCloudPlugin.

Razick et al. (2011) iRefScape. A Cytoscape plug-in for visualization and data mining of protein interaction data from iRefIndex. BMC Bioinformatics 12:388. (pmid: 21975162)

PubMed ] [ DOI ] BACKGROUND: The iRefIndex consolidates protein interaction data from ten databases in a rigorous manner using sequence-based hash keys. Working with consolidated interaction data comes with distinct challenges: data are redundant, overlapping, highly interconnected and may be collected and represented using different curation practices. These phenomena were quantified in our previous studies. RESULTS: The iRefScape plug-in for the Cytoscape graphical viewer addresses these challenges. We show how these factors impact on data-mining tasks and how our solutions resolve them in a simple and efficient manner. A uniform accession space is used to limit redundancy and support search expansion and searching on multiple accession types. Multiple node and edge features support data filtering and mining. Node colours and features supply information about search result provenance. Overlapping evidence is presented using a multi-graph and a bi-partite representation is used to distinguish binary and n-ary source data. Searching for interactions between sets of proteins is supported and specifically includes searches on disease-related genes found in OMIM. Finally, a synchronized adjacency-matrix view facilitates visualization of relationships between sets of user defined groups. CONCLUSIONS: The iRefScape plug-in will be of interest to advanced users of interaction data. The plug-in provides access to a consolidated data set in a uniform accession space while remaining faithful to the underlying source data. Tools are provided to facilitate a range of tasks from a simple search to knowledge discovery. The plug-in uses a number of strategies that will be of interest to other plug-in developers.

Morris et al. (2011) clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics 12:436. (pmid: 22070249)

PubMed ] [ DOI ] BACKGROUND: In the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL. RESULTS: Results are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section. CONCLUSIONS: The Cytoscape plugin clusterMaker provides a number of clustering algorithms and visualizations that can be used independently or in combination for analysis and visualization of biological data sets, and for confirming or generating hypotheses about biological function. Several of these visualizations and algorithms are only available to Cytoscape users through the clusterMaker plugin. clusterMaker is available via the Cytoscape plugin manager.

Complex Analysis



 

That is all.


 

Links and resources

 


Footnotes and references


 

Ask, if things don't work for you!

If anything about the assignment is not clear to you, please ask on the mailing list. You can be certain that others will have had similar problems. Success comes from joining the conversation.



< Assignment 10