Enrichment

Enrichment Analysis

This page is a placeholder, or under current development; it is here principally to establish the logical framework of the site. The material on this page is correct, but incomplete.

Enrichment analysis addresses the question: do genes in a set have a non-trivial property in common? The methodologies discussed here have applications in many fields of computational biology.

Introductory reading

Tilford & Siemers (2009) Gene set enrichment analysis. Methods Mol Biol 563:99-121. (pmid: 19597782)

[ PubMed ] [ DOI ] Abstract

Relative Enrichment

Relative Enrichment is the ratio of (fraction of elements of interest in an observed set) and (fraction of elements of interest in a reference set).

Functional Annotation Analysis (FAA)

Functional Annotation Analysis (FAA) analyses the enrichment of properties in a set of genes. Such properties may include GO terms, EC codes, membership in pathways, coregulation etc. A good resource for FAA is the DAVID database and server.

Gene Set Enrichment Analysis (GSEA)

Gene Set Enrichment Analysis (GSEA) analyses the enrichment of members of a predefined gene set a set of experimentally observed genes. Such predefined gene sets may come from pathway components, interaction clusters, genes that have particular transcription factor binding sites in common etc. The default resource is the GSEA software, distributed via the Broad Institute GSEA homepage.

Exercises

Task:

Work through the DAVID tutorial published in nature protocols:

Huang et al. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4:44-57. (pmid: 19131956)

[ PubMed ] [ DOI ] Abstract

Access the Web version of the article, it conveniently contains the required links.

Use Demo List 2, provided on the DAVID site for your analysis. Remember to read the description of the gene list.

Do not use any of the Java tools. As of this writing Java applets in Web browsers are considered fundamentally insecure; Java should be disabled in your browser.

For each of the analysis steps, think clearly about whether the results support od contradict your expectations about the data. Feel free to discuss your expectations and findings on the mailing list.

If there are any problems with the assignment, contact me!

Further reading and resources

Tan et al. (2013) Network2Canvas: network visualization on a canvas with enrichment analysis. Bioinformatics 29:1872-8. (pmid: 23749960)

[ PubMed ] [ DOI ] Abstract

Takemasa et al. (2012) Potential biological insights revealed by an integrated assessment of proteomic and transcriptomic data in human colorectal cancer. Int J Oncol 40:551-9. (pmid: 22025299)

[ PubMed ] [ DOI ] Abstract

Merico et al. (2011) Visualizing gene-set enrichment results using the Cytoscape plug-in enrichment map. Methods Mol Biol 781:257-77. (pmid: 21877285)

[ PubMed ] [ DOI ] Abstract

Irizarry et al. (2009) Gene set enrichment analysis made simple. Stat Methods Med Res 18:565-75. (pmid: 20048385)

[ PubMed ] [ DOI ] Abstract

Abatangelo et al. (2009) Comparative study of gene set enrichment methods. BMC Bioinformatics 10:275. (pmid: 19725948)

[ PubMed ] [ DOI ] Abstract

BACKGROUND: The analysis of high-throughput gene expression data with respect to sets of genes rather than individual genes has many advantages. A variety of methods have been developed for assessing the enrichment of sets of genes with respect to differential expression. In this paper we provide a comparative study of four of these methods: Fisher's exact test, Gene Set Enrichment Analysis (GSEA), Random-Sets (RS), and Gene List Analysis with Prediction Accuracy (GLAPA). The first three methods use associative statistics, while the fourth uses predictive statistics. We first compare all four methods on simulated data sets to verify that Fisher's exact test is markedly worse than the other three approaches. We then validate the other three methods on seven real data sets with known genetic perturbations and then compare the methods on two cancer data sets where our a priori knowledge is limited. RESULTS: The simulation study highlights that none of the three method outperforms all others consistently. GSEA and RS are able to detect weak signals of deregulation and they perform differently when genes in a gene set are both differentially up and down regulated. GLAPA is more conservative and large differences between the two phenotypes are required to allow the method to detect differential deregulation in gene sets. This is due to the fact that the enrichment statistic in GLAPA is prediction error which is a stronger criteria than classical two sample statistic as used in RS and GSEA. This was reflected in the analysis on real data sets as GSEA and RS were seen to be significant for particular gene sets while GLAPA was not, suggesting a small effect size. We find that the rank of gene set enrichment induced by GLAPA is more similar to RS than GSEA. More importantly, the rankings of the three methods share significant overlap. CONCLUSION: The three methods considered in our study recover relevant gene sets known to be deregulated in the experimental conditions and pathologies analyzed. There are differences between the three methods and GSEA seems to be more consistent in finding enriched gene sets, although no method uniformly dominates over all data sets. Our analysis highlights the deep difference existing between associative and predictive methods for detecting enrichment and the use of both to better interpret results of pathway analysis. We close with suggestions for users of gene set methods.

Subramanian et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U.S.A 102:15545-50. (pmid: 16199517)

[ PubMed ] [ DOI ] Abstract

Enrichment

Contents

Introductory reading

Relative Enrichment

Functional Annotation Analysis (FAA)

Gene Set Enrichment Analysis (GSEA)

Exercises

Further reading and resources

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Sections

Tools