Expected Preparations:

  The Central Dogma:
Regulation of transcription and translation; Protein biosynthesis and degradation; Quality control.
  [BIN-EXPR]
Analysis
  [BIN]
Data_integration
 
  If you are not already familiar with the prior knowledge listed above, you need to prepare yourself from other information sources.   The units listed above are part of this course and contain important preparatory material.  

Keywords: NCBI GEO: finding and analyzing expression profiles

Objectives:

This unit will …

  • … introduce the contents and utilities of the GEO mRNA expression database.

Outcomes:

After working through this unit you …

  • … can access GEO, find expression datasets and analyze them with the provided tools.


Deliverables:

Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.

Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don’t overlook these.

Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.


Evaluation:

NA: This unit is not evaluated for course marks.

Contents

Introduction to the contents and utilities of the GEO mRNA expression database.

 

The transcriptome is the set of a cell’s mRNA molecules. The transcriptome originates from the genome, mostly, that is, and it results in the proteome, again: mostly. RNA that is transcribed(W) from the genome is not yet fit for translation but must be processed: splicing(W) is ubiquitous1 and in addition RNA editing(W) has been encountered in many species. Some authors therefore refer to the exome—the set of transcribed exons(W)— to indicate the actual coding sequence.

Microarray technology — the quantitative, sequence-specific hybridization of labelled nucleotides in chip-format — was the first domain of “high-throughput biology”. Today, it has largely been replaced by RNA-seq(W): quantification of transcribed mRNA by high-throughput sequencing and mapping reads to genes. Quantifying gene expression levels in a tissue-, development-, or response-specific way has yielded detailed insight into cellular function at the molecular level, with recent results of single-cell sequencing experiments adding a new level of precision. But not all transcripts are mapped to genes: we increasingly realize that the transcriptome is not merely a passive buffer of expressed information on its way to be translated into proteins, but contains multiple levels of complex, regulation through hybridization of small nuclear RNAs2.

NCBI’s GEO database stores expression data and experiment matadata and makes it publicly available.

Task…

  • Navigate to the GEO database;
  • Open the article below;
  • Read through the article and explore the features it discusses on the GEO Website.

Clough, Emily and Tanya Barrett. (2016). “The Gene Expression Omnibus Database”. Methods in Molecular Biology (Clifton, N.j.) 1418:93–110 .
[PMID: 27008011] [DOI: 10.1007/978-1-4939-3578-9_5]

The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome-protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/.

 

Questions, comments

If in doubt, ask! If anything about this contents is not clear to you, do not proceed but ask for clarification. If you have ideas about how to make this material better, let’s hear them. We are aiming to compile a list of FAQs for all learning units, and your contributions will count towards your participation marks.

Improve this page! If you have questions or comments, please post them on the Quercus Discussion board with a subject line that includes the name of the unit.

References

Page ID: BIN-EXPR-GEO

Author:
Boris Steipe ( <boris.steipe@utoronto.ca> )
Created:
2017-08-05
Last modified:
2022-09-14
Version:
1.1
Version History:
–  1.1 2020 Updates
–  1.0 First live version
–  0.1 First stub
Tagged with:
–  Unit
–  Live

 

[END]


  1. Strictly speaking, splicing is an eukaryotic(W) achievement, however there are examples of splicing in prokaryotes(W) as well.↩︎

  2. ↩︎