Expected Preparations:

  [BIN]
Storing_data
 
  The units listed above are part of this course and contain important preparatory material.  

Keywords: Database principles for bioinformatics

Objectives:

To describe construction principles for database systems;

To introduce some general aspects of database use in bioinformatics;

To explore the current NAR database and Web service issues.

Outcomes:

You understand the need for the four ACID requirements to ensure “transactional” integrity of databases;

You are familar with a spectrum of database and Web service offerings in bioinformatics.


Deliverables:

Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.

Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don’t overlook these.

Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.


Evaluation:

NA: This unit is not evaluated for course marks.

Contents

Large, scalable, multi-user database systems require a fair amount of technology underneath the hood. In particular, they need to fulfill the ACID requirements that ensure database integrity. This unit introduces the principles, and then moves onto an overview of current bioinformatics databases, and Web services.

In this unit we develop the technical context of bioinformatics databases and get a perspective on the multitude of data offerings in the field. Data and service offerings have no clearly defined boundaries, and many sites offer a mix of both. Thus we explore current Web services as well, to define the landscape.

Task…

  1. Read the introductory notes on construction principles for large, multi-user, scalable database systemsPDF.
  2. Visit the current Database Issue of NAR and browse the titles.
  3. Read the editorial article in this issue:

    Rigden, Daniel J and Xosé M Fernández. (2021). “The 2021 Nucleic Acids Research database issue and the online molecular biology database collection”. Nucleic Acids Research 49(D1):D1–D9 .
    [PMID: 33396976] [DOI: 10.1093/nar/gkaa1216]

    The 2021 Nucleic Acids Research database Issue contains 189 papers spanning a wide range of biological fields and investigation. It includes 89 papers reporting on new databases and 90 covering recent changes to resources previously published in the Issue. A further ten are updates on databases most recently published elsewhere. Seven new databases focus on COVID-19 and SARS-CoV-2 and many others offer resources for studying the virus. Major returning nucleic acid databases include NONCODE, Rfam and RNAcentral. Protein family and domain databases include COG, Pfam, SMART and Panther. Protein structures are covered by RCSB PDB and dispersed proteins by PED and MobiDB. In metabolism and signalling, STRING, KEGG and WikiPathways are featured, along with returning KLIFS and new DKK and KinaseMD, all focused on kinases. IMG/M and IMG/VR update in the microbial and viral genome resources section, while human and model organism genomics resources include Flybase, Ensembl and UCSC Genome Browser. Cancer studies are covered by updates from canSAR and PINA, as well as newcomers CNCdatabase and Oncovar for cancer drivers. Plant comparative genomics is catered for by updates from Gramene and GreenPhylDB. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been substantially updated, revisiting nearly 1000 entries, adding 90 new resources and eliminating 86 obsolete databases, bringing the current total to 1641 databases. It is available at https://www.oxfordjournals.org/nar/database/c/.

  1. Visit the current Web Service Issue of NAR and browse the titles. Read the editorial article in this issue:

    No Authors Listed. (2021). “Editorial: the 19th annual Nucleic Acids Research web server issue 2021”. Nucleic Acids Research 49(W1):W1–W4 .
    [PMID: 34212204] [DOI: 10.1093/nar/gkab525]

  2. For both issues: find one article that you find particularly interesting, intriguing, or surprising, cite it (correctly) in your journal, and comment on it.

Questions, comments

If in doubt, ask! If anything about this contents is not clear to you, do not proceed but ask for clarification. If you have ideas about how to make this material better, let’s hear them. We are aiming to compile a list of FAQs for all learning units, and your contributions will count towards your participation marks.

Improve this page! If you have questions or comments, please post them on the Quercus Discussion board with a subject line that includes the name of the unit.

References

Page ID: BIN-Databases

Author:
Boris Steipe ( <boris.steipe@utoronto.ca> )
Created:
2017-08-05
Last modified:
2022-09-14
Version:
1.1
Version History:
–  1.1 Annual update … and a task
–  1.0 First live version.
–  0.1 First stub
Tagged with:
–  Unit
–  Live
–  Has lecture slides

 

[END]