Difference between revisions of "Sequence analysis"

From "A B C"
Jump to navigation Jump to search
m
Line 34: Line 34:
  
 
===Signal peptides===
 
===Signal peptides===
 +
{{#pmid: 15223320}}
 +
{{#pmid: 17446895}}
 +
{{#pmid: 21959131}}
  
  

Revision as of 03:40, 28 October 2012

Sequence analysis


This page is a placeholder, or under current development; it is here principally to establish the logical framework of the site. The material on this page is correct, but incomplete.


Summary ...



 

Contents

Examples

Motifs

Disorder

Signal peptides

Bendtsen et al. (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340:783-95. (pmid: 15223320)

PubMed ] [ DOI ] We describe improvements of the currently most popular method for prediction of classically secreted proteins, SignalP. SignalP consists of two different predictors based on neural network and hidden Markov model algorithms, where both components have been updated. Motivated by the idea that the cleavage site position and the amino acid composition of the signal peptide are correlated, new features have been included as input to the neural network. This addition, combined with a thorough error-correction of a new data set, have improved the performance of the predictor significantly over SignalP version 2. In version 3, correctness of the cleavage site predictions has increased notably for all three organism groups, eukaryotes, Gram-negative and Gram-positive bacteria. The accuracy of cleavage site prediction has increased in the range 6-17% over the previous version, whereas the signal peptide discrimination improvement is mainly due to the elimination of false-positive predictions, as well as the introduction of a new discrimination score for the neural network. The new method has been benchmarked against other available methods. Predictions can be made at the publicly available web server

Emanuelsson et al. (2007) Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2:953-71. (pmid: 17446895)

PubMed ] [ DOI ] Determining the subcellular localization of a protein is an important first step toward understanding its function. Here, we describe the properties of three well-known N-terminal sequence motifs directing proteins to the secretory pathway, mitochondria and chloroplasts, and sketch a brief history of methods to predict subcellular localization based on these sorting signals and other sequence properties. We then outline how to use a number of internet-accessible tools to arrive at a reliable subcellular localization prediction for eukaryotic and prokaryotic proteins. In particular, we provide detailed step-by-step instructions for the coupled use of the amino-acid sequence-based predictors TargetP, SignalP, ChloroP and TMHMM, which are all hosted at the Center for Biological Sequence Analysis, Technical University of Denmark. In addition, we describe and provide web references to other useful subcellular localization predictors. Finally, we discuss predictive performance measures in general and the performance of TargetP and SignalP in particular.

Petersen et al. (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8:785-6. (pmid: 21959131)

PubMed ] [ DOI ]


Secondary Structure

Pirovano & Heringa (2010) Protein secondary structure prediction. Methods Mol Biol 609:327-48. (pmid: 20221928)

PubMed ] [ DOI ] While the prediction of a native protein structure from sequence continues to remain a challenging problem, over the past decades computational methods have become quite successful in exploiting the mechanisms behind secondary structure formation. The great effort expended in this area has resulted in the development of a vast number of secondary structure prediction methods. Especially the combination of well-optimized/sensitive machine-learning algorithms and inclusion of homologous sequence information has led to increased prediction accuracies of up to 80%. In this chapter, we will first introduce some basic notions and provide a brief history of secondary structure prediction advances. Then a comprehensive overview of state-of-the-art prediction methods will be given. Finally, we will discuss open questions and challenges in this field and provide some practical recommendations for the user.


Transmembrane Helices

BENCHMARK OF MEMBRANE HELIX PREDICTIONS FROM SEQUENCE

Location

Integrated tools

Ooi et al. (2009) ANNIE: integrated de novo protein sequence annotation. Nucleic Acids Res 37:W435-40. (pmid: 19389726)

PubMed ] [ DOI ] Function prediction of proteins with computational sequence analysis requires the use of dozens of prediction tools with a bewildering range of input and output formats. Each of these tools focuses on a narrow aspect and researchers are having difficulty obtaining an integrated picture. ANNIE is the result of years of close interaction between computational biologists and computer scientists and automates an essential part of this sequence analytic process. It brings together over 20 function prediction algorithms that have proven sufficiently reliable and indispensable in daily sequence analytic work and are meant to give scientists a quick overview of possible functional assignments of sequence segments in the query proteins. The results are displayed in an integrated manner using an innovative AJAX-based sequence viewer. ANNIE is available online at: http://annie.bii.a-star.edu.sg. This website is free and open to all users and there is no login requirement.

http://annie.bii.a-star.edu.sg


   

Further reading and resources