Expected Preparations:
|
|||||||
|
|||||||
Keywords: Structural domains; CATH domain database; SCOP domain database; CDART domain database | |||||||
|
|||||||
Objectives:
This unit will …
|
Outcomes:
After working through this unit you …
|
||||||
|
|||||||
Deliverables: Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit. Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don’t overlook these. Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page. |
|||||||
|
|||||||
Evaluation: NA: This unit is not evaluated for course marks. |
Structural definition of domains allows a classification of protein structures, which in turn supports the discovery of distant relationships.
Task…
The Annotations tab of PDB entries allows to search for SCOP, CATH and InterPro annotations - i.e. clicking on the respective categories finds all PDB structures that share the same category. But let’s have a quick look at CATH itself.
Task…
There are not many members of this family, so the information we get
is not much more than what we got at the PDB. But we can see where our
domain falls in the CATH hierarchy by noting the CATH ID
3.10.260.10
.
3.10.260.10
IDThrough this exploration you can get a sense of where this fold fits in the “structural domain universe”.
Task…
What precisely constitutes an APSES domain is a matter of definition, as we will explore in the following task.
Task…
Access the Interpro information page for Mbp1 at the EBI: http://www.ebi.ac.uk/interpro/protein/P39678
Mouse over the domain annotations and **note down the residue ranges for the annotated domains covering the N-terminus. You should find:
Follow the links to the respective Interpro and Pfam domain definition pages and read about the domain. Each domain definition describes essentailly the same biomolecule, but the have distinct and partially overlapping sequence rangex.
Navigate to the NCBI page for the Mbp1 protein and click on CDD Search Results.
Hover over the Pfam KilA-N annotation in the linked page, note the highlight in the table below, and note down the annotated range - called “Interval” on this page. Hint: it is different from the annotation you find at Interpro.
Open ChimeraX and load the 1BM8 structure.
Type camera sbs
to turn stereo viewing on.
Select the entire protein chain and colour it white (residues 4 to 102, practically identical to the IPR003163 APSES domain definition.)
Next, use the “Sequence Window” to select specific residue ranges:
Choose Tools ▸ Sequence ▸ Show Sequence Viewer to open the sequence window, select the sequence corresponding to IPR018004 (Kil-A N) annotation and colour this fragment yellow. You can get the sequence numbers of a residue in the sequence window when you hover the pointer over it - but do confirm that the sequence numbering that Chimera displays matches the numbering of the Interpro domain definition.
Then select the residue range of Pfam 04383, the KilA-N domain as defined by CDD and colour that fragment orange.
Finally, choose the residues for PF04383, the KilA-N domain as defined by InterPro and color them red.
Study this in a side-by-side stereo view and get a sense for how the extra sequence beyond the Kil-A N domain(s) is part of the structure, and how the integrity of the folded structure would be affected if these fragments were missing.
size stickRadius 0.4
to give the bonds more
volumeOrient the protein well, save the resulting image as a jpeg and upload it to your Journal on the Wiki.
There is a rather important lesson in this: domain definitions may be fluid, their boundaries may be computationally derived from sequence comparisons across many families, and they do not necessarily correspond to the situation in specific structures. In our example, you saw that the more restrictive KilA-N domain definitions omit two beta-strands at the N-terminus that are well integrated with the rest of the structure - and may even modulate DNA binding through their interactions with the back of the “wing” domain. Database definition of structural domains are important guides, but the cannot replace your detailed judgement. Make sure you understand this well.
Dawson, Natalie
L et al.. (2017). “CATH: an expanded resource to predict
protein function through structure and sequence”. Nucleic Acids
Research 45(D1):D289–D295 .
[PMID: 27899584]
[DOI: 10.1093/nar/gkw1098]
Das, Sayoni
and Christine A Orengo. (2016). “Protein function annotation using
protein domain family resources”. Methods (San Diego, Calif.)
93:24–34 .
[PMID: 26434392]
[DOI: 10.1016/j.ymeth.2015.09.029]
Sillitoe,
Ian et al.. (2015). “CATH: comprehensive structural and
functional annotations for genome sequences”. Nucleic Acids
Research 43(Database issue):D376–81 .
[PMID: 25348408]
[DOI: 10.1093/nar/gku947]
If in doubt, ask! If anything about this contents is not clear to you, do not proceed but ask for clarification. If you have ideas about how to make this material better, let’s hear them. We are aiming to compile a list of FAQs for all learning units, and your contributions will count towards your participation marks.
Improve this page! If you have questions or comments, please post them on the Quercus Discussion board with a subject line that includes the name of the unit.
[END]