Difference between revisions of "BIN-SX-Small molecules"

From "A B C"
Jump to navigation Jump to search
m (Created page with "<div id="BIO"> <div class="b1"> "Small Molecule" Structure" </div> {{Vspace}} <div class="keywords"> <b>Keywords:</b>  A small-molecule structure tutorial </div...")
 
m
 
(24 intermediate revisions by the same user not shown)
Line 1: Line 1:
<div id="BIO">
+
<div id="ABC">
  <div class="b1">
+
<div style="padding:5px; border:1px solid #000000; background-color:#b3dbce; font-size:300%; font-weight:400; color: #000000; width:100%;">
 
"Small Molecule" Structure"
 
"Small Molecule" Structure"
  </div>
+
<div style="padding:5px; margin-top:20px; margin-bottom:10px; background-color:#b3dbce; font-size:30%; font-weight:200; color: #000000; ">
 +
(A small-molecule structure tutorial)
 +
</div>
 +
</div>
 +
 
 +
{{Smallvspace}}
  
  {{Vspace}}
+
 
 
+
<div style="padding:5px; border:1px solid #000000; background-color:#b3dbce33; font-size:85%;">
<div class="keywords">
+
<div style="font-size:118%;">
<b>Keywords:</b>&nbsp;
+
<b>Abstract:</b><br />
A small-molecule structure tutorial
+
<section begin=abstract />
 +
Creating with small molecule structures, finding complexes in the PDB that contain the molecule, and superimposing model and structure.
 +
<section end=abstract />
 +
</div>
 +
<!-- ============================  -->
 +
<hr>
 +
<table>
 +
<tr>
 +
<td style="padding:10px;">
 +
<b>Objectives:</b><br />
 +
This unit will ...
 +
* ... introduce options to model small molecules;
 +
* ... demonstrate how to find PDB complexes that contain the molecule;
 +
* ... teach how to superimpose model and structure;
 +
</td>
 +
<td style="padding:10px;">
 +
<b>Outcomes:</b><br />
 +
After working through this unit you ...
 +
* ... can model small molecule structures and superimpose them onto cognate molecules in a protein structure ligand binding site.
 +
</td>
 +
</tr>
 +
</table>
 +
<!-- ============================  -->
 +
<hr>
 +
<b>Deliverables:</b><br />
 +
<section begin=deliverables />
 +
<li><b>Time management</b>: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.</li>
 +
<li><b>Journal</b>: Document your progress in your [[FND-Journal|Course Journal]]. Some tasks may ask you to include specific items in your journal. Don't overlook these.</li>
 +
<li><b>Insights</b>: If you find something particularly noteworthy about this unit, make a note in your [[ABC-Insights|'''insights!''' page]].</li>
 +
<section end=deliverables />
 +
<!-- ============================  -->
 +
<hr>
 +
<section begin=prerequisites />
 +
<b>Prerequisites:</b><br />
 +
You need the following preparation before beginning this unit. If you are not familiar with this material from courses you took previously, you need to prepare yourself from other information sources:<br />
 +
*<b>Biomolecules</b>: The molecules of life; nucleic acids and amino acids; the genetic code; protein folding; post-translational modifications and protein biochemistry; membrane proteins; biological function.
 +
This unit builds on material covered in the following prerequisite units:<br />
 +
*[[BIN-SX-Chimera|BIN-SX-Chimera (UCSF ChimeraX: Structure Visualization and Analysis)]]
 +
<section end=prerequisites />
 +
<!-- ============================  -->
 
</div>
 
</div>
  
{{Vspace}}
+
{{Smallvspace}}
 +
 
 +
 
 +
 
 +
{{Smallvspace}}
  
  
Line 19: Line 67:
  
  
{{STUB}}
+
=== Evaluation ===
  
{{Vspace}}
+
This learning unit can be evaluated for a maximum of 5 marks. To submit material for credit for this unit ...
 +
<ol>
 +
<li>Create a new page on the student Wiki as a subpage of your User Page.</li>
 +
<li>Put all of your writing to submit on this one page.</li>
 +
<li>When you are done with everything, go to the [https://q.utoronto.ca/courses/180416/assignments Quercus '''Assignments''' page] and open the first Learning Unit that you have not submitted yet. Paste the URL of your Wiki page into the form, and click on '''Submit Assignment'''.</li>
 +
</ol>
  
 +
Your link can be submitted only once and not edited. But you may change your Wiki page at any time. However only the last version before the due date will be marked. All later edits will be silently ignored.
  
</div>
+
{{Smallvspace}}
<div id="ABC-unit-framework">
 
== Abstract ==
 
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "abstract" -->
 
...
 
  
{{Vspace}}
+
;Short report option
 +
:'''1.''' Visit the [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/BIN-SX-Small_molecules_choices '''small-molecule selection page on the Student Wiki'''] and choose one small molecule to work with.
 +
:'''2.''' Summarize what this molecule is. Draw the structure of the molecule in one of the molecular editors described in the unit and produce a SMILES string. Find a suitable PDB complex structure and superimpose your model in ChimeraX. Report on the quality of the superposition and create an informative stereo image that illustrates how your model is situated in the ligand binding site relative to the experimentally determined compound.
  
 +
;Perform all visualization and analysis on the command line (if you use the menu, the "log" will record the command line equivalent and you can copy it from there) and record your commands in a section of your report. The visualization must be fully reproducible from your documentation, i.e. when I copy/paste your commands, I need to get the same image that you are showing (except for rotation/translation/scaling).
  
== This unit ... ==
 
=== Prerequisites ===
 
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "prerequisites" -->
 
<!-- included from "ABC-unit_components.wtxt", section: "notes-external_prerequisites" -->
 
You need the following preparation before beginning this unit. If you are not familiar with this material from courses you took previously, you need to prepare yourself from other information sources:
 
<!-- included from "FND-prerequisites.wtxt", section: "biomolecules" -->
 
*<b>Biomolecules</b>: The molecules of life; nucleic acids and amino acids; the genetic code; protein folding; post-translational modifications and protein biochemistry; membrane proteins; biological function.
 
<!-- included from "ABC-unit_components.wtxt", section: "notes-prerequisites" -->
 
You need to complete the following units before beginning this one:
 
*[[BIN-PDB]]
 
  
{{Vspace}}
+
:'''3.''' When you are done, submit the link to your page via Quercus as described above.
  
 +
<!--
 +
; Quiz option
 +
: Open the [http://steipe.biochemistry.utoronto.ca/abc/students/index.php/Signup-BIN-PPI-Analysis_Quiz  '''signup-page for the quiz for this unit (linked from here)'''] and add your name. Your name must be signed up by 12:00 of the day of the Quiz to ensure copies of the quiz are available for all participants.
 +
#include("./data/ABC-unit_components.txt", section = "quiz-mechanics")
 +
-->
 +
<!--
 +
; R-code option
 +
:Submit code according to the following requirements. Make sure your code is documented.
 +
-->
 +
<!--
 +
; Option to write a "Self-Evaluation Question"
 +
: You can submit  a "Self-Evaluation Question" for at most '''one''' of your  assignments.
 +
: Write a "Self-evaluation Question" (with a model solution) that explores a significant, non-trivial aspect of studying how to work with EBI resources within this learning unit. Ensure that the question is feasible, given the existing content of the unit - or coordinate an extension of the contents with your instructor. Ensure your question pursues a high-level learning goal, it should allow others to demonstrate understanding, critical analysis, and/or the capacity to integrate and synthesize knowledge, not merely test memorization. Ensure that your question is specific, not ambiguous, vague or tangential to the contents. Ensure you are testing '''valuable''' knowledge and skills, not Cargo Cult. Apply the [[ABC-Rubrics| '''marking rubrics''']] in spirit to satisfy yourself of the quality of your contribution. Obviously, details of evaluation will vary with the question. Use the format and code templates that you find on the [[Self_evaluation_questions|'''Self evaluation questions page''']] -  but don't assume those examples are already models of excellent contributions. Note: assume that approximately the same amount of work is expected for all evaluation options. Consequently, the standard of excellence for this option will be quite high.
 +
:# Create a new page on the student Wiki as a subpage of your User Page. Develop your question there.
 +
:# When you are done with everything, add the following category tag '''to the end of page''':
 +
::<code><nowiki>[[Category:EVAL-BIN-SX-Small_molecules]]</nowiki></code>.
  
=== Objectives ===
+
Once the page has been saved with this tag, it is considered "submitted".
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "objectives" -->
+
'''Do not''' change your submission after this tag has been added. The page will be marked and the category tag will be removed by the instructor.-->
...
+
== Contents ==
 
 
{{Vspace}}
 
  
 +
{{Task|1=
 +
*Read the introductory notes on {{ABC-PDF|BIN-SX-Small_molecules|working with "small molecule" structure}}.
 +
}}
  
=== Outcomes ===
+
<!-- cf. Phil Fradkin's ChemoGenomics BCB410-2015 -->
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "outcomes" -->
 
...
 
  
 
{{Vspace}}
 
{{Vspace}}
  
 +
=== Modeling small molecules ===
  
=== Deliverables ===
+
"Small" molecules are solvent, ligands, substrates, products, prosthetic groups, drugs - in short, essentially everything that is not made by DNA-, RNA-polymerases or the ribosome. Whereas the biopolymers are still front and centre in our quest to understand molecular biology, small molecules are crucial for our quest to interact with the inventory of the cell, create useful products, or advance medicine.
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "deliverables" -->
 
<!-- included from "ABC-unit_components.wtxt", section: "deliverables-time_management" -->
 
*<b>Time management</b>: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
 
<!-- included from "ABC-unit_components.wtxt", section: "deliverables-journal" -->
 
*<b>Journal</b>: Document your progress in your [[FND-Journal|course journal]].
 
<!-- included from "ABC-unit_components.wtxt", section: "deliverables-insights" -->
 
*<b>Insights</b>: If you find something particularly noteworthy about this unit, make a note in your [[ABC-Insights|insights! page]].
 
  
{{Vspace}}
+
A number of public repositories make small-molecule information available, such as [http://pubchem.ncbi.nlm.nih.gov/ PubChem] at the NCBI, the ligand collection at the [http://pdb.org '''PDB'''], the [http://www.ebi.ac.uk/chebi/ ChEBI] database at the European Bioinformatics Institute, the Canadian [http://www.drugbank.ca DrugBank], or the [http://cactus.nci.nih.gov/ncidb2.2/ NCI database browser] at the US National Cancer Institute. One general way to export topology information from these services is to use {{WP|SMILES|SMILES strings}}&mdash;a shorthand notation for the composition and topology of chemical compounds.
  
  
=== Evaluation ===
+
{{Task|1=
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "evaluation" -->
+
;Caffeine at PubChem
<!-- included from "ABC-unit_components.wtxt", section: "eval-none" -->
+
# Access [http://pubchem.ncbi.nlm.nih.gov/ PubChem].
<b>Evaluation: NA</b><br />
+
# Enter "caffeine" as a search term in the '''Compound''' tab. A number of matches to this keyword search are returned.
:This unit is not evaluated for course marks.
+
# Click on the [http://pubchem.ncbi.nlm.nih.gov/compound/2519 top hit - 1,3,7-Trimethylxanthine, the Caffeine molecule]. Note that the page contains among other items:
 +
## A 2D structural sketch;
 +
## An idealized 3D structural conformer, for which you can download coordinates in several formats;
 +
## The IUPAC name: <tt>1,3,7-trimethylpurine-2,6-dione</tt>;
 +
## The CAS identifier <code>58-08-2</code> which is a unique identifier and can be used as a cross-reference ID;
 +
## The {{WP|SMILES|SMILES strings|SMILES string}} <code>CN1C{{=}}NC2{{=}}C1C({{=}}O)N(C({{=}}O)N2C)C</code>;
 +
## ... and much more.
 +
}}
  
{{Vspace}}
 
  
 +
{{Task|1=
 +
;Caffeine at DrugBank
 +
# Access [https://www.drugbank.ca DrugBank].
 +
# Enter "Caffeine" in the search form and. .
 +
# Click on the [https://www.drugbank.ca/drugs/DB00201 hit to "Caffeine" itself]. Note that the page contains among other items:
 +
## A detailed description
 +
## A 2D structural sketch with a link to 3D options;
 +
## Synonyms, including the IUPAC name: <tt>1,3,7-trimethylpurine-2,6-dione</tt>;
 +
## ... and much more.
 +
# Follow the link [https://www.drugbank.ca/structures/small_molecule_drugs/DB00201 to the 3D options] and note the options for downloading information, including the SMILES string and PDB formatted coordinates.
 +
}}
  
</div>
 
<div id="BIO">
 
== Contents ==
 
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "contents" -->
 
...
 
  
{{Vspace}}
+
That's great, but let's sketch our own version of caffeine. Several versions of Peter Ertl's {{WP|JME_editor|Java Molecular Editor (JME)}} are offered online, PubChem offers this functionality via its '''Sketcher''' tool and the PDB has a similar sketching tool on its ligand search page]
  
 +
{{Task|1=
 +
# Return to the [http://pubchem.ncbi.nlm.nih.gov/ PubChem homepage].
 +
# Click on '''Draw Structure'''.
 +
# Sketch the structure of caffeine. I find the editor quite intuitive but clicking on the '''Help''' button will give you a quick, structured overview. Make sure you define your double-bonds correctly.
 +
# '''Export''' the SMILES string of your compound to your project folder.
 +
}}
  
== Further reading, links and resources ==
 
<!-- {{#pmid: 19957275}} -->
 
<!-- {{WWW|WWW_GMOD}} -->
 
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
 
  
{{Vspace}}
+
=== Translating SMILES to structure ===
  
 +
ChimeraX can translate SMILES strings to coordinates<ref>There are also several online servers that translate SMILES strings to idealized structures, for example the [http://cactus.nci.nih.gov/translate/ online SMILES translation service] at the NCI.</ref>.
  
== Notes ==
+
{{Task|1=
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "notes" -->
+
# Open ChimeraX.
<!-- included from "ABC-unit_components.wtxt", section: "notes" -->
+
# Select '''Tools''' &rarr; '''Structure&nbsp;Editing''' &rarr; '''Build&nbsp;Structure'''.
<references />
+
# In the '''Build Structure''' window, select the '''SMILES string''' button, paste the string from your file, and click '''Apply'''.
 +
# The caffeine molecule will be generated and visualized in the graphics window. This is a "stick" representation.
 +
# You can rotate it with your mouse, pinch to scale, <shift> drag to translate.
 +
# Use the '''Actions''' &rarr; '''Atoms/Bonds''' &rarr; '''ball &amp; stick''' or '''sphere''' menu items to change appearance.
 +
# Use the '''Actions''' &rarr; '''Color''' &rarr; '''by element''' menu to change colors.
 +
# Change the display back to stick and use '''Actions''' &rarr; '''Surface''' &rarr; '''show''' to add a solvent accessible surface. Choosing this command triggers the calculation of the surface, which is then available as an individually selectable object. With default pramaters, the surface is a bit rough for this small molecule. Type <code>surface gridSpacing 0.1</code> to increase the resolution five-fold from its default 0.5A.
 +
# By default, the surface inherits the colour of the atoms it envelopes. To change the colour of the surface, use the '''Actions''' &rarr; '''Color''' &rarr; '''all options''' menu. Click the '''surfaces''' button to indicate that the color choice should be applied to the surface object (note what else you can apply color to...), then choose '''cornflower blue'''.
 +
# Use the '''Actions''' &rarr; '''Surface''' &rarr; '''transparency''' &rarr; '''50%''' menu to see atoms and bonds that are covered by the surface. I find a soft lighting usually works best: <code>lighting soft</code>
 +
# To begin working with molecules in "true" 3D, type <code>camera sbs</code>.
 +
# Your structure should look somthing like what you see below.
  
{{Vspace}}
 
  
 +
{{Stereo|Caffeine_stereo.jpg|'''Wall-eye stereo view''' of the caffeine structure, surrounded by a transparent molecular surface. The image for the left eye is on the left side.
 +
}}
  
</div>
 
<div id="ABC-unit-framework">
 
== Self-evaluation ==
 
<!-- included from "../components/BIN-SX-Small_molecules.components.wtxt", section: "self-evaluation" -->
 
<!--
 
=== Question 1===
 
  
Question ...
+
}}
  
<div class="toccolours mw-collapsible mw-collapsed" style="width:800px">
 
Answer ...
 
<div class="mw-collapsible-content">
 
Answer ...
 
  
</div>
+
===Superposition===
  </div>
 
  
  {{Vspace}}
+
To investigate a small molecule structure variant in the context of a complex, we need to superimpose it with an existing ligand.
  
-->
+
{{Task|1=
 +
* Open the structure <tt>3G6M</tt> in ChimeraX. This is one of the hits returned from the PDB search for caffeine - a fungal chitinase for which caffeine is a potent inhibitor.
 +
* Choose '''Select''' &rarr; '''Residue''' &rarr; '''CFF''' (CFF is th PDB three-letter code for this hetero compound), then '''Select''' &rarr; '''Invert (selected models)''',  '''Actions''' &rarr; '''Atoms/Bonds''' &rarr; '''hide''' and '''Actions''' &rarr; '''Cartoon''' &rarr; '''hide''' to show only the caffeine molecules - there are two. Select the one with residue ID 1, and again '''Actions''' &rarr; '''Atoms/Bonds''' &rarr; '''hide'''. The remaining CFF molecule has residue ID 427.
  
{{Vspace}}
+
To superimpose the structures, we can't use the standard "match" option, because that only works for protein or DNA molecules. Instead, we need to explicitly define matching pairs of atoms through ChimeraX's command line interface. The command line interface is a very powerful way to issue ChimeraX commands, but it has a bit of a learning curve since we need to use a precise model/residue/atom selection syntax.
  
 +
* Open the User hep page and study the select command options. Note how models are specified with a "#" sigil, residues with a ":" or "::" sigil, and atoms with an "@" sigil.
 +
* The command we need is <tt>align</tt>, and we need to feed the command atoms in exactly the order of the pairs that the superposition algorithm should superimpose. To identify the atom numbers, we can hover over them with the mouse, or we can select the residue/atom and choose  '''Actions''' &rarr; '''Label''' &rarr; '''name'''. If we superimpose the four nitrogen atoms, the correct command may be:<code>align #1@N1,N2,N4,N3 to #2:427@N3,N1,N7,N9</code> to superimpose the model we built from the SMILES string onto the structure - but the exact atom names in the model structure depend on how the SMILES string was written.
 +
* Note how the two structures are virtually identical - in this case, there are only very small coordinate differences because the conformational degrees of freedom are very much constrained in the xanthin heterocycle. But there '''are''' differences nevertheless. One molecule is an idealized structure, the other a structure that has been determined by a high-resolution experiment.
 +
* Turn the protein structure back on, and study how the ligand is bound.
  
 +
}}
  
 
{{Vspace}}
 
{{Vspace}}
  
 +
== Further reading, links and resources ==
  
<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_ask" -->
+
* http://www.ebi.ac.uk/chebi/
 +
* https://pubchem.ncbi.nlm.nih.gov/
  
----
+
== Notes ==
 +
<references />
  
 
{{Vspace}}
 
{{Vspace}}
  
<b>If in doubt, ask!</b> If anything about this learning unit is not clear to you, do not proceed blindly but ask for clarification. Post your question on the course mailing list: others are likely to have similar problems. Or send an email to your instructor.
 
 
----
 
 
{{Vspace}}
 
  
 
<div class="about">
 
<div class="about">
Line 155: Line 226:
 
:2017-08-05
 
:2017-08-05
 
<b>Modified:</b><br />
 
<b>Modified:</b><br />
:2017-08-05
+
:2020-10-07
 
<b>Version:</b><br />
 
<b>Version:</b><br />
:0.1
+
:1.2
 
<b>Version history:</b><br />
 
<b>Version history:</b><br />
 +
*1.2 Edit policy update
 +
*1.1 2020 Updates - rewrite for changed Websites and using ChimeraX
 +
*1.0 First live version
 
*0.1 First stub
 
*0.1 First stub
 
</div>
 
</div>
[[Category:ABC-units]]
 
<!-- included from "ABC-unit_components.wtxt", section: "ABC-unit_footer" -->
 
  
 
{{CC-BY}}
 
{{CC-BY}}
  
 +
[[Category:ABC-units]]
 +
{{UNIT}}
 +
{{LIVE}}
 
</div>
 
</div>
 
<!-- [END] -->
 
<!-- [END] -->

Latest revision as of 05:01, 10 October 2020

"Small Molecule" Structure"

(A small-molecule structure tutorial)


 


Abstract:

Creating with small molecule structures, finding complexes in the PDB that contain the molecule, and superimposing model and structure.


Objectives:
This unit will ...

  • ... introduce options to model small molecules;
  • ... demonstrate how to find PDB complexes that contain the molecule;
  • ... teach how to superimpose model and structure;

Outcomes:
After working through this unit you ...

  • ... can model small molecule structures and superimpose them onto cognate molecules in a protein structure ligand binding site.

Deliverables:

  • Time management: Before you begin, estimate how long it will take you to complete this unit. Then, record in your course journal: the number of hours you estimated, the number of hours you worked on the unit, and the amount of time that passed between start and completion of this unit.
  • Journal: Document your progress in your Course Journal. Some tasks may ask you to include specific items in your journal. Don't overlook these.
  • Insights: If you find something particularly noteworthy about this unit, make a note in your insights! page.

  • Prerequisites:
    You need the following preparation before beginning this unit. If you are not familiar with this material from courses you took previously, you need to prepare yourself from other information sources:

    • Biomolecules: The molecules of life; nucleic acids and amino acids; the genetic code; protein folding; post-translational modifications and protein biochemistry; membrane proteins; biological function.

    This unit builds on material covered in the following prerequisite units:


     



     



     


    Evaluation

    This learning unit can be evaluated for a maximum of 5 marks. To submit material for credit for this unit ...

    1. Create a new page on the student Wiki as a subpage of your User Page.
    2. Put all of your writing to submit on this one page.
    3. When you are done with everything, go to the Quercus Assignments page and open the first Learning Unit that you have not submitted yet. Paste the URL of your Wiki page into the form, and click on Submit Assignment.

    Your link can be submitted only once and not edited. But you may change your Wiki page at any time. However only the last version before the due date will be marked. All later edits will be silently ignored.


     
    Short report option
    1. Visit the small-molecule selection page on the Student Wiki and choose one small molecule to work with.
    2. Summarize what this molecule is. Draw the structure of the molecule in one of the molecular editors described in the unit and produce a SMILES string. Find a suitable PDB complex structure and superimpose your model in ChimeraX. Report on the quality of the superposition and create an informative stereo image that illustrates how your model is situated in the ligand binding site relative to the experimentally determined compound.
    Perform all visualization and analysis on the command line (if you use the menu, the "log" will record the command line equivalent and you can copy it from there) and record your commands in a section of your report. The visualization must be fully reproducible from your documentation, i.e. when I copy/paste your commands, I need to get the same image that you are showing (except for rotation/translation/scaling).


    3. When you are done, submit the link to your page via Quercus as described above.

    Contents

    Task:


     

    Modeling small molecules

    "Small" molecules are solvent, ligands, substrates, products, prosthetic groups, drugs - in short, essentially everything that is not made by DNA-, RNA-polymerases or the ribosome. Whereas the biopolymers are still front and centre in our quest to understand molecular biology, small molecules are crucial for our quest to interact with the inventory of the cell, create useful products, or advance medicine.

    A number of public repositories make small-molecule information available, such as PubChem at the NCBI, the ligand collection at the PDB, the ChEBI database at the European Bioinformatics Institute, the Canadian DrugBank, or the NCI database browser at the US National Cancer Institute. One general way to export topology information from these services is to use SMILES strings—a shorthand notation for the composition and topology of chemical compounds.


    Task:

    Caffeine at PubChem
    1. Access PubChem.
    2. Enter "caffeine" as a search term in the Compound tab. A number of matches to this keyword search are returned.
    3. Click on the top hit - 1,3,7-Trimethylxanthine, the Caffeine molecule. Note that the page contains among other items:
      1. A 2D structural sketch;
      2. An idealized 3D structural conformer, for which you can download coordinates in several formats;
      3. The IUPAC name: 1,3,7-trimethylpurine-2,6-dione;
      4. The CAS identifier 58-08-2 which is a unique identifier and can be used as a cross-reference ID;
      5. The SMILES strings CN1C=NC2=C1C(=O)N(C(=O)N2C)C;
      6. ... and much more.


    Task:

    Caffeine at DrugBank
    1. Access DrugBank.
    2. Enter "Caffeine" in the search form and. .
    3. Click on the hit to "Caffeine" itself. Note that the page contains among other items:
      1. A detailed description
      2. A 2D structural sketch with a link to 3D options;
      3. Synonyms, including the IUPAC name: 1,3,7-trimethylpurine-2,6-dione;
      4. ... and much more.
    4. Follow the link to the 3D options and note the options for downloading information, including the SMILES string and PDB formatted coordinates.


    That's great, but let's sketch our own version of caffeine. Several versions of Peter Ertl's Java Molecular Editor (JME) are offered online, PubChem offers this functionality via its Sketcher tool and the PDB has a similar sketching tool on its ligand search page]

    Task:

    1. Return to the PubChem homepage.
    2. Click on Draw Structure.
    3. Sketch the structure of caffeine. I find the editor quite intuitive but clicking on the Help button will give you a quick, structured overview. Make sure you define your double-bonds correctly.
    4. Export the SMILES string of your compound to your project folder.


    Translating SMILES to structure

    ChimeraX can translate SMILES strings to coordinates[1].

    Task:

    1. Open ChimeraX.
    2. Select ToolsStructure EditingBuild Structure.
    3. In the Build Structure window, select the SMILES string button, paste the string from your file, and click Apply.
    4. The caffeine molecule will be generated and visualized in the graphics window. This is a "stick" representation.
    5. You can rotate it with your mouse, pinch to scale, <shift> drag to translate.
    6. Use the ActionsAtoms/Bondsball & stick or sphere menu items to change appearance.
    7. Use the ActionsColorby element menu to change colors.
    8. Change the display back to stick and use ActionsSurfaceshow to add a solvent accessible surface. Choosing this command triggers the calculation of the surface, which is then available as an individually selectable object. With default pramaters, the surface is a bit rough for this small molecule. Type surface gridSpacing 0.1 to increase the resolution five-fold from its default 0.5A.
    9. By default, the surface inherits the colour of the atoms it envelopes. To change the colour of the surface, use the ActionsColorall options menu. Click the surfaces button to indicate that the color choice should be applied to the surface object (note what else you can apply color to...), then choose cornflower blue.
    10. Use the ActionsSurfacetransparency50% menu to see atoms and bonds that are covered by the surface. I find a soft lighting usually works best: lighting soft
    11. To begin working with molecules in "true" 3D, type camera sbs.
    12. Your structure should look somthing like what you see below.


    Caffeine stereo.jpg

    Wall-eye stereo view of the caffeine structure, surrounded by a transparent molecular surface. The image for the left eye is on the left side.


    Superposition

    To investigate a small molecule structure variant in the context of a complex, we need to superimpose it with an existing ligand.

    Task:

    • Open the structure 3G6M in ChimeraX. This is one of the hits returned from the PDB search for caffeine - a fungal chitinase for which caffeine is a potent inhibitor.
    • Choose SelectResidueCFF (CFF is th PDB three-letter code for this hetero compound), then SelectInvert (selected models), ActionsAtoms/Bondshide and ActionsCartoonhide to show only the caffeine molecules - there are two. Select the one with residue ID 1, and again ActionsAtoms/Bondshide. The remaining CFF molecule has residue ID 427.

    To superimpose the structures, we can't use the standard "match" option, because that only works for protein or DNA molecules. Instead, we need to explicitly define matching pairs of atoms through ChimeraX's command line interface. The command line interface is a very powerful way to issue ChimeraX commands, but it has a bit of a learning curve since we need to use a precise model/residue/atom selection syntax.

    • Open the User hep page and study the select command options. Note how models are specified with a "#" sigil, residues with a ":" or "::" sigil, and atoms with an "@" sigil.
    • The command we need is align, and we need to feed the command atoms in exactly the order of the pairs that the superposition algorithm should superimpose. To identify the atom numbers, we can hover over them with the mouse, or we can select the residue/atom and choose ActionsLabelname. If we superimpose the four nitrogen atoms, the correct command may be:align #1@N1,N2,N4,N3 to #2:427@N3,N1,N7,N9 to superimpose the model we built from the SMILES string onto the structure - but the exact atom names in the model structure depend on how the SMILES string was written.
    • Note how the two structures are virtually identical - in this case, there are only very small coordinate differences because the conformational degrees of freedom are very much constrained in the xanthin heterocycle. But there are differences nevertheless. One molecule is an idealized structure, the other a structure that has been determined by a high-resolution experiment.
    • Turn the protein structure back on, and study how the ligand is bound.


     

    Further reading, links and resources

    Notes

    1. There are also several online servers that translate SMILES strings to idealized structures, for example the online SMILES translation service at the NCI.


     


    About ...
     
    Author:

    Boris Steipe <boris.steipe@utoronto.ca>

    Created:

    2017-08-05

    Modified:

    2020-10-07

    Version:

    1.2

    Version history:

    • 1.2 Edit policy update
    • 1.1 2020 Updates - rewrite for changed Websites and using ChimeraX
    • 1.0 First live version
    • 0.1 First stub

    CreativeCommonsBy.png This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.