Transcription

Using AutoDock 4 withAutoDockTools:A TutorialWritten by Ruth Huey and Garrett M. MorrisThe Sc ripp s R esea rch Ins ti tut eMolecula r G raph ics Labora tor y1055 0 N. Tor re y Pi nes Rd .La Jolla, Califor nia 9 203 7-10 00USA8 January 200 81

ContentsCo ntents . 2Introdu ction . 4Before We Start .4FA Q – Fr equ ently Ask ed Q ue stions . 5Looking at DockingsExer cise O ne : Re adin g Do cking Lo gs . 7Procedure: .7Exer cise T w o: Visu alizing D ock ed C onf ormati ons . 10Procedure: .10Exer cise T hre e: Clu sterin g Conf orm ation s . 12Procedure: .12Tw o Ste p QA An alysis of Auto Do ck Re sults . 15Exer cise F our : Visualizin g Co nfor matio ns in C onte xt . 17Procedure: .17Setting Up a DockingExer cise Fi ve : P DB Files ar e Not P erfe ct: E diting a P DB file . 21Procedure: .21Exer cise Six: Pre pari ng a lig and file f or Aut o Do ck. 24Procedure: .24Exer cise Sev en : Pr ep aring th e fle xible re sidu e file. . 28Procedure: .28Exer cise Eight : Pr epari ng t he ma cro mol ecule f ile. . 30Procedure: .30Exer cise Ni ne: Pre pari ng th e grid par am eter file. . 31Procedure: .31Exer cise T en : Starting Aut oGri d 4 . 34Procedure: .342

Exer cise Ele ven : Pr ep aring the d ockin g par a meter file. . 36Procedure: .36Exer cise T w elve : Sta rting Auto Do ck 4. . 38Procedure: .38Files for E x ercis es : . 40Input Files:.40Results Files .40Useful Scripts in AutoDockTools/Utilities24 .40Customization Options for ADT .40AppendicesApp en dix 1: D ash bo ard Wid get . 41App en dix 2: P MV Ba sics . 43App en dix 3: D ockin g P aram eter s . 45Parameters common to SA, GA, GALS: .45Simulated Annealing Specific Parameters:.48Genetic Algorithm Specific Parameters: .49Local Search Specific Parameters: .50Clustering keywords:.51App en dix 4: C onfor matio n Play er . 523

IntroductionThis tutorial will introduce you to docking using the AutoD ock suiteof programs. We will use a Graphical User Interface calledAuto Doc kT ool s , or ADT , that helps a user easily set up the twomolecules for docking, launches the external number crunching jobs inAuto Doc k , and when the dockings are completed also lets the userinteractively visualize the docking results in 3D.Before We Start And only if you are at The Scripps Research Institute Thesecommands are for people attending the tutorial given at Scripps. Wewill be starting the graphical user interface to AutoDock from thecommand line. To do this, you need to open a Terminal window andthen type this at the UNIX, Mac OS X or Linux prompt:% source /tsri/python/share/bin/setpath4.csh% cd tutorial% adt4

FAQ – Frequently Asked Questions1. Where should I start ADT?You should always start ADT in the same directory as themacromolecule and ligand files. You can start ADT from thecommand line in a Terminal by typing “adt ” and pressing Return or Enter .2. Should I always add hydrogens?Yes, for both the macromolecule and the ligand, you shouldalways add hydrogens, compute Gasteiger charges and thenyou must merge the non-polar hydrogens. Polar hydrogens arehydrogens that are bonded to electronegative atoms likeoxygen and nitrogen. Non-polar hydrogens are hydrogensbonded to carbon atoms.3. How many AutoGrid grid maps do I need?You need one AutoGrid map for every atom type in the ligandplus an electrostatics map and a desolvation map. E.g.: forethanol, C2H5OH, you would need C, OA and HD maps plus anelectrostatics ‘e’ map plus a desolvation ‘d’ map.4. Why should all the total charges on the residues be an integer?This is because it is assumed that the residue is interchangeablewith others, and that no electrons are withdrawn or received byadjacent residues. In proteins, e.g., arginines should have atotal charge of 1.000 if they are protonated, or 0.000 if theyare neutral.5. How easy will it be to get good docking results?In general, the more rotatable bonds in the ligand, the moredifficult it will be to find good binding modes in repeateddocking experiments.6. How big should the AutoGrid grid box be?The grid volume should be large enough to at least allow the5

ligand to rotate freely, even when the ligand is in its most fullyextended conformation.7. Can I identify potential binding sites of a ligand on a proteinwith AutoDock?Yes, if you do not know where the ligand binds, you can builda grid volume that is big enough to cover the entire surface ofthe protein, using a larger grid spacing than the default value of0.375Å, and more grid points in each dimension. Then you canperform preliminary docking experiments with AutoDock tosee if there are particular regions of the protein that arepreferred by the ligand. This is sometimes referred to as “blinddocking”.Then, in a second round of docking experiments, you can buildsmaller grids around these potential binding sites and dock inthese smaller grids.If the protein is very large, then you can break it up intooverlapping grids and dock into each of these grid sets, e.g. onecovering the top half, one covering the lower half, and onecovering the middle half. The third-party tool, BDT, automatesthis process; see http://autodock.scripps.edu/resources.6

Exercise One: Reading Docking LogsAutoDock’s search for the best ways to fit a ligand molecule into areceptor results in a docking log file that contains a detailed record ofthe Docking. By convention, these results files have the extension“.dlg”. Reading a docking log or a set of docking logs into ADT is thefirst step in analyzing the results of docking experiments.In this exercise we will use the file ‘ind.dlg’ from a previousAutoDock docking of the clinically-approved HIV-1 proteaseinhibitor, Indinavir, to protease. It contains many details that areoutput as AutoDock parses the input files and reports what it finds.For example, when AutoDock opens each AutoGrid map, it reportsopening the map file and how many data points it read in. When itparses the input ligand file, it reports building various internal datastructures. After the input phase, AutoDock begins the specifiednumber of runs. It reports which run number it is starting; it mayreport specifics about each generation or simulated annealing cycle.After completing the runs, AutoDock begins an analysis phase of theconformational similarity of the dockings. At the very end, it reports asummary of the time taken and outputs the words ‘SuccessfulCompletion’. The level of output detail is controlled by the parameter“outlev” in the docking parameter file. For dockings using the LGAalgorithm, minimal output (‘outlev 0’) is recommended.The key results in a docking log are the docked structures orconformations found at the end of each run, the energies of thesedocked structures and their similarities to each other. The similarity ofdocked structures is measured by computing the root-mean-squaredeviation, rmsd, between the coordinates of the atoms and creating aclustering of the conformations based on these rmsd values. Thedocking results consist of the PDBQT of the Cartesian coordinates ofthe atoms in the docked molecule, along with the state variables thatdescribe this docked conformation and position and docked energies.Procedure:* If there is a previous Dockinginstance in the viewer, you are askedwhether you want to add this DLG tothe previous Docking instance. Thiscan be done when the same AutoGridmap files, ligand, and DPF files wereused for both docking experiments. Inthis case the total number of dockedconformations is reported.1. Analyze Doc king s Op en First, you need to choose the AutoDock log file you would liketo Analyze. This command opens a file browser that lets youchoose a file with the extension .dlg7

Choose ind. dlg .Note: WARNING messages fromthe docking log can be viewed in thepython shell . To do so, open thepython shell and typemv.docked.warningsClear renoves dockingsSelect changes current dockingReading a docking log creates a Docking instance in theviewer. A Conformation instance is created for each dockedresult found in the docking log. A Conformation represents aspecific state of the ligand and has either a particular set ofstate variables from which all the ligand atoms’ coordinatescan be computed or the coordinates themselves.Conformations also have energies: docked energy, bindingenergy, and possibly per atom electrostatic and vdw energies.AutoDock 4 computes the free energy of binding. It reports adetailed energy breakdown.ADT reports how many docked conformations were read infrom the DLG and tells you to how to visualize the dockedconformations or ‘states’.Open All reads all DLG files indirectory you specify .2. Analyze Conf orm ation s Lo ad This opens ind Conformation Chooser which gives you aconcise view of the energies and clusters of the dockedresults.Note: Autodock clusters by first sorting allthe docked conformations from lowest energy(best docking) to highest. The best overalldocked conformation is used as the ‘seed’ forthe first cluster. Then the coordinates of thesecond best conformation are compared withthose of the best to calculate the root-meansquare deviation between the twoconformations. If the calculated rms value issmaller than the specified cutoff, which is 0.5by default, that conformation is added to the‘bin’ containing the best conformation. Ifnot, the second becomes the reference for asecond ‘bin’. Then the rms between the thirdconformation and the ‘best’ is computed . Ifclose enough, it is added to the first bin. If notit compared with the seed of the second binand so on .The lower panel lists the docked conformations for the ligandgrouped according to the clustering performed at the end of theAutoDock calculation. Double clicking on an entry in this listmakes that entry the current conformation of the ligand. Thisresults in displaying ligand in the viewer with new coordinates.The input conformation is always the first entry in this list.The upper panel displays information about the currentconformation. This includes its overall rank, for example thebest result is always 1 1: lowest energy cluster best individualin cluster. Docked Energy is the sum of the intermolecularand internal energy components. Cluster RMS is the rootmean square difference rms between this individual and theseed for the cluster. 1 1 is the seed for the first cluster so itsCluster RMS is 0.0. Ref RMS is the rms between the specifiedreference structure. If no reference structure is specified in theDPF, the input structure is used as the reference. freeEnergyis the sum of the intermolecular energy plus the torsion entropypenalty which is a constant times the number of rotatable bondsin the ligand, kI calculated from the Docked Energy.8

Double click on the ind 1 1 to put the ligand in the bestdocked conformation. Look at the information displayed in thetop panel. Scroll down through the list to see how manyclusters were formed with your docking results. Notice therange in energy between the ‘best’ docking and the seed of thethe last cluster.3. Drag the lower right corner down to reveal the buttons at thebottom. Click Dismiss .9

Exercise Two: Visualizing Docked ConformationsThis exercise lets you visualize the docked conformations of thecurrent Docking instance, which was created in the last exercise byreading ind.dlg. The ‘best’ docking result can be considered to be theconformation with the lowest (docked) energy. Alternatively, it can beselected based on its rms deviation from a reference structure.At the end of each docking run, AutoDock outputs a result which is thelowest energy conformation of the ligand it found during that run.This conformation is a combination of translation, quaternion andtorsion angles and is characterized by intermolecular energy, internalenergy and torsional energy. The first two of these combined give the‘docking energy’ while the first and third give ‘binding energy.’AutoDock also breaks down the total energy into a vdW energy and anelectrostatic energy for each atom.Procedure:1. Analyze Conf orm ation s Play This opens a Confor matio nPlay er(CP ) you can use toexamine the docked conformations of ind.pdbqt. The CP has acurrent list of conformations (its sequence) and a current IDlist. These two lists vary depending on the last sequence ofmenu buttons. Here the sequence list consists all of the dockedconformations, ordered by run. The ID list is [0,1,2,3 10]. “0”is reserved for the original, input conformation.See Appendix 2 for an extendedtour of the ConformationPlayer andits buttons. Typ e-in e ntry at center for random access to any conformationby its id. Valid ids depend on which menubutton was last usedto start the player. Click on bl ack arr o w buttons next to entry to change to nextor previous conformation in current list.10

White arr ow buttons start play according to current play modeparameters (see below). Clicking on an active white arrow buttonstops play. [While a play button is active, its icon is changed todouble vertical bars.] Dou ble bla ck arro w buttons start play as fast as possible inthe specified direction. Dou ble bla ck arro w plus line buttons advance to beginningor end of conformation list. Amp ers and button opens the Set Play Options widget (seeAppendix 2). Quatr efoil button closes the player.Step through the sequence of conformations one by one using theblack arrows.Note: These coloring schemes are not availableuntil you have changed the conformation at leastonce using the Conformation Player.Open the Set Play Options widget by clicking on the Ampersandbutton. Set the conformation to 4. Change the coloring scheme tovd w or elect stat in the dropdown menu labelled Color by.Set the Play Mode to continuously in 1 dir ection from the PlayMode menu. Click on the forward white arrow. Click again tostop play.Open the Play Parameters widget and adjust the frame rate to 3.Display information about each conformation by opening theConformation Info widget by clicking on Show Info.In the next exercise, we will use the Build button to add newmolecules to the Viewer.11

Exercise Three: Clustering ConformationsAn AutoDock docking experiment usually has several solutions. Thereliability of a docking result depends on the similarity of its finaldocked conformations. One way to measure the reliability of a resultis to compare the rmsd of the lowest energy conformations and theirrmsd to one another, to group them into families of similarconformations or “clusters.”Note: lower energies are“better” and in the geneticalgorithm, “fitter.”The DPF keyword, analysis, determines whether clustering is done byAutoDock. As you will see below, it is also possible to clusterconformations with ADT. By default, AutoDock clusters dockedresults at 0.5Å rmsd. This process involves ordering all of theconformations by docked energy, from lowest to highest. The lowestenergy conformation is used as the seed for the first cluster. Next, thesecond conformation is compared to the first. If it is within the rmsdtolerance, it is added to the first cluster. If not, it becomes the firstmember of a new cluster. This process is repeated with the rest of thedocked results, grouping them into families of similar conformations.First we will examine the AutoDock clustering that we read in fromind.dl g. Next we will make new clusterings at different rms values.Procedure:Note: If you have read in more thanone docking log into the currentDocking or if the results did notinclude clustering, you mustRecluster to create a clusteringbefore you can show one.Note: if clusterings were performed usingseveral different rms tolerance values, themenu option:Analyze Clusterings S how would open a widget containing a list of theavailable rms values. Be sure to click onlyonce and to click delicately on this list toopen a new interactive histogram.(Otherwise, you may get several identicalwindows.)1. Analyze Clusteri ngs S ho w Opens an instance of a Python object, an interactivehistogram chart labelled ‘ind 1:rms 2.0 clustering’.This chart has bars which represent the clusterscomputed at the specified rmsd. The bars are sorted byenergy of the lowest-energy conformation in thatcluster and start off colored blue.For example, the lowest energy conformation in thesecond bar is 2 1. The height of the bar represents howmany conformations are in that cluster. Clicking on abar makes that cluster the current sequence for theligand’s CP, and its color changes to red.12

The Conformation # Info Widget shows you bothrefRMSD, the rmsd between the current reference andthe displayed conformation and clRMSD, the rmsdbetween the displayed conformation and the lowestenergy conformation in this cluster. As described abovein the tour of the CP, you can set the referencecoordinates to that of any of the docked conformationswhen it is the current conformation. When viewingclustering results, this is especially useful because itallows you to examine the rmsd between clustermembers. To do this, choose a cluster and use thearrow key to step forward to its lowest energyconformation, e.g. 1-1. Set the rms reference to thisconformation. Now, stepping through the cluster willshow you the rms difference between the lowest energymember of this cluster, i.e. 1-1, and the rest of theconformations in this cluster.You can change clusters by picking a different bar inthe interactive histogram chart. You can save thishistogram as a PostScript file: from the interactivehistogram’s menu select Edit Write to open a filebrowser for you to enter a filename. Make sure to use“.ps” extension. Select File Exit to close.* Note: To facilitate comparing thedocked conformations, typeFile Preferences SetCommands to be Applied onObject s thenselect color ByMol ecul es Whenthis is on, each time a new molecule isadded to the viewer (up to a currentlimit of 20), it is colored differently.The active site of the hiv protease has C2 symmetry.You can probably see evidence of this by examining theclusters of docked indinavir molecules. * Step 1 is tobuild a copy of the lowest energy conformation: cluster1, conformation 1. First display it via the CP, then clickthe Build button. Try clicking on the second bar in thehistogram and display the lowest energy member of thesecond cluster by using the arrow keys next to theentry. If this result doesn’t show C2 symmetry, tryanother cluster bar. You should see the symmetryrelated docked conformations.2. Analyze Clusteri ngs R eclust er Opens a widget that lets you enter a series of new rmstolerances as floating point number separated byspaces. These will be used to perform new clusteringoperations on the docked results. The time consumingstep in clustering is computing a difference matrixbetween conformations to be compared. Larger rms13

values require fewer comparisons; conformations thatare more similar require fewer comparisons. If you typea name in the OutputfileName: entry, a clusteringoutput file will be written. Our convention is to use theextension “.clust” for these files.Type in a list of RMSD tolerances separated by spacesthus 1.0 2. 0 3. 0 and click on OK . For our example,this should be very fast. You can visualize the newclusterings by repeating Step 1.14

Two Step QA Analysis of AutoDock ResultsFor quality assurance, after you have read in the docking log(s)1. Evaluate convergence to determine the thoroughness of the search:Analyz e Clusteri ngs S ho w The basic premise is that if you use a large enough number ofevaluations, the results will cluster. That is, that there is a smallnumber of ‘best’ results which will be found if you look ‘longenough’. The question you are answering here is ‘were the conditionsof the docking experiment sufficient to find these results?’ If theresults do not show reasonable clustering, you may want to repeat thedocking calculation after increasing the number of evaluations,ga num evals, in the DPF. When docking ligands with more than 810 active torsions, you will probably need to increase the number ofevaluations by a factor of ten or more.2. Evaluate the chemical reasonableness of the best results byexamining the interactions between the receptor and the best dockedconformation(s).Click on the lowest energy cluster in the clustering shown in step one.Put the ligand in the lowest energy conformation using theConformationPlayer.Analyz eNote: In the pharmaceuticalindustry, medicinal chemists mayvisually inspect hundreds ofdocked structures for chemicalreasonableness during the drugdiscovery process. M acr om olec ule Op en If hsg1 rigid.pdbqt cannot found in the current directory, afile browser opens to ask you to specify where it can be found.Look at the interactions between the ligand and nearby atoms in thereceptor: Is the ligand bound inside a pocket in the receptor? Are non-polar atoms in the ligand docked near non-polaratoms in the receptor? Are polar atoms in the liganddocked near polar atoms in the receptor? If you know that a particular residue or residues in theprotein interact with the ligand, is that interaction shown inthe docked result?15

Do the interactions seem reasonable in the context of whatyou know about your ligand-receptor from otherexperimental results such as mutation studies?The interpretation of AutoDock results is open-ended. In large part itdepends on your chemical insight and creativity. Docked poses of theligand may suggest chemical modifications such as side-groupsubstitutions, etc .16

Exercise Four: Visualizing Conformations inContextUltimately, the goal of a docking experiment is to illustrate the dockedresult in the context of the macromolecule, explaining the docking interms of the overall energy landscape. The interactions between theligand and the macromolecule are driven by energy composed of vander Waals (vdW), electrostatic, hydrogen bonding and desolvationcomponent energies.This exercise has three parts:First to evaluate the chemical reasonableness of our results, we willrepresent the macromolecule as a solvent-excluded surface usingmsms and check whether the ligand has docked in a ‘pocket’ on thereceptor and whether the pairwise-interactions between atoms in theligand and atoms in the receptor are reasonable.Next we will explore the energy landscape of the binding site,representing it using isocontours. This view of the docking canelucidate the binding mechanism and suggest chemical modificationsof the ligand.Finally, we will introduce two ways of visualizing all the dockedstructures at once, the ‘overall’ binding pattern.Note: If your docking included flexibleresidues, delete hsg1 from the Viewer,read in the rigid moleculehsg1 rigid.pdbqt and use it instead ofhsg1 in this Exercise.If hsg1 rigid is still in your viewer, skip Step 1. Instead useDispl ay Sh o w/ Hid e M ole cule to display it if it is not visible.Also, undisplay any docked conformations you may have alreadybuilt.Procedure:1. Analyze M acr om olec ule Op en If hsg1 rigid.pdbqt cannot found in the current directory, afile browser opens to ask you to specify where it can be found.17

2. Visualize binding in a pocket by displaying molecular surfaces:TSRI tutorials only: Improvethe msms surface display byturning off depth-cueing. Todo this, click on the D on thekeyboard.In the Dashboard, click on the circle under MS in the ind rowand in the hsg1 rigid row. Click on the diamond under Atomto color the ligand surface by the element of the underlyingatom. Click on the diamond under DS to color hsg1 rigid byDavid Goodsell colors.This view of the docking allows you to see how the docked ligand‘fits’ into the macromolecule. Clicking on the circles under MS allowsyou to show and hide the various msms surfaces.Alternatively, you may want to visualize the docked conformations inthe context of the energy grids. This may be useful for computer-aideddrug design.In the steps that follow, you will visualize the oxygen affinity map asan isocontour, then display only two residues in the active site of hsg1and finally see atom O2 of indinavir sitting in a pocket of oxygenaffinity between the two ASP25 residues. This is our most complicatedexercise.3. Analyze G rids O pen This opens a list chooser of the grids used in this docking.Select hsg1. OA .m ap .Note: The grid isocontours arecolored by atom type. (SeeExercise One for a list of thesecolors.)The AutoGrid map file is read into the viewer, creating aninstance of a Grid. This map is visualized as an isoco ntour in3D. This means that every point in the grid box that is equal tothe isocontour level will be connected together by lines orpolygons. You can change the isocontour level, which is anenergy in Kcal/mol; the step between grid points for samplingthe grid values; and whether to show the isocontoured regionsas lines or filled (solid) polygons. You can also toggle thevisibility of the Grid and its bounding box.To illustrate the kind of information you can obtain from theatomic affinity grid maps, try this:1. Set the IsoValue to –0.5 ; if you type into the sliderentry, remember to press Return or Enter .18

2. Display hsg1 ri gid. pd bqt ; if it is not present in theviewer, useAnalyz e M acr om olec ule Op en .3. Choose S elect S ele ct Fr om String and type inASP25 into the Residue field and then click Add .Click Yes to change selection level if necessary andDismi ss to close Sel ect Fr om Strin g widget.4. Choose Dis play Stic ks And BallsOpens Displa y Sticks an d B alls: widget. Increasethe quality to 15 and click ok.Choose C olor B y At om Typ e and select ‘balls’and ‘sticks’ in the widget that opens and click ok.5. Choose a low-energy docked conformation using theCP.Now you can rotate the objects in the viewer. You will see thatthe single selected atom in the inhibitor IND201:O2, is buriedin a pocket of Oxygen-affinity. If you Build (see below) otherlow-energy docked conformations, you should be able to seethe same O2 atom sitting in this region.Click Displ ay Map and S ho w B o x to undisplay theisocontour and its bounding box before you Dismis s this panel.[4. Analyze D ocki ngs S ho w as S phe res ]This command is good for getting a ‘bird’s eye view’ of thedocking results. It represents each docked conformation by asphere placed at the average position of the coordinates of allthe atoms in that conformation. Clicking on the name of adocking log in the list makes the spheres representing its resultsvisible only if the associated ligand is visible.Click on ind.dlg in the list. You can change the radii of thespheres, their color and their smoothness (or “ qua lity ”). [Youmay need to reduce the radii to .03 Å to see the overlappingdockings of our result.]This command gives you a nice overview of the distribution ofthe docked results.19

[5. CP Build All ]This builds a new molecule for each docked conformation inthe current set bound to the CP. This gives you a quick ideaof the overall results of your docking experiment.6. Display specific interactions between the ligand and the receptor:Analyz e Doc king s Sho w Inter action sThis radically changes the display, replacing the backgroundcolor with white, displaying the ligand molecular surface,spheres around atoms in the receptor which are hydrogenbonded or in close-contact to atoms in the ligand plusdisplaying secondary structure for sequences of 3 or moreresidues in the receptor which are interacting with the ligand.The GUI for this command lets you turn on and off differentparts of this specialized display.When you are finished experimenting with it, click on Revertto return to the previous Viewer settings.–—–Now that we have shown you the results of docking, we’re goingto go back to the very start to show you how to set up moleculesfor docking and how t

with others, and that no electrons are withdrawn or received by adjacent residues. In proteins, e.g., arginines should have a total charge of 1.000 if they are protonated, or 0.000 if they are neutral. 5. How easy will it be to get good docking results? In