ASA (ACCESSIBLE SURFACE AREA) 
          Accessible Surface Area is defined as the area (measured in square
          angstroms) of the molecular surface which is contact with solvent.
          It may also be described as the area over which the centre of a
          molecule of radius 1.4 angstroms can move while maintaining 
          unobstructed contact with the van der Waals surface of the molecule.
          The concept of accessible surface area provides a quantitative
          definition of the exterior and interior of proteins and other
          macromolecules.  

VDWSA (VAN DER WAALS SURFACE AREA)
          Van der Waals surface area is the actual exposed or visible area of
          a molecule assuming the atomic surface is defined by the van der
          Waals radius of each component atom in the molecule.  It may also
          be described as the area over which the centre of an infinitely
          small point (of radius 0.0 angstroms) can move while maintaining
          unobstructed contact with the van der Waals surface.  The van der
          Waals surface area is sometimes known as the atomic or molecular
          surface area.  It is nearly equal to the sum of the "contact" and
          "reentrant" surface areas described by Richards (1977).  Note that
          the van der Waals surface area is always smaller than the ASA and
          that it is measured in square angstroms.

..............................................................................
To run their programs:
	- pdbtodia    (pdb to diamond)
	- addradii    (add radius)
	- access      (calculate area)
I have re-written the first two, and they are part of area.c
..............................................................................

The area(ACCESS) program was written by:
     ACCESS - T. J. Richmond J. Mol. Biol. 178:63-89 (1984)
This includes all of the fortran programs.

The C interface for the program was added on by:
     * Authors...: David Wishart, Leigh Willard                 *
     * Supervisor: Brian Sykes                                  *
     * Location..: University of Alberta                        *
     *             Protein Engineering Network of               *
     *             Centres of Excellence                        *

The following documentation is that which was included in the original
(ACCESS) programs.

..............................................................................
From @mvs.oac.ucla.edu:LAURA@UCLAUE.MBI.UCLA.EDU Sat May 16 12:32:05 1992
Return-Path: <@mvs.oac.ucla.edu:LAURA@UCLAUE.MBI.UCLA.EDU>
Received: from odysseus.biochem.ualberta.ca ([129.128.6.150]) by rigel.biochem.ualberta.ca.ualberta.ca (4.1/SMI-4.1)
	id AA22510; Sat, 16 May 92 12:32:00 MDT
Received: from rigel by odysseus.biochem.ualberta.ca (NeXT-1.0 (From Sendmail 5.52)/NeXT-2.0)
	id AB02074; Sat, 16 May 92 12:23:38 MDT
Received: from UCLAUE by MVS.OAC.UCLA.EDU (IBM MVS SMTP V2R1) with TCP;
   Sat, 16 May 92 11:27:19 PST
Date: Sat, 16 May 1992 11:29 PST
From: LAURA@uclaue.mbi.ucla.edu
Subject: solvent.doc
To: dsw@rigel.biochem.ualberta.ca
Message-Id: <944FACE3A0602763@uclaue.mbi.ucla.edu>
X-Envelope-To: dsw@rigel.biochem.ualberta.ca
X-Vms-To: RECEIVER
Status: R

Date: February 4, 1986
To: Potential Users of Solvent Free Energy Program
From: David Eisenberg

Revised 2/25/86 by William Wilcox

	The following is a brief, hastily prepared summary of the
programs needed to compute solvent free energies.  Please write or call
me to clarify or expand directions.  Much of the documentation is
contained within the code.  All of the programs, other than the free
energy calculator FRENS and I/O conversion programs (CHMTODIA, BRKTODIA,
and CHARGE), were written by Tim Richmond and others at the MRC Lab,
Cambridge. They are supplied here with Richmond's permission, because
FRENS depends entirely on them. 

	It is possible to run all of these programs from a single com 
file.  Example com files are Totalchm.com for CHARMM-formatted coordinate files
and Totalbrk.com for Brookhaven-formatted files.

	1. BRKTODIA converts Brookhaven-format coordinates to
Diamond-format coordinates.  The Diamond format is used by the following
program.  As for all the programs here, a sample command file is
supplied.  In most of these command files, the input and output file
names are the third and fourth lines, after the default directory. 
A related program, CHMTODIA, converts CHARMM-formatted coordinates to 
Diamond format.

	2. ADDRADII adds a Van der Waals radius to each atom as
specified in the program.  In my work, I have used the standard values
supplied.  The coefficients (atomic solvation parameters - ASPs) of
COEF.ENG were determined, assuming these radii, and for any other radii
new ASPs need to be determined. 

	3. ACCESS calculates accessible areas.  In my work I have used a
probe of 1.4 A radius.  Again, for another radius, new ASPs would need
TO Be determined.  Also, the present ASPs are not appropriate for
contact or atomic areas; only for accessible areas.  The program has
been redimensioned for 3000 atoms. 

	4.  CHARGE converts the output of ACCESS into one suitable for
FRENS.  It renames atoms on Glu, Asp, Arg, His, and C-terminal residues
so that the atom with the largest exposed surface area will be the
charged atom.  It also renames N-terminal residues so that FRENS knows
that these are charged.  There are some restrictions on the order of
some of the input atom types, but these shouldn't pose a problem with
CHARMM or Brookhaven files.  See the code for more details. 

	5. FRENS sums accessible areas, each weighted by its appropriate
ASP, according to Eisenberg & McLachlan, Nature, 319, 199-203 (1986). 
FREN.COM gives the files that must be called, including the areas from
ACCESS, an ouput file, and three data files:  ATOMSS.ENG is a look up
table that gives the atom type (non-polar, polar, charged O, charged N,
or S) for each atom found commonly in proteins.  Other atom types (such
as iron) must be treated on an individual basis. REFAREASS.ENG is a look
up table that gives the reference area for each atom.  These are from
Shrake and Rupley [J.Mol.Biol., 79,351-371 (1986)].  The reference area
is the accessible area in the hypothetical extended chain.  COEF.ENG has
a title line followed by two identical lines of ASP values for the 5
atom types.  In normal use, these ASP values are not changed.  They are
those of Eisenberg & McLachlan, cited above. 

	Several energies are given by FRENS.  The one called net energy
is delta G sub S, the atomic solvation energy of Equation 4 of Eisenberg
& McLachlan. 

From @mvs.oac.ucla.edu:LAURA@UCLAUE.MBI.UCLA.EDU Sat May 16 12:31:54 1992
Return-Path: <@mvs.oac.ucla.edu:LAURA@UCLAUE.MBI.UCLA.EDU>
Received: from odysseus.biochem.ualberta.ca ([129.128.6.150]) by rigel.biochem.ualberta.ca.ualberta.ca (4.1/SMI-4.1)
	id AA22505; Sat, 16 May 92 12:31:51 MDT
Received: from MVS.OAC.UCLA.EDU by odysseus.biochem.ualberta.ca (NeXT-1.0 (From Sendmail 5.52)/NeXT-2.0)
	id AA02074; Sat, 16 May 92 12:23:22 MDT
Received: from UCLAUE by MVS.OAC.UCLA.EDU (IBM MVS SMTP V2R1) with TCP;
   Sat, 16 May 92 11:27:18 PST
Date: Sat, 16 May 1992 11:29 PST
From: LAURA@uclaue.mbi.ucla.edu
Subject: hydrophobic_energy.doc
To: dsw@rigel.biochem.ualberta.ca
Message-Id: <944DD25740602763@uclaue.mbi.ucla.edu>
X-Envelope-To: dsw@rigel.biochem.ualberta.ca
X-Vms-To: RECEIVER
Status: R

Hydrophobic energies:			 	by Laura Wesson 
       						Laura@Uclaue.mbi.ucla.edu

The hydrophobic energy command file, hydrophobic_energy.com, uses Protein Data
Bank formatted coordinate files as input.
    To massage this coordinate file so it will work with the com file, check
the following:
       o At the end of sequences, if they terminate with a carboxy group,
         make sure that the two terminal oxygens are labelled O and OX, OT
         or OXT.  Files from to Brookhaven database usually end with OXT.
         The oxygen which is labelled OX, OT or OXT should be the last atom
         in its residue.  The other oxygen can be anywhere in the last 
         residue.  If you just have one terminal oxygen, it should be 
         changed to OT2.  
       o If you have a segment which starts with a nitrogen that should be
         uncharged, make sure that that nitrogen atom isn't listed first.
         Otherwise its name will be changed to NT and it will be regarded
         as a charged amino group.
       o Check for non-standard residue names, such as ACE.  If you want 
         these to be treated as regular protein residues, give them a 
         standard residue name and give the atoms names which occur in 
         that residue.  Make sure that charged oxygens and charged 
         nitrogens are given atom names which are charged in the standard
         residue.
       o If you have non-protein groups, you have three ways to deal with 
         them:
           a) Calculate the hydrophobic energy of the protein as if 
              the groups were not there.
       	      To do this, make sure that the coordinates for the groups are
              in HETATM records.
           b) Use the groups to mask off the accessible area of the protein,
       	      but don't calculate any hydrophobic energy coming from the
              the groups themselves.
       	      To do this, put the coordinates for the groups in ATOM records,
       	      but make sure the residue names for the groups are NOT standard.
           c) Calculate hydrophobic energy coming from the groups themselves,
              where possible.  Since there are atomic solvation parameters
              only for C, O, N, and S, any other atom coordinates will be
              used only to mask off other atoms.
       	      To do this:
       	      Put the coordinates for the groups in ATOM records.
       	      For ions with charge that is shared among several oxygens, such 
       	      as sulfates and phosphates, label the residue CYS.  If it is a 
       	      sulfate, call the sulfur SG.
              If the overall charge of the ion is -2, for example, label two
       	      of the oxygens OX and the others O.  
       	      The command file will consider as charged the oxygens with the 
              greatest surface area, i.e. in a sulfate the two most exposed 
       	      oxygens will be considered as charged.
       	      For other groups, such as a heme group, change the residue name
       	      to CYS.
       	      If an atom is a carbon, change the atom name to C.
       	      If an atom is an uncharged oxygen, change the atom name to
       	      OT2.
       	      If an atom is a charged oxygen, change the atom name to OT1.
       	      If an atom is an uncharged nitrogen, changed the atom name to
       	      N, and make sure it doesn't come first in the residue.
       	      If an atom is a charged nitrogen, change the atom name to NT.
       	      If it's a sulfur, change the atom name to SG.
       	      Otherwise, leave the atom name alone -- the atom will only be
              used for its masking effect.
              Note:  the command file doesn't care if you repeat atom names.
       o Give the coordinate file a name with the extension .pdb, i.e.
         1MLP.PDB

    Now, fix the command file:
       o The adddradii program produces an output file which includes 
         coordinates and the Van der Waals radii.  As input, the addradii 
         program expects residue ranges.  In the command file as it is, there 
         is one residue range, from 1 to 10000.
         So every residue with a number in this range will be processed.
         It's possible to just take part of the protein for hydrophobic 
         residue calculations, by specifying a more restrictive residue 
         range.  Or you could use more than one residue range, giving 
         different ranges different identifiers, i.e.

$ RUN fre:ADDRADII
dia_FILE 
radii_FILE 
   10   50 1
  100  150 2

       In any case, a blank line terminates the list of residue ranges.
       Also, the addradii program is given a list of radii for non-protein
       atoms.  In these records, the residue name, atom name and radius is 
       listed.  The format of the example command file must be used.  Again, 
       a blank line terminates the list of radii for unusual atoms.

       Next, the ACCESS program calculates the accessible surface area for
       each atom.
       This program has a limit of 3000 atoms.  If you have more than this,
       use the ACCESS_BIG program, which has a limit of 50,000 atoms.

       Then, the CHARGE program reassigns atom types of shared-charge oxygen
       pairs, and a does a few other things.  For example, for a carboxy group,
       the charged oxygen is given a type of OT1, and the uncharged oxygen is 
       given the name OT2.

       Then, FRENS calculates the hydrophobic energy contributions of each
       atom.  The total hydrophobic folding energy is given near the end of
       the FRENS output file, as GNETTOT.

    Now, to actually RUN this:  Suppose your input Protein Data Bank coordinate 
file is called 1mlp.pdb.  Put it in the [guest.work] directory,  and type this:
$@[guest.asp]hydrophobic_energy 1mlp

or, 
$submit [guest.asp]hydrophobic_energy/para=1mlp

to put it in the batch queue (often advisable!)

You will hopefully end up with a bunch of files, e.g. if the input file was
1MLP.PDB, you will get:

1MLP.ACCESS
1MLP.CHG
1MLP.DIA
1MLP.FRENS
1MLP.PDB
1MLP.RADII

The file 1MLP.FRENS is the final output.  It gives detailed information on
the contributions of each atom to the hydrophobic energy.  Near the end of
this file there is a section which looks like this:

 AREA AND ENERGY SUMS FOR PROTEIN AND RESIDUE AND ATOM CLASSES



 GNETTOT    =    -434.5743 AREATOT  =   20100.8125
 GREFTOT =        513.1716 AREFTOT  =   77661.0000
 GTOT    =         78.5978
 GSTABTOT=       -819.0070



GREFTOT is the calculated transfer energy of the UNFOLDED protein from 
octanol to water.  In this example, it would cost 513 kcal/mol to transfer
this protein from octanol to water, if it was unfolded.

GTOT is the calculated transfer energy of the FOLDED protein from 
octanol to water.  In this example, it would cost 79 kcal/mol to transfer
this protein from octanol to water, if it was folded.  In other words, 
hydrophobic portions of the protein have been buried during the folding
process.

GNETTOT is the hydrophobic folding energy of the protein.  Note 
GNETTOT = GTOT - GREFTOT.  In this case, it yields 435 kcal/mole to fold the
protein.

AREATOT is the exposed surface area of the protein, when folded.

AREFTOT is the exposed surface area of the protein, when unfolded.

There are various checks that should be done:
    o The command file runs a program called checkatoms on the input PDB file.
      This program checks to see whether a residue has more or less than the
      standard number of atoms.  If it has one more, this residue may be the
      carboxy terminus of a segment.  If it has less, atoms may be missing
      in the coordinate file.
    o check 1MLP.RADII file to see that non-protein atoms were given the right
      radii.  If not, change the input to the ADDRADII program.
    o check beginning and end of protein segments in 1MLP.CHG file.  The 
      beginning of a segment, if it is an amino group, should be an atom
      called NT.  At the end of a segment, if it is a carboxy group, there
      should be two oxygens labelled OT1 and OT2.  OT1 has the bigger area.
      If you have non-protein groups, check that the beginning and end of
      the group has not been changed to NT or OT1/OT2, if that is not
      appropriate.
      If the beginning of a segment is NOT a charged nitrogen, make sure it
      doesn't start with NT.  If it does, change the location of the nitrogen
      record in the input PDB file so that it isn't the first atom in the
      sequence.
     o check 1MLP.FRENS file for error messages containing the words 
      "not recognized".  If an atom or residue is not recognized, then it
       will only be used to mask other atoms from the solvent; no hydrophobic
       energy contributions will be calculated from the atom or residue 
       itself.  The atom or residue may not have a name that FRENS recognizes. 
       If that isn't OK, look in the FRENS.FOR program for a list of the
       atom names and residue names.
     o Look for "zzzap!" messages from the output of the command file.  
       These are miscellaneous error messages from the programs, you could
       try to fix them or call me.



