The PSI suite of ab initio quantum chemistry programs is the result of an ongoing attempt by a cadre of graduate students, postdoctoral associates, and professors to produce code that is efficient but also easy to extend to new theoretical methods. Significant effort has been devoted to the development of libraries which are robust and easy to use. Some of the earliest contributions to what is now referred to as ``PSI'' include a direct configuration interaction (CI) program (Robert Lucchese, 1976, now at Texas A&M), the well-known graphical unitary group CI program (Bernie Brooks, 1977-78, now at N.I.H.), and the original integrals code (Russ Pitzer, 1978, now at Ohio State). From 1978-1987, the package was know as the BERKELEY suite, and after the Schaefer group moved to the Center for Computational Quantum Chemistry at the University of Georgia, the package was renamed PSI. Thanks primarily to the efforts of Curt Janssen (Sandia Labs, Livermore) and Ed Seidl (LLNL), the package was ported to UNIX systems, and substantially improved with new input formats and a C-based I/O system.
Beginning in 1999, an extensive effort was begun to develop PSI3 -- a PSI suite with a completely new face. As a result of this effort, all of the legacy Fortran code was removed, and everything was rewritten in C and C++, including new integral/derivative integral, coupled cluster, and CI codes. In addition, new I/O libraries have been added, as well as an improved checkpoint file structure and greater automation of typical tasks such as geometry optimization and frequency analysis. The package has the capability to determine wavefunctions, energies, analytic gradients, and various molecular properties based on a variety of theories, including spin-restricted, spin-unrestricted, and restricted open-shell Hartree-Fock (RHF, UHF, and ROHF); configuration interaction (CI) (including a variety of multireference CI's and full CI); coupled-cluster (CC) including CC with variationaly optimized orbitals; second-order Møller-Plesset perturbation theory (MPPT) including explicitly correlated second-order Møller-Plesset energy (MP2-R12); and complete-active-space self-consistent field (CASSCF) theory. By January 2008, all of the C code in PSI3 was converted to C++ to enable a path toward more object-oriented design and a single-excecutable framework that will facilitate code reuse and ease efforts at parallelization. At this same time, all of the legacy I/O routines from PSI2 were removed, greatly streamlining the libciomr.a library.
The purpose of this manual is to provide a reasonably detailed overview of the source code and programming philosophy of PSI3, such that programmers interested in contributing to the code will have an easier task. Section 2 gives a succint explanation of the steps required to obtain the source code from the main repository at Virginia Tech. (Installation instructions are given separately in the installation manual or in $PSI3/INSTALL.) Section 3 discusses the essential elements of a C-language PSI3 program, with emphasis on the input parsing and I/O functions. Section 4 provides documentation of a number of other important libraries, including the library of functions for reading from the checkpoint file, libchkpt.a, the Quantum Trio miscellaneous function library, libqt.a, the libiwl.a for reading and writing one- and two-electron integrals in the ``integrals with labels'' format. Section 5 offers advice on appropriate programming style for PSI3 code, and section 6 describes the structure of the package's Makefiles. Section 6.3 gives a brief overview of the necessary steps to adding a new module to PSI3, section 7 gives some suggestions on debugging it, and section 8 explains conventions for documenting it. The appendices provide important reference material, including the currently accepted PSI3 citation and format information for some of the most important text files used by PSI3 modules.
The subversion control system (SVN) ( subversion.tigris.org) provides a convenient means by which programmers may obtain the latest (or any previous) version of the PSI3 source from the main repository or a branch version, add new code to the source tree or modify existing PSI3 modules, and then make changes and additions available to other programmers by checking the modifications back into the main repository. SVN also provides a ``safety net'' in that any erroneous modifications to the code may be easily removed once they have been identified. This section describes how to use SVN to access and modify the PSI3 source code. (Note that compilation and installation instructions are given in a separate document.)
The main repository for the PSI3 Source code is currently maintained by the Crawford group at Virginia Tech. To check out the code, one must first obtain an SVN account by emailing crawdad@vt.edu. After you have a login-id and password, you are now ready to access the repository via a secure, SSL-based WebDAV connection, but first you must decide which version of the code you need.
The PSI3 SVN repository contains three top-level directories:
https://sirius.chem.vt.edu/svn/psi3/
The PSI3 repository is comprised of a main trunk and several release branches. The branch you should use depends on the sort of work you plan for the codes:
Fig. 1 provides a schematic of the SVN revision-control structure and branch labeling. Two release branches are shown, the current stable branch, named psi-3-4, and a planned future release, to be named psi-3-5. The tags on the branches indicate release shapshots, where bugs have been fixed and the code has been or will be exported for public distribution. The dotted lines in the figure indicate merge points: just prior to each public release, changes made to the code on the stable release branch will be merged into the main trunk.
A frequently encountered problem is what to do about bug fixes that are necessary for uninterrupted code development of the code on the main trunk. As Rule 1 of the above policy states, all bug fixes of the code already in the recent stable release must go on the corresponding branch, not on the main trunk. The next step depends on the severity of the bug:
The following are some of the most commonly used SVN commands for checking out and updating working copies of the PSI3 source code.
To checkout a working copy of the head of the main trunk:
svn co https://sirius.chem.vt.edu/svn/psi3/trunk/ psi3
To check out a working copy of the head of a specific release branch,
e.g., the branch labelled psi-3-4:
svn co https://sirius.chem.vt.edu/svn/psi3/branches/psi-3-4 psi3
Note that subsequent svn update commands in this working copy will provide updates only on the chosen branch. Note also that after you have checked out a fresh working copy of the code you must run the autoconf command to generate a configure script for building the code. (See the installation manual for configuration, compilation, and testing instructions.)
For each of the above commands, the working copy of your code will be placed in the directory psi3, regardless of your choice of branch. In this manual, we will refer to this directory from now on as $PSI3. Subsequent SVN commands are usually run within this top-level directory.
To update your current working copy to include the latest revisions:
svn update
Notes: (a) This will update only the revisions on your current branch; (b) The old -d and -P flags required by CVS are not necessary with SVN.
To convert your working copy to the head of a specific branch:
svn switch https://sirius.chem.vt.edu/svn/psi3/branches/psi-3-4
To convert your working copy to the head of the main trunk:
svn switch https://sirius.chem.vt.edu/svn/psi3/trunk/
To find out what branch your working copy is on, run this in your
top-level PSI3 source directory:
svn info | grep URL
This will return the SVN directory from which your working copy was taken, e.g.,
URL: https://sirius.chem.vt.edu/svn/psi3/branches/psi-3-4
Some words of advice:
If you have changes to Psi binaries or libraries which already exist, one of two series of steps is necessary to check these changes in to the main repository. The first series may be followed if all changes have been made only to files which already exist in the current version. The second series should be followed if new files must be added to the code in the repository.
The svn ci command in both of these sequences will examine all of the code in the current libciomr directory against the current version of the code in the main repository. Any files which have been altered (and for which no conflicts with newer versions exist!) will be identified and checked in to the main repository (as well as the new file in the second situation).
SVN requires that you include a comment on your changes. However, unlike CVS, SVN prefers that you put your comments on the command-line rather than editing a text file. I prefer the CVS way, but this is a minor pain compared to all the advantages of SVN, in my opinion.
If the programmer is adding a new executable module or library to the PSI3 repository, a number of important conventions should be followed:
Assume the new code is an executable module and is named great_code. The directory containing the new code must contain only those files which are to be checked in to the repository! Then the following steps will check in a new piece of code to the main repository:
If the code in the main repository has been altered, other users' working copies will of course not automatically be updated. In general, it is only necessary to execute the following steps in order to completely update your working copy of the code:
This will examine each entry in your working copy and compare it to the most recent version in the main repository. When the file in the main repository is more recent, your version of the code will be updated. If you have made changes to your version, but the version in the main repository has not changed, the altered code will be identified to you with an ``M''. If you have made changes to your version of the code, and one or more newer versions have been updated in the main repository, SVN will examine the two versions and attempt to merge them - this process often reveals conflicts, however, and is sometimes unsuccessful. You will be notified of any conflicts that arise (labelled with a ``C'') and you must resolve them manually.
If new directories have been added to the repository, the update above will automatically add them to your working copy. However, you may need to re-run autoconf and configure ( $objdir/config.status -recheck is a convenient command) to be able to build the new code.
The following steps will remove a source code file named bad_code.F from a binary module named great_code:
This will check the main repository and provide you with the code as it stood exactly on February 17th, 2002.
svn log detci.ccChecking the log files is a very useful way to see what recent changes might be causing new problems with the code.
Your working copy of the PSI3 source code includes a number of important subdirectories:
After compilation and installation, the $prefix directory contains the executable codes and other necessary files. NB: The files in this area should never be directly modified; rather, the working copy should be modified and the PSI3 Makefile hierarchy should handle installation of any changes. The structure of the installation area is:
To function as part of the PSI package, a program must incorporate certain required elements. This section will discuss the header files, global variables, and functions required to integrate a new C++ module into PSI3. Here is a minimal PSI3 program, whose elements are described below. Note that we are using C++ namespaces to avoid conflicting names between modules, as we are moving toward a single-executable design. However, for legacy reasons certain globals and the gprgid() function need to have C-linkage.
#include <cstdio>
#include <cstdlib>
#include <libipv1/ip_lib.h>
#include <psifiles.h>
#include <libqt/qt.h>
#include <libciomr/libciomr.h>
#include <libchkpt/chkpt.h>
#include <libpsio/psio.h>
extern "C" {
FILE *infile, *outfile;
char *psi_file_prefix;
}
// begin module-specific namespace
namespace psi { namespace MODULE_NAME {
// global variables, function declarations, and
// #define statements here
}} // close namespace psi::MODULE_NAME
// main needs to be in the global namespace
// but give it access to the psi::MODULE_NAME namespace
using namespace psi::MODULE_NAME
int main(int argc, char *argv[])
{
psi_start(&infile, &outfile, &psi_file_prefix,
argc-1, argv+1, 0);
ip_cwk_add(":MODULE_NAME"); // MODULE_NAME all caps here
psio_init(); psio_ipv1_config();
/* to start timing, tstart(outfile); */
/* Insert code here */
/* to end timing, tstop(outfile); */
psio_done();
psi_stop(infile, outfile, psi_file_prefix);
}
// this needs to be global namespace also
extern "C" {
char *gprgid(void)
{
char *prgid = "MODULE_NAME";
return(prgid);
}
}
// all other stuff is in a special namespace
namespace psi { namespace MODULE_NAME {
// other stuff below
double some_function(int x) {
// code
}
}} // close namespace psi::MODULE_NAME
In the above example, we have included the typical C++ and PSI header files, although for your specific module you may not need all of these, or perhaps you may need additional ones (such as string.h or math.h). The PSI include files used in this example are libipv1/ip_lib.h (the input parser, described in section 3.2), psifiles.h (definitions of all the PSI file numbers for I/O), libqt/qt.h (the ``quantum trio'' library, containing miscellaneous math and utility functions), libciomr/libciomr.h (the old PSI I/O and math routines library - although it contains no I/O anymore), libchkpt/chkpt.h (a library for accessing the checkpoint file to obtain quantities such as the SCF or nuclear repulsion energy), and libpsio/psio.h (the PSI I/O library, see section 3.3). These include files contain function declarations for all of the functions contained in those libraries.
Note that all PSI modules require three global variables with C linkage (i.e., inside an extern C statement): infile, outfile, and psi_file_prefix. Each PSI module must also have a C-linkage function called gprgid() defined as shown. The main() function must be in global scope, and other functions should be inside a namespace with the name of the module (which is further contained inside a psi namespace). Consult a C++ book if you are unfamiliar with namespaces.
The integer function main() must be able to handle command-line arguments required by the PSI3 libraries. In particular, all PSI3 modules must be able to pass to the function psi_start() arguments for the user's input and output filenames, as well as a global file prefix to be used for naming standard binary and text data files. (NB: the default names for user input and output are input.dat and output.dat, respectively, though any name may be used.) The current standard for command-line arguments is for all module-specific arguments (e.g., -quiet, used in detci) before the input, output, and prefix values. The psi_start() function expects to find only these last three arguments at most, so the programmer should pass as argv[] the pointer to the first non-module-specific argument. The above example is appropriate for a PSI3 module that requires no command-line arguments apart from the input/output/prefix globals. See the PSI3 modules input and detci for more sophisticated examples. The final argument to psi_start() is an integer whose value indicates whether the output file should be overwitten (1) or appended (0). Most PSI3 modules should choose to append.
The psi_start() function initializes the user's input and output files and sets the global variables infile, outfile, and psi_file_prefix, based on (in order of priority) the above command-line arguments or the environmental variables PSI_INPUT, PSI_OUTPUT, and PSI_PREFIX. The value of the global file prefix can also be specified in the user's input file. The psi_start() function will also initialize the input parser and sets up a default keyword tree (described in detail in section 3.2). This step is required even if the program will not do any input parsing, because some of the functionality of the input parser is assumed by libciomr.a and libpsio.a. For instance, opening a binary file via psio_open() (see section 3.3) requires parsing the files section of the user's input so that a unit number (e.g. 52) can be translated into a filename. The psi_stop() function shuts down the input parser and closes the user's input and output files.
Timing information (when the program starts and stops, and how much user, system, and wall-clock time it requires) can be printed to the output file by adding calls to tstart() and tstop() (from libciomr.a).
The sole purpose of the simple function gprgid() is to provide the input parser a means to determine the name of the current program. This allows the input parser to add the name of the program to the input parsing keyword tree. This function is used by libpsio.a, though the functionality it provides is rarely used.
In all but the most trivial of modules, you will probably need to split your code into multiple files. The PSI3 convention is to put the main() function, gprgid(), and the allocation of infile, outfile, and psi_file_prefix into a file with the same name as that of the module (and a .cc extension). Other C++ source files should have everything wrapped within the psi::MODULE_NAME namespace. Any module-specific header files should look like this:
#ifndef _psi_src_bin_MODULE_NAME_h
#define _psi_src_bin_MODULE_NAME_h
// if you need infile, outfile, and psi_file_prefix in the header,
// include them like this:
extern "C" {
extern FILE *infile, *outfile;
extern char *psi_file_prefix;
}
namespace psi { namespace MODULE_NAME {
/* header stuff goes here */
}} // namespace psi::MODULE_NAME
#endif // header guard
If you add infile, etc, to a header file, make sure they are within an extern "C" statement and in the global namespace. Since these variables are defined in MODULE_NAME.cc, you should also precede these variables with extern to tell the compiler they've been allocated in another module (e.g., extern FILE *infile). However, that means you then wouldn't be able to include that header file in MODULE_NAME.cc, because then you'd be telling the compiler both that infile, etc, are allocated elsewhere (according to extern FILE *infile in the header file) and also that it's allocated in the current file (FILE *infile in MODULE_NAME.cc), an obvious contradition. Most of the official PSI3 modules use a trick defining or undefining a variable called EXTERN to avoid this apparent paradox and allow the use of the same header file containing global variables (often called globals.h) in MODULE_NAME.cc and all other C++ source files.
As always, you are encouraged to avoid use of global variables when at all possible. It is customary to wrap variables that would otherwise be global into data structures such as MOInfo (for things like the number of orbitals) and Params (for user-specified parameters). In the next stage of PSI development, these commonly-used data structures will be standardized as new C++ objects for maximum code re-use and flexibility.
The format of input.dat follows certain rules which should probably referred to as the PSI input grammar. There is a description of most of those rules in PSI3 User's Manual. A complete definition of the PSI input grammar is encoded in parse.y (see below). To read a grammar we need a parser - the first component of libipv1.a. Then the identified lexical elements of input.dat (keywords and keyword values) need to be scanned for presence of ``forbidden'' characters (e.g. a space may not be a part of a string unless the string is placed between parentheses). This task is performed by the lexical scanner -- the second component of libipv1.a. Finally, scanned-in pairs of keyword-value(s) are stored in a hierarchical data structure (a tree). When a particular option is needed, the set of stored keywords and values is searched for the one queried and the value returned. In this way, options of varying type can be assigned, i.e. rather than having a line of integers, each corresponding to a program variable, mnemonic character string variables can be parsed and interpreted into program variables. It's also easier to implement default options, allowing a more spartan input deck. The set of input-parsing routines in libipv1.a is really not complicated to use, but the manner in which data is stored is somewhat painful to grasp at first.
The following is a list of the names of the individual source files in libipv1 and a summary of their contents. After that is a list of the syntax of specific functions and their use. Last is a simple illustration of the use of this library, taken mostly from cscf.
void ip_cwk_clear();
Clears current working keyword. Used when initializing input or switching
from one section to another (:DEFAULT and :CSCF to :INTCO, for instance).
void ip_cwk_add(char *kwd);
Adds kwd to the list of current working keywords. Allows parsing of
variables under that keyword out of the input file (files) which has
(have) been read or will be read in the future using ip_append.
The keyword kwd can only be removed from the list of current working
keywords by purging the entire list using ip_cwk_clear.
You must ensure that they keyword strings begin with a colon.
int ip_count(char *kwd, int *count, int n);
Counts the elements in the n'th element of the array kwd.
int ip_boolean(char *kwd, int *bool, int n);
Parses n'th element of kwd as boolean (true, 1, yes; false, 0, no)
into 1 or 0 returned in bool.
int ip_exist(char *kwd, int n);
Returns 1 if n'th element of kwd exists. Unfortunately, n must be 0.
int ip_data(char *kwd, char *conv, void *value, int n
[, int o1, ..., int on]);
Looks for keyword kwd, finds the value associated with it,
converts it according to the format specification given in
conv, and stores the result in value. Note that
value is a void * so this routine can handle any data
type, but it is the programmer's responsibility to ensure that the
pointer passed to this routine is of the appropriate pointer type for
the data. The value found by the input parser depends on the value of
n and any optional additional arguments. n is the
number of additional arguments. If n is 0, then there are no
additional arguments, and the keyword has only one value associated
with it. If the keyword has an array associated with it, then
n is 1 and the one additional argument is which element of the
array to pick. If kwd specifies an array of arrays, then
n is 2, the first additional argument is the number of the
first array, and the second argument is the number of the element
within that array, etc. Deep in here, the code calls a
sscanf(read, conv, value);, so that's the real meaning of
variables.
int ip_string(char *kwd, char **value, int n, [int o1, ..., int on]);
Parses the string associated with kwd stores it in value.
The role of n and optional arguments is the same as that
described above for ip_data().
int ip_value(char *kwd, ip_value_t **ip_val, int n);
Grabs the section of keyword tree at kwd and stores it in
ip_val
for the programmer's use - this is usually not used, since you need to
understand the structure of ip_value_t.
int ip_int_array(char *kwd, int *arr, int n);
Reads n integers into array arr.
void ip_set_uppercase(int uc);
Sets parsing to case sensitive if uc==0, I think.
void ip_initialize(FILE *in, FILE *out);
Calls yyparse(); followed by ip_cwk_clear(); followed by
ip_internal_values();. This routine reads the entire input deck
and stores it into the keyword tree for access later.
void ip_append(FILE *in, FILE *out);
Same thing as ip_initialize();, except this doesn't clear the
cwk first. Used for parsing another input file, such
as intco.dat.
void ip_done();
Frees up the keyword tree.
void ip_print_tree(FILE *out, ip_keyword_tree_t *tree);
Prints out tree to out. If tree is set to NULL,
then the current working keyword tree will be printed out.
This function is useful for debugging problems with parsing.
From cscf.cc:
#include <libipv1/ip_lib.h>
#include <libpsio/psio.h>
int main(int argc,char* argv[])
{
using namespace psi::cscf;
...
psi_start(&infile,&outfile,&psi_file_prefix,argc-1, argv+1, 0);
ip_cwk_add(":SCF");
From scf_input.cc:
errcod = ip_string("LABEL",&alabel,0);
if(errcod == IPE_OK) fprintf(outfile," label = %s\n",alabel);
reordr = 0; /* this sets the default that will be used in case the
user hasn't specified this keyword */
errcod = ip_boolean("REORDER",&reordr,0);
if(reordr) {
errcod = ip_count("MOORDER",&size,0);
for(i=0; i < size ; i++) {
errcod = ip_data("MOORDER","%d",&iorder[i],1,i);
errchk(errcod,"MOORDER");
}
}
second_root = 0;
if (twocon) {
errcod = ip_boolean("SECOND_ROOT",&second_root,0);
}
if(iopen) {
errcod = ip_count("SOCC",&size,0);
if(errcod == IPE_OK && size != num_ir) {
fprintf(outfile,"\n SOCC array is the wrong size\n");
fprintf(outfile," is %d, should be %d\n",size,num_ir);
exit(1);
}
if(errcod != IPE_OK) {
fprintf(outfile,"\n try adding some electrons buddy!\n");
fprintf(outfile," need SOCC\n");
ip_print_tree(outfile,NULL);
exit(1);
}
Almost all PSI3 modules must exchange data with raw binary (also called ``direct-access'') files. However, rather than using low-level C or Fortran functions such as read() or write(), PSI3 uses a flexible, but fast I/O system that gives the programmer and user control over the organization and storage of data. Some of the features of the PSI I/O system, libpsio, include:
The TOC structure of PSI binary files provdes several advantages over older I/O systems. For example, data items in the TOC are identified by keyword strings (e.g., "Nuclear Repulsion Energy") and the global address of an entry is known only to the TOC itself, never to the programmer. Hence, if the programmer wishes to read or write an entire TOC entry, he/she is required to provide only the TOC keyword and the entry size (in bytes) to obtain the data. Furthermore, the TOC makes it possible to read only pieces of TOC entries (say a single buffer of a large list of two-electron integrals) by providing the appropriate TOC keyword, a size, and a starting address relative to the beginning of the TOC entry. In short, the TOC design hides all information about the global structure of the direct access file from the programmer and allows him/her to be concerned only with the structure of individual entries. The current TOC is written to the end of the file when it is closed.
Thus the direct-access file itself is viewed as a series of pages, each of which contains an identical number of bytes. The global address of the beginning of a given entry is stored on the TOC as a page/offset pair comprised of the starting page and byte-offset on that page where the data reside. The entry-relative page/offset addresses which the programmer must provide work in exactly the same manner, but the 0/0 position is taken to be the beginning of the TOC entry rather than the beginning of the file.
int psio_init(void): Before any files may be opened or the basic read/write functions of libpsio may be used, the global data needed by the library functions must be initialized using this function.
int psio_ipv1_config(void): For the library to operate properly, its configuration must be read from the input file or from user's .psirc file. This call MUST immediately follow int psio_init();.
int psio_done(void): When all interaction with the direct-access files is complete, this function is used to free the library's global memory.
int psio_open(ULI unit, int status): Opens the direct access file identified by unit. The status flag is a boolean used to indicate if the file is new (0) or if it already exists and is being re-opened (1). If specified in the user input file, the file will be automatically opened as a multivolume (striped) file, and each page of data will be read from or written to each volume in succession.
int psio_close(ULI unit, int keep): Closes a direct access file identified by unit. The keep flag is a boolean used to indicate if the file's volumes should be deleted (0) or retained (1) after being closed.
int psio_read_entry(ULI unit, char *key, char *buffer, ULI size): Used to read an entire TOC entry identified by the string key from unit into the array buffer. The number of bytes to be read is given by size, but this value is only used to ensure that the read request does not exceed the end of the entry. If the entry does not exist, an error is printed to stderr and the program will exit.
int psio_write_entry(ULI unit, char *key, char *buffer, ULI size): Used to write an entire TOC entry idenitified by the string key to unit into the array buffer. The number of bytes to be written is given by size. If the entry already exists and its data is being overwritten, the value of size is used to ensure that the write request does not exceed the end of the entry.
int psio_read(ULI unit, char *key, char *buffer, ULI size, psio_address sadd, psio_address *eadd): Used to read a fragment of size bytes of a given TOC entry identified by key from unit into the array buffer. The starting address is given by the sadd and the ending address (that is, the entry-relative address of the next byte in the file) is returned in *eadd.
int psio_write(ULI unit, char *key, char *buffer, ULI size, psio_address sadd, psio_address *eadd): Used to write a fragment of size bytes of a given TOC entry identified by key to unit into the array buffer. The starting address is given by the sadd and the ending address (that is, the entry-relative address of the next byte in the file) is returned in *eadd.
The page/offset address pairs required by the preceeding read and write functions are supplied via variables of the data type psio_address, defined by:
typedef struct {
ULI page;
ULI offset;
} psio_address;
The PSIO_ZERO defined in a macro provides a convenient input
for the 0/0 page/offset.
int psio_tocprint(ULI unit, FILE *outfile): Prints the TOC of unit in a readable form to outfile, including entry keywords and global starting/ending addresses. (tocprint is also the name of a PSI3 utility module which prints a file's TOC to stdout.)
int psio_toclen(ULI unit, FILE *outfile): Returns the number of entries in the TOC of unit.
int psio_tocdel(ULI unit, char *key): Deletes the TOC entry corresponding to key. NB that this function only deletes the entry's reference from the TOC itself and does not remove the corresponding data from the file. Hence, it is possible to introduce data "holes" into the file.
int psio_tocclean(ULI unit, char *key): Deletes the TOC entry corresponding to key and all subsequent entries. As with psio_tocdel(), this function only deletes the entry references from the TOC itself and does not remove the corresponding data from the file. This function is still under construction.
#include <cstdio>
#include <cstdlib>
#include <libipv1/ip_lib.h>
#include <libpsio/psio.h>
#include <libciomr/libciomr.h>
extern "C" {
FILE *infile, *outfile;
char *psi_file_prefix;
}
using namespace psi::MODULE_NAME;
int main(int argc, char* argv[])
{
int i, M, N;
double enuc, *some_data;
psio_address next; /* Special page/offset structure */
psi_start(&infile,&outfile,&psi_file_prefix,argc-1, argv+1, 0);
ip_cwk_add(":MODULE_NAME"); // MODULE_NAME in all caps
tstart(outfile);
/* Initialize the I/O system */
psio_init(); psio_ipv1_config();
/* Open the file and write an energy */
psio_open(31, PSIO_OPEN_NEW);
enuc = 12.3456789;
psio_write_entry(31, "Nuclear Repulsion Energy", (char *) &enuc,
sizeof(double));
psio_close(31,1);
/* Read M rows of an MxN matrix from a file */
some_data = init_matrix(M,N);
psio_open(91, PSIO_OPEN_OLD);
next = PSIO_ZERO;/* Note use of the special macro */
for(i=0; i < M; i++)
psio_read(91, "Some Coefficients", (char *) (some_data + i*N),
N*sizeof(double), next, &next);
psio_close(91,0);
/* Close the I/O system */
psio_done();
tstop(outfile);
ip_done();
psi_stop(infile, outfile, psi_file_prefix);
exit(0);
}
extern "C" {
char *gprgid()
{
char *prgid = "CODE_NAME";
return(prgid);
}
}
The interface to the PSI3 I/O system has been designed to mimic that of the old wreadw() and wwritw() routines of libciomr (see the next section of this manual). The table of contents system introduces a few complications that users of the library should be aware of:
In this section we will consider these libraries in greater detail.
The libchkpt.a library is a collection of functions used to access the PSI3 checkpoint file (file32) - the file which contains all most frequently used information about the computation such as molecular geometry, basis set, HF determinant, etc. Previously, the checkpoint file was a fixed-format file which is accessed using the old PSI3 I/O system. However, this changed in the spring of 2002 to use the new libpsio.a I/O system to access the checkpoint file, and it is now free format. That is, any programmer can add content to the file at will. The old checkpoint file interface has been updated to access the new underlying I/O system. It is mandatory that the checkpoint file is accessed via the libchkpt.a functions only.
#include <cstdio>
#include <cstdlib>
#include <libipv1/ip_lib.h>
#include <libciomr/libciomr.h>
#include <libpsio/psio.h>
#include <libchkpt/chkpt.h>
extern "C" {
FILE *infile, *outfile;
char *psi_file_prefix;
}
using namespace psi::MODULE_NAME;
int main(int argc, char* argv[])
{
int nmo;
double escf, etot;
double *evals;
double **scf;
psi_start(&infile, &outfile, &psi_file_prefix,
argc-1, argv+1, 0);
ip_cwk_add(":MODULE_NAME"); // MODULE_NAME all caps here
psio_init(); psio_ipv1_config();
/* to start timing, tstart(outfile); */
/*------------------------------------
now initialize the checkpoint structure
and begin reading info
------------------------------------*/
chkpt_init(PSIO_OPEN_OLD);
escf = chkpt_rd_escf();
evals = chkpt_rd_evals();
scf = chkpt_rd_scf();
nmo = chkpt_rd_nmo();
chkpt_wt_etot(-1000.0);
etot = chkpt_rd_etot();
chkpt_close();
/*--------------------------------------------
print out info to see what has been read in
--------------------------------------------*/
fprintf(outfile,"\n\n\tEscf = %20.10lf\n",escf);
fprintf(outfile,"\tEtot = %20.10lf\n",etot);
fprintf(outfile,"SCF EIGENVECTOR\n");
eivout(scf,evals,nmo,nmo,outfile);
psio_done();
tstop(outfile);
psi_stop(infile,outfile,psi_file_prefix);
}
/*-------------------------------------------------
dont forget to add the obligatory gprgid section
-------------------------------------------------*/
extern "C" {
char *gprgid()
{
char *prgid = ":MODULE_NAME";
return(prgid);
}
}
| Arguments: | the libpsio status marker PSIO_OPEN_OLD; also requires that the input parser be initialized so that it can open the checkpoint file. |
| Returns: | zero. Perhaps this will change some day. |
int chkpt_close()
Closes the checkpoint file, frees memory, etc.
| Arguments: | none, but chkpt_init must already have been called for this to work. |
| Returns: | zero. Perhaps this, too, will change one day. |
| Arguments: | takes no arguments. |
| Returns: | a string, like "CISD", or "MCSCF" or some other wavefunction designation. |
char *chkpt_rd_label()
Reads the main the checkpoint file label.
| Arguments: | takes no arguments. |
| Returns: | calculation label. |
char *chkpt_rd_sym_label()
Reads the label for the point group.
| Arguments: | takes no arguments. |
| Returns: | point group label. |
| Arguments: | takes no arguments. |
| Returns: | an array of labels (strings) which denote the irreps for the point group in which the molecule is considered, _regardless_ of whether there exist any symmetry orbitals which transform as that irrep. |
char **chkpt_rd_hfsym_labs()
Read in the symmetry labels only for those irreps
which have basis functions.
| Arguments: | takes no arguments. |
| Returns: | an array of labels (strings) which denote
the irreps which have basis functions (in Cotton ordering). For DZ or
STO-3G water, for example, in |
| Arguments: | takes no arguments. |
| Returns: | the +/- dimensionality of ALPHA and BETA vectors of coupling coefficients for open shells. |
int chkpt_rd_max_am()
Reads in the maximum orbital quantum number of AOs in the basis.
| Arguments: | takes no arguments. |
| Returns: | the maximum orbital quantum number of AOs in the basis. |
int chkpt_rd_mxcoef()
Reads the value of the constant mxcoef.
| Arguments: | takes no arguments. |
| Returns: | the sum of the squares of the number of symmetry
orbitals for each irrep. This gives the number of elements in the
non-zero symmetry blocks of the SCF eigenvector. For STO-3G water
mxcoef
|
int chkpt_rd_nao()
Reads in the total number of atomic orbitals (read: Cartesian Gaussian
functions).
| Arguments: | takes no arguments. |
| Returns: | total number of atomic orbitals. |
int chkpt_rd_natom()
Reads in the total number of atoms.
| Arguments: | takes no arguments. |
| Returns: | total number of atoms. |
int chkpt_rd_ncalcs()
Reads in the total number of calculations in the checkpoint file
(was always 1 in old libfile30.a, probably still is for now).
| Arguments: | takes no arguments. |
| Returns: | total number of calculations in the checkpoint file. |
int chkpt_rd_nirreps()
Reads in the total number of irreducible representations
in the point group in which the molecule is being considered.
| Arguments: | takes no arguments. |
| Returns: | total number of irreducible representations. |
int chkpt_rd_nmo()
Reads in the total number of molecular orbitals (may be different
from the number of basis functions).
| Arguments: | takes no arguments. |
| Returns: | total number of molecular orbitals. |
int chkpt_rd_nprim()
Reads in the total number of primitive Gaussian functions
(only primitives of _symmetry independent_ atoms are counted!).
| Arguments: | takes no arguments. |
| Returns: | total number of primitive Gaussian functions. |
int chkpt_rd_nshell()
Reads in the total number of shells. For example, DZP basis set for
carbon atom (contraction scheme
) has a total of 15 basis
functions, 15 primitives, and 7 shells. Shells of _all_ atoms are counted
(not only of the symmetry independent; compare chkpt_rd_nprim).
| Arguments: | takes no arguments. |
| Returns: | total number of shells. |
int chkpt_rd_nso()
Reads in the total number of symmetry-adapted basis functions (read:
Cartesian or Spherical Harmonic Gaussians).
| Arguments: | takes no arguments. |
| Returns: | total number of SOs. |
int chkpt_rd_nsymhf()
Reads in the total number of irreps
in the point group in which the molecule is being considered which
have non-zero number of basis functions. For STO-3G or DZ water, for
example, this is three, even though nirreps is 4 (compare
int chkpt_rd_nirreps()).
| Arguments: | takes no arguments. |
| Returns: | total number of irreducible representations with a non-zero number of basis functions. |
int chkpt_rd_num_unique_atom()
Reads in the number of symmetry unique atoms.
| Arguments: | takes no arguments. |
| Returns: | number of symmetry unique atoms. |
int chkpt_rd_num_unique_shell()
Reads in the number of symmetry unique shells.
| Arguments: | takes no arguments. |
| Returns: | number of symmetry unique shells. |
int chkpt_rd_phase_check()
Reads the phase flag, which is 1 if the orbital phases have been checked
and is 0 otherwise (phase checking just helps ensure the arbitrary phases
of the orbitals are consistent from one geometry to the next, which helps
various guessing or extrapolation schemes).
| Arguments: | takes no arguments. |
| Returns: | flag. |
int chkpt_rd_ref()
Reads the reference type from the flag in the checkpoint file.
0 = RHF, 1 = UHF, 2 = ROHF, 3 = TCSCF.
| Arguments: | takes no arguments. |
| Returns: | flag indicating the reference. |
int chkpt_rd_rottype()
Reads the rigid rotor type the molecule represents.
0 = asymmetric, 1 = symmetric, 2 = spherical, 3 = linear, 6 = atom.
| Arguments: | takes no arguments. |
| Returns: | rigid rotor type. |
| Arguments: | takes no arguments. |
| Returns: | an array nshell long that maps shells from the angmom-ordered to the canonical (in the order of appearance) order. |
chkpt_rd_atom_position()
Reads in symmetry positions of atoms.
Allowed values are as follows:
| Arguments: | takes no arguments. |
| Returns: | an array of symmetry positions of atoms. |
int *chkpt_rd_clsdpi()
Reads in an array which has an element for each irrep of the
point group of the molecule (n.b. not just the ones
with a non-zero number of basis functions). Each element
contains the number of doubly occupied MOs for that irrep.
| Arguments: | takes no arguments. |
| Returns: | the number of doubly occupied MOs per irrep. |
int *chkpt_rd_openpi()
Reads in an array which has an element for each irrep of the
point group of the molecule (n.b. not just the ones
with a non-zero number of basis functions). Each element
contains the number of singly occupied MOs for that irrep.
| Arguments: | takes no arguments. |
| Returns: | the number of singly occupied MOs per irrep. |
int *chkpt_rd_orbspi()
Reads in the number of MOs in each irrep.
| Arguments: | takes no arguments. |
| Returns: | the number of MOs in each irrep. |
int *chkpt_rd_shells_per_am()
Reads in the number of shells in each angmom block.
| Arguments: | takes no arguments. |
| Returns: | the number of shells in each angmom block. |
chkpt_rd_sloc()
Read in an array of pointers to the first AO
from each shell.
| Arguments: | takes no arguments. |
| Returns: | Read in an array nshell long of pointers to the first AO from each shell. |
chkpt_rd_sloc_new()
Read in an array of pointers to the first basis
function (not AO as chkpt_rd_sloc does)
from each shell.
| Arguments: | takes no arguments. |
| Returns: | an array nshell long of pointers to the first basis function from each shell. |
int *chkpt_rd_snuc()
Reads in an array of pointers to the nuclei on which shells are centered.
| Arguments: | takes no arguments. |
| Returns: | an array nshell long of pointers to the nuclei on which shells are centered. |
int *chkpt_rd_snumg()
Reads in array of the numbers of the primitive
Gaussians in the shells.
| Arguments: | takes no arguments. |
| Returns: | an array nshell long of the numbers of the primitive Gaussians in shells. |
int *chkpt_rd_sprim()
Reads in pointers to the first primitive
from each shell.
| Arguments: | takes no arguments. |
| Returns: | an array nshell long of pointers to the first primitive from each shells. |
chkpt_rd_sopi()
Read in the number of symmetry-adapted basis functions in each symmetry block.
| Arguments: | takes no arguments. |
| Returns: | an array nirreps long of the numbers of symmetry orbitals in symmetry blocks. |
int *chkpt_rd_stype()
Reads in angular momentum numbers of
the shells.
| Arguments: | takes no arguments. |
| Returns: | Returns an array nshell long of the angular momentum numbers of the shells. |
int *chkpt_rd_symoper()
Read in the mapping array between "canonical" ordering
of the symmetry operations of the point group and the
one defined in symmetry.h.
| Arguments: | takes no arguments. |
| Returns: | a mapping array nirrep long |
int *chkpt_rd_ua2a()
Read in the mapping array from the symmetry-unique atom
list to the full atom list.
| Arguments: | takes no arguments. |
| Returns: | a mapping array num_unique_atom long |
int *chkpt_rd_us2s()
Read in the mapping array from the symmetry-unique shell list
to the full shell list.
| Arguments: | takes no arguments. |
| Returns: | a mapping array num_unique_shell long |
| Arguments: | takes no arguments. |
| Returns: | a matrix of integers. Each row corresponds to a particular symmetry operation, while each column corresponds to a particular atom. The value of ict[2][1], then, should be interpreted in the following manner: application of the third symmetry operation of the relavant point group, the second atom is placed in the location originally occupied by the atom number ict[2][1]. |
int **chkpt_rd_shell_transm()
Reads in the transformation matrix for the shells. Each row of the
matrix is the orbit of the shell under symmetry operations of the point
group.
| Arguments: | takes no arguments. |
| Returns: | a matrix of nshell*nirreps integers. |
| Arguments: | takes no arguments. |
| Returns: | the correlation energy. |
double chkpt_rd_enuc()
Reads in the nuclear repulsion energy
| Arguments: | takes no arguments. |
| Returns: | the nuclear repulsion energy. |
double chkpt_rd_eref()
Reads in the reference energy (may be different from HF energy).
| Arguments: | takes no arguments. |
| Returns: | the reference energy. |
double chkpt_rd_escf()
Reads in the SCF HF energy.
| Arguments: | takes no arguments. |
| Returns: | the SCF HF energy. |
double chkpt_rd_etot()
The total energy, be it HF, CISD, CCSD, or whatever! This is
the preferred function to use for geometry optimization via energies,
printing energies in analysis, etc., since this value is valid whatever
the calculation type.
| Arguments: | takes no arguments. |
| Returns: | The total energy. |
| Arguments: | take no arguments. |
| Returns: | an array of _all_ of the SCF eigenvalues,
ordered by irrep, and by increasing energy within each irrep.
(i.e. for STO-3G water, the four |
double *chkpt_rd_exps()
Reads in the exponents of the primitive Gaussian functions.
| Arguments: | takes no arguments. |
| Returns: | an array of doubles. |
double *chkpt_rd_zvals()
Reads in nuclear charges.
| Arguments: | takes no arguments. |
| Returns: | an array natom long of nuclear charges (as doubles). |
| Arguments: | int irrep, designates the desired symmetry block |
| Returns: | a square matrix has orbspi[irrep]
rows. The eigenvectors are stored with the column
index denoting MOs and the row index denoting SOs: this means that
scf_vector[i][j] is the contribution of the |
double **chkpt_rd_ccvecs()
Reads in a matrix rows of which are
ALPHA (ccvecs[0]) and BETA (ccvecs[1]) matrices of coupling
coefficients for open shells stored in lower triangular form.
Coupling coefficients are defined NOT as in
C.C.J.Roothaan Rev. Mod. Phys. 32, 179 (1960) as it is stated in the
manual pages for CSCF, but according to Pitzer (no reference yet)
and are **different** from those in Yamaguchi, Osamura, Goddard, and
Schaefer's book "Analytic Derivative Methods in Ab Initio Molecular
Electronic Structure Theory".
The relationship between the Pitzer's and Yamaguchi's conventions is
as follows : ALPHA = 1-2*a , BETA = 1+4*b , where a and b are
alpha's and beta's for open shells
defined on pp. 69-70 of Dr. Yamaguchi's book.
| Arguments: | takes no arguments. |
| Returns: | double **ccvecs, a matrix 2 by abs(iopen) rows of which are coupling coefficient matrices for open-shells in packed form. For definition of iopen see chkpt_rd_iopen(). |
chkpt_rd_contr_full()
Reads in the normalized contraction coefficients.
| Arguments: | takes no arguments. |
| Returns: | a matrix MAXANGMOM (a constant defined in ???) by the total number of primitives nprim; each primitive Gaussian contributes to only one shell (and one basis function, of course), so most of these values are zero. |
double **chkpt_rd_geom()
Reads in the cartesian geometry.
| Arguments: | takes no arguments. |
| Returns: | The cartesian geometry is returned as a matrix of doubles. The row index is the atomic index, and the column is the cartesian direction index (x=0, y=1, z=2). Therefore, geom[2][0] would be the x-coordinate of the third atom. |
chkpt_rd_lagr()
chkpt_rd_alpha_lagr()
chkpt_rd_beta_lagr()
Reads in an (RHF,
UHF,
UHF) Lagrangian matrix in MO basis.
| Arguments: | takes no arguments. |
| Returns: | a matrix nmo by nmo. |
double **chkpt_rd_scf()
double **chkpt_rd_alpha_scf()
double **chkpt_rd_beta_scf()
Reads in the (RHF,
UHF,
UHF) eigenvector.
| Arguments: | takes no arguments. |
| Returns: | a square matrix of dimensions nmo by nmo (see: chkpt_rd_nmo()). The symmetry blocks of the SCF vector appear on the diagonal of this matrix. |
chkpt_rd_schwartz()
Reads in the table of maxima of Schwartz integrals (ij|ij)
for each shell doublet.
| Arguments: | takes no arguments. |
| Returns: | NULL if no table is present in the checkpoint file, a matrix nshell by nshell otherwise. |
chkpt_rd_usotao_new()
Reads in an AO to SO transformation matrix.
| Arguments: | takes no arguments. |
| Returns: | a nso by nao matrix of doubles. |
chkpt_rd_usotbf()
Reads in a basis function to SO transformation matrix.
| Arguments: | takes no arguments. |
| Returns: | a nso by nso matrix of doubles. |
chkpt_rd_zmat()
Reads in the z-matrix
| Arguments: | takes no arguments. |
| Returns: | struct *z_entry natom long. |
The functions previously documented in this manual have been removed
because that documentation is now out of date. Documentation of the
library is now created directly from the source code using the
doxygen program and is available at
http://www.psicode.org/doc/libs/doxygen/html.
In the context of programming, style can refer to many things. Foremost, it refers to the format of the source code: how to use indentation, when to add comments, how to name variables, etc. It can also refer to many other issues, such code organization, modularity, and efficiency. Of course, stylistic concerns are often matters of individual taste, but often validity and portability of the code will ultimately depend on stylistic decisions made in the process of code development. Hence some stylistic choices are viewed as universally bad (e.g. not prototyping every function just because ``the code compiles and runs fine as is'', etc.). Admittedly, it is easy to not have any style, but it takes years to learn what makes a good one. A good programming style can reduce debugging and maintenance times dramatically. For a large package such as PSI3, it is very important to adopt a style which makes the code easy to understand and modify by others. This section will give a few brief pointers on what we consider to be a good style in programming.
Of course, for very simple programs design and implementation may be combined and documentation may consist of one line. However, for more complex programs it is recommended that the five stages are followed. This means that you should spend only about 20-40% of your time writing source code! Our experience shows that following this scheme results in the most efficient approach to programming in the long run.
To learn more on each stage of the software writing process, you may want to refer to Stroustrup's ``C++ Programming Language'' book (3rd Ed.) as the most common reference source not dedicated solely to one narrow subject. Besides being an excellent description of C++, it is also an introduction to writing software as well. Particular attention is paid to the issue of program design.
In C programs, we also consider it a good idea to place all the #include statements in a file such as includes.h, which is subsequently included in each relevant C source file. This is helpful because if a new header file needs to be added, it can simply be added to includes.h. Furthermore, if a source file suddenly needs to have access to a global variable or function prototype which is already present in one of the header files, then no changes need to be made; the header file is already included. A downside to this approach is that each header file is included in every source file which includes includes.h, regardless of whether a particular header file is actually needed by that source file; this could potentially lead to longer compile times, but it isn't likely to make a discernable difference, at least in C.2
Along similar lines, it is helpful to define all global variables in one location (in the main program file, or else within globals.c), and they should be declared within another standard location (perhaps globals.h, or common.h).3Similarly, if functions are used in several different source code files, the programmer may wish to place all function prototype declarations in a single header file, with the same name as the program or library, or perhaps called protos.h.
It is very common that statements within loops are indented. Loops within loops are indented yet again, and so on. This practice is near-universal and very helpful. Computational chemistry programs often require many nested loops. The consequence of this is that lines can be quite long, due to all those spaces before each line in the innermost loops. If the lines become longer than 80 characters, they are hard to read within a single window; please try to keep your lines to 80 characters or less. This means that you should use about 2-4 spaces per indentation level.
The matching of braces, and so forth, is more variable, and we recommend you follow the convention of The C Programming Language, by Kernighan and Ritchie, or perhaps the style found in other PSI3modules.
PSI3 programs have certain conventions in place for names of most common variables, as shown in the Table 1.
| Quantity | Variable(s) |
| Number of atoms | na, natom, num_atoms |
| Number of atoms * 3 | natom3, num_atoms3 |
| Nuclear repulsion energy | enuc, repnuc |
| SCF energy | escf |
| Number of atomic orbitals | nbfao, num_ao, nao |
| Number of symmetry orbitals | nbfso, num_so, nso |
| Size of lower triangle | |
| of AO's, SO's | nbatri, nbstri; ntri |
| Input file pointer | infile |
| Output file pointer | outfile |
| Offset array | ioff |
| Number of irreps | num_ir, nirreps |
| Open-shell flag | iopen |
| Number of orbitals per irrep | orbs_per_irrep, orbspi, mopi |
| Number of closed-shells | |
| per irrep | docc, clsd_per_irrep, clsdpi |
| Number of open-shells | |
| per irrep | socc, open_per_irrep, openpi |
| Orbital symmetry array | orbsym |
A practice which is probably preferable is to have a different print flag (boolean) for each of the major intermediates used by a program, and to have an overall print option (decimal) whose value determines the printing verbosity for the quantities without a specific printing option. The overall print option should be specified by a keyword PRINT_LVL, and its action should be as in Table 2.
| 0 | Almost no printing; to be used by driver programs |
| with -quiet option | |
| 1 | Usual printing (default) |
| 2 | Verbose printing |
| 3 | Some debugging information |
| 4 | Substantial debugging information |
| 5 | Print almost all intermediates unless arrays too large |
| 6 | Print everything |
Having said this, we will argue against excessive commenting: don't add a comment every time you do i++! It will actually make your code harder to read. Be sensible.
As of spring 2002, we have adopted the doxygen program to automatically generate source code documentation. This program scans the source code and looks for special codes which tell it to add the given comment block to the documentation list. The program is very fancy and can generate documentation in man, html, latex, and rtf formats. The file psi3.dox is the doxygen configuration file. The source code should be commented in the following way to work with doxygen.
The first file of each library defines a ``module'' via a special comment line:
/*! \defgroup PSIO libpsio: The PSI I/O Library */Note the exclamation mark above -- it is required by doxygen. The line above defines the PSIO key and associates it with the title ``The PSI I/O Library.'' Each file belonging to this group will have a special comment of the following form:
/*! ** \file ** \ingroup PSIO ** \brief A brief descriptor of the file should go here ** ** A more detailed description of the file can go here */This tells doxygen that this file should be documented, it should be added to the list of documented files, and it belongs to the PSIO group. Do not put the actual filename after the file directive, because current versions of doxygen have trouble when duplicate filenames appear in different modules. Leaving the filename blank after the file directive lets doxygen create a unique filename using part of the file path.
All functions should be commented as in the following:
/*! ** PSIO_CLOSE(): Closes a multivolume PSI direct access file. ** ** \param unit = The PSI unit number used to identify the file to all read ** and write functions. ** \param keep = Boolean to indicate if the file should be deleted (0) or ** retained (1). ** ** Returns: always returns 0 ** ** \ingroup PSIO */ int psio_close(ULI unit, int keep) ...This will add the function psio_close to the list, associate it with the PSIO module, and define the various arguments.
Please note: In addition to listing all the parameters and return values, it is very valuable to explain what the function actually does. Add this explanation immediately after the function name (see above). This explanation might be a few words, or an entire paragraph, as necessary.
It is possible to include formulas in the doxygen documentation
and to have them properly formatted when output to HTML or LaTeX.
If the formula appears in the running text of a doxygen comment,
enclose it within a pair of
f$ commands,
and format it according to LaTeX rules. To make the formula
centered on a new line, enclose it within
f[ and
f]. If the formula is to be in an environment
other than simple math mode (e.g., an eqnarray, then begin
the environment with
f{environment} and end it
with
f}, where environment is something
like eqnarray*. According to the doxygen documentation,
the program can have trouble recovering from typos in formlas, and to
get rid of a typo in a formula it may be necessary to remove the file
formula.repository from the HTML directory.
Makefiles consist of rules which describe how to carry out commands. For example, a rule might explain how to compile a single source file, or how to link all the object files into the executable, or perhaps how to clean up all the object files. A rule has the following form
target: dependencies
command
command
...
The target is the name of the rule, e.g. the name of the program
or file to be compiled. The first rule given in the Makefile is
the default. The dependencies are the names of files (often
names of other targets, as well) on which the construction of the
target depends. A particular target does not necessarily have to have
dependencies. The commands are the actual commands to be
executed once all the dependencies are complete. Note that a
<TAB> must be used to indent commands under the target
name; if you use spaces or don't indent you'll get a (not entirely
clear) error message. Makefiles may also contain
variable definitions to make the file perhaps simpler.
As an example, consider the Makefile.in file associated with cscf:
srcdir = @srcdir@
VPATH = @srcdir@
include ../MakeVars
PSILIBS = -lPSI_file30 -lPSI_chkpt -lPSI_iwl -lPSI_psio -lPSI_ciomr -lPSI_ipv1
TRUESRC = \
cscf.c cleanup.c dft_inputs.c diis.c dmat.c \
dmat_2.c ecalc.c errchk.c findit.c \
formg2.c formgc.c formgo.c form_vec.c gprgid.c init_scf.c \
packit_c.c packit_o.c rdone.c rdtwo.c rotate_vector.c scf_input.c \
scf_iter.c scf_iter_2.c schmit.c sdot.c shalf.c check_rot.c phases.c\
guess.c sortev.c occ_fun.c init_uhf.c cmatsplit.c dmatuhf.c \
findit_uhf.c uhf_iter.c schmit_uhf.c diis2_uhf.c formg_direct.c \
orb_mix.c
BINOBJ = $(TRUESRC:%.c=%.o)
ALLOC =
include ../MakeRules
ifneq ($(DODEPEND),no)
$(BINOBJ:%.o=%.d): $(DEPENDINCLUDE)
include $(BINOBJ:%.o=%.d)
endif
install_man:: cscf.1
$(MKDIRS) $(mandir)/man1
$(INSTALL_INCLUDE) $^ $(mandir)/man1
The @string@ directives tell the configure script where to insert certain variables is has determined from the system. This Makefile input also includes two external Makefiles, MakeVars and MakeRules, both of which are in the parent directory. These files contain (not surprisingly) numerous necessary variables (e.g. the local C compiler name) and rules (e.g. how to generate the module itself) for compilation and installation of cscf. Similar files exist for the PSI libraries as well. We recommend that programmer's spend some time studying the PSI Makefile structure.
Now you are ready to work on the code. Changes to source files (including the Makefile should be made to the files in $PSI3/src/bin/great_code and all compilations should be run in $prefix/src/bin/great_code.
Most interactive debuggers allow the programmer to specify multiple source code search directories using simple command-line options. For example, if one were debugging the cscf program and needed access to the libciomr.a library source code in addition to that of cscf, one could use gdb's ``dir'' command to search several source code directories:
dir \$PSI/src/lib/libciomrAdditionally, such commands can be placed in the user's $HOME/.gdbinit file. In dbx, the ``use'' command specifies multiple source directories.
setenv MANPATH /usr/local/psi3-bin/doc/man:/usr/share/manThe usual man path should be added after the PSI3 part and will be different for different systems. Different directories are separated by colons.
Note also that full documentation should also include citations to any relevant publications upon which the code may be based.
If you have just added a new module for performing, say multireference coupled cluster, and you would like to add a test case to the current test suite, here is what you should do.
Please contact one of the authors of PSI3 before making any major changes or if you have a problem adding a new test case. Remember, if all else fails, read the source code.
The following is a list of special items that should be kept in mind while developing PSI code.
Psi uses several text files to store certain types of information. Storing
information in text files makes it much easier for users to inspect
and manipulate that
information, provided that the user understands the format of that file.
In the following file format descriptions, I will use the notation
and
to denote the x, y and z coordinates of nucleus i,
respectively,
will denote the
internal coordinate, and
E will denote the sum of the electronic energy and nuclear repulsion energy.
geom.dat
A vectorized format which is appropriate for the routines in libipv1 or iomr is employed in geom.dat Generally, the first line of geom.dat is
%%Though this does not affect the parsing routines in libipv1, or any of the common programs which read geom.dat (i.e. rgeom or ugeom), some PSI2 modules (bmat, etc.) expected this line and would muddle up geom.dat if it is not present. geom.dat will frequently have several entries, with the topmost being the most recent addition by bmat.
format: n = number of atoms.
![]() |
(1) |
fconst.dat
This file contains the force constant matrix produced by optking or intder95. Because the force constant matrix is symmetric, only the lower diagonal is stored here. The force constant matrix may be represented in either cartesian or internal coordinates, depending upon what flags were used when intder95 was run to produce fconst.dat. optking is the program which uses fconst.dat most frequently, and it assumes that the force constant matrix will be in terms of the internal coordinates as defined in input.dat or intco.dat. For this reason, it is best to have intder95 produce a fconst.dat in internal coordinates. The order of internal coordinates is determined by the order set up in input.dat or intco.dat. The totally symmetric coordinates always come first, followed by all asymmetric coordinates.
In the following format,
is the force constant for internal
coordinate
and
is the force constant for
the mixed displacement of internal coordinates i and j.
format: n = total number of internal coordinates in intco.dat
or input.dat.
![]() |
(2) |
![]() |
(3) |
file11.dat
The number of atoms (n), total energy
as predicted by the final wavefunction,
cartesian geometry, cartesian gradients, atomic charges (Z
)
and a label are all contained in file11. The exact nature of the label
depends upon the type of wavefunction for which the gradient was calculated.
The first part of the label is determined by the label keyword in input.dat.
If an SCF gradient is run, then the calculation type (calctype), and
derivative type (dertype) will also appear.
If a correlated gradient has been run, calctype
[CI, CCSD, or CCSD(T)] and derivative type (FIRST) appear.
file11 will
frequently have several entries, with the last entry being the latest
addition by cints -deriv1.
format:
![]() |
(4) |
file12.dat
Internal coordinate values and gradients, the number of atoms (n), and the total energy (E) may be found in file12. file12 is produced by intder95, which can convert cartesian gradients into internal gradients. Generally, file12 will have several entries, with each entry corresponding to an entry in the file11 of interest.
format:
![]() |
(5) |
file12a.datIn order to calculate second derivatives from gradients taken at geometries finitely displaced from a particular geometry, intdif requires a file12a. This file contains essentially the same information as file12, but each entry also has information concerning which internal coordinate (numintco) was displaced in the gradient calculation and by how much (disp) it was displaced.
format:
![]() |
(6) |
file15.dat
The cartesian Hessian matrix is found in file15. The first line of this file gives the number of atoms (n) and, in case you are curious, six times the number of atoms (sixtimesn).
format:
![]() |
(7) |
file16.datThe second derivatives of the total energy with respect to the internal coordinates are found in file16. As in file15, the number of atoms (n) and six times that number (sixtimesn) are given.
format:
![]() |
(8) |
file17.datFirst derivatives of the cartesian dipole moments (
format:
![]() |
(9) |
file18.datFirst derivatives of the cartesian dipole moments (
format:
![]() |
(10) |
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -tmp /tmp -dir /home/users/crawdad/src/psi-3-4/objdir-diadem-gcc/doc/progman/html -external_file /home/users/crawdad/src/psi-3-4/objdir-diadem-gcc/doc/progman/progman -mkdir -local_icons -split 0 /home/users/crawdad/src/psi-3-4/doc/progman/progman.tex
The translation was initiated by T. Daniel Crawford on 2009-02-16