..................................................................................................................................................3 SPECIALIZED TERMS...............................................................................................................................4 REVIEW OF LITERATURE ......................................................................................................................4 Introduction to Hydrocarbons: Types and properties ...........................................................................4 Nomenclature.........................................................................................................................................5 HISTORY OF ISOMER COUNTING AND ENUMERATION..................................................................................6 Mathematical Counting .........................................................................................................................6 Computerized Enumeration ...................................................................................................................7 THE ALGORITHM......................................................................................................................................9 PROCESSES AND METHODS .........................................................................................................................9 Data Structure .....................................................................................................................................10 Main Program .....................................................................................................................................10 Development Process...........................................................................................................................14 RESULTS.....................................................................................................................................................15 OUTPUT.....................................................................................................................................................15 Specific Output.....................................................................................................................................15 General Output ....................................................................................................................................15 General Output ....................................................................................................................................16 ANALYSIS..................................................................................................................................................16 Accuracy ..............................................................................................................................................16 Efficiency .............................................................................................................................................17 APPLICATIONS...........................................................................................................................................20 CONCLUSIONS ...........................................................................................................................................21 APPENDIX ...................................................................................ERROR! BOOKMARK NOT DEFINED. MAIN PROGRAM CODE: ENUMERATE.CPP .......................................... ERROR! BOOKMARK NOT DEFINED. DATA STRUCTURE CODE.................................................................... ERROR! BOOKMARK NOT DEFINED. Header File: Tree.h ............................................................................... Error! Bookmark not defined. Source Code: Tree.cpp .......................................................................... Error! Bookmark not defined. WORKS CITED & ENDNOTES...............................................................................................................22 ACKNOWLEDGEMENTS........................................................................................................................23 Computerized Isomer Enumeration of the Alkane Series Page 3 © 2002 Kevin Ballard Abstract An algorithm and data structure were written for the purpose of determining the individual structures of all the isomers of an alkane, a hydrocarbon characterized by a nonlooping branch structure. This was accomplished by cycling through possible structures, one at a time, and comparing different sections of each structure against itself during the creation process. Each structure would either be deemed unique and recorded, or discarded, based on a set of priority rules. The resulting computations for the number of isomers for alkanes with carbon contents up to 18 match the literature exactly. However, after 18 carbons a small number of excess structures are returned. The algorithm’s efficiency is its greatest accomplishment, far surpassing that of previous attempts. It uses only a very small amount of memory, and exhibits a linear relationship between the required computational time and the number of isomers calculated. Computerized Isomer Enumeration of the Alkane Series Page 4 © 2002 Kevin Ballard Specialized Terms The first terms that need clarification are counting and enumeration. In reference to the calculation of isomers, the term ‘counting’ means finding the number of isomers of a given formula, without necessarily determining anything further about those isomers. The term ‘enumeration’ means finding the number of isomers, as well as all individual structures, for the given formula. Counting has historically been performed by mathematical equations, whereas enumeration has been a result of computer programs, which can store vast amounts of data. Two terms that have specific meaning to isomer enumeration are ‘irredundant’ and ‘exhaustive.’ The term ‘irredundant’ means that a particular algorithm does not generate any extra or equal structures; that is, every structure it generates is unique. The term exhaustive simply means that an algorithm calculates all possible isomers without exception. A perfect algorithm needs to be both exhaustive and irredundant. When referring to chemical structures, the terms cyclic and acyclic have important meaning. In a cyclic compound the bonds between atoms form at least one closed loop or ring. An acyclic compound is one that does not create a closed loop but is characterized by a branching structure. Finally the term degree, or valence, when used in reference to an atom means the number of bonds that atom will form. Review of Literature Introduction to Hydrocarbons: Types and properties A hydrocarbon is any chemical compound containing only carbon and hydrogen. Hydrocarbons can be separated into acyclic and cyclic types. The acyclic hydrocarbons are characterized by a branched tree structure, and can be separated into three categories. 1. Alkanes contain only single (σ) bonds, and have the general formula CnH2n+2. 2. Alkenes contain a double (π) bond, and have the general formula for CnH2n. Computerized Isomer Enumeration of the Alkane Series Page 5 © 2002 Kevin Ballard 3. Alkynes contain a triple (π) bond and have the general formula CnH2n-2. The cyclic hydrocarbons are characterized by a closed structure, creating a ring. Cyclic hydrocarbons can be separated into two categories. 1. The cycloalkanes contain only single (σ) bonds, and have the general formula CnH2n. 2. The aromatic hydrocarbons are based on permutations of the benzene molecule, which has the formula C6H6 (Brescia 566-582). Isomers are any set of chemical compounds that have the same chemical formula, but a different arrangement of the atoms involved (555-556). For example, pentane (C5H12) has the following three isomers (the hydrogens are omitted for ease of reading): (Brescia 569). Nomenclature In order to systematically name all possible hydrocarbon isomers, the International Union of Pure and Applied Chemistry (IUPAC) created a set of regulations. The first step is to find the longest continuous carbon chain, which then becomes the root of the name. In the next step, number the carbons on the longest chain in order, starting at the end that gives the carbons with branches the lowest numbers. Next, each branch is named as if it were a non-branching alkane, except the –ane ending is replaced with –yl. Then each branch has the number of its position on the main chain written in front of it. Any branches with the same number of carbons are combined, with both numbers in front of the name, and an appropriate prefix denoting the number of these branches. Any branches that have their own branches are named as if they were separate, and added in parentheses. The process for alkenes and alkynes is the same, except the double or triple bond has to be on the longest chain, and it is given numbering priority. A Computerized Isomer Enumeration of the Alkane Series Page 6 © 2002 Kevin Ballard structure that has no branches is giver the nprefix before its root name (Brescia 573-574). The following are examples of alkanes and their nomenclature: History of Isomer Counting and Enumeration Mathematical Counting The counting and enumeration of chemical isomers has been of interest to chemists and mathematicians for more than 125 years, yet the need for an efficient algorithm still exists. Originally, the task was to mathematically calculate the number of isomers without necessarily generating any structures. The first such attempt to calculate the number of isomers within the alkane series was carried out by Cayley in 1875. He realized that the number of isomers of an alkane with a given carbon is dependent on the number of isomers in the alkane with one fewer carbon atom. Using this fact, and mathematical graph theory, he predict