Learn More
A graphical method is presented for displaying the patterns in a set of aligned sequences. The characters representing the sequence are stacked on top of each other for each position in the aligned sequences. The height of each letter is made proportional to its frequency, and the letters are sorted so the most common one is on top. The height of the entire(More)
Repressors, polymerases, ribosomes and other macromolecules bind to specific nucleic acid sequences. They can find a binding site only if the sequence has a recognizable pattern. We define a measure of the information (R sequence) in the sequence patterns at binding sites. It allows one to investigate how information is distributed across the sites and to(More)
We have used a "Perceptron" algorithm to find a weighting function which distinguishes E. coli translational initiation sites from all other sites in a library of over 78,000 nucleotides of mRNA sequence. The "Perceptron" examined sequences as linear representations. The "Perceptron" is more successful at finding gene beginnings than our previous searches(More)
Single molecules perform a variety of tasks in cells, from replicating, controlling and translating the genetic material to sensing the outside environment. These operations all require that specific actions take place. In a sense, each molecule must make tiny decisions. To make a decision, each "molecular machine" must dissipate an energy Py in the(More)
We characterize the Shine and Dalgarno sequence of 124 known gene beginnings. This information is used to make "rules" which help distinguish gene beginning from other sites in a library of over 78,000 bases of mRNA. Gene beginnings are found to have information besides the initiation codon and Shine and Dalgarno sequence which can be used to make better(More)
Like macroscopic machines, molecular-sized machines are limited by their material components, their design, and their use of power. One of these limits is the maximum number of states that a machine can choose from. The logarithm to the base 2 of the number of states is defined to be the number of bits of information that the machine could "gain" during its(More)
A graphical method is presented for displaying how binding proteins and other macromolecules interact with individual bases of nucleotide sequences. Characters representing the sequence are either oriented normally and placed above a line indicating favorable contact, or upside-down and placed below the line indicating unfavorable contact. The positive or(More)
An information theory based multiple alignment ("Malign") method was used to align the DNA binding sequences of the OxyR and Fis proteins, whose sequence conservation is so spread out that it is difficult to identify the sites. In the algorithm described here, the information content of the sequences is used as a unique global criterion for the quality of(More)
This primer is written for molecular biologists who are unfamiliar with information theory. Its purpose is to introduce you to these ideas so that you can understand how to apply them to Most of the material in this primer can also be found in introductory texts on information theory. Although Shannon's original paper on the theory of information 10] is(More)