- T. F. Smith, M. S. Uaterman
- J. Mol. Biol
An interactive menu driven system of programmes written in Fortran and designed to utilize the three main nucleotide sequence libraries and one amino acid sequence library was developed to run on a small 16-bit mini computer with limited main memory and mass storage. The software uses a minimum of system function calls and should be transportable with minimal rewriting to micro computers. Software has also been written to create secondary data bases containing the nucleotide triplet values (4(3) classes) derived from the sequence libraries. Using this secondary set, a given sequence and its reversed complement, once reduced to their trinucleotide values, can be compared to all sequences present in the libraries in about forty minutes on a PDP 11/10 mini computer using the correlation statistic. Because the statistic in this case may not be assumed to be normally distributed, we have termed it a quasi correlation coefficient (Qr).