- Lloyd Allison, Trevor I. Dix
- Inf. Process. Lett.
- 1986

- Minh Duc Cao, Trevor I. Dix, Lloyd Allison, Chris Mears
- 2007 Data Compression Conference (DCC'07)
- 2007

This paper introduces a novel algorithm for biological sequence compression that makes use of both statistical properties and repetition within sequences. A panel of experts is maintained to estimate the probability distribution of the next symbol in the sequence to be encoded. Expert probabilities are combined to obtain the final distribution. The… (More)

- D R Powell, L Allison, T I Dix
- Journal of theoretical biology
- 2000

Alignment algorithms can be used to infer a relationship between sequences when the true relationship is unknown. Simple alignment algorithms use a cost function that gives a fixed cost to each possible point mutation-mismatch, deletion, insertion. These algorithms tend to find optimal alignments that have many small gaps. It is more biologically plausible… (More)

Early work on proteins identified the existence of helices and extended sheets in protein secondary structures, a high-level classification which remains popular today. Using the Snob program for information-theoretic Minimum Message Length (MML) classification, we are able to take the protein dihedral angles as determined by X-ray crystallography, and… (More)

- Trevor I. Dix, Dorota H. Kieronska
- Computer Applications in the Biosciences
- 1988

Restriction site mapping programs construct maps by generating permutations of fragments and checking for consistency. Unfortunately many consistent maps often are obtained within the experimental error bounds, even though there is only one actual map. A particularly efficient algorithm is presented that aims to minimize error bounds between restriction… (More)

- Ramesh Ram, Madhu Chetty, Trevor I. Dix
- IEEE Congress on Evolutionary Computation
- 2006

In this paper, a number of existing and novel techniques are considered for ordering cloned extracts from the genome of an organism based on fingerprinting data. A metric is defined for comparing the quality of the clone order for each technique. Simulated annealing is used in combination with several different objective functions. Empirical results with… (More)

- Lloyd Allison, David R. Powell, Trevor I. Dix
- Comput. J.
- 1999

A population of sequences is called non-random if there is a statistical model and an associated compression algorithm that allows members of the population to be compressed, on average. Any available statistical model of a population should be incorporated into algorithms for alignment of the sequences and doing so changes the rank order of possible… (More)

- Minh Duc Cao, Trevor I Dix, Lloyd Allison
- Advances in experimental medicine and biology
- 2011

A biological compression model, expert model, is presented which is superior to existing compression algorithms in both compression performance and speed. The model is able to compress whole eukaryotic genomes. Most importantly, the model provides a framework for knowledge discovery from biological data. It can be used for repeat element discovery, sequence… (More)

- Lloyd Allison, Linda Stern, Timothy Edgoose, Trevor I. Dix
- Computers & Chemistry
- 2000

A new statistical model for DNA considers a sequence to be a mixture of regions with little structure and regions that are approximate repeats of other subsequences, i.e. instances of repeats do not need to match each other exactly. Both forward- and reverse-complementary repeats are allowed. The model has a small number of parameters which are fitted to… (More)