Peter C. R. Lane

Learn More
Ferret is a fast and effective tool for detecting similarities in a group of files. Applying it to the PAN’09 corpus required modifications to meet the requirements of the competition, mainly to deal with the very large number of files, the large size of some of them, and to automate some of the decisions that would normally be made by a human operator.(More)
Computer implementations of theoretical concepts play an ever-increasing role in the development and application of scientific ideas. As the scale of such implementations increases from relatively small models and empirical setups to overarching frameworks from which many kinds of results may be obtained, it is important to consider the methodology by which(More)
Quantitative predictions for complex scientific theories are often obtained by running simulations on computational models. In order for a theory to meet with wide-spread acceptance, it is important that the model be reproducible and comprehensible by independent researchers. However, the complexity of computational models can make the task of replication(More)
Locating documents carrying positive or negative favourability is an important application within media analysis. This article presents some empirical results on the challenges facing a machine-learning approach to this kind of opinion mining. Some of the challenges include: the often considerable imbalance in the distribution of positive and negative(More)
Definition A chunk is meaningful unit of information built from smaller pieces of information, and chunking is the process of creating a new chunk. Thus, a chunk can be seen as a collection of elements that have strong associations with one another, but weak associations with elements belonging to other chunks. Chunks, which can be of different sizes, are(More)
Many academic staff will recognise that unusual shared elements in student submissions trigger suspicion of inappropriate collusion. These elements may be odd phrases, strange constructs, peculiar layout, or spelling mistakes. In this paper we review twenty-nine approaches to source-code plagiarism detection, showing that the majority focus on overall file(More)
Creating a plausible Unified Theory of Cognition (UTC) requires considerable effort from large, potentially distributed, teams. Computational Cognitive Architectures (CCAs) provide researchers with a concrete medium for connecting different cognitive theories to facilitate development of a robust, unambiguous UTC. However, due to wide dissemination of(More)