Steven J. Lynden

Learn More
Service-based approaches are rising to prominence because of their potential to meet the requirements for distributed application development in e-business and e-science. The emergence of a service-oriented view of hardware and software resources raises the question as to how database management systems and technologies can best be deployed or adapted for(More)
Discovering complex associations, anomalies and patterns in distributed data sets is gaining popularity in a range of scientific, medical and business applications. Various algorithms are employed to perform data analysis within a domain, and range from statistical to machine learning and AI based techniques. Several issues need to be addressed however to(More)
In order to effectively handle the growing amount of available RDF data, a scalable and flexible RDF data processing framework is needed. We previously proposed a Hadoop-based framework, which takes advantages of scalable and fault-tolerant distributed processing technologies, originally proposed as Google's distributed file system and MapReduce parallel(More)
OGSA-DQP is a service-based distributed query processor that is able to execute queries over data services and combine data integration with data analysis by invoking Web services. OGSA-DQP currently supports only one type of data source, relational databases wrapped using OGSA-DAI (a middleware tool that exposes XML or relational database management(More)
Uncertainty is an important factor that influences social evolution in natural and artificial environments. Here we distinguish between three aspects of uncertainty. Environmental uncertainty is the variance of resources in the environment, perceived uncertainty is the variance of the resource distribution as perceived by the organism and effective(More)
In adaptive query processing, the way in which a query is evaluated is changed in the light of feedback obtained from the environment during query evaluation. Such feedback may, for example, establish that misleading selectivity estimates were used when the query was compiled, leading to the optimizer choosing an inappropriate join order or unsuitable join(More)
MapReduce has become a popular method for data processing, in particular for large scale datasets, due to its accessibility as a scalable yet convenient programming paradigm. Data processing tasks often involve joins, and the repartition and fragment-replicate joins are two widely-used join algorithms utilised within the MapReduce framework. This paper(More)
Managing resources in large scale distributed systems is an important concern for both Peer-2-Peer and Computational Grid systems, and is a complex and time sensitive process. Although existing Peer-2-Peer systems are divided into those that support computation (CPU) sharing or data sharing, users in a Computional Grid generally need to share both.(More)
The use of RDF (Resource Description Framework) data is a cornerstone of the Semantic Web. RDF data embedded in Web pages may be indexed using semantic search engines, however, RDF data is often stored in databases, accessible via Web Services using the SPARQL query language for RDF, which form part of the Deep Web which is not accessible using search(More)