MSc Thesis

  title={MSc Thesis},
  Wing Lung Ngai
In the age of information our society generates data at an increasing and already alarming rate. To keep up with the rapid increase in the amount of available data, the academics have shown strong interests in the emerging research field of Big Data Processing (BDP), which explores technologies aiming at efficient processing of enormous amounts of data. As a result, many different types of BDP systems, for example systems specialized in large-scale graph processing, have been added into the Big… 

Machine Learning to Uncover Correlations Between Software Code Changes and Test Results

The goal of this thesis is to create a test execution model which uses supervised machine learning techniques to predict potential points of failure in a set of tests to reduce the number of test cases needed to be executed in order to test changes in code.

Keeping Fairness Alive : Design and formal verification of optimistic fair exchange protocols

The work in this thesis has been carried out at the centre for mathematics and computer science (CWI), under the auspices of the research school IPA (Institute for Programming research and

Hybrid Spread-Spectrum TCP for Combating Fraudulent Cyber Activities against Reconnaissance Attacks

A system that opens ports on a firewall by generating a connection attempt on a set of pre-specified closed ports is established, and the firewall rules are dynamically modified to allow the host that sent the connection attempts to connect over specific port(s).

Association Rules of DCI Patient Clusters and Reliability of Clustering Analysis

This study validates the reliability of the previous work by applying three different alternative clustering methods, by comparing the results of two-step clustering analysis with the Perceived Severity Index (PSI) and to validate the characteristics of patient clusters using association rules.

Portfolio Liquidity Risk: A Practitioner Perspective

Since the recent financial crisis, global institutions and governments strove for reducing both individual bank risk and systematic risks. New regulatory requirements have been proposed and then

Association rules mining in vertically partitioned databases

Analytical versus observational fragilities: the case of Pettino (L’Aquila) damage data database

A damage data database of 131 reinforced concrete (RC) buildings, collected after 2009 L’Aquila (Italy) earthquake, is employed for the evaluation of observational fragility curves. The specific

Balancing Wind and Batteries: Towards Predictive Verification of Smart Grids

To mitigate state space explosion, this work exploits structural properties of the model to implement an iterative exploration method that reuses pre-computed values as wind data is updated, and shows the method’s feasibility and versatility across gridurations and time scales.

The Prospects for Biogas Integration with Fuel Cells in the United Kingdom

Anaerobic digestion (AD) presence is emerging in the UK because it has numerous environmental benefits as a waste management strategy and produces valuable biogas. This work shows that up to 5.5% of

Traffic Flow Optimization using Reinforcement Learning

A method to determine speed limits is presented, in which a traffic flow model with reinforcement learning techniques is combined, and it is shown that it is able to significantly reduce congestion under high traffic demands.



Pregel: a system for large-scale graph processing

A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.

Spark: Cluster Computing with Working Sets

Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.

Hadoop: The Definitive Guide

This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoops clusters.

MapReduce: simplified data processing on large clusters

This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

Closure Strategies for Turbulent and Transitional Flows

This book provides a comprehensive account of the state-of-the-art in predicting turbulent and transitional flows by some of the world’s leaders in these fields and will prove indispensable for all working in CFD.

The MOLEN polymorphic processor

A microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution of the processor and to prove the viability of the proposal, the proposal was experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA.

Co‐evolutionary design for development: influences shaping engineering design and implementation in Nepal and the global village

This paper calls for a new generation of engineers, and the technologies they will invent and implement, to meet the basic human needs for security, broadly defined. The greatest threats to security

The drinking water response to the Indian Ocean tsunami, including the role of household water treatment

Purpose – To document the drinking water component of the humanitarian response to the Great Sumatra‐Andaman earthquake of December 26, 2004, including a focus on the promotion of household water

An evaluation of water urns to maintain domestic water quality

Unprotected shallow wells have been used for centuries as a source of domestic water in Africa. In Zimbabwe and many other African countries such wells are still an important and perhaps, the only