Computational resource management for data-driven applications with deadline constraints
With an increase in the number of monitoring sensors deployed on physical infrastructures, there is a corresponding increase in data volumes that need to be processed. Data measured or collected by sensors is typically processed at destination or "in-transit" (i.e. from data capture to delivery to a user). When such data are processed in-transit over a shared distributed computing infrastructure, it is useful to provide elastic computational capability which can be adapted based on processing requirements and demand. Where Service Level Agreements (SLAs) have been pre-agreed, such available computational capacity needs to be shared in such a way that any Quality of Service related constraints in such SLAs are not violated. This is particularly challenging for time critical applications and with highly variable and unpredictable rates of data generation (e.g. in Smart Grid applications where energy usage patterns may change unpredictably). Previously, we proposed a Reference net based architectural model for supporting QoS for multiple concurrent data streams being processed (prior to delivery to a user) over a shared infrastructure. In this paper, we describe a practical realisation of this architecture using the Open Nebula Cloud platform. We consider our infrastructure to be composed of a number of nodes, each of which has multiple processing units and data buffers. We utilize the "token bucket" model for regulating, on a per stream basis, the data injection rate into each node. We subsequently demonstrate how a streaming pipeline can be supported and managed using a dynamic control strategy at each node.