Yu Shyang Tan

Learn More
The inability to effectively track data in cloud computing environments is becoming one of the top concerns for cloud stakeholders. This inability is due to two main reasons. Firstly, the lack of data tracking tools built for clouds. Secondly, current logging mechanisms are only designed from a system-centric perspective. There is a need for data-centric(More)
Data leakages out of cloud computing environments are fundamental cloud security concerns for both the end-users and the cloud service providers. A literature survey of the existing technologies revealed the inadequacies of current technologies and the need for a new methodology. This position paper discusses the requirements and proposes a novel auditing(More)
Semantic Web efforts aim to bring the WWW to a state in which all its content can be interpreted by machines; the ultimate goal being a machine-processable Web of Knowledge. We strongly believe that adding a mechanism to extract and compute concepts from the Semantic Web will help to achieve this vision. However, there are a number of open questions that(More)
While provenance research is common in distributed systems, many proposed solutions do not address the security of systems and accountability of data stored in those systems. In this paper, we survey provenance solutions which were proposed to address the problems of system security and data accountability in distributed systems. From our survey, we derive(More)
In this paper, we present our design of a Processing Element (PE) Aware MapReduce base framework, Pamar. Pamar is designed for supporting distributed computing on clusters where node PE configurations are asymmetric on different nodes. Pamar's main goal is to allow users to seamlessly utilize different kinds of processing elements (e.g., CPUs or GPUs)(More)
The Bio Sequence alignment involves arranging DNA, RNA or protein sequences in order to find similarity between the aligned sequences, which helps us to find the structural, functional and evolutionary relationships among organisms. Next generation sequencing has led to the generation of billions of sequence data, making it increasingly infeasible for(More)
There are many ways to build a predictive model from data. Besides the numerous classification or regression algorithms to choose from, there are countless possibilities of useful data transformation prior to modeling. To assist in discovering good predictive analytics workflows, we introduced recently a collaborative analytics system that allows workflow(More)
Domain Name Systems (DNS) servers provide a critical service to users and application on the internet. At the top of the DNS hierarchy is the Root DNS servers providing pointers to Top Level Domain servers on the Internet. Over the past years the number of Root DNS servers has grown to cope with the exponential growth of the Internet. This paper reports on(More)
Mechanisms used by many current state of the art cloud frameworks for managing users' access to cloud resources adopt an "authenticate-and-forget" approach, users with a valid account can access and use cloud resources for an indefinite amount of time. This arrangement introduces problems such as resource hogging in resource limited cloud setups (e.g.(More)