Santonu Sarkar

Learn More
We present in this paper a new set of metrics that measure the quality of modularization of a non-object-oriented software system. We have proposed a set of design principles to capture the notion of modularity and defined metrics centered around these principles. These metrics characterize the software from a variety of perspectives: structural,(More)
One of the difficulties in maintaining a large software system is the absence of documented business domain topics and correlation between these domain topics and source code. Without such a correlation, people without any prior application knowledge would find it hard to comprehend the functionality of the system. Latent Dirichlet Allocation (LDA), a(More)
The metrics formulated to date for characterizing the modularization quality of object-oriented software have considered module and class to be synonymous concepts. But a typical class in object oriented programming exists at too low a level of granularity in large object-oriented software consisting of millions of lines of code. A typical module (sometimes(More)
In industries such as banking, retail, transportation, and telecommunications, large software systems support numerous work processes and develop over many years. Throughout their evolution, such systems are subject to repeated debugging and feature enhancements. Consequently, they gradually deviate from the intended architecture and deteriorate into(More)
We present a new set of metrics for analyzing the interaction between the modules of a large software system. We believe that these metrics would be important to any automatic or semi-automatic code modularization algorithm. The metrics are based on the rationale that code partitioning should be based on the principle of similarity of service provided by(More)
A MapReduce scheduling algorithm plays a critical role in managing large clusters of hardware nodes and meeting multiple quality requirements by controlling the order and distribution of users, jobs, and tasks execution. A comprehensive and structured survey of the scheduling algorithms proposed so far is presented here using a novel multidimensional(More)
Sharing of physical infrastructure using virtualization presents an opportunity to improve the overall resource utilization. It is extremely important for a Software as a Service (SaaS) provider to understand the characteristics of the business application workload in order to size and place the virtual machine (VM) containing the application. A typical(More)
True essence of the technology of virtualization is the ability to allow one or more workloads to share the underlying physical resources, thereby bringing about significant cost saving. However, in order to maximize the cost savings from this disruptive technology, it is essential to adopt optimal resource management techniques. These techniques broadly(More)
In the era of global outsourcing, maintenance and enhancement activities are performed in distributed locations. In most cases, the domain expertise is not available which increases the complexity to manifold. A critical success factor in such a scenario is to have a collaborative platform for managing and sharing the domain specific knowledge across(More)