A Parallel K-Medoids Algorithm for Clustering based on MapReduce
Monitoring and management of large scale applications is already a complex task because of syntactic and unstructured nature of execution data. Traditional application monitoring and management solutions focused on employing analysis techniques on unstructured and syntactic log information become limited as unstructured information cannot be well utilized to find out related events information or correlate such information with other related information from applications. Our proposed solution of semantically formalized logging fills this gap by bringing formal semantics and combining it in a meaningful way to enable automated monitoring and management of applications. Such formalized and well-structured log information helps analytical solution to maximally automate the process of monitoring and management of applications. However, while formalizing and structuring the log information, we came across several missing and incomplete data which causes hindrance in this process. In this paper, we tackle this problem and propose a social network analysis based solution to handle incomplete and missing data from application execution, possibly compute it and use it by our proposed solution of semantically formalizing and structured logs with adapted data mining techniques to enable automated and effective application monitoring and management. We demonstrate from an industrial use-case application that how historical data from application execution is stored using semantic logging and utilized with standard social-network analysis techniques to find out missing values in incomplete data and perform application monitoring and management.