Spark-based log data analysis for reconstruction of cybercrime events in cloud environment
MapReduce has been widely applied in various fields of data and compute intensive applications and also it is important programming model for cloud computing. Hadoop is an open-source implementation of MapReduce which operates on terabytes of data using commodity hardware. We have applied this Hadoop MapReduce programming model for analyzing web log files so that we could get hit count of specific web application. This system uses Hadoop file system to store log file and results are evaluated using Map and Reduce function. Experimental results show hit count for each field in log file. Also due to MapReduce runtime parallelization response time is reduced.