Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
@article{Liu2016EstimationAO, title={Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment}, author={Qi Liu and Weidong Cai and Dandan Jin and Jian Shen and Zhangjie Fu and Xiaodong Liu and Nigel Linge}, journal={Sensors (Basel, Switzerland)}, year={2016}, volume={16} }
Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve…
Figures and Tables from this paper
12 Citations
Near-Data Prediction Based Speculative Optimization in a Distribution Environment
- Computer Science
- 2019
An SE optimized strategy which can be used in prediction of near data and effectively improves the accuracy of alternative tasks and effects better in heterogeneous Hadoop environments in various situations, which is beneficial to consumers and cloud platform.
Designing a MapReduce performance model in distributed heterogeneous platforms based on benchmarking approach
- Computer ScienceThe Journal of Supercomputing
- 2020
A model based on MapReduce phases for predicting the execution time of jobs in a heterogeneous cluster is presented, and a novel heuristic method is designed, which significantly reduces the makespan of the jobs.
An Adaptively Speculative Execution Strategy Based on Real-Time Resource Awareness in a Multi-Job Heterogeneous Environment
- Computer ScienceKSII Trans. Internet Inf. Syst.
- 2017
An adaptive SE strategy (ASE) is presented in Hadoop-2.6.0 and the performance of MRV2 is largely improved using the ASE strategy on job execution time and resource consumption, whether in a multi-job environment.
Estimating runtime of a job in Hadoop MapReduce
- Computer ScienceJournal of Big Data
- 2020
A new method to estimate the runtime of a job by considering essential and efficient parameters that higher impact on runtime is proposed and the results show the average error rate is less than 12% in the estimation of runtime for the first run and less than 8.5% when the profile or history of the job has existed.
A Machine Learning Approach for Predicting Execution Time of Spark Jobs
- Computer ScienceAlexandria Engineering Journal
- 2018
A Hadoop Yarn Scheduling Based on Node Computing Capability and Data Locality in Heterogeneous Environments
- Computer Science
- 2018
A resource allocation algorithm based on node computing capability and data locality is proposed in this paper, which can effectively reduce the completion time and improve resource utilization in Hadoop.
ANN based execution time prediction model and assessment of input parameters through ISM
- Computer ScienceInt. Arab J. Inf. Technol.
- 2020
An Artificial Neural Network (ANN) based prediction model is proposed to predict the execution time of tasks and provides 21.72% reduction in mean relative error compared to other state-of-the-art methods.
Toward Approximating Job Completion Time in Vehicular Clouds
- Computer ScienceIEEE Transactions on Intelligent Transportation Systems
- 2019
The main contribution of this paper is to offer easy-to-compute approximations of job completion time when estimates of the first or the first two moments of the intervening random variables are available.
A Distributed Parallel Algorithm Based on Low-Rank and Sparse Representation for Anomaly Detection in Hyperspectral Images
- Computer ScienceSensors
- 2018
This paper proposes a novel distributed parallel algorithm (DPA) by redesigning key operators ofLRASR in terms of MapReduce model to accelerate LRASR on cloud computing architectures and demonstrates that the newly developed DPA achieves very high speedups when accelerating LRASr, in addition to maintaining similar accuracies.
MLP-ANN-Based Execution Time Prediction Model and Assessment of Input Parameters Through Structural Modeling
- Computer Science
- 2020
A multilayer perceptron–artificial neural network (MLP-ANN)-based prediction model is proposed to predict the execution time of tasks in cloud environment and provides 21.7% reduction in mean relative error compared to other state-of-the-art methods.
References
SHOWING 1-10 OF 47 REFERENCES
A Heuristic Speculative Execution Strategy in Heterogeneous Distributed Environments
- Computer Science2014 Sixth International Symposium on Parallel Architectures, Algorithms and Programming
- 2014
This paper proposes a novel speculative execution strategy in heterogeneous environments, ERUL, to im-prove the estimation of tasks' rest time and indicates that, the Hadoop-ERUL strategy not only works more accurately in the estimate of running tasks' remaining execution time, but also reduces 26% job's running time compared to Hadoan-LATE.
Improving MapReduce Performance Using Smart Speculative Execution Strategy
- Computer ScienceIEEE Transactions on Computers
- 2014
A new strategy, maximum cost performance (MCP), is developed which improves the effectiveness of speculative execution significantly and can run jobs up to 39 percent faster and improve the cluster throughput by up to 44 percent compared to Hadoop-0.21.
Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous clouds
- Computer ScienceJ. Netw. Comput. Appl.
- 2015
Maestro: Replica-Aware Map Scheduling for MapReduce
- Computer Science2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
- 2012
This work proposes a novel scheduling algorithm for map tasks, named Maestro, to improve the overall performance of the MapReduce computation and achieves around 95% local map executions, reduces speculative map tasks by 80% and results in an improvement of up to 34% in the execution time.
Improving MapReduce Performance with Partial Speculative Execution
- Computer ScienceJournal of Grid Computing
- 2015
This paper proposes the Partial Speculative Execution (PSE) strategy, a strategy to make speculative tasks start from the checkpoint of original tasks, which can eliminate the costs of re-reading, re-copying, and re-computing the processed data.
A Smart Strategy for Speculative Execution Based on Hardware Resource in a Heterogeneous Distributed Environment
- Computer Science
- 2016
Some pitfalls in proposed strategy have been modified and computer hardware has been taken into consideration (HWC-Speculation) in Hadoop-2.6 and results show that the method can find a slow task correctly and the performance of MRV2 is improved.
Improving MapReduce Performance in Heterogeneous Environments
- Computer ScienceOSDI
- 2008
A new scheduling algorithm, Longest Approximate Time to End (LATE), that is highly robust to heterogeneity and can improve Hadoop response times by a factor of 2 in clusters of 200 virtual machines on EC2.
Novel heuristic speculative execution strategies in heterogeneous distributed environments
- Computer ScienceComput. Electr. Eng.
- 2016
Energy-Aware Scheduling of MapReduce Jobs for Big Data Applications
- Computer ScienceIEEE Transactions on Parallel and Distributed Systems
- 2015
This paper proposes two heuristic algorithms, called energy-aware MapReduce scheduling algorithms (EMRSA-I and EMRSA-II), that find the assignments of map and reduce tasks to the machine slots in orderto minimize the energy consumed when executing the application.
A New Speculative Execution Algorithm Based on C4.5 Decision Tree for Hadoop
- Computer ScienceICYCSEE
- 2015
In this paper, a new Speculative Execution algorithm based on C4.5 Decision Tree, SECDT, for Hadoop is designed, which can predict execution time more accurately than other speculative execution methods, hence reduce the job completion time.