Corpus ID: 63126631

HADOOP ARCHITECTURE AND FAULT TOLERANCE BASED HADOOP CLUSTERS IN GEOGRAPHICALLY DISTRIBUTED DATA CENTER

@inproceedings{Cowsalya2015HADOOPAA,
  title={HADOOP ARCHITECTURE AND FAULT TOLERANCE BASED HADOOP CLUSTERS IN GEOGRAPHICALLY DISTRIBUTED DATA CENTER},
  author={T. Cowsalya},
  year={2015}
}
In today’s epoch of computer science storing and computing data is a very important phase. In recent days even a petabyte and exabytes of data is not adequate for storing large number of databases which contains large data sets. Therefore organizations today use concept called Hadoop which is a software framework of big data in their application. Hadoop is designed to store and process large volume of data sets consistently. While using geographically distributed data centers there may be a… Expand
Analysis and implementation of reactive fault tolerance techniques in Hadoop: a comparative study
TLDR
An analysis of notable fault tolerance techniques to see the impact of using different performance metrics under variable dataset with variable fault injections shows that response timewise, the byzantine technique has a performance priority over the retrying and checkpointing technique in regards to killing one node failure. Expand
Política Customizada de Balanceamento de Réplicas para o HDFS Balancer do Apache Hadoop
TLDR
A customized balancing policy for HDFS Balancer is proposed based on a system of priorities, which can be adapted and configured according to usage demands, thus making the balancing more flexible. Expand
Data Systems Fault Coping for Real-time Big Data Analytics Required Architectural Crucibles
TLDR
It is argued that new architectures, methods, and tools for handling and analyzing Big Data systems functioning in real- time must design systems that address and mitigate concerns for faults resulting from real-time streaming processes while ensuring that variables such as synchronization, redundancy, and latency are addressed. Expand
Recognizing Threats From Unknown Real-Time Big Data System Faults
Processing big data in real time creates threats to the validity of the knowledge produced. This chapter discusses problems that may occur within the real-time data and the risks to the knowledgeExpand

References

SHOWING 1-8 OF 8 REFERENCES
Hadoop high availability through metadata replication
TLDR
A metadata replication based solution to enable Hadoop high availability by removing single point of failure in Hadoan, which presents several unique features for Hadoops, such as runtime configurable synchronization mode. Expand
Fault Tolerance in Hadoop for Work Migration
2. Introduction Hadoop [1] is an open-source software framework implemented using Java and is designed to be used on large distributed systems. Hadoop is a project of the Apache Software FoundationExpand
Hadoop: The Definitive Guide
TLDR
This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoops clusters. Expand
The Hadoop Distributed File System
TLDR
The architecture of HDFS is described and experience using HDFS to manage 25 petabytes of enterprise data at Yahoo! is reported on. Expand
Introduction to Hadoop. A Dell technical white paper. http://i.dell.com/sites/content/business/solutions/white papers/en/Documents/hadoop-introduction.pdf VOL
  • APRIL
  • 2015
The Hadoop Distributed File System: Architecture and Design. http://hadoop.apache.org/common/docs/r0.18.0/hdfs_ design.pdf
  • 2007
Fault tolerance techniques for distributed systems. IBM. http://www.ibm.com/developerworks/rational/library/ 114.htm
  • 2004
Fault-Tolerance Techniques in Distributed Systems