Corpus ID: 27020051

A Big Data Hadoop Architecture for Online Analysis

  title={A Big Data Hadoop Architecture for Online Analysis},
  author={Ramlal Naik},
Big Data is a collection of data that is large or complex to process using on-hand database management tools or data processing applications. Big Data has recently become one of the issues important in the networking world. Hadoop is a distributed paradigm used to manipulate the large amount of data. This manipulation contains not only storage as well as processing on the data. Hadoop is normally used for data intensive applications. It actually holds the huge amount of data and upon… Expand

Figures from this paper

In this paper we discuss the various challenges of Big Data and problem arises due to continuous explosion of data resulting from the likes of social media and other online sources to gain access toExpand
Big data emerging technologies: A CaseStudy with analyzing twitter data using apache hive
A comprehensive study of major Big Data emerging technologies by highlighting their important features and how they work, with a comparative study between them is presented and represents performance analysis of Apache Hive query for executing Twitter tweets in order to calculate Map Reduce CPU time spent and total time taken to finish the job. Expand
Simplified HDFS Architecture with Blockchain Distribution of Metadata
Big data storage becomes one of the great challenges due to the rapid growth of huge volume, variety, velocity and veracity of data from various sources like social sites, Internet of Things, mobileExpand
Towards a Full Big Data Based Solution for Computationally Intensive Problems : Solving Motif Finding Problem as a Case Study
The challenges of implementing (MF) Motif Finding problem based on the services provided by the cloud storage platform, are addressed and effective parallelization is accomplished and evaluates the performance of such an integrated solution. Expand
Survey on analyzing and processing of EHR Medical data using Matlab and Hadoop
In today’s world data is evolving at a very high pace which we term as Big Data. Big Data can be described as a large volume of data which can be in the form of structured and unstructured data. ItExpand
Combining spark and snort technologies for detection of network attacks and anomalies: assessment of performance for the big data framework
The proposed combined framework for processing security data using parallel computing environment and measuring the performance of the implemented system for detection of network attacks and anomalies confirm its high efficiency for analyzing network traffic and security events. Expand
DDoS Detection System: Using a Set of Classification Algorithms Controlled by Fuzzy Logic System in Apache Spark
A dynamic DDoS attack detection system based on three main components: classification algorithms; a distributed system; and a fuzzy logic system that uses fuzzy logic to dynamically select an algorithm from a set of prepared classification algorithms that detect different DDoS patterns. Expand
A Survey Paper on Machine Learning Approaches to Intrusion Detection
This proposed dissertation discusses various security attacks classification and intrusion detection tools which can detect intrusion patterns and then forestall a break-in, thereby protecting the system from cyber-attacks. Expand
An Overview: Stochastic Gradient Descent Classifier, Linear Discriminant Analysis, Deep Learning and Naive Bayes Classifier Approaches to Network Intrusion Detection
The aim of Network Intrusion Detection Systems is to detect anomaly patterns either while the attack is unfolding or after evidence that an intrusion occurred. Expand


Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters
D-Streams support a new recovery mechanism that improves efficiency over the traditional replication and upstream backup solutions in streaming databases: parallel recovery of lost state across the cluster. Expand
Detecting DDoS attacks with Hadoop
This work proposes a novel DDoS detection method based on Hadoop that implements a HTTP GET flooding detection algorithm in MapReduce on the distributed computing platform. Expand