Classification of Big Data With Application to Imaging Genetics

  title={Classification of Big Data With Application to Imaging Genetics},
  author={Magnus Orn Ulfarsson and Frosti Palsson and Jakob Sigurdsson and Johannes R. Sveinsson},
  journal={Proceedings of the IEEE},
Big data applications, such as medical imaging and genetics, typically generate datasets that consist of few observations n on many more variables p, a scenario that we denote asp ≫ n. Traditional data processing methods are often insufficient for extracting information out of big data. This calls for the development of new algorithms that can deal with the size, complexity, and the special structure of such datasets. In this paper, we consider the problem of classifying p ≫ n data and propose… 

LDT-MRF: Log decision tree and map reduce framework to clinical big data classification

A novel method of big data classification using the Log Decision Tree and Map Reduce Framework (LDT-MRF) for performing the parallel data classification and the novel parameter termed as Logentropy is used to select the best feature attribute for data classification.

Optimized Decision tree rules using divergence based grey wolf optimization for big data classification in health care

A new algorithm namely divergence based grey wolf optimization (DGWO) is introduced, which is compared over other conventional methods like firefly algorithm, artificial bee colony algorithm, particle swarm optimization algorithm, genetic algorithm and grey wolf optimizer algorithms.

Big data analytics in medical engineering and healthcare: methods, advances and challenges

Advances and technology progress of big data analytics in healthcare are introduced, which includes artificial intelligence (AI) with big data, infrastructure and cloud computing, advanced computation and data processing, privacy and cybersecurity, health economic outcomes and technology management, and smart healthcare with sensing, wearable devices and Internet of things.

Adaptive hybrid optimization enabled stack autoencoder-based MapReduce framework for big data classification

A big data classification model based on the optimization-enabled MapReduce framework was developed for the effective management of data using the adaptive E-Bat algorithm, which outperformed the existing methods with the TPR and accuracy.

A Review of Big Data Resource Management: Using Smart Grid Systems as a Case Study

This article addressed resource management from the perspective of smart grids for a better understanding and discussed resource management in terms of various vulnerabilities and security risks to data and information being transmitted or received, as well as big data analytics.

Unsupervised and Supervised Feature Extraction Methods for Hyperspectral Images Based on Mixtures of Factor Analyzers

A framework that automatically extracts the most important features for classification from an HSI from a mixture of factor analyzers, deep MFA, and supervised MFA is proposed.

A Review of Online Sequential Extreme Learning Machines

This work reviews the most important and latest works in OS-ELM family and consists of two topics, one related to the improved version of OS-elM which aims at overcoming the disadvantages of OS -ELM, and the extended version the goals of which is to add some specialties to OS- ELM.

Rider Chicken Optimization Algorithm-Based Recurrent Neural Network for Big Data Classification in Spark Architecture

An effective classification method named Rider Chicken Optimization Algorithm-based Recurrent Neural Network (RCOA-based RNN) to perform big data classification in spark architecture is proposed to address the complex classification problems at a reasonable time.

Big Data Analytics in Healthcare Systems

  • Lidong WangC. Alexander
  • Medicine, Computer Science
    International Journal of Mathematical, Engineering and Management Sciences
  • 2019
Healthcare data, big data in healthcare systems, and applications and advantages of Big Data analytics in healthcare are introduced and the technological progress of bigData in healthcare, such as cloud computing and stream processing is presented.

Intelligent cloud workflow management and scheduling method for big data applications

A cloud workflow scheduling strategy based on an intelligent algorithm is proposed and realized and the two-tier scheduling of cloud workflow tasks is realized by adjusting the combination strategy for cloud service resources.



Gene selection using support vector machines with non-convex penalty

A unified procedure for simultaneous gene selection and cancer classification is provided, achieving high accuracy in both aspects and a successive quadratic algorithm is proposed to convert the non-differentiable and non-convex optimization problem into easily solved linear equation systems.

Shrinkage‐based Diagonal Discriminant Analysis and Its Applications in High‐Dimensional Data

The studies show that the proposed shrinkage‐based and regularization diagonal discriminant methods have lower misclassification rates than existing methods in many cases.

Sparse Discriminant Analysis

This work proposes sparse discriminantAnalysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously in the high-dimensional setting.

Regularized linear discriminant analysis and its application in microarrays.

Through both simulated data and real life data, it is shown that this method performs very well in multivariate classification problems, often outperforms the PAM method and can be as competitive as the support vector machines classifiers.

Penalized classification using Fisher's linear discriminant

  • D. WittenR. Tibshirani
  • Computer Science
    Journal of the Royal Statistical Society. Series B, Statistical methodology
  • 2011
This work proposes penalized LDA, which is a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability, and uses a minorization–maximization approach to optimize it efficiently when convex penalties are applied to the discriminating vectors.

Sparse Variable PCA Using Geodesic Steepest Descent

A new svPCA is proposed, which is based on a statistical model, and this gives access to a range of modeling and inferential tools, and a novel form of Bayesian information criterion (BIC) for tuning parameter selection.

Automatic classification of MR scans in Alzheimer's disease.

Assessment of support vector machines assigned individual diagnoses and whether data-sets combined from multiple scanners and different centres could be used to obtain effective classification of scans suggests an important role for computer based diagnostic image analysis for clinical practice.