Classification of Big Data With Application to Imaging Genetics

  title={Classification of Big Data With Application to Imaging Genetics},
  author={Magnus Orn Ulfarsson and Frosti Palsson and Jakob Sigurdsson and Johannes R. Sveinsson},
  journal={Proceedings of the IEEE},
Big data applications, such as medical imaging and genetics, typically generate datasets that consist of few observations n on many more variables p, a scenario that we denote asp ≫ n. Traditional data processing methods are often insufficient for extracting information out of big data. This calls for the development of new algorithms that can deal with the size, complexity, and the special structure of such datasets. In this paper, we consider the problem of classifying p ≫ n data and propose… 
LDT-MRF: Log decision tree and map reduce framework to clinical big data classification
A novel method of big data classification using the Log Decision Tree and Map Reduce Framework (LDT-MRF) for performing the parallel data classification and the novel parameter termed as Logentropy is used to select the best feature attribute for data classification.
Optimized Decision tree rules using divergence based grey wolf optimization for big data classification in health care
A new algorithm namely divergence based grey wolf optimization (DGWO) is introduced, which is compared over other conventional methods like firefly algorithm, artificial bee colony algorithm, particle swarm optimization algorithm, genetic algorithm and grey wolf optimizer algorithms.
Big data analytics in medical engineering and healthcare: methods, advances and challenges
Advances and technology progress of big data analytics in healthcare are introduced, which includes artificial intelligence (AI) with big data, infrastructure and cloud computing, advanced computation and data processing, privacy and cybersecurity, health economic outcomes and technology management, and smart healthcare with sensing, wearable devices and Internet of things.
Adaptive hybrid optimization enabled stack autoencoder-based MapReduce framework for big data classification
A big data classification model based on the optimization-enabled MapReduce framework was developed for the effective management of data using the adaptive E-Bat algorithm, which outperformed the existing methods with the TPR and accuracy.
Integrating Cuckoo search-Grey wolf optimization and Correlative Naive Bayes classifier with Map Reduce model for big data classification
Three metrics, such as accuracy, sensitivity, and specificity, are utilized for the performance evaluation of the proposed CGCNB-MRM approach, where it could achieve 80.7% accuracy with 84.5% sensitivity and 76.9% specificity and prove its effectiveness in big data classification.
Unsupervised and Supervised Feature Extraction Methods for Hyperspectral Images Based on Mixtures of Factor Analyzers
A framework that automatically extracts the most important features for classification from an HSI from a mixture of factor analyzers, deep MFA, and supervised MFA is proposed.
Rider Chicken Optimization Algorithm-Based Recurrent Neural Network for Big Data Classification in Spark Architecture
An effective classification method named Rider Chicken Optimization Algorithm-based Recurrent Neural Network (RCOA-based RNN) to perform big data classification in spark architecture is proposed to address the complex classification problems at a reasonable time.
Big Data Analytics in Healthcare Systems
  • Lidong Wang, C. Alexander
  • Computer Science
    International Journal of Mathematical, Engineering and Management Sciences
  • 2019
Healthcare data, big data in healthcare systems, and applications and advantages of Big Data analytics in healthcare are introduced and the technological progress of bigData in healthcare, such as cloud computing and stream processing is presented.
Intelligent cloud workflow management and scheduling method for big data applications
A cloud workflow scheduling strategy based on an intelligent algorithm is proposed and realized and the two-tier scheduling of cloud workflow tasks is realized by adjusting the combination strategy for cloud service resources.
Big Data Trends and Analytics: A Survey
This paper has discussed concept of Big Data, characteristics and challenges, its main focus is over data generated in various sector, analytics and various tools to manage data.


Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach
This work proposes sparse reduced rank regression (sRRR), a strategy for multivariate modelling of high-dimensional imaging responses and genetic covariates and shows that sRRR offers a promising alternative for detecting brain-wide, genome-wide associations.
Classification of gene microarrays by penalized logistic regression.
Classification of patient samples is an important aspect of cancer diagnosis and treatment. The support vector machine (SVM) has been successfully applied to microarray cancer diagnosis problems.
Shrinkage-based diagonal discriminant analysis and its applications in high-dimensional data.
The studies show that the proposed shrinkage-based and regularization diagonal discriminant methods have lower misclassification rates than existing methods in many cases.
Gene selection using support vector machines with non-convex penalty
A unified procedure for simultaneous gene selection and cancer classification is provided, achieving high accuracy in both aspects and a successive quadratic algorithm is proposed to convert the non-differentiable and non-convex optimization problem into easily solved linear equation systems.
Sparse reduced-rank regression detects genetic associations with voxel-wise longitudinal phenotypes in Alzheimer's disease
The application of a penalized multivariate model, sparse reduced-rank regression (sRRR), for the genome-wide detection of markers associated with voxel-wise longitudinal changes in the brain caused by Alzheimer's disease is discussed.
Sparse Discriminant Analysis
This work proposes sparse discriminantAnalysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously in the high-dimensional setting.
Regularized linear discriminant analysis and its application in microarrays.
Through both simulated data and real life data, it is shown that this method performs very well in multivariate classification problems, often outperforms the PAM method and can be as competitive as the support vector machines classifiers.
Sure independence screening for ultrahigh dimensional feature space
Summary. Variable selection plays an important role in high dimensional statistical modelling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large
The doubly regularized support vector machine
The standard L2-norm support vector machine (SVM) is a widely used tool for classification problems. The L1-norm SVM is a variant of the standard L2norm SVM, that constrains the L1-norm of the fitted
Sparse Variable PCA Using Geodesic Steepest Descent
A new svPCA is proposed, which is based on a statistical model, and this gives access to a range of modeling and inferential tools, and a novel form of Bayesian information criterion (BIC) for tuning parameter selection.