Feature Selection and Assessment of Lung Cancer Sub-types by Applying Predictive Models

  title={Feature Selection and Assessment of Lung Cancer Sub-types by Applying Predictive Models},
  author={Sara Gonz{\'a}lez and Daniel Castillo and Juan Manuel G{\'a}lvez and Ignacio Rojas and Luis Javier Herrera},
The main goal of this study is the identification of a robust set of genes having the capability of discerning among the different sub-types of lung cancer: Small Cell Lung Carcinoma (SCLC), Adenocarcinoma (ACC), Squamous Cell Carcinoma (SCC) and Large Cell Lung Carcinoma (LCLC). To achieve this goal, an overall differentially expressed genes analysis was performed by using data from gene expression microarrays publicly stored at NCBI/GEO platform. Once the analysis was done, a total of 60… Expand
Non-small-cell lung cancer classification via RNA-Seq and histology imaging probability fusion
A classification model using a late fusion methodology can considerably help clinicians in the diagnosis between the aforementioned lung cancer cancer subtypes over using each source of information separately, and can be applied to any cancer type or disease with heterogeneous sources of information. Expand
KnowSeq R-Bioc package: The automatic smart gene expression tool for retrieving relevant biological knowledge
KnowSeq R/Bioc package is designed as a powerful, scalable and modular software focused on automatizing and assembling renowned bioinformatic tools with new features and functionalities. It comprisesExpand
Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection
An improved version of Salp Swarm Algorithm (ISSA) is proposed in this study to solve feature selection problems and select the optimal subset of features in wrapper-mode and demonstrates that ISSA outperforms all baseline algorithms in terms of fitness values, accuracy, convergence curves, and feature reduction in most of the used datasets. Expand


Gene expression profiling reveals novel biomarkers in nonsmall cell lung cancer
Gene sequences were differentially expressed as a function of tumor type, stage and differentiation grade and high upregulation was observed for KRT15 and PKP1, which may be good markers to distinguish squamous‐cell carcinoma samples. Expand
Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series
A novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design, is proposed to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. Expand
Leukemia multiclass assessment and classification from Microarray and RNA-seq technologies integration at gene expression level
This work presents an integration of multiple Microarray and RNA-seq platforms, which has led to the design of a multiclass study by collecting samples from the main four types of leukemia, and an innovative parameter referred to as coverage is presented here. Expand
k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction
This study identified factors that contribute to the MAQC-II project performance variation, and validated a KNN data analysis protocol using a newly generated clinical data set with 478 neuroblastoma patients. Expand
Identification of Key Transcription Factors Associated with Lung Squamous Cell Carcinoma
  • F. Zhang, Xia Chen, +4 authors Hong Shi
  • Biology, Medicine
  • Medical science monitor : international medical journal of experimental and clinical research
  • 2017
NFIC, BRCA1, and NFATC2 might be the key transcription factors in the development of lung SCC by regulating the genes involved in cell cycle and DNA replication pathways. Expand
Gene selection and classification of microarray data using random forest
It is shown that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy. Expand
Unique microRNA molecular profiles in lung cancer diagnosis and prognosis.
Results indicate that miRNA expression profiles are diagnostic and prognostic markers of lung cancer. Expand
The role of desmoglein-3 in the diagnosis of squamous cell carcinoma of the lung.
DSG3 was over-expressed in SQCCs but had very limited expression in both adenocarcinomas and non-neoplastic lungs, and can be a useful ancillary marker to separate SQCC from other subtypes of lung cancer. Expand
Integration of RNA-Seq data with heterogeneous microarray data for breast cancer profiling
A new model to find the gene signature of breast cancer cell lines through the integration of heterogeneous data from different breast cancer datasets, obtained from microarray and RNA-Seq technologies is proposed and its performance was validated using previously unseen samples. Expand
A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data
Various ways of performing dimensionality reduction on high-dimensional microarray data are summarised to provide a clearer idea of when to use each one of them for saving computational time and resources. Expand