Class discovery in galaxy classification

  title={Class discovery in galaxy classification},
  author={David Bazell and David J. Miller},
  journal={The Astrophysical Journal},
In recent years, automated, supervised classification techniques have been fruitfully applied to labeling and organizing large astronomical databases. These methods require off-line classifier training, based on labeled examples from each of the (known) object classes. In practice, only a small batch of labeled examples, hand-labeled by a human expert, may be available for training. Moreover, there may be no labeled examples for some classes present in the data; i.e., the database may contain… 
Objective Subclass Determination of Sloan Digital Sky Survey Spectroscopically Unclassified Objects
We analyze a portion of the SDSS photometric catalog, consisting of approximately 10,000 objects that have been spectroscopically classified into stars, galaxies, QSOs, late-type stars, and unknown
Robust machine learning applied to astronomical data sets. I. Star-galaxy classification of the sloan digital sky survey DR3 using decision trees
We provide classifications for all 143 million nonrepeat photometric objects in the Third Data Release of the SDSS using decision trees trained on 477,068 objects with SDSS spectroscopic data. We
Unobserved classes and extra variables in high-dimensional discriminant analysis
In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set
Mining the SDSS Archive. I. Photometric Redshifts in the Nearby Universe
We present a supervised neural network approach to the determination of photometric redshifts. The method was fine-tuned to match the characteristics of the Sloan Digital Sky Survey, and as base of
Comparing Pattern Recognition Feature Sets for Sorting Triples in the FIRST Database
Pattern recognition techniques have been used with increasing success for coping with the tremendous amounts of data being generated by automated surveys. Usually this process involves construction
A Bayesian approach to star–galaxy classification
The Bayesian formalism developed here can be applied to improve the reliability of any star–galaxy classification schemes based on the measured values of morphology statistics alone.
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used
Les modèles de mélange, un outil utile pour la classification semi-supervisée
A survey of semi-supervised classification is given and how to use it with generative models is detailed, which takes into account information contained in unlabeled data when learning the parameters of the model.
Classification and Anomaly Detection for Astronomical Survey Data
A star-galaxy separator for the UKIRT Infrared Deep Sky Survey (UKIDSS) and a novel anomaly detection method for cross-matched astronomical datasets, which prevents the use of methods unable to handle missing values and makes direct comparison between objects difficult.
A galaxy classification grid that better recognises early-type galaxy morphology
  • A. Graham
  • Physics
    Monthly Notices of the Royal Astronomical Society
  • 2019
A modified galaxy classification scheme for local galaxies is presented. It builds upon the Aitken-Jeans nebula sequence, by expanding the Jeans-Hubble tuning fork diagram, which itself contained key


A Mixture Model and EM-Based Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled/Unlabeled Data Sets
A novel mixture model is proposed which treats as observed data not only the feature vector and the class label, but also the fact of label presence/absence for each sample, to address problems involving both the known, and unknown classes.
Morphological Classification of galaxies by Artificial Neural Networks
We explore a method for automatic morphological classification of galaxies by an Artificial Neural Network algorithm. The method is illustrated using 13 galaxy parameters measured by machine
Text Classification from Labeled and Unlabeled Documents using EM
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents, and presents two extensions to the algorithm that improve classification accuracy under these conditions.
We discuss the application of a class of machine learning algorithms known as decision trees to the process of galactic classification. In particular, we explore the application of oblique decision
Ensembles of Classifiers for Morphological Galaxy Classification
We compare the use of three algorithms for performing automated morphological galaxy classification using a sample of 800 galaxies. Classifiers are created using a single training set as well as
Model-based Gaussian and non-Gaussian clustering
Abstract : The classification maximum likelihood approach is sufficiently general to encompass many current clustering algorithms, including those based on the sum of squares criterion and on the
The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon
By using additional unlabeled samples that are available at no extra cost, the performance may be improved, and therefore the Hughes phenomenon can be mitigated and therefore more representative estimates can be obtained.
Sloan digital sky survey: Early data release
The Sloan Digital Sky Survey (SDSS) is an imaging and spectroscopic survey that will eventually cover approximately one-quarter of the celestial sphere and collect spectra of ≈106 galaxies, 100,000
Elements of Information Theory
The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.