Data Mining and Machine Learning in Astronomy

  title={Data Mining and Machine Learning in Astronomy},
  author={Nicholas M. Ball and Robert J. Brunner Herzberg Institute of Astrophysics and Victoria and Bc and Canada. and Department of Physics Astronomy and Univ. of Illinois at Urbana-Champaign},
  journal={arXiv: Instrumentation and Methods for Astrophysics},
We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical… 

Scientific Data Mining in Astronomy

  • K. Borne
  • Physics, Computer Science
    Next Generation of Data Mining
  • 2008
To facilitate data-driven discoveries in astronomy, a new data-oriented research paradigm for astronomy and astrophysics is envisioned -- astroinformatics, which is described as both a research approach and an educational imperative for modern data-intensive astronomy.

Surveying the reach and maturity of machine learning and artificial intelligence in astronomy

  • C. FlukeC. Jacobs
  • Computer Science, Physics
    Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
  • 2020
This review surveys contemporary, published literature on machine learning and artificial intelligence in astronomy and astrophysics for applications as diverse as discovering extrasolar planets, transient objects, quasars, and gravitationally lensed systems.

On the application of machine learning in astronomy and astrophysics: A text‐mining‐based scientometric analysis

A text‐mining‐based scientometric analysis of scientific documents published over the last three decades shows how application of AI/ML to the fields of astronomy/astrophysics represents an established and rapidly growing field of research that is crucial to obtaining scientific understanding of the universe.

Application of Decision Trees for Classifying Astronomical Objects

ParDTLT algorithm, which possesses these characteristics, was used in this work in context of astronomical objects catalogue SDSS, with the aim of obtaining decision rules to help astronomers to understand the behavior patterns of different kinds of astronomical object.

Machine-learning in astronomy

The first public release of the generic neural network training algorithm, called SkyNet, is discussed and its application to astronomical problems is demonstrated, focusing on its use in the BAMBI package for accelerated Bayesian inference in cosmology, and the identification of gamma-ray bursters.

Discussion on "Techniques for Massive-Data Machine Learning in Astronomy" by A. Gray

  • N. Ball
  • Physics, Computer Science
  • 2011
This discussion focuses on the questions raised by the practical application of astroinformatics and astrostatistics algorithms to real astronomical datasets, and what is needed to maximally leverage their potential to improve the science return.

Exploratory Analysis of Light Curves: A Case-Study in Astronomy Data Understanding

A case study of an ongoing work on exploratory analysis of unclassified light curves, which demonstrates the merit of customized exploratory approach for study and discusses scalability of the proposed method.

Machine-assisted discovery of relationships in astronomy

High-volume feature-rich data sets are becoming the bread-and-butter of 21st century astronomy but present significant challenges to scientific discovery. In particular, identifying scientifically

Success of Machine Learning algorithms in Dynamical Mass Measurements of Galaxy Clusters

In recent years, machine learning (ML) algorithms have been successfully employed in Astronomy for analyzing and interpreting the data collected from various surveys. The need for new robust and

From Supervised to Unsupervised Support Vector Machines and Applications in Astronomy

  • F. Gieseke
  • Computer Science
    KI - Künstliche Intelligenz
  • 2013
This work presents two optimization strategies that address this task and evaluates the potential of the resulting implementations on real-world data sets, including an example from the field of astronomy.



The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition

This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering.

Neural neZtworks in astronomy

Scientific Data Mining - A Practical Perspective

Technological advances are enabling scientists to collect vast amounts of data in fields such as medicine, remote sensing, astronomy, and high-energy physics, to address the modern problem of data overload in science and engineering domains.

Mining Very Large Databases with Parallel Processing

Mining Very Large Databases with Parallel Processing addresses the problem of large-scale data mining, describing advances in the integration of three computer science areas, namely: "intelligent" (machine learning-based) data mining techniques; relational databases and parallel processing.

Data Mining with Decision Trees - Theory and Applications

  • L. RokachO. Maimon
  • Computer Science
    Series in Machine Perception and Artificial Intelligence
  • 2007
This 2nd Edition is dedicated entirely to the field of decision trees in data mining; to cover all aspects of this important technique, as well as improved or new methods and techniques developed after the publication of the first edition.

Robust Machine Learning Applied to Astronomical Data Sets. I. Star-Galaxy Classification of the Sloan Digital Sky Survey DR3 Using Decision Trees

We provide classifications for all 143 million nonrepeat photometric objects in the Third Data Release of the SDSS using decision trees trained on 477,068 objects with SDSS spectroscopic data. We

Classifying bent-double galaxies

The work was performed while using the catalog from the FIRST survey to classify galaxies with a bent-double morphology, meaning those galaxies that appear to be bent in shape.

Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing)

This volume presents theoretical and practical discussions of nearest-neighbor (NN) methods in machine learning and examines computer vision as an application domain in which the benefit of these advanced methods is often dramatic.

Support vector machines and kd-tree for separating quasars from large survey data bases

The study shows that both kd-tree and SVMs are effective automated algorithms to classify point sources and can be applied for the photometric preselection of quasar candidates for large survey projects in order to optimise the efficiency of telescopes.

Support Vector Machines

This book explains the principles that make support vector machines (SVMs) a successful modelling and prediction tool for a variety of applications and provides a unique in-depth treatment of both fundamental and recent material on SVMs that so far has been scattered in the literature.