Clustering longitudinal life‐course sequences using mixtures of exponential‐distance models

  title={Clustering longitudinal life‐course sequences using mixtures of exponential‐distance models},
  author={Keefe Murphy and Thomas Brendan Murphy and Raffaella Piccarreta and Isobel Claire Gormley},
  journal={Journal of the Royal Statistical Society: Series A (Statistics in Society)},
Sequence analysis is an increasingly popular approach for analysing life courses represented by ordered collections of activities experienced by subjects over time. Here, we analyse a survey data set containing information on the career trajectories of a cohort of Northern Irish youths tracked between the ages of 16 and 22. We propose a novel, model-based clustering approach suited to the analysis of such data from a holistic perspective, with the aims of estimating the number of typical career… Expand


What matters in differences between life trajectories: a comparative review of sequence dissimilarity measures
The study shows that there is no universally optimal distance index, and that the choice of a measure depends on which aspect the authors want to focus on, and introduces novel ways of measuring dissimilarities that overcome some flaws in existing measures. Expand
Model-based clustering of categorical time series
Two approaches for model-based clustering of categorical time series based on time-homogeneous flrst-order Markov chains are discussed. For Markov chain clustering the individual transitionExpand
Strings of Adulthood: A Sequence Analysis of Young British Women’s Work-Family Trajectories
Employment, union formation and childbearing are central processes within young individuals’ transition to adulthood. These processes interact in highly complex ways, and they shape actualExpand
A framework for dissimilarity-based partitioning clustering of categorical time series
A new framework for clustering categorical time series using a dissimilarity-based partitioning method and a modified version of the $$k$$k-modes algorithm specifically designed to provide with a better characterization of the clusters is proposed. Expand
Mixture Hidden Markov Models for Sequence Data: The seqHMM Package in R
The seqHMM package in R is designed for the efficient modeling of sequences and other categorical time series data containing one or multiple subjects with one or several interdependent sequences using HMMs and MHMMs. Expand
Model-based biclustering of clickstream data
A model-based clustering relying on the mixture of first order Markov models will be considered, and states are clustered along with users providing a biclustering framework with good results. Expand
ClickClust: An R Package for Model-Based Clustering of Categorical Sequences
The R package ClickClust is a new piece of software devoted to finite mixture modeling and model-based clustering of categorical sequences based on finite mixtures of Markov models that extends the original clustering problem to a biclustering framework. Expand
The analysis of early life courses: Complex descriptions of the transition to adulthood
The quantitative analysis of life courses has to deal with a complex pattern of interrelated events and trajectories. Such a complex pattern needs complex measurement tools, even if only to describeExpand
Gaussian parsimonious clustering models with covariates and a noise component
This paper addresses the equivalent aims of including covariates in Gaussian Parsimonious clustering models and incorporating parsimonious covariance structures into all special cases of the Gaussian mixture of experts framework. Expand
Handbook of Mixed Membership Models and Their Applications
This handbook spans more than 20 years of the editors and contributors statistical work in the field and explores the use of the models in various application settings, including survey data, population genetics, text analysis, image processing and annotation, and molecular biology. Expand