Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
Optimizing Semantic Coherence in Topic Models
A novel statistical topic model based on an automated evaluation metric based on this metric that significantly improves topic quality in a large-scale document collection from the National Institutes of Health (NIH).
A Reductions Approach to Fair Classification
- Alekh Agarwal, A. Beygelzimer, Miroslav Dudík, J. Langford, H. Wallach
- Computer ScienceICML
- 6 March 2018
The key idea is to reduce fair classification to a sequence of cost-sensitive classification problems, whose solutions yield a randomized classifier with the lowest (empirical) error subject to the desired constraints.
Rethinking LDA: Why Priors Matter
The prior structure advocated substantially increases the robustness of topic models to variations in the number of topics and to the highly skewed word frequency distributions common in natural language.
Evaluation methods for topic models
It is demonstrated experimentally that commonly-used methods are unlikely to accurately estimate the probability of held-out documents, and two alternative methods that are both accurate and efficient are proposed.
Topic modeling: beyond bag-of-words
- H. Wallach
- Computer ScienceICML
- 25 June 2006
A hierarchical generative probabilistic model that incorporates both n-gram statistics and latent topic variables by extending a unigram topic model to include properties of a hierarchical Dirichlet bigram language model is explored.
Datasheets for datasets
- Timnit Gebru, Jamie H. Morgenstern, Kate Crawford
- Computer ScienceCommunications of the ACM
- 23 March 2018
Documentation to facilitate communication between dataset creators and consumers and consumers is presented.
Polylingual Topic Models
- David Mimno, H. Wallach, Jason Naradowsky, David A. Smith, A. McCallum
- Computer Science, LinguisticsEMNLP
- 6 August 2009
This work introduces a polylingual topic model that discovers topics aligned across multiple languages and demonstrates its usefulness in supporting machine translation and tracking topic trends across languages.
Language (Technology) is Power: A Critical Survey of “Bias” in NLP
A greater recognition of the relationships between language and social hierarchies is urged, encouraging researchers and practitioners to articulate their conceptualizations of “bias” and to center work around the lived experiences of members of communities affected by NLP systems.
Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?
- Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé, Miroslav Dudík, H. Wallach
- Computer ScienceCHI
- 13 December 2018
This first systematic investigation of commercial product teams' challenges and needs for support in developing fairer ML systems identifies areas of alignment and disconnect between the challenges faced by teams in practice and the solutions proposed in the fair ML research literature.
Manipulating and Measuring Model Interpretability
- Forough Poursabzi-Sangdeh, D. Goldstein, J. Hofman, Jennifer Wortman Vaughan, H. Wallach
- Computer Science, PsychologyCHI
- 21 February 2018
A sequence of pre-registered experiments showed participants functionally identical models that varied only in two factors commonly thought to make machine learning models more or less interpretable: the number of features and the transparency of the model (i.e., whether the model internals are clear or black box).