Human-AI Collaboration via Conditional Delegation: A Case Study of Content Moderation

  title={Human-AI Collaboration via Conditional Delegation: A Case Study of Content Moderation},
  author={Vivian Lai and Samuel Carton and Rajat Bhatnagar and Qingzi Vera Liao and Yunfeng Zhang and Chenhao Tan},
  journal={CHI Conference on Human Factors in Computing Systems},
Despite impressive performance in many benchmark datasets, AI models can still make mistakes, especially among out-of-distribution examples. It remains an open question how such imperfect models can be used effectively in collaboration with humans. Prior work has focused on AI assistance that helps people make individual high-stakes decisions, which is not scalable for a large amount of relatively low-stakes decisions, e.g., moderating social media comments. Instead, we propose conditional… 

On the Effect of Information Asymmetry in Human-AI Teams

It is demonstrated that humans can use contextual information to adjust the AI’s decision, resulting in complementary team performance (CTP), as in many real-world situations, humans have access to different contextual information.

Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild

This study formulate real-world scenarios of content moderation and introduces a simple yet effective threshold optimization method that searches the optimal thresholds of the multiple subtasks to make a reliable moderation decision in a cost-effective way.

SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice

—To counter online abuse and misinformation, social media platforms have been establishing content moderation guidelines and employing various moderation policies. The goal of this paper is to study



Understanding the Effect of Out-of-distribution Examples and Interactive Explanations on Human-AI Decision Making

A clear difference between in-distribution and out-of-dist distribution is demonstrated, and mixed results for interactive explanations are observed: while interactive explanations improve human perception of AI assistance's usefulness, they may reinforce human biases and lead to limited performance improvement.

Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance

This work conducts mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task (explaining itself in some conditions), and observes complementary improvements from AI augmentation that were not increased by explanations.

Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making

This paper presents a comparison on the effects of a set of established XAI methods in AI-assisted decision making, and highlights three desirable properties that ideal AI explanations should satisfy—improve people’s understanding of the AI model, help people recognize the model uncertainty, and support people's calibrated trust in the model.

OpenCrowd: A Human-AI Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation

OpenCrowd is presented, a unified Bayesian framework that seamlessly incorporates machine learning and crowdsourcing for effectively finding social influencers and empirically shows that the approach is particularly useful in finding micro-influencers, who are very directly engaged with smaller audiences.

Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making

It is shown that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making, which may also depend on whether the human can bring in enough unique knowledge to complement the AI's errors.

Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking

In a user study in which participants used the design and evaluation of the system to aid their own assessment of claims, the results suggest that individuals tend to trust the system: participant accuracy assessing claims improved when exposed to correct model predictions.

Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff

It is shown that updates that increase AI performance may actually hurt team performance, and a re-training objective is proposed to improve the compatibility of an update by penalizing new errors.

Proxy tasks and subjective measures can be misleading in evaluating explainable AI systems

This work conducted two online experiments and one in-person think-aloud study to evaluate two currently common techniques for evaluating XAI systems: using proxy, artificial tasks such as how well humans predict the AI's decision from the given explanations, and using subjective measures of trust and preference as predictors of actual performance.

Towards Unbiased and Accurate Deferral to Multiple Experts

This work proposes a framework that simultaneously learns a classifier and a deferredral system, with the deferral system choosing to defer to one or more human experts in cases of input where the classifier has low confidence.

Explaining models: an empirical study of how explanations impact fairness judgment

An empirical study with four types of programmatically generated explanations to understand how they impact people's fairness judgments of ML systems shows that certain explanations are considered inherently less fair, while others can enhance people's confidence in the fairness of the algorithm.