Su Lin Blodgett

Learn More
Though dialectal language is increasingly abundant on social media, few resources exist for developing NLP tools to handle such language. We conduct a case study of dialectal language in online conversational text by investigating African-American English (AAE) on Twitter. We propose a distantly supervised model to identify AAE-like language from(More)
While language identification works well on standard texts, it performs much worse on social media language, in particular dialectal language—even for English. First, to support work on English language identification, we contribute a new dataset of tweets annotated for English versus nonEnglish, with attention to ambiguity, codeswitching, and automatic(More)
We highlight an important frontier in algorithmic fairness: disparity in the quality of natural language processing algorithms when applied to language from authors of di‚erent social groups. For example, current systems sometimes analyze the language of females and minorities more poorly than they do of whites and males. We conduct an empirical analysis of(More)
We explore two techniques which use color to make sense of statistical text models. One method uses in-text annotations to illustrate a model’s view of particular tokens in particular documents. Another uses a high-level, “wordsas-pixels” graphic to display an entire corpus. Together, these methods offer both zoomed-in and zoomed-out perspectives into a(More)
  • 1