A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents
- Arman Cohan, Franck Dernoncourt, Nazli Goharian
- Computer ScienceNorth American Chapter of the Association for…
- 1 April 2018
This work proposes the first model for abstractive summarization of single, longer-form documents (e.g., research papers), consisting of a new hierarchical encoder that models the discourse structure of a document, and an attentive discourse-aware decoder to generate the summary.
CEDR: Contextualized Embeddings for Document Ranking
- Sean MacAvaney, Andrew Yates, Arman Cohan, Nazli Goharian
- Computer ScienceAnnual International ACM SIGIR Conference on…
- 15 April 2019
This work investigates how two pretrained contextualized language models (ELMo and BERT) can be utilized for ad-hoc document ranking and proposes a joint approach that incorporates BERT's classification vector into existing neural models and shows that it outperforms state-of-the-art ad-Hoc ranking baselines.
Depression and Self-Harm Risk Assessment in Online Forums
- Andrew Yates, Arman Cohan, Nazli Goharian
- Computer ScienceConference on Empirical Methods in Natural…
- 1 September 2017
This work introduces a large-scale general forum dataset consisting of users with self-reported depression diagnoses matched with control users, and proposes methods for identifying posts in support communities that may indicate a risk of self-harm, and demonstrates that this approach outperforms strong previously proposed methods.
Hate speech detection: Challenges and solutions
- Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina Russell, Nazli Goharian, O. Frieder
- Computer SciencePLoS ONE
- 20 August 2019
This work identifies and examines challenges faced by online automatic approaches for hate speech detection in text, and proposes a multi-view SVM approach that achieves near state-of-the-art performance, while being simpler and producing more easily interpretable decisions than neural methods.
SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions
- Arman Cohan, Bart Desmet, Andrew Yates, Luca Soldaini, Sean MacAvaney, Nazli Goharian
- Computer ScienceInternational Conference on Computational…
- 1 June 2018
This paper investigates the creation of high-precision patterns to identify self-reported diagnoses of nine different mental health conditions, and obtains high-quality labeled data without the need for manual labelling.
Scientific Article Summarization Using Citation-Context and Article’s Discourse Structure
- Arman Cohan, Nazli Goharian
- Computer ScienceConference on Empirical Methods in Natural…
- 1 September 2015
It is shown that the proposed summarization approach for scientific articles which takes advantage of citation-context and the document discourse model effectively improves over existing summarization approaches (greater than 30% improvement over the best performing baseline) in terms of ROUGE scores on TAC2014 scientific summarization dataset.
Ambiguity measure feature-selection algorithm
- Saket S. R. Mengle, Nazli Goharian
- Computer ScienceJ. Assoc. Inf. Sci. Technol.
- 1 May 2009
The ambiguity measure (AM) feature-selection algorithm, which selects the most unambiguous features from the feature set, is presented, which performs consistently better than the naive Bayes text classifier and significantly reduces the training time for the SVM algorithm.
Expansion via Prediction of Importance with Contextualization
- Sean MacAvaney, F. M. Nardini, R. Perego, N. Tonellotto, Nazli Goharian, O. Frieder
- Computer ScienceAnnual International ACM SIGIR Conference on…
- 29 April 2020
A representation-based ranking approach that explicitly models the importance of each term using a contextualized language model, and performs passage expansion by propagating the importance to similar terms, which narrows the gap between inexpensive and cost-prohibitive passage ranking approaches.
Efficient Document Re-Ranking for Transformers by Precomputing Term Representations
- Sean MacAvaney, F. M. Nardini, R. Perego, N. Tonellotto, Nazli Goharian, O. Frieder
- Computer ScienceAnnual International ACM SIGIR Conference on…
- 29 April 2020
The proposed approach, called PreTTR (Precomputing Transformer Term Representations), considerably reduces the query-time latency of deep transformer networks making these networks more practical to use in a real-time ranking scenario.
Triaging Mental Health Forum Posts
- Arman Cohan, S. Young, Nazli Goharian
- Psychology, Computer ScienceCLPsych@HLT-NAACL
- 1 June 2016
This work approached the task of automatically triaging forum posts as a multiclass classification problem, using a supervised classifier with various features including lexical, psycholinguistic, and topic modeling features to identify critical cases.
...
...