Abusive Language Detection in Online User Content
- Chikashi Nobata, J. Tetreault, A. Thomas, Yashar Mehdad, Yi Chang
- Computer ScienceThe Web Conference
- 11 April 2016
A machine learning based method to detect hate speech on online user comments from two domains which outperforms a state-of-the-art deep learning approach and a corpus of user comments annotated for abusive language, the first of its kind.
Yahoo! Learning to Rank Challenge Overview
This paper provides an overview and an analysis of this challenge, along with a detailed description of the released datasets, used internally at Yahoo! for learning the web search ranking function.
Attributed Network Embedding for Learning in a Dynamic Environment
- Jundong Li, Harsh Dani, Xia Hu, Jiliang Tang, Yi Chang, Huan Liu
- Computer ScienceInternational Conference on Information and…
- 6 June 2017
DANE first provides an offline method for a consensus embedding and then leverages matrix perturbation theory to maintain the freshness of the end embedding results in an online manner, and performs extensive experiments to corroborate the effectiveness and efficiency of the proposed framework.
Towards recency ranking in web search
This paper proposes a retrieval system which automatically detects and responds to recency sensitive queries, and proposes several training methodologies important for training recencysensitive rankers.
Robust early-learning: Hindering the memorization of noisy labels
The memorization effects of deep networks show that they will first memorize training data with clean labels and then those with noisy labels. The early stopping method therefore can be exploited for…
What is Tumblr: a statistical overview and comparison
It is found Tumblr has more rich content than other microblogging platforms, and it contains hybrid characteristics of social networking, traditional blogosphere, and social media.
Time is of the essence: improving recency ranking using Twitter data
A method to use the micro-blogging data stream to detect fresh URLs and to compute novel and effective features for ranking fresh URLs is proposed and demonstrated to improve effective of the portal web search engine for realtime web search.
A Survey of Signed Network Mining in Social Media
A review of mining signed networks in the context of social media and discuss some promising research directions and new frontiers of signed network mining.
Active Learning for Ranking through Expected Loss Optimization
- Bo Long, O. Chapelle, Ya Zhang, Yi Chang, Zhaohui Zheng, B. Tseng
- Computer ScienceIEEE Transactions on Knowledge and Data…
- 19 July 2010
This paper derives a novel algorithm, expected discounted cumulative gain (DCG) loss optimization (ELO-DCG), to select most informative examples and investigates both query and document level active learning for raking and proposes a two-stage ELO- DCG algorithm which incorporate bothquery and document selection into active learning.
Mining social media with social theories: a survey
Some key social theories in mining social media, their verification approaches, interesting findings, and state-of-the-art algorithms are reviewed.