• Publications
  • Influence
Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network
TLDR
It is found that, amongst content features, URLs and hashtags have strong relationships with retweetability and the number of followers and followees as well as the age of the account seem to affect retweetability, while, interestingly, thenumber of past tweets does not predict retweetability of a user's tweet.
Crowdsourcing user studies with Mechanical Turk
TLDR
Although micro-task markets have great potential for rapidly collecting user measurements at low costs, it is found that special care is needed in formulating tasks in order to harness the capabilities of the approach.
The Case for Learned Index Structures
TLDR
The idea of replacing core components of a data management system through learned models has far reaching implications for future systems designs and that this work provides just a glimpse of what might be possible.
A taxonomy of visualization techniques using the data state reference model
  • Ed H. Chi
  • Computer Science
    IEEE Symposium on Information Visualization…
  • 9 October 2000
TLDR
The paper shows that the Data State Model not only helps researchers understand the space of design, but also helps implementers understand how information visualization techniques can be applied more broadly.
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts
TLDR
This work proposes a novel multi-task learning approach, Multi-gate Mixture-of-Experts (MMoE), which explicitly learns to model task relationships from data and demonstrates the performance improvements by MMoE on real tasks including a binary classification benchmark, and a large-scale content recommendation system at Google.
Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles
TLDR
The first in-depth study of user behavior with regard to the location field in Twitter user profiles found that a user's country and state can in fact be determined easily with decent accuracy, indicating that users implicitly reveal location information, with or without realizing it.
He says, she says: conflict and coordination in Wikipedia
TLDR
The growth of non-direct work in Wikipedia is examined and the development of tools to characterize conflict and coordination costs in Wikipedia are described, which may inform the design of new collaborative knowledge systems.
Top-K Off-Policy Correction for a REINFORCE Recommender System
TLDR
This work presents a general recipe of addressing biases in a production top-K recommender system at Youtube, built with a policy-gradient-based algorithm, i.e. REINFORCE, and proposes a noveltop-K off-policy correction to account for the policy recommending multiple items at a time.
Using information scent to model user information needs and actions and the Web
TLDR
Two computational methods for understanding the relationship between user needs and user actions are described, which use a concept called “information scent”, which is the subjective sense of value and cost of accessing a page based on perceptual cues.
AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks
TLDR
This paper draws connections between recurrent networks and ordinary differential equations and proposes a special form of recurrent networks called AntisymmetricRNN, able to capture long-term dependencies thanks to the stability property of its underlying differential equation.
...
...