• Publications
  • Influence
Why we twitter: understanding microblogging usage and communities
TLDR
It is found that people use microblogging to talk about their daily activities and to seek or share information and the user intentions associated at a community level are analyzed to show how users with similar intentions connect with each other. Expand
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
TLDR
MobileBERT is a thin version of BERT_LARGE, while equipped with bottleneck structures and a carefully designed balance between self-attentions and feed-forward networks that can be generically applied to various downstream NLP tasks via simple fine-tuning. Expand
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
TLDR
The empirical results demonstrate the superior performance of LAMB across various tasks such as BERT and ResNet-50 training with very little hyperparameter tuning, and the optimizer enables use of very large batch sizes of 32868 without any degradation of performance. Expand
Evolutionary spectral clustering by incorporating temporal smoothness
TLDR
This paper proposes two frameworks that incorporate temporal smoothness in evolutionary spectral clustering and demonstrates that their methods provide the optimal solutions to the relaxed versions of the corresponding evolutionary k-means clustering problems. Expand
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
TLDR
The proposed BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies, is proposed, able to train a single set of shared weights on ImageNet and use these weights to obtain child models whose sizes range from 200 to 1000 MFLOPs. Expand
Why We Twitter: An Analysis of a Microblogging Community
TLDR
It is found that people use microblogging primarily to talk about their daily activities and to seek or share information and that users with similar intentions connect with each other. Expand
On evolutionary spectral clustering
TLDR
This article proposes two frameworks that incorporate temporal smoothness in evolutionary spectral clustering and demonstrates that their methods provide the optimal solutions to the relaxed versions of the corresponding evolutionary k-means clustering problems. Expand
Identifying opinion leaders in the blogosphere
TLDR
The InfluenceRank algorithm ranks blogs according to not only how important they are as compared to other blogs, but also how novel the information they can contribute to the network. Expand
Reducing BERT Pre-Training Time from 3 Days to 76 Minutes
TLDR
The LAMB optimizer is proposed, which helps to scale the batch size to 65536 without losing accuracy, and is a general optimizer that works for both small and large batch sizes and does not need hyper-parameter tuning besides the learning rate. Expand
Personalized recommendation driven by information flow
We propose that the information access behavior of a group of people can be modeled as an information flow issue, in which people intentionally or unintentionally influence and inspire each other,Expand
...
1
2
3
4
...