• Publications
  • Influence
Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks
TLDR
A novel model named Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from data via a neural attention network, which consistently outperforms the state-of-the-art deep learning methods Wide&Deep and DeepCross with a much simpler structure and fewer model parameters.
Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning
TLDR
This paper proposes a new approach, namely Hierarchical Recurrent Neural Encoder (HRNE), to exploit temporal information of videos to exploit video temporal structure in a longer range by reducing the length of input information flow, and compositing multiple consecutive inputs at a higher level.
DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection
TLDR
This paper proposes a multi-task deep saliency model based on a fully convolutional neural network with global input (whole raw images) and global output (Whole saliency maps) and presents a graph Laplacian regularized nonlinear regression model for saliency refinement.
Dynamic Network Embedding by Modeling Triadic Closure Process
TLDR
This paper presents a novel representation learning approach, DynamicTriad, to preserve both structural information and evolution patterns of a given network and can effectively be applied and help to identify telephone frauds in a mobile network, and to predict whether a user will repay her loans or not in a loan network.
A Unified MRC Framework for Named Entity Recognition
TLDR
This paper proposes to formulate the task of NER as a machine reading comprehension (MRC) task, and naturally tackles the entity overlapping issue in nested NER: the extraction of two overlapping entities with different categories requires answering two independent questions.
Video Question Answering via Gradually Refined Attention over Appearance and Motion
TLDR
This paper proposes an end-to-end model which gradually refines its attention over the appearance and motion features of the video using the question as guidance and demonstrates the effectiveness of the model by analyzing the refined attention weights during the question answering procedure.
Hallucinating faces: LPH super-resolution and neighbor reconstruction for residue compensation
TLDR
The proposed locality preserving hallucination (LPH) algorithm combines locality preserving projection (LPP) and radial basis function (RBF) regression together to hallucinate the global high-resolution face.
Supervised Coupled Dictionary Learning with Group Structures for Multi-modal Retrieval
TLDR
This paper introduces coupled dictionary learning (DL) into supervised sparse coding for multi-modal (crossmedia) retrieval with group structures for Multi-Modal retrieval (SliM2), and formulates the multimodal mapping as a constrained dictionary learning problem.
HST-LSTM: A Hierarchical Spatial-Temporal Long-Short Term Memory Network for Location Prediction
TLDR
A Spatial-Temporal Long-Short Term Memory (ST-LSTM) model which naturally combines spatial-temporal influence into LSTM to mitigate the problem of data sparsity is proposed and evaluated on a real world trajectory data set.
Cross-media semantic representation via bi-directional learning to rank
TLDR
This paper proposes a general cross-media ranking algorithm to optimize the bi-directional listwise ranking loss with a latent space embedding, which it is called Bi- Directional Cross-Media Semantic Representation Model (Bi-CMSRM).
...
1
2
3
4
5
...