• Corpus ID: 237266377

Fastformer: Additive Attention Can Be All You Need

  title={Fastformer: Additive Attention Can Be All You Need},
  author={Chuhan Wu and Fangzhao Wu and Tao Qi and Yongfeng Huang},
Transformer is a powerful model for text understanding. However, it is inefficient due to its quadratic complexity to input sequence length. Although there are many methods on Transformer acceleration, they are still either inefficient on long sequences or not effective enough. In this paper, we propose Fastformer, which is an efficient Transformer model based on additive attention. In Fastformer, instead of modeling the pair-wise interactions between tokens, we first use additive attention… 

DCAN: Diversified News Recommendation with Coverage-Attentive Networks

This work proposes a personalized news recommendation model called DCAN that captures multi-grained user-news matching signals through news encoders and user encodERS and improves the diversity of news recommendations with minimal sacrifice in accuracy.

Personalized News Recommendation: Methods and Challenges

A novel perspective to understand personalized news recommendation based on its core problems and the associated techniques and challenges is proposed and proposed instead of following the conventional taxonomy of news recommendation methods.

A News Recommendation Model Based on Time Awareness and News Relevance

  • Shaojun RenChongyang Shi
  • Computer Science
    2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI)
  • 2022
A news recommendation model based on time awareness and news relevance, which combines various news auxiliary information and user-news interaction data in the form of heterogeneous graph, and mines the temporal relationship in the user click sequence for news recommendation.

FUM: Fine-grained and Fast User Modeling for News Recommendation

The core idea of FUM is to concatenate the clicked news into a long document and transform user modeling into a document modeling task with both intra-news and inter-news word-level interactions.

A Study on Deep Learning based News Recommender Systems

This study is intended to analyze the various technologies, difficulties, opportunities, and current state-of-the-art technologies that are used in the news recommender system to solve the news recommendation problem.

VLSNR: Vision-Linguistics Coordination Time Sequence-aware News Recommendation

This work proposes a vision-linguistics coordinate time sequence news recommendation using an attentional GRU network to model user preference in terms of time ad-equately, and constructs a large scale multimodal news recommendation dataset V-MIND.

User recommendation system based on MIND dataset

The core of the system has used the GloVe algorithm for word embeddings and representation, and the Multi-head Attention Layer calculates the attention of words, to generate a list of recommended news.

Improving Graph-Based Movie Recommender System Using Cinematic Experience

A new graph-based movie recommender system that utilized sentiment and emotion information along with user ratings, and evaluated its performance in comparison to well known conventional models and state-of-the-art graph- based models shows that the proposed IGMC-based models coupled with emotion and sentiment are superior over the compared models.

Recommendation Systems: An Insight Into Current Development and Future Research Challenges

A gentle introduction to recommendation systems is provided, describing the task they are designed to solve and the challenges faced in research, and an extension to the standard taxonomy is presented, to better reflect the latest research trends, including the diverse use of content and temporal information.

A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification

A novel practical framework is proposed by utilizing a two-tier attention architecture to decouple the complexity of explanation and the decision-making process and is applied in the context of a news article classification task.



Neural News Recommendation with Multi-Head Self-Attention

A neural news recommendation approach with multi-head self-attentions to learn news representations from news titles by modeling the interactions between words and applies additive attention to learn more informative news and user representations by selecting important words and news.

MIND: A Large-scale Dataset for News Recommendation

This paper presents a large-scale dataset named MIND, constructed from the user click logs of Microsoft News, which contains 1 million users and more than 160k English news articles, each of which has rich textual content such as title, abstract and body.

Empowering News Recommendation with Pre-trained Language Models

Personalized news recommendation is an essential technique for online news services. News articles usually contain rich textual content, and accurate news modeling is important for personalized news

Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS)

A probabilistic model based on collaborative filtering and topic modeling is proposed that allows it to capture the interest distribution of users and the content distribution for movies; it provides a link between interest and relevance on a per-aspect basis and it allows us to differentiate between positive and negative sentiments on aPer-Aspect basis.

Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering

This paper builds novel models for the One-Class Collaborative Filtering setting, where the goal is to estimate users' fashion-aware personalized ranking functions based on their past feedback and combines high-level visual features extracted from a deep convolutional neural network, users' past feedback, as well as evolving trends within the community.

Item Silk Road: Recommending Items from Information Domains to Social Users

This work presents a novel Neural Social Collaborative Ranking (NSCR) approach, which seamlessly sews up the user-item interactions in information domains and user-user connections in SNSs.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Synthesizer: Rethinking Self-Attention in Transformer Models

The true importance and contribution of the dot product-based self-attention mechanism on the performance of Transformer models is investigated and a model that learns synthetic attention weights without token-token interactions is proposed, called Synthesizer.

Hi-Transformer: Hierarchical Interactive Transformer for Efficient and Effective Long Document Modeling

A hierarchical interactive Transformer (Hi-Transformer) is proposed for efficient and effective long document modeling that first learns sentence representations and then learns document representations, and uses hierarchical pooling method to obtain document embedding.

Attention is All you Need

A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.