• Corpus ID: 237266377

Fastformer: Additive Attention Can Be All You Need

@article{Wu2021FastformerAA,
  title={Fastformer: Additive Attention Can Be All You Need},
  author={Chuhan Wu and Fangzhao Wu and Tao Qi and Yongfeng Huang},
  journal={ArXiv},
  year={2021},
  volume={abs/2108.09084}
}
Transformer is a powerful model for text understanding. However, it is inefficient due to its quadratic complexity to input sequence length. Although there are many methods on Transformer acceleration, they are still either inefficient on long sequences or not effective enough. In this paper, we propose Fastformer, which is an efficient Transformer model based on additive attention. In Fastformer, instead of modeling the pair-wise interactions between tokens, we first use additive attention… 

DCAN: Diversified News Recommendation with Coverage-Attentive Networks

This work proposes a personalized news recommendation model called DCAN that captures multi-grained user-news matching signals through news encoders and user encodERS and improves the diversity of news recommendations with minimal sacrifice in accuracy.

Personalized News Recommendation: Methods and Challenges

A novel perspective to understand personalized news recommendation based on its core problems and the associated techniques and challenges is proposed and proposed instead of following the conventional taxonomy of news recommendation methods.

A News Recommendation Model Based on Time Awareness and News Relevance

  • Shaojun RenChongyang Shi
  • Computer Science
    2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI)
  • 2022
A news recommendation model based on time awareness and news relevance, which combines various news auxiliary information and user-news interaction data in the form of heterogeneous graph, and mines the temporal relationship in the user click sequence for news recommendation.

FUM: Fine-grained and Fast User Modeling for News Recommendation

The core idea of FUM is to concatenate the clicked news into a long document and transform user modeling into a document modeling task with both intra-news and inter-news word-level interactions.

VLSNR: Vision-Linguistics Coordination Time Sequence-aware News Recommendation

This work proposes a vision-linguistics coordinate time sequence news recommendation using an attentional GRU network to model user preference in terms of time ad-equately, and constructs a large scale multimodal news recommendation dataset V-MIND.

User recommendation system based on MIND dataset

The core of the system has used the GloVe algorithm for word embeddings and representation, and the Multi-head Attention Layer calculates the attention of words, to generate a list of recommended news.

Improving Graph-Based Movie Recommender System Using Cinematic Experience

A new graph-based movie recommender system that utilized sentiment and emotion information along with user ratings, and evaluated its performance in comparison to well known conventional models and state-of-the-art graph- based models shows that the proposed IGMC-based models coupled with emotion and sentiment are superior over the compared models.

Recommendation Systems: An Insight Into Current Development and Future Research Challenges

A gentle introduction to recommendation systems is provided, describing the task they are designed to solve and the challenges faced in research, and an extension to the standard taxonomy is presented, to better reflect the latest research trends, including the diverse use of content and temporal information.

A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification

A novel practical framework is proposed by utilizing a two-tier attention architecture to decouple the complexity of explanation and the decision-making process and is applied in the context of a news article classification task.

GRAM: Fast Fine-tuning of Pre-trained Language Models for Content-based Collaborative Filtering

This work proposes GRAM (GRadient Accumulation for Multi-modality in CCF), which exploits the fact that a given item often appears multiple times within a batch of interaction histories, and significantly improves training efficiency.

References

SHOWING 1-10 OF 29 REFERENCES

Neural News Recommendation with Multi-Head Self-Attention

A neural news recommendation approach with multi-head self-attentions to learn news representations from news titles by modeling the interactions between words and applies additive attention to learn more informative news and user representations by selecting important words and news.

MIND: A Large-scale Dataset for News Recommendation

This paper presents a large-scale dataset named MIND, constructed from the user click logs of Microsoft News, which contains 1 million users and more than 160k English news articles, each of which has rich textual content such as title, abstract and body.

Empowering News Recommendation with Pre-trained Language Models

Personalized news recommendation is an essential technique for online news services. News articles usually contain rich textual content, and accurate news modeling is important for personalized news

Fine-grained Interest Matching for Neural News Recommendation

FIM, a Fine-grained Interest Matching method for neural news recommendation, hierarchically construct multi-level representations for each news via stacked dilated convolutions and performs fine- grained matching between segment pairs of each browsed news and the candidate news at each semantic level.

Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS)

A probabilistic model based on collaborative filtering and topic modeling is proposed that allows it to capture the interest distribution of users and the content distribution for movies; it provides a link between interest and relevance on a per-aspect basis and it allows us to differentiate between positive and negative sentiments on aPer-Aspect basis.

Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering

This paper builds novel models for the One-Class Collaborative Filtering setting, where the goal is to estimate users' fashion-aware personalized ranking functions based on their past feedback and combines high-level visual features extracted from a deep convolutional neural network, users' past feedback, as well as evolving trends within the community.

Item Silk Road: Recommending Items from Information Domains to Social Users

This work presents a novel Neural Social Collaborative Ranking (NSCR) approach, which seamlessly sews up the user-item interactions in information domains and user-user connections in SNSs.

Poolingformer: Long Document Modeling with Pooling Attention

Experimental results show that Poolingformer sits atop three official leaderboards measured by F1, outperforming previous state-of-the-art models by 1.9 points, and results on the arXiv benchmark continue to demonstrate its superior performance.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Synthesizer: Rethinking Self-Attention in Transformer Models

The true importance and contribution of the dot product-based self-attention mechanism on the performance of Transformer models is investigated and a model that learns synthetic attention weights without token-token interactions is proposed, called Synthesizer.