A land-based tank aquaculture system coupled with effluent treatment ponds: Effect of stocking density on the growth performance of gibel carp (Carassius gibelio)
- Huacheng LiJieya Liu Dapeng Li
- 1 April 2025
Environmental Science, Biology
M3Rec: A Context-Aware Offline Meta-Level Model-Based Reinforcement Learning Approach for Cold-Start Recommendation
- Yanan WangYong GeZ. LiLi LiRui Chen
- 25 April 2024
Computer Science
ACM Trans. Inf. Syst.
This article addresses the cold-start challenge in the RL-based recommender systems by proposing a novel context-aware offline meta-level model-based RL approach for user adaptation, and introduces a mutual information constraint between the user model and recommendation agent.
Double Wins: Boosting Accuracy and Efficiency of Graph Neural Networks by Reliable Knowledge Distillation
- Qiaoyu TanD. Zha Xia Hu
- 1 December 2023
Computer Science
A novel Reliable Knowledge Distillation framework for MLP optimization (RKDMLP) is proposed, which shows strong promise in achieving a “sweet point” in co-optimizing model accuracy and efficiency.
Chain-of-Query: Unleashing the Power of LLMs in SQL-Aided Table Understanding via Multi-Agent Collaboration
- Songyuan SuiHongyi Liu Xia Hu
- 14 August 2025
Computer Science
IJCNLP-AACL
CoQ adopts natural-language-style representations of table schemas to abstract away structural noise and enhance understanding and achieves substantial accuracy improvements and significantly lowers invalid SQL rates compared to prior generic LLM-based, SQL-aided, and hybrid baselines, confirming its superior effectiveness in table understanding.
Addressing Delayed Feedback in Conversion Rate Prediction: A Domain Adaptation Approach
- Leisheng YuYanxiao Cai Xia Hu
- 9 December 2024
Computer Science
This work proposes a simple framework that redefines CVR prediction under delayed feedback as an unsupervised domain adaptation (UDA) problem, by integrating existing click-through rate (CTR) or CVR models with UDA algorithms.
Exploration into RL-based Language Model Finetuning
- Rui Chen
Computer Science
The report dives into various methods of training LLM’s on preference finetuning under a reinforcement learning point of view, and starts with a baseline approach running SFT on smoltalk + DPO on ultrafeedback, then dives into implementing a novel approach AGRO which claims to set up a uniformed framework for most existing methods on RL finetuning.
ERNIE 5.0 Technical Report
- Haifeng WangHua Wu Ziyuan Gao
- 4 February 2026
Computer Science
ERNIE 5.0 represents the first production-scale realization of a trillion-parameter unified autoregressive model that supports both multimodal understanding and generation and systematically addresses the challenges of scaling reinforcement learning to unified foundation models.
Fairness-Aware Mutual Information for Multimodal Recommendation
- Dan LuXu ChenRui ChenShiqing WuGuandong Xu
- 16 August 2024
Computer Science
This work proposes a modality-guided representation learning framework using fairness-aware mutual information to disentangle sensitive and non-sensitive information from modal embeddings, and adopts a dual mutual information objective to decompose modal embeddings.