LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions

@article{Wang2017LADDERAH,
  title={LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions},
  author={Yu Wang and Jiayi Liu and Yuxiang Liu and Jun Hao and Yang He and Jinghe Hu and Weipeng P. Yan and Mantian Li},
  journal={ArXiv},
  year={2017},
  volume={abs/1708.05565}
}
We present LADDER, the first deep reinforcement learning agent that can successfully learn control policies for large-scale real-world problems directly from raw inputs composed of high-level semantic information. The agent is based on an asynchronous stochastic variant of DQN (Deep Q Network) named DASQN. The inputs of the agent are plain-text descriptions of states of a game of incomplete information, i.e. real-time large scale online auctions, and the rewards are auction profits of very… CONTINUE READING
11
Twitter Mentions

Similar Papers

Figures, Tables, Results, and Topics from this paper.

Key Quantitative Results

  • We apply the agent to an essential portion of JD's online RTB (real-time bidding) advertising business and find that it easily beats the former state-of-the-art bidding policy that had been carefully engineered and calibrated by human experts: during JD.com's June 18th anniversary sale, the agent increased the company's ads revenue from the portion by more than 50%, while the advertisers' ROI (return on investment) also improved significantly.
  • We evaluated LADDER on a significant portion of JD DSP business with online A/B test and the experimental results indicate that the industry was far from solving the RTB problem: LADDER easily outperformed the human expert calibrated ECPM policy: during JD.com’s June 18th anniversary sale, the agent raised the company’s ads revenue from the portion by more than 50%, while the ROI of the advertisers also improved as much as 17%.

References

Publications referenced by this paper.