DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

  author={Yue Wang and Vitor Campanholo Guizilini and Tianyuan Zhang and Yilun Wang and Hang Zhao and Justin Solomon},
We introduce a framework for multi-camera 3D object detection. In contrast to existing works, which estimate 3D bounding boxes directly from monocular images or use depth prediction networks to generate input for 3D object detection from 2D information, our method manipulates predictions directly in 3D space. Our architecture extracts 2D features from multiple camera images and then uses a sparse set of 3D object queries to index into these 2D features, linking 3D positions to multi-view images… 

