Corpus ID: 237213708

DeepScale: An Online Frame Size Adaptation Approach to Accelerate Visual Multi-object Tracking

@inproceedings{Nalaie2021DeepScaleAO,
  title={DeepScale: An Online Frame Size Adaptation Approach to Accelerate Visual Multi-object Tracking},
  author={Keivan Nalaie and Rong Zheng},
  year={2021}
}
In surveillance and search and rescue applications, it is important to perform multi-target tracking (MOT) in realtime on low-end devices. Today’s MOT solutions employ deep neural networks, which tend to have high computation complexity. Recognizing the effects of frame sizes on tracking performance, we propose DeepScale, a model agnostic frame size selection approach that operates on top of existing fully convolutional network-based trackers to accelerate tracking throughput. In the training… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 34 REFERENCES
TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model
TLDR
A concise end-to-end model TubeTK which only needs one step training by introducing the "bounding-tube" to indicate temporal-spatial locations of objects in a short video clip is proposed which achieves state-of-the-art performances even if it adopts no ready-made detection results. Expand
MOTS: Multi-Object Tracking and Segmentation
TLDR
This paper creates dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation procedure, and proposes a new baseline method which jointly addresses detection, tracking, and segmentation with a single convolutional network. Expand
Simple online and realtime tracking with a deep association metric
TLDR
This paper integrates appearance information to improve the performance of SORT and reduces the number of identity switches, achieving overall competitive performance at high frame rates. Expand
Discriminative Appearance Modeling with Multi-track Pooling for Real-time Multi-object Tracking
TLDR
This paper solves the problem of simultaneously considering all tracks during memory updating, with only a small spatial overhead, via a novel multitrack pooling module and proposes a training strategy adapted to multi-track pooling which generates hard tracking episodes online. Expand
MOT20: A benchmark for multi object tracking in crowded scenes
TLDR
The MOT20benchmark, consisting of 8 new sequences depicting very crowded challenging scenes, is presented, and gives to chance to evaluate state-of-the-art methods for multiple object tracking when handling extremely crowded scenarios. Expand
AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling
TLDR
A novel approach is proposed, dubbed AdaScale, which adaptively selects the input image scale that improves both accuracy and speed for video object detection, and shows that re-scaling the image to a lower resolution will sometimes produce better accuracy. Expand
AR-Net: Adaptive Frame Resolution for Efficient Action Recognition
TLDR
A novel approach, called AR-Net (Adaptive Resolution Network), that selects on-the-fly the optimal resolution for each frame conditioned on the input for efficient action recognition in long untrimmed videos. Expand
Focal Loss for Dense Object Detection
TLDR
This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. Expand
Focal Loss for Dense Object Detection
TLDR
This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. Expand
Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor
  • Wongun Choi
  • Computer Science
  • 2015 IEEE International Conference on Computer Vision (ICCV)
  • 2015
TLDR
A novel Aggregated Local Flow Descriptor (ALFD) that encodes the relative motion pattern between a pair of temporally distant detections using long term interest point trajectories (IPTs) and ablative analysis verifies the superiority of the ALFD metric over the other conventional affinity metrics. Expand
...
1
2
3
4
...