Corpus ID: 237346900

TIMo - A Dataset for Indoor Building Monitoring with a Time-of-Flight Camera

@article{Schneider2021TIMoA,
  title={TIMo - A Dataset for Indoor Building Monitoring with a Time-of-Flight Camera},
  author={Pascal Schneider and Yuriy Anisimov and Raisul Islam and Bruno Mirbach and Jason R. Rambach and Frederic Grandidier and Didier Stricker},
  journal={ArXiv},
  year={2021},
  volume={abs/2108.12196}
}
We present TIMo (Time-of-flight Indoor Monitoring), a dataset for video-based monitoring of indoor spaces captured using a time-of-flight (ToF) camera. The resulting depth videos feature people performing a set of different predefined actions, for which we provide detailed annotations. Person detection for people counting and anomaly detection are the two targeted applications. Most existing surveillance video datasets provide either grayscale or RGB videos. Depth information, on the other hand… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 25 REFERENCES
TICaM: A Time-of-flight In-car Cabin Monitoring Dataset
TLDR
A synthetic dataset of in-car cabin images with same multi-modality of images and annotations, providing a unique and extremely beneficial combination of synthetic and real data for effectively training cabin monitoring systems and evaluating domain adaptation approaches. Expand
ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes
TLDR
This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks. Expand
UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor
TLDR
A freely available dataset, named UTD-MHAD, which consists of four temporally synchronized data modalities, which includes RGB videos, depth videos, skeleton positions, and inertial signals from a Kinect camera and a wearable inertial sensor for a comprehensive set of 27 human actions is described. Expand
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
TLDR
This work introduces a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames, and investigates a novel one-shot 3D activity recognition problem on this dataset. Expand
RGBD Datasets: Past, Present and Future
  • Michael Firman
  • Computer Science
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2016
TLDR
This paper reviews datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification, and considers which datasets have succeeded in driving computer vision forward and why. Expand
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
TLDR
A large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects is introduced and a new recurrent neural network structure is proposed to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification. Expand
Comparison of Kinect V1 and V2 Depth Images in Terms of Accuracy and Precision
TLDR
A systematic comparison of the Kinect v1 and Kinect v2 is presented, investigating the accuracy and precision of the devices for their usage in the context of 3D reconstruction, SLAM or visual odometry. Expand
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
TLDR
A dataset of large-scale indoor spaces that provides a variety of mutually registered modalities from 2D, 2.5D and 3D domains, with instance-level semantic and geometric annotations, enables development of joint and cross-modal learning models and potentially unsupervised approaches utilizing the regularities present in large- scale indoor spaces. Expand
SALT: A Semi-automatic Labeling Tool for RGB-D Video Sequences
TLDR
SALT, a tool to semi-automatically annotate RGB-D video sequences to generate 3D bounding boxes for full six Degrees of Freedom (DoF) object poses, as well as pixel-level instance segmentation masks for both RGB and depth is introduced. Expand
Real-World Anomaly Detection in Surveillance Videos
TLDR
The experimental results show that the MIL method for anomaly detection achieves significant improvement on anomaly detection performance as compared to the state-of-the-art approaches, and the results of several recent deep learning baselines on anomalous activity recognition are provided. Expand
...
1
2
3
...