Learning Safety Equipment Detection using Virtual Worlds

  title={Learning Safety Equipment Detection using Virtual Worlds},
  author={Marco Di Benedetto and Enrico Meloni and Giuseppe Amato and F. Falchi and Claudio Gennaro},
  journal={2019 International Conference on Content-Based Multimedia Indexing (CBMI)},
Nowadays, the possibilities offered by state-of-the-art deep neural networks allow the creation of systems capable of recognizing and indexing visual content with very high accuracy. Performance of these systems relies on the availability of high quality training sets, containing a large number of examples (e.g. million), in addition to the the machine learning tools themselves. For several applications, very good training sets can be obtained, for example, crawling (noisily) annotated images… 

Figures and Tables from this paper

Learning accurate personal protective equipment detection from virtual worlds

This paper generated photo-realistic synthetic image sets to train deep learning models to recognize the correct use of personal safety equipment during at-risk work activities and demonstrated that training with the synthetic training set generated and the use of the domain adaptation phase is an effective solution for applications where no training set is available.

Learning to Detect Fallen People in Virtual Worlds

This work proposes a Computer Vision deep-learning based approach for human fall detection based on largely available standard RGB cameras and trains a general-purpose object detector trained using a virtual world dataset in addition to real-world images.

Machine learning using synthetic images for detecting dust emissions on construction sites

A framework that overcomes the challenges of lacking sufficient imagery data for training computer vision algorithms to monitor construction dust is established and experimental results indicate that training dust detection algorithms with only synthetic images can achieve acceptable performance on real-world images.

Panoptic Segmentation in Industrial Environments using Synthetic and Real Data

Experiments show that the use of synthetic images allows to drastically reduce the number of real images needed to obtain reasonable panoptic segmentation performance, and the generated images are automatically labeled and hence Eliotless to obtain.

Instance Segmentation of Personal Protective Equipment using a Multi-stage Transfer Learning Process

Soft biometric object classes from the Open Images V5 and DeepFashion2 datasets are proposed to pre-train a mask segmentation network to detect and segment personal protective equipment in the workplace.

Relatable Clothing: Detecting Visual Relationships between People and Clothing

The release of the Relatable Clothing Dataset is presented, which contains 35287 person-clothing pairs and segmentation masks for the development of "worn" and "unworn" classification models, and a novel soft attention unit is proposed for performing 'worn' and 'unworn' classification using deep neural networks.

Relatable Clothing: Soft-Attention Mechanism for Detecting Worn/Unworn Objects

A novel visual relationship model architecture for “worn’ and “unworn” clothing detection that makes use of a soft attention mechanism for feature fusion between a conventional ResNet backbone and the authors' novel person-clothing mask feature extraction architecture is proposed.

Digital Twins: A Survey on Enabling Technologies, Challenges, Trends and Future Prospects

The paper provides a deep insight into the technology, lists design goals and objectives, highlights design challenges and limitations across industries, discusses research and commercial developments, provides its applications and use cases, and covers developments to date.

Artificial intelligence in construction asset management: a review of present status, challenges and future opportunities

This study conducted the first state-of-the-art research on AI for building asset management using bibliometric tools to identify prominent institutions, topics, and journals and identified three main trends that can be a reference point for future studies made by practitioners or researchers.



Unsupervised domain adaptation of virtual and real worlds for pedestrian detection

The transductive SVM (T-SVM) learning algorithm is explored in order to adapt virtual and real worlds for pedestrian detection and the use of unsupervised domain adaptation techniques that avoid human intervention during the adaptation process is proposed.

Driving in the Matrix: Can virtual worlds replace human-generated annotations for real world tasks?

A method to incorporate photo-realistic computer images from a simulation engine to rapidly generate annotated data that can be used for the training of machine learning algorithms, which offers the possibility of accelerating deep learning's application to sensor-based classification problems like those that appear in self-driving cars.

Virtual and Real World Adaptation for Pedestrian Detection

A domain adaptation framework, V-AYLA, in which different techniques to collect a few pedestrian samples from the target domain and combine them with the many examples of the source domain in order to train a domain adapted pedestrian classifier that will operate in thetarget domain.

Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization

This work presents a system for training deep neural networks for object detection using synthetic images that relies upon the technique of domain randomization, in which the parameters of the simulator are randomized in non-realistic ways to force the neural network to learn the essential features of the object of interest.

VIVID: Virtual Environment for Visual Deep Learning

A new Virtual Environment for Visual Deep Learning (VIVID) is presented, which offers large-scale diversified indoor and outdoor scenes and leverages the advanced human skeleton system, which enables us to simulate numerous complex human actions.

Learning appearance in virtual scenarios for pedestrian detection

Detecting pedestrians in images is a key functionality to avoid vehicle-to-pedestrian collisions. The most promising detectors rely on appearance-based pedestrian classifiers trained with labelled

Playing for Data: Ground Truth from Computer Games

It is shown that associations between image patches can be reconstructed from the communication between the game and the graphics hardware, which enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content.

Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars

The efficacy and flexibility of a "GTA-V"-like virtual environment is expected to provide an efficient well-defined foundation for the training and testing of Convolutional Neural Networks for safe driving.

Training a convolutional neural network for multi-class object detection using solely virtual world data

This work developed a CNN-based multi-class detection system that was trained solely on virtual world data and achieves competitive results compared to state-of-the-art detection systems.

The Pascal Visual Object Classes Challenge: A Retrospective

A review of the Pascal Visual Object Classes challenge from 2008-2012 and an appraisal of the aspects of the challenge that worked well, and those that could be improved in future challenges.