Comparison of state-of-the-art deep learning APIs for image multi-label classification using semantic metrics

  title={Comparison of state-of-the-art deep learning APIs for image multi-label classification using semantic metrics},
  author={Adam Kubany and Shimon Ben Ishay and Ruben-sacha Ohayon and Armin Shmilovici and Lior Rokach and Tomer Doitshman},
  journal={Expert Syst. Appl.},
Abstract Image understanding heavily relies on accurate multi-label classification. In recent years, deep learning algorithms have become very successful for such tasks, and various commercial and open-source APIs have been released for public use. However, these APIs are often trained on different datasets, which, besides affecting their performance, might pose a challenge to their performance evaluation. This challenge concerns the different object-class dictionaries of the APIs’ training… 
LCP-Net: A local context-perception deep neural network for medical image segmentation
A deep neural network (LCP-Net) that can perceive multi-scale context information of images and improve segmentation accuracy of the model for small objects is proposed and a novel improved cross-entropy loss (DDCLoss) is proposed, which can adaptively adjust the weight of loss according to the certainty and deviation distance of the predicted pixel value.
IMU Data and GPS Position Information Direct Fusion Based on LSTM
The trained LSTM is a dependable fusion method for combining IMU data and GPS position information to estimate position and has no cumulative divergence error compared to SINS (computed).
An efficient deep Convolutional Neural Network based detection and classification of Acute Lymphoblastic Leukemia
  • Pradeep Kumar Das, S. Meher
  • Computer Science
    Expert Syst. Appl.
  • 2021
An efficient deep CNNs framework is proposed to mitigate this issue and yield more accurate ALL detection, and a novel probability-based weight factor is suggested, which has a significant role in efficiently hybridizing MobilenetV2 and ResNet18 with preserving the benefits of both approaches.
Gated recurrent units and temporal convolutional network for multilabel classification
A new ensemble method for managing multilabel classification is proposed, which combines a set of gated recurrent units and temporal convolutional neural networks trained with variants of the Adam optimization approach, and is shown to outperform the state-of-the-art.
How Viewer Tuning, Presence and Attention Respond to Ad Content and Predict Brand Search Lift*
New technology measures TV viewer tuning, presence and attention, enabling the first distinctions between TV ad viewability and actual ad viewing. We compare new and traditional viewing metrics,


CNN-RNN: A Unified Framework for Multi-label Image Classification
The proposed CNN-RNN framework learns a joint image-label embedding to characterize the semantic label dependency as well as the image- label relevance, and it can be trained end-to-end from scratch to integrate both information in a unified framework.
Learning Deep Latent Space for Multi-Label Classification
A novel deep neural networks based model, Canonical Correlated AutoEncoder (C2AE), is proposed, which allows end-to-end learning and prediction with the ability to exploit label dependency, and can be easily extended to address the learning problem with missing labels.
Multi-task deep neural network for multi-label learning
  • Yan Huang, Wei Wang, Liang Wang, T. Tan
  • Computer Science
    2013 IEEE International Conference on Image Processing
  • 2013
A multi-task deep neural network (MT-DNN) architecture to handle the multi-label learning problem, in which each label learning is defined as a binary classification task, which generalizes one classification task of traditional DNN into multiple binary classification tasks through defining the output layer with a negative class node and a positive class node for each label.
An extensive experimental comparison of methods for multi-label learning
The results of the analysis show that for multi-label classification the best performing methods overall are random forests of predictive clustering trees (RF-PCT) and hierarchy of multi- label classifiers (HOMER), followed by binary relevance (BR) and classifier chains (CC).
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Rethinking the Inception Architecture for Computer Vision
This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.
Show and tell: A neural image caption generator
This paper presents a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image.
Mask R-CNN
This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.