Deceiving Google’s Cloud Video Intelligence API Built for Summarizing Videos
@article{Hosseini2017DeceivingGC, title={Deceiving Google’s Cloud Video Intelligence API Built for Summarizing Videos}, author={Hossein Hosseini and Baicen Xiao and Radha Poovendran}, journal={2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)}, year={2017}, pages={1305-1309} }
Despite the rapid progress of the techniques for image classification, video annotation has remained a challenging task. [] Key Method A demonstration website has been also launched, which allows anyone to select a video for annotation. The API then detects the video labels (objects within the video) as well as shot labels (description of the video events over time).,,,,,,In this paper, we examine the usability of the Google's Cloud Video Intelligence API in adversarial environments. In particular, we…
17 Citations
Adversarial Video Captioning
- Computer Science2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)
- 2019
This is the first successful method for targeted attacks against a video captioning model, able to inject 'subliminal' perturbations into the video stream, and force the model to output a chosen caption with up to 0.981 cosine similarity, achieving near-perfect similarity to chosen target captions.
Google's Cloud Vision API is Not Robust to Noise
- Computer Science2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)
- 2017
By adding sufficient noise to the image, the Google Cloud Vision API generates completely different outputs for the noisy image, while a human observer would perceive its original content, suggesting that cloud vision API can readily benefit from noise filtering, without the need for updating image analysis algorithms.
Sparse Adversarial Perturbations for Videos
- Computer ScienceAAAI
- 2019
An l2,1-norm based optimization algorithm is proposed to compute the sparse adversarial perturbations for videos and chooses the action recognition as the targeted task, and networks with a CNN+RNN architecture as threat models to verify the method.
Adversarial Evasion Noise Attacks Against TensorFlow Object Detection API
- Computer Science2020 15th International Conference for Internet Technology and Secured Transactions (ICITST)
- 2020
The positive effect of low-density additive noise in terms of improving the performance of the ML models such that they could be considered to be added as a new feature vector is shown.
When George Clooney Is Not George Clooney: Using GenAttack to Deceive Amazon's and Naver's Celebrity Recognition APIs
- Computer ScienceSEC
- 2018
A novel way to generate adversarial example images using an evolutionary genetic algorithm (GA) and demonstrates the practicability of generating adversarial examples and successfully fooling the state-of-the-art commercial image recognition systems.
Negative Adversarial Example Generation Against Naver's Celebrity Recognition API
- Computer ScienceWDC@AsiaCCS
- 2022
This work generates adversarial images against Naver's celebrity recognition API and demonstrates that it is extremely easy to fool the online DNN-based APIs using adversarial examples and discusses possible negative impacts resulting from these adversarialExamples.
Image Processing and Location based Image Querier(LBIQ)
- Computer Science2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT)
- 2020
An LBIQ takes into account only a single input that is an image and is able to give out a set of attributes which can be further processed for convenience of service.
Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training
- Computer ScienceEMNLP
- 2020
It is shown that the bias can be reduced in the sentence representations by using an ensemble of adversaries, encouraging the model to jointly decrease the accuracy of these different adversaries while fitting the data.
Gone at Last: Removing the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training
- Computer ScienceEMNLP
- 2020
It is shown that using an ensemble of adversaries can prevent the bias from being relearned after the model training is completed, further improving how well the model generalises to different NLI datasets.
Enhancing robustness of machine learning systems via data transformations
- Computer Science2018 52nd Annual Conference on Information Sciences and Systems (CISS)
- 2018
The use of data transformations as a defense against evasion attacks on ML classifiers is effective against the best known evasion attacks from the literature, resulting in a two-fold increase in the resources required by a white-box adversary with knowledge of the defense.
24 References
A Generic Framework for Video Annotation via Semi-Supervised Learning
- Computer ScienceIEEE Transactions on Multimedia
- 2012
A Fast Graph-based Semi-Supervised Multiple Instance Learning (FGSSMIL) algorithm is proposed to jointly explore small-scale expert labeled videos and large-scale unlabeled videos to train the models and results compared with the state-of-the-arts are promising and demonstrate the effectiveness and efficiency of the proposed approach.
Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces
- Computer ScienceECCV
- 2010
This work has created a public framework for dividing the work of labeling video data into micro-tasks that can be completed by huge labor pools available through crowdsourced marketplaces and leverages more sophisticated interpolation between key frames to maximize performance given a budget.
Unified Video Annotation via Multigraph Learning
- Computer ScienceIEEE Transactions on Circuits and Systems for Video Technology
- 2009
This paper shows that various crucial factors in video annotation, including multiple modalities, multiple distance functions, and temporal consistency, all correspond to different relationships among video units, and hence they can be represented by different graphs, and proposes optimized multigraph-based semi-supervised learning (OMG-SSL), which aims to simultaneously tackle these difficulties in a unified scheme.
Interactive Video Indexing With Statistical Active Learning
- Computer ScienceIEEE Transactions on Multimedia
- 2012
A novel active learning approach based on the optimum experimental design criteria in statistics is proposed that simultaneously exploits sample's local structure, and sample relevance, density, and diversity information, as well as makes use of labeled and unlabeled data.
Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation
- Computer ScienceIEEE Transactions on Multimedia
- 2009
This work proposes a novel neighborhood similarity measure, which explores the local sample and label distributions and shows that the neighborhood similarity between two samples simultaneously takes into account three characteristics: their distance; the distribution difference of the surrounding samples; and the distribution different of surrounding labels.
Deceiving Google's Perspective API Built for Detecting Toxic Comments
- Computer ScienceArXiv
- 2017
It is shown that an adversary can subtly modify a highly toxic phrase in a way that the system assigns significantly lower toxicity score to it, and this attack can consistently reduce the toxicity scores to the level of the non-toxic phrases.
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
- Computer Science2014 IEEE Conference on Computer Vision and Pattern Recognition
- 2014
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
Very Deep Convolutional Networks for Large-Scale Image Recognition
- Computer ScienceICLR
- 2015
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
The Limitations of Deep Learning in Adversarial Settings
- Computer Science2016 IEEE European Symposium on Security and Privacy (EuroS&P)
- 2016
This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Assistive tagging: A survey of multimedia tagging with human-computer joint exploration
- Computer ScienceCSUR
- 2012
Along with the explosive growth of multimedia data, automatic multimedia tagging has attracted great interest of various research communities, such as computer vision, multimedia, and information…