• Corpus ID: 6181178

Using Convolutional Neural Networks and Transfer Learning to Perform Yelp Restaurant Photo Classification

@inproceedings{Stanford2016UsingCN,
  title={Using Convolutional Neural Networks and Transfer Learning to Perform Yelp Restaurant Photo Classification},
  author={Diveesh Singh Stanford and Serra Mall},
  year={2016}
}
For the Yelp Image Classification Kaggle Challenge, we use a modified VGGNet to make predictions on 9 specific attributes of restaurants based on images by performing transfer learning. We explore two approaches to making these predictions: a naive model that assigns the attributes of the restaurant to each picture, and a more sophisticated method called multiple instance learning. With our naive model, we achieved a mean F1 score of 0.533, while our MIL model achieved a mean F1 score of 0.618… 

Figures from this paper

An Extensive Study and Comparison of the Various Approaches to Object Detection using Deep Learning
TLDR
After an extensive comparison, considering the given evaluation metrics and time constraints of a real-time smart surveillance system, the YOLO architecture and its variants are found to be the most efficient.
An Extensive Study and Comparison of the Various Approaches to Object Detection using Deep Learning
TLDR
After an extensive comparison, considering the given evaluation metrics and time constraints of a real-time smart surveillance system, the YOLO architecture and its variants are found to be the most efficient.
COCO-Bridge: Common Objects in Context Dataset and Benchmark for Structural Detail Detection of Bridges
TLDR
This research investigated the required parameters for detail identification and evaluated performance enhancements on the annotation process for COCO-Bridge, an annotated dataset which can be trained using a convolutional neural network to identify specific structural details.
GM-RKB WikiText Error Correction Task and Baselines
TLDR
The GM-RKB WikiText Error Correction Task for the automatic detection and correction of typographical errors in WikiText annotated pages is introduced and two supervised baseline WikiFixer error correction methods are designed and evaluated.
Arabic Sign Language Recognition through Deep Neural Networks Fine-Tuning
TLDR
Transfer learning and fine tuning deep convolutional neural networks are utilized to improve the accuracy of recognizing 32 hand gestures from the Arabic sign language.

References

SHOWING 1-10 OF 10 REFERENCES
Food-101 - Mining Discriminative Components with Random Forests
TLDR
A novel method to mine discriminative parts using Random Forests (rf), which allows us to mine for parts simultaneously for all classes and to share knowledge among them, and compares nicely to other s-o-a component-based classification methods.
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
TLDR
Clear empirical evidence that training with residual connections accelerates the training of Inception networks significantly is given and several new streamlined architectures for both residual and non-residual Inception Networks are presented.
Multiple Instance Boosting for Object Detection
TLDR
MILBoost adapts the feature selection criterion of MILBoost to optimize the performance of the Viola-Jones cascade to show the advantage of simultaneously learning the locations and scales of the objects in the training set along with the parameters of the classifier.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TLDR
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Visual tracking with online Multiple Instance Learning
TLDR
It is shown that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems, and can therefore lead to a more robust tracker with fewer parameter tweaks.
Accelerating t-SNE using tree-based algorithms
TLDR
Variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N log N) are developed and shown to substantially accelerate and make it possible to learnembeddings of data sets with millions of objects.
Deep learning starter code
  • 2016
Introducing the Yelp Restaurant Photo Classification Challenge
  • Retrieved March
  • 2016
Inceptionv4, Inception-ResNet and the Impact of Residual Connections on Learning. ArXiv
  • 2016