• Corpus ID: 44061055

Classifying cooking object's state using a tuned VGG convolutional neural network

  title={Classifying cooking object's state using a tuned VGG convolutional neural network},
  author={Rahul Paul},
  • Rahul Paul
  • Published 23 May 2018
  • Computer Science
  • ArXiv
In robotics, knowing the object states and recognizing the desired states are very important. Objects at different states would require different grasping. To achieve different states, different manipulations would be required, as well as different grasping. To analyze the objects at different states, a dataset of cooking objects was created. Cooking consists of various cutting techniques needed for different dishes (e.g. diced, julienne etc.). Identifying each of this state of cooking objects… 

Figures from this paper

Cooking Object's State Identification Without Using Pretrained Model

This paper has proposed a CNN and trained it from scratch on the dataset from cooking state recognition challenge and achieves 65.8% accuracy on the unseen test dataset.

Classifying States of Cooking Objects Using Convolutional Neural Network

A robust deep convolutional neural network is designed for classifying the state of the cooking objects from scratch and is evaluated by using various techniques, such as adjusting architecture layers, tuning key hyperparameters, and using different optimization techniques to maximize the accuracy of state classification.

Tuned Inception V3 for Recognizing States of Cooking Ingredients

This paper proposes a fine tuned convolutional neural network that makes use of transfer learning by reusing the Inception V3 pre-trained model and is trained and validated on a cooking dataset consisting of eleven states.

Cooking Actions Inference based on Ingredient’s Physical Features

This paper developed a framework for inferring the executable cooking actions of ingredients, in order to compensate for the common knowledge of humans, and tuned the existing VGG16 Convolutional Neural Network to learn the physical features of ingredients.

Food Arrangement Framework for Cooking Robots

The proposed food arrangement framework for a robot to automatically serve meals is evaluated and it is demonstrated that a UR3 robot is capable of serving a steak meal using a spatula-like end-effector.

Pouring Sequence Prediction using Recurrent Neural Network

Recurrent neural network (RNN) was used to build a neural network to learn that complex sequence and predict for unseen pouring sequences and Dynamic time warping was used for evaluating the prediction performance of the trained model.

Robustness Analysis for VGG-16 Model in Image Classification of Post-Hurricane Buildings

  • Haoyang LiXinyi Wang
  • Environmental Science
    2021 2nd International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE)
  • 2021
Hurricane is a destructive natural disaster that causes a dramatic loss of life and damage to property. Locating houses damaged by hurricane manually is quite time-consuming and labor intensive. To

Tuned Inception V3 for Recognizing States of Cooking Ingredients

  • Kin Tek Ng
  • Computer Science
    State Recognition symposium
  • 2019
This paper proposes a fine tuned convolutional neural network that makes use of transfer learning by reusing the Inception V3 pre-trained model and is trained and validated on a cooking dataset consisting of eleven states.



Identifying Object States in Cooking-Related Images

In this paper, objects and ingredients in cooking videos are explored and the most frequent objects are analyzed and a dataset of images containing those objects and their states is created.

Im2Calories: Towards an Automated Mobile Vision Food Diary

A system which can recognize the contents of your meal from a single image, and then predict its nutritional contents, such as calories, is presented, significantly outperforming previous work.

Food-101 - Mining Discriminative Components with Random Forests

A novel method to mine discriminative parts using Random Forests (rf), which allows us to mine for parts simultaneously for all classes and to share knowledge among them, and compares nicely to other s-o-a component-based classification methods.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision

A novel method for aligning a sequence of instructions to a video of someone carrying out a task, based on a deep convolutional neural network, that outperforms simpler techniques based on keyword spotting.

DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment

A new Convolutional Neural Network CNN-based food image recognition algorithm is proposed to improve the accuracy of dietary assessment by analyzing the food images captured by mobile devices e.g., smartphone.

Visualizing and Understanding Convolutional Networks

A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

Task Prediction in Cooking Activities Using Hierarchical State Space Markov Chain and Object Based Task Grouping

The work done in the paper is first of its kind as it focuses on task prediction rather than task recognition, and can be easily adapted to any complex activity supporting various annotation schemes and activity models.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

A Hybrid Discriminative/Generative Approach for Modeling Human Activities

A hybrid approach to recognizing activities is presented, which combines boosting to discriminatively select useful features and learn an ensemble of static classifiers to recognize different activities, with hidden Markov models (HMMs) to capture the temporal regularities and smoothness of activities.