• Corpus ID: 232105220

Cooking Object's State Identification Without Using Pretrained Model

  title={Cooking Object's State Identification Without Using Pretrained Model},
  author={Md Sadman Sakib},
Recently, Robotic Cooking has been a very promising field. To execute a recipe, a robot has to recognize different objects and their states. Contrary to object recognition, state identification has not been explored that much. But it is very important because different recipe might require different state of an object. Moreover, robotic grasping depends on the state. Pretrained model usually perform very well in this type of tests. Our challenge was to handle this problem without using any… 

Figures and Tables from this paper


Classifying cooking object's state using a tuned VGG convolutional neural network
The work presented in this paper focuses on classification between various object states rather than task recognition or recipe prediction, and this framework can be easily adapted in any other object state classification activity.
Identifying Object States in Cooking-Related Images
In this paper, objects and ingredients in cooking videos are explored and the most frequent objects are analyzed and a dataset of images containing those objects and their states is created.
Joint Object and State Recognition Using Language Knowledge
Experiments on a dataset of cooking objects show that using a language knowledge graph on top of a deep neural network effectively enhances object and state classification.
Very Deep Convolutional Networks for Large-Scale Image Recognition
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
A Survey of Knowledge Representation in Service Robotics
This paper focuses on knowledge representations and notably how knowledge is typically gathered, represented, and reproduced to solve problems as done by researchers in the past decades and the key distinction between such representations and useful learning models that have extensively been introduced and studied in recent years.
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Object-object interaction affordance learning
An image-based visual servoing approach that uses the learned motion features of the affordance in interaction as the control goals to control a robot to perform manipulation tasks.
AI Meets Physical World - Exploring Robot Cooking
  • Yu Sun
  • Computer Science, Engineering
  • 2018
The recent research effort to bring the computer intelligence into the physical world so that robots could perform physically interactive manipulation tasks developed new grasping strategies for robots to hold objects with a firm grasp to withstand the disturbance during physical interactions.
Functional Object-Oriented Network: Construction & Expansion
This work builds upon the functional object-oriented network (FOON), a structured knowledge representation which is constructed from observations of human activities and manipulations, and discusses two means of generalization: expanding the network through the use of object similarity to create new functional units from those the authors already have, and compressing the functional units by object categories rather than specific objects.
Functional object-oriented network for manipulation learning
The paper describes FOON's structure and an approach to form a universal FOON with extracted knowledge from online instructional videos, demonstrating the flexibility of FOON in creating a novel and adaptive means of solving a problem using knowledge gathered from multiple sources.