Food Ingredients Recognition Through Multi-label Learning

  title={Food Ingredients Recognition Through Multi-label Learning},
  author={Marc Bola{\~n}os and Aina Ferr{\`a} and Petia Radeva},
Automatically constructing a food diary that tracks the ingredients consumed can help people follow a healthy diet. We tackle the problem of food ingredients recognition as a multi-label learning problem. We propose a method for adapting a highly performing state of the art CNN in order to act as a multi-label predictor for learning recipes in terms of their list of ingredients. We prove that our model is able to, given a picture, predict its list of ingredients, even if the recipe… 

A Study of Multi-Task and Region-Wise Deep Learning for Food Ingredient Recognition

An insightful analysis of three compelling issues in ingredient recognition, which involve recognition in either image-level or region level, pooling in either single or multiple image scales, and learning ineither single or multi-task manner are provided.

Zero-Shot Ingredient Recognition by Multi-Relational Graph Convolutional Network

Multi-relational GCN (graph convolutional network) is introduced that integrates ingredient hierarchy, attribute as well as co-occurrence for zero-shot ingredient recognition and sheds light on zero- shot ingredients recognition.

Few-shot Food Recognition via Multi-view Representation Learning

A Multi-View Few-Shot Learning (MVFSL) framework to explore additional ingredient information for few-shot food recognition, and extends another two types of networks, namely, Siamese Network and Matching Network, by introducing ingredient information.

Food Ingredients Identification from Dish Images by Deep Learning

This work chooses 35 kinds of ingredients in the daily life as identification categories, and constructs three kinds of novel datasets for establishing the ingredient identification models, and proposes two types of ways for ingredient identification.

CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images and yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.

Applying Deep Learning for Food Image Analysis

This project is adapted concatenating an ontology layer, a multidimensional layer which contains the relation between the elements, in order to help during the classification process, and will be a model which will be able to simultaneously predict two food-related tasks; dish and ingredients.

Where and What Am I Eating? Image-Based Food Menu Recognition

This model, based on Convolutional Neural Networks and Recurrent Neural Networks, is able to learn a language model that generalizes on never seen dish names without the need of re-training it.

Learning From Web Recipe-Image Pairs for Food Recognition: Problem, Baselines and Performance

An empirical study is conducted to provide insights whether the features learnt with noisy pair data are resilient and could capture the modality correspondence between visual and text in recipes.

Ingredient-Guided Cascaded Multi-Attention Network for Food Recognition

An Ingredient-Guided Cascaded Multi-Attention Network (IG-CMAN) is developed, capable of sequentially localizing multiple informative image regions with multi-scale from category-level to ingredient-level guidance in a coarse-to-fine manner, and can introduce the explainability for localized regions.



Exploring Food Detection Using CNNs

An overview of the last advances on food detection and an optimal model based on GoogLeNet Convolutional Neural Network method, principal component analysis, and a support vector machine that outperforms the state of the art on two public food/non-food datasets are proposed.

Learning to Make Better Mistakes: Semantics-aware Visual Food Recognition

A visual food recognition framework that integrates the inherent semantic relationships among fine-grained classes by formulating a multi-task loss function on top of a convolutional neural network architecture, which further exploits the rich semantic information.

Recipe recognition with large multimodal food dataset

This paper compares and evaluates leading vision-based and text-based technologies on a new very large multimodal dataset (UPMC Food-101) containing about 100,000 recipes for a total of 101 food categories, and presents deep experiments of recipe recognition on this dataset using visual, textual information and fusion.

Wide-Slice Residual Networks for Food Recognition

This work introduces a new deep scheme that is designed to handle the food structure, and first introduces a slice convolution block to capture the vertical food traits that are common to a large number of categories.

Food-101 - Mining Discriminative Components with Random Forests

A novel method to mine discriminative parts using Random Forests (rf), which allows us to mine for parts simultaneously for all classes and to share knowledge among them, and compares nicely to other s-o-a component-based classification methods.

CNN-RNN: A Unified Framework for Multi-label Image Classification

The proposed CNN-RNN framework learns a joint image-label embedding to characterize the semantic label dependency as well as the image- label relevance, and it can be trained end-to-end from scratch to integrate both information in a unified framework.

Is Saki #delicious?: The Food Perception Gap on Instagram and Its Relation to Health

This work uses data for 1.9 million images from Instagram from the US to look at systematic differences in how a machine would objectively label an image compared to how a human subjectively does, and shows that this difference, which it calls the "perception gap", relates to a number of health outcomes observed at the county level.

CNN-Based Food Image Segmentation Without Pixel-Wise Annotation

A CNN-based food image segmentation which requires no pixel-wise annotation is proposed and outperformed RCNN regarding food region detection as well as the PASCAL VOC detection task.

Rethinking the Inception Architecture for Computer Vision

This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.

Multi-Label Classification: An Overview

The task of multi-label classification is introduced, the sparse related literature is organizes into a structured presentation and comparative experimental results of certain multilabel classification methods are performed.