FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context

@article{Chowdhury2022FSCOCOTU,
  title={FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context},
  author={Pinaki Nath Chowdhury and Aneeshan Sain and Yulia Gryaditskaya and Ayan Kumar Bhunia and Tao Xiang and Yi-Zhe Song},
  journal={ArXiv},
  year={2022},
  volume={abs/2203.02113}
}
We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO. With practical applications in mind, we collect sketches that convey well scene content but can be sketched within a few minutes by a person with any sketching skills. Our dataset comprises 10, 000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals, offering both objectand scene-level abstraction. Each sketch is augmented with its text description… 

Figures and Tables from this paper

SceneTrilogy: On Scene Sketches and its Relationship with Text and Photo
. We for the first time extend multi-modal scene understanding to include that of free-hand scene sketches. This uniquely results in a trilogy of scene data modalities (sketch, text, and photo),

References

SHOWING 1-10 OF 65 REFERENCES
Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings
TLDR
A novel semi-supervised framework, which models shared information between domains and domain-specific information separately and is aligned with an invertible neural network, which allows to learn diverse many-to-many mappings between the two domains.
SketchyCOCO: Image Generation From Freehand Scene Sketches
TLDR
This work introduces the first method for automatic image generation from scene-level freehand sketches that allows for controllable image generation by specifying the synthesis goal via free hand sketches and builds a large-scale composite dataset called SketchyCOCO to support and evaluate the solution.
A Neural Representation of Sketch Drawings
We present sketch-rnn, a recurrent neural network (RNN) able to construct stroke-based drawings of common objects. The model is trained on thousands of crude human-drawn images representing hundreds
Microsoft COCO: Common Objects in Context
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene
Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting
TLDR
This paper proposes two novel cross-modal translation pre-text tasks for self-supervised feature learning: Vectorization and Rasterization and shows that the learned encoder modules benefit both raster-based and vector-based downstream approaches to analysing hand-drawn data.
Sketch-a-Net that Beats Humans
TLDR
A multi-scale multi-channel deep neural network framework that yields sketch recognition performance surpassing that of humans, and not only delivers the best performance on the largest human sketch dataset to date, but also is small in size making efficient training possible using just CPUs.
SceneSketcher: Fine-Grained Image Retrieval with Scene Sketches
TLDR
This paper proposes a graph embedding based method to learn the similarity measurement between images and scene sketches, which models the multi-modal information, including the size and appearance of objects as well as their layout information, in an effective manner.
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
TLDR
This paper proposes a new learning method Oscar (Object-Semantics Aligned Pre-training), which uses object tags detected in images as anchor points to significantly ease the learning of alignments.
COCO-Stuff: Thing and Stuff Classes in Context
TLDR
An efficient stuff annotation protocol based on superpixels is introduced, which leverages the original thing annotations, and the speed versus quality trade-off of the protocol is quantified and the relation between annotation time and boundary complexity is explored.
How do humans sketch objects?
TLDR
This paper is the first large scale exploration of human sketches, developing a bag-of-features sketch representation and using multi-class support vector machines, trained on the sketch dataset, to classify sketches.
...
...