Corpus ID: 237635447

Localizing Infinity-shaped fishes: Sketch-guided object localization in the wild

  title={Localizing Infinity-shaped fishes: Sketch-guided object localization in the wild},
  author={Pau Riba and Sounak Dey and Ali Furkan Biten and Josep Llad{\'o}s},
  • Pau Riba, Sounak Dey, +1 author J. Lladós
  • Published 24 September 2021
  • Computer Science
  • ArXiv
This work investigates the problem of sketch-guided object localization (SGOL), where human sketches are used as queries to conduct the object localization in natural images. In this cross-modal setting, we first contribute with a tough-to-beat baseline that without any specific SGOL training is able to outperform the previous works on a fixed set of classes. The baseline is useful to analyze the performance of SGOL approaches based on available simple yet powerful methods. We advance prior… Expand

Figures and Tables from this paper


Sketch-Guided Object Localization in Natural Images
A novel cross-modal attention scheme is proposed that guides the region proposal network (RPN) to generate object proposals relevant to the sketch query that are later scored against the query to obtain final localization. Expand
FoveaBox: Beyound Anchor-Based Object Detection
Without bells and whistles, FoveaBox achieves state-of-the-art single model performance on the standard COCO and Pascal VOC object detection benchmark and avoids all computation and hyper-parameters related to anchor boxes, which are often sensitive to the final detection performance. Expand
Sketch Me That Shoe
A deep tripletranking model for instance-level SBIR is developed with a novel data augmentation and staged pre-training strategy to alleviate the issue of insufficient fine-grained training data. Expand
Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval
This paper proposes a novel ZS-SBIR framework to jointly model sketches and photos into a common embedding space, and forms a novel strategy to mine the mutual information among domains is specifically engineered to alleviate the domain gap. Expand
One-Shot Object Detection with Co-Attention and Co-Excitation
A novel CoAE framework that develops a squeeze-and-co-excitation scheme that can adaptively emphasize correlated feature channels to help uncover relevant proposals and eventually the target objects, and designs a margin-based ranking loss for implicitly learning a metric to predict the similarity of a region proposal to the underlying query. Expand
Sketch Less for More: On-the-Fly Fine-Grained Sketch-Based Image Retrieval
A reinforcement learning based cross-modal retrieval framework that directly optimizes rank of the ground-truth photo over a complete sketch drawing episode and introduces a novel reward scheme that circumvents the problems related to irrelevant sketch strokes, and thus provides us with a more consistent rank list during the retrieval. Expand
Sketch-a-Net: A Deep Neural Network that Beats Humans
It is shown that state-of-the-art deep networks specifically engineered for photos of natural objects fail to perform well on sketch recognition, regardless whether they are trained using photos or sketches. Expand
Sketch-a-Segmenter: Sketch-Based Photo Segmenter Generation
It is shown, for the first time, that it is possible to generate a photo-segmentation model of a novel category using just a single sketch and furthermore exploit the unique fine-grained characteristics of sketch to produce more detailed segmentation. Expand
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth
This work proposes a new clustering loss function for proposal-free instance segmentation that pulls the spatial embeddings of pixels belonging to the same instance together and jointly learns an instance-specific clustering bandwidth, maximizing the intersection-over-union of the resulting instance mask. Expand
One-Shot Segmentation in Clutter
An improved model that attends to multiple candidate locations, generates segmentation proposals to mask out background clutter and selects among the segmented objects is introduced, suggesting that such image recognition models based on an iterative refinement of object detection and foreground segmentation may provide a way to deal with highly cluttered scenes. Expand