Match Cutting: Finding Cuts with Smooth Visual Transitions

@article{Chen2022MatchCF,
  title={Match Cutting: Finding Cuts with Smooth Visual Transitions},
  author={Boris Chen and Amir Ziai and Rebecca Tucker and Yuchen Xie},
  journal={ArXiv},
  year={2022},
  volume={abs/2210.05766}
}
A match cut is a transition between a pair of shots that uses similar framing, composition, or action to fluidly bring the viewer from one scene to the next. Match cuts are frequently used in film, television, and advertising. How-ever, finding shots that work together is a highly manual and time-consuming process that can take days. We propose a modular and flexible system to efficiently find high-quality match cut candidates starting from millions of shot pairs. We annotate and release a dataset of… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 77 REFERENCES

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet.

Learning Transferable Visual Models From Natural Language Supervision

It is demonstrated that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet.

PyTorch: An Imperative Style, High-Performance Deep Learning Library

This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.

Moonrise Kingdom. Indian Paintbrush

  • American Empirical Pictures,
  • 2012

Video Swin Transformer

  • Ze LiuJ. Ning Han Hu
  • Computer Science
    2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2022
This paper advocates an inductive bias of locality in video Transformers, which leads to a better speed-accuracy trade-off compared to previous approaches which compute self-attention globally even with spatial-temporal factorization.

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow

We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network architecture for optical flow. RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs

Billion-Scale Similarity Search with GPUs

This paper proposes a novel design for an inline-formula that enables the construction of a high accuracy, brute-force, approximate and compressed-domain search based on product quantization, and applies it in different similarity search scenarios.

XGBoost: A Scalable Tree Boosting System

This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Empirical Evaluation of Rectified Activations in Convolutional Network

The experiments suggest that incorporating a non-zero slope for negative part in rectified activation units could consistently improve the results, and are negative on the common belief that sparsity is the key of good performance in ReLU.
...