Actor and Observer: Joint Modeling of First and Third-Person Videos

@article{Sigurdsson2018ActorAO,
  title={Actor and Observer: Joint Modeling of First and Third-Person Videos},
  author={Gunnar A. Sigurdsson and Abhinav Gupta and Cordelia Schmid and Ali Farhadi and Alahari Karteek},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2018},
  pages={7396-7404}
}
  • Gunnar A. Sigurdsson, Abhinav Gupta, +2 authors Alahari Karteek
  • Published 2018
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor). Despite this, learning such models for human action recognition has not been achievable due to the lack of data. This paper takes a step in this direction, with the introduction of Charades-Ego, a large-scale dataset of paired first… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Explore key concepts

    Links to highly relevant papers for key concepts in this paper:

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 21 CITATIONS

    What I See Is What You See: Joint Attention Learning for First and Third Person Video Co-analysis

    VIEW 15 EXCERPTS
    CITES BACKGROUND, METHODS & RESULTS
    HIGHLY INFLUENCED

    Visual-GPS: Ego-Downward and Ambient Video Based Person Location Association

    VIEW 6 EXCERPTS
    CITES METHODS & BACKGROUND
    HIGHLY INFLUENCED

    H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Egocentric Meets Top-View

    VIEW 1 EXCERPT
    CITES BACKGROUND

    EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Dynamic Motion Representation for Human Action Recognition

    VIEW 2 EXCERPTS
    CITES BACKGROUND

    Beyond the Camera: Neural Networks in World Coordinates

    VIEW 1 EXCERPT
    CITES BACKGROUND

    FILTER CITATIONS BY YEAR

    2018
    2020

    CITATION STATISTICS

    • 6 Highly Influenced Citations

    • Averaged 7 Citations per year from 2018 through 2020

    References

    Publications referenced by this paper.
    SHOWING 1-4 OF 4 REFERENCES

    Learning Image Representations Tied to Ego-Motion

    VIEW 6 EXCERPTS
    HIGHLY INFLUENTIAL

    Detecting activities of daily living in first-person camera views

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Unsupervised Learning of Visual Representations Using Videos

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Discovering important people and objects for egocentric video summarization

    VIEW 4 EXCERPTS
    HIGHLY INFLUENTIAL