Coherent Multi-sentence Video Description with Variable Level of Detail

@article{Rohrbach2014CoherentMV,
  title={Coherent Multi-sentence Video Description with Variable Level of Detail},
  author={Anna Rohrbach and Marcus Rohrbach and Wei Qiu and Annemarie Friedrich and Manfred Pinkal and Bernt Schiele},
  journal={ArXiv},
  year={2014},
  volume={abs/1403.6173}
}
  • Anna Rohrbach, Marcus Rohrbach, +3 authors Bernt Schiele
  • Published 2014
  • Computer Science
  • ArXiv
  • Humans can easily describe what they see in a coherent way and at varying level of detail. However, existing approaches for automatic video description focus on generating only single sentences and are not able to vary the descriptions’ level of detail. In this paper, we address both of these limitations: for a variable level of detail we produce coherent multi-sentence descriptions of complex videos. To understand the difference between detailed and short descriptions, we collect and analyze a… CONTINUE READING

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 123 CITATIONS, ESTIMATED 97% COVERAGE

    A Hierarchical Approach for Generating Descriptive Image Paragraphs

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Generating Descriptions with Grounded and Co-referenced People

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Generating natural language tags for video information management

    VIEW 3 EXCERPTS
    CITES METHODS & BACKGROUND

    Localizing Moments in Video with Natural Language

    Visual-Textual Video Synopsis Generation

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Movie Description

    VIEW 7 EXCERPTS
    CITES BACKGROUND, RESULTS & METHODS

    FILTER CITATIONS BY YEAR

    2014
    2020

    CITATION STATISTICS

    • 17 Highly Influenced Citations

    • Averaged 23 Citations per year from 2018 through 2020

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 34 REFERENCES

    Towards coherent natural language description of video streams

    VIEW 2 EXCERPTS

    Translating Video Content to Natural Language Descriptions

    VIEW 10 EXCERPTS

    A Thousand Frames in Just a Few Words: Lingual Description of Videos through Latent Topics and Sparse Object Stitching

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Grounding Action Descriptions in Videos

    VIEW 13 EXCERPTS

    Seeing What You're Told: Sentence-Guided Activity Recognition in Video

    Human Focused Video Description