Hierarchical Label Propagation and Discovery for Machine Generated Email

@inproceedings{Wendt2016HierarchicalLP,
  title={Hierarchical Label Propagation and Discovery for Machine Generated Email},
  author={James Bradley Wendt and Michael Bendersky and Lluis Garcia Pueyo and Vanja Josifovski and Balint Miklos and Ivo Krka and Amitabh Saikia and Jie Yang and Marc-Allen Cartright and Sujith Ravi},
  booktitle={WSDM},
  year={2016}
}
Machine-generated documents such as email or dynamic web pages are single instantiations of a pre-defined structural template. As such, they can be viewed as a hierarchy of template and document specific content. This hierarchical template representation has several important advantages for document clustering and classification. First, templates capture common topics among the documents, while filtering out the potentially noisy variabilities such as personal information. Second, template… CONTINUE READING
Highly Cited
This paper has 17 citations. REVIEW CITATIONS

From This Paper

Figures, tables, results, and topics from this paper.

Key Quantitative Results

  • We demonstrate that the template label propagation achieves more than 91% precision and 93% recall, while increasing the label coverage by more than 11%.

Citations

Publications citing this paper.
Showing 1-10 of 12 citations

References

Publications referenced by this paper.