Incorporating site-level knowledge to extract structured data from web forums

@inproceedings{Yang2009IncorporatingSK,
  title={Incorporating site-level knowledge to extract structured data from web forums},
  author={Jiang-Ming Yang and Rui Cai and Yida Wang and Jun Zhu and Lei Zhang and Wei-Ying Ma},
  booktitle={WWW},
  year={2009}
}
Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to both complex page layout designs and unrestricted user created posts. In this paper, we study the problem of structured data extraction from various web forum sites. Our target is to find a solution as general as possible to extract structured data, such as post title, post author, post time, and post content from any… CONTINUE READING
Highly Cited
This paper has 64 citations. REVIEW CITATIONS

10 Figures & Tables

Topics

Statistics

010202009201020112012201320142015201620172018
Citations per Year

65 Citations

Semantic Scholar estimates that this publication has 65 citations based on the available data.

See our FAQ for additional information.